Why should you pick a license?
In other words, why not just put code online without a license? A license clarifies the conditions under which your code can be reused. In the absence of a license, the author still retains propietary copyright, and the conditions under which the code can be used are unclear.:1 A sentence like "feel free to use this code" does not improve the situation, because it does not clarify to what extent any possible user should "feel free" to use the code. Standard licenses provide pre-defined sets of standard conditions, which both providers and users only have to understand once, and can then later easily recognise and categorise.
Also see "Why is an open-source licence useful?" on the Software Sustainability's Choosing an open-source licence page article.
Selecting a license
The most common licenses for a given artifact can be determined by its type: code, data, or any other generic digital "creative work" (documentation, reports, figures). For any given project, its components can and should be licensed independently by type.
At the most basic level, one must decide on whether to use a copyleft license or a more permissive license. While copyleft assures that code changes by any future contributors must stay public, permissive licenses only require attribution in derived works.
With these two distinctions (type of work, type of license), the following decision matrix can be drawn:
|| ODbL, CC BY-SA
|| CC BY-SA
|| MIT, BSD, Apache
|| ODC-BY, CC BY
|| CC BY
The following sections elaborate on both options for each type of work: code, data, and other.
The following minimal questions can give a guideline on whether one should choose a copyleft or a permissive license for a project. The Wikipedia article on Free software licenses gives an in-depth overview on both types and their properties.
- Are you ok with your code becoming part of a closed-source commercial software product?
- No: GPL
- Yes: permissive licenses (MIT/BSD/Apache)
- Do you want to force users to publish their improvements to your software, or to software they develop based on your software, under the same licence?
- No: permissive licenses (MIT/BSD/Apache). This makes the code more broadly usable, but also allows people to take the code without sharing their improvements.
- Yes: GPL. This ensures that any future changes and improvements to the code remain free and open
Developers who use a GPL license must make the source code available if they share or sell an application built upon it. In this case, the source code must also contain any changes the developers may have made. If GPL code is used but not shared or sold, the code need not be published and any improvements may remain private. This has the important consequence that software written under such a license cannot be included in "non-free" software. The GPL family of licenses comprise:
- GPL: the basic GPL license
- LGPL: Lesser GPL: permits non-free software to link to the LGPL-licensed software, which the GPL does not
- AGPL: Affero GNU GPL: this license closes a loophole in the GPL that permits somebody to operate a web application that uses GPL code, without making the code available to users
More information, geared towards the use of copyleft licenses for code, can be found in the article How to choose a license for your own work on gnu.org.
The article Why you should use a BSD style license for your Open Source Project on freebsd.org is recommended. The permissive licenses allow code to be re-used without restrictions, including the possibility of building commercial software whereby the new code is not made public.
The site choosealicense.com has a great three-column summary of the differences between the major open source licenses. Summarizing their explanations (with links to their license description pages):
- The MIT license is a simple permissive license
- The Apache License adds an additional term explicitly granting the patent rights of code contributors to the users of such code
- The BSD license, or its newer even shorter successor ISC is virtually identical to MIT, but worded shorter
One important aspect of all these licenses is the liability clause. This clause prohibits any code user from holding the code producers liable for damages arising from the use of the software code.
Data in this context includes organized databases holding datasets. Special copyright rules may apply to databases, which, under European law, are protected in their own right, irrespective of the status of the data they contain.
The go-to solution for data licenses comes from the Open Data Commons that have created the following three licenses, and a nice 2-minute guide on the how and why of data licensing:
- Creative Commons licenses have been reworked since version 2 to include provisions which make them suitable for datasets and databases as well. Please refer to the section Other for their properties. See also the related question are CC licenses suitable for data? on StackExchange.
If you would like your model documentation to be used verbatim on Wikipedia, you need to use one of these Creative Commons licenses:
- CC BY, all versions and ports up to and including 4.0
- CC BY-SA, versions 1.0, 2.0, 2.5, and 3.0
However CC BY-SA 4.0 is not compatible with Wikipedia. For more information, please see the Wikipedia compatibility table. The Wikimedia Foundation ran a community discussion on an upgrade to a CC BY-SA 4.0 license in November 2016, but the outcome has not been announced. If this exercise is successful then the compatibility problem just noted will disappear.
GNU also recommends not using the following licenses for free documentation:
- CC-NC (non-commercial) license of any form
- CC-ND (no derivatives) license of any form
For more information please see GNU page on nonfree documentation licenses.
The Creative Commons family of licenses are probably the most widely known. As of December 2016, the fourth version of these licenses is current. First, the two "free licenses" are presented:
- CC BY (short for attributions, the "by" refers to authoring): requires that the source be named when sharing the original work or any derived work
- CC BY-SA: similar to the GPL in that it requires any derived works to be published under a "compatible" (easiest is the same) license. It is the license used by Wikipedia. Version 4.0 is fully compatible with the GPLv3 license.
Only the two previous CC licenses qualify as "free" licenses, as they do not restrict what a user may do with the licensed work. Two "non-free" license components are presented:
- CC NC: the non-commercial option allows one to prohibit any "commercial" usage of their work. However, it is notoriously hard to define exactly what constitutes a commercial activity. This issue is explained in detail in the brochure Consequences, risks and side-effects of the license module "non-commercial use only". In any case, the two licenses which result from this component are CC BY-NC and, with the share-alike option, CC BY-NC-SA.
- CC ND (short for no derivitives): this option is available If you do not want changes or improvements to your work to be shared or redistributed. Its main use is by musicians who want to share their music with listeners, but do not wish to allow remixes or cover versions. The resulting licenses are CC BY-ND or, by combining it with the NC option, CC BY-NC-ND.
For inspiration, the Creative Commons homepage lists a host of example projects licensed under all different licenses to give an impression of the contexts that triggered various individuals to select a particular license.
Why is it a bad idea to use Creative Commons licenses for code?
The long answer is given in Why is CC BY-SA discouraged for code? on StackExchange.
The short answer is that the CC licenses are written for creative works (music, film, texts), not software code. Therefore, a CC license leaves many important details (what consitutes a "derived" work, is linking allowed, what about patents?) unspecified. The conclusion: don not use CC licenses for code.
At what point should I choose or change a license?
You should determine and add a licence prior to the first release of your project. Ideally that would include “small” releases within your research institute or organization.
Can I change the licence? The important thing is that, when you receive a copy of the source code with a particular license, you have been granted that license. So as long as the license does not have a revocation clause, it is permanent. See the related question on StackOverflow: What happens when an open source project changes its license?
A license can be changed. You can swap the licence for the next version or release if all contributors (meaning copyright holders) agree or have previously agreed to the possibility of doing so, usually via a contributor agreement. Search for "contributor (license/assignment) agreement" online, for pointers on how to enable this up front. See also the FAQ on contributor agreements on opensource.org, Civic Commons article on contributor agreements.
What does sublicensing mean?
Sublicensing allows somebody to relicense all or part of the licensed software, for example, to use BSD-licensed code in a closed-source commercial application.
"The basic idea ... is that if this is granted, a licensee can become a licensor of some of the rights of the grant they received regardless of any other claim they may have to copyright control over what they distribute." Source: https://programmers.stackexchange.com/questions/189633/what-sublicense-actually-means
Questions related to the selection and use of open source licenses can be posted to our mailing list. But remember that the list is public and its traffic is indexed by search engines.
- The Software Sustainability Institute (UK) provides information and answers to frequently asked questions. The questions dealt with include:
- “Why is an open-source licence useful?”
- “How can I tell the difference between open-source licences?”
- “What happens if I am using someone else's code in my software?”
- “What do I need to do before applying my choice of licence?”
- This Nature commentary article dispels common excuses for not publishing scientific code and argues that code should be published more frequently:
Barnes, Nick (13 October 2010). "Publish your computer code: it is good enough". Nature News. 467: 753–753. doi:10.1038/467753a. ISSN 0028-0836. Retrieved 2016-12-10.
- The following paper contains a good overview of different licenses and some of the questions one might ask when deciding on a license, from the point of view of a programming scientist:
Morin, Andrew; Urban, Jennifer; Sliz, Piotr (27 July 2012). "A quick guide to software licensing for the scientist-programmer". PLOS Computational Biology. 8: e1002598. doi:10.1371/journal.pcbi.1002598. ISSN 1553-7358. Retrieved 2016-12-10.
- TLDRlegal.com summarizes a wide range of popular licenses in an easy to read format
- Help for choosing a license for software from GitHub at ChooseALicense.com
- Opensource.org: lots of helpful information especially under the FAQ (provides, for instance, an aswer to “Why not use CC for code?”)
- Open Source Licenses: wallow in the abundance of licenses out there, then come back to the FAQ and read “Which Open Source license should I choose to release my software under?”
Morin, Andrew; Urban, Jennifer; Sliz, Piotr (26 July 2012). "A quick guide to software licensing for the scientist-programmer". PLOS Computational Biology. 8: e1002598. doi:10.1371/journal.pcbi.1002598. ISSN 1553-7358. Retrieved 2016-12-10.