Why should you pick a license?
In other words, why not just put code online without a license? A license clarifies the conditions under which your code can be re-used. In absence of a license, the author still retains copyright, and the conditions under which the code can be used are unclear. A sentence like "feel free to use this code" does not improve the situation, because it does not clarify to what extent any possible user should "feel free" to use it. Standard licenses provide pre-defined sets of standard conditions, which both providers and users only have to understand once, and can then immediately recognise or categorise later.
Also see "Why is an open-source licence useful?" on the Software Sustainability's Choosing an open-source licence page article.
Selecting a license
The most common licenses for a given artifact can be determined by its type: code, data, or any other generic digital "creative work" (documentation, reports, figures). For any given project, its components can be licensed independently by type.
At the most basic level, one must decide on whether one wants to use a copyleft license or a more permissive license. While copyleft assures that code changes by any future contributors must stay public, permissive licenses only require attribution in derived works.
With these two distinctions (type of work, type of license), the following decision matrix can be drawn:
|| GPL licenses
|| MIT, BSD, Apache
The following sections elaborate on both options for each type of work: code, data, and other.
The following minimal questions can give a guideline on whether one should choose a copyleft or a permissive license for a project. The Wikipedia article on Free software licenses gives a more in-depth overview on both types and their properties.
- Are you ok with your code becoming part of a closed-source commercial software product?
- No: GPL
- Yes: permissive licenses (MIT/BSD/Apache)
- Do you want to force users to publish their improvements to your software, or to software they develop based on your software, under the same licence?
- No: permissive licenses (MIT/BSD/Apache). This makes the code more broadly usable, but also allows people to take the code without sharing their improvements.
- Yes: GPL. This ensures that any future changes/improvements to the code remain free and open.
Developers who use a GPL license must make the source code available if they share or sell an application built upon it. In this case, the source code must also contain any changes the developers may have made. If GPL code is used but not shared or sold, the code need not be published and any changes may remain private. This has the important consequence that software written under such a license can not be included in "non-free" software. The GPL licenses comprise:
- GPL: the basic GPL license
- LGPL: Lesser GPL. Permits non-free software to link to the LGPL-licensed software, which the GPL does not
- AGPL: Affero GNU GPL. This closes a loophole in the GPL that permits somebody to operate a web application that uses GPL code, without making the code available to users
More information, geared towards the use of copyleft licenses for code, can be found in the article How to choose a license for your own work on gnu.org.
The article Why you should use a BSD style license for your Open Source Project on freebsd.org is recommended. The permissive licenses allow code to be re-used without restrictions, including the possibility of building commercial software for which the new code is no longer made publicly available.
The site choosealicense.com has a great three-column summary of the differences between the major open source licenses. Summarizing their explanations (with links to their license description pages):
- The MIT license is a simple permissive license
- The Apache License adds an additional term explicitly granting the patent rights of code contributors to the users of such code
- The BSD license, or its newer even shorter successor ISC is virtually identical to MIT, but worded shorter
One important aspect of all these licenses is the liability clause. This clause prohibits any code user from holding the code producer liable for any damages arising from the use of the software code.
Data in this context included databases holding organized dataset. Special copyright rules apply to databases, which can be protected in their own right, irrespective of the status of the data they contain.
The go-to solution for data licenses comes from the Open Data Commons that have created the following three licenses, and a nice 2-minute guide on the how and why of data licensing.
- Creative Commons licenses have been reworked since version 2 to include provisions which make them suitable for databases and datasets as well. Please refer to the section "Other" below for their properties. See also the related question are CC licenses suitable for data? on StackExchange.
If you would like your model documentation to be used verbatim on Wikipedia, you need to use one of these Creative Commons licenses:
- CC BY, all versions and ports up to and including 4.0
- CC BY-SA, versions 1.0, 2.0, 2.5, and 3.0
However CC BY-SA 4.0 is not compatible with Wikipedia. For more information, please see the Wikipedia compatibility table. The Wikimedia Foundation ran a community discussion on an upgrade to a CC BY-SA 4.0 license in November 2016, but the outcome has not been announced. If this exercise is successful then the compatibility problem will disappear.
GNU also recommends not using the following licenses for free documentation:
- a CC-NC (noncommercial) license of any form
- a CC-ND (no derivatives) license of any form
For more information please see GNU page on nonfree documentation licenses.
The Creative Commons family of licenses is probably the most widely known. As of September 2015, the fourth version of these licenses have been published. First, the two "free licenses" are presented:
- CC BY, short for attributions ("by" refers to the author), requires that the source be named when sharing the original work or a derived work
- CC BY-SA is similar to the GPL as it requires any derived works to be published under a "compatible" (easiest is the same) license. It is the license used by Wikipedia. Version 4.0 is fully compatible with GPLv3.
Only the two previous CC licenses qualify as "free" licenses, as they do not restrict what a user may do with the licensed work.
- CC NC, the non-commercial option allows one to prohibit any "commercial" usage of their work. However, it is notoriously hard to define what exactly constitutes a commercial activity. This issue is explained in detail in the brochure Consequences, risks and side-effects of the license module "non-commercial use only". In any case, the two resulting licenses are CC BY-NC and, with the share-alike option, CC BY-NC-SA.
- CC ND (no derivitives) option is available If you do not want changes or improvements to your work to be shared or redistributed. Its main use is by musicians who want their music to be shared by listeners, but do not wish to allow remixes or cover versions. The resulting licenses are CC BY-ND or, by combining it with the NC tag, CC BY-NC-ND.
For inspiration, the Creative Commons homepage lists a host of example projects licensed under all different licenses to give an impression of the context that triggered various individuals to select a particular license.
Why is it a bad idea to use Creative Commons licenses for code?
The long answer is given in Why is CC BY-SA discouraged for code? on StackExchange.
The short answer is that the CC licenses are written for creative works (music, film, texts), not software code. Therefore, it leaves many important details (what consitutes a "derived" work, is linking allowed, what about patents?) unspecified. The conclusion: don't use CC licenses for code.
At what point should I choose or change a license?
You should determine and add a licence prior to the first release of your project. Ideally that would include “small” releases within your research institute or organization.
Can I change the licence? The important thing is that, when you receive a copy of source code with a particular license, you have been granted that license. So as long as the license does not have a revocation clause, it is permanent. See the related question on StackOverflow: What happens when an open source project changes its license?
A license can be changed. You can change the licence for the next version or release if all contributors (meaning copyright holders) agree or have previously agreed on the possibility of doing so, usually via a contributor agreement. Search for "contributor (license/assignment) agreement" online, for pointers on how to enable this up front. See also the FAQ on contributor agreements on opensource.org, Civic Commons article on contributor agreements.
What does sublicensing mean?
Sublicensing allows somebody to relicense all or part of the licensed software, for example, to use BSD-licensed code in a closed-source commercial application.
"The basic idea [...] is that if this is granted, a licensee can become a licensor of some of the rights of the grant they received regardless of any other claim they may have to copyright control over what they distribute." Source: https://programmers.stackexchange.com/questions/189633/what-sublicense-actually-means
- The Software Sustainability Institute (UK) provides information and answers to frequently asked questions. The questions dealt with include:
- “Why is an open-source licence useful?”
- “How can I tell the difference between open-source licences?”
- “What happens if I am using someone else's code in my software?”
- “What do I need to do before applying my choice of licence?”