-
Notifications
You must be signed in to change notification settings - Fork 0
/
choose_license.qmd
404 lines (271 loc) · 38.5 KB
/
choose_license.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
---
title: "Choose a License"
lang: en
bibliography: literature.bib
format:
html:
mermaid:
theme: neutral
engine: knitr
---
## TL;DR
Every expression of an idea, such as a literary or an artistic works, is automatically covered by copyright. This means that nobody else than the copyright holder is allowed to copy, modify, or share it. Copyright licenses can grant others some of the necessary rights, but mostly don't cover publicity, privacy, moral, patent, or trademark rights.
Researchers are only allowed to share material they have the necessary rights to, for example, through a free/open license. These licenses often require to attribute the authors, indicate whether changes were made, and provide the text of the license, among other things. In addition, researchers should share any work by their own under such a license.
_Choosing a license_ involves:
1. Recording the copyright notices and licenses for content by others (for example, in a file called `LICENSE.txt`) and following all their additional requirements.
2. In the simplest case, choosing two licenses (one suitable for code, one for everything else) which to apply simultaneously to all own content (for example, again by recording the license in the file `LICENSE.txt`). Recipients are then free to choose under which of the two licenses to use the material.
We recommend to choose [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/) and [Apache\ 2.0](https://choosealicense.com/licenses/apache-2.0/) for all own works. Take your time to read the summaries behind the links and/or the actual license text to understand its legal effect.
You can adapt the following wording to your use case:
> Except where noted otherwise, all files in this project are made available under CC0\ 1.0 or (at your option) under the terms of the Apache Software License\ 2.0.
## A Primer on Licenses
Whenever you create a literary or artistic work (such as a text, image, video, or software), the copyright law in most countries limits other people from copying, modifying, and sharing it without your express permission. [If the work was created as part of your job, it might be your employer who holds the copyright to the work, depending on the country and contract.]{.aside} This even applies if you make it available to others (e.g., on your website): First and foremost they are not allowed to copy, modify, or share it. This legal default of "all rights reserved" was created to benefit publishers, not authors [@Fogel2006], and runs counter to many cultural and scientific processes. Copyright licenses enable authors to free up their works for reuse by others.
::: {.column-margin}
For the purpose of this tutorial, by _license_ we mean _copyright license_.
:::
A license is a legal document that regulates what _others_ are allowed to do with a copyright-protected work -- the licenses we discuss do not limit the author or copyright holder in their rights. And while you _could_ write your own, there are already many pre-formulated licenses available to choose from and to apply to your work.^[In fact, you should not create your own license if the purpose is to share your work freely and openly with others. By using a boilerplate license others know what to expect, also because they have been [tested](https://sfconservancy.org/copyleft-compliance/) in [courts](https://legaldb.creativecommons.org/).]
::: {#cau-exclusive-rights .callout-caution}
### Giving Publishers Exclusive Rights
When you have your article published under an open access license, some publishers still demand an exclusive publishing and distribution license or a copyright assignment. This would give them more rights than the readers of the article have through the chosen open access license [@Rumsey2022] and exceeds exceeds by far what is necessary to make publication possible [@Suber2022]. Consequently, authors should oppose this practice and grant publishers the same rights that every other reader of the article has. If your chosen publisher insists on an exclusive license, you may at least retain the copyright for your figures -- follow the guide "Retaining copyright for figures in academic publications to allow easy citation and reuse" by @Elson2016 to learn how to do that.
If you have published a closed-access paper before, you can consult [ShareYourPaper](https://shareyourpaper.org/) and [Dissemin](https://dissem.in/) for legal options to still make it available free of charge to readers.
:::
The licenses we discuss here mostly regard copyright. Therefore, recipients may lack other rights such as publicity, privacy, moral, patent,^[In fact, the software licenses we recommend on this page have been specifically selected to provide an express grant of patent rights.] or trademark rights. For example, sharing photos that depict people is not only a matter of copyright, but also of privacy rights. Conversely, the licenses do not apply if recipients are allowed to use the works for other reasons such as fair use, the right to quote, or because they made a different arrangement with the author.
## Which License to Choose for a Work?
Many boilerplate licenses are available to apply to your work. Which license is appropriate depends on several factors, including existing licenses in place and the type of work, but also your personal considerations. We strongly recommend to apply a _free/open_ license to your work, which means that the work "can be freely studied, applied, copied and/or modified, by anyone, for any purpose" [@Moller2015]. Importantly, this also means that others do not need to ask or notify the author and that they can use it for commercial purposes. By the choice of license, authors can, however, demand that they are credited, that the original license is indicated, that modifications are indicated, that derivative works are only shared under the same license, and that no further restrictions are imposed on the work. Software licenses may additionally require to make the source code available to everybody the software is shared with and often require to display the full text of the license upon usage. Because there are many free/open licenses available, the licenses discussed here only represent a recommended subset.
::: {.column-margin}
If you would like to choose a license not listed here, it should be appropriate for the type of work in question and be compatible with the dominant copyleft license in the respective community (see also @nte-silos). For software that's almost universally the [GPLv3](https://www.gnu.org/licenses/license-list.en.html#GPLCompatibleLicenses) and for writing, image, audio, and video that's mostly the [CC\ BY-SA\ 4.0](https://creativecommons.org/compatiblelicenses). For data, no dominant copyleft license has emerged yet, so any of [ODbL\ 1.0](https://opendatacommons.org/licenses/odbl/summary/), [CDLA\ Sharing\ 1.0](https://cdla.dev/sharing-1-0/), and CC\ BY-SA\ 4.0 are acceptable.
:::
::: {#nte-terminology .callout-note collapse="true"}
### A Note on Terminology
The terms _free_ and _open_, especially with regard to software, come with a history. The Free Software Foundation (FSF) was founded in 1985 to protect "four essential freedoms" [-@FSF2024] of a program's user. These are the freedoms to use, study, share, and improve a program. Software whose users legally and practically have these freedoms (because, among other things, they have access to its source code) is considered _free_. The four freedoms are seen as vital for a society as a whole in the sense that they enable sharing, cooperation and ultimately freedom in general. The FSF maintains [a list of software licenses](https://www.gnu.org/licenses/license-list.en.html#SoftwareLicenses) that it considers to be protecting the four freedoms. Sometimes the term _libre_ (Spanish and French for _free_) is used to make a distinction from _gratis_ software. You can learn more about free software at [Write Free Software](https://writefreesoftware.org/learn).
The Open Source Initiative (OSI), which was founded in 1998, follows a more pragmatic approach. It is concerned with developing high-quality software, for which everyone's ability to obtain, modify and contribute back the source code is considered beneficial. Access to the source code is one out of multiple conditions for software to be considered _open source_ by the OSI [-@OSI2007], which equally maintains [a list of approved licenses](https://opensource.org/licenses).
Access to the source code is a necessary, but not sufficient requirement both for _free software_ and for _open source software_. Conversely, both can (and frequently are) sold for money, as their respective criteria only apply once one has access to the software. Throughout this tutorial, we write "free/open license" to mean a license that is approved by either the FSF or the OSI. Software which is neither but makes available its source code is sometimes referred to as _source-available software_.
Apart from software, the term _open access_ has often been used for works that are available at no cost. For example, this is the commonality of bronze, green, hybrid, gold, and diamond/platinum open access _articles_, which otherwise vary in the rights that are granted to readers. In 2002, the Budapest Open Access Initiative declared that _open access_ additionally includes the right to use articles for any purpose, and in 2003, the Bethesda Statement and the Berlin Declaration added the right to make derivative works.
Two other notable definitions include the _Open Definition_ [@OKFN2016], which was first drafted in 2005, and the definition of _Free Cultural Works_ [@Moller2015], for which the open editing phase began in 2006. They are largely viewed as compatible with one another.
:::: {.column-margin}
![](images/Definition_of_Free_Cultural_Works.svg){width=250px}
::::
:::
### Existing License?
First, if you adapt (i.e., modify, build on) a work by others you need to determine if it is provided to you under a free/open license.[You can determine whether a license is free/open by searching for its name in the [SPDX License List](https://spdx.org/licenses/) and looking for at least one `Y` in the two columns _FSF Free/Libre?_ and _OSI Approved?_]{.aside} If yes, we recommend you to make your contribution available under the same license.^[Copyleft licenses even require you to choose the same or a compatible license.] For example, if you adapt code published in another paper, choose the same license for your modifications. The same applies if there are strong community norms to use a particular free/open license.^[Of course, this is only a heuristic and there might be good reasons to deviate from community norms.] Importantly, as discussed before, you are generally not allowed to adapt a work _not_ published under a free/open license.
### Work Type?
If you create a new work and no strong community norms suggest a particular license, you need to choose the license yourself. Which license to choose depends on the type of work you create. Software licenses, for example, may consider that the source code is the preferred form for making modifications, while licenses for data can differentiate between the database and any works produced from it. We have created a flowchart that covers the most likely types of works you will create as a researcher: software, writing (i.e., text), images, audio, video, and data (see @fig-flowchart-simple). This flowchart always recommends the most permissive license possible to maximize reuse -- below we provide two additional flowcharts that allow for more choices. Click on the name of a license to learn more about it.
::: {#tip-multi-licensing .callout-tip}
#### Multi-licensing
Sometimes, the type of a work is not obvious. For example, a Quarto document...
- ...contains both R code and writing, and
- ...may be distributed in the source format or as rendered document, possibly including images.
One may wonder which license to apply in this case, because Creative Commons licenses are not recommended for source code^[because (among other reasons) they explicitly disclaim any conveyance of patent rights] and applying software licenses to PDFs or images can lead to confusion or nuisance.^[because they often require to display the full text of the license]
One solution is to make such a work simultaneously available under two (or more) licenses, at the choice of the recipient: Either under a specified software license, or under a Creative Commons license^[that is, a license for writing, image, audio, and video]. This is called multi-licensing and makes it easier to reuse both the rendered document as well as the code. For example, one could write:
> The Quarto files in this project are made available under CC0\ 1.0 or (at your option) under the terms of the Apache Software License\ 2.0.
:::
Note, that for data you have, at least in principle, the option to apply different licenses to the individual entries and the collective database. For example, if you were to create a database of artworks by others, those artworks would be licensed individually as chosen by the artists, but the license for the database as a whole could be chosen by you. The latter includes the structure of the database (e.g., the selection of entries and field names) and any _sui generis_ database rights. However, if the content was created by you, we recommend you to choose the same license for both content and database. Metadata, in particular, should always be licensed under [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/).
::: {#fig-flowchart-simple}
```{mermaid}
flowchart TB
start("We want<br>to choose<br>a license.") --"We adapted a work by<br>others shared under a<br>free/open license."--> use_existing_license["<em>Use its license</em>"]
start --"We created the work<br>entirely by ourselves."--> norm("Community norm<br>regarding license?")
norm --"Exists"--> follow_existing_norms["<em>Follow that norm</em>"]
norm --"Does not<br>exist"--> type("Work type?")
type --"Software"--> apache["Apache 2.0"]
type --"Writing, image, audio, video"--> cc0["CC0 1.0"]
type --"Data(base)"--> existing_license_data("Adapting individual<br>data entries by others?")
existing_license_data --"No, we created them<br>entirely by ourselves."--> cc0_data["CC0 1.0 <em>for database<br>and its content</em>"]
existing_license_data --"Yes, they were<br>shared under a<br>free/open license."--> use_existing_license_data["<em>Use their license<br>for content and</em><br>CC0 1.0 <em>for<br>the database</em>"]
click apache href "https://choosealicense.com/licenses/apache-2.0/"
click cc0 href "https://creativecommons.org/publicdomain/zero/1.0/"
click cc0_data href "https://creativecommons.org/publicdomain/zero/1.0/"
```
Flowchart for Choosing a License
:::
::: {#nte-other-work-types .callout-note collapse="true"}
#### Other Work Types
One should be cautious about the restrictions of licenses applied to the following types of works:
- __fonts:__ Copyleft licenses applied to fonts can a special case: If a font is put under the license [CC\ BY-SA\ 4.0](https://creativecommons.org/licenses/by-sa/4.0/), any documents containing texts using that font will probably be derivative works and have to be put under the same license if shared. If the intent is that only derived fonts, if published, have to be put under the same license, the [SIL Open Font License\ 1.1](https://openfontlicense.org/) is an appropriate choice. Note, however, that it doesn't require attribution for usage. If no copyleft mechanism is intended, [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/) also works for fonts.
- __templates and LaTeX packages:__ If a template or, for this purpose, a LaTeX package, is licensed under a copyleft software license such as the [AGPLv3](https://choosealicense.com/licenses/gpl-3.0/), every work that is a derivative of the template or that uses the LaTeX package has to put under the same license, if shared [@Koppor2016]. And if a document contained source code covered by the GPLv3, the same license would also apply to the document [@MadHatter2022].
- __database content:__ If the work produced from a database (the "output") is a derivative of the content in the database, the output is subject to the restrictions laid out in the license. For example, if geospatial data were to be licensed under [CC\ BY\ 4.0](https://creativecommons.org/licenses/by/4.0/), all maps produced from the data would likely need to fulfill this license's obligation for unrestricted access if shared [see @Poole2017]. Similarly, following an example from @Matt2009 if one were to choose the copyleft license [CC\ BY-SA\ 4.0](https://creativecommons.org/licenses/by-sa/4.0/) for this purpose, any map that is a derivative of the data would also need to licensed under CC\ BY-SA\ 4.0 (or a compatible license) if shared. If the intention is to only have [derivative databases](https://osmfoundation.org/wiki/Licence/Community_Guidelines/Produced_Work_-_Guideline) under the same license, one might want to choose the [ODbL\ 1.0](https://opendatacommons.org/licenses/odbl/summary/) for the database, as it was specifically designed not to apply to works produced from the data in the database. Otherwise [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/) is an excellent choice for data.
- __works in the public domain__: If a work is already in the public domain, it should be marked using the [PDM 1.0](https://creativecommons.org/publicdomain/mark/1.0/), rather than applying a waiver such as the [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/) (or even another license).
:::
### Copyleft?
The consequence of applying a permissive license to your work is that others may distribute their modifications to it under stricter terms, possibly without a free/open license. You may prefer that everybody who receives somebody else's adaptations of your work enjoys the same rights that you granted to them. This can be achieved by _copyleft_ licenses.^[So called because copyleft licenses make content permanently free, thus turning copyright around: © → 🄯. In the realm of Creative Commons licenses, copyleft is also called _share alike_.] For software, they come in two flavors:
- __Weak copyleft licenses__ only require that modifications to the software itself are licensed under the same or a compatible license if shared. For example, if you create and publish software under a weak copyleft license, others who modify it and put their version on the internet have to apply the same license. However, people who merely _use_ your software in their own work which they make publicly available can choose any license for it.
- __Strong copyleft licenses,__ on the other side, insist that any larger works that use the copyleft-licensed software must also be licensed under the same or a compatible license if shared. For example, if your software were to be put under a strong copyleft license, everybody publishing software that uses your software would need to put it under the same license. Because of only few rulings in courts, the extent of this requirement is disputed [@Wikipedia2024].
Note, however, that the copyleft licenses we discuss here do not mandate sharing. Copyleft (and attribution) clauses are only triggered if the work is shared [@CC2015]. This means that if you only use a work internally, you do not need to share your derivative works. It is also worth reiterating that these licenses do not restrict the original author(s): They are still permitted to distribute their work under a different license and without sharing the source code.
::: {#tip-license-r .callout-tip}
#### Projects Involving R Code
In most cases, the output of software, like images or tables, does not depend on its license. Therefore, if you use an R package under a copyleft license to create a figure, you are likely the copyright owner. However, if the output is based on data, it can be considered a derivative work and the license of the data also applies. For example, maps may be considered as a derivative of the geographic data they are based on.
It is disputed whether software that uses an R package under AGPLv3 or GPLv3^[a variant of the AGPLv3 that does not cover software running as a service] can only be published under a GPL-compatible license -- or even has to be published under the same license. Posit, the company behind RStudio, does not believe that to be the case [see also @Wickham2023PackagesLicense].
You can learn which license an installed package uses via `packageDescription("<PACKAGE_NAME>", fields = "License")`. And to identify which licenses are being used by the R packages your project depends on, you can use the following code:
```r
deps <- renv::dependencies()$Package |>
unique() |>
pak::pkg_deps(dependencies = NA) |>
getElement("package")
unique(installed.packages(fields="License")[deps, "License"])
```
:::
We have prepared two advanced license flowcharts, one for software, writing, images, audio, and video in @fig-flowchart-non-data and one for data in @fig-flowchart-data where you can make additional choices. Note, however, that especially the advanced flowchart for licensing data is quite complex and we recommend you to seek legal counsel if you want to be sure.
### Attribution, State Changes, and Anti-DRM?
The advanced license flowcharts also allow you to make additional decisions:
- __Attribution__ means whether recipients of your work are required to provide attribution to you. For software licenses, this is called a copyright notice. Note that even if attribution is not a requirement of the license, good scientific practice demands that appropriate citations are made.
- __State changes__ means that recipients need to indicate if changes were made.
- __Anti-DRM__ means that when others share your work they are not allowed to apply technological measures restricting anything that the license permits. DRM is an abbreviation for **d**igital **r**ights **m**anagement.
From the Creative Commons licenses, only [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/) does not require providing attribution. All software licenses in @fig-flowchart-non-data require providing attribution, although only [Apache\ 2.0](https://choosealicense.com/licenses/apache-2.0/) and [AGPLv3](https://choosealicense.com/licenses/agpl-3.0/) require others to indicate changes. For data, the [ODbL\ 1.0](https://opendatacommons.org/licenses/odbl/summary/) allows for technological measures that restrict the work only if a version of the database is provided in parallel without such measures.
::: {#nte-silos .callout-note collapse="true"}
### Other Restrictions
As indicated before, a free/open license must allow creating derivative works and must allow exercising the rights granted by it _for any purpose_, including commercial use. There are a few good resources on the reasoning behind that [e.g., see @Klimpel2013; @ODI2015; @Stallman2022NoLimit; @Moller2023], but we would like to highlight one reason in particular.
With the rights provided by free/open licenses comes the possibility to build on and combine multiple works by different authors, which is essential for any cultural and scientific activity. However, this is only possible if the various licenses involved are compatible with each other. For example, one is allowed to remix two figures if the first is licensed under CC\ BY\ 4.0 and the second under CC\ BY-SA\ 4.0 because the licenses were written to be compatible with each other [compare @CC2024FAQ]. Also note that CC\ BY-SA\ 4.0 is [one-way compatible](https://creativecommons.org/compatiblelicenses) with the GPLv3, which in turn is compatible with the AGPLv3.
However, applying a restriction such as only permitting non-commercial use or not allowing derivative works creates _silos_ of works which are mutually incompatible with each other. Put differently, one cannot share a remix of two works where one is licensed under CC\ BY-SA\ 4.0 and the other only allows non-commercial use. In order to avoid silos, one should only choose license which are compatible with the dominant copyleft license in the respective community [@Lammerhirt2017; @Wheeler2014]. If you would like to learn more about the different types of compatibility, we recommend you to read the article "A Quick Guide to Software Licensing for the Scientist-Programmer" by @Morin2012. The following diagram provides an overview of the compatibility of various licenses:
!["Open data-capable license interoperability" by Robbie Morrison licensed under [CC\ BY\ 4.0](https://creativecommons.org/licenses/by/4.0/). Taken without modification from @Morrison2024](images/Morrison2024.jpg){.lightbox}
:::
## Which Licenses to Choose for a Project?
So far, we only discussed how to choose a license for works _of one type_. But what if you want to share a project with all kinds of files? For example, the project from this tutorial (among other things) includes a data file, a manuscript file with intermingled code and writing, and an R file. And what if you also want to share files by others, as is the case with `apaquarto` which you may have installed in this project?
The answer is that you need to indicate the license on a per-file or per-folder basis (rather than choosing one for the whole project). The easiest approach is to make note of every foreign work included in your project and record its license. Then, multi-license all the remaining files, which are yours, under a code and a non-code license in parallel as explained in @tip-license-r.
## Applying the License
Having selected the licenses of your choice -- again, you might need multiple ones depending on the types of works your project contains --, we encourage you to read through the full license text (or at least a legal summary) to understand their effect. Then, you can record the license of existing content and apply the licenses of your own contributions. Mostly, this just means indicating which license applies to which file or folder, usually in the project's README (among other places), whose creation will be discussed [later](make_readme.qmd).
::: {#cau-license-versions .callout-caution}
### License Versions Are Important
You may have noticed that we mostly refer to licenses using a name _and_ a version number. This is because the organizations that created the licenses sometimes publish updated versions to accommodate for developments in copyright law and the communities that use the licenses. For example, the Creative Commons licenses (that start with `CC`) were first published in 2002. Since then, the possibility to relicense under compatible licenses has been added ([v3.0](https://creativecommons.org/2007/02/23/version-30-launched/)), a 30-day window to correct license violations has been established to combat [copyleft trolls](https://commons.wikimedia.org/wiki/Commons:Copyleft_trolling), and _sui generis_ database rights are covered explicitly ([v4.0](https://creativecommons.org/version4/)). There are many more [subtle differences between license versions](https://wiki.creativecommons.org/wiki/License_Versions), therefore it is important to indicate which license version exactly one is referring to, as the license of a work does not "update" automatically.
For the AGPLv3 it is even recommended to state whether a work is licensed under exactly the indicated version of the license or, alternatively, also under newer versions of the license [@Stallman2022Version].
:::
For example, the `apaquarto` extension that you included in your project is a work by others.^[Unless you are the author of the `apaquarto` extension.] You need to indicate its license so that others know what they are allowed to do -- and, of course, you need to comply with any terms yourself, such as retaining the copyright notice.^[In this particular case, `apaquarto` is licensed under [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/), which does not require you to retain a copyright notice. In fact, it's an extremely permissive license.] In contrast, if it were not for this tutorial, the manuscript would contain your own work and you would need to indicate under which license you provide it to others.
::: {#tip-license-help .callout-tip}
### Follow the Help Provided by the License Authors
For all the licenses recommended in this tutorial, the organizations that created these licenses provide more information on how to apply them to your work:
::: {layout-ncol=2}
:::: {}
- [Apache\ 2.0](https://www.apache.org/licenses/LICENSE-2.0#apply)
- [MPL\ 2.0](https://www.mozilla.org/MPL/2.0/FAQ/#license-use)
- [AGPLv3](https://www.gnu.org/licenses/gpl-howto.html)
::::
:::: {}
- [CC licenses](https://creativecommons.org/faq/#how-do-i-apply-a-creative-commons-license-to-my-material)
- [ODbL\ 1.0](https://opendatacommons.org/licenses/odbl/)
::::
:::
Creative Commons even provides a range of considerations for licensors and licensees [@CC2013] and an [interactive chooser](https://creativecommons.org/choose/) which you can use to create text snippet that you can copy and paste to the desired location.
:::
While it is common to state the chosen license(s) in the README, usually one of the following actions is taken in addition:
In the simplest case, one just creates a file called `LICENSE.txt` where the full text of the license is copied verbatim. This is a practice propagated by GitHub, which provides instructions for and comparisons of many licenses via [ChooseALicense.com](https://choosealicense.com/licenses/). However, if the project is not completely covered by one single license, this practice may become unwieldy. For example, if a project contains different types of works by different authors, the `LICENSE.txt` needs to detail which file is covered by which license(s), along with any copyright notices.
Individual programming languages also have their own way of stating which license a package is distributed under. For R packages, this is usually set by the field `License` in the file `DESCRIPTION` [@Wickham2023PackagesLicense].
Finally, @nte-reuse explains how to use the [REUSE](https://reuse.software/) specification to make the choice of license machine-readable. This is the approach we recommend taking.
::: {#nte-reuse .callout-note collapse="true"}
### Using REUSE to Record Licenses
Every major free/open license has a unique [SPDX identifier](https://spdx.org/licenses/) which allows communicating the license choice unequivocally. We will be using that to indicate the license for every file in your project, along with the year of publication and the copyright holder. To do this, we add a comment to the beginning of every file and include the two tags `SPDX-FileCopyrightText` and `SPDX-License-Identifier`. How this works depends on the file type, as the syntax for a comment varies.
For example, if you previously created the file `create_data_dictionary.R`, you can now add the following comment to the beginning of the file, replacing `<YEAR>` and `<NAME>` with the current year and your name -- of course, you can also choose a different license:
```{.r filename="create_data_dictionary.R"}
# SPDX-FileCopyrightText: <YEAR> <NAME>
#
# SPDX-License-Identifier: CC0-1.0
```
You need to use `#` to start the comment because this is the symbol that starts comment lines in R scripts. Alternatively, you can use the [reuse tool](https://github.com/fsfe/reuse-tool) to add these information for you. After installing it with...
```{.bash filename="Terminal"}
pipx install reuse
```
...you can add the copyright information using the following command -- the current year will be added automatically:
```{.bash .code-overflow-wrap filename="Terminal"}
reuse annotate --copyright="<NAME>" --license="CC0-1.0" create_data_dictionary.R
```
In many cases, the reuse tool will figure out the appropriate comment style for you. If this is not the case, as currently with Quarto files, you can tell it directly which comment style to use (`html` in this case):
```{.bash .code-overflow-wrap filename="Terminal"}
reuse annotate --copyright="Josephine Zerna <[email protected]>" --copyright="Christoph Scheffel <[email protected]>" --copyright="Florian Kohrt" --license="CC-BY-4.0" --style=html Manuscript.qmd
```
This adds the following header to `Manuscript.qmd`:
```{.html filename="Manuscript.qmd"}
<!--
SPDX-FileCopyrightText: 2024 Christoph Scheffel <[email protected]>
SPDX-FileCopyrightText: 2024 Florian Kohrt
SPDX-FileCopyrightText: 2024 Josephine Zerna <[email protected]>
SPDX-License-Identifier: CC-BY-4.0
-->
```
Note that `Manuscript.qmd` was provided to you under [CC\ BY\ 4.0](https://creativecommons.org/licenses/by/4.0/), which is what you indicate with the previous comment. If you edited the file, you may also add yourself.^[For licenses that require that modifications are indicated, this is an easy way to comply with them, although you do not need to provide your real name.]
Sometimes, there are file types which do not allow for adding the license information inside them, such as PDF and CSV files. For these, a corresponding `.license` file can be created. Try the following command which indicates that the data were published under [CC0\ 1.0](https://creativecommons.org/publicdomain/zero/1.0/):
```{.bash .code-overflow-wrap filename="Terminal"}
reuse annotate --copyright="Kristen Gorman" --license="CC0-1.0" Data.csv
```
You will notice that this creates another file called `Data.csv.license` containing the relevant information:
```{.yml filename="Data.csv.license"}
SPDX-FileCopyrightText: 2024 Kristen Gorman
SPDX-License-Identifier: CC0-1.0
```
If you want to indicate the license for all files in a particular folder, you can create a file called `REUSE.toml` and add an `[[annotations]]` table for them:
```{.toml filename="REUSE.toml"}
version = 1
# apaquarto extension from https://github.com/wjschne/apaquarto
[[annotations]]
path = "_extensions/wjschne/apaquarto/*"
SPDX-FileCopyrightText = "2024 William Joel Schneider <[email protected]>"
SPDX-License-Identifier = "CC0-1.0"
```
Finally, there may be some minor files which are build artifacts. You can either add them to your `.gitignore` file or use the CC0\ 1.0 license/waiver with a copyright tag such as `SPDX-FileCopyrightText: NONE` to assert that there is no copyright holder. For more information, also discussing other corner cases, you can read their [Frequently Asked Questions](https://reuse.software/faq/).
Once you are done, you can download the texts of all indicated licenses using...
```{.bash filename="Terminal"}
reuse download --all
```
...and verify that you did not miss a file by running...
```{.bash filename="Terminal"}
reuse lint
```
:::
Regardless of how exactly the licenses are added to the project, this is a good opportunity to verify one last time that all third party content is provided to you under a free/open license and that you comply with it. Please add a license to your project now, either creating a file `LICENSE.txt` or following the REUSE standard.
## Wrap-up
If you would like to learn more about copyright and licenses you might find the following resources interesting:
- "Open Content -- A Practical Guide to Using Creative Commons Licences" by @Kreutzer2014
- "Creative Commons Certificate for Educators, Academic Librarians, and Open Culture" by @CC2024Certificate
- "Freie Software -- Zwischen Privat- und Gemeineigentum" by @Grassmuck2004
## Additional Figures {.appendix}
::: {#fig-flowchart-non-data}
```{mermaid}
flowchart TB
start("We want to choose a<br>license for software,<br>writing, image, audio,<br>or video.") --"We adapted a work by<br>others shared under a<br>free/open license."--> use_existing_license["<em>Use its license</em>"]
start --"We created the work<br>entirely by ourselves."--> norm("Community norm<br>regarding license?")
norm --"Exists"--> follow_existing_norms["<em>Follow that norm</em>"]
norm --"Does not<br>exist"--> type("Work type?")
type --"Software"--> code_sa("Attribution?<br>State changes?<br>Copyleft?")
type --"Writing, image, audio, video"--> nocode_cc("Attribution?<br>State changes?<br>Anti-DRM?<br>Copyleft?")
code_sa --"Attribution &<br>State changes"--> apache["Apache 2.0"]
code_sa --"Attribution &<br>Weak copyleft"--> mpl["MPL 2.0"]
code_sa --"Attribution &<br>State changes &<br>Strong copyleft"--> agpl["AGPLv3"]
nocode_cc --"Neither"--> cc0["CC0 1.0"]
nocode_cc --"Attribution &<br>State changes &<br>Anti-DRM"--> cc_by["CC BY 4.0"]
nocode_cc --"Attribution &<br>State changes &<br>Anti-DRM &<br>Copyleft"--> cc_by_sa["CC BY-SA 4.0"]
click apache href "https://choosealicense.com/licenses/apache-2.0/"
click mpl href "https://choosealicense.com/licenses/mpl-2.0/"
click agpl href "https://choosealicense.com/licenses/agpl-3.0/"
click cc0 href "https://creativecommons.org/publicdomain/zero/1.0/"
click cc_by href "https://creativecommons.org/licenses/by/4.0/"
click cc_by_sa href "https://creativecommons.org/licenses/by-sa/4.0/"
```
Note: DRM = digital rights management
Advanced License Flowchart for Software, Writing, Images, Audio, and Video
:::
::: {#fig-flowchart-data}
```{mermaid}
flowchart TB
start("We want to choose a<br>license for data.") --"We adapted a database by<br>others shared under a<br>free/open license."--> use_existing_license_db["<em>Use its license(s)<br>for content and database</em>"]
start --"We created a database<br>entirely by ourselves."--> norm("Community norm<br>regarding license?")
norm --"Exists"--> follow_existing_norms["<em>Follow that norm</em>"]
norm --"Does not<br>exist"--> existing_license_content("Adapting content<br> by others?")
subgraph "License for individual data entries (content)"
existing_license_content --"No, we created the content<br>entirely by ourselves."--> metadata("Is metadata?")
existing_license_content --"Yes, it was<br>shared under a<br>free/open license."--> use_existing_license_content["<em>Use that license</em>"]
metadata --"Yes"--> cc0_content_metadata["CC0 1.0"]
metadata --"No"--> choose_license["<em>Consult flowchart for<br>software, writing,<br>image, audio, and video</em>"]
end
subgraph "License for combination of data (database)"
choose_license --> switch_license["<em>Depending on content license</em>"]
use_existing_license_content --> switch_license
cc0_content_metadata --> cc0_db["CC0 1.0"]
switch_license --"CC0 or<br>non-CC license"--> sa("Attribution?<br>Anti-DRM?<br>Copyleft?")
switch_license --"CC BY or<br>CC BY-SA"--> same["<em>Same license for DB</em>"]
sa --"Neither"--> cc0_db
sa --"Attribution &<br>Anti-DRM &<br>Copyleft"--> odbl["ODbL 1.0"]
%% the following link is only added to have terminal nodes on the same level
sa ~~~ same
end
click cc0_content_metadata href "https://creativecommons.org/publicdomain/zero/1.0/"
click cc0_db href "https://creativecommons.org/publicdomain/zero/1.0/"
click odbl href "https://opendatacommons.org/licenses/odbl/summary/"
```
Advanced License Flowchart for Data(base)
:::