-
Notifications
You must be signed in to change notification settings - Fork 4
/
Part1_DummyModel.qmd
842 lines (619 loc) · 39.3 KB
/
Part1_DummyModel.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
# Introducing `SpaDES` with a dummy ecological model {#sec-part1}
```{r}
#| echo: false
#| cache: false
source("_common.R")
```
Authors: Ceres Barros, Tati Micheletti
```{r setup}
#| include: false
#| cache: false
projDir <- getwd() ## save this for later
rmdDir <- "~/SpaDES4Dummies_Part1"
dir.create(rmdDir)
knitr::opts_knit$set(
root.dir = rmdDir
)
options("knitr.graphics.rel_path" = FALSE,
"spades.useRequire" = FALSE)
```
Let's imagine we want to explore how the relationship between a species' abundance and temperature changes over time.
Both the abundance data and the temperature data are being constantly updated by a simulation model, and we want to analyse the relationship between the two iteratively, without needing to manually run a script to account for the newly generated data inputs.
## BEFORE `SpaDES`
If we use R to develop our species abundance and temperature simulation models in the 'conventional way', we'll probably have i) (the worst case scenario) several scripts that run simulations and data treatment/analysis separately and have to be executed manually, or ii) a long script where everything happens - the simulations and data analysis -, iii) a main script that sources others that do the simulation and analyses.
Option i) is more common when different software are used for different parts of the process (e.g., a simulation model in *C++* generates data that is then analysed in R).
Option ii) is inconvenient because very long scripts make changes and updates to the script - debugging can also be more tiresome.
Option iii), is similar to the `SpaDES` way of thinking.
The difference is that `SpaDES` defines a standard way of writing different components of a model, or of a modelling framework.
This makes changing, updating and sharing code - or modules - easier, as well as swapping and adding modules in a modelling framework.
The example below is so minimal that it is unlikely to show the full benefits of using `SpaDES` - the same could be accomplished with a fairly short script.
However, it introduces the different parts of a module and how to link modules.
[Chapter 2](Part2_SDMs.qmd) goes a step further and uses real datasets to project species presences across a landscape in Canada.
In [Chapter 2](Part2_SDMs.qmd), we introduce `SpaDES` features that we most commonly use in our work (e.g., caching and spatial data processing) and provide some coding best practices that we use ourselves (e.g., code assertions).
### Setup
This is what you'd normally do... Install all the packages in some way that you probably didn't record in your scripts and then start your script with loading the packages:
```{r loadPkgs}
#| eval: false
#| message: false
#| warning: false
#| results: "hide"
#| class-source: fold-show
## please start from a clean R session
remotes::install_github("ropensci/NLMR") ## you will need this ;)
library(terra)
library(quickPlot)
library(ggplot2)
library(ggpubr)
```
```{r loadPkgsHidden}
#| include: false
#| cache: false
library(terra)
library(quickPlot)
library(ggplot2)
library(ggpubr)
stopifnot("NLMR" %in% installed.packages())
```
And now create a raster template:
```{r the_r_way_raster}
#| message: false
#| warning: false
#| class-source: fold-show
r <- rast(nrows = 100, ncols = 100, xmin = -50, xmax = 50, ymin = -50, ymax = 50)
```
### Species abundance "simulations"
Our VERY simple "simulation" model (in form of a function) generates rasters that follow a Gaussian distribution
```{r the_r_way_simulation_function_AB}
#| class-source: fold-show
abundance_model <- function(ras, Time) {
abund_outputs <- list()
for (t in 1:Time) {
abund_outputs[[t]] <- NLMR::nlm_mpd(
ncol = ncol(ras),
nrow = nrow(ras),
resolution = unique(res(ras)),
roughness = 0.5,
rand_dev = 100,
rescale = TRUE,
verbose = FALSE
) |>
rast()
}
return(abund_outputs)
}
```
Set the length of the simulation (or simply the number of model iterations), run it and plot results (all ABUNDANCE plots together):
```{r the_r_way_simulation_length}
#| message: false
#| class-source: fold-show
Time <- 10
abundance <- abundance_model(ras = r, Time = Time)
dev()
plot(rast(abundance))
```
### Temperature "simulations"
The temperature simulation model will be similar to the vegetation one - remember this is a dummy example.
```{r the_r_way_simulation_function_TMP}
#| class-source: fold-show
temperature_model <- function(ras, Time) {
temp_outputs <- list()
for (t in 1:Time) {
temp_outputs[[t]] <- NLMR::nlm_mpd(
ncol = ncol(ras),
nrow = nrow(ras),
resolution = unique(res(ras)),
roughness = 0.5,
rand_dev = 10,
rescale = FALSE,
verbose = FALSE
) |>
rast()
}
return(temp_outputs)
}
```
Run the model and plot results (all temperature plots together)
```{r the_r_way_plot_results_TMP}
#| class-source: fold-show
temperature <- temperature_model(ras = r, Time = Time)
plot(rast(temperature))
```
### Data analysis
Now we analyse if species abundance and temperature are correlated.\
First, we create the data analysis function (a simple linear model):
```{r data_analysis_model}
#| class-source: fold-show
stats_analysis <- function(Data) {
if (all(c("abund", "temp") %in% colnames(Data))) {
lm1 <- lm(abund ~ temp, data = Data)
ggplot(Data) +
geom_point(aes(x = temp, y = abund)) +
geom_abline(intercept = lm1$coefficients["(Intercept)"],
slope = lm1$coefficients["temp"], size = 2, col = "blue") +
theme_bw() +
labs(x = "Temp.", y = "Species abundance")
} else {
stop("Data must contain 'abund' and 'temp' columns")
}
}
```
Then we create a loop to analyse each plot of our time-series:
```{r data_analysis}
#| class-source: fold-show
lmPlots <- list()
for (t in 1:Time) {
outputdata <- data.frame(abund = as.vector(abundance[[t]][]), temp = as.vector(temperature[[t]][]))
lmPlots[[t]] <- stats_analysis(Data = outputdata)
}
ggarrange(plotlist = lmPlots)
```
## AFTER `SpaDES`
### The control script
Let us now solve the same problem using the `SpaDES` approach.[^part1old-1]
We start by creating an *.R* script (it can have any name) that sets up and runs the `SpaDES` model.
The control script for this example is located on the root of the *SpaDES4Dummies* GitHub repository under the name `Part1_DummyModel.R`. Note that Markdown (`.Rmd`) scripts can also be used instead of \`.R\` scripts.
We start by making sure all `SpaDES` packages and their dependencies are installed (and that the installation is *scripted*) using the `Require` package.
```{r globalscriptPart1}
#| eval: false
#| message: false
#| warning: false
#| results: hide
## start again from a clean R session
options(repos = c("https://predictiveecology.r-universe.dev/",
CRAN = "https://cloud.r-project.org"))
## decide where you're working
mainPath <- "~/SpaDES4Dummies_Part1"
pkgPath <- file.path(mainPath, "packages", version$platform,
paste0(version$major, ".", strsplit(version$minor, "[.]")[[1]][1]))
dir.create(pkgPath, recursive = TRUE)
.libPaths(pkgPath, include.site = FALSE) ## install packages in project library (proj-lib)
if (!"remotes" %in% installed.packages(lib.loc = pkgPath))
install.packages("remotes")
if (!"Require" %in% installed.packages(lib.loc = pkgPath) ||
packageVersion("Require", lib.loc = pkgPath) < "0.3.1.9015") {
remotes::install_github("PredictiveEcology/Require@2788b023ad191c29346ef8c64df71b937be307e2",
upgrade = FALSE)
}
Require::Require(c("SpaDES", "DiagrammeR"), require = FALSE, upgrade = FALSE,
dependencies = TRUE, standAlone = TRUE) ## automatically downloads all packages in the SpaDES family and their dependencies
library(SpaDES)
setPaths(cachePath = file.path(mainPath, "cache"),
inputPath = file.path(mainPath, "inputs"),
modulePath = file.path(mainPath, "modules"),
outputPath = file.path(mainPath, "outputs"))
getPaths() ## check that this is what you wanted
## Let's create a self-contained module that will simulate the species' abundance for any given period of time and frequency.
if (!dir.exists(file.path(getPaths()$modulePath, "speciesAbundance"))) {
newModule(name = "speciesAbundance", path = getPaths()$modulePath)
}
```
```{r globalscriptPart1Hidden}
#| include: false
#| cache: false
## for faster rendering use all .libPaths to avoid installing all packages.
options(repos = c("https://predictiveecology.r-universe.dev/",
CRAN = "https://cloud.r-project.org"))
mainPath <- "~/SpaDES4Dummies_Part1"
pkgPath <- file.path(mainPath, "packages", version$platform,
paste0(version$major, ".", strsplit(version$minor, "[.]")[[1]][1]))
dir.create(pkgPath, recursive = TRUE)
.libPaths(pkgPath)
if (!"remotes" %in% installed.packages(lib.loc = pkgPath))
install.packages("remotes")
if (!"Require" %in% installed.packages(lib.loc = pkgPath) ||
packageVersion("Require", lib.loc = pkgPath) < "0.3.1.9015") {
remotes::install_github("PredictiveEcology/Require@2788b023ad191c29346ef8c64df71b937be307e2",
upgrade = FALSE)
}
Require::Require(c("SpaDES", "DiagrammeR"), require = FALSE, upgrade = FALSE,
dependencies = TRUE)
library(SpaDES)
setPaths(cachePath = file.path(mainPath, "cache"),
inputPath = file.path(mainPath, "inputs"),
modulePath = file.path(mainPath, "modules"),
outputPath = file.path(mainPath, "outputs"))
mods <- list.files(file.path(projDir, "modules"), full.names = TRUE)
mods <- mods[basename(mods) %in% c("speciesAbundance", "temperature", "speciesTempLM")]
sapply(mods, function(mod, to) {
to <- file.path(to, basename(mod))
dir.create(to, showWarnings = FALSE)
ff <- list.files(mod, full.names = TRUE) ## don't recurse into dirs to preserve structure
file.copy(ff, to, recursive = TRUE, overwrite = TRUE) ## now recurse to copy files in dirs
}, to = getPaths()$modulePath)
rm(mods)
```
We then create modules using `newModule`.
`newModule` creates a module folder (*speciesAbundance*) inside `/modules` that contains both the module `.R` script template, as well as the documentation template (the `.Rmd` file).
Although we will not be discussing the `.Rmd` file, please bear in mind that this is a **fundamental** part of creating a reproducible and transparent module - check out the [Guide to Reproducible Code in Ecology and Evolution](https://www.britishecologicalsociety.org/wp-content/uploads/2017/12/guide-to-reproducible-code.pdf) from the British Ecological Society).
The documentation should contain a the description of the module, its input, parameters and outputs, and potentially a reproducible examples of how the module is executed.
`newModule` also created the folder `/data` where data necessary to the module can be put in, and the folder `/tests` that may contain testing scripts.
We will not be using either of them in this example.
**/!\\ ATTENTION /!\\**
*`newModule` should only be run once, otherwise it will replace all edits and contents of the module folder with the templates - this is why it is wrapped in an `if` statement above.*
Now go ahead, open the `speciesAbundance.R` script and have a look at it.
------------------------------------------------------------------------
### General module structure: *speciesAbundance* module
The module template contains all the essential components of a module, with examples, and may seem overwhelming at first.
We'll go through it step by step (although not necessarily following the order of the script).
The module script can be divided into 4 parts:
[**Defining the module**]: this is where the module is **defined**, i.e., the module's metadata (e.g. module author(s), time units, basic parameters, general inputs and outputs, etc.);
[**Events and event functions**]: these are the "actions" (or events) executed in the module (i.e. species reproduction, plotting, saving parameters) - simply put, **WHAT** the module does;
[**Scheduling events**]: this is how `SpaDES` schedules when each event is going to happen - in which order (e.g. during the simulation, when will `SpaDES` plot a graph) - simply put, **WHEN** the module does it;
[**Additional module functions**]: any additional functions needed (e.g. this is used to keep the coding of your module as clear and straightforward as possible);
The first thing to note is that **the user does not need to manually run** any of the code inside a module's `.R` script.
The function `simInit()` will do so when it sets up the simulation.
We will see this see this later in detail.
#### **Defining the module**
The first section of the script defines the module's [metadata](https://data-informed.com/what-is-metadata-a-simple-guide-to-what-everyone-should-know/).
It allows defining the module's author, keywords, any required packages and module(s) and their versions, but also parameters (and their default values) and input objects that the module requires, and the output objects it creates.
Although this dummy module example requires no true input data, we will define the template raster \`r\` as an "input" in the `expectsInput` function, and provide a default object in `.inputObjects` (see below).
As for the outputs, it produces a list of abundance rasters (produced during the `abundanceSim` event).
So we define it as an output in the`createsOutput` function.
Note that we removed several parameters that come with the template created by the `newModule` function, as they are not needed for this example.
To distinguish what input and output objects are in the context of a module, a good rule of thumb is that inputs are all the `sim$...` objects that appear for the first time (in the module events) on the **right-hand side** of a `<-`, whereas output parameters are the `sim$...` objects that appear for the first time to the **left-hand side** of a `<-`.
Another way of explaining it for objects is illustrated in @fig-inOutObj :
```{r fig-inOutObj}
#| fig-align: center
#| warning: false
#| message: false
#| echo: false
#| fig-cap: "Inputs and outputs in SpaDES: Object A comes from outside of the module (e.g. from an internet URL, from data you have, or from `.inputObjects`), while Module Z produces object C. Both objects serve as an inputs for Module Y, which in return produce as outputs objects B and D, respectivelly from objects A and C. As Module Z uses a simple function *internally* to create object C, it doesn't have any inputs, such as our dummy example."
#| out-width: 80%
knitr::include_graphics(normalizePath(file.path(projDir, "figures/obj.png"), winslash = "/"))
```
The default input objects created by the `.inputObjects` function (see \[.inputObjects function\]) during the `simInit` call are exceptions to this rule.
Here is how we defined the *speciesAbundance* module:
```{r abund_module}
#| eval: false
defineModule(sim, list(
name = "speciesAbundance",
description = "",
keywords = "",
authors = person("Me", email = "[email protected]", role = c("aut", "cre")),
childModules = character(0),
version = list(speciesAbundanceData = "1.0.0"),
timeframe = as.POSIXlt(c(NA, NA)),
timeunit = "year",
citation = list("citation.bib"),
documentation = deparse(list("README.txt", "speciesAbundance.Rmd")),
reqdPkgs = list("SpaDES.core (>=2.0.2)",
"terra", "quickPlot"),
parameters = bindrows(
#defineParameter("paramName", "paramClass", value, min, max, "parameter description"),
defineParameter("simulationTimeStep", "numeric", 1, NA, NA,
"This describes the simulation time step interval"),
defineParameter(".plotInitialTime", "numeric", 1, NA, NA,
"Describes the simulation time at which the first plot event should occur."),
defineParameter(".plotInterval", "numeric", 1, NA, NA,
"Describes the simulation time interval between plot events.")
),
inputObjects = bindrows(
# expectsInput("objectName", "objectClass", "input object description", sourceURL, ...),
expectsInput("r", objectClass = "SpatRaster", desc = "Template raster")
),
outputObjects = bindrows(
#createsOutput("objectName", "objectClass", "output object description", ...),
createsOutput("abundRasters", "list", "List of layers of species abundance at any given year")
)
))
```
Note that the package versions that you define will depend on the ones that are installed on your computer.
So take care to change them accordingly.
The `SpaDES` package version suggested by the template reflects the version on your computer.
The rest of the script defines the events and their sequences for this module - remember `SpaDES` = Spatial Discrete Event Simulator - and the events themselves.
**/!\\ ATTENTION /!\\**
*`defineModule()` is not intended to be run directly by the user -- it is run internally during a `simInit()` call (see [Simulation setup in a "global" script]). In other words, you don't run any part of a module's code directly in your session; you run `simInit()` with that module listed in the modules argument.*
#### **Events and event functions**
Module events are defined and scheduled in the `doEvent.<module name>` function (in this example, `doEvent.speciesAbundance` function; see \[Scheduling events\]. Since we are only interested in simulating and plotting species abundances, we removed unnecessary events from the script and kept: the initialisation (`init`), an abundance simulation event (`SimulAbund`) and a plotting event (`abundPlot`).
Each of these events can execute one or more functions.
*Event functions* (actual R functions) mustn't be confused with *event names*, which are the names of the events appearing in the `doEvent.<module name>`.
**/!\\ ATTENTION /!\\**
*Event functions take only one argument, `sim` (the `SpaDES.core::simList` object that stores all objects, modules, functions, etc., of a simulation; see `?simList`) and event functions always (and only) return `sim` (using `return(invisible(sim))`).*
##### *Initialisation event function*
The initialisation event function (here, `abundanceInit`) can be seen as the starting point of the module.
Unlike the `init` event, which must always be present, the function itself does not need to exist (see \[Scheduling events\]) and can have whatever name we want.
Usually, this function will does pre-simulation steps that are only need to be executed once.
In our dummy example, it creates a template raster and a storage list for our species abundance outputs (which will also be rasters).
Notice that the only argument to `abundanceInit` is `sim`, a `simList` object that is also its only output.
```{r abund_module_SpaDES}
#| eval: false
abundanceInit <- function(sim) {
## create storage list of species abundance
sim$abundRasters <- list()
return(invisible(sim))
}
```
##### *Abundance simulation event function*
The function `abundanceSim` is the core event function of this module, where species abundances are generated via the event.
Notice how instead of a *for-loop*, `abundanceSim` runs the `abundance_model` function (which we define separately below) and stores its outputs in the `sim$abundRaster` object.
Notice as well that we use `time(sim)` as the identifier of the list slots where outputs are stored (see `?SpaDES.core::time`).
As before, the sole argument and output to this event function is the `sim` object.
```{r abund_module3}
#| eval: false
abundanceSim <- function(sim) {
## Generate species abundances - our "simulation"
sim$abundRasters[[as.character(time(sim))]] <- abundance_model(ras = sim$r)
return(invisible(sim))
}
```
The `abundanceSim` function was called `Event1` in the template.
##### *Plotting event function*
Finally, we created the `abundancePlot` event function to plot the species abundance rasters that are produced by the `abundanceSim` event function.
Again, the sole argument and output of this function is `sim`.
```{r abund_module4}
#| eval: false
abundancePlot <- function(sim) {
## plot abundances
plotTitle <- paste("Species abundance\nat time",
names(sim$abundRasters)[length(sim$abundRasters)])
abundPlot <- sim$abundRasters[[length(sim$abundRasters)]]
Plot(abundPlot,
title = plotTitle,
new = TRUE, addTo = "abundPlot")
return(invisible(sim))
}
```
The `abundancePlot` function was called `plotFun` in the template.
#### **Scheduling events**
The order in which module events are executed is determined by the `doEvent.<module name>` function.
This function also defines the events themselves and what happens in them.
The `switch` function executes each event (here `init`, `SimulAbund`, and `abundPlot`) and the events schedule themselves.
Two things are of particular importance:
1. The `init` event is **mandatory**. This is the only event whose *name* that cannot be changed and that cannot be removed (even if it does not execute any event functions). All other `events` are optional and can be renamed. `SpaDES` searches and executes all modules' `init` events automatically. Note that the names of event functions executed during `init` *can* have any name: here we changed the `Init` function name (suggested by the template) to `abundanceInit`.
2. **Events should only schedule themselves**. The only exception is the `init`, which schedules the first time all other events are executed (even if a particular event only occurs once at the end of the simulation).
It is usually easier to fill the `doEvent.<module name>` function *after* having defined the event functions (as we did above).
For instance, we know that plotting should occur after the generation of species abundances, and so the `abundPlot` will be scheduled to occur after the `SimulAbund` event, by changing event priority (see `?priority`).
This is how we configured our `doEvent.speciesAbundance` function:
```{r abund_module5}
#| eval: false
doEvent.speciesAbundance = function(sim, eventTime, eventType, debug = FALSE) {
switch(
eventType,
init = {
## do stuff for this event
sim <- abundanceInit(sim)
## schedule future event(s)
sim <- scheduleEvent(sim, eventTime = start(sim), moduleName = "speciesAbundance",
eventType = "SimulAbund")
sim <- scheduleEvent(sim, eventTime = P(sim)$.plotInitialTime,
moduleName = "speciesAbundance", eventType = "abundPlot",
eventPriority = .normal()+0.5)
},
SimulAbund = {
## do stuff for this event
sim <- abundanceSim(sim)
## schedule future event(s)
sim <- scheduleEvent(sim, eventTime = time(sim) + P(sim)$simulationTimeStep,
moduleName = "speciesAbundance", eventType = "SimulAbund")
},
abundPlot = {
## do stuff for this event
sim <- abundancePlot(sim)
## schedule future event(s)
sim <- scheduleEvent(sim, eventTime = time(sim) + P(sim)$.plotInterval,
moduleName = "speciesAbundance", eventType = "abundPlot",
eventPriority = .normal()+0.5)
},
warning(paste("Undefined event type: '", current(sim)[1, "eventType", with = FALSE],
"' in module '", current(sim)[1, "moduleName", with = FALSE], "'", sep = ""))
)
return(invisible(sim))
}
```
We suggest having a look at `?base::switch` too fully understand its behaviour.
In short, `base::switch` tells R to execute (or switch) different code depending on the value of `EXPR` (here `eventType`).
Here, this means that the behaviour of the function `doEvent.speciesAbundance` will change depending on the present `eventType`.
So we need to define what behaviour it should have for each event type defined in the module - namely, which functions will be executed and whether to schedule future events with `scheduleEvent`.
##### **init**
The first event is, obviously, `init` - again **its name cannot be changed**.
In `init` we run the initialisation event function (`abundanceInit`) - optional - and schedule the first occurrence of all other events (here, the abundance simulation, `SimulAbund`, and plotting, `abundPlot`, events).
Because the `init` is the only event that `SpaDES` always executes at the start of the simulation, if no events are scheduled during `init`, no events will be executed after the `init`.
Notice two things:
1. The `SimulAbund` event is scheduled at `start(sim)` (i.e. at the first time step of the simulation), which means that it will run after the `init` event, but still in the same "year".
2. `init` schedules the first plotting event to be executed at the time defined by the `.plotInitialTime` parameter, which is stored in the `sim` object (and obtained using `SpaDES.core::P(sim)`), but with a slightly lower event priority `eventPriority = .normal()+0.5` (see `?priority`).
##### **SimulAbund**
The `SimulAbund` event is defined next.
This event used to be called `event1` in the template, and we changed its name to be more informative of what it does.
It is the core event of this module, where species abundances are generated via the event function `abundanceSim`.
The even also **schedules itself** to occur at a frequency defined by the `simulationTimeStep` parameter
##### **abundPlot**
Finally, we schedule the plotting event, `abundPlot` (which used to be called `plot` in the template).
Similarly to the `SimulAbund` event, it executes an event function (`abundancePlot`) and reschedules itself.
An important difference is that it uses the `.plotInterval` parameter, instead of `simulationTimeStep`, when rescheduling itself.
This way, future events will occur depending on the time step and plot interval parameters defined in the global script (or their default values defined in the metadata section).
#### **`.inputObjects` function**
The end of the template `.R` script defines a function called `.inputObjects`.
This is where the developer should include code to provide the defaults for any input objects required by the module.
This is the ideal place to produce the template raster, `r` , instead of doing so in `abundanceInit`.
This will allow a future user (or module) to provide their own `r` object (e.g. for another study area).
If on the other hand we don't do this and create `r` during the `init` event, any `r` supplied by the user will be overridden by the execution of `init`.
As a rule of thumb, default inputs should be created in a way that allows their values to be overridden by the user (by supplying a named list of objects via `simInit(objects = ...)`) or by any other modules that produce these objects.
For this, we rely on the `SpaDES.core::suppliedElsewhere` function, which detects if a given object has already been supplied by the user or if it will be supplied by another module.
Note that `suppliedElsewhere` does not know whether the module that supplies the object will be executed *before* the present module, as it is blind to module scheduling order.
When modules are relatively simple and have an approximately linear flow of interdependencies, `SpaDES` is usually able to tell the order in which modules need to be executed.
In more complex cases it is a good idea to pass a vector of module names to `simInit(loadOrder = ...)` definining the order of module execution.
Here's an example of how to do this (the commented instructions have been deleted):
```{r abund_model6}
#| eval: false
.inputObjects <- function(sim) {
if (!suppliedElsewhere("r")) {
## make template raster if not supplied elsewhere.
sim$r <- rast(nrows = 100, ncols = 100, xmin = -50, xmax = 50, ymin = -50, ymax = 50)
}
return(invisible(sim))
}
```
If we choose to supply the default `r` in `.inputObjects`, then we need to remove its creation from the `abundanceInit` function and add `r` to the metadata as an input.
We have done this, so that `abundanceInit` only creates a storage list for the outputs:
```{r abund_model7}
#| eval: false
abundanceInit <- function(sim) {
## create storage list of species abundance
sim$abundRasters <- list()
return(invisible(sim))
}
```
It is good practice to provide default input objects to all remaining modules, so that they can work stand-alone.
We have done this below.
**/!\\ ATTENTION /!\\**
*If `r` becomes an input with defaults it must be **added to the module metadata** inside an `expectsInput` call.*
#### **Additional module functions**
Events can also rely on other functions that can either be sourced from other scripts, or defined at the end of the module script (e.g. usually before `.inputObjects`, although the order is irrelevant).
This is the case for the species abundances generator function, which we coded in a separate script called `abundance_model.R`.
Scripts with accessory functions like these go into module's `R/` folder.
Functions should also be accompanied by metadata.
Here we provide a description of the function, its parameters, returning value and what other package functions it relies on using the `roxygen2` documentation style (indicated by `#'`).
```{r abund_module8}
#| code: !expr 'readLines(file.path(projDir, "modules", "speciesAbundance", "R", "abundance_model.R"))'
#| eval: false
```
------------------------------------------------------------------------
### Creating and adding additional modules: the *temperature* module
The order in which modules are first executed (i.e. their `init` events) can be automatically determined by inter-module dependencies (i.e. module inputs that are the outputs of other modules).
If there are no inter-module dependencies this order is determined by the order in which the modules are listed in the `Part1_DummyModel.R` script, or via the `simInit(loadOrder = ...)` argument.
After the `init` event, the module execution order follows the order of events.
This means that a module's events can be scheduled before and after another module's events within the same simulation time step.
However, keep in mind that this can make the simulation flow hard to follow, debug and change when additional modules are added.
The second module we created generates yearly temperatures.
Apart from different objects and functions names, this module also has the template raster `r` as required input object.
Recall that `r` is created during the `.inputObjects` of the *speciesAbundance* module.
When the two modules are linked, this object will not be created twice because `suppliedElsewhere("r")` will tell the *temperature* module that \`r\` will be supplied by another module.
This may appear trivial in this example, but it can be extremely useful when inuts are heavy objects that require lengthy computations to be produces.
This is how we set up the `temperature.R` script looks like:
```{r temp_module}
#| code: !expr 'readLines(file.path(projDir, "modules", "temperature", "temperature.R"))'
#| eval: false
```
Again, we added an accessory `temperature_model` function in a separate script `R/temperature_model.R`:
```{r temp_module2}
#| code: !expr 'readLines(file.path(projDir, "modules", "temperature", "R", "temperature_model.R"))'
#| eval: false
```
------------------------------------------------------------------------
### Modules that depend on other modules: the *speciesTempLM* module
Our third and last module, *speciesTempLM*, will be used to run the statistical analysis at each year, after the abundances and temperatures are generated (**species** and **Temp**erature **L**inear **M**odel).
Hence, it will depend on the outputs of the *speciesAbundance* and the *temperature* modules.
The interest of keeping the statistical analysis in a separate module lies on the fact that it allows us to easily swap and compare different statistical models to analyse our data if we want to.
It also allows for greater flexibility when it comes to **when** the statistical model is supposed to run.
For example, we may want to fit it at every 5 years, instead of every year, using the previous 5 years of data.
By having the statistical analysis contained in its own module, we don't need to change other module scripts in order to make these changes.
Finally, we draw your attention to a few differences in this module's script before we see it:
- The **frequency** of the statistical analysis (and correspondent plots) will be determined by the parameter `statsTimestep`.
This parameter also determines the number of data years to be used to fit the linear model.
If `statsTimestep = 5`, the statistical analysis will use the precedent 5 years of data including the year in which the event is running (a total of 6 years of data);
- This module **requires inputs** that have no defaults in `.inputObjects`.
They are specified in `inputObjects` part of `defineModule` - notice how we've respected the names, classes and description of the objects that come from the *speciesAbundance* and the *temperature* modules;
- We have **two additional functions** in a separate script (`R/linear_model_functions.R`): the function fitting the linear model and a plotting function.
Below is the full module script.
Notice how the future events where scheduled to `P(sim)$statsTimestep + 0.1`, to force the statistical analyses to occur **after** the abundance and temperature rasters are ready.
```{r stats_module}
#| code: !expr 'readLines(file.path(projDir, "modules", "speciesTempLM", "speciesTempLM.R"))'
#| eval: false
```
And the script with the accessory functions:
```{r stats_module2}
#| code: !expr 'readLines(file.path(projDir, "modules", "speciesTempLM", "R", "linear_model_functions.R"))'
#| eval: false
```
------------------------------------------------------------------------
### Simulation
#### Simulation setup in a "global" script
We can now go back to our `Part1_DummyModel.R` script and set the simulation up.
The function `simInit` needs a few arguments listing simulation folder directories, parameters, simulation times, modules and, optionally, input objects supplied by the user.
`simInit` will prepare a simulation object that can later be run by the `spades` function:
- The first list, `modules`, contains modules we want to activate.
- `times` is a named list containing the start and end times of the simulation and what time units we're working with (with "start" and "end" being the list `names`. It thus defines the length of the simulation. It is important that the start and ending times are defined in decimals, because `SpaDES` allows decomposing time units into smaller fractions.
- `parameters` is a named list of named lists, containing parameters values passed to each module.
Note that because the module metadata will (or should) contain default parameter values, here we pass only parameters which we want to change with respect to their defaults.
For instance, `.plotInterval` is used and defined in the *speciesAbundance* and *temperature* modules, but not passed to the `simInit` function because we want to use the default value.
As a developer providing a reproducible example, we may also chose to list important and useful parameters, even if the value is the same as the default.
Here we chose to list `.plotInitialTime` (a parameter used and defined in the *speciesAbundance* and *temperature* modules), but provide the default value (we experimenting with it by changing its value in the `Part1_DummyModel.R`).
- `paths` contains the folder directory paths that we set earlier.
```{r globalscript2}
#| results: false
#| warning: false
#| message: false
## list the modules to use
simModules <- list("speciesAbundance", "temperature", "speciesTempLM")
## Set simulation and module parameters
simTimes <- list(start = 1, end = 10, timeunit = "year")
simParams <- list(
speciesAbundance = list(simulationTimeStep = 1,
.plotInitialTime = 1),
temperature = list(simulationTimeStep = 1,
.plotInitialTime = 1),
speciesTempLM = list(statsTimestep = 5)
)
## make a list of directory paths
simPaths <- getPaths()
## Simulation setup
mySim <- simInit(times = simTimes, params = simParams,
modules = simModules, paths = simPaths)
```
Finally, we highlight that `simInit` also executes all `.inputObjects` functions, and schedules the `init` events, but does not execute them:
```{r scheduled-events}
#| results: false
#| warning: false
#| message: false
events(mySim)
```
#### Checking the simulation setup
Before starting the simulations we should check if the modules were linked correctly.
**Module diagram**
`moduleDiagram` (@fig-moduleDiagram ) is a useful function that shows module inter-dependencies as a network diagram.
The direction of the arrows indicates an output to input flow.
You can see that *speciesAbundance* and *temperature* inputs (specifically our \`r\` raster) are supplied by an external source (*INPUT*) - the user or `.inputObjects`.
Whereas the inputs to the *speciesTempLM* module are outputs of the *speciesAbundance* and *temperature* modules.
```{r fig-moduleDiagram}
#| fig-height: 6
#| fig-width: 6
#| message: false
#| warning: false
#| fig-cap: "Diagram of module connections."
moduleDiagram(mySim)
```
**Object diagram**
`objectDiagram` (@fig-objectDiagram) provides another way of checking module linkages.
It explicitly shows module inter-dependencies by depicting the objects that establish links between modules.
```{r objectDiagramCode}
#| eval: false
objectDiagram(mySim)
```
```{r fig-objectDiagram}
#| echo: false
#| fig-align: center
#| fig-cap: "Module diagram showing module inter-dependencies with object names."
#| fig-width: 10
knitr::include_graphics(normalizePath(file.path(projDir, "figures/Part1_objectDiagram.png"), winslash = "/"),
dpi = NA)
```
#### Running `SpaDES`
We run the simulation using the `spades` function, which takes the output of the `simInit`, executes the already scheduled `init` events, which schedule the remainder of the events.
We passed `debug = TRUE` so that `spades` prints the events as they are being executed.
In case something fails, this helps diagnosing where the issue occurred.
```{r spadesrun}
#| eval: false
## run simulation
dev() # on Windows and Mac, this opens external device if using Rstudio, it is faster
clearPlot()
mySim2 <- spades(mySim, debug = TRUE)
```
```{r figSimulationFig}
#| fig-align: center
#| warning: false
#| message: false
#| echo: false
#| fig-cap: "Simulation plots: Final plot of the simulation"
#| out-width: 80%
knitr::include_graphics(normalizePath(file.path(projDir, "figures/simul_plot.png"), winslash = "/"))
```
We suggest experimenting with changing parameter values and trying to create and add other modules to further explore all the `SpaDES` flexibility.
The more complex the project gets, the more advantageous it is to use `SpaDES` to turn modules *on* or *off*, swapping modules to run, e.g., different statistical analyses, or to include different data.
------------------------------------------------------------------------
### Additional notes
`SpaDES` is an extremely powerful package, whose potential goes well beyond what has been discussed in this dummy example.
If you want to explore it further, we recommend following [Chapter 2](Part2_SDMs.qmd) for a more realistic (but still simple) `SpaDES` application.
Also, do go to the [`SpaDES` webpage](https://spades.predictiveecology.org/) to find further information about the platform, as well as upcoming workshops and publications and to the [Predictive Ecology Github repository](https://github.com/PredictiveEcology/) to see all the `SpaDES` modules and SpaDES-related packages that we maintain at the [Predictive Ecology Lab](https://predictiveecology.org).
------------------------------------------------------------------------
<center>**Happy SpaDESing!**</center>
[^part1old-1]: The full R script to run this example @sec-scriptPart1 .