Getting Started With care4cmodel

This vignette’s purpose is to serve as a guide for getting quickly started with the R package care4cmodel. While we give a short introduction the concept of the package below, we assume that you already know why you want to use the package, and that you are not completely unfamiliar with the idea behind it. Here, we will only provide scientific details if they are necessary for understanding how to work with the package. In parallel, a scientific publication is currently in preparation.

1 Concept of the package

The R package care4cmodel has been designed in the context of quantifying carbon flows related to the growth and management of forests. It is, in essence, a dynamic simulation model that is surrounded with tools to enable users to quickly set up and simulate scenarios and evaluate them. The current version is made for evaluating silvicultural concepts with a focus on opposing the CO2 uptake from wood growth to the CO2 emissions caused by forest operations.

The basic idea of the model is that a forest area managed under a given concept can be adequately represented by a set of forest development stages, each of which covers a certain share of the total area. These area shares change dynamically over time due to continuous forest dynamics, but they can also change abruptly due to hazard events. Any further simulation results are basically obtained by upscaling phase-wise information to these areas. Note, that care4cmodel is not a growth and yield simulator in the usual sense, i.e.  the required basic growth and yield information relating to the concept of interest must be provided by the user (see Section 3.1). While the current implementation is constrained to providing the carbon related information as mentioned above, the concept of the model is generic and lends itself to a broad variety of applications.

2 Quickstart

If you simply want to run and evaluate a simulation without further reading, here’s your code:

library(care4cmodel) # Attach the package

# Run a simulation and store its base results in a variable sim_base_out
# call ?simulate_single_concept for details
sim_base_out <- simulate_single_concept(
    concept_def = pine_thinning_from_above_1, # use pre-defined concept
    init_areas  = c(800, 0, 0, 0, 0, 200),
    time_span   = 200,
    risk_level  = 3
)

# Evaluate the base results for carbon related information
# call ?fuel_and_co2_evaluation for details
carbon_out <- fuel_and_co2_evaluation(sim_base_out, road_density_m_ha = 35)

Both result variables, sim_base_out, and carbon_out contain large objects which are, in essence, named lists. You can easily explore them manually, but there exist convenient functions for visualization:

# Plot base results. Without further specifications, this will generate a plot
# of the areas covered by the stand development phases over time

# Check the documentation with ?plot.c4c_base_result in order to see all 
# options for plotting, especially growth and yield variables
plot(sim_base_out) 

# Plot carbon related results. The selected option for plot_type generates a
# phase diagram where the total CO2 emissions due to forest operations are
# plotted over the CO2 sequestered by wood growth.

# Check the documentation with ?plot.c4c_co2_result in order to see all options
# for plotting
plot(carbon_out, plot_type = "em_vs_inc")

Great, if this got you sufficiently started. If you want more explanations, read on.

3 Using care4cmodel

After installing the package on your computer, the easiest (and standard) way to access the functionality of the package is to attach it with

library(care4cmodel)

3.1 Silvicultural concept definition

The fundamental requirement for running simulations with care4cmodel is an appropriate definition of the silvicultural concept of interest. Such definitions are objects of class c4c_concept, and the most convienent way to generate one of your own is to use the function c4c_concept(). The main argument to that function is a data frame with growth and yield information, as defined below. Many users will find it convenient to assemble this information in a spreadsheet software like MS EXCEL, and import it into R as a data frame. With ?c4c_concept you can call the documentation of c4c_concept(). Using this function includes a series of automatic checks in order to ensure that the definition is technically correct. The package contains two pre-defined concept definitions, pine_thinning_from_above_1, and, pine_no_thinning_and_clearcut_1. We use the former for explaining the main content of such a definition. Simply type the name of the concept to display its contents:

pine_thinning_from_above_1
#> $concept_name
#> [1] "Scots pine thinning from above"
#> 
#> $units
#>          time          area        volume          mass stem_diameter 
#>        "year"          "ha"          "m3"           "t"          "cm" 
#> 
#> $growth_and_yield
#> # A tibble: 6 × 14
#>   phase_no phase_name      duration n_subphases vol_standing vol_remove vol_mort
#>      <int> <chr>              <dbl>       <dbl>        <dbl>      <dbl>    <dbl>
#> 1        1 initial_stand         15           3          0         0       0    
#> 2        2 young_growth          14           3         58.5       0       0.142
#> 3        3 immature_timber       29           6        206.        1.64    1.72 
#> 4        4 mature_timber         49          10        374.        4.38    1.60 
#> 5        5 final_harvest_…       19           4        446.        8.78    0.584
#> 6        6 final_harvest_…       29           6        378.       16.5     0.471
#> # ℹ 7 more variables: n_standing <dbl>, n_remove <dbl>, dbh_standing <dbl>,
#> #   dbh_remove <dbl>, harvest_interval <dbl>, survival_cum <dbl>,
#> #   vol_increment <dbl>
#> 
#> attr(,"class")
#> [1] "c4c_concept"

Basically, such an object of class c4c_concept (as created with the function of the same name) is a list with three elements. The first element, concept_name is to be defined by the user. The second element, units lists the most important units to be used when simulating this concept. This feature is not actively used in the current version (and is not required to be defined by the user), i.e. all numbers connected to any silvicultural concept are understood in these units if not explicitly stated otherwise. However in future versions, it is planned to be more flexible in that regard. The third element, growth_and_yield is actually the most important. It is a data frame that defines the stand development phases to be considered, their duration, and a set of fundamental growth and yield variables. Typically this information comes from observational plots, silvicultural guidelines, forest growth simulators, yield tables, and similar sources, or a mixture of these.

The number of stand development phases is totally up to the user; for most applications, however, four to seven phases should be sufficient. Each line of the data frame completely defines one such phase. The phases are taken to be subsequent to each other in the order of the column phase_no. The last phase is followed by the first phase again, which typically means a final harvest followed by an initial stand. The columns of the data frame are defined as follows (note that, depending on your system, the contents of some columns, and some columns themselves might be cut off in the display above):

3.2 Running a simulation

The workhorse function for conducting simulation runs with care4cmodel is simulate_single_concept(). For simulating an example with the concept pine_thinning_from_above_1 we use it as follows:

sim_base_out <- simulate_single_concept(
    concept_def = pine_thinning_from_above_1,
    init_areas  = c(1000, 0, 0, 0, 0, 0),
    time_span   = 200,
    risk_level  = 3
)

The first argument concept_def is the definition of the silvicultural concept to be used, it must be a valid object of class c4c_concept (Section 3.1). The second argument, init_areas, is a vector that sets the initial areas covered by the stand development phases as defined in the silvicultural concept (in ascending order). In the example, init_areas indicates 1000 ha in the first stand development phase, and no areas covered by any other phase. This can be seen as an afforestation of a 1000 ha forest area. During the simulation, the total area (1000 ha in our example) will remain constant, while the shares of the phases will be changing. The size of the areas and their initial distribution is fully up to the user. Note that the initial areas can also be specified on the level of the internal sub-phases (Section 3.1). Call the function’s documentation with ?simulate_single_concept for this and other details. The argument time_span is the number of years to cover with the simulation. 200 years, as defined here, is certainly longer than required for typical applications. With the argument risk_level, you can adjust the occurence of stochastic hazard events relative to the baseline risk settings made in survival_cum in the concept definition (Section 3.1). In the example, the setting risk_level = 3 means that the actual damage probabilities are the same as if the events leading to the baseline risk would happen three times in sequence. Consequently, risk_level = 1 would use exactly the baseline risk; numbers smaller than 1 would indicate a lower risk, e.g. risk_level = 1/2 uses damage probabilities as if only half of the baseline risk events happened. Setting risk_level = 0 will simulate the development without any stochastic hazard events. In our example, we store the output in a variable, we call sim_base_out. Internally, the simulation is based on functionality provided by the R package deSolve (Karline Soetaert, Thomas Petzoldt, and R. Woodrow Setzer 2010).

3.3 Exploring the base simulation output

As its output, a simulation with simulate_single_concept() generates an object of class c4c_base_result. This large object comprises different categories of information, and it has an own plot() function for convenient result visualization as we will show below. In essence, such an object is a named list containing several vectors, matrices, and data frames. In the example above (Section 3.2), we stored such an output object in the variable sim_base_out. For the sake of clarity, we do not display the entire object here. In order to get an overview, let us first display the names of the top level list elements:

names(sim_base_out)
#> [1] "concept"              "time_span"            "detailed_init"       
#> [4] "detailed_out"         "init_areas"           "risk_level"          
#> [7] "parameters"           "sim_areas"            "sim_growth_and_yield"

The first six list elements, concept, time_span, detailed_init, detailed_out, init_areas, and risk_level document the settings of
simulate_single_concept() that were used for the simulation. You can access them most conveniently by their name, e.g.

sim_base_out$init_areas
#> [1] 1000    0    0    0    0    0

The list element parameters contains interal simulation parameters derived from the user settings; most important sub-phase wise dwell times and detailed risk related information.

The two remaining list items sim_areas, and sim_growth_and_yield contain actual simulation output. Both are named lists themselves. sim_areas contains three matrices named as follows:

names(sim_base_out$sim_areas)
#> [1] "areas"                "area_inflows_regular" "area_outflows_events"

All three matrices have the same structure; each line represents a point in time, i.e. the end of a simulation year. The first column denotes these times in years. The other columns represent the subsequent stand development phases from left to right. The matrix areas simply contains each phase’s area in ha at the end of the respective simulation year. The matrix area_inflows_regular represents the areas in ha that flowed into a phase from the previous phase during the respective simulation year. Analogously, the matrix area_outflows_events contains the areas that flowed out of a given phase into the (first subphase of the) initial phase due to hazard events. As both of the latter matrices represent flows during a year, they contain NA values for the initial situation, i.e. time 0.

head(sim_base_out$sim_areas$areas)
#>      time initial_stand young_growth immature_timber mature_timber
#> [1,]    0     1000.0000      0.00000      0.00000000  0.000000e+00
#> [2,]    1      933.3646     66.61132      0.02405194  1.194750e-11
#> [3,]    2      866.9807    132.68005      0.33920499  1.036965e-08
#> [4,]    3      801.3647    197.11993      1.51537479  5.018447e-07
#> [5,]    4      737.1068    258.66169      4.23155240  7.508342e-06
#> [6,]    5      674.7173    316.13968      9.14299728  5.902744e-05
#>      final_harvest_phase_I final_harvest_phase_II
#> [1,]          0.000000e+00           0.000000e+00
#> [2,]          2.667036e-31           2.355412e-40
#> [3,]          1.860259e-24           1.883231e-31
#> [4,]          5.427057e-21           3.371655e-27
#> [5,]          1.395680e-18           2.739986e-24
#> [6,]          9.978331e-17           4.749860e-22
head(sim_base_out$sim_areas$area_inflows_regular)
#>      time initial_stand young_growth immature_timber mature_timber
#> [1,]    0            NA           NA              NA            NA
#> [2,]    1  1.601196e-55     66.63537      0.02405194  1.194750e-11
#> [3,]    2  1.087646e-42     66.39317      0.31516207  1.035772e-08
#> [4,]    3  4.207780e-37     65.66596      1.17651109  4.915013e-07
#> [5,]    4  2.036135e-33     64.34588      2.71796583  7.007984e-06
#> [6,]    5  1.354264e-30     62.43646      4.91347614  5.152796e-05
#>      final_harvest_phase_I final_harvest_phase_II
#> [1,]                    NA                     NA
#> [2,]          2.667036e-31           2.355412e-40
#> [3,]          1.860259e-24           1.883231e-31
#> [4,]          5.425208e-21           3.371467e-27
#> [5,]          1.390282e-18           2.736636e-24
#> [6,]          9.839079e-17           4.722529e-22
head(sim_base_out$sim_areas$area_outflows_events)
#>      time initial_stand young_growth immature_timber mature_timber
#> [1,]    0            NA           NA              NA            NA
#> [2,]    1    0.17662312  0.000000000    0.000000e+00  0.000000e+00
#> [3,]    2    0.03011647  0.009276849    9.009760e-06  1.128976e-14
#> [4,]    3    0.07505072  0.049569165    3.407947e-04  2.626824e-11
#> [5,]    4    0.08116650  0.086164250    1.781222e-03  1.487123e-09
#> [6,]    5    0.02970601  0.044992483    1.979728e-03  8.860609e-09
#>      final_harvest_phase_I final_harvest_phase_II
#> [1,]                    NA                     NA
#> [2,]          0.000000e+00           0.000000e+00
#> [3,]          4.118826e-34           4.773684e-43
#> [4,]          7.697644e-27           1.022242e-33
#> [5,]          2.626638e-23           2.140410e-29
#> [6,]          2.691596e-21           6.934072e-27

The other remaining list element sim_growth_and_yield is a named list of data frames that contain growth and yield information which was, in essence, obtained by upscaling the growth and yield information provided in the silvicultural concept definition (Section 3.1). We obtain an overview with:

sim_base_out$sim_growth_and_yield
#> $gyield_summary
#> # A tibble: 201 × 8
#>     time vol_standing vol_rmv_cont vol_rmv_damage vol_rmv_total vol_mort
#>    <dbl>        <dbl>        <dbl>          <dbl>         <dbl>    <dbl>
#>  1     0           0        0              NA            0          0   
#>  2     1        3903.       0.0393          0            0.0393     9.50
#>  3     2        7833.       0.555           0.545        1.10      19.4 
#>  4     3       11846.       2.48            2.97         5.45      30.6 
#>  5     4       16007.       6.92            5.41        12.3       44.0 
#>  6     5       20382.      15.0             3.04        18.0       60.6 
#>  7     6       25019.      27.5             3.21        30.7       81.2 
#>  8     7       29871.      44.9            79.2        124.       106.  
#>  9     8       35010.      67.7            98.0        166.       136.  
#> 10     9       40498.      96.3            59.3        156.       170.  
#> # ℹ 191 more rows
#> # ℹ 2 more variables: vol_inc_ups <dbl>, vol_inc <dbl>
#> 
#> $gyield_phases
#> $gyield_phases$vol_standing
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0             0           0             0    0            
#>  2     1             0        3898.            4.96 0.00000000446
#>  3     2             0        7764.           69.9  0.00000387   
#>  4     3             0       11534.          312.   0.000187     
#>  5     4             0       15135.          872.   0.00280      
#>  6     5             0       18498.         1884.   0.0220       
#>  7     6             0       21558.         3461.   0.115        
#>  8     7             0       24213.         5658.   0.451        
#>  9     8             0       26477.         8531.   1.44         
#> 10     9             0       28367.        12127.   3.95         
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> $gyield_phases$vol_rmv_cont
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0             0            0          0           0       
#>  2     1             0            0          0.0393      5.23e-11
#>  3     2             0            0          0.555       4.54e- 8
#>  4     3             0            0          2.48        2.20e- 6
#>  5     4             0            0          6.92        3.29e- 5
#>  6     5             0            0         15.0         2.58e- 4
#>  7     6             0            0         27.5         1.35e- 3
#>  8     7             0            0         44.9         5.29e- 3
#>  9     8             0            0         67.7         1.69e- 2
#> 10     9             0            0         96.3         4.62e- 2
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> $gyield_phases$vol_rmv_damage
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0            NA       NA            NA           NA       
#>  2     1             0        0             0            0       
#>  3     2             0        0.543         0.00186      4.22e-12
#>  4     3             0        2.90          0.0702       9.81e- 9
#>  5     4             0        5.04          0.367        5.55e- 7
#>  6     5             0        2.63          0.408        3.31e- 6
#>  7     6             0        2.52          0.689        2.04e- 5
#>  8     7             0       55.3          23.8          1.99e- 3
#>  9     8             0       60.2          37.8          7.56e- 3
#> 10     9             0       31.8          27.5          1.17e- 2
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> $gyield_phases$vol_rmv_total
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0             0        0              0           0       
#>  2     1             0        0              0.0393      5.23e-11
#>  3     2             0        0.543          0.557       4.54e- 8
#>  4     3             0        2.90           2.55        2.21e- 6
#>  5     4             0        5.04           7.29        3.34e- 5
#>  6     5             0        2.63          15.4         2.62e- 4
#>  7     6             0        2.52          28.2         1.37e- 3
#>  8     7             0       55.3           68.8         7.28e- 3
#>  9     8             0       60.2          106.          2.44e- 2
#> 10     9             0       31.8          124.          5.79e- 2
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> $gyield_phases$vol_mort
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0             0         0             0           0       
#>  2     1             0         9.46          0.0414      1.91e-11
#>  3     2             0        18.8           0.584       1.65e- 8
#>  4     3             0        28.0           2.61        8.01e- 7
#>  5     4             0        36.7           7.28        1.20e- 5
#>  6     5             0        44.9          15.7         9.42e- 5
#>  7     6             0        52.3          28.9         4.93e- 4
#>  8     7             0        58.8          47.3         1.93e- 3
#>  9     8             0        64.3          71.3         6.15e- 3
#> 10     9             0        68.8         101.          1.68e- 2
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> $gyield_phases$vol_inc_ups
#> # A tibble: 201 × 7
#>     time initial_stand young_growth immature_timber mature_timber
#>    <dbl>         <dbl>        <dbl>           <dbl>         <dbl>
#>  1     0         3901.           0            0          0       
#>  2     1         3641.         711.           0.220      8.91e-11
#>  3     2         3382.        1417.           3.10       7.73e- 8
#>  4     3         3126.        2105.          13.8        3.74e- 6
#>  5     4         2875.        2762.          38.6        5.60e- 5
#>  6     5         2632.        3376.          83.5        4.40e- 4
#>  7     6         2398.        3934.         153.         2.30e- 3
#>  8     7         2180.        4419.         251.         9.01e- 3
#>  9     8         1974.        4832.         378.         2.87e- 2
#> 10     9         1780.        5177.         538.         7.88e- 2
#> # ℹ 191 more rows
#> # ℹ 2 more variables: final_harvest_phase_I <dbl>, final_harvest_phase_II <dbl>
#> 
#> 
#> $gyield_harvest_detail
#> # A tibble: 2,412 × 8
#> # Groups:   time, phase_no, phase_name, harvest_type, dbh_cm, vol_tree_m3
#> #   [2,412]
#>     time harvest_type phase_no phase_name  dbh_cm vol_tree_m3 tree_dist_m volume
#>    <dbl> <chr>           <int> <chr>        <dbl>       <dbl>       <dbl>  <dbl>
#>  1     0 damage              1 initial_st…    0          0           1.2       0
#>  2     0 regular             1 initial_st…    0          0           0         0
#>  3     0 damage              2 young_grow…    5.8        0.01        1.27      0
#>  4     0 regular             2 young_grow…    0          0           0         0
#>  5     0 damage              3 immature_t…   10.3        0.05        1.55      0
#>  6     0 regular             3 immature_t…   12.9        0.09       10.7       0
#>  7     0 damage              4 mature_tim…   22.9        0.39        3.21      0
#>  8     0 regular             4 mature_tim…   25.8        0.58       11.5       0
#>  9     0 damage              5 final_harv…   30.2        0.88        4.43      0
#> 10     0 regular             5 final_harv…   41.9        1.08       15.7       0
#> # ℹ 2,402 more rows

As these data frames are mostly self explaining, let us just point out a few details. If not stated otherwise all numbers given there represent cubic meters wood. In the first data frame gyield_summary, the column vol_rmv_cont stands for wood volume that is continually removed (harvested) as planned according to the concept definition (Section 3.1). vol_rmv_damage in contrast, is all wood volume of the stands that were lost due to hazard events, assuming that this wood is harvested in its entirety. Consequently, vol_rmv_total is the sum of both, regular and damage-induced harvest. Note that there are two variables related to volume increment, vol_inc_ups, and vol_inc. The former is upscaled from the concept definition, and this the volume increment which should be solely used for subsequent calculations. The latter results from a differently defined post-hoc calculation and is provided for internal reasons only. The data frames that make up the sub-list gyield_phases give a more detailed view on the variables provided with gyield_summary, i.e. they split them among the stand development phases.

The last data frame, gyield_harvest_detail is of special importance, as it contains detailed growth and yield information that is required for calculating fuel consumption and CO2 emissions due to harvest operations. That is, the average dbh of a tree to be harvested, dbh_cm, the average tree volume, vol_tree_m3, the average neighbor distance of the harvest trees, tree_dist_m, and the volume to be harvested, volume. Note that volume can be 0 even though there are non-zero values for the other variables.

As mentioned in the quickstart section, there is an own plot function for the output objects of the function simulate_single_concept(). Its documentation is accessible by calling ?plot.c4c_base_result. The plot shown in the quickstart section shows the default, the development of the areas covered by the different stand development phases over time. Due to the CRAN policies we can only provide very few other example plots; the reader is encouraged to try out all options listed in the documentation. First, we visualize the standing volume:

plot(sim_base_out, variable = "vol_standing") # standing volume

Second, we plot the total harvested volume. The sharp peaks result from wood that is removed due to hazard events.

plot(sim_base_out, variable = "vol_rmv_total") # total harvested volume

4 Further functions

When calling the documentation of any function contained in the package care4cmodel, e.g. with ?simulate_single_concept and scrolling down to the bottom of the page, there is a link called “Index”. Following this link leads to a complete list of documented functions the package provides for users. Typically, most of these functions are internally used in the workflow described above. Advanced users, however, might find some useful tools among them.

5 Acknowledgment

This R package has been developed as a part of the CARE4C project that has received funding from the European Union’s HORIZON 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement # 778322.

6 References

Bacescu, Narcis Mihail, Alberto Cadei, Tadeusz Moskalik, Mateusz Wiśniewski, Bruce Talbot, and Stefano Grigolato. 2022. “Efficiency Assessment of Fully Mechanized Harvesting System Through the Use of Fleet Management System.” Sustainability 14 (24). https://doi.org/10.3390/su142416751.
Enache, A, and K Stampfer. 2015. “Machine Utilization Rates, Energy Requirements and Greenhouse Gas Emissions of Forest Road Construction and Maintenance in Romanian Mountain Forests.” Journal of Green Engineering 4 (4): 325–50.
Grigolato, Stefano, and Alberto Cadei. 2022. “Full-Mechanized CTL Production Data in Scots Pine Forest in Poland.” https://researchdata.cab.unipd.it/659/.
Kärhä, Kalle, Hanna Haavikko, Heikki Kääriäinen, Teijo Palander, Lars Eliasson, and Kimmo Roininen. 2023. “Fossil-Fuel Consumption and CO2eq Emissions of Cut-to-Length Industrial Roundwood Logging Operations in Finland.” European Journal of Forest Research, 1–17.
Karline Soetaert, Thomas Petzoldt, and R. Woodrow Setzer. 2010. “Solving Differential Equations in R: Package deSolve.” Journal of Statistical Software 33 (9): 1–25. https://doi.org/10.18637/jss.v033.i09.