Data Collection and Preparation

The random forest model is best suited to a system with a large number of input parameters, and requires both presence and absence training points for classification. Climatic parameter rasters were obtained from the University of Alberta’s ClimateWNA dataset which includes 24 bioclimatic parameter rasters (Hamann, 2013). The model was constructed used the climate normals from 1961-1990, and predictions were made using six sets of modelled bioclimatic variables. Two time periods were considered: the 2050’s, defined by Hamann (2013) as 2041-2070; and 2080’s, defined as 2071-2100. For each period, three ensemble models were selected representing high, moderate and low emissions scenarios.

The model was trained using about 600 data points from E-Flora BC representing the observed distribution combined with a set of 400 pseudo-absence points. While the observed distribution dataset contained records from outside the climate normals period (Acer macrophyllum pursh, 2018), in order to retain a large enough training set, all points were included. Many of the points did not include their date of collection, and points collected after 1990 were found to be sufficiently close to earlier records to reflect past habitat. Ideally, observed absence points should be used to train the model (Mi, et al., 2017), but in this case, absence data was not available. Based on the methodology of Mi et al. (2017), a set of pseudo-absence points was generated by randomly sampling the study area. Fewer pseudo-absence points were generated than the observed presence data points to reflect the uncertainty in absences, but the two sets are sufficiently similar in size so that the training set remained balanced. The use of pseudo-absence points allows the model to classify by predicting cells more likely to be hospitable to A. macrophyllum than a randomly chosen cell. The point data was converted to raster format with a 1 km resolution, matching the climate data.

In addition to the climatic variables, a number of topographic parameter rasters were created from a digital elevation model. These included slope and aspect, which are commonly considered in plant distribution models (Garzon, et al., 2006), slope curvature and a topographic wetness index which acts as a substitute for soil moisture and stream proximity. A. macrophyllum is known to be found in riparian zones and lowlands, and it requires moist soil (Acer macrophyllum pursh, 2018; Fryer, 2011).

Previous                                                                                                                 Next