Conclusion

Modern machine learning algorithms provide researchers a powerful toolkit to discover non-linear empirical relations which is especially useful in solving classification problems. The streamflow regime classification model created for BC streams using the extreme gradient boosting algorithm is statistically robust and requires only readily available climate and topographic data. However, the ability of the XGB algorithm to predict streamflow quantities is modest. Its performance in numeric prediction is similar to the performance of regression models, and is inferior compared to process-based DMWBM by Moore et al. The model tends to over-predict for locations with low flow condition. Further studies could focus on improving the model’s quantities predictive power by investigating the sources of systematic error when the flow is low. Fortunately, the number of basins with low flow condition (< 1m3/s) is within manageable range. Examining the histories and features of these locations exhaustively would probably lead to informative findings. Human interferences or depletion of groundwater due to climate change at those locations could be the potential sources of the systematic error.