Using Machine Learning with Fisheries Data
Caitlin Allen Akselrud; Alaska Fisheries Science Center, NOAA
Tuesday, November 19th, 09:30 AM PST
https://attendee.gotowebinar.com/register/4169976356159403096
Machine learning methods such as random forest regression models are useful tools in ecology when applied correctly, although features inherent to ecological data sets can lead to overfitting or uncertain predictions. Here, a set of methods are outlined to account for temporal autocorrelation, and sparse, short, or missing data for random forest predictions. Methods are also provided for estimating prediction uncertainty due to the combination of inherent randomness in the random forest algorithm and sparse input data. This suite of methods was used to generate pre-season predictions of total catches with uncertainty for California market squid (Doryteuthis opalescens), the most valuable fishery in California (by ex-vessel value). The methodology presented in this analysis is not only robust, incorporating key cross-validation and hyperparameter tuning techniques from across disciplines, but is also flexible, making it applicable to various ecological and fisheries datasets beyond market squid.
The talk will be based on Caitlin’s recent publication in Fisheries Research.
Akselrud, C.I.A., 2024. Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty. Fisheries Research, 280, p.107161.
Additional information: Caitlin Allen Akselrud is a stock assessment scientist at NMFS SWFSC in La Jolla. Her PhD research is co-advised by Trevor Branch and Andre Punt.