Estimating Cetacean Bycatch From Non-representative Samples (I): A Simulation Study With Regularized Multilevel Regression and Post-stratification

Citation
Authier M, Rouby E, Macleod K (2021) Estimating Cetacean Bycatch From Non-representative Samples (I): A Simulation Study With Regularized Multilevel Regression and Post-stratification. Frontiers in Marine Science 8:1459. https://doi.org/10.3389/fmars.2021.719956
Abstract

Bycatch, the non-intentional capture or killing of non-target species in commercial or recreational fisheries, is a world wide threat to protected, endangered or threatened species (PETS) of marine megafauna. Obtaining accurate bycatch estimates of PETS is challenging: the only data available may come from non-dedicated schemes, and may not be representative of the whole fisheries effort. We investigated, with simulated data, a model-based approach for estimating PETS bycatch from non-representative samples. We leveraged recent development in the statistical analysis of surveys, namely regularized multilevel regression with post-stratification, to infer total bycatch under realistic scenarios of data sampling such as under-sampling or over-sampling when PETS bycatch risk is high. Post-stratification is a survey technique to re-align the sample with the population and addresses the problem of non-representative samples. Post-stratification requires to sub-divide a population of interest into potentially hundreds of cells corresponding to the cross-classification of important attributes. Multilevel regression accommodate this data structure, and the statistical technique of regularization can be used to predict for each of these hundreds of cells. We illustrated these statistical ideas by modeling bycatch risk for each week within a year with as few as a handful of observed PETS bycatch events. The model-based approach led to improvements, under mild assumptions, both in terms of accuracy and precision of estimates and was more robust to non-representative samples compared to more design-based methods currently in use. In our simulations, there was no detrimental effects of using the model-based even when sampling was representative. Estimating PETS bycatch ideally requires dedicated observer schemes and adequate coverage of fisheries effort. We showed how a model-based approach combining sparse data typical of PETS bycatch and recent methodological developments can help when both dedicated observer schemes and adequate coverage are challenging to implement.