MODELING SETTING

Libraries

Libraries necessary to made this analysis;

# install.packages('devtools')
# devtools::install_github('r4ss/r4ss',
# ref='development') install.packages('caTools')
# library('caTools') install.packages('r4ss')
library(r4ss)
library(here)
# remotes::install_github('PIFSCstockassessments/ss3diags')
library(ss3diags)
library(kableExtra)
library(doParallel)
detectCores()
## [1] 16
registerDoParallel(8)
library(ggpubr)
library(tibble)
library(openxlsx)
library(ggthemes)
library(forecast)
library(tidyr)
library(mixR)
library(readxl)
library(tidyverse)
library(ggridges)
library(ggrepel)
library(broom)
library(lmtest)
library(flextable)
library(car)

Code Repository

The repository with files templates by scenario to replicate this analysis can be found in this GitHub author link

dir1.1 <- here("s1.1")  # no predador- environmental
dir1.2 <- here("s1.2")  # s1.1 w/ predador
dir1.3 <- here("s1.3")  # s1.1 w/ env
dir1.4 <- here("s1.4")  # s1.1 predator and env
Figs <- here("Figs")

OVERVIEW

This study aims to evaluate the impact of ecosystem components—such as environmental variables and predator-prey interactions—on the productivity and key population dynamics of Euphausia superba in Subarea 48.1. By incorporating these factors into the assessment, we analyze how krill population variables respond to ecological variability, providing insights into their resilience and potential management implications. This code forms part of the supplementary material for the scientific article titled “Assessing environmental and predator impacts on Antarctic Krill (Euphausia superba) population dynamics from an integrated length-to-age assessment model perspective.”

Spatial dimension of stock assessment

The study focuses on Subarea 48.1 in the Western Antarctic Peninsula, where most krill fishing occurs and CCAMLR is advancing spatially refined management. We use five CCAMLR-defined strata to increase spatial resolution, allowing detection of regional differences in krill dynamics (Figure 1). This structure, integrated into the stock assessment model, enhances its capacity to capture spatial variability in both biological and environmental data. In this approach, spatial structure is incorporated implicitly by treating different areas as separate fleets (Nielsen et al., 2021; Waterhouse et al., 2014). This constitutes a spatially implicit modeling framework, where differences among strata are recognized both from the perspective of krill population dynamics and from the influence of environmental variability within Subarea 48.1.

Subarea 48.1 and management strata considered in the spatio-temporal analysis of intrinsic productivity of Krill (BS=Brainsfield Strait, EI= Elephant Island, Gerlache= Gerlache strait, JOIN= Joinville Island, SSWI= South West)

Figure 1: Subarea 48.1 and management strata considered in the spatio-temporal analysis of intrinsic productivity of Krill (BS=Brainsfield Strait, EI= Elephant Island, Gerlache= Gerlache strait, JOIN= Joinville Island, SSWI= South West)

Figure 2 illustrates a conceptual model with the spatial distribution and movement of Antarctic krill in the Western Antarctic Peninsula (WAP). Adult krill are concentrated and move primarily in the northern areas, while juvenile krill dominate the southern regions. Arrows indicate directional flows, suggesting ontogenetic or environmentally-driven migration patterns. In terms of assumptions, although there are spatial differences in population structure, all these areas are considered part of a single, closed population unit upon which the stock assessment is conducted. By leveraging a spatially implicit, ecosystem-informed approach, this assessment provides a robust framework for evaluating krill stock dynamics under changing environmental conditions. These insights are crucial for informing sustainable management strategies in the Antarctic Peninsula region, where krill plays a foundational role in the marine food web.

Conceptual model used to model dynamics population in Antarctic krill in WAP

Figure 2: Conceptual model used to model dynamics population in Antarctic krill in WAP

Statistical Model (SS3)

Stock Synthesis (v.3.30.21) is a widely used tool for assessing fish and invertebrate populations, including Antarctic krill. SS3 is implemented in C++ with estimation enabled through automatic differentiation (ADMB) (Fournier et al., 2012; Methot & Wetzel, 2013). The spurce code can be find in Github SS3 repository. In this exercise, SS3 is configured as an integrated stock assessment model, explicitly accounting for age and size structure while incorporating key ecosystem drivers. The model simulates population processes such as growth, maturity, fecundity, recruitment, movement, and mortality, while also integrating environmental variability and predator-prey relationships to refine estimates of population trends in krill. The analysis of model outputs is conducted using R, utilizing the r4ss and ss3diags packages (Taylor, 2019; Winker et al., 2024). Integrated models can effectively capture the age structure by transforming length observations into population-level dynamics (Lee et al., 2024; Andre Punt et al., 2013).

On the other hand, length data, which are cost-effective and readily available (Canales et al., 2021; Chong et al., 2019), provide valuable insights into krill population structure due to their correlation with age (Thanassekos et al., 2014). For made this transformation, the models often use an Age-Length Keys (ALKs). An ALK is a probabilistic matrix that estimates the likelihood of individuals of a given length belonging to specific age classed and this method assume both aged and measured fish are random samples from the same population and should be applied within the same time period to avoid biases (ICCAT, 2003; Lee et al., 2024; André Punt, 2003; Andre Punt et al., 2013). The ALK matrix is typically defined as:

In a catch-at-length model like krill assessment the AKL matrix (Figure 3) is modeled trough parametrization process and have this shape;

Representation of ALK Matrix to krill in 48.1

Figure 3: Representation of ALK Matrix to krill in 48.1

Data

Parameters

The following table summarizes the key parameters to conditioning the reference model, including biological, growth, and population dynamics factors.

Table 1: Input parameters for the initial SS3 model of krill. Each parameter line contains a minimum value (LO), maximum value (HI), and initial value (INIT). If the phase (PHASE) for the parameter is negative, the parameter is fixed as input
LO HI INIT PHASE
Natural Mortality
Nat M 0.20 1.00 0.270 -3
Growth
Lmin 0.00 5.00 3.400 -2
Lmax 1.00 10.00 5.000 -4
VonBert K 0.05 0.80 0.470 -4
CV young 0.05 0.25 0.140 -4
CV old 0.05 0.25 0.070 -4
relationship Length-Weigth
Wt a 0.00 3.00 0.000 -3
Wt b 1.00 4.00 3.347 -3
Maturity
L50% 0.20 5.00 1.800 -4
Mat slope -3.00 3.00 -2.900 -4
S-R relation
SR_LN(R0) 3.00 30.00 23.000 1
SR_BH_steep 0.20 1.00 0.850 -4
SR_sigmaR 0.00 2.00 1.200 -4
SR_regime -5.00 5.00 0.000 -4
SR_autocorr 0.00 0.00 0.000 -99
Catchability
LnQ_base_FISHERYBS(1) -25.00 25.00 -5.722 1
LnQ_base_FISHERYEI(2) -25.00 25.00 -5.722 1
Selectivity
SizeSel_P_1_FISHERYBS(1) 0.01 8.00 2.000 -3
SizeSel_P_2_FISHERYBS(1) 0.00 8.00 2.000 -2
SizeSel_P_1_FISHERYEI(2) 0.01 8.00 3.500 -3
SizeSel_P_2_FISHERYEI(2) 0.00 8.00 4.000 -2
SizeSel_P_1_FISHERYGS(3) 0.01 8.00 2.000 -3
SizeSel_P_2_FISHERYGS(3) 0.00 8.00 2.000 2
SizeSel_P_1_FISHERYJOIN(4) 0.01 8.00 3.500 -3
SizeSel_P_2_FISHERYJOIN(4) 0.00 8.00 2.000 -2
SizeSel_P_1_FISHERYSSIW(5) 0.01 8.00 3.500 -3
SizeSel_P_2_FISHERYSSIW(5) 0.00 8.00 2.000 -2
SizeSel_P_1_SURVEYBS(6) 1.00 7.00 2.000 -2
SizeSel_P_2_SURVEYBS(6) 1.00 7.00 1.000 -3
SizeSel_P_1_SURVEYEI(7) 1.00 7.00 3.000 -2
SizeSel_P_2_SURVEYEI(7) 1.00 7.00 1.000 -3
SizeSel_P_1_SURVEYGS(8) 1.00 7.00 2.000 -2
SizeSel_P_2_SURVEYGS(8) 1.00 7.00 1.000 -3
SizeSel_P_1_SURVEYJOIN(9) 1.00 7.00 3.000 2
SizeSel_P_2_SURVEYJOIN(9) 1.00 7.00 1.000 3
SizeSel_P_1_SURVEYSSIW(10) 1.00 7.00 2.000 -2
SizeSel_P_2_SURVEYSSIW(10) 1.00 7.00 1.000 -3
SizeSel_P_1_PREDATOR(11) 0.00 3.00 0.200 2
SizeSel_P_2_PREDATOR(11) 0.00 3.00 0.200 3

Index

Abundance index in Figure 4

Standardized indices of krill index abundance and consumption from fishery-dependent, fishery-independent, and predator-based data sources across different strata within Subarea 48.1. Each panel represents a distinct spatial or functional stratum, with trend lines indicating temporal variation from 1990 to 2020. Colors denote data source categories: green for fishery, orange for scientific surveys, and purple for predator-based indices. These patterns highlight spatial and temporal heterogeneity in krill dynamics across the subarea.

Figure 4: Standardized indices of krill index abundance and consumption from fishery-dependent, fishery-independent, and predator-based data sources across different strata within Subarea 48.1. Each panel represents a distinct spatial or functional stratum, with trend lines indicating temporal variation from 1990 to 2020. Colors denote data source categories: green for fishery, orange for scientific surveys, and purple for predator-based indices. These patterns highlight spatial and temporal heterogeneity in krill dynamics across the subarea.

Length comositions

Length compositions in Figure 5

Annual length-frequency distributions of Antarctic krill (Euphausia superba) across different data sources and spatial strata within Subarea 48.1 from 1991 to 2020. Each panel represents a distinct stratum for either fishery-dependent (green), fishery-independent survey (orange), or predator-based (purple) observations. Density ridgelines illustrate variation in krill size structure across years. The red vertical line marks a recruit references length (3.6 cm).

Figure 5: Annual length-frequency distributions of Antarctic krill (Euphausia superba) across different data sources and spatial strata within Subarea 48.1 from 1991 to 2020. Each panel represents a distinct stratum for either fishery-dependent (green), fishery-independent survey (orange), or predator-based (purple) observations. Density ridgelines illustrate variation in krill size structure across years. The red vertical line marks a recruit references length (3.6 cm).

Summary data sources to Modelling

Finally, this information and all sources can be represented through the following flow diagram (Figure 6) of inputs, model, and outputs.

Framework path to stock assessment model in krill in WAP (Yellow boxes is not implemeted yet).

Figure 6: Framework path to stock assessment model in krill in WAP (Yellow boxes is not implemeted yet).

Figure ?? show time series of differente componentes of data sources to this krill stock assessment.

Scenarios

Here, the reference model is s1.1, wich represents a baseline assessment of Euphausia superba population dynamics in Subarea 48.1, excluding environmental and ecological variables. This model assumes that krill productivity and population parameters are driven by intrinsic biological processes, such as growth, mortality, and recruitment and fishery impacts without accounting for external influences like environmental variability or predation pressure. By serving as a control scenario, this model provides a benchmark against which the impact of ecosystem components in productivity can be evaluated, allowing for a direct comparison of how environmental and ecological factors influence krill stock dynamics.

Table ?? show scenarios used for modelling dynamics in krill

Table 2: Scenarios used for modelling dynamics in krill
Scenario Description
s1.1 Spatial data without environmental and predator components
s1.2 “s1.1” with predator components
s1.3 “s1.1” with environmental variable
s1.4 “s1.1” with both predator fleet and environmental variable

RESULTS

Population Variables by scenario

Main Variables poulation in s1.1 scenario (Figure 7)

Main variables scenario s1.1

Figure 7: Main variables scenario s1.1

Main Variables poulation in s1.2 scenario (Figure 8)

Main variables scenario s1.2

Figure 8: Main variables scenario s1.2

Main Variables poulation in s1.3 scenario (Figure 9)

Main variables scenario s1.3

Figure 9: Main variables scenario s1.3

Main Variables poulation in s1.4 scenario (Figure 10)

Main variables scenario s1.4

Figure 10: Main variables scenario s1.4

Selectivity estimated by scenario in Figure 11.

Selectivity by fleet in each scenario

Figure 11: Selectivity by fleet in each scenario

This Figure 12 shows standardized time series of input indices used in four different model scenarios (s1.1 to s1.4) for the stock assessment of Antarctic krill in Subarea 48.1. Each panel presents fishery-dependent (FISHERY) and fishery-independent (SURVEY) indices across five management strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW), from the mid-1990s to 2020. Scenario s1.4 additionally includes a predator index, highlighting the incorporation of ecosystem variables. The figure illustrates spatial and temporal variability in data availability and trends among sources.

Standardized indices of krill abundance used as input in four model scenarios (s1.1 to s1.4), representing fishery-dependent (FISHERY) and fishery-independent (SURVEY) data across five spatial strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW). Scenario s1.4 also incorporates a predator index (PREDATOR), reflecting the integration of ecosystem variables into the assessment framework

Figure 12: Standardized indices of krill abundance used as input in four model scenarios (s1.1 to s1.4), representing fishery-dependent (FISHERY) and fishery-independent (SURVEY) data across five spatial strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW). Scenario s1.4 also incorporates a predator index (PREDATOR), reflecting the integration of ecosystem variables into the assessment framework

Outputs Variables s1.1

Table 3: Main variables outputs from stock asssessment krill in WAP s1.1
Yr Era Seas Bio_all Bio_smry SpawnBio Recruit_0
1254 1989 VIRG 1 2184230 2176360 3565180 107962000
1255 1990 INIT 1 1498570 1490900 2259200 105278000
1256 1991 TIME 1 1498570 1490900 2259200 105278000
1257 1992 TIME 1 1616910 1609190 2452620 105844000
1258 1993 TIME 1 1723130 1715370 2661750 106370000
1259 1994 TIME 1 1814140 1806360 2839670 106759000
1260 1995 TIME 1 1889790 1881980 2987350 107049000
1261 1996 TIME 1 1951400 1943580 3107870 107266000
1262 1997 TIME 1 2000910 1993080 3204890 107430000
1263 1998 TIME 1 2031670 2031470 3280300 2722690
1264 1999 TIME 1 1950370 1950080 3329570 3861330
1265 2000 TIME 1 1709860 1703160 3365450 91895800
1266 2001 TIME 1 1523110 1504210 2802010 259242000
1267 2002 TIME 1 1638890 1627610 2264810 154705000
1268 2003 TIME 1 1971710 1971470 2294470 3332310
1269 2004 TIME 1 2079960 2078730 3317380 16879800
1270 2005 TIME 1 1916790 1913490 3747190 45270000
1271 2006 TIME 1 1724010 1708750 3238060 209287000
1272 2007 TIME 1 1640040 1631050 2612060 123361000
1273 2008 TIME 1 1833350 1832360 2343170 13566800
1274 2009 TIME 1 1869200 1864810 3037610 60221700
1275 2010 TIME 1 1731220 1719710 3226670 157914000
1276 2011 TIME 1 1606630 1586690 2567950 273609000
1277 2012 TIME 1 1905570 1872960 2385900 447447000
1278 2013 TIME 1 2520170 2519760 2744410 5517500
1279 2014 TIME 1 2918840 2914270 3602640 62737200
1280 2015 TIME 1 2758580 2740280 5255400 251086000
1281 2016 TIME 1 2672990 2619030 4425950 740188000
1282 2017 TIME 1 3269820 3269180 3819450 8874990
1283 2018 TIME 1 4057150 3997160 4289280 823040000
1284 2019 TIME 1 4780510 4672090 7517290 1487330000
1285 2020 TIME 1 6934500 6933840 6701380 8978140
1286 2021 FORE 1 8884260 8876160 10056300 111127000
1287 2022 FORE 1 8472730 8464580 16411100 111824000
1288 2023 FORE 1 7532870 7524730 14262700 111657000
1289 2024 FORE 1 6398190 6390070 11986300 111416000
1290 2025 FORE 1 5398930 5390840 9988720 111115000

Outputs Variables s1.2

Table 4: Main variables outputs from stock asssessment krill in WAP in s1.2
Yr Era Seas Bio_all Bio_smry SpawnBio Recruit_0
1354 1989 VIRG 1 2299170 2268860 2213450 415800000
1355 1990 INIT 1 2140700 2110590 1919210 413006000
1356 1991 TIME 1 2140700 2110590 1919210 413006000
1357 1992 TIME 1 2210510 2180300 2052100 414362000
1358 1993 TIME 1 2249080 2218830 2126300 415049000
1359 1994 TIME 1 2270970 2240690 2163880 415380000
1360 1995 TIME 1 2283380 2253090 2185490 415565000
1361 1996 TIME 1 2290340 2260040 2197820 415669000
1362 1997 TIME 1 2294230 2263930 2204700 415727000
1363 1998 TIME 1 2266810 2265650 2207770 15962500
1364 1999 TIME 1 1867210 1863690 2202490 48340700
1365 2000 TIME 1 1196690 1191800 2199340 67016200
1366 2001 TIME 1 828069 768871 1232060 812129000
1367 2002 TIME 1 1347830 1298870 737765 671607000
1368 2003 TIME 1 2369680 2367790 534842 25993000
1369 2004 TIME 1 2358800 2347400 2222130 156448000
1370 2005 TIME 1 1692350 1681210 2907030 152806000
1371 2006 TIME 1 1334460 1282450 1706020 713509000
1372 2007 TIME 1 1587260 1564820 1167630 307877000
1373 2008 TIME 1 2058080 2050430 968864 104953000
1374 2009 TIME 1 1799590 1784940 2225020 200938000
1375 2010 TIME 1 1401810 1386880 1981840 204876000
1376 2011 TIME 1 1141310 1130440 1172540 149095000
1377 2012 TIME 1 1076390 1042010 1058820 471730000
1378 2013 TIME 1 1258030 1225100 995034 451783000
1379 2014 TIME 1 1685050 1599720 774750 1170600000
1380 2015 TIME 1 2619570 2612840 1328470 92363200
1381 2016 TIME 1 2961350 2947640 1691030 188112000
1382 2017 TIME 1 2137730 2126280 3483620 157048000
1383 2018 TIME 1 1632790 1493160 2050430 1915550000
1384 2019 TIME 1 3002460 2903520 1450160 1357410000
1385 2020 TIME 1 5247580 5235720 1111080 162708000
1386 2021 FORE 1 5161170 5130090 5069530 426398000
1387 2022 FORE 1 3655000 3623850 5768290 427421000
1388 2023 FORE 1 2833070 2802300 3345400 422101000
1389 2024 FORE 1 2415010 2384550 2502690 417931000
1390 2025 FORE 1 2214670 2184410 2127620 415061000

Outputs Variables s1.3

Table 5: Main variables outputs from stock asssessment krill in WAP in s1.3
Yr Era Seas Bio_all Bio_smry SpawnBio Recruit_0
1305 1989 VIRG 1 1353400 1348530 2209070 66896000
1306 1990 INIT 1 924676 919929 1365380 65120800
1307 1991 TIME 1 924676 919929 1365380 65120800
1308 1992 TIME 1 1011120 1006340 1537670 65631800
1309 1993 TIME 1 1081550 1076740 1677420 65973500
1310 1994 TIME 1 1138370 1133540 1787730 66207600
1311 1995 TIME 1 1183710 1178870 1876090 66376300
1312 1996 TIME 1 1219630 1214790 1946320 66500000
1313 1997 TIME 1 1248020 1243170 2001930 66592100
1314 1998 TIME 1 1265270 1264970 2044760 4181950
1315 1999 TIME 1 1215210 1214650 2069470 7762690
1316 2000 TIME 1 1072930 1068190 2087520 64986400
1317 2001 TIME 1 973382 957638 1743440 215996000
1318 2002 TIME 1 1110280 1101160 1435720 125135000
1319 2003 TIME 1 1430490 1430440 1496900 633499
1320 2004 TIME 1 1555180 1553690 2431110 20543200
1321 2005 TIME 1 1457770 1454920 2841870 39014900
1322 2006 TIME 1 1337590 1324930 2465110 173694000
1323 2007 TIME 1 1282230 1275120 2003690 97545800
1324 2008 TIME 1 1458050 1457180 1827320 11817800
1325 2009 TIME 1 1495670 1492280 2433790 46474800
1326 2010 TIME 1 1380840 1374560 2577760 86096400
1327 2011 TIME 1 1216580 1201820 2008220 202555000
1328 2012 TIME 1 1370620 1354010 1858140 227851000
1329 2013 TIME 1 1695600 1695400 1915330 2806810
1330 2014 TIME 1 1790520 1786470 2457810 55596900
1331 2015 TIME 1 1629840 1613780 3067190 220350000
1332 2016 TIME 1 1591930 1567500 2435350 335134000
1333 2017 TIME 1 1924590 1917310 2100960 99782500
1334 2018 TIME 1 2324970 2262080 2686890 862721000
1335 2019 TIME 1 3242040 3138410 3990540 1421630000
1336 2020 TIME 1 5618600 5617370 4015810 16881500
1337 2021 FORE 1 7720710 7715680 8064010 69109800
1338 2022 FORE 1 7521960 7516890 14572800 69497300
1339 2023 FORE 1 6680720 6675660 12858000 69433000
1340 2024 FORE 1 5627440 5622390 10753000 69326200
1341 2025 FORE 1 4663130 4658090 8824990 69184200

Outputs Variables s1.4

Table 6: Main variables outputs from stock asssessment krill in WAP in s1.4
Yr Era Seas Bio_all Bio_smry SpawnBio Recruit_0
1305 1989 VIRG 1 1353400 1348530 2209070 66896000
1306 1990 INIT 1 924676 919929 1365380 65120800
1307 1991 TIME 1 924676 919929 1365380 65120800
1308 1992 TIME 1 1011120 1006340 1537670 65631800
1309 1993 TIME 1 1081550 1076740 1677420 65973500
1310 1994 TIME 1 1138370 1133540 1787730 66207600
1311 1995 TIME 1 1183710 1178870 1876090 66376300
1312 1996 TIME 1 1219630 1214790 1946320 66500000
1313 1997 TIME 1 1248020 1243170 2001930 66592100
1314 1998 TIME 1 1265270 1264970 2044760 4181950
1315 1999 TIME 1 1215210 1214650 2069470 7762690
1316 2000 TIME 1 1072930 1068190 2087520 64986400
1317 2001 TIME 1 973382 957638 1743440 215996000
1318 2002 TIME 1 1110280 1101160 1435720 125135000
1319 2003 TIME 1 1430490 1430440 1496900 633499
1320 2004 TIME 1 1555180 1553690 2431110 20543200
1321 2005 TIME 1 1457770 1454920 2841870 39014900
1322 2006 TIME 1 1337590 1324930 2465110 173694000
1323 2007 TIME 1 1282230 1275120 2003690 97545800
1324 2008 TIME 1 1458050 1457180 1827320 11817800
1325 2009 TIME 1 1495670 1492280 2433790 46474800
1326 2010 TIME 1 1380840 1374560 2577760 86096400
1327 2011 TIME 1 1216580 1201820 2008220 202555000
1328 2012 TIME 1 1370620 1354010 1858140 227851000
1329 2013 TIME 1 1695600 1695400 1915330 2806810
1330 2014 TIME 1 1790520 1786470 2457810 55596900
1331 2015 TIME 1 1629840 1613780 3067190 220350000
1332 2016 TIME 1 1591930 1567500 2435350 335134000
1333 2017 TIME 1 1924590 1917310 2100960 99782500
1334 2018 TIME 1 2324970 2262080 2686890 862721000
1335 2019 TIME 1 3242040 3138410 3990540 1421630000
1336 2020 TIME 1 5618600 5617370 4015810 16881500
1337 2021 FORE 1 7720710 7715680 8064010 69109800
1338 2022 FORE 1 7521960 7516890 14572800 69497300
1339 2023 FORE 1 6680720 6675660 12858000 69433000
1340 2024 FORE 1 5627440 5622390 10753000 69326200
1341 2025 FORE 1 4663130 4658090 8824990 69184200

Comparision between variables

Main variables population by scenario (Figure 13)

Time series of different populations variables

Figure 13: Time series of different populations variables

Another comparison in R0 (Natural log of virgin recruitment level) (Figure 14)

R0 probability predicted by scenario

Figure 14: R0 probability predicted by scenario

Comparsion in sd long term time series forecasting Figure 15

Comparsion in sd long term time series forecasting

Figure 15: Comparsion in sd long term time series forecasting

Autocorrelation in Recruit

To evaluate the temporal correlation structure of krill recruitment under different model configurations, we performed an Autocorrelation Function (ACF) analysis (Figure 16). The ACF measures the correlation between observations at different time lags, helping to assess whether recruitment estimates exhibit persistence or randomness across time.

The analysis was conducted on recruitment estimates derived from four model scenarios:

A reference model without environmental or predator influences (Ref Model: No Env-Predator). A model incorporating predator data (S1.1 w/ Predator data). A model incorporating environmental data (S1.1 w/ Env data). A model incorporating both environmental and predator data (S1.1 w/ Env and Predator data). Each model’s residuals were extracted, and the autocorrelation function (ACF) was computed for a time lag range of up to 15 years. The dashed blue lines in the plots represent the 95% confidence intervals, indicating the threshold beyond which correlation values are statistically significant. If autocorrelation values remain within this range, it suggests that the recruitment estimates behave as a random process with no significant dependence on past values. Conversely, autocorrelation values exceeding these bounds indicate recruitment persistence or cyclic patterns.

Autocorrelation in Recruit by scenario

Figure 16: Autocorrelation in Recruit by scenario

The ACF plots indicate that the reference model (without environmental or predator data) exhibits weak but noticeable positive autocorrelation at certain lags, suggesting some degree of recruitment dependence over time. However, this autocorrelation does not appear strong or systematic.

The model incorporating only predator data shows a slight reduction in autocorrelation magnitude, suggesting that predator-driven recruitment variability may have captured part of the temporal structure in the data.

The model with only environmental data exhibits a further reduction in autocorrelation, implying that environmental variability explains a larger portion of recruitment trends than predator effects alone.

Finally, the model that includes both environmental and predator data presents the lowest autocorrelation values, with nearly all bars remaining within the confidence bounds. This suggests that incorporating both factors provides the most effective explanation of recruitment fluctuations, reducing unexplained temporal structure.

Overall, these results indicate that recruitment variability is at least partially driven by environmental and predator influences, and models integrating these factors provide more robust and independent recruitment estimates, minimizing systematic dependencies over time.

Relationship Stock-Recruit

\[ R = \frac{R_0 \cdot S}{S_0 (1 - h) + S (5h - 1)} \]

Where:
- \(R\) is the predicted recruitment.
- \(S\) is the spawning stock biomass.
- \(R_0\) is the recruitment at unfished equilibrium.
- \(S_0\) is the spawning biomass at unfished equilibrium.
- \(h\) is the steepness parameter (the proportion of \(R_0\) produced when \(S = 20\% \cdot S_0\)).

The blue line represents the Beverton–Holt stock-recruitment relationship, commonly used in fisheries models to describe the compensatory response of recruitment to changes in spawning biomass.

Productivity and Interannual Variability by Scenario

Estimating the productivity of Antarctic krill is critical for understanding the species’ capacity to replenish its population in response to varying levels of spawning biomass. Productivity, defined as the ratio of recruitment to spawning stock biomass, provides a standardized measure of reproductive success and population resilience under different ecological and fishing pressures. Comparing productivity across scenarios—each representing different assumptions about environmental drivers, fishing mortality, or predator dynamics—enables a robust evaluation of how krill populations reflect this changes.

For each scenario \(i\) and year \(t\), we computed the productivity as the ratio between recruitment and spawning stock biomass (SSB):

\[ \text{Productivity}_{i,t} = \frac{\text{Recruitment}_{i,t}}{\text{SSB}_{i,t}} \]

We also calculated the interannual percentage change in recruitment and SSB as:

\[ \text{Change in Recruitment}_{i,t} = \left( \frac{\text{Recruitment}_{i,t} - \text{Recruitment}_{i,t-1}}{\text{Recruitment}_{i,t-1}} \right) \times 100 \]

\[ \text{Change in SSB}_{i,t} = \left( \frac{\text{SSB}_{i,t} - \text{SSB}_{i,t-1}}{\text{SSB}_{i,t-1}} \right) \times 100 \]

These metrics allow us to analyze both the productivity and the temporal dynamics of the population under each scenario \(i\).

These two panels in Figure 17 provide insight into the stock-recruitment dynamics of Antarctic krill under four assessment scenarios, each incorporating different levels of ecosystem complexity. Left panel compares the predicted recruitment across years with corresponding spawning biomass values for the four model scenarios (s1.1 to s1.4). Scenario s1.1 (red) shows a relatively flat and optimistic Beverton-Holt relationship, with high recruitment predicted even at low levels of SSB. The other scenarios, particularly s1.3 (gray) and s1.4 (black), show much more constrained recruitment estimates and a more saturating relationship, where recruitment increases only slightly with increasing SSB. This suggests that incorporating environmental (s1.3) and predator (s1.4) effects leads to a more conservative and ecologically realistic recruitment dynamic. Year labels illustrate how certain years (e.g., 2004, 2016, 2022) deviate between scenarios, especially at low biomass levels.

Right panel displays recruitment efficiency (Recruitment/SSB) as a function of relative spawning biomass (SSB / SSB₀). In all models, recruitment efficiency is highest at very low biomass levels and declines as SSB increases, consistent with compensatory dynamics. However, s1.3 and s1.4 (which include environmental effects) show significantly lower overall productivity across all SSB levels compared to s1.1 and s1.2. This indicates that the inclusion of ecosystem drivers reduces the per-capita productivity of the stock, especially under depleted conditions.

Together, these plots indicate that ecosystem-informed models (s1.3 and s1.4) produce more constrained and precautionary estimates of krill productivity. The more optimistic recruitment predicted by s1.1 (and to a lesser extent s1.2) may overestimate stock resilience, particularly in low SSB conditions. These findings emphasize the importance of accounting for environmental variability and predator dynamics in stock assessment models to avoid overoptimistic management advice and better reflect the biological limits of the krill population.

Left panel: Stock–recruitment relationships for Antarctic krill under four assessment scenarios. Predicted recruitment as a function of spawning stock biomass (SSB), with fitted Beverton-Holt curves and annual labels for selected years. Right panel: Recruitment per unit of SSB plotted against relative SSB (SSB/SSBo), illustrating differences in per-capita productivity across models ( s1.1 (no ecosystem variables), s1.2 (with predator effects), s1.3 (with environmental variables), and s1.4 (with both predator and environmental effects))

Figure 17: Left panel: Stock–recruitment relationships for Antarctic krill under four assessment scenarios. Predicted recruitment as a function of spawning stock biomass (SSB), with fitted Beverton-Holt curves and annual labels for selected years. Right panel: Recruitment per unit of SSB plotted against relative SSB (SSB/SSBo), illustrating differences in per-capita productivity across models ( s1.1 (no ecosystem variables), s1.2 (with predator effects), s1.3 (with environmental variables), and s1.4 (with both predator and environmental effects))

Recruit deviation

Explotation Rate

Explotation rato (havest rate) in Figure 18

Harves rate by scenario in krill overtime

Figure 18: Harves rate by scenario in krill overtime

Diagnosis and robustness

A rigorous model diagnosis is essential to ensure the reliability and robustness of stock assessment models. The key steps for a good practice in model diagnosis include:

  1. Convergence Check: The model must reach a final convergence criterion of 1.0e-04 to ensure numerical stability and reliable parameter estimation.

  2. Residual Analysis: Both visual inspection and statistical metrics are used to evaluate model residuals, helping to detect patterns of bias or misfit.

  3. Retrospective Analysis: The Mohn’s rho parameter is used to assess the consistency of model estimates when sequentially removing recent years of data, identifying potential overestimation or underestimation trends.

  4. Likelihood Profile Analysis: This approach examines how the likelihood function behaves across a range of parameter values, providing insight into parameter uncertainty and model sensitivity.

This framework follows the recommendations outlined by Carvalho et al. (2021), aiming to enhance transparency and reproducibility in model evaluation.

Convergence Criteria

The convergence criterion used for model calibration is set to a final threshold of 0.0001 (or equivalently 1.0e-04). This criterion defines the minimum acceptable difference between successive model iterations. Convergence is considered achieved when the absolute change in the objective function value or key parameters falls below this threshold. A smaller convergence value ensures that the model achieves a high degree of accuracy and stability in its final estimates, indicating that further iterations are unlikely to result in significant changes to the parameter estimates.

Residuals

Figure ??, 20, 21 and 22 displays bubble plots of krill length-frequency distributions across different spatial strata and years. Each panel corresponds to a fishery-dependent or independent data source (e.g., FISHERYBS, SURVEYBS), and shows the proportional abundance of krill in different length classes (y-axis) over time (x-axis). The size of each circle is proportional to the relative frequency or standardized number of individuals in a given length bin for a specific year, providing a visual summary of size structure dynamics across regions and time periods.

Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.1

Figure 19: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.1

Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.2

Figure 20: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.2

Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.3

Figure 21: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.3

Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.4

Figure 22: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.4

This Figure 23 shows the Pearson residuals of predicted length distributions for krill across four modeling scenarios, each incorporating different levels of ecosystem complexity. The residuals are visualized by year and length bin, allowing an assessment of model fit over time and across the size structure of the population. In the “Without Ecosystem Variables” scenario (top left), substantial lack of fit is evident across multiple years and length classes, particularly in the late 1990s and early 2000s. Positive Pearson residuals indicate that the model tends to underestimate frequencies of certain length bins, especially among smaller and mid-sized individuals, suggesting poor representation of recruitment and growth dynamics in the absence of environmental and predator effects. The “With Predator” scenario (top right) shows some improvement in model fit, with slightly more balanced residuals. However, there are still consistent deviations in the mid-size ranges across multiple years, indicating that predation alone does not fully explain observed variability in length composition. In the “With Environment” scenario (bottom left), there is a noticeable reduction in residual magnitude across several years and size classes, implying that incorporating environmental covariates helps align the model more closely with observed size distributions. This is particularly apparent in the early 2000s, where the residuals are closer to zero across most bins. Finally, the “With Predator and Ecosystem” scenario (bottom right) shows the most consistent reduction in Pearson residuals across time and size structure. The distribution of residuals is more homogeneous and centered around zero, indicating that the combined inclusion of predator and environmental drivers leads to the best model fit among the four scenarios.

Overall, the results support the conclusion that ecosystem-informed models better capture the temporal and size-structured dynamics of krill, with the joint inclusion of predator and environmental variables yielding the most robust representation of observed length compositions.

Pearson residual by scenario and fleet

Figure 23: Pearson residual by scenario and fleet

Also, we can check density of residual distribution in Figure 24

Density of residual distribution by scenarios in krill model

Figure 24: Density of residual distribution by scenarios in krill model

Residual consistency

Residual analysis is a critical component of model diagnostics in stock assessments. It helps evaluate the fit of the model to observed data and detect potential biases or inconsistencies. This process is applied to both length composition data and abundance indices such as CPUE (Catch Per Unit Effort) and survey-derived estimates. For length composition data, residuals represent the difference between observed and model-predicted length distributions. The standardized residuals are calculated as the difference between observed and expected proportions at each length bin. These residuals are plotted by year to identify systematic trends, biases, or inconsistencies in the data. Ideally, they should be randomly distributed around zero, indicating no systematic over- or underestimation.

For abundance indices such as CPUE and fishery-independent surveys, residuals are analyzed to assess model fit and potential sources of bias. Residuals are computed as the difference between observed index values and those predicted by the model, typically standardized by dividing by the standard error to facilitate comparison across years. These residuals are then plotted over time to evaluate trends. A shaded confidence region, like the green area in the provided plot, represents expected variability, with outliers highlighted in red or other distinct markers. Persistent positive or negative residuals may indicate systematic bias in the model or data collection process.

Statistical diagnostics are also performed to check for autocorrelation in residuals, which can indicate potential model misspecifications. When mean residual values are close to zero, the model fit is considered unbiased. By integrating these residual analyses for both length and abundance indices, stock assessment models can be refined, improving their reliability and increasing confidence in the assessment results.

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:
##         Index runs.p     test  sigma3.lo sigma3.hi type
## 1   FISHERYBS  0.218   Passed -0.1189633 0.1189633  len
## 2   FISHERYEI  0.013   Failed -0.2839347 0.2839347  len
## 3   FISHERYGS  0.912   Passed -0.1865475 0.1865475  len
## 4 FISHERYJOIN     NA Excluded         NA        NA  len
## 5 FISHERYSSIW  0.230   Passed -0.1176572 0.1176572  len
## 6    SURVEYBS  0.221   Passed -0.2220680 0.2220680  len
## 7    SURVEYEI  0.595   Passed -0.2119927 0.2119927  len
## 8    SURVEYGS  0.454   Passed -0.2928749 0.2928749  len
## 9  SURVEYJOIN  0.541   Passed -0.4156365 0.4156365  len
## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests
## 
## Runs Test stats by Mean length:
##          Index runs.p     test   sigma3.lo  sigma3.hi type
## 1    FISHERYBS  0.150   Passed -0.04986852 0.04986852  len
## 2    FISHERYEI  0.013   Failed -0.30357206 0.30357206  len
## 3    FISHERYGS  0.346   Passed -0.38742103 0.38742103  len
## 4  FISHERYJOIN     NA Excluded          NA         NA  len
## 5  FISHERYSSIW  0.230   Passed -0.14854188 0.14854188  len
## 6     SURVEYBS  0.001   Failed -0.29126384 0.29126384  len
## 7     SURVEYEI  0.627   Passed -0.22413735 0.22413735  len
## 8     SURVEYGS  0.409   Passed -0.35196215 0.35196215  len
## 9   SURVEYJOIN  0.500   Passed -0.44688062 0.44688062  len
## 10    PREDATOR  0.607   Passed -0.18346375 0.18346375  len

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:
##         Index runs.p     test  sigma3.lo sigma3.hi type
## 1   FISHERYBS  0.744   Passed -0.0853166 0.0853166  len
## 2   FISHERYEI  0.013   Failed -0.2797884 0.2797884  len
## 3   FISHERYGS  0.179   Passed -0.2016250 0.2016250  len
## 4 FISHERYJOIN     NA Excluded         NA        NA  len
## 5 FISHERYSSIW  0.230   Passed -0.1211104 0.1211104  len
## 6    SURVEYBS  0.631   Passed -0.2273600 0.2273600  len
## 7    SURVEYEI  0.786   Passed -0.1820963 0.1820963  len
## 8    SURVEYGS  0.136   Passed -0.2945415 0.2945415  len
## 9  SURVEYJOIN  0.500   Passed -0.4188837 0.4188837  len
## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests
## 
## Runs Test stats by Mean length:
##          Index runs.p     test   sigma3.lo  sigma3.hi type
## 1    FISHERYBS  0.500   Passed -0.04736845 0.04736845  len
## 2    FISHERYEI  0.013   Failed -0.27981814 0.27981814  len
## 3    FISHERYGS  0.013   Failed -0.26753759 0.26753759  len
## 4  FISHERYJOIN     NA Excluded          NA         NA  len
## 5  FISHERYSSIW  0.064   Passed -0.11817640 0.11817640  len
## 6     SURVEYBS  0.001   Failed -0.29914741 0.29914741  len
## 7     SURVEYEI  0.627   Passed -0.22176277 0.22176277  len
## 8     SURVEYGS  0.198   Passed -0.36228949 0.36228949  len
## 9   SURVEYJOIN  0.500   Passed -0.44222599 0.44222599  len
## 10    PREDATOR  0.607   Passed -0.26332169 0.26332169  len

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests
## 
## Runs Test stats by Index:
##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.001   Failed -0.8964068 0.8964068 cpue
## 2    FISHERYEI  0.100   Passed -1.1328816 1.1328816 cpue
## 3    FISHERYGS  0.797   Passed -1.9827363 1.9827363 cpue
## 4  FISHERYJOIN  0.140   Passed -1.1347812 1.1347812 cpue
## 5  FISHERYSSIW  0.001   Failed -0.7101892 0.7101892 cpue
## 6     SURVEYBS  0.333   Passed -2.9583720 2.9583720 cpue
## 7     SURVEYEI  0.771   Passed -2.6776080 2.6776080 cpue
## 8     SURVEYGS  0.448   Passed -3.2527164 3.2527164 cpue
## 9   SURVEYJOIN  0.001   Failed -3.1976327 3.1976327 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:
##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.143   Passed -1.1809409 1.1809409 cpue
## 2    FISHERYEI  0.100   Passed -1.4337655 1.4337655 cpue
## 3    FISHERYGS  0.797   Passed -2.5816226 2.5816226 cpue
## 4  FISHERYJOIN  0.582   Passed -1.2562104 1.2562104 cpue
## 5  FISHERYSSIW  0.285   Passed -0.8795652 0.8795652 cpue
## 6     SURVEYBS  0.001   Failed -2.3451953 2.3451953 cpue
## 7     SURVEYEI  0.684   Passed -2.5556960 2.5556960 cpue
## 8     SURVEYGS  0.753   Passed -3.4160804 3.4160804 cpue
## 9   SURVEYJOIN  0.268   Passed -3.2923986 3.2923986 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue
## 11    PREDATOR  0.001   Failed -0.5495260 0.5495260 cpue
## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests
## 
## Runs Test stats by Index:
##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.089   Passed -0.9904944 0.9904944 cpue
## 2    FISHERYEI  0.095   Passed -1.3238558 1.3238558 cpue
## 3    FISHERYGS  0.530   Passed -2.0047873 2.0047873 cpue
## 4  FISHERYJOIN  0.140   Passed -1.0957547 1.0957547 cpue
## 5  FISHERYSSIW  0.000   Failed -0.7353154 0.7353154 cpue
## 6     SURVEYBS  0.184   Passed -2.4900423 2.4900423 cpue
## 7     SURVEYEI  0.684   Passed -2.6110623 2.6110623 cpue
## 8     SURVEYGS  0.448   Passed -3.2351728 3.2351728 cpue
## 9   SURVEYJOIN  0.718   Passed -3.1774337 3.1774337 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:
##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.219   Passed -1.1537055 1.1537055 cpue
## 2    FISHERYEI  0.095   Passed -1.3503280 1.3503280 cpue
## 3    FISHERYGS  0.240   Passed -2.3807098 2.3807098 cpue
## 4  FISHERYJOIN  0.500   Passed -1.3449663 1.3449663 cpue
## 5  FISHERYSSIW  0.028   Failed -0.8088713 0.8088713 cpue
## 6     SURVEYBS  0.001   Failed -2.3417309 2.3417309 cpue
## 7     SURVEYEI  0.684   Passed -2.6617553 2.6617553 cpue
## 8     SURVEYGS  0.753   Passed -3.3151881 3.3151881 cpue
## 9   SURVEYJOIN  0.268   Passed -3.4148453 3.4148453 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue
## 11    PREDATOR  0.001   Failed -0.5495260 0.5495260 cpue

Residual Analysis and RMSE

Residual analysis of mean length data is a fundamental diagnostic tool in stock assessments. It helps evaluate whether the model provides an unbiased fit to the observed data and detects potential biases over time. In this figure, mean length residuals are plotted across years, differentiated by data source, including fishery-dependent (FISHERY) and fishery-independent (SURVEY) datasets, as well as predator-related observations (PREDATOR). The residuals represent the deviation of observed mean length from model-predicted values, standardized to facilitate interpretation.

The black line represents a locally estimated scatterplot smoothing (Loess) curve, which provides a trend line to visualize systematic deviations over time. The presence of persistent positive or negative trends in the residuals may indicate biases in the growth model, selectivity assumptions, or misrepresentation of recruitment variability. The gray bars highlight periods where residual variability is particularly high, suggesting potential inconsistencies between observed and predicted size structures.

RMSE quantifies the overall deviation between observed and predicted values, providing an aggregate measure of model fit. Lower RMSE values indicate better agreement between observed and predicted data. In fisheries stock assessment (Hurtado-ferro et al., 2015), RMSE thresholds for acceptable model performance typically range between 10% and 30%, depending on the data quality and complexity of the population dynamics being modeled. Values exceeding this range suggest potential biases, requiring further investigation into the model structure, parameter estimation, or data sources.

By analyzing residual patterns and RMSE values, the model can be refined to improve the accuracy of mean length predictions, ultimately enhancing the reliability of stock assessment outcomes and management recommendations (Figure 25).

## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
Time series of RMSE of length compositions by scenario

Figure 25: Time series of RMSE of length compositions by scenario

## 
## RMSE stats by Index:

Figure 26 show RMSE to index.

## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
## 
## RMSE stats by Index:
## Plotting JABBA residual plot
Time series of RMSE of CPUE compositions by scenario

Figure 26: Time series of RMSE of CPUE compositions by scenario

## 
## RMSE stats by Index:

Comparision RMSE

The Figure 27 presents evaluation results for krill models across different scenarios (s1.1 to s1.4). The RMSE (Root Mean Square Error) values range from 0.5 to 1.5, with lower values indicating better model accuracy. The Residuals vary between -1.0 and 1.0, showing deviations between predicted and observed values. Smaller residuals suggest better model fit. Scenario s1.4 appears to have the lowest RMSE (0.5), indicating higher precision, while residuals near zero in some scenarios reflect balanced predictions.

RMSE and Residual values for krill model evaluation across scenarios (s1.1-s1.4), highlighting precision and prediction errors.

Figure 27: RMSE and Residual values for krill model evaluation across scenarios (s1.1-s1.4), highlighting precision and prediction errors.

This boxplot compares a summary of the Root Mean Square Error (RMSE) across four different models (s1.1, s1.2, s1.3, and s1.4). RMSE serves as an indicator of model accuracy, with lower values representing better predictive performance.

s1.1 exhibits the lowest median RMSE, suggesting it has the best overall fit among the four models. In contrast, Models 1.2, 1.3, and 1.4 show higher RMSE values, indicating comparatively lower predictive accuracy. The interquartile range (IQR) of these models is relatively similar, suggesting comparable variability in RMSE across models. Additionally, s1.1 has an outlier above 1.5 RMSE, which could indicate a case where the model’s predictions deviated significantly from observed values.

Overall, this analysis highlights that incorporating different environmental or predator-related variables in the models impacts their predictive ability. The differences in RMSE suggest that some models may overfit or underfit the recruitment patterns of krill, emphasizing the need to refine model selection based on ecological and statistical considerations.

Retrospective Analysis in Model Evaluation

Retrospective analyses provide insights into the differences in estimation patterns (underestimation or overestimation) among the models evaluated. These analyses assess the consistency and reliability of stock assessment models by systematically removing the most recent years of data and comparing the resulting estimates with the full dataset. In this study, we conducted a retrospective analysis to examine the sensitivity of our recruitment and spawning stock biomass (SSB) estimates to the inclusion or exclusion of recent data. By applying this approach to multiple models, we identified potential biases and evaluated the stability of the recruitment estimates over time. The retrospective patterns were assessed by calculating the relative error between the predictions of truncated datasets and the full dataset. These differences allowed us to detect trends in model performance, such as systematic overestimation or underestimation of key population parameters.

Using retro() and SSplotRetro() functions, we obtain main results

Retrospective analysis for spawning biomass (Figure 28)

Retrospective analysis for spawning biomass by scenario in krill

Figure 28: Retrospective analysis for spawning biomass by scenario in krill

Retrospective analysis for fishing mortality (Figure 29)

Retrospective analysis for fishing mortality by scenario in krill

Figure 29: Retrospective analysis for fishing mortality by scenario in krill

Figure 30 presents the retrospective bias (Rho) in Spawning Stock Biomass (SSB) and Fishing Mortality (F) across different peel years (2019–2016) and for the combined period under four model scenarios (s1.1 to s1.4). In the SSB panel, all models exhibit negative Rho values across the years, indicating a downward bias. The magnitude of the bias varies among models, with s1.3 and s1.4 showing the most pronounced deviations. In the F panel, Rho values are consistently positive, suggesting an upward bias. The combined Rho values for each metric confirm these trends, showing persistent differences among model scenarios.

Summary of retrospective analisis by scenario in F and SSB

Figure 30: Summary of retrospective analisis by scenario in F and SSB

See Table @ref(tab:rho_parameters) for details.

(#tab:rho_parameters)Rho parameter by model and quantity (SSB and F)
Model Quant Rho.type Rho.peel Rho.Rho Rho.ForcastRho
s1.1 SSB SSB 2019 -0.2970246 -0.2944080
s1.1 SSB SSB 2018 -0.2707308 -0.5103302
s1.1 SSB SSB 2017 -0.1872940 -0.3297243
s1.1 SSB SSB 2016 -0.1779392 -0.1957271
s1.1 SSB SSB Combined -0.2332471 -0.3325474
s1.1 F F 2019 1.1909249 2.5686552
s1.1 F F 2018 0.7828193 1.5773644
s1.1 F F 2017 0.8435982 1.0765231
s1.1 F F 2016 0.3245750 1.1174145
s1.1 F F Combined 0.7854794 1.5849893
s1.2 SSB SSB 2019 -0.0827150 -0.1016893
s1.2 SSB SSB 2018 -0.4893559 -0.5608395
s1.2 SSB SSB 2017 -0.7272409 -0.7935804
s1.2 SSB SSB 2016 -0.4875969 -0.8027882
s1.2 SSB SSB Combined -0.4467272 -0.5647244
s1.2 F F 2019 1.0251379 3.4863606
s1.2 F F 2018 1.0700408 3.1240360
s1.2 F F 2017 3.5165245 3.4961678
s1.2 F F 2016 0.8270261 5.1191803
s1.2 F F Combined 1.6096823 3.8064362
s1.3 SSB SSB 2019 -0.2631174 -0.3176714
s1.3 SSB SSB 2018 -0.3709787 -0.4678843
s1.3 SSB SSB 2017 -0.3342903 -0.5396513
s1.3 SSB SSB 2016 -0.3209827 -0.3946523
s1.3 SSB SSB Combined -0.3223423 -0.4299648
s1.3 F F 2019 1.1085776 2.9726497
s1.3 F F 2018 0.8449169 2.3280948
s1.3 F F 2017 1.3134959 2.0784872
s1.3 F F 2016 0.7604930 2.1222896
s1.3 F F Combined 1.0068708 2.3753803
s1.4 SSB SSB 2019 -0.1266622 -0.2006111
s1.4 SSB SSB 2018 -0.4523622 -0.6020788
s1.4 SSB SSB 2017 -0.6106873 -0.8405322
s1.4 SSB SSB 2016 -0.4535118 -0.7243463
s1.4 SSB SSB Combined -0.4108059 -0.5918921
s1.4 F F 2019 0.8803373 3.3155514
s1.4 F F 2018 0.8286323 3.1905901
s1.4 F F 2017 3.6632985 6.7126908
s1.4 F F 2016 0.4471872 14.4300252
s1.4 F F Combined 1.4548638 6.9122144

Hindcast Cross-Validation and Prediction Skill

The Hindcast Cross-Validation (HCxval) diagnostic in Stock Synthesis is implemented using the model outputs generated by the r4ss::SS_doRetro() and using SSplotHCval() function. This diagnostic evaluates the predictive performance of the model by comparing hindcast predictions with observed data. To assess prediction skill, we employ the Mean Absolute Scaled Error (MASE) as a robust metric. MASE is calculated by scaling the mean absolute error of the model predictions relative to the mean absolute error of a naïve baseline prediction. Specifically, the MASE score is computed as follows:

  • A MASE score greater than 1 indicates that the model’s average forecasts are less accurate than a random walk model.
  • A MASE score equal to 1 suggests that the model’s accuracy is similar to that of a random walk.
  • A MASE score less than 1 indicates that the model performs better than a random walk.
  • A MASE score of 0.5, for example, indicates that the model’s forecasts are twice as accurate as the naïve baseline prediction, suggesting the model has predictive skill.

Hindcast validation for s1.1 (Figure 31)

Hindcast validation for s1.1 by fleet

Figure 31: Hindcast validation for s1.1 by fleet

Hindcast validation for s1.2 (Figure 32)

Hindcast validation for s1.2 by fleet

Figure 32: Hindcast validation for s1.2 by fleet

Hindcast validation for s1.3 (Figure 33)

Hindcast validation for s1.3 by fleet

Figure 33: Hindcast validation for s1.3 by fleet

Hindcast validation for s1.4 (Figure 34)

Hindcast validation for s1.4 by fleet

Figure 34: Hindcast validation for s1.4 by fleet

Likelihood tables

See Table 7 for details.

Table 7: Model Diagnosis Results
Description s1.1 s1.2 s1.3 s1.4
Convergency 0.000 0.000 0.000 0.000
AIC 2668.080 51985.800 3234.900 50435.600
Total_like 1291.040 25963.900 1591.450 25188.800
N_Params 86.000 58.000 52.000 58.000
Survey_like 412.880 23344.100 645.637 23313.300
Length_comp_like 846.103 2593.080 933.491 1860.930
RMSE_index 81.300 85.700 87.000 86.800
RMSE_length 9.700 16.600 10.700 16.800
MASE 0.427 0.517 0.427 0.517
Retro_Rho_ssb -0.233 -0.447 -0.322 -0.411
Forecast_Rho_ssb -0.333 -0.565 -0.430 -0.592
Forecast_Rho_f 0.785 1.610 1.007 1.455
Rho_f 1.585 3.806 2.375 6.912

As shown in Table 8, the models differ substantially in key parameter estimates and likelihood contributions.

Table 8: Model parameter and likelihood comparison
Label s1.1 s1.2 s1.3 s1.4
TOTAL_like 1291.0400 25963.9000 1591.45000 25188.800000
Survey_like 412.8800 23344.1000 645.63700 23313.300000
Length_comp_like 846.1030 2593.0800 933.49100 1860.930000
Parm_priors_like 0.4809 9.3817 5.01577 9.019300
Recr_Virgin_billions 107.9620 415.8000 66.89600 368.857000
SR_LN(R0) 18.4973 19.8457 18.01870 19.725900
SR_LN(R0)_dev_se NA NA 0.50000 0.500000
SR_LN(R0)_dev_autocorr NA NA 0.00000 0.000000
SR_LN(R0)_DEVmult_2000 NA NA -0.01748 -0.168254
SR_LN(R0)_DEVmult_2001 NA NA 0.12348 0.054449
SR_LN(R0)_DEVmult_2002 NA NA 0.06134 0.050924
SR_LN(R0)_DEVmult_2003 NA NA -0.60447 -0.632315
SR_LN(R0)_DEVmult_2004 NA NA -0.15152 -0.163070
SR_LN(R0)_DEVmult_2005 NA NA -0.07692 -0.115789
SR_LN(R0)_DEVmult_2006 NA NA 0.09326 0.054163
SR_LN(R0)_DEVmult_2007 NA NA 0.02918 -0.022446
SR_LN(R0)_DEVmult_2008 NA NA -0.21759 -0.243368
SR_LN(R0)_DEVmult_2009 NA NA -0.05644 -0.127149
SR_LN(R0)_DEVmult_2010 NA NA 0.01351 -0.101928
SR_LN(R0)_DEVmult_2011 NA NA 0.11352 -0.042061
SR_LN(R0)_DEVmult_2012 NA NA 0.12875 0.030838
SR_LN(R0)_DEVmult_2013 NA NA -0.39951 -0.067914
SR_LN(R0)_DEVmult_2014 NA NA -0.03596 0.004577
SR_LN(R0)_DEVmult_2015 NA NA 0.11805 -0.028420
SR_LN(R0)_DEVmult_2016 NA NA 0.16990 0.002295
SR_LN(R0)_DEVmult_2017 NA NA 0.03144 -0.055025
SR_LN(R0)_DEVmult_2018 NA NA 0.28965 0.153088
SR_LN(R0)_DEVmult_2019 NA NA 0.34139 0.149552
SR_LN(R0)_DEVmult_2020 NA NA -0.18234 -0.355401
SSB_Virgin 3565180.0000 2213450.0000 2209070.00000 2028880.000000
Bratio_2020 1.8797 0.5020 1.81788 0.838206
SPRratio_2020 0.1443 0.1207 0.17600 0.104103

Total likelihood in Figure 35

total likelihood composition by scenario

Figure 35: total likelihood composition by scenario

This bar plot presents the log-normal likelihood contributions of different model components in an Antarctic krill stock assessment under four scenarios (s1.1, s1.2, s1.3, and s1.4).

TOTAL Likelihood: The overall likelihood is highest for scenario s1.4 (black), followed by s1.2 (light orange). This suggests that s1.4 provides the best overall model fit, while s1.1 (dark red) has the lowest likelihood, indicating a poorer fit.

Survey Component: The survey likelihood follows a similar pattern, with s1.4 and s1.2 showing better agreement with the data compared to s1.1.

Recruitment, Parameter Priors, and Parameter Deviations: These components contribute minimally to the total likelihood, indicating that they do not heavily influence model fit differences.

Length Composition: There is some variation in this component, with s1.2 showing a larger contribution than s1.1 and s1.3.

Scenario s1.4 seems to fit the data best, followed closely by s1.2. Scenario s1.1 has the poorest likelihood, meaning it is the least supported by the data. If s1.4 includes both environmental and predator effects, this suggests that incorporating ecosystem variables significantly improves the stock assessment model for Antarctic krill.

Statistics analisys differences bewteen models

To evaluate the residual behavior across model scenarios, we computed residuals as the difference between observed and expected values (residual = Obs - Exp). Basic statistics, including sample size (n()), mean (mean()), and standard deviation (sd()), were calculated for each combination of model and type. To test the normality of residuals, we applied the Shapiro-Wilk test (shapiro.test()) (Shapiro & Wilk, 1965), which is appropriate for small to moderate sample sizes. Temporal autocorrelation was assessed using the Ljung-Box test (Box.test() (Ljung & Box, 1978) with type = "Ljung-Box" and lag = 10), evaluating the null hypothesis of independence across lags. To detect heteroscedasticity, we used the Breusch-Pagan test (bptest() from the lmtest package) (Breusch & Pagan, 1979), fitting a linear model of residuals against year (residual ~ Yr) and testing for non-constant variance in the residuals. These diagnostics provide insight into the validity of model assumptions across different scenarios.

As shown in Table 9, the residuals exhibit different statistical properties across model scenarios.

Table 9: Summary statistics and residual diagnostic tests by type and model
type model N Mean SD shapiro_p ljung_p bp_p
Index s1.1 159 105609.8 1.290513e+06 0 0.00080 0.27150
Index s1.2 189 106345.1 1.159477e+06 0 0.00000 0.06413
Index s1.3 159 143409.3 1.277875e+06 0 0.00072 0.01052
Index s1.4 189 126347.0 1.131044e+06 0 0.00002 0.02942
Length s1.1 3752 0.0 3.807000e-02 0 0.00000 0.00004
Length s1.2 4564 0.0 4.741000e-02 0 0.00000 0.00001
Length s1.3 3752 0.0 3.992000e-02 0 0.00000 0.00067
Length s1.4 4564 0.0 4.655000e-02 0 0.00000 0.00000

The residual analysis shows distinct patterns between the Index and Length-based models across all four scenarios (s1.1 to s1.4). For Index models, the mean residuals are notably large (ranging from approximately 105,610 to 143,409), with high standard deviations exceeding 1 million units, indicating substantial variability. The Shapiro-Wilk test consistently returned p-values of 0, strongly rejecting the null hypothesis of normality. The Ljung-Box test yielded very low p-values (≤ 0.00080), suggesting significant temporal autocorrelation in the residuals. In terms of heteroscedasticity, the Breusch-Pagan test showed moderate evidence of non-constant variance, with p-values ranging from 0.27150 (s1.1) to 0.01052 (s1.3), the latter indicating strong evidence of heteroscedasticity.

In contrast, Length models present mean residuals near zero with low standard deviations (approximately 0.038–0.047), suggesting good overall model fit and low dispersion. However, despite the small residual magnitude, the Shapiro-Wilk and Ljung-Box tests consistently return p-values of 0, indicating violation of normality and presence of temporal autocorrelation. Furthermore, the Breusch-Pagan test returned extremely low p-values (≤ 0.00067), indicating strong evidence of heteroscedasticity across all Length-based scenarios.

These results suggest that while Length models achieve better central tendencies (mean near zero and low variance), they still suffer from violations of key assumptions—most notably autocorrelation and heteroscedasticity. Index models, on the other hand, display large-scale variability and systematic residual patterns that may require model refinement or transformation.

Another statistical analysis;

## 
##  Shapiro-Wilk normality test
## 
## data:  spawn_bio_1
## W = 0.65734, p-value = 4.827e-08
## 
##  Shapiro-Wilk normality test
## 
## data:  spawn_bio_4
## W = 0.71222, p-value = 3.39e-07
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  1  4.9249 0.02962 *
##       72                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Two Sample t-test
## 
## data:  spawn_bio_1 and spawn_bio_4
## t = 4.2682, df = 72, p-value = 5.915e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1387011 3818074
## sample estimates:
## mean of x mean of y 
##   4600304   1997762
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spawn_bio_1 and spawn_bio_4
## W = 1256, p-value = 6.698e-10
## alternative hypothesis: true location shift is not equal to 0

Two-Sample t-Test

  • t-value: 4.2682
  • Degrees of Freedom (df): 72
  • p-value: 5.915e-05 (very small)
  • 95% Confidence Interval for the Mean Difference: [1,387,011; 3,818,074]
  • Sample Means:
    • Model 1.1: 4,600,304
    • Model 1.4: 1,997,762

Since the p-value (5.915e-05) is much smaller than 0.05, we reject the null hypothesis. This means there is a statistically significant difference between the spawning biomass predicted by Model 1.1 and Model 1.4. The confidence interval suggests that Model 1.1 tends to estimate a higher spawning biomass than Model 1.4, with an estimated difference between 1.39 and 3.82 million tons.

Wilcoxon Rank Sum Test (Mann-Whitney U Test)

  • W-statistic: 1256
  • p-value: 6.698e-10 (extremely small)
  • Warning: “Cannot compute exact p-value with ties” (indicates that some values are repeated)

The Wilcoxon test also rejects the null hypothesis with an extremely small p-value (6.698e-10). This confirms that the distribution of spawning biomass values is significantly different between the two models. Since the Wilcoxon test does not assume normality, it supports the conclusion from the t-test, reinforcing that

Both tests strongly indicate that Model 1.1 and Model 1.4 produce significantly different estimates of spawning biomass. Specifically, Model 1.1 predicts higher values. This suggests that the factors or assumptions included in Model 1.4 result in lower spawning biomass estimates, which could have important implications for stock assessment and fisheries management.

REFERENCES

Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294. https://doi.org/10.2307/1911963
Canales, C. M., Punt, A. E., & Mardones, M. (2021). Can a length-based pseudo-cohort analysis (LBPA) using multiple catch length-frequencies provide insight into population status in data-poor situations? Fisheries Research, 234(October 2020), 105810. https://doi.org/10.1016/j.fishres.2020.105810
Carvalho, F., Winker, H., Courtney, D., Kapur, M., Kell, L., Cardinale, M., Schirripa, M., Kitakado, T., Yemane, D., Piner, K. R., Maunder, M. N., Taylor, I., Wetzel, C. R., Doering, K., Johnson, K. F., & Methot, R. D. (2021). A cookbook for using model diagnostics in integrated stock assessments. Fisheries Research, 240(April), 105959. https://doi.org/10.1016/j.fishres.2021.105959
Chong, L., Mildenberger, T. K., Rudd, M. B., Taylor, M. H., Cope, J. M., Branch, T. A., Wolff, M., & Stäbler, M. (2019). Performance evaluation of data-limited, length-based stock assessment methods. ICES Journal of Marine Science, 77(1), 97–108. https://doi.org/10.1093/icesjms/fsz212
Fournier, D. A., Skaug, H. J., Ancheta, J., Ianelli, J., Magnusson, A., Maunder, M. N., Nielsen, A., & Sibert, J. (2012). AD Model Builder: Using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software, 27(2), 233–249. https://doi.org/10.1080/10556788.2011.597854
Hurtado-ferro, F., Szuwalski, C. S., Valero, J. L., Anderson, S. C., Cunningham, C. J., Johnson, K. F., Licandeo, R., Mcgilliard, C. R., Monnahan, C. C., & Muradian, M. L. (2015). Marine Science. 72, 99–110.
ICCAT. (2003). Manual. Sampling for length frequency data and AKL description (pp. 29–41).
Lee, H. H., Maunder, M. N., & Piner, K. R. (2024). Good Practices for estimating and using length-at-age in integrated stock assessments. Fisheries Research, 270(November 2023). https://doi.org/10.1016/j.fishres.2023.106883
Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65(2), 297–303. https://doi.org/10.1093/biomet/65.2.297
Methot, R., & Wetzel, C. (2013). Stock synthesis: A biological and statistical framework for fish stock assessment and fishery management. Fisheries Research, 142, 86–99. https://doi.org/10.1016/j.fishres.2012.10.012
Nielsen, A., Hintzen, N. T., Mosegaard, H., Trijoulet, V., & Berg, C. W. (2021). Multi-fleet state-space assessment model strengthens confidence in single-fleet SAM and provides fleet-specific forecast options. ICES Journal of Marine Science, 78(6), 2043–2052. https://doi.org/10.1093/icesjms/fsab078
Punt, André. (2003). The performance of a size-structured stock assessment method in the face of spatial heterogeneity in growth. Fisheries Research, 65(1-3), 391–409. https://doi.org/10.1016/j.fishres.2003.09.028
Punt, Andre, Huang, T., & Maunder, M. N. (2013). Review of integrated size-structured models for stock assessment of hard-to-age crustacean and mollusc species. ICES Journal of Marine Science, 70(1), 16–33. https://doi.org/10.4135/9781412953924.n678
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.2307/2333709
Taylor, I. (2019). Using R for Stock Synthesis Installing R and getting R4SS. Fisheries Science, November.
Thanassekos, S., Cox, M. J., & Reid, K. (2014). Investigating the effect of recruitment variability on length-based recruitment indices for antarctic krill using an individual-based population dynamics model. PLoS ONE, 9(12), 1–20. https://doi.org/10.1371/journal.pone.0114378
Waterhouse, L., Sampson, D. B., Maunder, M., & Semmens, B. X. (2014). Using areas-as-fleets selectivity to model spatial fishing: Asymptotic curves are unlikely under equilibrium conditions. Fisheries Research, 158, 15–25. https://doi.org/10.1016/j.fishres.2014.01.009
Winker, H., Carvalho, F., Cardinale, M., & Kell, L. (2024). ss3diags: What the package does (one line, title case). https://github.com/jabbamodel/ss3diags