MODELING SETTING

Libraries

Libraries necessary to made this analysis;

# install.packages('devtools')
# devtools::install_github('r4ss/r4ss',
# ref='development') install.packages('caTools')
# library('caTools') install.packages('r4ss')
library(r4ss)
library(here)
# remotes::install_github('PIFSCstockassessments/ss3diags')
library(ss3diags)
library(kableExtra)
library(doParallel)
detectCores()

## [1] 16

registerDoParallel(8)
library(ggpubr)
library(tibble)
library(openxlsx)
library(ggthemes)
library(forecast)
library(tidyr)
library(mixR)
library(readxl)
library(tidyverse)
library(ggridges)
library(ggrepel)
library(broom)
library(lmtest)
library(flextable)
library(car)

Code Repository

The repository with files templates by scenario to replicate this analysis can be found in this GitHub author link

dir1.1 <- here("s1.1")  # no predador- environmental
dir1.2 <- here("s1.2")  # s1.1 w/ predador
dir1.3 <- here("s1.3")  # s1.1 w/ env
dir1.4 <- here("s1.4")  # s1.1 predator and env
Figs <- here("Figs")

OVERVIEW

This study aims to evaluate the impact of ecosystem components—such as environmental variables and predator-prey interactions—on the productivity and key population dynamics of Euphausia superba in Subarea 48.1. By incorporating these factors into the assessment, we analyze how krill population variables respond to ecological variability, providing insights into their resilience and potential management implications. This code forms part of the supplementary material for the scientific article titled “Assessing environmental and predator impacts on Antarctic Krill (Euphausia superba) population dynamics from an integrated length-to-age assessment model perspective.”

Spatial dimension of stock assessment

The study focuses on Subarea 48.1 in the Western Antarctic Peninsula, where most krill fishing occurs and CCAMLR is advancing spatially refined management. We use five CCAMLR-defined strata to increase spatial resolution, allowing detection of regional differences in krill dynamics (Figure 1). This structure, integrated into the stock assessment model, enhances its capacity to capture spatial variability in both biological and environmental data. In this approach, spatial structure is incorporated implicitly by treating different areas as separate fleets (Nielsen et al., 2021; Waterhouse et al., 2014). This constitutes a spatially implicit modeling framework, where differences among strata are recognized both from the perspective of krill population dynamics and from the influence of environmental variability within Subarea 48.1.

Subarea 48.1 and management strata considered in the spatio-temporal analysis of intrinsic productivity of Krill (BS=Brainsfield Strait, EI= Elephant Island, Gerlache= Gerlache strait, JOIN= Joinville Island, SSWI= South West)

Figure 1: Subarea 48.1 and management strata considered in the spatio-temporal analysis of intrinsic productivity of Krill (BS=Brainsfield Strait, EI= Elephant Island, Gerlache= Gerlache strait, JOIN= Joinville Island, SSWI= South West)

Figure 2 illustrates a conceptual model with the spatial distribution and movement of Antarctic krill in the Western Antarctic Peninsula (WAP). Adult krill are concentrated and move primarily in the northern areas, while juvenile krill dominate the southern regions. Arrows indicate directional flows, suggesting ontogenetic or environmentally-driven migration patterns. In terms of assumptions, although there are spatial differences in population structure, all these areas are considered part of a single, closed population unit upon which the stock assessment is conducted. By leveraging a spatially implicit, ecosystem-informed approach, this assessment provides a robust framework for evaluating krill stock dynamics under changing environmental conditions. These insights are crucial for informing sustainable management strategies in the Antarctic Peninsula region, where krill plays a foundational role in the marine food web.

Figure 2: Conceptual model used to model dynamics population in Antarctic krill in WAP

Statistical Model (`SS3`)

Stock Synthesis (v.3.30.21) is a widely used tool for assessing fish and invertebrate populations, including Antarctic krill. SS3 is implemented in C++ with estimation enabled through automatic differentiation (ADMB) (Fournier et al., 2012; Methot & Wetzel, 2013). The spurce code can be find in Github SS3 repository. In this exercise, SS3 is configured as an integrated stock assessment model, explicitly accounting for age and size structure while incorporating key ecosystem drivers. The model simulates population processes such as growth, maturity, fecundity, recruitment, movement, and mortality, while also integrating environmental variability and predator-prey relationships to refine estimates of population trends in krill. The analysis of model outputs is conducted using R, utilizing the r4ss and ss3diags packages (Taylor, 2019; Winker et al., 2024). Integrated models can effectively capture the age structure by transforming length observations into population-level dynamics (Lee et al., 2024; Andre Punt et al., 2013).

On the other hand, length data, which are cost-effective and readily available (Canales et al., 2021; Chong et al., 2019), provide valuable insights into krill population structure due to their correlation with age (Thanassekos et al., 2014). For made this transformation, the models often use an Age-Length Keys (ALKs). An ALK is a probabilistic matrix that estimates the likelihood of individuals of a given length belonging to specific age classed and this method assume both aged and measured fish are random samples from the same population and should be applied within the same time period to avoid biases (ICCAT, 2003; Lee et al., 2024; André Punt, 2003; Andre Punt et al., 2013). The ALK matrix is typically defined as:

In a catch-at-length model like krill assessment the AKL matrix (Figure 3) is modeled trough parametrization process and have this shape;

Figure 3: Representation of ALK Matrix to krill in 48.1

Data

Parameters

The following table summarizes the key parameters to conditioning the reference model, including biological, growth, and population dynamics factors.

Table 1: Input parameters for the initial SS3 model of krill. Each parameter line contains a minimum value (LO), maximum value (HI), and initial value (INIT). If the phase (PHASE) for the parameter is negative, the parameter is fixed as input
	LO	HI	INIT	PHASE
Natural Mortality
Nat M	0.20	1.00	0.270	-3
Growth
Lmin	0.00	5.00	3.400	-2
Lmax	1.00	10.00	5.000	-4
VonBert K	0.05	0.80	0.470	-4
CV young	0.05	0.25	0.140	-4
CV old	0.05	0.25	0.070	-4
relationship Length-Weigth
Wt a	0.00	3.00	0.000	-3
Wt b	1.00	4.00	3.347	-3
Maturity
L50%	0.20	5.00	1.800	-4
Mat slope	-3.00	3.00	-2.900	-4
S-R relation
SR_LN(R0)	3.00	30.00	23.000	1
SR_BH_steep	0.20	1.00	0.850	-4
SR_sigmaR	0.00	2.00	1.200	-4
SR_regime	-5.00	5.00	0.000	-4
SR_autocorr	0.00	0.00	0.000	-99
Catchability
LnQ_base_FISHERYBS(1)	-25.00	25.00	-5.722	1
LnQ_base_FISHERYEI(2)	-25.00	25.00	-5.722	1
Selectivity
SizeSel_P_1_FISHERYBS(1)	0.01	8.00	2.000	-3
SizeSel_P_2_FISHERYBS(1)	0.00	8.00	2.000	-2
SizeSel_P_1_FISHERYEI(2)	0.01	8.00	3.500	-3
SizeSel_P_2_FISHERYEI(2)	0.00	8.00	4.000	-2
SizeSel_P_1_FISHERYGS(3)	0.01	8.00	2.000	-3
SizeSel_P_2_FISHERYGS(3)	0.00	8.00	2.000	2
SizeSel_P_1_FISHERYJOIN(4)	0.01	8.00	3.500	-3
SizeSel_P_2_FISHERYJOIN(4)	0.00	8.00	2.000	-2
SizeSel_P_1_FISHERYSSIW(5)	0.01	8.00	3.500	-3
SizeSel_P_2_FISHERYSSIW(5)	0.00	8.00	2.000	-2
SizeSel_P_1_SURVEYBS(6)	1.00	7.00	2.000	-2
SizeSel_P_2_SURVEYBS(6)	1.00	7.00	1.000	-3
SizeSel_P_1_SURVEYEI(7)	1.00	7.00	3.000	-2
SizeSel_P_2_SURVEYEI(7)	1.00	7.00	1.000	-3
SizeSel_P_1_SURVEYGS(8)	1.00	7.00	2.000	-2
SizeSel_P_2_SURVEYGS(8)	1.00	7.00	1.000	-3
SizeSel_P_1_SURVEYJOIN(9)	1.00	7.00	3.000	2
SizeSel_P_2_SURVEYJOIN(9)	1.00	7.00	1.000	3
SizeSel_P_1_SURVEYSSIW(10)	1.00	7.00	2.000	-2
SizeSel_P_2_SURVEYSSIW(10)	1.00	7.00	1.000	-3
SizeSel_P_1_PREDATOR(11)	0.00	3.00	0.200	2
SizeSel_P_2_PREDATOR(11)	0.00	3.00	0.200	3

Index

Abundance index in Figure 4

Figure 4: Standardized indices of krill index abundance and consumption from fishery-dependent, fishery-independent, and predator-based data sources across different strata within Subarea 48.1. Each panel represents a distinct spatial or functional stratum, with trend lines indicating temporal variation from 1990 to 2020. Colors denote data source categories: green for fishery, orange for scientific surveys, and purple for predator-based indices. These patterns highlight spatial and temporal heterogeneity in krill dynamics across the subarea.

Length comositions

Length compositions in Figure 5

Annual length-frequency distributions of Antarctic krill (Euphausia superba) across different data sources and spatial strata within Subarea 48.1 from 1991 to 2020. Each panel represents a distinct stratum for either fishery-dependent (green), fishery-independent survey (orange), or predator-based (purple) observations. Density ridgelines illustrate variation in krill size structure across years. The red vertical line marks a recruit references length (3.6 cm).

Figure 5: Annual length-frequency distributions of Antarctic krill (Euphausia superba) across different data sources and spatial strata within Subarea 48.1 from 1991 to 2020. Each panel represents a distinct stratum for either fishery-dependent (green), fishery-independent survey (orange), or predator-based (purple) observations. Density ridgelines illustrate variation in krill size structure across years. The red vertical line marks a recruit references length (3.6 cm).

Summary data sources to Modelling

Finally, this information and all sources can be represented through the following flow diagram (Figure 6) of inputs, model, and outputs.

Figure 6: Framework path to stock assessment model in krill in WAP (Yellow boxes is not implemeted yet).

Figure ?? show time series of differente componentes of data sources to this krill stock assessment.

Scenarios

Here, the reference model is s1.1, wich represents a baseline assessment of Euphausia superba population dynamics in Subarea 48.1, excluding environmental and ecological variables. This model assumes that krill productivity and population parameters are driven by intrinsic biological processes, such as growth, mortality, and recruitment and fishery impacts without accounting for external influences like environmental variability or predation pressure. By serving as a control scenario, this model provides a benchmark against which the impact of ecosystem components in productivity can be evaluated, allowing for a direct comparison of how environmental and ecological factors influence krill stock dynamics.

Table ?? show scenarios used for modelling dynamics in krill

Table 2: Scenarios used for modelling dynamics in krill
Scenario	Description
s1.1	Spatial data without environmental and predator components
s1.2	“s1.1” with predator components
s1.3	“s1.1” with environmental variable
s1.4	“s1.1” with both predator fleet and environmental variable

RESULTS

Population Variables by scenario

Main Variables poulation in s1.1 scenario (Figure 7)

Figure 7: Main variables scenario s1.1

Main Variables poulation in s1.2 scenario (Figure 8)

Figure 8: Main variables scenario s1.2

Main Variables poulation in s1.3 scenario (Figure 9)

Figure 9: Main variables scenario s1.3

Main Variables poulation in s1.4 scenario (Figure 10)

Figure 10: Main variables scenario s1.4

Selectivity estimated by scenario in Figure 11.

Figure 11: Selectivity by fleet in each scenario

This Figure 12 shows standardized time series of input indices used in four different model scenarios (s1.1 to s1.4) for the stock assessment of Antarctic krill in Subarea 48.1. Each panel presents fishery-dependent (FISHERY) and fishery-independent (SURVEY) indices across five management strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW), from the mid-1990s to 2020. Scenario s1.4 additionally includes a predator index, highlighting the incorporation of ecosystem variables. The figure illustrates spatial and temporal variability in data availability and trends among sources.

$Standardized indices of krill abundance used as input in four model scenarios (s1.1 to s1.4), representing fishery-dependent (FISHERY) and fishery-independent (SURVEY) data across five spatial strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW). Scenario s1.4 also incorporates a predator index (PREDATOR), reflecting the integration of ecosystem variables into the assessment framework$

Figure 12: Standardized indices of krill abundance used as input in four model scenarios (s1.1 to s1.4), representing fishery-dependent (FISHERY) and fishery-independent (SURVEY) data across five spatial strata: Bransfield Strait (BS), Elephant Island (EI), Gerlache Strait (GS), Joinville Island (JOIN), and South West (SW). Scenario s1.4 also incorporates a predator index (PREDATOR), reflecting the integration of ecosystem variables into the assessment framework

Outputs Variables `s1.1`

Table 3: Main variables outputs from stock asssessment krill in WAP `s1.1`
	Yr	Era	Seas	Bio_all	Bio_smry	SpawnBio	Recruit_0
1254	1989	VIRG	1	2184230	2176360	3565180	107962000
1255	1990	INIT	1	1498570	1490900	2259200	105278000
1256	1991	TIME	1	1498570	1490900	2259200	105278000
1257	1992	TIME	1	1616910	1609190	2452620	105844000
1258	1993	TIME	1	1723130	1715370	2661750	106370000
1259	1994	TIME	1	1814140	1806360	2839670	106759000
1260	1995	TIME	1	1889790	1881980	2987350	107049000
1261	1996	TIME	1	1951400	1943580	3107870	107266000
1262	1997	TIME	1	2000910	1993080	3204890	107430000
1263	1998	TIME	1	2031670	2031470	3280300	2722690
1264	1999	TIME	1	1950370	1950080	3329570	3861330
1265	2000	TIME	1	1709860	1703160	3365450	91895800
1266	2001	TIME	1	1523110	1504210	2802010	259242000
1267	2002	TIME	1	1638890	1627610	2264810	154705000
1268	2003	TIME	1	1971710	1971470	2294470	3332310
1269	2004	TIME	1	2079960	2078730	3317380	16879800
1270	2005	TIME	1	1916790	1913490	3747190	45270000
1271	2006	TIME	1	1724010	1708750	3238060	209287000
1272	2007	TIME	1	1640040	1631050	2612060	123361000
1273	2008	TIME	1	1833350	1832360	2343170	13566800
1274	2009	TIME	1	1869200	1864810	3037610	60221700
1275	2010	TIME	1	1731220	1719710	3226670	157914000
1276	2011	TIME	1	1606630	1586690	2567950	273609000
1277	2012	TIME	1	1905570	1872960	2385900	447447000
1278	2013	TIME	1	2520170	2519760	2744410	5517500
1279	2014	TIME	1	2918840	2914270	3602640	62737200
1280	2015	TIME	1	2758580	2740280	5255400	251086000
1281	2016	TIME	1	2672990	2619030	4425950	740188000
1282	2017	TIME	1	3269820	3269180	3819450	8874990
1283	2018	TIME	1	4057150	3997160	4289280	823040000
1284	2019	TIME	1	4780510	4672090	7517290	1487330000
1285	2020	TIME	1	6934500	6933840	6701380	8978140
1286	2021	FORE	1	8884260	8876160	10056300	111127000
1287	2022	FORE	1	8472730	8464580	16411100	111824000
1288	2023	FORE	1	7532870	7524730	14262700	111657000
1289	2024	FORE	1	6398190	6390070	11986300	111416000
1290	2025	FORE	1	5398930	5390840	9988720	111115000

Outputs Variables `s1.2`

Table 4: Main variables outputs from stock asssessment krill in WAP in `s1.2`
	Yr	Era	Seas	Bio_all	Bio_smry	SpawnBio	Recruit_0
1354	1989	VIRG	1	2299170	2268860	2213450	415800000
1355	1990	INIT	1	2140700	2110590	1919210	413006000
1356	1991	TIME	1	2140700	2110590	1919210	413006000
1357	1992	TIME	1	2210510	2180300	2052100	414362000
1358	1993	TIME	1	2249080	2218830	2126300	415049000
1359	1994	TIME	1	2270970	2240690	2163880	415380000
1360	1995	TIME	1	2283380	2253090	2185490	415565000
1361	1996	TIME	1	2290340	2260040	2197820	415669000
1362	1997	TIME	1	2294230	2263930	2204700	415727000
1363	1998	TIME	1	2266810	2265650	2207770	15962500
1364	1999	TIME	1	1867210	1863690	2202490	48340700
1365	2000	TIME	1	1196690	1191800	2199340	67016200
1366	2001	TIME	1	828069	768871	1232060	812129000
1367	2002	TIME	1	1347830	1298870	737765	671607000
1368	2003	TIME	1	2369680	2367790	534842	25993000
1369	2004	TIME	1	2358800	2347400	2222130	156448000
1370	2005	TIME	1	1692350	1681210	2907030	152806000
1371	2006	TIME	1	1334460	1282450	1706020	713509000
1372	2007	TIME	1	1587260	1564820	1167630	307877000
1373	2008	TIME	1	2058080	2050430	968864	104953000
1374	2009	TIME	1	1799590	1784940	2225020	200938000
1375	2010	TIME	1	1401810	1386880	1981840	204876000
1376	2011	TIME	1	1141310	1130440	1172540	149095000
1377	2012	TIME	1	1076390	1042010	1058820	471730000
1378	2013	TIME	1	1258030	1225100	995034	451783000
1379	2014	TIME	1	1685050	1599720	774750	1170600000
1380	2015	TIME	1	2619570	2612840	1328470	92363200
1381	2016	TIME	1	2961350	2947640	1691030	188112000
1382	2017	TIME	1	2137730	2126280	3483620	157048000
1383	2018	TIME	1	1632790	1493160	2050430	1915550000
1384	2019	TIME	1	3002460	2903520	1450160	1357410000
1385	2020	TIME	1	5247580	5235720	1111080	162708000
1386	2021	FORE	1	5161170	5130090	5069530	426398000
1387	2022	FORE	1	3655000	3623850	5768290	427421000
1388	2023	FORE	1	2833070	2802300	3345400	422101000
1389	2024	FORE	1	2415010	2384550	2502690	417931000
1390	2025	FORE	1	2214670	2184410	2127620	415061000

Outputs Variables `s1.3`

Table 5: Main variables outputs from stock asssessment krill in WAP in `s1.3`
	Yr	Era	Seas	Bio_all	Bio_smry	SpawnBio	Recruit_0
1305	1989	VIRG	1	1353400	1348530	2209070	66896000
1306	1990	INIT	1	924676	919929	1365380	65120800
1307	1991	TIME	1	924676	919929	1365380	65120800
1308	1992	TIME	1	1011120	1006340	1537670	65631800
1309	1993	TIME	1	1081550	1076740	1677420	65973500
1310	1994	TIME	1	1138370	1133540	1787730	66207600
1311	1995	TIME	1	1183710	1178870	1876090	66376300
1312	1996	TIME	1	1219630	1214790	1946320	66500000
1313	1997	TIME	1	1248020	1243170	2001930	66592100
1314	1998	TIME	1	1265270	1264970	2044760	4181950
1315	1999	TIME	1	1215210	1214650	2069470	7762690
1316	2000	TIME	1	1072930	1068190	2087520	64986400
1317	2001	TIME	1	973382	957638	1743440	215996000
1318	2002	TIME	1	1110280	1101160	1435720	125135000
1319	2003	TIME	1	1430490	1430440	1496900	633499
1320	2004	TIME	1	1555180	1553690	2431110	20543200
1321	2005	TIME	1	1457770	1454920	2841870	39014900
1322	2006	TIME	1	1337590	1324930	2465110	173694000
1323	2007	TIME	1	1282230	1275120	2003690	97545800
1324	2008	TIME	1	1458050	1457180	1827320	11817800
1325	2009	TIME	1	1495670	1492280	2433790	46474800
1326	2010	TIME	1	1380840	1374560	2577760	86096400
1327	2011	TIME	1	1216580	1201820	2008220	202555000
1328	2012	TIME	1	1370620	1354010	1858140	227851000
1329	2013	TIME	1	1695600	1695400	1915330	2806810
1330	2014	TIME	1	1790520	1786470	2457810	55596900
1331	2015	TIME	1	1629840	1613780	3067190	220350000
1332	2016	TIME	1	1591930	1567500	2435350	335134000
1333	2017	TIME	1	1924590	1917310	2100960	99782500
1334	2018	TIME	1	2324970	2262080	2686890	862721000
1335	2019	TIME	1	3242040	3138410	3990540	1421630000
1336	2020	TIME	1	5618600	5617370	4015810	16881500
1337	2021	FORE	1	7720710	7715680	8064010	69109800
1338	2022	FORE	1	7521960	7516890	14572800	69497300
1339	2023	FORE	1	6680720	6675660	12858000	69433000
1340	2024	FORE	1	5627440	5622390	10753000	69326200
1341	2025	FORE	1	4663130	4658090	8824990	69184200

Outputs Variables `s1.4`

Table 6: Main variables outputs from stock asssessment krill in WAP in `s1.4`
	Yr	Era	Seas	Bio_all	Bio_smry	SpawnBio	Recruit_0
1305	1989	VIRG	1	1353400	1348530	2209070	66896000
1306	1990	INIT	1	924676	919929	1365380	65120800
1307	1991	TIME	1	924676	919929	1365380	65120800
1308	1992	TIME	1	1011120	1006340	1537670	65631800
1309	1993	TIME	1	1081550	1076740	1677420	65973500
1310	1994	TIME	1	1138370	1133540	1787730	66207600
1311	1995	TIME	1	1183710	1178870	1876090	66376300
1312	1996	TIME	1	1219630	1214790	1946320	66500000
1313	1997	TIME	1	1248020	1243170	2001930	66592100
1314	1998	TIME	1	1265270	1264970	2044760	4181950
1315	1999	TIME	1	1215210	1214650	2069470	7762690
1316	2000	TIME	1	1072930	1068190	2087520	64986400
1317	2001	TIME	1	973382	957638	1743440	215996000
1318	2002	TIME	1	1110280	1101160	1435720	125135000
1319	2003	TIME	1	1430490	1430440	1496900	633499
1320	2004	TIME	1	1555180	1553690	2431110	20543200
1321	2005	TIME	1	1457770	1454920	2841870	39014900
1322	2006	TIME	1	1337590	1324930	2465110	173694000
1323	2007	TIME	1	1282230	1275120	2003690	97545800
1324	2008	TIME	1	1458050	1457180	1827320	11817800
1325	2009	TIME	1	1495670	1492280	2433790	46474800
1326	2010	TIME	1	1380840	1374560	2577760	86096400
1327	2011	TIME	1	1216580	1201820	2008220	202555000
1328	2012	TIME	1	1370620	1354010	1858140	227851000
1329	2013	TIME	1	1695600	1695400	1915330	2806810
1330	2014	TIME	1	1790520	1786470	2457810	55596900
1331	2015	TIME	1	1629840	1613780	3067190	220350000
1332	2016	TIME	1	1591930	1567500	2435350	335134000
1333	2017	TIME	1	1924590	1917310	2100960	99782500
1334	2018	TIME	1	2324970	2262080	2686890	862721000
1335	2019	TIME	1	3242040	3138410	3990540	1421630000
1336	2020	TIME	1	5618600	5617370	4015810	16881500
1337	2021	FORE	1	7720710	7715680	8064010	69109800
1338	2022	FORE	1	7521960	7516890	14572800	69497300
1339	2023	FORE	1	6680720	6675660	12858000	69433000
1340	2024	FORE	1	5627440	5622390	10753000	69326200
1341	2025	FORE	1	4663130	4658090	8824990	69184200

Comparision between variables

Main variables population by scenario (Figure 13)

Figure 13: Time series of different populations variables

Another comparison in R0 (Natural log of virgin recruitment level) (Figure 14)

Figure 14: R0 probability predicted by scenario

Comparsion in sd long term time series forecasting Figure 15

$Comparsion in sd long term time series forecasting$

Figure 15: Comparsion in sd long term time series forecasting

Autocorrelation in Recruit

To evaluate the temporal correlation structure of krill recruitment under different model configurations, we performed an Autocorrelation Function (ACF) analysis (Figure 16). The ACF measures the correlation between observations at different time lags, helping to assess whether recruitment estimates exhibit persistence or randomness across time.

The analysis was conducted on recruitment estimates derived from four model scenarios:

A reference model without environmental or predator influences (Ref Model: No Env-Predator). A model incorporating predator data (S1.1 w/ Predator data). A model incorporating environmental data (S1.1 w/ Env data). A model incorporating both environmental and predator data (S1.1 w/ Env and Predator data). Each model’s residuals were extracted, and the autocorrelation function (ACF) was computed for a time lag range of up to 15 years. The dashed blue lines in the plots represent the 95% confidence intervals, indicating the threshold beyond which correlation values are statistically significant. If autocorrelation values remain within this range, it suggests that the recruitment estimates behave as a random process with no significant dependence on past values. Conversely, autocorrelation values exceeding these bounds indicate recruitment persistence or cyclic patterns.

Figure 16: Autocorrelation in Recruit by scenario

The ACF plots indicate that the reference model (without environmental or predator data) exhibits weak but noticeable positive autocorrelation at certain lags, suggesting some degree of recruitment dependence over time. However, this autocorrelation does not appear strong or systematic.

The model incorporating only predator data shows a slight reduction in autocorrelation magnitude, suggesting that predator-driven recruitment variability may have captured part of the temporal structure in the data.

The model with only environmental data exhibits a further reduction in autocorrelation, implying that environmental variability explains a larger portion of recruitment trends than predator effects alone.

Finally, the model that includes both environmental and predator data presents the lowest autocorrelation values, with nearly all bars remaining within the confidence bounds. This suggests that incorporating both factors provides the most effective explanation of recruitment fluctuations, reducing unexplained temporal structure.

Overall, these results indicate that recruitment variability is at least partially driven by environmental and predator influences, and models integrating these factors provide more robust and independent recruitment estimates, minimizing systematic dependencies over time.

Relationship Stock-Recruit

\[ R = \frac{R_0 \cdot S}{S_0 (1 - h) + S (5h - 1)} \]

Where:
- $R$ is the predicted recruitment.
- $S$ is the spawning stock biomass.
- $R_0$ is the recruitment at unfished equilibrium.
- $S_0$ is the spawning biomass at unfished equilibrium.
- $h$ is the steepness parameter (the proportion of $R_0$ produced when $S = 20\% \cdot S_0$).

The blue line represents the Beverton–Holt stock-recruitment relationship, commonly used in fisheries models to describe the compensatory response of recruitment to changes in spawning biomass.

Productivity and Interannual Variability by Scenario

Estimating the productivity of Antarctic krill is critical for understanding the species’ capacity to replenish its population in response to varying levels of spawning biomass. Productivity, defined as the ratio of recruitment to spawning stock biomass, provides a standardized measure of reproductive success and population resilience under different ecological and fishing pressures. Comparing productivity across scenarios—each representing different assumptions about environmental drivers, fishing mortality, or predator dynamics—enables a robust evaluation of how krill populations reflect this changes.

For each scenario $i$ and year $t$, we computed the productivity as the ratio between recruitment and spawning stock biomass (SSB):

\[ \text{Productivity}_{i,t} = \frac{\text{Recruitment}_{i,t}}{\text{SSB}_{i,t}} \]

We also calculated the interannual percentage change in recruitment and SSB as:

\[ \text{Change in Recruitment}_{i,t} = \left( \frac{\text{Recruitment}_{i,t} - \text{Recruitment}_{i,t-1}}{\text{Recruitment}_{i,t-1}} \right) \times 100 \]

\[ \text{Change in SSB}_{i,t} = \left( \frac{\text{SSB}_{i,t} - \text{SSB}_{i,t-1}}{\text{SSB}_{i,t-1}} \right) \times 100 \]

These metrics allow us to analyze both the productivity and the temporal dynamics of the population under each scenario $i$.

These two panels in Figure 17 provide insight into the stock-recruitment dynamics of Antarctic krill under four assessment scenarios, each incorporating different levels of ecosystem complexity. Left panel compares the predicted recruitment across years with corresponding spawning biomass values for the four model scenarios (s1.1 to s1.4). Scenario s1.1 (red) shows a relatively flat and optimistic Beverton-Holt relationship, with high recruitment predicted even at low levels of SSB. The other scenarios, particularly s1.3 (gray) and s1.4 (black), show much more constrained recruitment estimates and a more saturating relationship, where recruitment increases only slightly with increasing SSB. This suggests that incorporating environmental (s1.3) and predator (s1.4) effects leads to a more conservative and ecologically realistic recruitment dynamic. Year labels illustrate how certain years (e.g., 2004, 2016, 2022) deviate between scenarios, especially at low biomass levels.

Right panel displays recruitment efficiency (Recruitment/SSB) as a function of relative spawning biomass (SSB / SSB₀). In all models, recruitment efficiency is highest at very low biomass levels and declines as SSB increases, consistent with compensatory dynamics. However, s1.3 and s1.4 (which include environmental effects) show significantly lower overall productivity across all SSB levels compared to s1.1 and s1.2. This indicates that the inclusion of ecosystem drivers reduces the per-capita productivity of the stock, especially under depleted conditions.

Together, these plots indicate that ecosystem-informed models (s1.3 and s1.4) produce more constrained and precautionary estimates of krill productivity. The more optimistic recruitment predicted by s1.1 (and to a lesser extent s1.2) may overestimate stock resilience, particularly in low SSB conditions. These findings emphasize the importance of accounting for environmental variability and predator dynamics in stock assessment models to avoid overoptimistic management advice and better reflect the biological limits of the krill population.

Left panel: Stock–recruitment relationships for Antarctic krill under four assessment scenarios. Predicted recruitment as a function of spawning stock biomass (SSB), with fitted Beverton-Holt curves and annual labels for selected years. Right panel: Recruitment per unit of SSB plotted against relative SSB (SSB/SSBo), illustrating differences in per-capita productivity across models ( s1.1 (no ecosystem variables), s1.2 (with predator effects), s1.3 (with environmental variables), and s1.4 (with both predator and environmental effects))

Figure 17: Left panel: Stock–recruitment relationships for Antarctic krill under four assessment scenarios. Predicted recruitment as a function of spawning stock biomass (SSB), with fitted Beverton-Holt curves and annual labels for selected years. Right panel: Recruitment per unit of SSB plotted against relative SSB (SSB/SSBo), illustrating differences in per-capita productivity across models ( s1.1 (no ecosystem variables), s1.2 (with predator effects), s1.3 (with environmental variables), and s1.4 (with both predator and environmental effects))

Recruit deviation

Explotation Rate

Explotation rato (havest rate) in Figure 18

Figure 18: Harves rate by scenario in krill overtime

Diagnosis and robustness

A rigorous model diagnosis is essential to ensure the reliability and robustness of stock assessment models. The key steps for a good practice in model diagnosis include:

Convergence Check: The model must reach a final convergence criterion of 1.0e-04 to ensure numerical stability and reliable parameter estimation.
Residual Analysis: Both visual inspection and statistical metrics are used to evaluate model residuals, helping to detect patterns of bias or misfit.
Retrospective Analysis: The Mohn’s rho parameter is used to assess the consistency of model estimates when sequentially removing recent years of data, identifying potential overestimation or underestimation trends.
Likelihood Profile Analysis: This approach examines how the likelihood function behaves across a range of parameter values, providing insight into parameter uncertainty and model sensitivity.

This framework follows the recommendations outlined by Carvalho et al. (2021), aiming to enhance transparency and reproducibility in model evaluation.

Convergence Criteria

The convergence criterion used for model calibration is set to a final threshold of 0.0001 (or equivalently 1.0e-04). This criterion defines the minimum acceptable difference between successive model iterations. Convergence is considered achieved when the absolute change in the objective function value or key parameters falls below this threshold. A smaller convergence value ensures that the model achieves a high degree of accuracy and stability in its final estimates, indicating that further iterations are unlikely to result in significant changes to the parameter estimates.

Residuals

Figure ??, 20, 21 and 22 displays bubble plots of krill length-frequency distributions across different spatial strata and years. Each panel corresponds to a fishery-dependent or independent data source (e.g., FISHERYBS, SURVEYBS), and shows the proportional abundance of krill in different length classes (y-axis) over time (x-axis). The size of each circle is proportional to the relative frequency or standardized number of individuals in a given length bin for a specific year, providing a visual summary of size structure dynamics across regions and time periods.

Figure 19: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.1

$Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.2$

Figure 20: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.2

Figure 21: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.3

Figure 22: Pearson residuals, comparing across fleets (plot 1 of 2) Closed bubbles are positive residuals (observed > expected) and open bubbles are negative residuals (observed < expected) s1.4

This Figure 23 shows the Pearson residuals of predicted length distributions for krill across four modeling scenarios, each incorporating different levels of ecosystem complexity. The residuals are visualized by year and length bin, allowing an assessment of model fit over time and across the size structure of the population. In the “Without Ecosystem Variables” scenario (top left), substantial lack of fit is evident across multiple years and length classes, particularly in the late 1990s and early 2000s. Positive Pearson residuals indicate that the model tends to underestimate frequencies of certain length bins, especially among smaller and mid-sized individuals, suggesting poor representation of recruitment and growth dynamics in the absence of environmental and predator effects. The “With Predator” scenario (top right) shows some improvement in model fit, with slightly more balanced residuals. However, there are still consistent deviations in the mid-size ranges across multiple years, indicating that predation alone does not fully explain observed variability in length composition. In the “With Environment” scenario (bottom left), there is a noticeable reduction in residual magnitude across several years and size classes, implying that incorporating environmental covariates helps align the model more closely with observed size distributions. This is particularly apparent in the early 2000s, where the residuals are closer to zero across most bins. Finally, the “With Predator and Ecosystem” scenario (bottom right) shows the most consistent reduction in Pearson residuals across time and size structure. The distribution of residuals is more homogeneous and centered around zero, indicating that the combined inclusion of predator and environmental drivers leads to the best model fit among the four scenarios.

Overall, the results support the conclusion that ecosystem-informed models better capture the temporal and size-structured dynamics of krill, with the joint inclusion of predator and environmental variables yielding the most robust representation of observed length compositions.

Figure 23: Pearson residual by scenario and fleet

Also, we can check density of residual distribution in Figure 24

Figure 24: Density of residual distribution by scenarios in krill model

Residual consistency

Residual analysis is a critical component of model diagnostics in stock assessments. It helps evaluate the fit of the model to observed data and detect potential biases or inconsistencies. This process is applied to both length composition data and abundance indices such as CPUE (Catch Per Unit Effort) and survey-derived estimates. For length composition data, residuals represent the difference between observed and model-predicted length distributions. The standardized residuals are calculated as the difference between observed and expected proportions at each length bin. These residuals are plotted by year to identify systematic trends, biases, or inconsistencies in the data. Ideally, they should be randomly distributed around zero, indicating no systematic over- or underestimation.

For abundance indices such as CPUE and fishery-independent surveys, residuals are analyzed to assess model fit and potential sources of bias. Residuals are computed as the difference between observed index values and those predicted by the model, typically standardized by dividing by the standard error to facilitate comparison across years. These residuals are then plotted over time to evaluate trends. A shaded confidence region, like the green area in the provided plot, represents expected variability, with outliers highlighted in red or other distinct markers. Persistent positive or negative residuals may indicate systematic bias in the model or data collection process.

Statistical diagnostics are also performed to check for autocorrelation in residuals, which can indicate potential model misspecifications. When mean residual values are close to zero, the model fit is considered unbiased. By integrating these residual analyses for both length and abundance indices, stock assessment models can be refined, improving their reliability and increasing confidence in the assessment results.

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:

##         Index runs.p     test  sigma3.lo sigma3.hi type
## 1   FISHERYBS  0.218   Passed -0.1189633 0.1189633  len
## 2   FISHERYEI  0.013   Failed -0.2839347 0.2839347  len
## 3   FISHERYGS  0.912   Passed -0.1865475 0.1865475  len
## 4 FISHERYJOIN     NA Excluded         NA        NA  len
## 5 FISHERYSSIW  0.230   Passed -0.1176572 0.1176572  len
## 6    SURVEYBS  0.221   Passed -0.2220680 0.2220680  len
## 7    SURVEYEI  0.595   Passed -0.2119927 0.2119927  len
## 8    SURVEYGS  0.454   Passed -0.2928749 0.2928749  len
## 9  SURVEYJOIN  0.541   Passed -0.4156365 0.4156365  len

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:

##          Index runs.p     test   sigma3.lo  sigma3.hi type
## 1    FISHERYBS  0.150   Passed -0.04986852 0.04986852  len
## 2    FISHERYEI  0.013   Failed -0.30357206 0.30357206  len
## 3    FISHERYGS  0.346   Passed -0.38742103 0.38742103  len
## 4  FISHERYJOIN     NA Excluded          NA         NA  len
## 5  FISHERYSSIW  0.230   Passed -0.14854188 0.14854188  len
## 6     SURVEYBS  0.001   Failed -0.29126384 0.29126384  len
## 7     SURVEYEI  0.627   Passed -0.22413735 0.22413735  len
## 8     SURVEYGS  0.409   Passed -0.35196215 0.35196215  len
## 9   SURVEYJOIN  0.500   Passed -0.44688062 0.44688062  len
## 10    PREDATOR  0.607   Passed -0.18346375 0.18346375  len

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:

##         Index runs.p     test  sigma3.lo sigma3.hi type
## 1   FISHERYBS  0.744   Passed -0.0853166 0.0853166  len
## 2   FISHERYEI  0.013   Failed -0.2797884 0.2797884  len
## 3   FISHERYGS  0.179   Passed -0.2016250 0.2016250  len
## 4 FISHERYJOIN     NA Excluded         NA        NA  len
## 5 FISHERYSSIW  0.230   Passed -0.1211104 0.1211104  len
## 6    SURVEYBS  0.631   Passed -0.2273600 0.2273600  len
## 7    SURVEYEI  0.786   Passed -0.1820963 0.1820963  len
## 8    SURVEYGS  0.136   Passed -0.2945415 0.2945415  len
## 9  SURVEYJOIN  0.500   Passed -0.4188837 0.4188837  len

## 
##  Running Runs Test Diagnosics for Mean length 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Mean length:

##          Index runs.p     test   sigma3.lo  sigma3.hi type
## 1    FISHERYBS  0.500   Passed -0.04736845 0.04736845  len
## 2    FISHERYEI  0.013   Failed -0.27981814 0.27981814  len
## 3    FISHERYGS  0.013   Failed -0.26753759 0.26753759  len
## 4  FISHERYJOIN     NA Excluded          NA         NA  len
## 5  FISHERYSSIW  0.064   Passed -0.11817640 0.11817640  len
## 6     SURVEYBS  0.001   Failed -0.29914741 0.29914741  len
## 7     SURVEYEI  0.627   Passed -0.22176277 0.22176277  len
## 8     SURVEYGS  0.198   Passed -0.36228949 0.36228949  len
## 9   SURVEYJOIN  0.500   Passed -0.44222599 0.44222599  len
## 10    PREDATOR  0.607   Passed -0.26332169 0.26332169  len

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:

##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.001   Failed -0.8964068 0.8964068 cpue
## 2    FISHERYEI  0.100   Passed -1.1328816 1.1328816 cpue
## 3    FISHERYGS  0.797   Passed -1.9827363 1.9827363 cpue
## 4  FISHERYJOIN  0.140   Passed -1.1347812 1.1347812 cpue
## 5  FISHERYSSIW  0.001   Failed -0.7101892 0.7101892 cpue
## 6     SURVEYBS  0.333   Passed -2.9583720 2.9583720 cpue
## 7     SURVEYEI  0.771   Passed -2.6776080 2.6776080 cpue
## 8     SURVEYGS  0.448   Passed -3.2527164 3.2527164 cpue
## 9   SURVEYJOIN  0.001   Failed -3.1976327 3.1976327 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:

##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.143   Passed -1.1809409 1.1809409 cpue
## 2    FISHERYEI  0.100   Passed -1.4337655 1.4337655 cpue
## 3    FISHERYGS  0.797   Passed -2.5816226 2.5816226 cpue
## 4  FISHERYJOIN  0.582   Passed -1.2562104 1.2562104 cpue
## 5  FISHERYSSIW  0.285   Passed -0.8795652 0.8795652 cpue
## 6     SURVEYBS  0.001   Failed -2.3451953 2.3451953 cpue
## 7     SURVEYEI  0.684   Passed -2.5556960 2.5556960 cpue
## 8     SURVEYGS  0.753   Passed -3.4160804 3.4160804 cpue
## 9   SURVEYJOIN  0.268   Passed -3.2923986 3.2923986 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue
## 11    PREDATOR  0.001   Failed -0.5495260 0.5495260 cpue

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:

##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.089   Passed -0.9904944 0.9904944 cpue
## 2    FISHERYEI  0.095   Passed -1.3238558 1.3238558 cpue
## 3    FISHERYGS  0.530   Passed -2.0047873 2.0047873 cpue
## 4  FISHERYJOIN  0.140   Passed -1.0957547 1.0957547 cpue
## 5  FISHERYSSIW  0.000   Failed -0.7353154 0.7353154 cpue
## 6     SURVEYBS  0.184   Passed -2.4900423 2.4900423 cpue
## 7     SURVEYEI  0.684   Passed -2.6110623 2.6110623 cpue
## 8     SURVEYGS  0.448   Passed -3.2351728 3.2351728 cpue
## 9   SURVEYJOIN  0.718   Passed -3.1774337 3.1774337 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue

## 
##  Running Runs Test Diagnosics for Index 
## Plotting Residual Runs Tests

## 
## Runs Test stats by Index:

##          Index runs.p     test  sigma3.lo sigma3.hi type
## 1    FISHERYBS  0.219   Passed -1.1537055 1.1537055 cpue
## 2    FISHERYEI  0.095   Passed -1.3503280 1.3503280 cpue
## 3    FISHERYGS  0.240   Passed -2.3807098 2.3807098 cpue
## 4  FISHERYJOIN  0.500   Passed -1.3449663 1.3449663 cpue
## 5  FISHERYSSIW  0.028   Failed -0.8088713 0.8088713 cpue
## 6     SURVEYBS  0.001   Failed -2.3417309 2.3417309 cpue
## 7     SURVEYEI  0.684   Passed -2.6617553 2.6617553 cpue
## 8     SURVEYGS  0.753   Passed -3.3151881 3.3151881 cpue
## 9   SURVEYJOIN  0.268   Passed -3.4148453 3.4148453 cpue
## 10  SURVEYSSIW     NA Excluded         NA        NA cpue
## 11    PREDATOR  0.001   Failed -0.5495260 0.5495260 cpue

Residual Analysis and RMSE

Residual analysis of mean length data is a fundamental diagnostic tool in stock assessments. It helps evaluate whether the model provides an unbiased fit to the observed data and detects potential biases over time. In this figure, mean length residuals are plotted across years, differentiated by data source, including fishery-dependent (FISHERY) and fishery-independent (SURVEY) datasets, as well as predator-related observations (PREDATOR). The residuals represent the deviation of observed mean length from model-predicted values, standardized to facilitate interpretation.

The black line represents a locally estimated scatterplot smoothing (Loess) curve, which provides a trend line to visualize systematic deviations over time. The presence of persistent positive or negative trends in the residuals may indicate biases in the growth model, selectivity assumptions, or misrepresentation of recruitment variability. The gray bars highlight periods where residual variability is particularly high, suggesting potential inconsistencies between observed and predicted size structures.

RMSE quantifies the overall deviation between observed and predicted values, providing an aggregate measure of model fit. Lower RMSE values indicate better agreement between observed and predicted data. In fisheries stock assessment (Hurtado-ferro et al., 2015), RMSE thresholds for acceptable model performance typically range between 10% and 30%, depending on the data quality and complexity of the population dynamics being modeled. Values exceeding this range suggest potential biases, requiring further investigation into the model structure, parameter estimation, or data sources.

By analyzing residual patterns and RMSE values, the model can be refined to improve the accuracy of mean length predictions, ultimately enhancing the reliability of stock assessment outcomes and management recommendations (Figure 25).

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

Figure 25: Time series of RMSE of length compositions by scenario

## 
## RMSE stats by Index:

Figure 26 show RMSE to index.

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

## 
## RMSE stats by Index:

## Plotting JABBA residual plot

$Time series of RMSE of CPUE compositions by scenario$

Figure 26: Time series of RMSE of CPUE compositions by scenario

## 
## RMSE stats by Index:

Comparision RMSE

The Figure 27 presents evaluation results for krill models across different scenarios (s1.1 to s1.4). The RMSE (Root Mean Square Error) values range from 0.5 to 1.5, with lower values indicating better model accuracy. The Residuals vary between -1.0 and 1.0, showing deviations between predicted and observed values. Smaller residuals suggest better model fit. Scenario s1.4 appears to have the lowest RMSE (0.5), indicating higher precision, while residuals near zero in some scenarios reflect balanced predictions.

Figure 27: RMSE and Residual values for krill model evaluation across scenarios (s1.1-s1.4), highlighting precision and prediction errors.

This boxplot compares a summary of the Root Mean Square Error (RMSE) across four different models (s1.1, s1.2, s1.3, and s1.4). RMSE serves as an indicator of model accuracy, with lower values representing better predictive performance.

s1.1 exhibits the lowest median RMSE, suggesting it has the best overall fit among the four models. In contrast, Models 1.2, 1.3, and 1.4 show higher RMSE values, indicating comparatively lower predictive accuracy. The interquartile range (IQR) of these models is relatively similar, suggesting comparable variability in RMSE across models. Additionally, s1.1 has an outlier above 1.5 RMSE, which could indicate a case where the model’s predictions deviated significantly from observed values.

Overall, this analysis highlights that incorporating different environmental or predator-related variables in the models impacts their predictive ability. The differences in RMSE suggest that some models may overfit or underfit the recruitment patterns of krill, emphasizing the need to refine model selection based on ecological and statistical considerations.

Retrospective Analysis in Model Evaluation

Retrospective analyses provide insights into the differences in estimation patterns (underestimation or overestimation) among the models evaluated. These analyses assess the consistency and reliability of stock assessment models by systematically removing the most recent years of data and comparing the resulting estimates with the full dataset. In this study, we conducted a retrospective analysis to examine the sensitivity of our recruitment and spawning stock biomass (SSB) estimates to the inclusion or exclusion of recent data. By applying this approach to multiple models, we identified potential biases and evaluated the stability of the recruitment estimates over time. The retrospective patterns were assessed by calculating the relative error between the predictions of truncated datasets and the full dataset. These differences allowed us to detect trends in model performance, such as systematic overestimation or underestimation of key population parameters.

Using retro() and SSplotRetro() functions, we obtain main results

Retrospective analysis for spawning biomass (Figure 28)

Figure 28: Retrospective analysis for spawning biomass by scenario in krill

Retrospective analysis for fishing mortality (Figure 29)

Figure 29: Retrospective analysis for fishing mortality by scenario in krill

Figure 30 presents the retrospective bias (Rho) in Spawning Stock Biomass (SSB) and Fishing Mortality (F) across different peel years (2019–2016) and for the combined period under four model scenarios (s1.1 to s1.4). In the SSB panel, all models exhibit negative Rho values across the years, indicating a downward bias. The magnitude of the bias varies among models, with s1.3 and s1.4 showing the most pronounced deviations. In the F panel, Rho values are consistently positive, suggesting an upward bias. The combined Rho values for each metric confirm these trends, showing persistent differences among model scenarios.

Figure 30: Summary of retrospective analisis by scenario in F and SSB

See Table @ref(tab:rho_parameters) for details.

(#tab:rho_parameters)Rho parameter by model and quantity (SSB and F)
Model	Quant	Rho.type	Rho.peel	Rho.Rho	Rho.ForcastRho
s1.1	SSB	SSB	2019	-0.2970246	-0.2944080
s1.1	SSB	SSB	2018	-0.2707308	-0.5103302
s1.1	SSB	SSB	2017	-0.1872940	-0.3297243
s1.1	SSB	SSB	2016	-0.1779392	-0.1957271
s1.1	SSB	SSB	Combined	-0.2332471	-0.3325474
s1.1	F	F	2019	1.1909249	2.5686552
s1.1	F	F	2018	0.7828193	1.5773644
s1.1	F	F	2017	0.8435982	1.0765231
s1.1	F	F	2016	0.3245750	1.1174145
s1.1	F	F	Combined	0.7854794	1.5849893
s1.2	SSB	SSB	2019	-0.0827150	-0.1016893
s1.2	SSB	SSB	2018	-0.4893559	-0.5608395
s1.2	SSB	SSB	2017	-0.7272409	-0.7935804
s1.2	SSB	SSB	2016	-0.4875969	-0.8027882
s1.2	SSB	SSB	Combined	-0.4467272	-0.5647244
s1.2	F	F	2019	1.0251379	3.4863606
s1.2	F	F	2018	1.0700408	3.1240360
s1.2	F	F	2017	3.5165245	3.4961678
s1.2	F	F	2016	0.8270261	5.1191803
s1.2	F	F	Combined	1.6096823	3.8064362
s1.3	SSB	SSB	2019	-0.2631174	-0.3176714
s1.3	SSB	SSB	2018	-0.3709787	-0.4678843
s1.3	SSB	SSB	2017	-0.3342903	-0.5396513
s1.3	SSB	SSB	2016	-0.3209827	-0.3946523
s1.3	SSB	SSB	Combined	-0.3223423	-0.4299648
s1.3	F	F	2019	1.1085776	2.9726497
s1.3	F	F	2018	0.8449169	2.3280948
s1.3	F	F	2017	1.3134959	2.0784872
s1.3	F	F	2016	0.7604930	2.1222896
s1.3	F	F	Combined	1.0068708	2.3753803
s1.4	SSB	SSB	2019	-0.1266622	-0.2006111
s1.4	SSB	SSB	2018	-0.4523622	-0.6020788
s1.4	SSB	SSB	2017	-0.6106873	-0.8405322
s1.4	SSB	SSB	2016	-0.4535118	-0.7243463
s1.4	SSB	SSB	Combined	-0.4108059	-0.5918921
s1.4	F	F	2019	0.8803373	3.3155514
s1.4	F	F	2018	0.8286323	3.1905901
s1.4	F	F	2017	3.6632985	6.7126908
s1.4	F	F	2016	0.4471872	14.4300252
s1.4	F	F	Combined	1.4548638	6.9122144

Hindcast Cross-Validation and Prediction Skill

The Hindcast Cross-Validation (HCxval) diagnostic in Stock Synthesis is implemented using the model outputs generated by the r4ss::SS_doRetro() and using SSplotHCval() function. This diagnostic evaluates the predictive performance of the model by comparing hindcast predictions with observed data. To assess prediction skill, we employ the Mean Absolute Scaled Error (MASE) as a robust metric. MASE is calculated by scaling the mean absolute error of the model predictions relative to the mean absolute error of a naïve baseline prediction. Specifically, the MASE score is computed as follows:

A MASE score greater than 1 indicates that the model’s average forecasts are less accurate than a random walk model.
A MASE score equal to 1 suggests that the model’s accuracy is similar to that of a random walk.
A MASE score less than 1 indicates that the model performs better than a random walk.
A MASE score of 0.5, for example, indicates that the model’s forecasts are twice as accurate as the naïve baseline prediction, suggesting the model has predictive skill.

Hindcast validation for s1.1 (Figure 31)

Figure 31: Hindcast validation for s1.1 by fleet

Hindcast validation for s1.2 (Figure 32)

Figure 32: Hindcast validation for s1.2 by fleet

Hindcast validation for s1.3 (Figure 33)

Figure 33: Hindcast validation for s1.3 by fleet

Hindcast validation for s1.4 (Figure 34)

Figure 34: Hindcast validation for s1.4 by fleet

Likelihood tables

See Table 7 for details.

Table 7: Model Diagnosis Results
Description	s1.1	s1.2	s1.3	s1.4
Convergency	0.000	0.000	0.000	0.000
AIC	2668.080	51985.800	3234.900	50435.600
Total_like	1291.040	25963.900	1591.450	25188.800
N_Params	86.000	58.000	52.000	58.000
Survey_like	412.880	23344.100	645.637	23313.300
Length_comp_like	846.103	2593.080	933.491	1860.930
RMSE_index	81.300	85.700	87.000	86.800
RMSE_length	9.700	16.600	10.700	16.800
MASE	0.427	0.517	0.427	0.517
Retro_Rho_ssb	-0.233	-0.447	-0.322	-0.411
Forecast_Rho_ssb	-0.333	-0.565	-0.430	-0.592
Forecast_Rho_f	0.785	1.610	1.007	1.455
Rho_f	1.585	3.806	2.375	6.912

As shown in Table 8, the models differ substantially in key parameter estimates and likelihood contributions.

Table 8: Model parameter and likelihood comparison
Label	s1.1	s1.2	s1.3	s1.4
TOTAL_like	1291.0400	25963.9000	1591.45000	25188.800000
Survey_like	412.8800	23344.1000	645.63700	23313.300000
Length_comp_like	846.1030	2593.0800	933.49100	1860.930000
Parm_priors_like	0.4809	9.3817	5.01577	9.019300
Recr_Virgin_billions	107.9620	415.8000	66.89600	368.857000
SR_LN(R0)	18.4973	19.8457	18.01870	19.725900
SR_LN(R0)_dev_se	NA	NA	0.50000	0.500000
SR_LN(R0)_dev_autocorr	NA	NA	0.00000	0.000000
SR_LN(R0)_DEVmult_2000	NA	NA	-0.01748	-0.168254
SR_LN(R0)_DEVmult_2001	NA	NA	0.12348	0.054449
SR_LN(R0)_DEVmult_2002	NA	NA	0.06134	0.050924
SR_LN(R0)_DEVmult_2003	NA	NA	-0.60447	-0.632315
SR_LN(R0)_DEVmult_2004	NA	NA	-0.15152	-0.163070
SR_LN(R0)_DEVmult_2005	NA	NA	-0.07692	-0.115789
SR_LN(R0)_DEVmult_2006	NA	NA	0.09326	0.054163
SR_LN(R0)_DEVmult_2007	NA	NA	0.02918	-0.022446
SR_LN(R0)_DEVmult_2008	NA	NA	-0.21759	-0.243368
SR_LN(R0)_DEVmult_2009	NA	NA	-0.05644	-0.127149
SR_LN(R0)_DEVmult_2010	NA	NA	0.01351	-0.101928
SR_LN(R0)_DEVmult_2011	NA	NA	0.11352	-0.042061
SR_LN(R0)_DEVmult_2012	NA	NA	0.12875	0.030838
SR_LN(R0)_DEVmult_2013	NA	NA	-0.39951	-0.067914
SR_LN(R0)_DEVmult_2014	NA	NA	-0.03596	0.004577
SR_LN(R0)_DEVmult_2015	NA	NA	0.11805	-0.028420
SR_LN(R0)_DEVmult_2016	NA	NA	0.16990	0.002295
SR_LN(R0)_DEVmult_2017	NA	NA	0.03144	-0.055025
SR_LN(R0)_DEVmult_2018	NA	NA	0.28965	0.153088
SR_LN(R0)_DEVmult_2019	NA	NA	0.34139	0.149552
SR_LN(R0)_DEVmult_2020	NA	NA	-0.18234	-0.355401
SSB_Virgin	3565180.0000	2213450.0000	2209070.00000	2028880.000000
Bratio_2020	1.8797	0.5020	1.81788	0.838206
SPRratio_2020	0.1443	0.1207	0.17600	0.104103

Total likelihood in Figure 35

Figure 35: total likelihood composition by scenario

This bar plot presents the log-normal likelihood contributions of different model components in an Antarctic krill stock assessment under four scenarios (s1.1, s1.2, s1.3, and s1.4).

TOTAL Likelihood: The overall likelihood is highest for scenario s1.4 (black), followed by s1.2 (light orange). This suggests that s1.4 provides the best overall model fit, while s1.1 (dark red) has the lowest likelihood, indicating a poorer fit.

Survey Component: The survey likelihood follows a similar pattern, with s1.4 and s1.2 showing better agreement with the data compared to s1.1.

Recruitment, Parameter Priors, and Parameter Deviations: These components contribute minimally to the total likelihood, indicating that they do not heavily influence model fit differences.

Length Composition: There is some variation in this component, with s1.2 showing a larger contribution than s1.1 and s1.3.

Scenario s1.4 seems to fit the data best, followed closely by s1.2. Scenario s1.1 has the poorest likelihood, meaning it is the least supported by the data. If s1.4 includes both environmental and predator effects, this suggests that incorporating ecosystem variables significantly improves the stock assessment model for Antarctic krill.

Statistics analisys differences bewteen models

To evaluate the residual behavior across model scenarios, we computed residuals as the difference between observed and expected values (residual = Obs - Exp). Basic statistics, including sample size (n()), mean (mean()), and standard deviation (sd()), were calculated for each combination of model and type. To test the normality of residuals, we applied the Shapiro-Wilk test (shapiro.test()) (Shapiro & Wilk, 1965), which is appropriate for small to moderate sample sizes. Temporal autocorrelation was assessed using the Ljung-Box test (Box.test() (Ljung & Box, 1978) with type = "Ljung-Box" and lag = 10), evaluating the null hypothesis of independence across lags. To detect heteroscedasticity, we used the Breusch-Pagan test (bptest() from the lmtest package) (Breusch & Pagan, 1979), fitting a linear model of residuals against year (residual ~ Yr) and testing for non-constant variance in the residuals. These diagnostics provide insight into the validity of model assumptions across different scenarios.

As shown in Table 9, the residuals exhibit different statistical properties across model scenarios.

Table 9: Summary statistics and residual diagnostic tests by type and model
type	model	N	Mean	SD	ljung_p	bp_p
Index	s1.1	159	105609.8	1.290513e+06	0.00080	0.27150
Index	s1.2	189	106345.1	1.159477e+06	0.00000	0.06413
Index	s1.3	159	143409.3	1.277875e+06	0.00072	0.01052
Index	s1.4	189	126347.0	1.131044e+06	0.00002	0.02942
Length	s1.1	3752	0.0	3.807000e-02	0.00000	0.00004
Length	s1.2	4564	0.0	4.741000e-02	0.00000	0.00001
Length	s1.3	3752	0.0	3.992000e-02	0.00000	0.00067
Length	s1.4	4564	0.0	4.655000e-02	0.00000	0.00000

The residual analysis shows distinct patterns between the Index and Length-based models across all four scenarios (s1.1 to s1.4). For Index models, the mean residuals are notably large (ranging from approximately 105,610 to 143,409), with high standard deviations exceeding 1 million units, indicating substantial variability. The Shapiro-Wilk test consistently returned p-values of 0, strongly rejecting the null hypothesis of normality. The Ljung-Box test yielded very low p-values (≤ 0.00080), suggesting significant temporal autocorrelation in the residuals. In terms of heteroscedasticity, the Breusch-Pagan test showed moderate evidence of non-constant variance, with p-values ranging from 0.27150 (s1.1) to 0.01052 (s1.3), the latter indicating strong evidence of heteroscedasticity.

In contrast, Length models present mean residuals near zero with low standard deviations (approximately 0.038–0.047), suggesting good overall model fit and low dispersion. However, despite the small residual magnitude, the Shapiro-Wilk and Ljung-Box tests consistently return p-values of 0, indicating violation of normality and presence of temporal autocorrelation. Furthermore, the Breusch-Pagan test returned extremely low p-values (≤ 0.00067), indicating strong evidence of heteroscedasticity across all Length-based scenarios.

These results suggest that while Length models achieve better central tendencies (mean near zero and low variance), they still suffer from violations of key assumptions—most notably autocorrelation and heteroscedasticity. Index models, on the other hand, display large-scale variability and systematic residual patterns that may require model refinement or transformation.

Another statistical analysis;

## 
##  Shapiro-Wilk normality test
## 
## data:  spawn_bio_1
## W = 0.65734, p-value = 4.827e-08

## 
##  Shapiro-Wilk normality test
## 
## data:  spawn_bio_4
## W = 0.71222, p-value = 3.39e-07

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  1  4.9249 0.02962 *
##       72                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
##  Two Sample t-test
## 
## data:  spawn_bio_1 and spawn_bio_4
## t = 4.2682, df = 72, p-value = 5.915e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1387011 3818074
## sample estimates:
## mean of x mean of y 
##   4600304   1997762

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spawn_bio_1 and spawn_bio_4
## W = 1256, p-value = 6.698e-10
## alternative hypothesis: true location shift is not equal to 0

Two-Sample t-Test

t-value: 4.2682
Degrees of Freedom (df): 72
p-value: 5.915e-05 (very small)
95% Confidence Interval for the Mean Difference: [1,387,011; 3,818,074]
Sample Means:
- Model 1.1: 4,600,304
- Model 1.4: 1,997,762

Since the p-value (5.915e-05) is much smaller than 0.05, we reject the null hypothesis. This means there is a statistically significant difference between the spawning biomass predicted by Model 1.1 and Model 1.4. The confidence interval suggests that Model 1.1 tends to estimate a higher spawning biomass than Model 1.4, with an estimated difference between 1.39 and 3.82 million tons.

Wilcoxon Rank Sum Test (Mann-Whitney U Test)

W-statistic: 1256
p-value: 6.698e-10 (extremely small)
Warning: “Cannot compute exact p-value with ties” (indicates that some values are repeated)

The Wilcoxon test also rejects the null hypothesis with an extremely small p-value (6.698e-10). This confirms that the distribution of spawning biomass values is significantly different between the two models. Since the Wilcoxon test does not assume normality, it supports the conclusion from the t-test, reinforcing that

Both tests strongly indicate that Model 1.1 and Model 1.4 produce significantly different estimates of spawning biomass. Specifically, Model 1.1 predicts higher values. This suggests that the factors or assumptions included in Model 1.4 result in lower spawning biomass estimates, which could have important implications for stock assessment and fisheries management.

REFERENCES

Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294. https://doi.org/10.2307/1911963

Canales, C. M., Punt, A. E., & Mardones, M. (2021). Can a length-based pseudo-cohort analysis (LBPA) using multiple catch length-frequencies provide insight into population status in data-poor situations? Fisheries Research, 234(October 2020), 105810. https://doi.org/10.1016/j.fishres.2020.105810

Carvalho, F., Winker, H., Courtney, D., Kapur, M., Kell, L., Cardinale, M., Schirripa, M., Kitakado, T., Yemane, D., Piner, K. R., Maunder, M. N., Taylor, I., Wetzel, C. R., Doering, K., Johnson, K. F., & Methot, R. D. (2021). A cookbook for using model diagnostics in integrated stock assessments. Fisheries Research, 240(April), 105959. https://doi.org/10.1016/j.fishres.2021.105959

Chong, L., Mildenberger, T. K., Rudd, M. B., Taylor, M. H., Cope, J. M., Branch, T. A., Wolff, M., & Stäbler, M. (2019). Performance evaluation of data-limited, length-based stock assessment methods. ICES Journal of Marine Science, 77(1), 97–108. https://doi.org/10.1093/icesjms/fsz212

Fournier, D. A., Skaug, H. J., Ancheta, J., Ianelli, J., Magnusson, A., Maunder, M. N., Nielsen, A., & Sibert, J. (2012). AD Model Builder: Using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software, 27(2), 233–249. https://doi.org/10.1080/10556788.2011.597854

Hurtado-ferro, F., Szuwalski, C. S., Valero, J. L., Anderson, S. C., Cunningham, C. J., Johnson, K. F., Licandeo, R., Mcgilliard, C. R., Monnahan, C. C., & Muradian, M. L. (2015). Marine Science. 72, 99–110.

ICCAT. (2003). Manual. Sampling for length frequency data and AKL description (pp. 29–41).

Lee, H. H., Maunder, M. N., & Piner, K. R. (2024). Good Practices for estimating and using length-at-age in integrated stock assessments. Fisheries Research, 270(November 2023). https://doi.org/10.1016/j.fishres.2023.106883

Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65(2), 297–303. https://doi.org/10.1093/biomet/65.2.297

Methot, R., & Wetzel, C. (2013). Stock synthesis: A biological and statistical framework for fish stock assessment and fishery management. Fisheries Research, 142, 86–99. https://doi.org/10.1016/j.fishres.2012.10.012

Nielsen, A., Hintzen, N. T., Mosegaard, H., Trijoulet, V., & Berg, C. W. (2021). Multi-fleet state-space assessment model strengthens confidence in single-fleet SAM and provides fleet-specific forecast options. ICES Journal of Marine Science, 78(6), 2043–2052. https://doi.org/10.1093/icesjms/fsab078

Punt, André. (2003). The performance of a size-structured stock assessment method in the face of spatial heterogeneity in growth. Fisheries Research, 65(1-3), 391–409. https://doi.org/10.1016/j.fishres.2003.09.028

Punt, Andre, Huang, T., & Maunder, M. N. (2013). Review of integrated size-structured models for stock assessment of hard-to-age crustacean and mollusc species. ICES Journal of Marine Science, 70(1), 16–33. https://doi.org/10.4135/9781412953924.n678

Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.2307/2333709

Taylor, I. (2019). Using R for Stock Synthesis Installing R and getting R4SS. Fisheries Science, November.

Thanassekos, S., Cox, M. J., & Reid, K. (2014). Investigating the effect of recruitment variability on length-based recruitment indices for antarctic krill using an individual-based population dynamics model. PLoS ONE, 9(12), 1–20. https://doi.org/10.1371/journal.pone.0114378

Waterhouse, L., Sampson, D. B., Maunder, M., & Semmens, B. X. (2014). Using areas-as-fleets selectivity to model spatial fishing: Asymptotic curves are unlikely under equilibrium conditions. Fisheries Research, 158, 15–25. https://doi.org/10.1016/j.fishres.2014.01.009

Winker, H., Carvalho, F., Cardinale, M., & Kell, L. (2024). ss3diags: What the package does (one line, title case). https://github.com/jabbamodel/ss3diags

Supplementary Information 1

Assessing ecosystem effects on the stock productivity of Antarctic Krill (Euphausia superba): An Integrated modeling perspective

Mardones, M; Jarvis Mason, E.T.; Piñones, A.; Krüger, L.,Santa-Cruz, F.; Watters, G., Cárdenas, C.A; Methot, R.

21 May, 2025

MODELING SETTING

Libraries

Code Repository

OVERVIEW

Spatial dimension of stock assessment

Statistical Model (`SS3`)

Data

Parameters

Index

Length comositions

Summary data sources to Modelling

Scenarios

RESULTS

Population Variables by scenario

Outputs Variables `s1.1`

Outputs Variables `s1.2`

Outputs Variables `s1.3`

Outputs Variables `s1.4`

Comparision between variables

Autocorrelation in Recruit

Relationship Stock-Recruit

Productivity and Interannual Variability by Scenario

Recruit deviation

Explotation Rate

Diagnosis and robustness

Convergence Criteria

Residuals

Residual consistency

Residual Analysis and RMSE

Comparision RMSE

Retrospective Analysis in Model Evaluation

Hindcast Cross-Validation and Prediction Skill

Likelihood tables

Statistics analisys differences bewteen models

REFERENCES

Supplementary Information 1

Assessing ecosystem effects on the stock productivity of Antarctic Krill (Euphausia superba): An Integrated modeling perspective

Mardones, M; Jarvis Mason, E.T.; Piñones, A.; Krüger, L.,Santa-Cruz, F.; Watters, G., Cárdenas, C.A; Methot, R.

21 May, 2025

MODELING SETTING

Libraries

Code Repository

OVERVIEW

Spatial dimension of stock assessment

Statistical Model (SS3)

Data

Parameters

Index

Length comositions

Summary data sources to Modelling

Scenarios

RESULTS

Population Variables by scenario

Outputs Variables s1.1

Outputs Variables s1.2

Outputs Variables s1.3

Outputs Variables s1.4

Comparision between variables

Autocorrelation in Recruit

Relationship Stock-Recruit

Productivity and Interannual Variability by Scenario

Recruit deviation

Explotation Rate

Diagnosis and robustness

Convergence Criteria

Residuals

Residual consistency

Residual Analysis and RMSE

Comparision RMSE

Retrospective Analysis in Model Evaluation

Hindcast Cross-Validation and Prediction Skill

Likelihood tables

Statistics analisys differences bewteen models

REFERENCES

Statistical Model (`SS3`)

Outputs Variables `s1.1`

Outputs Variables `s1.2`

Outputs Variables `s1.3`

Outputs Variables `s1.4`