03b Estimating population parameters: ratio estimator
2025-10-15
Source:vignettes/v03b-Estimating-population-parameters-ratio-estimator.Rmd
v03b-Estimating-population-parameters-ratio-estimator.Rmd
Introduction
The aim of this document is to demonstrate how to estimate population parameters from RDBESDataObjects.
Prerequisites
We’ll use some example data from the RDBES. For your own data it’s a good idea to check your RDBESDataObjects are valid after any manipulations you perform.
Estimation workflow
Estimation actually is done on a RDBESEstObject that is generate from RDBESDataObject using createRDBESEstObject(…). The following example uses a dataset that resembles in many ways real data at ICES area 27.3.d.28.1 from the Estonian Baltic Trawling fleet for 2022 sprat (Sprattus sprattus).The dataset has both total landings and commercial sampling data.
H8ExampleEE1
#> Hierarchy 8 RDBESdataObject:
#> DE: 1
#> SD: 1
#> TE: 11 (SRSWOR: 2-3/4)
#> VS: 15 (SRSWR: 1-2/7-12)
#> LE: 15 (SRSWOR: 1/1)
#> SS: 15 (CENSUS: 1/1)
#> SA: 15 (SRSWOR: 1/794-14268)
#> BV: 3995 (CENSUS: 16-100/16-100)
#> VD: 7
#> SL: 1
#> IS: 2
#> CL: 71
#> CE: 132
The data has not gone through RDBES upload and download procedure and contains invalid data types but also may have some other validity problems. However the fields for estimation should be present and valid.
#to check data types
validateRDBESDataObject(H8ExampleEE1, checkDataTypes = TRUE, strict = FALSE)
Simple Estimation Example
For the simplest case of estimation we need the RDBESDataObject with CS tables and a CL table. In the following example we will estimate the total number of sprat caught in the area 27.3.d.28.1 with the gear OTM_SPF_16-31_0_0 for the first and last quarter of the year.
#From the commertial landings table we need to get the total weight of the catches
CLfieldstoSum <- c("CLoffWeight")
The most important thing in this estimation is to get the same strata for the CS and CL tables. This means we want to take the samples from the same area, with the same gear and the same species. Exactly how this is done depends on the upper and lower hierarchy used and how the sampling is stratified. In the following example we are using the lower hierarchy C meaning that we are extracting the BV data as the biological data.
#get the first quarter data from CS
strataListCS <- list(LEarea="27.3.d.28.1",
LEmetier6 = "OTM_SPF_16-31_0_0",
TEstratumName = month.name[1:3],
SAspeCodeFAO = "SPR")
#get the first quarter data from CL table
strataListCL <- list(CLarea="27.3.d.28.1",
CLquar = 1,
CLmetier6 = "OTM_SPF_16-31_0_0",
CLspecFAO = "SPR")
#we are using the lower hierarchy C meaning that we are extracting the BV data
#as the biological data
biolCLQ1 <- addCLtoLowerCS(H8ExampleEE1, strataListCS, strataListCL,
combineStrata =T, lowerHierarchy = "C",
CLfields = CLfieldstoSum)
#> Warning in getLowerTableSubsets(strataListCS, tblName, rdbes, combineStrata = combineStrata): TEstratumName is collapsed in the result into: "January|February|March"
To estimate the total number of sprat caught in the area 27.3.d.28.1 with the gear OTM_SPF_16-31_0_0 for the first quarter of the year we need to use the function doBVestimCANUM(…). The function takes the biological data table, the columns to be added to the final result and the class breaks for the estimation. The function returns a table with the estimates for the given class breaks. The allowed class units are “Ageyears”, “Lenghtmm” and “Weightg”. The class breaks are the number of classes to break the data into. The function will return the estimates for each class. The addColumns parameter is a vector of column names that will be added to the final result. The columns should be present in the biological data table. A minimum of one column (the sum column that in our case is “sumCLoffWeight”) is required. The function will return the estimates for each unique value in the columns.
lenCANUMQ1 <- doBVestimCANUM(biolCLQ1, c("sumCLoffWeight"),
classUnits = "Lengthmm",
classBreaks = seq(70,130,10),
verbose = FALSE)
lenCANUMQ1
#> Key: <sumCLoffWeight>
#> sumCLoffWeight Group WeightgMean WeightgLen LengthmmMean lenMeas
#> <num> <fctr> <num> <int> <num> <int>
#> 1: 396210 120-130 12.230303 66 122.21212 433
#> 2: 396210 110-120 9.111348 141 113.34752 433
#> 3: 396210 80-90 3.628125 32 84.87500 433
#> 4: 396210 100-110 7.500621 161 104.13043 433
#> 5: 396210 90-100 4.985185 27 93.25926 433
#> 6: 396210 130+ 13.650000 4 132.50000 433
#> 7: 396210 70-80 2.500000 2 75.50000 433
#> targetMeas plusGroup propSample WeightIndex WeightIndexSum TWCoef
#> <int> <num> <num> <num> <num> <num>
#> 1: 433 130 0.152424942 1.864203e-03 0.008336721 47525882
#> 2: 433 130 0.325635104 2.966975e-03 0.008336721 47525882
#> 3: 433 130 0.073903002 2.681293e-04 0.008336721 47525882
#> 4: 433 130 0.371824480 2.788915e-03 0.008336721 47525882
#> 5: 433 130 0.062355658 3.108545e-04 0.008336721 47525882
#> 6: 433 130 0.009237875 1.260970e-04 0.008336721 47525882
#> 7: 433 130 0.004618938 1.154734e-05 0.008336721 47525882
#> totWeight totNum
#> <num> <num>
#> 1: 88597.9035 7244129.9
#> 2: 141008.0855 15476095.6
#> 3: 12743.0830 3512305.4
#> 4: 132545.6247 17671286.5
#> 5: 14773.6346 2963507.7
#> 6: 5992.8711 439038.2
#> 7: 548.7977 219519.1
To have the estimates for several Quarters in the result same biolCL table creation is done for the last quarter of the year.
strataListCS <- list(LEarea="27.3.d.28.1",
LEmetier6 = "OTM_SPF_16-31_0_0",
TEstratumName = month.name[10:12], #the last 3 months of the year
SAspeCodeFAO = "SPR")
strataListCL <- list(CLarea="27.3.d.28.1",
CLquar = 4,
CLmetier6 = "OTM_SPF_16-31_0_0",
CLspecFAO = "SPR")
biolCLQ4 <- addCLtoLowerCS(H8ExampleEE1, strataListCS, strataListCL,
combineStrata =T, lowerHierarchy = "C",
CLfields = CLfieldstoSum)
#> Warning in getLowerTableSubsets(strataListCS, tblName, rdbes, combineStrata = combineStrata): TEstratumName is collapsed in the result into: "October|November|December"
#lets combine the two quarters into one table
biolCL <- rbind(biolCLQ1, biolCLQ4)
More columns from the above table can be added to the final result. The output is broken down by unique values in these columns.
#this alllows to add extra columns into the final result
addCols <- c(names(strataListCS),
names(strataListCL),
paste0('sum', CLfieldstoSum))
ageCANUM <- doBVestimCANUM(biolCL, addCols,
classUnits = "Ageyear",
classBreaks = 1:8,
verbose = FALSE)
ageCANUM
#> Key: <LEarea, LEmetier6, TEstratumName, SAspeCodeFAO, CLarea, CLquar, CLmetier6, CLspecFAO, sumCLoffWeight>
#> LEarea LEmetier6 TEstratumName SAspeCodeFAO
#> <char> <char> <char> <char>
#> 1: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 2: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 3: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 4: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 5: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 6: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 7: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 8: 27.3.d.28.1 OTM_SPF_16-31_0_0 January|February|March SPR
#> 9: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 10: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 11: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 12: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 13: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 14: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 15: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> 16: 27.3.d.28.1 OTM_SPF_16-31_0_0 October|November|December SPR
#> CLarea CLquar CLmetier6 CLspecFAO sumCLoffWeight Group
#> <char> <num> <char> <char> <num> <fctr>
#> 1: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 7
#> 2: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 3
#> 3: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 5
#> 4: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 1
#> 5: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 6
#> 6: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 2
#> 7: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 8+
#> 8: 27.3.d.28.1 1 OTM_SPF_16-31_0_0 SPR 396210 4
#> 9: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 2
#> 10: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 3
#> 11: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 1
#> 12: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 4
#> 13: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 8+
#> 14: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 5
#> 15: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 7
#> 16: 27.3.d.28.1 4 OTM_SPF_16-31_0_0 SPR 225451 6
#> WeightgMean WeightgLen LengthmmMean lenMeas targetMeas plusGroup propSample
#> <num> <int> <num> <int> <int> <int> <num>
#> 1: 11.500000 7 125.57143 433 433 8 0.01616628
#> 2: 9.051695 118 112.49153 433 433 8 0.27251732
#> 3: 10.204167 48 120.04167 433 433 8 0.11085450
#> 4: 3.943396 53 87.01887 433 433 8 0.12240185
#> 5: 10.811111 9 122.88889 433 433 8 0.02078522
#> 6: 7.529341 167 104.28743 433 433 8 0.38568129
#> 7: 10.040000 5 121.80000 433 433 8 0.01154734
#> 8: 13.750000 26 118.07692 433 433 8 0.06004619
#> 9: 10.383740 123 111.73171 378 356 8 0.34550562
#> 10: 10.913415 82 114.18293 378 356 8 0.23033708
#> 11: 9.576316 38 107.42105 378 356 8 0.10674157
#> 12: 12.093182 44 119.25000 378 356 8 0.12359551
#> 13: 11.362500 24 119.87500 378 356 8 0.06741573
#> 14: 12.088462 26 121.80769 378 356 8 0.07303371
#> 15: 14.883333 6 131.16667 378 356 8 0.01685393
#> 16: 12.746154 13 123.46154 378 356 8 0.03651685
#> WeightIndex WeightIndexSum TWCoef totWeight totNum
#> <num> <num> <num> <num> <num>
#> 1: 0.0001859122 0.008336721 47525882 8835.643 768316.8
#> 2: 0.0024667436 0.008336721 47525882 117234.168 12951626.1
#> 3: 0.0011311778 0.008336721 47525882 53760.224 5268458.1
#> 4: 0.0004826790 0.008336721 47525882 22939.745 5817255.8
#> 5: 0.0002247113 0.008336721 47525882 10679.604 987835.9
#> 6: 0.0029039261 0.008336721 47525882 138011.650 18329843.8
#> 7: 0.0001159353 0.008336721 47525882 5509.929 548797.7
#> 8: 0.0008256351 0.008336721 47525882 39239.037 2853748.1
#> 9: 0.0035876404 0.010983427 20526471 73641.599 7092011.2
#> 10: 0.0025137640 0.010983427 20526471 51598.706 4728007.5
#> 11: 0.0010221910 0.010983427 20526471 20981.975 2191027.9
#> 12: 0.0014946629 0.010983427 20526471 30680.156 2536979.6
#> 13: 0.0007660112 0.010983427 20526471 15723.508 1383807.1
#> 14: 0.0008828652 0.010983427 20526471 18122.107 1499124.3
#> 15: 0.0002508427 0.010983427 20526471 5148.915 345951.8
#> 16: 0.0004654494 0.010983427 20526471 9554.035 749562.2
The output table should be enough to populate a classic Intercatch data call table.
See also other package vignettes:
- v01a data import
- v01b manipulating rdbesdataobjects
- v02a generating probabilities
- v02b Generating zeros for species not observed
- v02c Generating NAs for species not targeted by sampling
- v02d update SA with taxon from SL
- v03a Estimating population parameters unbiased estimator
- v03b Estimating population parameters ratio estimator
- v04 Create IC format from estimation results
#END