Estimate Catch at Number (CANUM) for Biological Variables

This function estimates catch at number (CANUM) for a specified biological variable, such as age or length. It aggregates data based on specified columns and generates a "plus group" for the highest value in the defined classes. The function supports grouping by various units (e.g., age, length, weight) and calculates required indices, totals, and proportions for the groups.

Usage

doBVestimCANUM(
  bv,
  addColumns,
  classUnits = "Ageyear",
  classBreaks = 1:8,
  verbose = FALSE
)

Arguments

bv: A data.table containing biological data, with columns for the biological variable, class units (e.g., Ageyear, Lengthmm, Weightg), and other relevant variables.
addColumns: A character vector of additional column names used to group the data for aggregation (e.g., BVfishId and other identifiers).
classUnits: A character string specifying the class units of the biological variable to use for grouping (e.g., "Ageyear", "Lengthmm", "Weightg"). Default is "Ageyear".
classBreaks: A numeric vector specifying the breakpoints for classifying the biological variable. The last value defines the lower bound of the "plus group". Default is 1:8 for age groups.
verbose: Logical, if TRUE, prints detailed information about the process. Default is FALSE.

Value

A data.table containing the aggregated results, including groupings, calculated means, proportions, indices, and totals for the specified biological variable.

Details

The function performs the following steps:

Validates the presence of the classUnits in the biological variable data.
Reshapes the input data using dcast and groups the biological variable into classes using cut().
Aggregates mean weights and lengths by the defined classes, along with calculating proportions and indices based on the sample size.
A "plus group" is created for values exceeding the highest classBreaks value.
Calculates total weights, catch numbers, and performs a sanity check to ensure there are no rounding errors in the final results.

Mathematical Logic:

Let:

$W_{mean}$ be the mean weight for each group.
$L_{mean}$ be the mean length for each group.
$n_W$ be the number of weight measurements in each group.
$N$ be the total number of measurements in the sample.
$P$ be the proportion of the sample represented by each group.
$I_W$ be the weight index for each group.
$S$ be the sum of weight indices across all groups.
$C$ be the total catch weight.
$T_W$ be the total weight for each group.
$C_{num}$ be the total catch number for each group.

The calculations are as follows:

Proportion of sample: $$P = \frac{n_W}{N}$$
Weight Index: $$I_W = P \times \left( \frac{W_{mean}}{1000} \right)$$
Sum of Weight Indices: $$S = \sum I_W$$
Total Weight Coefficient: $$\mathrm{TWCoef} = \frac{C}{S}$$
Total Weight per Group: $$T_W = I_W \times \mathrm{TWCoef}$$
Total Catch Number per Group: $$C_{num} = \frac{T_W}{\left( \frac{W_{mean}}{1000} \right)}$$