02d Update SA with taxon from SL
Source:vignettes/v02d-update-SA-with-taxon-from-SL.Rmd
v02d-update-SA-with-taxon-from-SL.Rmd
Introduction
The aim of this document is to show how to rename the species code
from the Sample (SA) table based on the Species list (SL) table using
the function updateSAwithTaxonFromSL()
in the
RDBEScore
package.
Prerequisit
In the first step you need to load some example data. Good practice is to check that the RDBESDataObjects are valid.
# load an example dataset
# H1 directory
dirH1 <- "../tests/testthat/h1_v_20250211/"
myObject <- createRDBESDataObject(input = dirH1)
myObject <- filterAndTidyRDBESDataObject(myObject,
fieldsToFilter =c("DEsampScheme"),
valuesToFilter = c("National Routine"))
numOfRows <- 5
myObject[["SA"]] <- head(myObject[["SA"]],numOfRows)
myObject <- findAndKillOrphans(myObject)
# Change SA species codes to Sprat(126425)
numOfRows <- nrow(myObject[["SA"]])
myObject[["SA"]]$SAspeCode[1:numOfRows] <- rep("126425", numOfRows)
# Just keep the first row of the SL data
myObject[["SL"]] <- head( myObject[["SL"]][myObject[["SL"]]$SLspeclistName=="ZW_1965_SpeciesList1" &
myObject[["SL"]]$SLcatchFrac=="Lan",],1)
# Just keep the first row of the IS data that matches the remaining SLid
mySLid = myObject[["SL"]]$SLid
myObject[["IS"]] <- head(myObject[["IS"]][myObject[["IS"]]$SLid == mySLid,],1)
# Change IS species codes to Clupeidae(125464)
myObject[["IS"]][,"ISsppCode"] <- as.integer(125464)
myObject[["IS"]][,"IScommTaxon"] <- as.integer(125464)
# check it is a valid RDBESobject
validateRDBESDataObject(myObject, checkDataTypes = TRUE)
A closer look into the example data and its characteristics
The example is from data in hierarchy 1. It contains a single trip with a single haul. For simplicity, we restrict our analysis to the tables SL, IS, SS and SA which are the ones handled by the functions which behaviour we want to demonstrate.
Examining a print of the Species List table (SL) and part where SLspeclistName is ZW_1965_SpeciesList we can conclude that what was targeted for the sampling was only landings of Clupeidae (aphiaId 125464).
myObject$SL[SLspeclistName=='ZW_1965_SpeciesList1'&SLcatchFrac=='Lan',]
#> Key: <SLid>
#> SLid SLrecType SLcou SLinst SLspeclistName SLyear SLcatchFrac
#> <int> <char> <char> <char> <char> <int> <char>
#> 1: 47870 SL ZW 1051 ZW_1965_SpeciesList1 1965 Lan
Here is the single IS row that is part of this species list.
myObject[["IS"]]
#> Key: <ISid>
#> ISid SLid ISrecType IScommTaxon ISsppCode
#> <int> <int> <char> <int> <int>
#> 1: 1001 47870 IS 125464 125464
The Species Selection Table where species list is ZW_1965_SpeciesList1 and catch category Lan
myObject$SS[SSspecListName=='ZW_1965_SpeciesList1' & SScatchFra=='Lan',c(1,3,6,8,15,18)]
#> Key: <SSid>
#> SSid FOid SLid SSrecType SScatchFra SSspecListName
#> <int> <int> <int> <char> <char> <char>
#> 1: 225062 68983 47870 SS Lan ZW_1965_SpeciesList1
#> 2: 225063 68984 47870 SS Lan ZW_1965_SpeciesList1
#> 3: 225064 68985 47870 SS Lan ZW_1965_SpeciesList1
#> 4: 225065 68986 47870 SS Lan ZW_1965_SpeciesList1
#> 5: 225066 68987 47870 SS Lan ZW_1965_SpeciesList1
#> ---
#> 725: 225786 69707 47870 SS Lan ZW_1965_SpeciesList1
#> 726: 225787 69708 47870 SS Lan ZW_1965_SpeciesList1
#> 727: 225788 69709 47870 SS Lan ZW_1965_SpeciesList1
#> 728: 225789 69710 47870 SS Lan ZW_1965_SpeciesList1
#> 729: 225790 69711 47870 SS Lan ZW_1965_SpeciesList1
The part of the SA table which corresponds to SS table is
myObject$SA[,c(1,2,9,14,48,49)]
#> Key: <SAid>
#> SAid SSid SAspeCode SAcatchCat SAnoSampReason SAnonRespCol
#> <num> <int> <char> <char> <char> <char>
#> 1: 548860 225062 126425 Lan
#> 2: 548861 225063 126425 Lan
#> 3: 548862 225064 126425 Lan
#> 4: 548863 225065 126425 Lan
#> 5: 548864 225066 126425 Lan
What was looked for is 125464 - Clupeidae
.
In the sample we recorded 126425 - Clupea harengus
which
is part of the Clupeidae
family. In this point it will be
helpful to use renameSA
function.
Rename species code in SA table using SL table.
We looked for the fish from Clupeidae
family but we had
a more precise reading than we expected because we read fish in the
species level which is a more accurate rank than family.
myObjectnew<-updateSAwithTaxonFromSL(RDBESDataObject=myObject,
validate=TRUE,
verbose=TRUE,
strict=TRUE)
#> [1] "Note that TE is NULL but this is allowed in an RDBESDataObject"
#> [2] "Note that LO is NULL but this is allowed in an RDBESDataObject"
#> [3] "Note that OS is NULL but this is allowed in an RDBESDataObject"
#> [4] "Note that LE is NULL but this is allowed in an RDBESDataObject"
#> [1] "Number of rows changed: 5"
Function is checking the rank of aphia id in both of tables SA and SL. If aphia id in SA table is more accurate than in SL table and aphia ids are from the same family then aphia id in SA table is renaming to aphia id from SL table. If aphia Ids are from diffrent families then the function is leaving the primary value. If we have the opposite situation, so when in the SL table there is more accurate rank than in the SA table, the function is leaving the primary value too.
In the SA results we can see that 126425 -
Clupea harengus
has been changed to 125464 -
Clupeidae
.
myObjectnew$SA[,c(1,2,9,14,48,49)]
#> Key: <SAid>
#> SAid SSid SAspeCode SAcatchCat SAnoSampReason SAnonRespCol
#> <num> <int> <char> <char> <char> <char>
#> 1: 548860 225062 125464 Lan
#> 2: 548861 225063 125464 Lan
#> 3: 548862 225064 125464 Lan
#> 4: 548863 225065 125464 Lan
#> 5: 548864 225066 125464 Lan
See also other package vignettes:
- v01a data import
- v01b manipulating rdbesdataobjects
- v02a generating probabilities
- v02b Generating zeros for species not observed
- v02c Generating NAs for species not targeted by sampling
- v02d update SA with taxon from SL
- v03a Estimating population parameters unbiased estimator
- v03b Estimating population parameters ratio estimator
- v04 Create IC format from estimation results
#END