Skip to contents

Introduction

The aim of this document is to show how to rename the species code from the Sample (SA) table based on the Species list (SL) table using the function updateSAwithTaxonFromSL() in the RDBEScore package.

Load the package

Prerequisit

In the first step you need to load some example data. Good practice is to check that the RDBESDataObjects are valid.

# load an example dataset

# H1 directory
dirH1 <- "../tests/testthat/h1_v_20250211/"

myObject <- createRDBESDataObject(input = dirH1)
myObject <- filterAndTidyRDBESDataObject(myObject,
                        fieldsToFilter =c("DEsampScheme"),
                        valuesToFilter = c("National Routine"))

numOfRows <- 5
myObject[["SA"]] <- head(myObject[["SA"]],numOfRows)
myObject <- findAndKillOrphans(myObject)

# Change SA species codes to Sprat(126425) 
numOfRows <- nrow(myObject[["SA"]])
myObject[["SA"]]$SAspeCode[1:numOfRows] <- rep("126425", numOfRows)

# Just keep the first row of the SL data
myObject[["SL"]] <- head( myObject[["SL"]][myObject[["SL"]]$SLspeclistName=="ZW_1965_SpeciesList1" &
                 myObject[["SL"]]$SLcatchFrac=="Lan",],1)

# Just keep the first row of the IS data that matches the remaining SLid
mySLid = myObject[["SL"]]$SLid
myObject[["IS"]] <- head(myObject[["IS"]][myObject[["IS"]]$SLid == mySLid,],1)

# Change IS species codes to Clupeidae(125464) 
myObject[["IS"]][,"ISsppCode"] <- as.integer(125464)
myObject[["IS"]][,"IScommTaxon"] <- as.integer(125464)

# check it is a valid RDBESobject
validateRDBESDataObject(myObject, checkDataTypes = TRUE)

A closer look into the example data and its characteristics

The example is from data in hierarchy 1. It contains a single trip with a single haul. For simplicity, we restrict our analysis to the tables SL, IS, SS and SA which are the ones handled by the functions which behaviour we want to demonstrate.

Examining a print of the Species List table (SL) and part where SLspeclistName is ZW_1965_SpeciesList we can conclude that what was targeted for the sampling was only landings of Clupeidae (aphiaId 125464).

myObject$SL[SLspeclistName=='ZW_1965_SpeciesList1'&SLcatchFrac=='Lan',]
#> Key: <SLid>
#>     SLid SLrecType  SLcou SLinst       SLspeclistName SLyear SLcatchFrac
#>    <int>    <char> <char> <char>               <char>  <int>      <char>
#> 1: 47870        SL     ZW   1051 ZW_1965_SpeciesList1   1965         Lan

Here is the single IS row that is part of this species list.

myObject[["IS"]]
#> Key: <ISid>
#>     ISid  SLid ISrecType IScommTaxon ISsppCode
#>    <int> <int>    <char>       <int>     <int>
#> 1:  1001 47870        IS      125464    125464

The Species Selection Table where species list is ZW_1965_SpeciesList1 and catch category Lan

myObject$SS[SSspecListName=='ZW_1965_SpeciesList1' & SScatchFra=='Lan',c(1,3,6,8,15,18)]
#> Key: <SSid>
#>        SSid  FOid  SLid SSrecType SScatchFra       SSspecListName
#>       <int> <int> <int>    <char>     <char>               <char>
#>   1: 225062 68983 47870        SS        Lan ZW_1965_SpeciesList1
#>   2: 225063 68984 47870        SS        Lan ZW_1965_SpeciesList1
#>   3: 225064 68985 47870        SS        Lan ZW_1965_SpeciesList1
#>   4: 225065 68986 47870        SS        Lan ZW_1965_SpeciesList1
#>   5: 225066 68987 47870        SS        Lan ZW_1965_SpeciesList1
#>  ---                                                             
#> 725: 225786 69707 47870        SS        Lan ZW_1965_SpeciesList1
#> 726: 225787 69708 47870        SS        Lan ZW_1965_SpeciesList1
#> 727: 225788 69709 47870        SS        Lan ZW_1965_SpeciesList1
#> 728: 225789 69710 47870        SS        Lan ZW_1965_SpeciesList1
#> 729: 225790 69711 47870        SS        Lan ZW_1965_SpeciesList1

The part of the SA table which corresponds to SS table is

myObject$SA[,c(1,2,9,14,48,49)]
#> Key: <SAid>
#>      SAid   SSid SAspeCode SAcatchCat SAnoSampReason SAnonRespCol
#>     <num>  <int>    <char>     <char>         <char>       <char>
#> 1: 548860 225062    126425        Lan                            
#> 2: 548861 225063    126425        Lan                            
#> 3: 548862 225064    126425        Lan                            
#> 4: 548863 225065    126425        Lan                            
#> 5: 548864 225066    126425        Lan

What was looked for is 125464 - Clupeidae.

In the sample we recorded 126425 - Clupea harengus which is part of the Clupeidae family. In this point it will be helpful to use renameSA function.

Rename species code in SA table using SL table.

We looked for the fish from Clupeidae family but we had a more precise reading than we expected because we read fish in the species level which is a more accurate rank than family.

myObjectnew<-updateSAwithTaxonFromSL(RDBESDataObject=myObject,
                            validate=TRUE,
                            verbose=TRUE,
                            strict=TRUE)
#> [1] "Note that TE is NULL but this is allowed in an RDBESDataObject"
#> [2] "Note that LO is NULL but this is allowed in an RDBESDataObject"
#> [3] "Note that OS is NULL but this is allowed in an RDBESDataObject"
#> [4] "Note that LE is NULL but this is allowed in an RDBESDataObject"
#> [1] "Number of rows changed: 5"

Function is checking the rank of aphia id in both of tables SA and SL. If aphia id in SA table is more accurate than in SL table and aphia ids are from the same family then aphia id in SA table is renaming to aphia id from SL table. If aphia Ids are from diffrent families then the function is leaving the primary value. If we have the opposite situation, so when in the SL table there is more accurate rank than in the SA table, the function is leaving the primary value too.

In the SA results we can see that 126425 - Clupea harengus has been changed to 125464 - Clupeidae.

myObjectnew$SA[,c(1,2,9,14,48,49)]
#> Key: <SAid>
#>      SAid   SSid SAspeCode SAcatchCat SAnoSampReason SAnonRespCol
#>     <num>  <int>    <char>     <char>         <char>       <char>
#> 1: 548860 225062    125464        Lan                            
#> 2: 548861 225063    125464        Lan                            
#> 3: 548862 225064    125464        Lan                            
#> 4: 548863 225065    125464        Lan                            
#> 5: 548864 225066    125464        Lan

See also other package vignettes:

#END