Introduction

The aim of this document is to illustrate some of the ways of manipulating RDBESDataObjects.

Prerequisites

First we’ll load some example data from the RDBES and check it’s valid. It’s a good idea to check your RDBESDataObjects are valid after any manipulations you perform. See how to import your own data in the vignette Import RDBES data In this vignette package example data is used.

# load Hierarchy 1 demo data
myH1RawObject <- H1Example

validateRDBESDataObject(myH1RawObject, verbose = FALSE)

The print method gives list of non-null tables in the RDBESDataObject. The structure of the output for each table is:

  • Table name (DE, TE, LE, etc.)
  • Number of rows
  • Sampling method (if applicable, SWRWR, NPJS, CENSUS, etc.)
  • Range of number sampled (if applicable)
  • Range of number total (if applicable)

If there is a single hierarchy present the output is ordered by RDBES hierarchy structure.

print(myH1RawObject)
#> Hierarchy 1 RDBESdataObject:
#>  DE: 8
#>  SD: 8
#>  VS: 1214 (SRSWOR,CENSUS,SRSWR: 2-135/4-1382)
#>  FT: 1430 (CENSUS,SRSWR: 1-3/1-100)
#>  FO: 1916 (CENSUS,SRSWR: 1-3/1-20)
#>  SS: 1916 (CENSUS,SRSWR: 1/1-4)
#>  SA: 1916 (CENSUS,SRSWR: 1/1-2)
#>  FM: 7290
#>  BV: 14580 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

Underling the print function there is a summary function that retains only unique rows for some columns used in print.

h1summary <- summary(myH1RawObject)
#get the hierarchy
h1summary$hierarchy
#> [1] 1
#extract the number of rows in tables from the summary
sapply(h1summary$data, function(x){x$rows})
#>    DE    SD    VS    FT    FO    SS    SA    FM    BV    VD    SL 
#>     8     8  1214  1430  1916  1916  1916  7290 14580   311   170

To get the number of rows in each non-null table you can simply call the object:

myH1RawObject #equivalent to print(summary(myH1RawObject))
#> Hierarchy 1 RDBESdataObject:
#>  DE: 8
#>  SD: 8
#>  VS: 1214 (SRSWOR,CENSUS,SRSWR: 2-135/4-1382)
#>  FT: 1430 (CENSUS,SRSWR: 1-3/1-100)
#>  FO: 1916 (CENSUS,SRSWR: 1-3/1-20)
#>  SS: 1916 (CENSUS,SRSWR: 1/1-4)
#>  SA: 1916 (CENSUS,SRSWR: 1/1-2)
#>  FM: 7290
#>  BV: 14580 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

To get an example Hierarchy 5 data:

# Hierarchy 5 demo data
myH5RawObject <- H5Example

validateRDBESDataObject(myH5RawObject, verbose = FALSE)

# Number of rows in each non-null table and hierarchy
print(myH5RawObject)
#> Hierarchy 5 RDBESdataObject:
#>  DE: 3
#>  SD: 3
#>  OS: 27 (SRSWR: 3/100)
#>  LE: 27 (SRSWR: 1/2)
#>  FT: 27 (SRSWR: 1/1)
#>  SS: 27 (SRSWR: 1/4)
#>  SA: 27 (SRSWR: 1/2)
#>  FM: 270
#>  BV: 540 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

Sorting RDBESDataObject

If the data has a single hierarchy in the DE table a correct sort order is defined for it. You can use the sort() method for it. Printing the object sorts it automatically if possible.

#before sorting
names(H8ExampleEE1)
#>  [1] "DE" "SD" "VS" "FT" "FO" "TE" "LO" "OS" "LE" "SS" "SA" "FM" "BV" "VD" "SL"
#> [16] "CL" "CE"
#after sorting
names(sort(H8ExampleEE1))
#>  [1] "DE" "SD" "TE" "VS" "FT" "LE" "SS" "SA" "FM" "BV" "FO" "LO" "OS" "VD" "SL"
#> [16] "CL" "CE"
#printing the summary
H8ExampleEE1
#> Hierarchy 8 RDBESdataObject:
#>  DE: 1
#>  SD: 1
#>  TE: 11 (SRSWOR: 2-3/4)
#>  VS: 14 (SRSWR: 1-2/7-12)
#>  LE: 15 (SRSWOR: 1-2/1-2)
#>  SS: 15 (CENSUS: 2/2)
#>  SA: 15 (SRSWOR: 1/794-14268)
#>  BV: 3995 (CENSUS: 16-100/16-100)
#>  VD: 7
#>  SL: 2
#>  CL: 71
#>  CE: 132

Combining RDBESDataObjects

RDBESDataObjects can be combined using the combineRDBESDataObjects() function. This might be required when different sampling schemes are used to collect on-shore and at-sea samples - it will often be required to combine all the data together before further analysis.

myCombinedRawObject <- combineRDBESDataObjects(myH1RawObject,
                                               myH5RawObject)

# Number of rows in each non-null table and hierarchies
print(myCombinedRawObject)
#> Warning: No sort order for multiple hierarchies can be defined!
#> Warning: Mixed hierarchy RDBESDataObject!
#> Hierarchy 1 RDBESdataObject:
#>   Hierarchy 5 RDBESdataObject:
#>  DE: 11
#>  SD: 11
#>  VS: 1214 (SRSWOR,CENSUS,SRSWR: 2-135/4-1382)
#>  FT: 1457 (CENSUS,SRSWR: 1-3/1-100)
#>  FO: 1916 (CENSUS,SRSWR: 1-3/1-20)
#>  OS: 27 (SRSWR: 3/100)
#>  LE: 27 (SRSWR: 1/2)
#>  SS: 1943 (CENSUS,SRSWR: 1/1-4)
#>  SA: 1943 (CENSUS,SRSWR: 1/1-2)
#>  FM: 7560
#>  BV: 15120 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

validateRDBESDataObject(myCombinedRawObject, verbose = FALSE)

Filtering RDBESDataObjects

RDBESDataObjects can be filtered using the filterRDBESDataObject() function - this allows the RDBESDataObject to be filtered by any field. A typical use of filtering might be to extract all data collected in a particular ICES division.

myH1RawObject <- H1Example

# Number of rows in each non-null table
print(myH1RawObject)
#> Hierarchy 1 RDBESdataObject:
#>  DE: 8
#>  SD: 8
#>  VS: 1214 (SRSWOR,CENSUS,SRSWR: 2-135/4-1382)
#>  FT: 1430 (CENSUS,SRSWR: 1-3/1-100)
#>  FO: 1916 (CENSUS,SRSWR: 1-3/1-20)
#>  SS: 1916 (CENSUS,SRSWR: 1/1-4)
#>  SA: 1916 (CENSUS,SRSWR: 1/1-2)
#>  FM: 7290
#>  BV: 14580 (SRSWR: 2/2)
#>  VD: 311
#>  SL: 170

myFields <- c("SDctry","VDctry","VDflgCtry","FTarvLoc")
myValues <- c("ZW","ZWBZH","ZWVFA" )

myFilteredObject <- filterRDBESDataObject(myH1RawObject,
                                         fieldsToFilter = myFields,
                                         valuesToFilter = myValues )

# Number of rows in each non-null table
unlist(summary(myFilteredObject)$rows)
#> NULL

validateRDBESDataObject(myFilteredObject, verbose = FALSE)

It is important to note that filtering is likely to result in “orphan” rows being produced so it is usual to also apply the findAndKillOrphans() function to the filtered data to remove these records.


myFilteredObjectNoOrphans <- 
  findAndKillOrphans(objectToCheck = myFilteredObject, verbose = FALSE)

validateRDBESDataObject(myFilteredObjectNoOrphans, verbose = FALSE)

You can also remove any records that are not linking to a row in the VesselDetails (VD) table using the removeBrokenVesselLinks() function.


myFields <- c("VDlenCat")
myValues <- c("18-<24" )
myFilteredObject <- filterRDBESDataObject(myFilteredObjectNoOrphans,
                                         fieldsToFilter = myFields,
                                         valuesToFilter = myValues )

myFilteredObjectValidVesselLinks <- removeBrokenVesselLinks(
                                  objectToCheck = myFilteredObject,
                                  verbose = FALSE)

validateRDBESDataObject(myFilteredObjectValidVesselLinks, verbose = FALSE)

Finally you can also remove any records that are not linking to an entry in the SpeciesListDetails (SL) table using the removeBrokenSpeciesListLinks() function.


myFields <- c("SLspeclistName")
myValues <- c("ZW_1965_SpeciesList" )
myFilteredObjectValidSpeciesLinks <- filterRDBESDataObject(myFilteredObjectValidVesselLinks,
                                         fieldsToFilter = myFields,
                                         valuesToFilter = myValues )

myFilteredObjectValidSpeciesLinks <- removeBrokenSpeciesListLinks(
                                  objectToCheck = myFilteredObjectValidSpeciesLinks,
                                  verbose = FALSE)

validateRDBESDataObject(myFilteredObjectValidSpeciesLinks, verbose = FALSE)

See also other package vignettes: