Get new occurrence record set from GBIF and save as .rds. GBIF is the Global Biodiversity Information Facility.

get_gbif(
  aoi = NULL,
  save_dir = NULL,
  get_new = FALSE,
  data_map = NULL,
  predicates = NULL,
  request_wait = 20,
  name = "gbif",
  filter_inconsistent = TRUE,
  filter_NA_date = TRUE,
  occ_char = TRUE,
  adj_spa_rel = TRUE,
  previous_key = NULL,
  ...
)

Arguments

aoi

sf defining area of interest.

save_dir

Character. File path into which to save outputs. If null results will be saved to fs::path("out", "ds", "gbif") as file gbif.rds.

get_new

Logical. If FALSE will attempt to load data from previously saved results.

data_map

Dataframe or NULL. Mapping of fields to retrieve. See example envImport::data_map

predicates

List. Any number of gbif predicates

request_wait

Integer. Time in seconds to wait between rgbif::occ_download_meta() requests. Used by rgbif::occ_download_wait() status_ping argument.

name

Character or NULL. data_name value in envImport::data_map (or other data_map). Required if data_map is not NULL

filter_inconsistent

Logical. If TRUE inconsistencies between the occurrenceStatus column and either organismQuantity or individualCount are removed. e.g. a record with occurrenceStatus == "ABSENT" but individualCount == 1 would be filtered.

filter_NA_date

Logical. Filter if is.na(eventDate).

occ_char

Logical. If true, occ_derivation will be coerced to character (to match other data sources).

adj_spa_rel

Logical. If true, an attempt will be made to check coordinateUncertaintyInMeters against: information in informationWithheld. If informationWithheld contains "Coordinate uncertainty increased to", readr::parse_number() is used to retrieve that number, which is then used to replace any value in coordinateUncertaintyInMeters; and if the column issue contains COORDINATE_UNCERTAINTY_METERS_INVALID, coordinateUncertaintyInMeters is limited to 10000 or greater.

previous_key

Character. e.g. 0092123-240506114902167. If provided, an attempt will be made to load (or download) a previous query of occurrence data.

...

Passed to envImport::file_prep()

Value

Dataframe of occurrences, full download (as key.zip) in save_dir

and file saved to save_dir as gbif.parquet.

Details

Uses various rgbif functions to return a dataframe of occurence records. Requires gbif credentials.

Any arguments to rgbif::occ_download() can be passed via extra_prediates. For convenience, aoi can also be passed directly and internally it is converted to a bounding box in appropriate lat/long and passed to rgbif::pred_within() in WKT format.

Examples


  # setup -------
  library("envImport")

  # no aoi ------
  out_dir <- file.path(system.file(package = "envImport"), "examples", "get_gbif_ex")

  gbif_data <- get_gbif(save_dir = out_dir
                        , get_new = FALSE
                        #, data_map = envImport::data_map
                        , predicates = rgbif::pred_and(rgbif::pred("taxonKey", 2474903)
                                                       , rgbif::pred("year", 2000)
                                                       )
                        , previous_key = "0057516-240626123714530"
                        )
#> save_file will be C:/temp/joel/RtmpKSVQ8v/temp_libpath23d833687f18/envImport/examples/get_gbif_ex/gbif/gbif.parquet

  # 667 records 2024-08-09
  nrow(gbif_data)
#> [1] 667
  head(gbif_data)
#> # A tibble: 6 × 17
#>   data_name site       date         lat  long original_name nsx   occ_derivation
#>   <chr>     <chr>      <date>     <dbl> <dbl> <chr>         <lgl> <chr>         
#> 1 gbif      960974767  2000-03-25 -17.2  145. Ardeotis aus… NA    PRESENT       
#> 2 gbif      960883126  2000-03-27 -17.1  145. Ardeotis aus… NA    PRESENT       
#> 3 gbif      942289812  2000-09-25 -21.9  114. Ardeotis aus… NA    PRESENT       
#> 4 gbif      816864400  2000-09-15 -18.6  139. Ardeotis aus… NA    PRESENT       
#> 5 gbif      4631162629 2000-07-27 -14.6  144. Ardeotis aus… NA    PRESENT       
#> 6 gbif      4350045030 2000-09-11 -15.5  145. Ardeotis aus… NA    PRESENT       
#> # ℹ 9 more variables: quantity <chr>, rel_metres <dbl>, method <chr>,
#> #   obs <chr>, denatured <lgl>, kingdom <chr>, occ <dbl>, year <dbl>,
#> #   month <dbl>

  # with aoi
  out_dir <- file.path(system.file(package = "envImport"), "examples", "get_gbif_aoi_ex")

  gbif_data <- get_gbif(save_dir = out_dir
                        , aoi = envClean::aoi
                        , data_map = envImport::data_map
                        , get_new = FALSE
                        , predicates = rgbif::pred("year", 2000)
                        )
#> save_file will be C:/temp/joel/RtmpKSVQ8v/temp_libpath23d833687f18/envImport/examples/get_gbif_aoi_ex/gbif/gbif.parquet

  # 107 records 2024-08-09
  nrow(gbif_data)
#> [1] 107

  # .bib created
  readr::read_lines(fs::path(out_dir, "gbif", "gbif.bib"))
#>  [1] "@misc{gbif,"                                                                   
#>  [2] "  doi = {10.15468/DL.HXGMEU},"                                                 
#>  [3] "  url = {https://www.gbif.org/occurrence/download/0057584-240626123714530},"   
#>  [4] "  author = {{GBIF.Org User}},"                                                 
#>  [5] "  keywords = {GBIF, biodiversity, species occurrences},"                       
#>  [6] "  title = {Occurrence Download},"                                              
#>  [7] "  publisher = {The Global Biodiversity Information Facility},"                 
#>  [8] "  year = {2024},"                                                              
#>  [9] "  copyright = {Creative Commons Attribution Non Commercial 4.0 International},"
#> [10] "}"