Get new occurrence record set from GBIF and save as .rds
. GBIF is the
Global Biodiversity Information Facility.
get_gbif(
aoi = NULL,
save_dir = NULL,
get_new = FALSE,
data_map = NULL,
predicates = NULL,
request_wait = 20,
name = "gbif",
filter_inconsistent = TRUE,
filter_NA_date = TRUE,
occ_char = TRUE,
adj_spa_rel = TRUE,
previous_key = NULL,
...
)
sf defining area of interest.
Character. File path into which to save outputs. If null
results will be saved to fs::path("out", "ds", "gbif")
as file
gbif.rds
.
Logical. If FALSE
will attempt to load data from previously
saved results.
Dataframe or NULL.
Mapping of fields to retrieve. See example
envImport::data_map
List. Any number of gbif predicates
Integer. Time in seconds to wait between
rgbif::occ_download_meta()
requests. Used by rgbif::occ_download_wait()
status_ping
argument.
Character or NULL
. data_name
value in envImport::data_map
(or other data_map
). Required if data_map
is not NULL
Logical. If TRUE
inconsistencies between the
occurrenceStatus
column and either organismQuantity
or individualCount
are removed. e.g. a record with occurrenceStatus == "ABSENT"
but
individualCount == 1
would be filtered.
Logical. Filter if is.na(eventDate)
.
Logical. If true, occ_derivation will be coerced to character (to match other data sources).
Logical. If true, an attempt will be made to check
coordinateUncertaintyInMeters
against: information in informationWithheld.
If
informationWithheld
contains "Coordinate uncertainty increased to",
readr::parse_number()
is used to retrieve that number, which is then used
to replace any value in coordinateUncertaintyInMeters
; and if the column
issue
contains COORDINATE_UNCERTAINTY_METERS_INVALID
,
coordinateUncertaintyInMeters
is limited to 10000 or greater.
Character. e.g. 0092123-240506114902167
. If provided,
an attempt will be made to load (or download) a previous query of occurrence
data.
Passed to envImport::file_prep()
Dataframe of occurrences, full download (as key.zip) in save_dir
and file saved to save_dir
as gbif.parquet
.
Uses various rgbif
functions
to return a dataframe of occurence records. Requires
gbif credentials.
Any arguments to rgbif::occ_download()
can be passed via extra_prediates
.
For convenience, aoi
can also be passed directly and internally it is
converted to a bounding box in appropriate lat/long and passed to
rgbif::pred_within()
in WKT format.
# setup -------
library("envImport")
# no aoi ------
out_dir <- file.path(system.file(package = "envImport"), "examples", "get_gbif_ex")
gbif_data <- get_gbif(save_dir = out_dir
, get_new = FALSE
#, data_map = envImport::data_map
, predicates = rgbif::pred_and(rgbif::pred("taxonKey", 2474903)
, rgbif::pred("year", 2000)
)
, previous_key = "0057516-240626123714530"
)
#> save_file will be C:/temp/joel/RtmpKSVQ8v/temp_libpath23d833687f18/envImport/examples/get_gbif_ex/gbif/gbif.parquet
# 667 records 2024-08-09
nrow(gbif_data)
#> [1] 667
head(gbif_data)
#> # A tibble: 6 × 17
#> data_name site date lat long original_name nsx occ_derivation
#> <chr> <chr> <date> <dbl> <dbl> <chr> <lgl> <chr>
#> 1 gbif 960974767 2000-03-25 -17.2 145. Ardeotis aus… NA PRESENT
#> 2 gbif 960883126 2000-03-27 -17.1 145. Ardeotis aus… NA PRESENT
#> 3 gbif 942289812 2000-09-25 -21.9 114. Ardeotis aus… NA PRESENT
#> 4 gbif 816864400 2000-09-15 -18.6 139. Ardeotis aus… NA PRESENT
#> 5 gbif 4631162629 2000-07-27 -14.6 144. Ardeotis aus… NA PRESENT
#> 6 gbif 4350045030 2000-09-11 -15.5 145. Ardeotis aus… NA PRESENT
#> # ℹ 9 more variables: quantity <chr>, rel_metres <dbl>, method <chr>,
#> # obs <chr>, denatured <lgl>, kingdom <chr>, occ <dbl>, year <dbl>,
#> # month <dbl>
# with aoi
out_dir <- file.path(system.file(package = "envImport"), "examples", "get_gbif_aoi_ex")
gbif_data <- get_gbif(save_dir = out_dir
, aoi = envClean::aoi
, data_map = envImport::data_map
, get_new = FALSE
, predicates = rgbif::pred("year", 2000)
)
#> save_file will be C:/temp/joel/RtmpKSVQ8v/temp_libpath23d833687f18/envImport/examples/get_gbif_aoi_ex/gbif/gbif.parquet
# 107 records 2024-08-09
nrow(gbif_data)
#> [1] 107
# .bib created
readr::read_lines(fs::path(out_dir, "gbif", "gbif.bib"))
#> [1] "@misc{gbif,"
#> [2] " doi = {10.15468/DL.HXGMEU},"
#> [3] " url = {https://www.gbif.org/occurrence/download/0057584-240626123714530},"
#> [4] " author = {{GBIF.Org User}},"
#> [5] " keywords = {GBIF, biodiversity, species occurrences},"
#> [6] " title = {Occurrence Download},"
#> [7] " publisher = {The Global Biodiversity Information Facility},"
#> [8] " year = {2024},"
#> [9] " copyright = {Creative Commons Attribution Non Commercial 4.0 International},"
#> [10] "}"