Flag local outliers using dbscan::lof()

flag_local_outliers(
  df,
  context,
  vars = context,
  min_points = 30,
  geo_rel_col = "rel_metres_adj",
  geo_rel_thresh = 100,
  iqrMult = 2,
  ...
)

Arguments

df

Dataframe with context and all other columns defining the space in which to look for outliers (usually environmental variables such as climate or satellite variables)

context

Character. Name of columns defining context.

vars

Character. Name of column(s) to investigate for outliers

min_points

Numeric. Don't attempt reverse jackknife calculations unless there are at least this number of data points.

geo_rel_col

Character. Name of column containing geographic reliability information. Set to NULL to ignore.

geo_rel_thresh

Numeric. Threshold in geo_rel_col below which to filter that row from analysis. Needed for, say, coarse spatial reliability but satellite variables (e.g. no point checking if a point is an outlier against satellite variables (with resolution of, say 30 m) if the geographic reliability of that point is 10 km). Ignored if geo_rel_col is NULL.

iqrMult

Used in quantile(x, probs = 0.75) + iqrMult * IQR(x) to set the threshold for an outlier. e.g. ggplot2::geom_boxplot() default value is 1.5.

...

Passed to dbscan::lof(). e.g. minPts argument

Value

tibble