Flag reverse jackknife outliers

flag_rjack_outliers(
  df,
  context,
  vars = context,
  min_points = 30,
  geo_rel_col = "rel_metres_adj",
  geo_rel_thresh = 100,
  prop_thresh = 1/3
)

Arguments

df

Dataframe with context and all other columns defining the space in which to look for outliers (usually environmental variables such as climate or satellite variables)

context

Character. Name of columns defining context.

vars

Character. Name of column(s) to investigate for outliers

min_points

Numeric. Don't attempt reverse jackknife calculations unless there are at least this number of data points.

geo_rel_col

Character. Name of column containing geographic reliability information. Set to NULL to ignore.

geo_rel_thresh

Numeric. Threshold in geo_rel_col below which to filter that row from analysis. Needed for, say, coarse spatial reliability but satellite variables (e.g. no point checking if a point is an outlier against satellite variables (with resolution of, say 30 m) if the geographic reliability of that point is 10 km). Ignored if geo_rel_col is NULL.

prop_thresh

Numeric. What proportion of variables (i.e. proportion of vars) need to be reverse jackknife outliers for a point to be flagged as an outlier?

Value

tibble