The package rdefra allows to retrieve air pollution data from the Air Information Resource UK-AIR of the Department for Environment, Food and Rural Affairs in the United Kingdom. UK-AIR does not provide a public API for programmatic access to data, therefore this package scrapes the HTML pages to get relevant information.
This package follows a logic similar to other packages such as
waterData and
rnrfa: sites are first
identified through a catalogue, data are imported via the station
identification number, then data are visualised and/or used in analyses.
The metadata related to the monitoring stations are accessible through
the function ukair_catalogue()
, missing stations’ coordinates can be
obtained using the function ukair_get_coordinates()
, and time series
data related to different pollutants can be obtained using the function
ukair_get_hourly_data()
.
DEFRA’s servers can handle multiple data requests, therefore concurrent calls can be sent simultaneously using the parallel package. Although the limit rate depends on the maximum number of concurrent calls, traffic and available infrastructure, data retrieval is very efficient. Multiple years of data for hundreds of sites can be downloaded in only few minutes.
For similar functionalities see also the openair package, which relies on a local copy of the data on servers at King’s College (UK), and the ropenaq which provides UK-AIR latest measured levels (see https://uk-air.defra.gov.uk/latest/currentlevels) as well as data from other countries.
Get the released version from CRAN:
install.packages("rdefra")
Or the development version from GitHub using the package remotes
:
install.packages("remotes")
remotes::install_github("ropensci/rdefra")
Load the rdefra package:
library(rdefra)
The package logic assumes that users access the UK-AIR database in the following steps:
- Browse the catalogue of available stations and selects some stations
of interest (see function
ukair_catalogue()
). - Get missing coordinates (see function
ukair_get_coordinates()
). - Retrieves data for the selected stations (see functions
ukair_get_site_id()
andukair_get_hourly_data()
).
For an in-depth description of the various functionalities and example applications, please refer to the package vignette.
- This package and functions herein are part of an experimental open-source project. They are provided as is, without any guarantee.
- Please report any issues or bugs.
- License: GPL-3
- This package was reviewed by Maëlle Salmon and Hao Zhu for submission to ROpenSci (see review here) and the Journal of Open Source Software (see review here).
- Cite
rdefra
:citation(package = "rdefra")