-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.qmd
139 lines (109 loc) · 3.76 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
format: gfm
---
# stdmatern
This directory is for development of fast and memory-efficient code that creates Matérn precision matrices that have been standardized so that their inverse is a correlation matrix. The methods used in this code are described in [a paper I am working on](https://bggj.is/materneigenpaper/).
The package is still in very active development and should not be considered a stable product yet. The package can be installed with
```{r}
#| eval: false
pak::pak("bgautijonsson/stdmatern")
```
```{r}
library(stdmatern)
```
# Sampling spatial data
```{r}
#| include: false
#| echo: true
library(dplyr)
library(ggplot2)
```
Here we sample 50 replicates of highly dependent spatial data on a 200x100 grid, i.e. there's 20.000 observational locations.
```{r}
start <- tictoc::tic()
dim1 <- 200
dim2 <- 100
rho1 <- 0.6
rho2 <- 0.9
nu <- 2
n_obs <- 50
Z <- rmatern_copula_eigen(n_obs, dim1, dim2, rho1, rho2, nu)
stop <- tictoc::toc()
tibble(
Z = as.numeric(Z[, 1])
) |>
mutate(
lat = rep(seq_len(dim1), each = dim2),
lon = rep(seq_len(dim2), times = dim1),
) |>
ggplot(aes(lat, lon, fill = Z)) +
geom_raster() +
scale_fill_viridis_c() +
coord_fixed(expand = FALSE)
stop$callback_msg
```
On the plots below, the distribution of sample statistics are shown in black while the theoretical distributions of standard normal variables is shown in red.
```{r}
Z |> apply(1, mean) |> density() |> plot()
curve(dnorm(x, 0, 1 / sqrt(n_obs)), add = TRUE, from = -0.5, to = 0.5, col = "red")
```
```{r}
apply(Z, 1, var) |> density() |> plot()
curve((n_obs - 1) * dchisq((n_obs - 1) * x, df = (n_obs - 1)), col = "red", from = 0, to = 5, add = TRUE)
```
# Normal density
The package implements a method for calculating the log-density of a multivariate normal with appropriate precision matrix, defined as
$$
\mathbf{Q} = \left( \mathbf{Q}_{\rho_1} \otimes \mathbf{I_{n_2}} + \mathbf{I_{n_1}} \otimes \mathbf{Q}_{\rho_2} \right)^{\nu + 1}, \quad \nu \in \{0, 1, 2\},
$$
where $\nu$ is a smoothness parameter, $\otimes$ is the Kronecker product and each $Q_\rho$ is a precision matrix corresponding to a one-dimensional AR(1) process with unit marginal variance.
To evaluate the density of the random sample generated above we would run:
```{r}
tictoc::tic()
dens <- dmatern_copula_eigen(Z, dim1, dim2, rho1, rho2, nu) |> sum()
tictoc::toc()
```
## Folded Circulant Approximation
The package also provides a circulant approximation to the precision matrix, $\mathbf Q$, with reflective boundary conditions. This lets us do fast computations because of the FFT. This package uses the [FFTW C library](https://www.fftw.org/) for FFT computations, and the [Eigen C++ library](https://eigen.tuxfamily.org/index.php?title=Main_Page) for more general computations.
```{r}
tictoc::tic()
dens <- dmatern_copula_folded(Z, dim1, dim2, rho1, rho2, nu) |> sum()
tictoc::toc()
```
# Benchmarks
The following section contains a simple benchmark to compare the computation times of the exact method and the folded approximation. For further benchmarks, see the working paper linked above.
```{r}
my_fun <- function(dim) {
X <- rmatern_copula_eigen(1, dim, dim, rho, rho, nu)
bench::mark(
"Eigen" = dmatern_copula_eigen(X, dim, dim, rho, rho, nu),
"Folded" = dmatern_copula_folded(X, dim, dim, rho, rho, nu),
filter_gc = FALSE,
iterations = 20,
check = FALSE
) |>
mutate(
dim = dim
)
}
```
```{r}
#| cache: true
rho <- 0.5
nu <- 2
results <- purrr::map(
c(seq(20, 240, by = 20)),
my_fun
) |> purrr::list_rbind()
```
```{r}
results |>
mutate(
dim = glue::glue("{dim}x{dim}"),
expression = as.character(expression)
) |>
select(
Grid = dim, expression, median
) |>
tidyr::pivot_wider(names_from = expression, values_from = median)
```