-
Notifications
You must be signed in to change notification settings - Fork 0
/
2020-03-17.Rmd
248 lines (206 loc) · 8.7 KB
/
2020-03-17.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
---
title: "COVID-19 Doubles Every Two Days. Everywhere."
description: |
Hey US: Stay home, or become [Italy](http://www.cidrap.umn.edu/news-perspective/2020/03/doctors-covid-19-pushing-italian-icus-toward-collapse). Written 3/17 9:43pm PST.
site: distill::distill_website
---
## Quick Facts
* Although China was hit first, dozens of countries are following in the same
* In all top eight impacted countries, COVID-19 doubles every two to three days.
* The U.S. is only 10 days behind having as many cases as Italy now has.
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
install.load::install_load(c('distill','dplyr', 'tidyverse', 'ggplot2', 'plotly', 'rmarkdown', 'magrittr', 'RCurl', 'lubridate', 'DT', 'ggthemes'))
ggplot2::theme_set(theme_minimal())
read_data <- function(type='Deaths') {
fp_data = paste0('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-', type ,'.csv')
# Read in data
df = read_csv(fp_data) %>%
select(-Lat, -Long) %>%
gather(date_raw, stat,
-`Province/State`, -`Country/Region`) %>%
mutate(date = mdy(date_raw))
df[[type]] = df$stat
# Clean column names
colnames(df) %<>%
tolower() %>%
str_replace_all(fixed('/'), '_') %>%
str_replace_all(fixed(' '), '_')
df %<>% select(-stat)
return(df)
}
dfd <- read_data(type='Deaths')
dfc <- read_data(type='Confirmed')
dfr <- read_data(type='Recovered')
```
The warning is clear: COVID-19 is rising exponentially in the US.
```{r}
# US
dfc %>%
filter(country_region == 'US') %>%
group_by(date) %>%
summarize(confirmed = sum(confirmed)) %>%
ggplot(aes(x = date, y = confirmed)) +
geom_line() +
ggtitle('US Cases are on an Exponential Rise')
```
If we don't learn the lessons other countries have, we'll repeat their recently-learned history. Staying home saves lives.
## The US Had a Delayed Start, but Follows the Same Trends
While China was the first to get hit by COVID-19, everyone else is actually on a similar growth trajectory (unless we take action #stayhome!). The US was in a lull for many weeks, probably because access to and from China was wisely halted. But then cases came in from other sources. We failed to contain. We failed to prepare. Now, we're past containment.
This graph shows how many confirmed cases there have been in the world relative to days since the country first observed an onset. Day 0 is the first day a case was reported.
```{r}
# Day since 1
df_sum <- dfc %>%
group_by(country_region, date) %>%
summarize(confirmed = sum(confirmed)) %>%
arrange(country_region, date)
df_tmp <- df_sum %>%
filter(confirmed > 0) %>%
mutate(day_since_start = row_number()) %>%
ungroup() %>%
#filter(country_region %in% c("China", 'US', 'Italy', 'France', 'Iran')) %>% #View
highlight_key(~country_region)
p <- ggplot(df_tmp, aes(x=day_since_start,
y=confirmed,
color=country_region)) +
geom_line() +
labs(title='Total Confirmed COVID-19 Cases by Days Since First Onset',
x = 'Days Since Start of Onset',
y = 'Total Confirmed') +
theme(legend.position = "none")
ggplotly(p) %>%
highlight('plotly_hover',
selectize = T,
# dynamic=T,
defaultValues = c('China', 'Italy', 'US'))
```
## COVID-19 Doubles Really Fast, Regardless of Country
No matter how long it takes a country to hit 500 cases, they all double at the same rate thereafter.
Italy hit it's 500th case on day 28 of being exposed. It then doubled in 2 days. Then again. Then again.
The US didn't hit it's 500th case until day 47 of the virus being in the US. This was 47 days of a false sense of security. This was 47 days we all watched China double exponentially. Now *we* are doubling.
There are eight countries that have hit 4,000 cases as of March 16, 2020. It takes about 1-3 days for the virus to double, regardless of the country it's been present in.
```{r}
# How many countries are there:
df_firsts <- df_sum %>%
group_by(country_region) %>%
filter(confirmed >= 1) %>%
mutate(max_confirmed = max(confirmed)) %>%
arrange(country_region, date) %>%
mutate(days_since_first = row_number())
df_firsts %>%
filter(confirmed > 500) %>%
mutate(
days_until_500 = min(days_since_first),
) %>%
filter(confirmed > 1000) %>%
mutate(
days_until_1k = min(days_since_first),
days_500_to_1k = days_until_1k - days_until_500
) %>%
filter(confirmed > 2000) %>%
mutate(
days_until_2k = min(days_since_first),
days_1k_to_2k = days_until_2k - days_until_1k
) %>%
filter(confirmed > 4000) %>%
mutate(
days_until_4k = min(days_since_first),
days_2k_to_4k = days_until_4k - days_until_2k
) %>%
filter(date == Sys.Date() - 1) %>%
select(country_region, confirmed, days_until_500, days_500_to_1k, days_1k_to_2k, days_2k_to_4k, days_since_first) %>%
arrange(desc(days_until_500)) %>%
datatable(
colnames = c(
'Country',
'Total Confirmed',
'Days to Hit 500',
'From 500 to 1k',
'From 1k to 2k',
'From 2k to 4k',
'Total Days Since First'
),
options = list(
pageLength = 100,
dom = 't',
scrollX = TRUE
)
)
```
## Too Many are Trending Like Italy
Looking at countries that have had at least 500 cases so far, we can see how the US is in trending with other countries, especially Italy. As of 3/17 there are 19 countries that have at least 500 cases. I'm guessing many of these will have to go into total lockdown to avoid the exponential growth.
```{r}
# df_sum %>%
# filter(confirmed > 500) %>%
# pull(country_region) %>%
# unique() %>%
# length()
df_tmp <- df_sum %>%
filter(confirmed > 500) %>%
mutate(day_since_500 = row_number()) %>%
filter(country_region != 'Cruise Ship') %>%
ungroup() %>%
#filter(country_region %in% c("China", 'US', 'Italy', 'France', 'Iran')) %>% #View
highlight_key(~country_region)
p <- ggplot(df_tmp, aes(x=day_since_500,
y=confirmed,
color=country_region)) +
geom_line() +
labs(title='Total Confirmed COVID-19 Cases by Days \nSince First Reaching 500',
x = 'Days Since 500 Cases',
y = 'Total Confirmed') +
theme(legend.position = "none")
ggplotly(p) %>%
highlight('plotly_click',
selectize = T,
defaultValues = c('Italy', 'US'))
```
To hone in on the trends, we see that the US is just ten days behind Italy.
```{r}
df_sum %>%
filter(confirmed > 100) %>%
mutate(days_since = row_number(),
days_since = ifelse(country_region == 'US', days_since - 0 , days_since)) %>%
filter(country_region %in% c('Italy', 'US')) %>%
ggplot(aes(x = days_since, y = confirmed, fill = country_region)) +
geom_bar(stat = 'identity', position='dodge') +
scale_fill_pander(name='Country') +
labs(title='The US is Just Ten Days Behind Italy',
x = 'Days Since Hitting 100 Confirmed Cases',
y = 'Total Confirmed Cases',
#subtitle = 'US-confirmed cases shifted one day to the left to align trends',
caption = 'Data Source: github.com/CSSEGISandData/COVID-19\ngithub.com/bryanwhiting/covid19')
```
If you just shift the green trend to the left by one day, you'll notice they align perfectly. We're on the same trajectory as Italy.
```{r}
df_sum %>%
filter(confirmed > 100) %>%
mutate(days_since = row_number(),
days_since = ifelse(country_region == 'US', days_since - 1 , days_since)) %>%
filter(country_region %in% c('Italy', 'US')) %>%
ggplot(aes(x = days_since, y = confirmed, fill = country_region)) +
geom_bar(stat = 'identity', position='dodge') +
scale_fill_pander(name='Country') +
labs(title='The US is Just Ten Days Behind Italy',
x = 'Days Since Hitting 100 Confirmed Cases',
y = 'Total Confirmed Cases',
subtitle = 'US-confirmed cases shifted one day to the left to align trends',
caption = 'Data Source: github.com/CSSEGISandData/COVID-19\ngithub.com/bryanwhiting/covid19')
```
Other countries in Europe are on the same trend.
```{r}
df_sum %>%
filter(confirmed > 100) %>%
mutate(days_since = row_number()) %>%
filter(country_region %in% c('Italy', 'France', 'Spain', 'Germany', 'Switzerland', 'Norway')) %>%
ggplot(aes(x = days_since, y = confirmed, color = country_region)) +
geom_line() +
# geom_bar(stat = 'identity', position='dodge') +
scale_color_pander(name='Country') +
labs(title='The Rest of Europe is on Track to Become Italy',
x = 'Days Since Hitting 100 Confirmed Cases',
y = 'Total Confirmed Cases',
caption = 'Data Source: github.com/CSSEGISandData/COVID-19\ngithub.com/bryanwhiting/covid19')
```
## Stay Home!
Read the headlines. If we don't take action, we'll be in the same position.