--- title: "Short term trend" author: "Richard Aubrey White, Chi Zhang" date: "2023-05-31" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Short term trend} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette shows how to compute short-term trend signals using `csalert::short_term_trend`. Two scenarios are covered: a single location (Norway) and multiple locations (Norwegian counties). ```{r} library(ggplot2) library(data.table) library(magrittr) ``` ## Single location ### Covid-19 hospitalisation data The dataset contains daily and weekly counts of Covid-19 hospitalisations in Norway where Covid-19 was recorded as the primary cause. It does not distinguish between age groups or sex. This extract was pulled on 2022-05-04 and covers 2020-02-21 to 2022-05-03. The raw data in CSV and XLSX formats are available on the [GitHub repository](https://github.com/folkehelseinstituttet/surveillance_data). ### Data in `cstidy` format The data have been pre-processed into [cstidy format](https://niphr.github.io/cstidy/articles/csfmt_rts_data_v2.html), which provides a standardised column structure including data types and missingness information. ```{r} d_hosp <- cstidy::nor_covid19_icu_and_hospitalization_csfmt_rts_v1 d_hosp ``` ### Weekly observations The `short_term_trend` function is run on the weekly subset of the data. The `trend_isoyearweeks` argument sets the number of weeks used to estimate the trend, and `remove_last_isoyearweeks` trims the most recent weeks to account for reporting delays. ```{r} d_hosp_weekly <- d_hosp[granularity_time=="isoyearweek"] res <- csalert::short_term_trend( d_hosp_weekly, numerator = "hospitalization_with_covid19_as_primary_cause_n", trend_isoyearweeks = 6, remove_last_isoyearweeks = 1 ) # create the trend label res[, hospitalization_with_covid19_as_primary_cause_trend0_41_status := factor( hospitalization_with_covid19_as_primary_cause_trend0_41_status, levels = c("training","forecast","notincreasing", "increasing"), labels = c("Training","Forecast","Not increasin", "Increasing") )] colnames(res) ``` The function adds several new columns to the original data. The example below shows the forecasted counts alongside the trend status. ```{r} # check some columns res[ , .( date, hospitalization_with_covid19_as_primary_cause_n, hospitalization_with_covid19_as_primary_cause_forecasted_n, hospitalization_with_covid19_as_primary_cause_trend0_41_status ) ] ``` ### Visualising trend status Trend status can be mapped to bar fill colours, with error bars showing the prediction interval. ```{r} q <- ggplot( res, aes( x = isoyearweek, y = hospitalization_with_covid19_as_primary_cause_forecasted_n, group = 1 ) ) q <- q + geom_col(mapping = aes(fill = hospitalization_with_covid19_as_primary_cause_trend0_41_status)) q <- q + geom_errorbar( mapping = aes( ymin = hospitalization_with_covid19_as_primary_cause_forecasted_predinterval_q02x5_n, ymax = hospitalization_with_covid19_as_primary_cause_forecasted_predinterval_q97x5_n ) ) q <- q + scale_y_continuous("Weekly hospitalization with Covid-19 as primary cause", expand = c(0, 0.1)) q <- q + scale_x_discrete("Isoyearweek") q <- q + expand_limits(y=0) q <- q + scale_fill_brewer("6 week trend", palette = "Set1") q ``` Point shapes can be used instead, placing a symbol above each bar to mark the trend category. ```{r} shape_adjustment_factor <- max(res$hospitalization_with_covid19_as_primary_cause_forecasted_n)*0.01 q <- ggplot( res, aes( x = isoyearweek, y = hospitalization_with_covid19_as_primary_cause_forecasted_n, group = 1 ) ) q <- q + geom_col() q <- q + geom_point(mapping = aes( y = hospitalization_with_covid19_as_primary_cause_forecasted_n + shape_adjustment_factor, shape = hospitalization_with_covid19_as_primary_cause_trend0_41_status )) q <- q + geom_errorbar( mapping = aes( ymin = hospitalization_with_covid19_as_primary_cause_forecasted_predinterval_q02x5_n, ymax = hospitalization_with_covid19_as_primary_cause_forecasted_predinterval_q97x5_n ) ) q <- q + scale_y_continuous("Weekly hospitalization with Covid-19 as primary cause", expand = c(0, 0.1)) q <- q + scale_x_discrete("Isoyearweek") q <- q + expand_limits(y=0) q <- q + scale_shape_manual("6 week trend", values = c("Increasing" = 17, "Decreasing" = 6)) q ``` ## Multiple locations When the data contain multiple locations, `short_term_trend` computes the trend for each one. Here the function is applied to weekly Covid-19 case counts at county level. ```{r} d <- cstidy::nor_covid19_cases_by_time_location_csfmt_rts_v1[ granularity_time == "isoyearweek" & granularity_geo == "county" ] trend <- csalert::short_term_trend( d, numerator = "covid19_cases_testdate_n", trend_isoyearweeks = 6, remove_last_isoyearweeks = 1 ) print(trend) ``` ### Mapping trend status The trend status for a single week can be joined to a county map polygon dataset and plotted geographically. ```{r} pd <- copy(csmaps::nor_county_map_b2020_split_dt) pd[ trend[isoyearweek == "2021-44"], on = c("location_code"), covid19_cases_testdate_trend0_41_status := covid19_cases_testdate_trend0_41_status ] # plot map q <- ggplot() q <- q + geom_polygon( data = pd, mapping = aes(x = long, y = lat, group = group,fill=covid19_cases_testdate_trend0_41_status), color="black", linewidth = 0.2 ) q <- q + coord_quickmap() q <- q + theme_void() q <- q + labs(title="MSIS cases per 100k population for week 2021-44") q <- q + scale_fill_brewer("Covid trends", palette = "Set1", direction = -1) q ```