--- title: "Introduction" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r echo=FALSE, include=FALSE} library(data.table) library(magrittr) ``` # csfmt_rts_data_v2 `csfmt_rts_data_v2` (`vignette("csfmt_rts_data_v2", package = "cstidy")`) is the Core Surveillance data format for real-time surveillance of infectious diseases. ```{r} d <- cstidy::generate_test_data() cstidy::set_csfmt_rts_data_v2(d) # Looking at the dataset d[] ``` ## Smart assignment `csfmt_rts_data_v2` supports smart assignment for time and geography. When the **bold** variables below are set with `:=`, the associated variables are automatically derived. **location_code**: - granularity_geo - country_iso3 **isoyear**: - granularity_time - isoweek - isoyearweek - season - seasonweek - calyear - calmonth - calyearmonth - date **isoyearweek**: - granularity_time - isoyear - isoweek - season - seasonweek - calyear - calmonth - calyearmonth - date **date**: - granularity_time - isoyear - isoweek - isoyearweek - season - seasonweek - calyear - calmonth - calyearmonth ```{r} d <- cstidy::generate_test_data()[1:5] cstidy::set_csfmt_rts_data_v2(d) # Looking at the dataset d[] # Smart assignment of time columns (note how granularity_time, isoyear, isoyearweek, date all change) d[1,isoyearweek := "2021-01"] d # Smart assignment of time columns (note how granularity_time, isoyear, isoyearweek, date all change) d[2,isoyear := 2019] d # Smart assignment of time columns (note how granularity_time, isoyear, isoyearweek, date all change) d[4:5,date := as.Date("2020-01-01")] d # Smart assignment fails when multiple time columns are set d[1,c("isoyear","isoyearweek") := .(2021,"2021-01")] d # Smart assignment of geo columns d[1,c("location_code") := .("norge")] d # Collapsing down to different levels, and healing the dataset # (so that it can be worked on further with regards to real time surveillance) d[, .(deaths_n = sum(deaths_n), location_code = "norge"), keyby=.(granularity_time)] %>% cstidy::set_csfmt_rts_data_v2(create_unified_columns = FALSE) %>% print() # Collapsing to different levels, and removing the class csfmt_rts_data_v2 because # it is going to be used in new output/analyses d[, .(deaths_n = sum(deaths_n), location_code = "norge"), keyby=.(granularity_time)] %>% cstidy::remove_class_csfmt_rts_data() %>% print() ``` ## Summary `summary()` gives a concise overview of the data structure. ```{r} cstidy::generate_test_data() %>% cstidy::set_csfmt_rts_data_v2() %>% summary() ``` ## Identifying the data structure of one column `cstidy::identify_data_structure()` inspects a single column and returns a plottable object. ```{r} cstidy::generate_test_data() %>% cstidy::set_csfmt_rts_data_v2() %>% cstidy::identify_data_structure("deaths_n") %>% plot() ```