--- title: "Intro to csutil" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Intro to csutil} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Splitting `easy_split()` divides a vector into groups either by specifying the target size of each group or the total number of groups. ```{r} csutil::easy_split(letters[1:20], size_of_each_group = 3) csutil::easy_split(letters[1:20], number_of_groups = 3) ``` ## Unnesting data frames within a list `unnest_dfs_within_list_of_fully_named_lists()` collapses a list of named lists, each containing data frames, into a single flat list. Elements that share a name across the outer lists are row-bound together. ```{r} x <- list( list( "a" = data.frame("v1"=1), "b" = data.frame("v2"=3) ), list( "a" = data.frame("v1"=10), "b" = data.frame("v2"=30), "d" = data.frame("v3"=50) ) ) print(x) csutil::unnest_dfs_within_list_of_fully_named_lists(x) ``` ## Describing lists These predicates test structural properties of a list: whether every element is named, and whether every element is `NULL` or a particular type. ```{r} csutil::is_fully_named_list(list(1)) csutil::is_fully_named_list(list("a"=1)) csutil::is_all_list_elements_null_or_df(list(data.frame())) csutil::is_all_list_elements_null_or_df(list(1, NULL)) csutil::is_all_list_elements_null_or_list(list(1, NULL)) csutil::is_all_list_elements_null_or_list(list(list(), NULL)) csutil::is_all_list_elements_null_or_fully_named_list(list(list(), NULL)) csutil::is_all_list_elements_null_or_fully_named_list(list(list("a" = 1), NULL)) ``` ## Applying a function via hash table `apply_fn_via_hash_table()` extracts the unique values from the input, applies the given function once per unique value to build a lookup table, then maps the results back to the original input. When there are many repeated values, this avoids redundant computation and can be substantially faster than applying the function element-wise. ```{r} input <- rep(seq(as.Date("2000-01-01"), as.Date("2020-01-01"), 1), 1000) a1 <- Sys.time() z <- format(input, "%Y") a2 <- Sys.time() a2 - a1 b1 <- Sys.time() z <- csutil::apply_fn_via_hash_table( input, format, "%Y" ) b2 <- Sys.time() b2 - b1 ```