Day 22: Having a rest with friends πŸ“·


Day 21: An impressive space at the foot of a mountain πŸ“·


Day 20: I’m looking forward to resuming winter sports πŸ“·


Day 19: Mirror in a lake πŸ“·


Morning arising πŸ“·


Day 18: Lucy is finished for the day πŸ“·


Day 17: No need for a compass when hiking in the city, just follow the sound of traffic πŸ“·


Day 16: Rotation πŸ“·


Day 15: Ethereal πŸ“·


Day 14: My favourite wheels as a kid πŸ“·


Day 13: Lucy is a couch animal πŸ“·


Day 12: Rock legends πŸ“·


Day 11: Hygge πŸ“·


The cansim R package is really helpful πŸ“¦ πŸ“Š

Statistics Canada has a wealth of data that are essential for good public policy. Often a good third of my analytical scripts are devoted to accessing and processing data from the Statistics Canada website, which always seems like a waste of effort and good opportunity for making silly errors. So, I was keen to test out the cansimpackage for R to see how it might help. The quick answer is “very much”.

The documentation for the cansim package is thorough and doesn’t need to be repeated here. I thought it might be useful to illustrate how helpful the package can be by refactoring some earlier work that explored consumer price inflation.

These scripts always start off with downloading and extracting the relevant data file:

cpi_url <- "https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" # (1)
if(file.exists("18100004-eng.zip")) { # (2)
    # Already downloaded
  }  else {
    download.file(cpi_url,
      destfile = "18100004-eng.zip", 
      quiet = TRUE)
    unzip("18100004-eng.zip") # (3)
  }
cpi <- readr::read_csv("18100004.csv") # (4)

A few things to note here:

  1. You need to know the url for the data. Sometimes the logic is clear and you can guess, but often that doesn’t work and you need to spelunk through the Stats Can website
  2. To avoid downloading the file every time I run the script, there’s a test to see if the file already exists
  3. This approach yields lots of files and folders that you need to manage, including making sure they’re ignored by version control
  4. Using the great readr package imports the final csv file

With cansim all I need to know is the data series number:

cansim_table <- "18-10-0004"
cpi <- cansim::get_cansim(cansim_table)

get_cansimdownloads the right file to a temporary directory, extracts the data, and imports it as a tidyverse-compatible data frame.

The get_cansim function has some other nice features. It automatically creates a Date column with the right type, inferred from the standard REF_DATE column. And, it also creates a val_norm column that intelligently converts the VALUE column. For example, converting percentage or thousand-dollar values into standard formats.

The cansim package is a great example of a really helpful utility package that allows me to focus on analysis, rather than fiddling around with data. Definitely worth checking out if you deal with data from Statistics Canada.


Day 10: The bridges of my morning run πŸ“· πŸƒβ€β™‚οΈ


Day 9: Swinging through the trees is safe with this gear on πŸ“·


Day 8: A benefit of a twilight run is that the sidewalks are clear πŸ“·


Day 7: Spice πŸ“·


Day 6: Street πŸ“·


Day 5: The toys are watching, always πŸ“·