Last updated: 2020-07-13

Checks: 7 0

Knit directory: wflow-divvy/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(1) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 1c51d7c. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    analysis/.DS_Store
    Ignored:    data/Divvy_Stations_2016_Q1Q2.csv
    Ignored:    data/Divvy_Stations_2016_Q3.csv
    Ignored:    data/Divvy_Stations_2016_Q4.csv
    Ignored:    data/Divvy_Trips_2016_04.csv
    Ignored:    data/Divvy_Trips_2016_05.csv
    Ignored:    data/Divvy_Trips_2016_06.csv
    Ignored:    data/Divvy_Trips_2016_Q1.csv
    Ignored:    data/Divvy_Trips_2016_Q3.csv
    Ignored:    data/Divvy_Trips_2016_Q4.csv
    Ignored:    data/README.txt
    Ignored:    data/data.tar.gz

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/first-glance.Rmd) and HTML (docs/first-glance.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
html b9ae8da Peter Carbonetto 2020-01-06 Re-built remaining analysis pages using workflowr 1.6.0.
html ee7e9d3 Peter Carbonetto 2019-07-31 Re-built first-glance page using workflowr 1.4.0.
html 5357a3b Peter Carbonetto 2019-04-10 Build site.
Rmd 61c85b2 Peter Carbonetto 2019-04-10 wflow_publish(c(“seasonal-trends.Rmd”, “station-map.Rmd”,
html f4e627f Peter Carbonetto 2019-04-10 Re-built first-glance analysis using workflowr 1.2.0.9000.
html 66feda4 Peter Carbonetto 2018-05-07 Adjusted _site,yml slightly.
html 39bbd3a Peter Carbonetto 2018-04-14 Re-built first-glance webpage with workflowr v0.11.0.9000.
Rmd ea5fb72 Peter Carbonetto 2018-04-14 wflow_publish(“first-glance.Rmd”)
html 51163d7 Peter Carbonetto 2018-03-12 Ran wflow_publish("*.Rmd") with version v0.11.0 of workflowr.
html 440ea39 Peter Carbonetto 2018-03-09 Removed the code_folding feature.
html ab9176e Peter Carbonetto 2018-03-09 Added code_hiding to the analysis R Markdown files.
html 97cbef6 Peter Carbonetto 2018-01-23 Adjusted footer and re-built all pages.
html b32e833 Peter Carbonetto 2018-01-18 Re-built all webpages using workflowr v0.1.0.
html 7d0b902 Peter Carbonetto 2017-11-16 Re-built first-glance.html with workflowr v0.8.0.
Rmd 6f16f68 Peter Carbonetto 2017-08-02 Reverted changes for testing only.
Rmd 1470002 Peter Carbonetto 2017-08-02 Testing wflow_status() bug.
html 7979358 Peter Carbonetto 2017-08-02 Re-built all webpages.
Rmd 6b9ddf1 Peter Carbonetto 2017-08-02 Added header with between-section spacing adjustment, and removed <br> tags from R Markdown files.
html 13f03ed Peter Carbonetto 2017-07-31 Re-built all webpages.
html 6d2c5f4 Peter Carbonetto 2017-07-24 Re-built website after fixing MathJax settings in footer.
html e3afc60 Peter Carbonetto 2017-07-24 Re-built all the R Markdown documents using workflowr 0.7.0, and with
html 727b8d9 Peter Carbonetto 2017-07-13 Re-built all the analysis files; wflow_publish(Sys.glob("*.Rmd")).
Rmd 6d02ffc Peter Carbonetto 2017-07-13 Made a dozen or so small adjustments to the .Rmd files.
Rmd b739bf9 Peter Carbonetto 2017-07-12 Revised text in first-glance.Rmd.
html b739bf9 Peter Carbonetto 2017-07-12 Revised text in first-glance.Rmd.
html 597355d Peter Carbonetto 2017-07-07 Ran wflow_publish(c(index.Rmd,first-glance.Rmd,station-map.Rmd,time-of-day-trends.Rmd)).
Rmd f7da4f6 Peter Carbonetto 2017-07-07 Fixed a broken link, and made a bunch of small revisions to the notebooks.
html f62f674 Peter Carbonetto 2017-07-05 Re-built all the files without cached chunks.
Rmd 96f2db4 Peter Carbonetto 2017-07-05 wflow_publish(c(“index.Rmd”, “first-glance.Rmd”, “station-map.Rmd”))
html 5a4a3bd Peter Carbonetto 2017-07-05 Another small adjustment to first-glance.Rmd.
Rmd 7d1aefc Peter Carbonetto 2017-07-05 wflow_publish(“first-glance.Rmd”)
html c8f1418 Peter Carbonetto 2017-07-05 Build site.
Rmd 4bb29bd Peter Carbonetto 2017-07-05 Formatting adjustments to first-glance.Rmd.
Rmd 09bb3c4 Peter Carbonetto 2017-07-05 A few adjustments to first-glance.Rmd.
html 841c429 Peter Carbonetto 2017-07-05 Updated first-glance.html.
html db8f335 Peter Carbonetto 2017-07-05 Updated first-look.html.
Rmd 5e53297 Peter Carbonetto 2017-07-05 Filled out first-glance.Rmd.
html d132d28 Peter Carbonetto 2017-07-05 Re-built first-glance.html.
Rmd bbd4aa2 Peter Carbonetto 2017-07-05 Added steps to extract dates and times from character strings in CSV files.
html bbd4aa2 Peter Carbonetto 2017-07-05 Added steps to extract dates and times from character strings in CSV files.

Here, we will take a brief look at the data provided by Divvy.

I begin by loading a few packages, as well as some additional functions I wrote for this project.

library(data.table)
source("../code/functions.R")

Reading the data

I wrote a function, read.divvy.data, that reads in the trip and station data from the Divvy CSV files. This function uses fread from the data.table package to quickly read in the data (it is much faster than read.table). This function also prepares the data, including the departure dates and times, so that they are easier to work with.

divvy <- read.divvy.data()
# Reading station data from ../data/Divvy_Stations_2016_Q4.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q1.csv.
# Reading trip data from ../data/Divvy_Trips_2016_04.csv.
# Reading trip data from ../data/Divvy_Trips_2016_05.csv.
# Reading trip data from ../data/Divvy_Trips_2016_06.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q3.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q4.csv.
# Preparing Divvy data for analysis in R.
# Converting dates and times.

A first glance at the Divvy data

We have data on 581 Divvy stations across the city.

nrow(divvy$stations)
# [1] 581
print(head(divvy$stations),row.names = FALSE)
#                        name latitude longitude dpcapacity online_date
#         2112 W Peterson Ave 41.99118 -87.68359         15   5/12/2015
#               63rd St Beach 41.78102 -87.57612         23   4/20/2015
#           900 W Harrison St 41.87468 -87.65002         19    8/6/2013
#  Aberdeen St & Jackson Blvd 41.87773 -87.65479         15   6/21/2013
#     Aberdeen St & Monroe St 41.88042 -87.65560         19   6/26/2013
#    Ada St & Washington Blvd 41.88283 -87.66121         15  10/10/2013

We also have information about the >3 million trips taken on Divvy bikes in 2016.

nrow(divvy$trips)
# [1] 3595383
print(head(divvy$trips),row.names = FALSE)
#  trip_id           starttime bikeid tripduration from_station_id
#  9080551 2016-03-31 23:53:00    155          841             344
#  9080550 2016-03-31 23:46:00   4831          649             128
#  9080549 2016-03-31 23:42:00   4232          210             350
#  9080548 2016-03-31 23:37:00   3464         1045             303
#  9080547 2016-03-31 23:33:00   1750          202             334
#  9080546 2016-03-31 23:31:00   4302          638              67
#              from_station_name to_station_id               to_station_name
#  Ravenswood Ave & Lawrence Ave           458      Broadway & Thorndale Ave
#        Damen Ave & Chicago Ave           213        Leavitt St & North Ave
#      Ashland Ave & Chicago Ave           210     Ashland Ave & Division St
#        Broadway & Cornelia Ave           458      Broadway & Thorndale Ave
#    Lake Shore Dr & Belmont Ave           329 Lake Shore Dr & Diversey Pkwy
#  Sheffield Ave & Fullerton Ave           304       Broadway & Waveland Ave
#    usertype gender birthyear start.week start.day start.hour
#  Subscriber   Male      1986         13  Thursday         23
#  Subscriber   Male      1980         13  Thursday         23
#  Subscriber   Male      1979         13  Thursday         23
#  Subscriber   Male      1980         13  Thursday         23
#  Subscriber   Male      1969         13  Thursday         23
#  Subscriber   Male      1991         13  Thursday         23

Out of all the Divvy stations in Chicago, the one on Navy Pier (near the corner of Streeter and Grand) had the most activity by far.

departures <- table(divvy$trips$from_station_name)
as.matrix(head(sort(departures,decreasing = TRUE)))
#                               [,1]
# Streeter Dr & Grand Ave      90042
# Lake Shore Dr & Monroe St    51090
# Theater on the Lake          47927
# Clinton St & Washington Blvd 47125
# Lake Shore Dr & North Blvd   45754
# Clinton St & Madison St      41744

Divvy bikes at the University of Chicago

I would also like to take a close look at the trip data for the main Divvy station on the University of Chicago campus. The Divvy bikes were rented almost 8,000 times in 2016 at that location.

sum(divvy$trips$from_station_name == "University Ave & 57th St",na.rm = TRUE)
# [1] 7944

This is the version of R and the packages that were used to generate these results.


sessionInfo()
# R version 3.6.2 (2019-12-12)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS Catalina 10.15.5
# 
# Matrix products: default
# BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] data.table_1.12.8
# 
# loaded via a namespace (and not attached):
#  [1] workflowr_1.6.2 Rcpp_1.0.3      rprojroot_1.3-2 digest_0.6.23  
#  [5] later_1.0.0     R6_2.4.1        backports_1.1.5 git2r_0.26.1   
#  [9] magrittr_1.5    evaluate_0.14   stringi_1.4.3   rlang_0.4.5    
# [13] fs_1.3.1        promises_1.1.0  whisker_0.4     rmarkdown_2.0  
# [17] tools_3.6.2     stringr_1.4.0   glue_1.3.1      httpuv_1.5.2   
# [21] xfun_0.11       yaml_2.2.0      compiler_3.6.2  htmltools_0.4.0
# [25] knitr_1.26