There are two ways to progress through this material, depending upon how much time you have to devote to it each week.
- If you are self-paced with at least 10 hours a week to devote to learning R, or if you are teaching graduate students, I’d recommend adopting the schedule below, which is designed for an intense but doable semester-long course, one module per week. It is intended to take the average graduate student roughly 10 hours per week to complete all required tasks. However, some number of students will find programming to be more challenging and may take up to 15 hours per week. Some will breeze through the material in 5.
- If you are self-paced with limited free time or teaching undergraduate students , you can take a more relaxed pace by alternating weeks: in the first week in each pair, complete the DataCamp materials, and in the second week, complete the project. If you are teaching undergraduates for a single semester, I suggest taking this approach but skipping modules 6 and 10-13.
Modules 1 – 6 are fundamental R programming. Modules 7 – 9 are traditional social scientific statistical analyses and visualization. Modules 10 – 14 are new data science skills.
If you’d like to be kept in the loop on updates to this material, including the release of an advanced course, please consider joining my lab’s Google group/email newsletter.
You can view video content by either watching the entire set of videos currently available or by clicking on individual video links below. Be sure to watch in high definition (HD; 720p or 1080p) to see text clearly.
Module | Topic | DataCamp and Readings | Lecture | Project | Debriefing |
---|---|---|---|---|---|
1 | Introduction and Software | Reading: You Say Data, I Say System | None | None | |
2 | Data Types and Basic Variable Manipulation | ||||
3 | Conditionals, Loops, and Apply | Course: Intermediate R Course: Intermediate R – Practice |
|||
4 | Data Import and Formatting | Course: Importing Data in R, Part 1 modules:
|
|||
5 | Data Manipulation | Course: Data Manipulation with dplyr in R Course: Joining Data with dplyr in R |
|||
6 | String Manipulation | Course: String Manipulation in R with stringr Course: RegexOne (not DataCamp) |
|||
7 | Data Visualization | Courses: Data Visualization with ggplot2: | |||
8 | Analysis of Variance |
Course: Analysis of Variance (ANOVA) |
|||
9 | General and Generalized Linear Model |
Course: Correlation and Regression |
|||
10 | Generating Reports and Web Apps | Course: Communicating with Data in the Tidyverse:
|
|||
11 | Web Scraping and APIs | ||||
12 | Machine Learning |
Course: Machine Learning Toolbox Reading: Machine Learning vs. Statistics Reading: Statistics vs. Machine Learning: Fight! Reading: The Actual Difference Between Statistics and Machine Learning |
Participate in a Kaggle Competition | ||
13 | Natural Language Processing |
Course: Text Mining: Bag of Words Course: Sentiment Analysis in R: The Tidy Way module: |
Coming Next Course Refresh |
You can also download a sample final exam here, which is intended to take 2-3 hours, and a sample final project, which is intended to take a week.
Once you have completed the course, if you still have your Datacamp.com subscription, you might also consider taking their course on Python for R Users to learn about the differences and similarities between these two most common programming languages used for data science.