Schedule and Materials

There are two ways to progress through this material, depending upon how much time you have to devote to it each week.

  1. If you are teaching graduate students or are self-paced with enough time to devote, I’d recommend adopting the schedule below, which is designed for an intense but doable semester-long course.  It is intended to take the average graduate student roughly 10 hours per week to complete all required tasks.  However, some number of students will find programming to be more challenging and may take up to 15 hours per week.  Some will breeze through the material in 5.
  2. If you are teaching undergraduate students or are self-paced with less free time, you can take a more relaxed pace by alternating weeks: in the first week in each pair, complete the DataCamp materials, and in the second week, complete the project.  If you are teaching undergraduates for a single semester, I suggest taking this approach but skipping modules 6 and 10-13.

You can download a sample graduate-level syllabus here, which I used when I taught this course in Fall 2017.

If you are designing your own data science course using these materials, it is recommended that for each week of this course, you strive for students to accomplish four objectives:

  1. Complete the assigned DataCamp modules.
  2. Listen to the lecture materials tying DataCamp modules to social scientific applications and highlighting critical new skills.
  3. Complete the practical project.
  4. Participate in a debriefing explaining how DataCamp and the lecture materials link together to write the code to address project requirements.  A sample debriefing is provided for self-paced or asynchronous online courses, but this is best done live if possible.

Weeks 1 – 6 are fundamental R programming.  Weeks 7 – 9 are traditional social scientific statistical analyses and visualization.   Week 10 – 14 are new data science skills.

Additional videos will be released 4-at-a-time through Spring 2018 until the complete course is available.

You can view video content by either watching the entire set of videos currently available or by clicking on individual video links below.  Be sure to watch in high definition (HD; 720p or 1080p) to see text clearly.

Module Topic DataCamp and Readings Lecture Project Debriefing
1 Introduction and Software Reading: You Say Data, I Say System

Module 1 Video

Module 1a PDF

 None  None
2 Data Types and Basic Variable Manipulation

Course: Introduction to R

Reading: Designing Projects

Module 2 Lecture Video

Module 2 PDF

Module 2 Project

Data for Module 2

 Module 2 Debriefing Video
3 Conditionals, Loops, and Apply Course: Intermediate R
Course: Intermediate R – PracticeReading: Using apply, sapply, lapply in R

Module 3 Lecture Video

Module 3 PDF

Module 3 PDF

Data file for Module 3

 Module 3 Debriefing Video
4 Data Import and Formatting Course: Importing Data in R, Part 1 modules:

  • Importing data
  • readr & data.table

Course: Cleaning Data in R

Reading: data.table vs dplyr

Module 4 Lecture Video

Module 4 PDF

Module 4 PDF

Data file for Module 4

 Module 4 Debriefing Video
5 Data Manipulation Course: Data Manipulation in R with dplyr
Course: Joining Data in R with dplyr modules:

  • Mutating joins
  • Filtering joins and set operations
  • Assembling data
Module 5 Lecture VideoModule 5 PDF

Module 5 PDF

Data files for Module 5

Module 5 Debriefing Video
6 String Manipulation Course: String Manipulation with stringr
Course: RegexOne (not DataCamp)Reading: Demystifying RegEx
Module 6 Lecture VideoModule 6 PDF

Module 6 PDF

Data file for Module 6

Module 4 Debriefing Video
7 Data Visualization Courses: Data Visualization with ggplot2:

Reading: Data Visualizations That Will Blow Your Mind

Module 7 Lecture VideoModule 7 PDF Module 7 PDF Module 4 Debriefing Video
8 Analysis of Variance

Course: Analysis of Variance (ANOVA)

Course: Repeated-measures ANOVA

Reading: Quick-R ANOVA Guide

Reading: Personality Project ANOVA Guide

Module 4 Lecture VideoModule 8 PDF Module 8 PDF Module 8 Debriefing Video
9 General and Generalized Linear Model

Course: Correlation and Regression

Course: Multiple and Logistic Regression

Reading: Why ANOVA and Linear Regression are the Same

Module 9 PDF Module 9 PDF Coming Mar 15
10 Generating Reports and Web Apps

Course: R Markdown

Reading: Advantages of Using R Notebooks for Data Analysis

Module 10 PDF Module 10 PDF Coming Mar 15
11 Web Scraping and APIs

Course: Working with Web Data in R

Reading: Internet Scraping for Research

Module 11 PDF Module 11 PDF Coming Mar 15
12 Machine Learning

Course: Machine Learning Toolbox

Reading: Machine Learning vs. Statistics

Reading: Statistics vs. Machine Learning: Fight!

Module 12 PDF  See
Project
13
See Project 13
13 Natural Language Processing

Course: Text Mining: Bag of Words

Course: Sentiment Analysis in R: The Tidy Way module:

Reading: Words as Vectors

Module 13 PDF Module 13 PDF Coming Mar 15

You can also download a sample final exam here, which is intended to take 2-3 hours, and a sample final project, which is intended to take a week.

Would you like to be emailed when new videos are made available?  If so, please submit your email address here: