UNL R Workshops
  • Home
  • Introduction to R
  • Graphics with ggplot2
  • Data Wrangling
  • Modeling

On this page

  • Timetable
  • Your Turn Solutions
  • Useful Links

Data Wrangling

Return to rwrks homepage

dplyr hex sticker tidyr hex sticker lubridate hex sticker

This workshop will to prepare you for dealing with messy data by walking you through real-life examples. We will work on improving your programming skills and help you move beyond using copy-and-paste. We will discuss how to write functions in order to reduce duplication in your code and automate common tasks and how to use iteration in order to further reduce duplication. You will leave with skills that will allow you to both tackle problems with more ease.

The course will be data centric, with lots of different data sets that illustrate examples of the different techniques used for different problems.

Timetable

Date Notes Lectures and Resources
9 - 9:15 Introduction reading in basic file types: .xls, .csv, .txt, .xport and more
general functions: filter, join, …
9:15 - 10:05 Reading Files Excel files vs. text, data organization
2-Files.R, midwest.csv, midwest.xls
10:05 - 10:30 Break
10:30 - 12:15 Summarizing with dplyr Pipe operator and dplyr verbs
3-dplyr.R
pitch.csv
12:15 - 1:15 Lunch Break (on your own)
1:15 - 2:45 Tidy Data Restructuring data with pivot wider, pivot longer, and separate.
4-tidyr.R, frenchfries.csv, billboard.csv, flights.csv, occupation-1870.csv
2:45 - 3:00 Break
3:00 - 4:00 Joining Data Join dataframes together using SQL-based logic
5-joining.R, boxoffice.csv, baseball.csv
3:55 - 4:00 Evaluation Help us make the workshops better!

Your Turn Solutions

  • Your Turn Solutions

Useful Links

  • The Split-Apply-Combine Strategy for Data Analysis, Journal of Statistical Software, 2011
  • Overview of base apply functions
  • Dplyr and Tidyr Cheat Sheet