Project: Scraping Public Data Screencast

Author

Your Name

Project Description

For your class project (which will take the place of the midterm exam), you will be recording a screencast in the style of David Robinson’s TidyTuesday screencasts.

You can find time-stamped, catalogued versions of some of David Robinson’s screencasts here.

Requirements:

  • Your screencast should be approximately 45 minutes long.

  • Your screencast should begin with identifying and explaining how to scrape or assemble from API calls some public data. This data can be generated by local, state, or national governments, nonprofit organizations, or advocacy organizations, but should not be from any for-profit entity.

  • You should showcase at least 4 different techniques you’ve learned in Stat 351. Some examples include:

    • web scraping or API use
    • efficient/“polite” techniques for acquiring data
    • functional programming
    • list processing
    • use of appropriate and well-chosen graphics
    • interactive graphics

Unlike David Robinson’s screencasts, you will write a rough pseudocode “script” before you start recording. This will give you a rough outline of how to do the analysis and what things you intend to cover.

Your goal is to help a future Stat 351 student understand some of the topics covered in this class. So while David Robinson and others who record their screencasts live might not fully explain what he’s doing, you should take the time to explain each technique you decide to use in a way that will help someone else understand.

There will be three deliverables for this project:

  1. Plan your dataset and topics
  2. Pseudocode script uploaded to github repository
  3. Screencast + github repository
    • Screencast uploaded to YouTube/YuJa
    • Approximate time index provided for each of the 4 techniques you’re demonstrating (examples)
    • Code uploaded to github repository

In lieu of a midterm exam, you will peer review a selection of your classmates’ screencasts.