Note: This assignment must be submitted in github classroom.
This week’s assignment uses data from Tidy Tuesday (link) and relates to food consumption and CO2 emissions.
# Credit to Kasia and minorly edited to create output file and test plot# Blog post at https://r-tastic.co.uk/post/from-messy-to-tidy/library(rvest)library(dplyr)url <-"https://web.archive.org/web/20191224072125/https://www.nu3.de/blogs/nutrition/food-carbon-footprint-index-2018"# scrape the websiteurl_html <-read_html(url)# extract the HTML tablewhole_table <- url_html %>%html_nodes('table') %>%html_table(fill =TRUE) %>% .[[2]]
Error in eval(expr, envir, enclos): object 'table_content' not found
The code above reads the data in from the Wayback Machine’s archived version of the original webpage and gets it into tabular form.
Your job is to complete the following tasks:
Describe the state of the data set, table_content.
What are the variables in the data set?
var1
var2
(add more as necessary)
Is it in tidy form? What principles of tidy data does this violate?
Your answer here
What steps do you need to take to get it into tidy form?
(add more steps as necessary)
Sketch out what the final (tidy) data set will look like. You can use markdown table syntax or a picture here, but if you use a picture, upload it to imgur and include the image link in this document USING PROPER MARKDOWN SYNTAX.
Write R or python code for each step in the process you identified in #1. Show what the data looks like at each step using head(). Each step should be in a different code chunk.
For each food type (you may have to remove total values), plot the relationship between Carbon output and Consumption (use facets to get separate plots for each type of food). What do you notice for each plot? If you want to reduce carbon emissions, what foods should you eat less of?
Look at the plot above again. Do you have any concerns about the data? The data source?