Data Reshaping with Puppies

HW
Week9
Week10
Author

Homework

Published

Due: March 31, 2023

Note: This assignment must be submitted in github classroom.

Setting up

In this assignment, we’ll work with American Kennel Club data on dog breeds.

Data information

Read in the CSV files provided:

# Read in the data

This assignment will ask you to sketch the layout of various datasets. If you want, you can use paper/pencil sketches, but the images should be provided in PNG or JPEG format and included in your document as images. If you prefer a digital tool, Excalidraw.com is an excellent (free) option.

1 Breed Traits

Is the breed_traits.csv file in tidy form? Why or why not?

If you wanted to plot the distribution of rankings for each trait, with each trait’s distribution in a separate facet, what form would you need to use for the data?

Sketch the form the data would look like to create the necessary plot, and include the image of that sketch here.

What transformations are necessary for the data to be converted into this form? Include a sketch of the transformation process.

Transform the data and generate a plot showing the distribution of breed rankings for the variables Affectionate With Family, Good with Young Children, Good With Other Dogs, Playfulness Level, and Trainability Level. Each variable should have its own facet.

# Transformation code here
# Plot code here

2 Breed Ranks

Is the breed_rank_all.csv file in tidy form? Why or why not?

If you wanted to plot the popularity of Beagles and Jack Russell Terriers between 2013 and 2020, what form would you need the data to be in?

Sketch the form the data would look like to create the necessary plot, and include the image of that sketch here.

What transformations are necessary for the data to be converted into this form? Include a sketch of the transformation process.

Choose two dog breeds that you are interested in and generate a line chart showing the breeds’ relative popularity between 2013 and 2020. Breed should be indicated by color as well as linetype (so that your plot is double encoded).

# Transformation code here
# Plot code here