# Read in the data
Data Reshaping with Puppies
Note: This assignment must be submitted in github classroom.
Setting up
In this assignment, we’ll work with American Kennel Club data on dog breeds.
Read in the CSV files provided:
This assignment will ask you to sketch the layout of various datasets. If you want, you can use paper/pencil sketches, but the images should be provided in PNG or JPEG format and included in your document as images. If you prefer a digital tool, Excalidraw.com is an excellent (free) option.
1 Breed Traits
Is the breed_traits.csv
file in tidy form? Why or why not?
If you wanted to plot the distribution of rankings for each trait, with each trait’s distribution in a separate facet, what form would you need to use for the data?
Sketch the form the data would look like to create the necessary plot, and include the image of that sketch here.
What transformations are necessary for the data to be converted into this form? Include a sketch of the transformation process.
Transform the data and generate a plot showing the distribution of breed rankings for the variables Affectionate With Family
, Good with Young Children
, Good With Other Dogs
, Playfulness Level
, and Trainability Level
. Each variable should have its own facet.
# Transformation code here
# Plot code here
2 Breed Ranks
Is the breed_rank_all.csv
file in tidy form? Why or why not?
If you wanted to plot the popularity of Beagles and Jack Russell Terriers between 2013 and 2020, what form would you need the data to be in?
Sketch the form the data would look like to create the necessary plot, and include the image of that sketch here.
What transformations are necessary for the data to be converted into this form? Include a sketch of the transformation process.
Choose two dog breeds that you are interested in and generate a line chart showing the breeds’ relative popularity between 2013 and 2020. Breed should be indicated by color as well as linetype (so that your plot is double encoded).
# Transformation code here
# Plot code here