What are the critical components of data documentation?
- Who collected the data
- Why the data was collected
- What the data is about
- When the data was collected
- Where the data was collected
- How the data was generated/collected
- Structure of the data
- Formatting decisions in the data
- Data validation/quality control
- How the data can be reused/license
- Suggested data analysis methods
- Measurement instruments used
Documentation is Project Dependent
Project 1: Building a Shoe Print Wear Database
- 150 pairs of shoes
- 2 brands of shoes
- several sizes for each brand
- step counters used on the shoes
- questionnaires measuring activities
- wearer weight/height/gait
Shoe Measurements
Initial measurement period + 2-3 additional measurement periods (~6 weeks between)
- Photos of shoe soles
- Digital shoe sole prints
- Powder prints
- Film
- Paper
- Vinyl flooring
- 3d scans of shoe soles
Measurements taken in the lab by research assistants.
Important Documentation?
- Probably should have included which research assistant was wearing the shoe, how much they weighed, their gait/height/etc., and so on.
Whoops.
Documentation is Project Dependent
Project 2: Wire cuts
Goal is to estimate the length of sharp surfaces on all wire-cutting tool in peoples’ homes
General survey with instructions for measuring each type of tool
Collected data is a list of tool types, # blades, # cutting surfaces, and # of that tool
Estimates are generated by adding up total length of cutting surfaces
Codebooks
Basic documentation that contains:
- Variable name in the code
- Long-form description of what was measured
- Units of measurement
- Acceptable values
- Values used to indicate missingness, refusal to respond, etc.
- Additional notes that may be relevant
Very common for government data - CDC codebooks are intense.
Data Doc Influences Analysis
- Experimental design
- Randomization
- Sampling strategy
- Random effects
- Transformations of collected data
- Sources of measurement error
Data Documentation
Documentation is a love letter that you write to your future self
Damian Conway