This class meets Wednesday 6:30 p.m. to 9:00 p.m. (weeks 1-7 only).
- Your instructor is Hilary Mason
- Contact email: me at hilarymason dot com (or hmason at nyu dot edu)
- github repo: https://github.com/hmason/datastorytelling
This short course is an exploration of the line between data analysis and storytelling. How do we find the interesting stories in data, and how do we communicate them in a compelling way, with respect to the data?
We'll be using github to collaborate and share work during this course. Github provides wonderful tools while also reinforcing good software engineering practices.
We have seven short weeks to explore!
week 0 - Introduction and Philosophy
- philosophy of data science
- ipython, github, and a simple analysis
- homework: get everything installed, work through the bridges and tunnels analysis and commit it to github
week 1 - Tools and process
- is this graph misleading?
- software engineering process and tools -- how to submit pull requests (and why you would ever want to)
- a few command-line utilities -- grep, awk, wc
- a very simple intro to statistics using NYC restaurant health inspection data
week 2 - Acquiring data and process
- Guest speaker: Mike Dewar, New York Times R&D
- Let's check out this animated gif on data viz
- ...and this visualization of NY Fashion Week
- we'll review the first two projects and make sure everyone can commit their code
- let's talk about acquiring data when it's not packaged up so neatly
week 3 - Markov Chains and art
- Guest speaker: Max Shron, Data Strategist
- Sad AI or conceptual art? Let's talk about @horse_ebooks
- We're going to explore probability and a few probabilistic data structures
- Final project discussion!
week 4 - Algorithms for Data Classification
- Guest speaker: Gilad Lotan, Chief Scientist at Betaworks
- Bayes law and the naive bayes classifier
week 5 - Data Product Design
week 6 - Final project presentations