Day 07

Tuesday, Feb 21, 2017

CSV Exercise (recap)

CSV1. Make a plot of all the abandoned buildings in Chicago using csv.DictReader() to access the CSV file you download from here.

abandoned_plot1.png abandoned_plot2.png


JSON (JavaScript Object Notation) is a lightweight data-interchange format useful for non-tabular data. It is easy for humans to read and write. It is easy for machines to parse and generate. It supports self-documentation.

Useful to install JSON-formatting browser plugin, such as "JSONView", "JSON Viewer", "JSON Formatter".

Supplementary reference: Jennifer Widom database lectures.

Examples of use: Browser data, NHTSA Complaints database, and ... Jupyter Notebooks

Correspondence with Python data structures

object = dictionary (string:value)
array  = list
value = string/number/true/false/null/object/array/

json module: dumps, loads


  1. JSON keys are always strings (not required in Python dictionaries)
  2. JSON text is almost pasteable as Python code, but JSON "true/false" map to Python "True/False"

Contrast with Mathematica notebook, Maple worksheet classic mws (replaced by XML-based mw).

JSON Exercise 1: load a Jupyter notebook

JSON Exercise 2: Chevrolet Cobalt ignition switch

From Wikipedia article on Chevrolet Cobalt: Faulty ignition switches in the Cobalts, which cut power to the car while in motion, were eventually linked to many crashes resulting in fatalities, starting with a teenager in 2005 who drove her new Cobalt into a tree. The switch continued to be used in the manufacture of the vehicles even after the problem was known to GM. On February 21, 2014, GM recalled over 700,000 Cobalts for issues traceable to the defective ignition switches. In May 2014 the NHTSA fined the company $35 million for failing to recall cars with faulty ignition switches for a decade, despite knowing there was a problem with the switches. Thirteen deaths were linked to the faulty switches during the time the company failed to recall the cars.

Exercise: Was this problem evident from the NHTSA complaint database long before the 2014 recall?

My scratch notebook from today

My scratch notebook from today

Please remember you have to produce a plot at the beginning of class Thursday.

JSON Exercise 3

This will be due Friday at 11:59pm.

Write an "Escape from Jupyter", i.e. a program that extracts all the python code and markdown from a Jupyter Notebook and writes a corresponding executable plain Python script. Markdown should be converted to Python comments.



Upload your plain-text python file that converts a .ipynb Jupyter notebook file to a
plain-text version that can be fed directly to a Python (3) interpreter.

The input filename must be read from the command line.

The output filename should be the input filename with ".py" tacked on the end.

All python code in the notebook should be transcribed into the output file.

All markdown in the notebook should be transcribed into the output file as comments.

Your code will be tested by a totally unmerciful script: be sure you follow all the
instructions precisely.