Day 25, Tuesday, May 1, 2018

Course evaluations

Your thoughts on what works well, and what would improve the course, will be valued. Thanks.

Applications of pseudo-random numbers, cont'd

Job/marriage problem

Please fix your code if necessary and email it to me.

Algorithm performance: The figures below show the fraction of trials in which your algorithm chose one of the top k offers, for k from 1 to n. An (impossibly) perfect algorithm would have a horizontal line at 1.0 all the way across. Purely random choice will give a straight line from (1,1/n) to (1,1). If your name does not appear, it means your code gave an error on import. If your curve is horizontal at zero, it means your function gave an error when called.


marriage_job_problem/marriage_algorithms_performance_noffers0005.png marriage_job_problem/marriage_algorithms_performance_noffers0010.png marriage_job_problem/marriage_algorithms_performance_noffers0030.png

Germ avoidance problem


Code to start from: germ_avoidance_starter_code.ipynb


Computing with text

Grab an online document

import requests
r = requests.get("")
giantstring = r.text
print giantstring[:500]

Slurp a file on disk

Use your browser to download the plain text version of a book of your choice from Project Gutenberg.

with open('hamlet.txt') as f:
        giantstring =

or may need

with open('hamlet.txt',encoding='utf8') as f:
        giantstring =

Search for a phrase

if 'To be or not' in giantstring:
        print 'Yep, it's in there.'
        print giantstring.find('To be or not')

There is also a count() function.

Clean up and count words

Most useful methods of strings: split, replace, slicing with [:], lower, join

Occasionally useful: title/capitalize, startwith/endswith

The string module has some useful things, including: string.punctuation, string.whitespace

Exercise: Count number of distinct words

Word count

Exercise: Make a dictionary whose keys are all the distinct words in your text, and the value is the number of times the word occurs.

Exercise: Examine the relationship between word frequency and frequency rank.

Other texts of possible interest:

If time permits, make up a totally random text and compare. (Use numpy.random.choice to pick a bunch of letters at random from an alphabet of your choosing.)