i used code from here to put all the words from the first executive order into a set, which is a way to get all the unique words in the document.
## from allison parrish's http://www.decontextualize.com/teaching/rwet/simple-models-of-text/ import sys words = set() for linein sys.stdin: line = line.strip() line_words = line.split() for wordin line_words: words.add(word) for wordin words: print wordwhich produced “words” separated by line breaks. here are some interesting sections:
all
United
burden
PENDING
out
purchasers
Patient
for
enforceable
availability
HOUSE,
health
imperative
Nothing
benefit
Human
repeal,
or
otherwise
individuals,
control
Constitution
unwarranted
fiscal
head
with
legislative
Procedure
CARE
me
commerce
agency
Act
authorities
such
WHITE
law
affect:
impair
does
In
the
insurers,
okay so obviously some of these would be neat poems, so i tried to join them:
import sys for linein sys.stdin: line = line.strip() output = " ".join(line) print output
hmmmm noooo…

hmmmm nooooooo… okay, new activity: replacing the executive order with these poems.
i’m doing this manually for now since it would involve a bunch of regex, but i’ll record the steps here:
replace all instances of “Minimizing the Economic Burden of the Patient Protection and Affordable Care Act Pending Repeal” with first poem above, “all United burden,” in the style in which the original text appears (so, with .title() or .upper()) when sections begin, keep the text naming the section as such (“Section 1”, “Sec. 2”, etc.) but replace body of the section with the next poem above. remove newlines from poems above so the words flow like sentences, but don’t change case, punctuation, etc. fill sections for as many poems as were originally picked out from the set. delete sections that don’t have an accompanying poem.this feels very related to a project i did in jer’s class last year where i replaced “mortgage” language with “data” language in hank paulson’s 2008 announcement about the economy. python woulda helped with that/made it better. anyway, executive order results here , original here .

another thing i was working on was figure out how to clean up the file without going through manually. these are things i did in the interpreter. i wonder if there’s a way to say if 'space' char appears > or = 2 times, replace it with ' ‘? it’d also be cool to figure out how to split on html tags so i don’t have to manually delete those. maybe this will be useful later.
for linein lines: line = line.strip().replace('', ' ') line = line.strip().replace(' ', ' ') line = line.strip().replace('', ' ') line = line.strip().replace(' ', ' ') line = line.strip().replace('', ' ') line = line.strip().replace(' ', ' ') line = line.strip().replace('', ' ') line = line.strip().replace(' ', ' ') print line