Code as a Liberal Art, Spring 2025

Unit 2, Lesson 3 Homework

Due: Wednesday, April 9, 8pm

  1. Review the class notes for this week.
  2. Make sure that you have the code that we worked on in class operating correctly.

    Try running your Markov data structure generator on another sample input. Let's try Jorge Luis Borges' "The Garden of Forking Paths".

    Try using pprint.pp() to print out the next word lists for a few select words. Compare your outputs and make sure they are identical to these:

    pprint.pp(markov["In"]):
    ['his',
     'despite',
     'the',
     'ten',
     'all',
     'all',
     'the',
     "Ts'ui",
     'the',
     'the',
     'your',
     'some',
     'this',
     'another,',
     'yet',
     'the',
     'point'],
    
    pprint.pp(markov["by"]):
    ['thirteen',
     'fourteen',
     'Dr.',
     'a',
     'train.',
     'making',
     'besting',
     'an',
     'a',
     'the',
     'distance.',
     'clapping',
     'the',
     'the',
     'many',
     'the',
     'one',
     'them',
     'means',
     'the',
     'careless',
     'lightning.',
     'the',
     'the',
     'the'],
    
    pprint.pp(markov["other"]):
    ['men,',
     'men,',
     'enterprise',
     'than',
     'times.',
     'bifurcations.',
     'possible',
     'through',
     'dimensions',
     'course'],
    
    Can you find any other interesting outputs from running this code on this input data? Upload any interesting outputs that you can find.
  3. There seems to be a lot of garbage characters and weird formatting in the David Copperfield text. Try to clean up that file and see if it appears to improve the behavior of the Markov process in any way. Does the data structure look good, or is there junk in it? Are there some things you can do to "clean up"" this data? Post your results.

  4. Try to make some progress on the text generation part of the Markov process. Try to get started on using a Markov data structure to generate text. Upload your code and any output.