Code as a Liberal Art, Spring 2021

Unit 2, Exercise 2 homework

Due: Tuesday, March 9, 8pm

  1. Review the class notes for this week.
  2. Build on the example that we worked through in class that demonstrates some principles of web scraping and visualization. You can use my example directly: scraping pages and their titles, and visualizing their links. Or you could try to gather some other data: perhaps meta data about those pages, or perhaps you could use Beautiful Soup to target a more specific piece of data on a page, throughout a collection of pages.

    My example turned out to not be super interesting because the NY Times homepage links to all those pages, but the pages don't link to each other very much. So it produced a kind of star shape. Probably if you used a different URL as your starting point you could find a more interesting sitemap, but you'll have to use the "Inspect" tool in your browser and poke around a bit in the HTML to find out which pieces to parse with Beautiful Soup.

    Maybe Wikipedia would have an interesting structure, if you find an interesting page to use as your starting point. Or maybe you could develop an idea related to the final question that you answered as part of last week's homework.