Radical Software

LCST 2234, Fall 2021 (CRN 9430)
Rory Solomon

Project 2, Tutorial 2: CSS and text modification with a browser extension

In our tutorial last week we got started on this project, downloaded some source code as a starting point, and got setup with the basic structure of your browser extension. This week we are going to talk about functionality we can start adding to your extension to allow it to intervene in a web user's browsing experience by modifying the HTML page currently being displayed.

In order to understand how this works, we need to take a bit of a detour into the world of CSS and it's relationship to HTML.

Table of contents

  1. Review: HTML and CSS as structure vs content
    1. A sample HTML file to work with
  2. Introducing CSS
    1. Attaching CSS to an HTML page
  3. Inspecting the structure & apperance of an HTML page
  4. Extension technique: modifying CSS properties
  5. Extension technique: search-and-replace text
The artist Gordon Matta Clark would often work by removing the façades of structures in the built environment, presenting structural features as aesthetic objects. Today we'll be mucking around with the structure and "façade" of HTML documents. This piece is called "Bingo", 1974, on view here at MoMA.

01. HTML and CSS: Structure and content versus appearance

Last week I gave a brief overview of HTML: its basic syntax of open and close braces < > to define tags (a.k.a. elements), the way that tags often come in open and close pairs (like <p> </p>), and the way that HTML is generally used to specify the semantics or organization structure of webpages, like paragraphs, headings, lists, etc.

That leaves CSS to control the appearance of webpages as determined by the structure of HTML.

(Throughout my class notes I will use bold to indicate key technical jargon that is important for you to understand. Learning to code is not just about learning programming language syntax, but also includes learning how to speak about programs with other programmers. The technical vocabulary of these bolded terms will help you do that, and it is as important to practice using as the specific code commands we'll be working with.)

For your lab notebook. Try looking through the notes for the technical tutorials so far and make note of the bolded terms and their definitions.

(jump back up to table of contents)

A sample HTML file to work with

Let's make a sample HTML file to work with for today. In VS Code, select File > New File, name it sample.html and save it into your project-1-main folder. Now let's add the basic starting code for a web page. Start by typing html. VS Code should give you some autocomplete options. Select html:5. Now let's start by copy/pasting the sample HTML code that we were looking at last week, taking care to put this in the <body> element (i.e. in between the open and close <body> tags):

<body>
<h1>Autobiography</h1>
<h2>By Gritty Monsteur</h2>
<div>
  <h3>Past</h3>
  <p>
    I was born thousands of years ago
    in Monsterville.
  </p>
  <p>
    I come from a long line of monsters.
  </p>
</div>
<div>
  <h3>Present</h3>
  <p>
    I like:
    <ul>
	<li>ice skating</li>
	<li>the color orange</li>
	<li>doing pranks</li>
    </ul>
  </p>
</div>
<div>
  <h3>Future</h3>
  <p>
    One day I hope to be a hockey player.
  </p>
</div>
</body>
(jump back up to table of contents)

02. Introducing CSS

Now we can get started molding this HTML structure into the layout that we're working toward. We do this with CSS, which stands for Cascading StyleSheets. Where HTML specifies the structure and semantics of a document, CSS is deployed to control the visual display of that structure. As we've seen so far, most HTML tags have some kind of default appearance. CSS can be used to change and override any of these visual aspects. Any display properties that are not explicitly modified by your CSS code will fall back on the default properties for each tag, which are specified by your browser, and sometimes slightly different from one browser to another.

We won't be writing too much CSS code together for this project, and instead we will be more focused on writing Javascript code to modify CSS. But there may be some parts of your project that will require CSS, and in any case in order to write code that modifies CSS, it is a pretty good idea to have a solid foundation for how it works.

CSS code is comprised of a series of rules. The basic structure of a CSS rule is as follows:

(CSS code will always be displayed with an orange background
and rounded corners.)
p {
  color: red;
}
There are four main components to each rule:

Some other syntax rules:

We can't possibly go over every available CSS property and all their valid values. Perhaps if this were a web design course, we'd spend a few weeks doing so. For our class, I'm only going to show a you a couple, and offer you some pointers to where you can look for more reference documentation about how CSS works. Probably the best guide I can recommend is W3 Schools. Here's their CSS information:
W3 Schools, CSS Tutorial
Notice the column on the left side with a long list of different categories of CSS functionality. Click on each of those to learn more about the specific CSS properties and values for a range of CSS options, including: backgrounds, borders, colors, fonts, margins, height/width, links, and lists, just to name a few that I think might be useful.

(jump back up to table of contents)

02.a Attaching CSS to an HTML page

CSS rules are sort of "attached" or connected to the page structure of an HTML document by including the CSS code in the HTML. There are several ways to do this. Three ways to include CSS in an HTML page:

  1. Inline. For any HTML tag, you can include an attribute named style, which can include CSS code. For example:
    <h1 style="color: red;">Header</h1>
    
    As above, you can specify multiple properties / values separated with semi-colons:
    <h1 style="color: red; border: solid 1px;">Header</h1>
    
    In this case, these rules will apply only to this one specific <h1> tag. No others.

    This is, in general, considered the worst way to apply CSS rules to your HTML. It is not general in any way, so if you want to apply similar styles to different tags, you have to copy/paste a lot of CSS code. Sometimes in limited cases though it can be a useful shortcut. But if you find yourself including a lot of CSS this way, switch to one of the below methods.

  2. Internal. A better way is to include a big chunk of CSS rules at the top of your document, in the <head> section using a <style> tag, like so: (This HTML is a snippet from the complete HTML file above, as indicated by the jagged border.)
    <head>
      <title></title>
      <style>
    p {
      color: red;
      border: solid 1px;
    }
      </style>
    </head>
    

    This method is probably a little better than inline. It helps separate your tags specifying the structure of the document and the rules for how it is displayed.

  3. External. This is probably the best method. With this technique, you place all your CSS code in a separate file, which you reference from your HTML file.

    To do this, create a new subfolder called css in your project folder, and within that, create a new file. I like to call my CSS file style.css, but you can call it anything you'd like, just make sure it ends in .css.

    Now, place your CSS rules in this file, and save it. The modify the <head> section of your HTML file like so:

    <head>
      <title></title>
      <link rel="stylesheet" href="css/style.css">
    </head>
    

IV. Inspecting the structure & appearance of a web page

The browser's "inspect" tool

Can use to explore how a webpage is built, to learn HTML and CSS

Can also use it to have some fun modifying the structure and apperance of an HTML page. And can even use this (sometimes) to circumvent paywall blocks that are implemented with CSS changes. Is this "radical software"?

Keep in mind that these changes you are making are only having an effect locally, in the version of the page that your browser is showing you, a kind of local copy. You are not modifying the page on the server, because you do not have access to that computer. You cannot share these changes with others to see because you can't control how these other web servers deliver HTML and CSS data.

You can however create a browser extension that does this, and that way anyone who installs your browser extension will see modifications that you have done. However, that entails you figuring out how to automate these changes, to run in someone else's browser. For that we'll be using Javascript.

V. Modifying CSS properties with an extension

We saw a simple version of this last week (adding a red border to all <p> tags). Today let's try something else.

Let's walk through an example. Rename content.js to tutorial1-content.js. Create a new file called tutorial2-content.js. Modify manifest.json to specify this filename in the js field of content_scripts.

// Tutorial 2, content.js

// Technique 1
for (e of document.getElementsByTagName('*')) {
    var c = "rgb(" + Math.floor(Math.random()*255)
             + "," + Math.floor(Math.random()*255)
             + "," + Math.floor(Math.random()*255) + ")"

    // For testing:
    //console.log(e);

    e.style.setProperty('background-color',c);
}

Similar to last week's tutorial, this code snippet uses getElementsByTagName(), which returns a list of all the elements in an HTML document that match the tag specified. Last week we said getElementsByTagName('p') to retrieve a list of all <p> elements in the document. In the code snippet here, we're saying getElementsByTagName('*'), which just returns all HTML tags from the document.

The next line for (e of document.getElementsByTagName('*')) { creates a loop, which iterates over all the items in the list returned by getElementsByTagName(). The code inside the loop gets run on each item in the list. In other words, the code in between the curly braces { } in the above example gets invoked, once per HTML element. Each time it gets invokved, the variable e is set to be each successive item in the list.

The last line (e.style.setProperty()) sets a CSS property (in this case background-color) on an HTML element. Which element? The one indicated by the variable e. And which one is that? The variable e is set to each element in the document, every single one, taken one at a time.

So what this code does is randomly generate an RGB color, by creating a string that looks something like: rgb(155,155,255), then sets that RGB color string as the value for the CSS property background-color.

Remember to save all your code files and to reload the extension in the extensions manager (chrome://extensions). Then reload a web page and see what this looks like.

VI. Modifying HTML content with an extension

Comment out all the above code and let's work on another technique involving replacing the textual content of a web page

Recall the text replacement that we did with tweets: simply replaced all instances of a word or phrase with another word or phrase in the text of a tweet. We can't simply do that here in an HTML file because it could mess with the structure of the code. It could inadvertently replace HTML tags, alter CSS classes, or otherwise just do things we don't want. (Of course, if you're interested in intentionally breaking and glitching things, perhaps this may be an effect that you're interested in. What if you search-and-replaced all instances of one tag with another for example? Or swapped CSS classes just to see what you'd get ...)

For the technique I want to demonstrate today, we only want to do search-and-replace within the text nodes of the DOM. Remember this?

Remember this from last week? The internal DOM tree structure of this HTML snippet.

Let's step through this example:

// Technique 2
var html = document.querySelector('html');
var walker = document.createTreeWalker(html, NodeFilter.SHOW_TEXT);
var node;
while (node = walker.nextNode()) {
    node.nodeValue = node.nodeValue.replace(/SEARCH/gi, 'REPLACE')
}

Replace the orange SEARCH text with the word or phrase that you want to search for (make sure you include the forward slashes / before and after), and replace the orange REPLACE text with the word or phrase that you want to swap in (making sure you include the single quotes ' on either side).

Documentation for querySelector()

Documentation for createTreeWalker()

The forward slashes / / indicate that this argument is something called a regular expression. This is a special string used to match, search, and replace other string patterns. Regular expressions are a powerful idea in computer science and the basis of much text and language processing. This is a very simple regular expression that just searches for any text that matches the word or phrase between forward slashes. The gi at the end stands for "global" and "insensitive", meaning that this will match all instances of this pattern (sometimes regular expressions will find the first match and then stop) and it will match in a case insensitive manner (by default regular expressions are case sensitive). You can read more about regular expressions in the Mozilla documentation here: Regular expressions