Code as a Liberal Art, Spring 2023

Unit 1, Tutorial 1 — Wednesday, January 25

Coding algorithms

Today we're going to go from the abstract idea of what algorithms are to actually implementing some algorithms in computer program code. We're going from the first exercise where you devised an algorithm for sorting a deck of cards, to implementing some algorithms in Python to sort numbers, words, and even filtering the pixels of an image.

So often in discussions of digital media, we talk about and theorize algorithms in general terms, as abstract, formal sequences of instructions. So much recent socio-political study of technology has rightly realized the need to reckon with the power and prevalence of algorithms and the ways they shape daily life, from shopping to socializing to how we acquire news.

Introductory computer programming and creative coding classes and bootcamps often claim to educate about how algorithms actually work, but the fact is that most of the coding that we do in those contexts does not actually involve puzzling through the implementation of algorithms of the kinds studied by computer scientists, nor of the kinds that have become such influential mediators of our socio-cultural patterns. Today we're going to work through some exercises that might hopefully help you start to fill in that gap.

(An IBM 082 punch card sorting machine. At one point in the history of computing, programs were "written" with small holes stamped into specially formatted cards. In order to run your program, a stack of these cards would be fed into a card reader, connected to a computer. Natually, for any program to run, the cards would have to fed to the machine in a precise order. Additionally, punch cards were often used to start data. In both of these cases, stacks of cards would frequently need to be sorted into the desired order. Machines like this one were created for that task. You can find more information about this here.)

But first we're going to have to talk about some preliminaries to get setup to do that work ...

Table of contents for the topics today: (Click each to jump down to the corresponding section.)

  1. The command line
  2. A simple folder structure
  3. Python
  4. What is a file, really?
  5. The VS Code text editor
  6. Coding algorithms (search)
  7. Sorting

I. The command line

(jump back up to table of contents)

II. A simple folder structure

(jump back up to table of contents)

III. Python

Next let's introduce Python in the command line context.

From the command line, you should simply be able to type python3 to invoke the Python shell:

$ python3
Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
You may not see precisely the same output, but it should be similar. If you see Python 2.7.x let me know.

Noteworthy here is that the prompt has changed to >>>. This is to indicate that you are not in the operating system shell, but rather in the Python shell. At this point, you can no longer type the regular shell commands. For example, cd and pwd will give you an error message. Instead, you can type Python commands. Let's experiment with a few. Try this:

>>> print("Hello!")
Hello
If you get an error message like bash: syntax error near unexpected token ..., it means you are not in the Python shell. Type python, and make sure that you see the Python command prompt: >>>

You have just typed and executed a single line of Python code. This line is comprised of one command: print(). The print() command is simple: it displays a message to the user (i.e., it "prints" it to the screen). What does it print? Whatever you specify in between the parentheses, which is what we call the arguments to the command.

In this case, the argument is "Hello": a bit of text enclosed in quotes (""), which, in computer programming, we call a string. Computers can process numbers and text (strings), as well as other digital objects like images, video, etc. Let's look at some other Python commands.

In the Python shell, if you simply type a number, Python will evaluate that expression and display the result, basically just printing the value:

>>> 10
10
>>> 5
5

Note that strings are different from numbers, even if they may seem like the same value to us. Note the difference in output here. Python uses single quotes ' ' or double-quotes " " to indicate string values. Think of single and double quotes as interchangeable in Python for now and you can use either, but you must use the same end quote as you used for the beginning of a string.

>>> "Hello"
'Hello'
>>> 10
10
>>> "10"
'10'

Note that the number 10 and the string '10' are not equal. We can check values for equality with two equal signs ==. For example:

>>> 10 == 5 + 5
True
>>> 10 == "10"
False

You can also create variables, set them to values, and do various operations on them, such as arithmetic. In Python, like in most programming languages, we have addition +, subtraction -, multiplication * and division /.

>>> x = 10
>>> y = 5
>>> x + y
15
>>> x / y
2.0
>>> x * y
50

And we can combine commands in various ways, for example to print() the result of an operation on a variable:

>>> print(x-1)
9

We can also use things called control structures, such as the if command, to change the way code instructions are executed:

>>> if x > 5:
...   print("Greater")
... else:
...   print("Less than")
... 
Greater

Note that the command prompt changed for a moment to dots: .... This notation indicates that I am inside a code block, in this case, an if statement. This code block is initiated by the colon :.

Usually Python ignores all whitespace. So x = 5 is the same as x =     5, and print("Hello") is the same as print(  "Hello"  ). You can add spaces for clarity and readability. However, at the start of a line, spaces are called indentation and they are very important in Python: they indicate that you are in a code block. In the above example, both print() statements are indented because they are in a way part of the if statement.

The last line containing only ... is created by pressing ENTER with no spaces and no other text. The empty line ends the if statement, instructing Python to evaluate the above four lines.

The if statement will run differently with different values:

>>> x = 2
>>> if x > 5:
...   print("Greater")
... else:
...   print("Less than")
... 
Less than

(When working through coding examples, I will indicate any modified code in orange with a thin dashed underline. Again, I hope to accommodate all different types of seeing abilities and if this styling is not super clear, please let me know how I can make it better!)

Typing commands like this into the Python shell can be a good way to learn, but it is a very limited way to work. Algorithms are all about automation: writing some instructions and then running them on different inputs. In other words, you don't want to have to manually modify the value of x each time, you want an if statement that automatically responds to some data or user input. To achieve this, we'll have to exit the Python shell and create and executable Python computer program by typing code into a file, which we can then run many many times without re-typing.

To exit the Python shell back to the command line, you can type CONTROL-D or exit().

(jump back up to table of contents)

IV. What is a file, really?

A file is a collection of data, grouped together into one atomic unit, which exists on a computer and is identified by a name and a path.

Files are managed on computers by something called file systems. The file system is the entire scheme of folders, subfolders, and paths that your computer uses to organize and manage data as atomic files. The file system is one part of your operating system (e.g., Mac, Windows, Linux) along with other system utilities.

Files can be of all types: images, sounds, videos, games. They can be fancy text with lots of formatting like a Microsoft Word file, or they can be simple text with no formatting, which we call plain text, and which are usually opened by a program like Text Edit.

File types are usually specified by filename extensions: the part at the end that starts with a "dot" .. The file extension tells your operating system which program to use to open a given file. For example, if you double-click a .doc file, your computer will probably open it with Microsoft Word; .html files are opened with your default web browser; .pdf files will be opened with Preview (Mac), QuickLook (Windows) or Acrobat Reader; and image files like .jpg or .png will probably be opened with Preview or QuickLook.

Please adjust the preferences for Finder (Mac) or Explorer (Windows) so that your operating system will not hide file extensions to you. Understanding file extensions is a key part of digital literacy, and knowing the extensions of files will be crucial for accessing them via computer programs this semester.

Computer programs are just files! They are usually plain text files on your computer, with a file extension telling your operating system that they can be run.

(jump back up to table of contents)

V. The VS Code text editor

To create computer program files, you need a text editor: a program that lets you type and save text as plain text, without any fancy formatting like Microsoft Word offers.

The text editor that we will use this semester is a very useful, open source tool called VS Code. The fact that it is open source means that the computer program files and algorithms of this tool are freely, publicly available, not proprietary or hidden. That means you can view the source code for this tool! In fact, the full source code is posted here on GitHub. Feel free to have a look. This also means that if you wanted to, you could modify VS Code to change its behavior in some way, then rebuild it and run your own customized version. (A very advanced exercise, but possible!)

Great! Now we can get started coding in Python.

VI. Coding algorithms (search)

Let's start by thinking about how you would design the following algorithm:

 Find the smallest number in a list of numbers.
(Similar to command line code and Python code, I will use this gray background with thin double-borders to indicate "pseudocode": this is not valid programming language syntax, but is plain English that is meant to read algorithmically. It is a kind of in-between step in the process of translating from human communication to machine instructions.)

It might be hard to actually think through the process that your brain is going through when you find the smallest number in a list. Maybe it will help if we consider an actual list of (basically random) numbers:

(I have referred to the first one starting with zero because that is what nearly all computer programming languages do.)

If you are like me, your mind may have done something like this:

Look at the first number, and let's call that "the smallest number" for the moment.

Now, compare the next number to "the smallest number":
  if the next number is smaller than "the smallest number", then it now becomes the smallest.
  Repeat this for every other number.

At the end of the list, "the smallest number" will be smaller than all the others.

Let's try to translate this into Python code. Create a new file in VS Code (File > New File) and type this there.

Start by entering the list of numbers. Here is the Python syntax for that:

number_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]
Now, let's implement the first part of the algorithm:
number_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

smallest_number = number_list[0]

This new syntax says take the first item in the list (the [0]) and assign that to a variable called smallest_number.

The next part of the algorithm indicated repetition: Repeat this for every other number. In Python, we do that with a for loop, and the syntax looks like this:

number_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

smallest_number = number_list[0]

for n in number_list:
    if n < smallest_number:
        smallest_number = n
That will loop over every number in the list, and temporarily refer to each one as n. Each time the loop executes, it will compare n (the current number in the list) to the current smallest number (i.e. smallest_number) and if n is smaller, it now becomes the smallest number.

That's it! That's all there is to this algorithm. Now we can just print an informative message at the end:

Example 1: Find the smallest number

number_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

smallest_number = number_list[0]

for n in number_list:
    if n < smallest_number:
        smallest_number = n

print("The smallest number in this list is: " + str(smallest_number) )

In my print() statement, the plus sign (+) is not addition, but string concatenation: it appends one string onto another. Remember how numbers and strings are different? The str() command converts the number into a string so that it can be concatenated. Without that, the addition operator + would give a TypeError.

Save this file in VS Code (File > Save) and name it unit1_smallest_number.py.

Try running this file from the command line and see what happens:

$ python unit1_smallest_number.py
The smallest number in this list is: 1	
Perhaps not the most exciting output, but part of what is great here is that you could modify this list, even adding thousands of numbers, and the algorithm would still work well.

Let's call this type of algorithm a search: we are looking through a collection of data for some particular value. In this case we are simply looking for the smallest number, but we could do more complicated things.

You can combine different logical statements with the and operator. So if we were looking for the smallest number that was greater than 100, we could say this:

for n in number_list:
    if n > 100 and n < smallest_number:
        smallest_number = n

Experimenting with this further will be part of the homework for this week.

Swapping out text for numbers

One great thing about Python is that many of the comparison operators (e.g. < and >) work equally well for text as for numbers. This will be an exercise in the homework.

VII. Sorting

Let's try a slightly more complicated algorithm: putting a list of numbers in order.

Computer scientists call this sorting. Perhaps this is somewhat of a misnomer. Usually in common language "sorting" might mean putting items into buckets or categories, while arranging things in sequence would probably be called "ordering". But in any case, in computer science, sorting means putting a collection of things into some order and it is one of the most common algorithms. Sorting algorithms can be some of the simplest introductions to working with algorithms, but they can also get incredibly complex as we try to find faster and more efficient ways of implementing them.

Thinking back to your card sorting algorithm, how might we describe a strategy to put a list of numbers into increasing order?

Using some of the same principles from our Smallest Number algorithm, let's try this:

Given a list of unordered numbers

Make a new, empty list

Put the first item from the unordered list into the new list

Now consider the next item in the unordered list:
    If it is smaller than the item in the new list, then put it first
    If it is larger than the item in the new list, put after

For each subsequent item in the unordered list, find where it fits into the new list.
In other words:
For each item in the unordered list:
    Compare it to each item in the new list:
        If it is bigger, move on
        If it is smaller, insert it before
      

Let's implement this in Python as well. Make a new file in VS Code (File > New File). Start with the input list again:

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

Now we'll make a new, empty list, like this:

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]
	
new_list = []

Now, as I described, take the first item from the unordered list and put it into the new list:

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]
	
new_list = []

new_list.append( unordered_list[0] )
unordered_list[0] is how we reference the first item in the unordered list, and .append() is the command to add an item to a list.

Next we'll add a loop, to loop over all the items in the unordered list:

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]
	
new_list = []

new_list.append( unordered_list[0] )

for item in unordered_list:

    # something will go in here ... 
Note my use of a comment here as I work from pseudocode to actual code.

What do we do with each item from the underline list? We have to check it against each item in the new list. So we need another loop. This one will look a little different:

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

new_list = []

new_list.append( unordered_list[0] )

for item in unordered_list:

    j = 0
    while item > new_list[j]:
        j = j + 1

    new_list.insert( j, item )
This is a new control structure called a while loop. This new variable j starts out as 0. Then we compare the item from the unordered list with the first item of the new list. If it is greater, we increase j and keep repeating this process. But as soon as the item is not greater than the next item in the new list, we'll have found the place where it belongs. In other words, j will indicate its position. So we can use the .insert() command to insert item at the position indicated by j.

There's only one last problem. Try saving this and running it. You will probably get an error message like this:

$ python unit1-insertion-sort.py
Traceback (most recent call last):
  File "unit1-insertion-sort.py", line 14, in <module>
    while  item > new_list[j]:
IndexError: list index out of range
The problem here occurs when item is greater than everything in the new list. So this while loop gets to the very end, and j is bigger than the size of the list. So the insert gives an error. We can fix it with the following check:
unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

new_list = []

new_list.append( unordered_list[0] )

for item in unordered_list:

    j = 0
    while j < len(new_list) and item > new_list[j]:
        j = j + 1

    new_list.insert( j, item )
This will guarantee that j does not get too big.

Save this, run it again, and you shouldn't see any error messages. But did it work? We can add some print() statements to watch the new list getting sorted as the algorithm runs. The reading for this week will talk about different techniques for visualizing algorithms. You can think of these print() statements here as a kind of very basic visualization. It allows us to observe how the algorithm ran.

Example 2: Insertion sort

unordered_list = [ 234, 2, 6, 42, 1, 124, 687, 5, 674, 45, 67, 99, 23 ]

new_list = []

new_list.append( unordered_list[0] )

print(new_list)

for item in unordered_list:

    print(item)

    j = 0
    while j < len(new_list) and item > new_list[j]:
        j = j + 1

    new_list.insert( j, item )

    print(new_list)

Wrapping up ...

Hopefully this gives you some building blocks in algorithmic thinking that you can play with. The homework for this week has some opportunities for you to experiment with these ideas. Have fun!