In class we debugged Emma's work for part 1 of last week's homework. In working on that we came up with three different ways of implementing a solution. Here is the file that we came up with, which includes all three approaches: week11-hw-part1-mags.py I encourage you to step through each of these.
There were also some questions about part 2 of last week's homework but we didn't have time to get to those. I think the technique being developed there is one that will be pretty useful for many people with final projects. If you have any questions about it, please send me a Gist or an email. If there is interest we can take some time to review it together as a group.
Last week we ended with a tiny glimpse into the world of networking: we wrote code that made requests from Processing running on our computers, to Python code running on a different computer which returned JSON data. This allowed all of our Processing programs to access the same shared data by making requests to the same Python program. This kind of arrangement is known as a client-server model of computing, sometimes described as a client-server architecture, "architecture" being a computer science term for the structure of a digital system.
All of the Processing programs that we wrote and then ran in this case were considered the clients, and the Python code running on a different machine is what we would refer to as the server. There are many different types of servers: file servers, email servers, login servers, and thousands of other possibilities. The server that we connected to last week is what is called a web server, because it implements HTTP, which stands for hypertext transfer protocol. By "implements" I mean that this server knows how to "speak" this protocol: it has been programmed to recognize the commands that this protocol defines, and knows what commands to issue in response.
You have probably already inferred that HTTP,
like most protocols, is structured
around requests
and responses. The request is
a bit of data that a client sends to
a server, and the response is
the data that the server sends
back. HTTP is the protocol of the world wide
web. It is the commiunication protocol used to transmit all web
pages. This is why most URLs start with
"http://
". (Sometimes they start with other things
like ftp://
, https://
, or
even file://
when you are viewing a file located on
your own computer.) HTTP is a convenient protocol to experiment
with, because at its most basic usage, the requests and
responses are plain text. Let's experiment ...
Open up a command shell (e.g Terminal), run Python, and type the
following commands (remember, don't type the command
prompts $
or >>>
)
$ python >>> import socket >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) >>> s.connect(("google.com",80)) >>> s.send(b'GET / HTTP/1.0\n\n') 16You have just sent an HTTP request!
This is using something called a socket. A
socket is kind of like a file, in that it is
something that your code can write to and
read from, but it is not actually a file saved on
your local disk. Rather, it is a connection to
open a channel of communication with another computer, over a
network. A socket is usually specified by
an IP address (the numerical internet address
of the computer you are connecting to) and a port
number, which kind of means the window on that computer
to which you want to connect. Most computers need to do more
than one network thing simultaneously (e.g. visit web pages,
send and receive emails, view Zoom meetings, transfer files),
and these multiple simultaneous funcions are managed by doing
different things through different simultaneous connections that
are each specified by a unique port
number. The port for the web is
usually 80
, so when doing testing, people will
often use variations of this,
like 8000
, 8080
,
or 8888
. (Lower numbers are reserved for special
functions and special permissions, but higher numbers can be
used by anyone.)
The above command should have displayed a number, this is just a
status output telling you the number of bytes that were sent
— in this case, 16, corresponding to the 16 characters of
my request. (If you are reading closely, you will have noticed
that there are appear to be 18 characters in my request, but
that is because \n
is an escape code and so counts
as one character, meaning "newline", like pressing enter.)
Next, we want to get the response from the server. To do that, type the following:
>>> data = s.recv(1024) >>> repr(data)This should print a big chunk of HTML, which is the webpage that
google.com
returns for a basic request. You
can experiment with the string in quotes above by
replacing google.com
with a different website that
you'd like to visit. And you can also modify
the /
in the request to ask for a different
page. For example, if I wanted to access an article
on nytimes.com
today, I would do the following:
>>> import socket >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) >>> s.connect(("nytimes.com",80)) >>> s.send(b'GET /2020/11/18/health/pfizer-covid-vaccine.html HTTP/1.0\n\n') 59and then receive the request using the same lines as above. This request is actually giving me an error at the moment as I type this, which you can tell because the response says:
" 'HTTP/1.1 500 Domain Not Found ...But that's OK. HTTP is a complicated protocol and we should not stress if we cannot make it work perfectly by hand. The point of this experiment is to see how the call and response process of request and response works. With this response, you see an error code of
500
, which is part of
the HTTP protocol that indicates there has been some kind of
problem.
Now that we have made some HTTP requests to a server somewhere
on the internet, let's try to set up our own webserver and
make requests to that. Python actually comes bundled with a
very simple webserver library that implements HTTP,
called SimpleHTTPServer
. Using this, we can
experiment with running our own very simple webserver running
locally on our computers.
First, make a new directory (folder) in Finder (or Explorer),
create a new file in Atom, add some simple HTML code, and save
it into this new folder as hello.html
. You can use
this basic HTML:
<html> <body> <h1>Hello!</h1> <p>This is some basic HTML</p> </body> </html>Now,
cd
in to this new directory, and
type the following command:
$ python -m SimpleHTTPServer 8000What this does is run a simple webserver, locally on your machine, using port
8000
. In the terminal, you are now viewing
a log of all the requests that the server receives, and messages
about how it is handling responses to them. Open up a browser and
try visiting this local webserver. The IP address of your local
computer is always 127.0.0.1
, and remember that we
are using port 8000
, which you specify with a colon,
so type this in to your browser: 127.0.0.1:8000
. You
should see something happening in the log, and your browser should
show you a list of the contents of this directory. If you click
on hello.html
, you should now see your HTML file
displayed.
When you are ready to exit out of this, you can type control-C
to stop the webserver and get you back to the command line. Try
adding additional files and subfolders to the folder of this
simple webserver and re-running the server. As you click around
to HTML files in your browser, note how the URLs of these pages
correspond to the folder and subfolder structure that you
create. This is the principle behind how all websites and URLs
are structured: the IP address and port refer to a computer
program (a server) running on a computer somewhere, and the URL
with all of its slashes (/
) refers to folders and
subfolders which that server program can see.
OK, well that is nice and fun, but what if you want a webserver that does more than just return static HTML pages to a user. What if you want a webserver that dynamically generates content using all that you have learned about coding, and returns the output of a program to the web user?
For this, we can use a technique called CGI, which stands for common gateway interface. This is a technique where a server receives an HTTP request , but instead of simply returning an HTML file, the server executes a computer program, and the output of that program is returned as the HTTP response. It is similar to running a Python program on the command line, but instead of typing a command to execute, a user on the web requests a URL with their browser, which then invokes the program, and the browser then displays the response. In a way, it is also somewhat similar to all the work that we have been doing in Processing, except instead of interactive keyboard and mouse input, the only inputs come from the request, and instead of displaying visual graphic results, you can only return textual content in the web response.
Let's experiment with this ...
Create a subfolder called cgi-bin
. Within that
subfolder, create a new file called server.py
, and
open that in Atom.
cd
to this new folder, and make sure
that your server.py
file
is executable. You can check this by
typing the following (remember, don't type the dollar signs):
$ ls -l cgi-binIt probably is not executable by default, so to make sure that it is, type the following command:
$ chmod a+x cgi-bin/server.py(This command is short for "change mode" and it allows you to change read, write, and execute permissions on files.)
The first line of this file should be the executable path to Python on your system. Type:
$ which python /usr/local/bin/pythonFor me, that outputs
/usr/local/bin/python
, but
for you it may be different. Whatever it says, copy that
output, and past it in as the first line
of server.py
, preceded by a pound sign
(#
), like this:
#!/usr/local/bin/pythonNow you can add any arbitrary Python code into this program. The first thing you must do to follow the HTTP protocol is return the following text:
server.py
:
print("Content-Type: text/html") print("")After that, try adding some other Python commands. For example for now you could simply add:
print("Hello, world!")
Now try running an HTTP server in this directory. But we cant use the same one as before. We have to use a webserver that is capable of serving CGI scripts. Fortunately Python also comes with a library that does this. So run the following command:
python -m CGIHTTPServer 8000And now in your browser try visiting:
127.0.0.1:8000/cgi-bin/server.py
In the homework for this week, I explain how you can create an HTML page with a form into which a user can enter values, which will then be made available as a dictionary of key-value pairs to your Python CGI script. This will allow your CGI script to treat the user values from the HTML form as inputs, and work with those to generate a dynamic response.