Lab 6 - Web Servers
Facilitator: aaronzheng, bplate
10 min readTable of contents
Overview
Networking is key to many services because it allows processes and computers to communicate with each other. In this lab, we’ll explore different networked services with an emphasis on web services.
Make sure that you are doing all of these steps on your VM.
Which networked services are already running?
There are multiple networked services running on your VM right now. To output the networked services running on the VM use netstat
(install using apt).
- Question 1a: What command did you use to display the networked services?
- Question 1b: Paste the output of the command.
- Question 1c: Choose one service from the output and describe what it does.
DNS
In this section we are going to be setting up our own DNS server! Remember that
DNS is the system that maps from a domain like ocf.berkeley.edu
to an IP like
169.229.226.23
(and 2607:f140:8801::1:23
for IPv6) so that computers know
how to send information over the network to servers without people having to
remember a bunch of numbers to connect to everything.
First, install the bind9
package on your VM to set up a DNS server. Uninstall dnsmasq
if it’s previously installed on your VM by sudo apt purge dnsmasq
.
Let’s check the status of the service using systemctl
. What command can you run to do this?
In the output of the systemctl
command, you should see that the bind9
service is already running. Let’s bring it down temporarily so we can
investigate: systemctl stop bind9
The service should have a unit file at /lib/systemd/system/named.service
or
/lib/systemd/system/bind9.service
. If you print that file (with cat
or
systemctl cat bind9
), you should see something like this:
[Unit]
Description=BIND Domain Name Server
Documentation=man:named(8)
After=network.target
Wants=nss-lookup.target
Before=nss-lookup.target
[Service]
EnvironmentFile=-/etc/default/named
ExecStart=/usr/sbin/named -f $OPTIONS
ExecReload=/usr/sbin/rndc reload
ExecStop=/usr/sbin/rndc stop
[Install]
WantedBy=multi-user.target
Alias=bind9.service
This should look somewhat familiar to you after the lecture on networking! Don’t worry if it doesn’t all look familiar since there are some options you haven’t seen yet in here, but you should at least recognize some of the options used.
If you now run dig ocf.berkeley.edu @localhost
from your VM, you should see
that the command eventually times out after trying to run for about 15 seconds.
This is because it is trying to send DNS requests to your VM, but the DNS
server is not actually running yet so it doesn’t get a response. However, if
@localhost
is left off the end of the command, it succeeds. Why is this the
case? What DNS server are requests currently being sent to if @localhost
is
not specified in the command?
Try starting the DNS server using the relevant systemctl
command. If you
check the status of the bind9
service after starting it, you should see the
status has changed to say that the service is active and running.
If you now run dig ocf.berkeley.edu @localhost
from your VM, you should now
see a response containing the correct IP (169.229.226.23
)!
Now to the exciting part, the configuration! Edit /etc/bind/named.conf.local
with your favorite text editor (add sudo if you don’t have write permission). Inside this file, it should be empty apart
from a few comments at the top because you haven’t done any local configuration
yet. Add a new zone in this file for example.com
with these contents:
zone "example.com" {
type master;
file "/etc/bind/db.example.com";
};
Then, create a file /etc/bind/db.example.com
to contain the responses to give
if anyone sends requests to your DNS server for example.com
. The easiest way
to do this is generally to copy an existing config and then make changes from
there to get what you want for your config instead of having to start from
scratch.
To make this easier, we’ve provided a valid config in
decal-labs
that you can copy in place at
/etc/bind/db.example.com
.
Let’s start by adding a record for a subdomain named test.example.com
.
Add the line below to /etc/bind/db.example.com
.
test IN A 93.184.216.34
Make sure to reload the bind9
service after changing anything in /etc/bind9
, since you want the running
service to change its configuration.
systemctl reload bind9
Now run the commands below. For the first command you should see the result for example.com
which should be 1.2.3.4
.
For the second command you should see 93.184.216.34
as the result.
dig @localhost example.com
dig @localhost test.example.com
Please add few more records of your choice. Try to add one A record, and a couple of other
types of records (CNAME, SRV, TXT, etc.). Make sure to reload the bind9
after!
-
Question 2a: What is the systemctl command to show whether bind9 is running or not?
-
Question 2b: Why does the dig command (dig ocf.berkeley.edu) work if @localhost is not present at the end (if bind9 is not started) but times out when @localhost is added?
-
Question 2c: What additional entries did you add to your DNS server config (the db.example.com file)?
-
Question 2d: What commands did you use to make requests to the local DNS server for your additional entries?
Load Balancing
For this section we will be using HAProxy, a commonly-used open-source load balancer. NGINX is actually starting to become a load balancer alongside being a web server, which is pretty interesting, but HAProxy is still commonly used.
You can install HAProxy using sudo apt install haproxy
.
First, grab the python file for the service you will be running from the
decal-labs repo using wget
or something similar to download
it. You’ll likely also need to install tornado
using sudo apt install python3-tornado
.
When run (python3 server.py
), this script will start up 6 different HTTP
server workers listening on ports 8080 to 8085 (inclusive). Each worker returns
different content to make it clear which one you are talking to for this lab
(“Hello, I am ID 0” for instance), but in real usage they would generally all
return the same content. You would still want something to distinguish between
them (maybe a HTTP header saying which host or instance they are?), but only
for debugging purposes, not like in this lab where they have actually differing
content.
You can test out each worker if you’d like by forwarding each of the ports (from 8080-8085) and making a request individually to each server. If you need a reminder on how to port forward from your VM, visit lab 4.
The idea behind using a load balancer is that requests will be spread out among instances so that if a lot of requests are coming in all at once, they will not overload any one instance. Another very useful feature is that if one of the instances happens to crash or become unavailable for whatever reason, another working server will be used instead. This requires some kind of health checks to be implemented to decide whether a server is healthy or not.
Your job is to do the configuration to get it to work with the services you are given! The main
config file is at /etc/haproxy/haproxy.cfg
and you should only have to append
to the end of this file to finish this lab. One snippet is provided here for you
to add to the config already, this will give you a nice status page that you
can use to see which of the servers is up or down:
listen stats
bind 0.0.0.0:7001
mode http
stats enable
stats hide-version
stats uri /stats
Make sure to forward ports 7000 and 7001 to allow access to your load balancer server and the stats page from your local machine.
After adding this, if you restart the haproxy
service and open
http://localhost:7001/stats
in a web browser, you should see a
page with a table and some statistics information on HAProxy (pid, sessions,
bytes transferred, uptime, etc.).
Part 1: Configuration
Your goal is to add a backend and frontend to haproxy’s config that proxies to
all of the running workers on the ports from 8080 to 8085 and listens on port
7000 on your VM, so that if you go to http://localhost:7000
you
can see the responses from the workers. Try refreshing, what do you notice
happening? Do you notice a pattern? What load balancing algorithm
are you using from your observations? What config did you add to the haproxy
config file to get this to work? Try changing the algorithm and see what
happens to your results!
Part 2: Health Checks
Now, after adding all the servers to the backend in the config, add health checks for each of them. If you refresh the stats page, what do you notice has changed? What color are each of the servers in your backend?
Some hints for Parts 1-2
- You shouldn’t need to change the current contents of
haproxy.cfg
; you’ll just need to append additional lines to the bottom of the file. - You’ll need to add two sections, one for
frontend
and one forbackend
. Take a look at the Frontend and Backend sections of The Four Essential Sections of an HAProxy Configuration to learn more about the syntax and options available! - You can label your frontend and backend sections however you wish.
- You should need to append about 10-15 lines to the config file.
- If you’d like more hints, feel free to ask on #decal-general!
Part 3: Crashing
If you make a request to http://localhost:7000/crash
, it will
crash the worker that you connect to. What changes in the HAProxy stats page?
(Try refreshing a few times, the health checks can take a couple seconds to
update the status from UP -> DOWN) If you make a lot of requests to
http://localhost:7000
again, are all the servers present in the
IDs that are returned in your requests or not? Try crashing a particular worker
by running curl localhost:<port>/crash
, substituting the port with one of the
workers that is still up on your instance. What happens on the HAProxy stats
page? If you crash all the workers, what status code does HAProxy return to you
when you make a request to the service?
-
Question 3a: Do you notice any pattern when you refresh the page multiple times?
-
Question 3b: What load balancing algorithm are you using?
-
Question 3c: What did you add to the haproxy config? (just copy and paste the lines you added to the bottom into here)
-
Question 3d: What do you notice has changed on the stats page after adding health checks? What color are each of the servers in the backend now?
-
Question 3e: What changes in the stats page when you crash a worker? What happened to the pattern from before?
-
Question 3f: What HTTP status code (or error message) does HAProxy return if you crash all of the workers?
Remember to submit your answers on Gradescope!