Lab 6 - Web Servers | OCF Sysadmin DeCal

Overview #

Networking is key to many services because it allows processes and computers to communicate with each other. In this lab, we’ll explore different networked services with an emphasis on web services.

Make sure that you are doing all of these steps on your VM.

Which networked services are already running? #

There are multiple networked services running on your VM right now. To output the networked services running on the VM use netstat (install using apt).

Question 1a: What command did you use to display the networked services?
Question 1b: Paste the output of the command.
Question 1c: Choose one service from the output and describe what it does.

DNS #

In this section we are going to be setting up our own DNS server! Remember that DNS is the system that maps from a domain like ocf.berkeley.edu to an IP like 169.229.226.23 (and 2607:f140:8801::1:23 for IPv6) so that computers know how to send information over the network to servers without people having to remember a bunch of numbers to connect to everything.

First, install the bind9 package on your VM to set up a DNS server. Uninstall dnsmasq if it’s previously installed on your VM by sudo apt purge dnsmasq.

Let’s check the status of the service using systemctl. What command can you run to do this?

In the output of the systemctl command, you should see that the bind9 service is already running. Let’s bring it down temporarily so we can investigate: systemctl stop bind9

The service should have a unit file at /lib/systemd/system/named.service or /lib/systemd/system/bind9.service. If you print that file (with cat or systemctl cat bind9), you should see something like this:

[Unit]
Description=BIND Domain Name Server
Documentation=man:named(8)
After=network.target
Wants=nss-lookup.target
Before=nss-lookup.target

[Service]
EnvironmentFile=-/etc/default/named
ExecStart=/usr/sbin/named -f $OPTIONS
ExecReload=/usr/sbin/rndc reload
ExecStop=/usr/sbin/rndc stop

[Install]
WantedBy=multi-user.target
Alias=bind9.service

This should look somewhat familiar to you after the lecture on networking! Don’t worry if it doesn’t all look familiar since there are some options you haven’t seen yet in here, but you should at least recognize some of the options used.

If you now run dig ocf.berkeley.edu @localhost from your VM, you should see that the command eventually times out after trying to run for about 15 seconds. This is because it is trying to send DNS requests to your VM, but the DNS server is not actually running yet so it doesn’t get a response. However, if @localhost is left off the end of the command, it succeeds. Why is this the case? What DNS server are requests currently being sent to if @localhost is not specified in the command?

Try starting the DNS server using the relevant systemctl command. If you check the status of the bind9 service after starting it, you should see the status has changed to say that the service is active and running.

If you now run dig ocf.berkeley.edu @localhost from your VM, you should now see a response containing the correct IP (169.229.226.23)!

Now to the exciting part, the configuration! Edit /etc/bind/named.conf.local with your favorite text editor (add sudo if you don’t have write permission). Inside this file, it should be empty apart from a few comments at the top because you haven’t done any local configuration yet. Add a new zone in this file for example.com with these contents:

zone "example.com" {
  type master;
  file "/etc/bind/db.example.com";
};

Then, create a file /etc/bind/db.example.com to contain the responses to give if anyone sends requests to your DNS server for example.com. The easiest way to do this is generally to copy an existing config and then make changes from there to get what you want for your config instead of having to start from scratch.

To make this easier, we’ve provided a valid config in decal-labs that you can copy in place at /etc/bind/db.example.com.

Let’s start by adding a record for a subdomain named test.example.com.

Add the line below to /etc/bind/db.example.com.

test IN A 93.184.216.34

Make sure to reload the bind9 service after changing anything in /etc/bind9, since you want the running service to change its configuration.

systemctl reload bind9

Now run the commands below. For the first command you should see the result for example.com which should be 1.2.3.4. For the second command you should see 93.184.216.34 as the result.

dig @localhost example.com

dig @localhost test.example.com

Please add few more records of your choice. Try to add one A record, and a couple of other types of records (CNAME, SRV, TXT, etc.). Make sure to reload the bind9 after!

Question 2a: What is the systemctl command to show whether bind9 is running or not?
Question 2b: Why does the dig command (dig ocf.berkeley.edu) work if @localhost is not present at the end (if bind9 is not started) but times out when @localhost is added?
Question 2c: What additional entries did you add to your DNS server config (the db.example.com file)?
Question 2d: What commands did you use to make requests to the local DNS server for your additional entries?

Load Balancing #

For this section we will be using HAProxy, a commonly-used open-source load balancer. NGINX is actually starting to become a load balancer alongside being a web server, which is pretty interesting, but HAProxy is still commonly used.

You can install HAProxy using sudo apt install haproxy.

First, grab the python file for the service you will be running from the decal-labs repo using wget or something similar to download it. You’ll likely also need to install tornado using sudo apt install python3-tornado.

Next, run the script using python3 server.py.

After running python3 server.py, the script will start up 6 different HTTP server workers listening on ports 8080 to 8085 (inclusive). Each worker returns different content to make it clear which one you are talking to for this lab (“Hello, I am ID 0” for instance), but in real usage they would generally all return the same content. You would still want something to distinguish between them (maybe a HTTP header saying which host or instance they are?), but only for debugging purposes, not like in this lab where they have actually differing content.

You can test out each worker if you’d like by making a request (e.g. using cURL) individually to each server (http://localhost:8080 to http://localhost:8085) on your VM.

The idea behind using a load balancer is that requests will be spread out among instances so that if a lot of requests are coming in all at once, they will not overload any one instance. Another very useful feature is that if one of the instances happens to crash or become unavailable for whatever reason, another working server will be used instead. This requires some kind of health checks to be implemented to decide whether a server is healthy or not.

Your job is to do the configuration to get it to work with the services you are given! The main config file is at /etc/haproxy/haproxy.cfg and you should only have to append to the end of this file to finish this lab. One snippet is provided here for you to add to the config already, this will give you a nice status page that you can use to see which of the servers is up or down:

listen stats
  bind    0.0.0.0:7001
  bind    [::]:7001
  mode    http
  stats   enable
  stats   hide-version
  stats   uri /stats

After adding this, if you restart the haproxy service and open http://[yourusername].decal.ocfhosted.com:7001/stats in a web browser, you should see a page with a table and some statistics information on HAProxy (pid, sessions, bytes transferred, uptime, etc.).

Part 1: Configuration #

Your goal is to add a backend and frontend to haproxy’s config that proxies to all of the running workers on the ports from 8080 to 8085 and listens on port 7000 on your VM, so that if you go to http://[yourusername].decal.ocfhosted.com:7000 you can see the responses from the workers. Try refreshing, what do you notice happening? Do you notice a pattern? What load balancing algorithm are you using from your observations? What config did you add to the haproxy config file to get this to work? Try changing the algorithm and see what happens to your results!

Part 2: Health Checks #

Now, after adding all the servers to the backend in the config, add health checks for each of them. If you refresh the stats page, what do you notice has changed? What color are each of the servers in your backend?

Some hints for Parts 1-2 #

You shouldn’t need to change the current contents of haproxy.cfg; you’ll just need to append additional lines to the bottom of the file.
You’ll need to add two sections, one for frontend and one for backend. Take a look at the Frontend and Backend sections of The Four Essential Sections of an HAProxy Configuration to learn more about the syntax and options available!
You can label your frontend and backend sections however you wish.
You should need to append about 10-15 lines to the config file.
Make sure you include a line to listen for IPv6 requests in addition to IPv4 (bind [::]:<port number>)
If you’d like more hints, feel free to ask on #decal-general!

Part 3: Crashing #

If you make a request to http://[yourusername].decal.ocfhosted.com:7000/crash, it will crash the worker that you connect to. What changes in the HAProxy stats page? (Try refreshing a few times, the health checks can take a couple seconds to update the status from UP -> DOWN) If you make a lot of requests to http://[yourusername].decal.ocfhosted.com:7000 again, are all the servers present in the IDs that are returned in your requests or not? Try crashing a particular worker by running curl localhost:<port>/crash on your VM, substituting the port with one of the workers that is still up on your instance. What happens on the HAProxy stats page? If you crash all the workers, what status code does HAProxy return to you when you make a request to the service?

Question 3a: Do you notice any pattern when you refresh the page multiple times?
Question 3b: What load balancing algorithm are you using?
Question 3c: What did you add to the haproxy config? (just copy and paste the lines you added to the bottom into here)
Question 3d: What do you notice has changed on the stats page after adding health checks? What color are each of the servers in the backend now?
Question 3e: What changes in the stats page when you crash a worker? What happened to the pattern from before?
Question 3f: What HTTP status code (or error message) does HAProxy return if you crash all of the workers?

Remember to submit your answers on Gradescope!