Lab 2 - Shell Commands and Pipes
UNIX machines usually have a common set of commands pre-installed. While you could do this lab on a Mac or Linux machine using the Terminal, we recommend using SSH to do them on OCF machines. Follow the instructions at ocf.io/docs/services/shell to get started on the OCF login server.
To start this lab, copy and paste this segment into the terminal (this will download the lab2 folder, fill it with some practice files, and cd into the folder):
wget -qO- https://decal.ocf.berkeley.edu/labs/lab2/lab2.tgz | tar xz && cd lab2
Navigating the filesystem
Files and Folders
To start off, type the tree command:
…and press enter. This will draw a diagram of the entire file/folder hierarchy of the folder you’re in.
Now type:
This will show you the files in the current directory. To see the files in another folder, you can use the cd
command (change directory). To go into the documents folder, type:
Then type ls
again to see the files in that folder.
To visit the folder containing your current folder (the parent folder), type:
Try using ls
and cd
to explore this lab, and answer these questions:
- How many files are in the
code
folder?
- What happens when you try to use
cd
on a file (as opposed to a folder)?
- What happens when you type
cd
by itself, without any arguments?
- Go to the lab2 folder and type
ls -A
. Do you see anything different?
Reading files
You can use the cat
command to view the contents of files. Go into the lab2
folder and type cat README
. It should output a friendly message. Now answer these questions:
- What happens when you run “cat binary_file” from inside the lab2 folder?
- What happens when you run cat on a folder (as opposed to a file)?
- Can you find the hidden file? What are its contents?
Other file manipulation
To copy a file, use the cp
command. For example, to make a copy of the README
file:
Then type cat newfile
to see that it has the same contents as README
.
Other commands:
mv newfile otherfile # Works like cp, but moves files instead of copying them
mkdir myfolder # Makes a folder (the command name is short for “make directory”)
touch myfile # Makes an empty file
rm myfile # Deletes a file (WARNING: this cannot be undone!)
Now answer these questions:
- Try copying the
code
folder into a folder called newcode
. Does it work? How can you make it work? If you aren’t sure, try doing a Google search to learn how to copy a folder.
- Make a folder called
myfolder
and try to delete it with rm
. Does it work? Again, try to figure out how to delete it.
Command flags
After answering the above questions, you’ve used some extra options in commands. Most commands have dozens of options that can modify their behavior.
To list all files in a directory, but with extra information for each file, type:
To learn about all options a command has, type man <command>
, replacing <command>
with the command you’re interested in.
Questions:
- How do you list all files in a directory, with extra information, AND showing hidden files? Write the command.
- At the start of this lab, you used the
tree
command to draw a diagram of the filesystem hierarchy. How can you show hidden files with the tree command? If you’re not sure, read man tree
.
Editing files
When you’re SSH’d into a remote server, you can’t use IDEs you’re familiar with* (IntelliJ, Eclipse, Xcode, etc). Instead, you’ll have to use text editors that work in the terminal. Vim and Emacs are popular options to use. If you’re interested in learning vim, run the “vimtutor” command. If you don’t want to learn either, you can use the “nano” editor, which is easy to use.
Use nano to edit the README file and add your own message:
* There are ways to get around this– you can use the sshfs command to mount a remote computer as a filesystem, or you can use SSH functionality in an IDE. But these methods can be time-consuming and unwieldy, so it’s important to learn how to use command-line text editors.
Other useful commands
less
Displays a file through a “pager.” This makes it easy to read through long files. For example, if you run cat logs/apache-log.txt
, the output is so long, you might have to wait a few seconds to get through all of it! Instead, you can use less
to scroll through the file easily:
Some tips for navigating in less:
- g goes to the top of the file, G goes to the bottom.
- Type / (forward slash) and then text to search for something. Use n and N to scroll through all matches.
- Use k and j to scroll up and down by small amounts.
- Use PgUp and PgDown to scroll up and down by large amounts.
- Press q to quit
Questions:
- What is the date and time of the first entry in apache-log.txt? The last entry?
- What is the date and time of the first log entry related to “archlinux”?
grep
Can be used to search through files. As an example, look in the logs/irc-ubuntu
folder, which contains logs of an IRC chatroom that provides support for Ubuntu users. Each file contains the messages sent for a particular day of the month (this sample was taken from the logs during August 2017).
To get a list of all messages containing the word “thank” on the logs for August 18, 2017:
If you don’t care about case-sensitivity (in this case, we don’t), you can use the -i
option to ignore case:
Questions:
- The documents/people file contains a list of (fake) people’s names, gender, and addresses. Use grep to find out which people live on “Agawam Street.”
head
and tail
Gets the beginning or end of a file. To get the 20 most recent log entries:
tail -n 20 logs/apache-log.txt
To get the 20 first log entries:
head -n 20 logs/apache-log.txt
wc
Stands for “word count.” It takes in a list of files and displays the number of lines, words, and bytes in each file.
The wc
command (as well as most commands in this lab) takes wildcards for processing multiple files at once. To get a word count of all the Ubuntu IRC logs:
Questions:
- Usually you only care about line counts, and the other two numbers just make things confusing. Read the manpage for wc. How can you only get the line count, without any other numbers? Write the command to get just line counts for all the Ubuntu logs.
curl
Displays a file from the internet.
curl https://www.ocf.berkeley.edu # HTML for the OCF’s website
curl https://api.github.com/users/ocf # Query Github’s JSON API
curl ifconfig.co # Figure out what your public IP address is
curl https://irclogs.ubuntu.com/2017/07/09/%23ubuntu.txt # #ubuntu IRC logs for July 9 2017
wget
Downloads a file from the internet.
wget https://irclogs.ubuntu.com/2017/07/09/%23ubuntu.txt # download IRC logs
Questions:
- What is the difference between
curl
and wget
? When might you use one over the other?
- Run
curl google.com
. Then run wget google.com
and read the downloaded file. Are there any extra differences you see?
Pipes and Redirection
Now that you’ve learned the basic commands (if you forget them, you can find a handy reference at cheatsheetworld.com/programming/unix-linux-cheat-sheet/), you’re ready to learn about what makes shells so powerful: being able to combine commands using pipes.
In the past examples, you were supplying a file argument to the commands. However, if you omit the file arguments, you can “pipe” the output from one command into another. The pipe character is |
(to the right of the ] key and above the Enter key). Try running these examples to get a feel for how pipes work: |
Print the contents of a file normally:
Pipe the output of cat README
into wc:
Same as above, but only print line counts:
Show the number of times someone said “thank” on IRC on August 3 2017:
grep -i thank logs/irc-ubuntu/03.txt | wc -l
Get a word count of War and Peace, without downloading a file to disk:
curl https://www.gutenberg.org/files/2600/2600-0.txt | wc -m
Find all occurrences of the word “elementary” in The Adventures of Sherlock Holmes:
curl https://www.gutenberg.org/cache/epub/1661/pg1661.txt | grep -i elementary
Questions:
- What command would you run to get the number of times someone said thanks on IRC, but on all days in August?
- The documents/vehicles file contains a list of cars and their owners. How many vehicles listed are of make “Toyota”? What command(s) did you use to figure this out?
- How many log entries related to
Ubuntu
are in the apache logfile? What about Kali linux?
- Scenario: you remember being in the Ubuntu chatroom when a user named “tomreyn” posted a useful link to a Github issue (you remember that the issue number was 16, but you don’t remember the repository name). Can you find what you’re looking for in the logs?