Lab 3 - Shell Scripting
Overview
Many of the tasks that someone would like to perform on a computer are regular,
require repetition, or are menial or tedious to do by hand. Shell scripting
allows one to interact programmatically with a shell to do certain tasks. For
example, the command for scanning log files in the previous topic guide could
be automated to be performed on a schedule by means of a shell script. bash
scripts are an incredibly powerful tool for sysadmins to automate tasks that
are otherwise difficult to remember or long-running.
In cases where shell syntax is inappropriate for the task at hand, one can
instead call into programs written in other languages, such as Python, which
can read from stdin, process data, and write to stdout.
What’s Covered
bash
as a scripting language
python
for system administration
Scripting with the Bourne-Again Shell (Bash)
While most programmers are likely familiar with bash
in its popular capacity
as a command line interpreter, it is in fact a powerful and full-featured
programming language. Moreover, bash
has a uniquely qualified claim to the
title of scripting language in that programs written in bash
are simply
series of shell commands which bash
reads off and executes line-by-line. Or,
conversely, one might say that bash
command line entries are simply short
one-line scripts. So really, you’ve been bash
scripting all along!
Shebang!
Shell scripts typically begin with the shebang line:
#!path/to/interpreter
.
#!
is a human-readable representation of a magic number 0x23 0x21
which can tell the shell to pass execution of the rest of the file to a
specified interpreter. If your script is run as an executable (e.g.
./awesome_shell_script
) with a shebang line, then the shell will invoke the
executable (usually an interpreter) at path/to/interpreter
to run your
script. If your script is passed as an argument to an interpreter e.g. bash
awesome_shell_script
, then the shebang has no effect and bash
will handle
the script’s execution.
Why is this important? The shebang line can be considered a useful piece of
metadata which passes the concern of how a script is executed from the user
to the program’s author. awesome_shell_script
could be a bash
script, a
python
script, a ruby
script, etc. The idea is that only the script’s
behavior, not its implementation details, should matter to the user who calls
the script.
You may have seen some variant of #!/bin/sh
. Although initially referencing
the Bourne shell, on modern systems sh
has come to reference to the Shell
Command Language, which is a POSIX specification with many implementations.
sh
is usually symlinked to one of these POSIX-compliant shells which
implement the Shell Command Language. On Debian, for instance, sh
is
symlinked to the shell dash
. It is important to note that bash
does not
comply with this standard, although running bash
as bash --posix
makes it
more compliant.
Why is this important? If awesome_shell_script
uses bashisms (i.e.
non-POSIX bash-specific features) but includes a shebang line pointing to sh
,
then trying to run the script as an executable e.g. ./awesome_shell_script
will likely fail. So if you plan to use bashisms in your script, the shebang
line should point to bash
, not sh
. Note that this will sacrifice
portability, as only systems with bash
installed will be able to execute your
script. A list of common bashisms and specification differences between common
shells can be found here. The commonly installed checkbashisms
program
can help to identify bashisms.
In contexts other than the shebang line, #
indicates the beginning of a
comment. Everything to the right of a #
on a line will not be executed.
Shell Variables and Types
Like most other programming languages, bash
facilitates stateful assignment
of names to values as variables.
Variables can be assigned in bash
with the syntax: NAME=value
. Note the
lack of spaces between the assignment operator =
and its operands. Assignment
is whitespace-sensitive.
You can retrieve the value of a variable by prepending a $
to it’s name.
Getting the value of NAME
must be done with $NAME
. This is called variable
interpolation.
$ NAME = "Tux" # Incorrect
-bash: NAME: command not found
$ NAME="Tux" # Correct
$ echo NAME # Incorrect. We want the value we assigned to NAME, not the text
NAME itself.
NAME
$ echo $NAME # Correct
Tux
$?
holds the exit code of the most recently executed command. In this
context, exit code 0
generally means that a program has executed
successfully. Other exit codes refer to the nature of the error which
caused the program to fail.
Special positional parameters allow arguments to be passed into your script.
$0
is the name of the script, $1
is the first argument passed to the
script, $2
is the second argument passed to the script, $3
is the third
argument, etc. $#
gives the number of arguments passed to the script.
So ./awesome_shell_script foo bar
could access foo
from $1
and bar
from
$2
.
Bash variables are untyped. They are usually treated as text (strings), but a
variable can be treated as a number if it contains digits and arithmetic
operations are applied to it. Note that this is different from most programming
languages. Variables don’t have types themselves, but operators will treat
their values differently in different contexts. In other words, bash
variables are text and don’t have any inherent behaviors or properties beyond
that of text which can be manipulated, but operators will interpret this text
according to its content (digits or no digits?) and the context of the
expression.
Arithmetic
Bash supports integer arithmetic with the let
builtin.
$ x=1+1
$ echo $x # Incorrect. We wanted 2, not the text 1+1.
1+1
$ let x=1+1
$ echo $x # Correct
2
Note that let
is whitespace sensitive. Operands and operators must not be
separated by spaces.
bash
does not natively support floating point arithmetic, so we must rely on
external utilities if we want to deal with decimal numbers. A common choice for
this is bc
. Fun fact: bc
is actually it’s own complete language!
We commonly access bc
via a pipe (represented as |
), which allows the
output of one command to be used as the input for another. We include the -l
option for bc
in order to enable floating point arithmetic.
$ echo 1/2 | bc -l
.50000000000000000000
test
Bash scripts frequently use the [
(a synonym for test
) shell builtin for
the conditional evaluation of expressions. test
evaluates an expression and
exits with either status code 0
(true) or status code 1
(false).
test
supports the usual string and numeric operators, as well as a number of
additional binary and unary operators which don’t have direct analogs in most
other programming languages. You can see a list of these operators, along with
other useful information, by entering help test
in your shell. The output of
this is shown below.
$ help test
test: test [expr]
Exits with a status of 0 (true) or 1 (false) depending on
the evaluation of EXPR. Expressions may be unary or binary. Unary
expressions are often used to examine the status of a file. There
are string operators as well, and numeric comparison operators.
File operators:
-a FILE True if file exists.
-b FILE True if file is block special.
-c FILE True if file is character special.
-d FILE True if file is a directory.
-e FILE True if file exists.
-f FILE True if file exists and is a regular file.
-g FILE True if file is set-group-id.
-h FILE True if file is a symbolic link.
-L FILE True if file is a symbolic link.
-k FILE True if file has its `sticky' bit set.
-p FILE True if file is a named pipe.
-r FILE True if file is readable by you.
-s FILE True if file exists and is not empty.
-S FILE True if file is a socket.
-t FD True if FD is opened on a terminal.
-u FILE True if the file is set-user-id.
-w FILE True if the file is writable by you.
-x FILE True if the file is executable by you.
-O FILE True if the file is effectively owned by you.
-G FILE True if the file is effectively owned by your group.
-N FILE True if the file has been modified since it was last
read.
FILE1 -nt FILE2 True if file1 is newer than file2 (according to
modification date).
FILE1 -ot FILE2 True if file1 is older than file2.
FILE1 -ef FILE2 True if file1 is a hard link to file2.
String operators:
-z STRING True if string is empty.
-n STRING
STRING True if string is not empty.
STRING1 = STRING2
True if the strings are equal.
STRING1 != STRING2
True if the strings are not equal.
STRING1 < STRING2
True if STRING1 sorts before STRING2 lexicographically.
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Other operators:
-o OPTION True if the shell option OPTION is enabled.
! EXPR True if expr is false.
EXPR1 -a EXPR2 True if both expr1 AND expr2 are true.
EXPR1 -o EXPR2 True if either expr1 OR expr2 is true.
arg1 OP arg2 Arithmetic tests. OP is one of -eq, -ne,
-lt, -le, -gt, or -ge.
Arithmetic binary operators return true if ARG1 is equal, not-equal,
less-than, less-than-or-equal, greater-than, or greater-than-or-equal
than ARG2.
We can test integer equality
$ [ 0 -eq 0 ]; echo $? # exit code 0 means true
0
$ [ 0 -eq 1 ]; echo $? # exit code 1 means false
1
string equality
$ [ zero = zero ]; echo $? # exit code 0 means true
0
$ [ zero = one ]; echo $? # exit code 1 means false
1
and a number of other string and numeric operations which you are free to
explore.
Flow Control
bash
includes control structures typical of most programming languages –
if-then-elif-else
, while
for-in
, etc. You can read more about
conditional statements and iteration in the Bash Guide for
Beginners from the Linux Documentation Project (LDP). You are encouraged to
read those sections, as this guide provides only a brief summary of some
important features.
if-then-elif-else
The general form of an if-statement in bash
is
if TEST-COMMANDS; then
CONSEQUENT-COMMANDS
elif MORE-TEST-COMMANDS; then
MORE-CONSEQUENT-COMMANDS
else
ALTERNATE-CONSEQUENT-COMMANDS;
fi
Indentation is good practice, but not required.
For example, if we write
#!/bin/bash
# contents of awesome_shell_script
if [ $1 -eq $2 ]; then
echo args are equal
else
echo args are not equal
fi
we see
$ ./awesome_shell_script 0 0
args are equal
$ ./awesome_shell_script 0 1
args are not equal
while
The general form of a while loop in bash
is
while TEST-COMMANDS; do
CONSEQUENT-COMMANDS
done
If TEST-COMMANDS
exits with status code 0
, CONSEQUENT-COMMANDS
will
execute. These steps will repeat until TEST-COMMANDS
exits with some nonzero
status.
For example, if we write
#!/bin/bash
# contents of awesome_shell_script
n=$1
while [ $n -gt 0 ]; do
echo $n
let n=$n-1
done
we see
$ ./awesome_shell_script 5
5
4
3
2
1
Functions
bash
supports functions, albeit in a crippled form relative to many other
languages. Some notable differences include:
- Functions dont return anything, they just produce output streams (e.g.
echo
to stdout)
bash
is strictly call-by-value. That is, only atomic values (strings) can
be passed into functions.
- Variables are not lexically scoped.
bash
uses a very simple system of local
scope which is close to dynamic scope.
bash
does not have first-class functions (i.e. no passing functions to
other functions), anonymous functions, or closures.
Functions in bash
are defined by
name_of_function() {
FUNCTION_BODY
}
and called by
name_of_function $arg1 $arg2 ... $argN
Note the lack of parameters in the function signature. Parameters in bash
functions are treated similarly to global positional parameters, with $1
containing the $arg1
, $2
containing $arg2
, etc.
For example, if we write
#!/bin/bash
# contents of awesome_shell_script
foo() {
echo hello $1
}
foo $1
we see
$ ./awesome_shell_script world
hello world
Examples
Despite bash
’s clumsiness, recursion and more complex programming logic are
possible (read: painful).
#!/bin/bash
# contents of fibonacci
if [ $# -eq 0 ]; then
echo "fibonacci needs an argument"
exit 1
fi
fib() {
N="$1"
if [ -z "${N##*[!0-9]*}" ]; then
echo "fibonacci only makes sense for nonnegative integers"
exit 1
fi
if [ "$N" -eq 0 ]; then
echo 0
elif [ "$N" -eq 1 ]; then
echo 1
else
echo $(($(fib $((N-2))) + $(fib $((N-1)))))
fi
}
fib "$1"
bash
can give us a recursive solution to finding the n
th Fibonacci number.
Python for Sysadmins
Although bash
scripts can be a simple and straightforward way to automate
tasks involving the sequential execution of some shell commands, you may have
already gathered that venturing beyond trivial conditional logic and simple
functions introduces unnecessary syntactic complexity as compared to many other
modern interpreted languages. For this reason, more complex scripts are
popularly written in another, more general, programming language like python
.
Scripting with python
is increasingly popular among sysadmins.
Countless great tutorials for learning python
are available online.
Alternatively, Berkeley offers the self-paced course CS 9H to students
with a programming background, or CS 61A as a python
-based introductory
course in computer science.
Adapting python
to command-line scripting is only a matter of using relevant
modules. Here are some tips:
- The
argparse
module in the python
standard library is a popular way
to implement command-line interfaces for python
scripts
fabric
simplifies some sysadmin tasks, mostly in regard to
application deployment
salt
is useful for general infrastructure management
psutil
provides an interface to system information monitoring
In practice, the decision to write a script in either python
or bash
is
largely dependent on the context of the task at hand. Generally, tasks solvable
with simple shell commands and those requiring simple file reading, writing,
and appending are often a good fit for a bash
script. Those with complex
control logic, recursion, and other more general programming patterns are a
better fit for a python
script.
Scripting Lab Assignment
You’ll be completing a classic first shell scripting assignment: make a
phonebook.
Write a shell script phonebook
which has the following behavior:
./phonebook new name number
adds an entry to the phonebook. and
-
./phonebook name
displays the name and phone number associated with that
name.
-
./phonebook list
displays every entry in the phonebook (in no particular
order). If the phonebook has no entries, display phonebook is empty
-
./phonebook remove name
deletes the entry associated with that name.
./phonebook clear
deletes the entire phonebook.
For example,
$ ./phonebook new Linus Torvalds 101-110-0111
$ ./phonebook Linus Torvalds
Linus Torvalds 101-110-1010
$ ./phonebook new Tux Penguin 555-666-7777
$ ./phonebook list
Linus Torvalds 101-110-1010
Tux Penguin 555-666-7777
$ ./phonebook remove Linus Torvalds
$ ./phonebook list
Tux Penguin 555-666-7777
$ ./phonebook clear
$ ./phonebook list
phonebook is empty
Here’s the kicker: You have to implement this same functionality in both
bash
and python
. This is to help illuminate the strengths and weaknesses
of each language in the context of writing a simple CLI application.
Skeleton code for both bash
and python
formats can be found at
https://github.com/0xcf/decal-labs/tree/master/b3/skeleton.
Some tips to make things easier:
bash
has an append operator >>
which, as you might guess, appends its
second operand to the file passed in as the first operand.
$ cat foobar.txt
foobar
$ echo "hello, reader" >> foobar.txt
$ cat foobar.txt
foobar
hello, reader
- Remember that you can simply write to and read from a file to persist data
- Recall that
bash
exposes its command line arguments through the
$<integer>
positional parameters
#!/bin/bash
# contents of argscript.sh
echo "$1"
echo "$2"
$ ./argscript.sh foo bar
foo
bar
- In
bash
, single quotes ''
preserve the literal value of the characters
they enclose. Double quotes ""
preserve the literal value of all characters
except for $
, backticks ``, and the backslash \\
. The most important
implication of this is that double quotes allow for variable interpolation,
while single quotes do not. You can think of single quotes and the stronger
“escape everything” syntax while double quotes are the more lax “escape most
things” syntax.
$ echo `$LANG`
$LANG
$ echo "$LANG"
en_US.UTF-8
- In
python
, you can interact with command-line arguments through the
sys.argv
list:
# !/usr/bin/python
# contents of argscript.py
import sys print(sys.argv[1]) print(sys.argv[2])
$ ./argscript.py foo bar
foo
bar
python
lets you manipulate files with the open
function, commonly used
with the with
control structure:
# !/usr/bin/python
# contents of fileman.py
with open('./newfile.txt', 'w') as f:
f.write("hello from python\n")
$ python fileman.py
$ cat newfile.txt
hello from python
- Here you can read more about
bash
redirections
- Here you can see
python
used in a system administration context to
implement the checkoff script used for this class