Lab 3 - Shell Scripting

Overview

Many of the tasks that someone would like to perform on a computer are regular, require repetition, and are menial or tedious to do by hand. Shell scripting allows one to interact programmatically with a shell to do certain tasks. For example, the command for scanning log files in the previous topic guide could be automated to be performed on a schedule by means of a shell script. bash scripts are an incredibly powerful tool for sysadmins to automate tasks that are otherwise difficult to remember or long-running.

In cases where shell syntax is inappropriate for the task at hand, one can instead call into programs written in other languages, such as Python, which can read from stdin, process data, and write to stdout.

Scripting with the Bourne-Again Shell (Bash)

While most programmers are likely familiar with bash in its popular capacity as a command line interpreter, it is in fact a powerful and full-featured programming language. Moreover, bash has a uniquely qualified claim to the title of scripting language in that programs written in bash are simply series of shell commands which bash reads off and executes line-by-line. Or, conversely, one might say that bash command line entries are simply short one-line scripts. So really, you’ve been bash scripting all along!

Shebang!

Shell scripts typically begin with the shebang line: #!pathtointerpreter.

#! is a human-readable representation of the magic number 0x23 0x21 used by the operating system to determine whether a file is a script or an executable binary. If your script is run as an executable e.g. ./awesome_shell_script, then the operating system will invoke the executable (usually an interpreter) at pathtointerpreter to run your script. If your script is passed as an argument to an interpreter e.g. bash awesome_shell_script, then the shebang has no effect on the program’s execution and bash will handle the script’s execution.

Why is this important? The shebang line can be considered a useful piece of metadata which passes the concern of how a script is executed from the user to the program’s author. awesome_shell_script could be a bash script, a Python script, a C shell script, etc. Only the script’s behavior, not its implementation details, should matter to the user who calls the script.

You may have seen some variant of #!/bin/sh. Although initially referencing the Bourne shell, on modern systems sh has come to reference to the Shell Command Language, which is a POSIX specification with many implementations including ash, dash, ksh88, etc. sh is usually symlinked to one of these POSIX-compliant shells. On Debian, for instance, sh is symlinked to dash. It is important to note that bash does not comply with this standard, although running bash as bash --posix makes it more compliant.

Why is this important? If awesome_shell_script uses bashisms (i.e. non-POSIX bash-specific features) but includes a shebang line pointing to sh, then trying to run the script as an executable e.g. ./awesome_shell_script will likely fail. So if you plan to use bashisms in your script, the shebang line should point to bash, not sh. Note that this will sacrifice portability in that only systems with bash installed will be able to execute your script.

Note that in contexts other than the shebang line, # indicates the beginning of a comment. Everything to the right of a # on a line will not be executed.

Shell Variables and Types

It is often necessary to store some piece of information for later reuse. In bash (and most other programming languages) this idea is supported via variables.

Variables can be set in bash with the syntax: NAME=value. Note the lack of spaces between the assignment operator = and its operands. Assignment is whitespace-sensitive.

You can retrieve the value of a variable by prepending a $ to it’s name. Getting the value of NAME must be done with $NAME

$ NAME = "Tux" # Incorrect
-bash: NAME: command not found 
$ NAME="Tux" # Correct
$ echo NAME # Incorrect. We want the value we assigned to NAME, not the text NAME itself.
NAME
$ echo $NAME # Correct
Tux

Special positional parameters allow arguments to be passed into your script. $0 is the name of the script, $1 is the first argument passed to the script, $2 is the second argument passed to the script, $3 is the third argument, etc. $# gives the number of arguments passed to the script.

So ./awesome_shell_script foo bar could access foo from $1 and bar from $2.

Bash variables are untyped. They are usually treated as text (strings), but a variable can be treated as a number if it contains digits and arithmetic operations are applied to it. Note that this is different from most programming languages. Variables don’t have types themselves, but operators will treat their values differently in different contexts. In other words, bash variables are text and don’t have any inherent behaviors or properties beyond that of text which can be manipulated, but operators will interpret this text according to its content (digits or no digits?) and the context of the expression.

Arithmetic

Bash supports integer arithmetic with the let builtin.

$ x=1+1
$ echo $x # Incorrect. We wanted 2, not the text 1+1.
1+1
$ let x=1+1
$ echo $x # Correct
2

Note that let is whitespace sensitive. Operands and operators must not be separated by spaces.

bash does not natively support floating point arithmetic, so we must rely on external utilities if we want to deal with decimal numbers. A common choice for this is bc. Fun fact: bc is actually it’s own complete language!

We commonly access bc via a pipe (represented as |), which allows the output of one command to be used as the input for another. We include the -l option for bc in order to enable floating point arithmetic.

$ echo 1/2 | bc -l
.50000000000000000000

[ ]

Bash scripts frequently use the [ (a synonym for test) shell builtin for the conditional evaluation of expressions. test evaluates an expression and exits with either status code 0 (true) or status code 1 (false).

test supports the usual string and numeric operators, as well as a number of additional binary and unary operators which don’t have direct analogs in most other programming languages. You can see a list of these operators along, along with other useful information, by entering help test in your shell. The output of this is shown below.

$ help test
test: test [expr]
    Exits with a status of 0 (true) or 1 (false) depending on
    the evaluation of EXPR.  Expressions may be unary or binary.  Unary
    expressions are often used to examine the status of a file.  There
    are string operators as well, and numeric comparison operators.

    File operators:

        -a FILE        True if file exists.
        -b FILE        True if file is block special.
        -c FILE        True if file is character special.
        -d FILE        True if file is a directory.
        -e FILE        True if file exists.
        -f FILE        True if file exists and is a regular file.
        -g FILE        True if file is set-group-id.
        -h FILE        True if file is a symbolic link.
        -L FILE        True if file is a symbolic link.
        -k FILE        True if file has its `sticky' bit set.
        -p FILE        True if file is a named pipe.
        -r FILE        True if file is readable by you.
        -s FILE        True if file exists and is not empty.
        -S FILE        True if file is a socket.
        -t FD          True if FD is opened on a terminal.
        -u FILE        True if the file is set-user-id.
        -w FILE        True if the file is writable by you.
        -x FILE        True if the file is executable by you.
        -O FILE        True if the file is effectively owned by you.
        -G FILE        True if the file is effectively owned by your group.
        -N FILE        True if the file has been modified since it was last read.

      FILE1 -nt FILE2  True if file1 is newer than file2 (according to
                       modification date).

      FILE1 -ot FILE2  True if file1 is older than file2.

      FILE1 -ef FILE2  True if file1 is a hard link to file2.

    String operators:

        -z STRING      True if string is empty.

        -n STRING
        STRING         True if string is not empty.

        STRING1 = STRING2
                       True if the strings are equal.
        STRING1 != STRING2
                       True if the strings are not equal.
        STRING1 < STRING2
                       True if STRING1 sorts before STRING2 lexicographically.
        STRING1 > STRING2
                       True if STRING1 sorts after STRING2 lexicographically.

    Other operators:

        -o OPTION      True if the shell option OPTION is enabled.
        ! EXPR         True if expr is false.
        EXPR1 -a EXPR2 True if both expr1 AND expr2 are true.
        EXPR1 -o EXPR2 True if either expr1 OR expr2 is true.

        arg1 OP arg2   Arithmetic tests.  OP is one of -eq, -ne,
                       -lt, -le, -gt, or -ge.

    Arithmetic binary operators return true if ARG1 is equal, not-equal,
    less-than, less-than-or-equal, greater-than, or greater-than-or-equal
    than ARG2.

We can test integer equality

$ [ 0 -eq 0 ]; echo $? # exit code 0 means true
0
$ [ 0 -eq 1 ]; echo $? # exit code 1 means false
1

string equality

$ [ zero = zero ]; echo $? # exit code 0 means true
0
$ [ zero = one ]; echo $? # exit code 1 means false
1

and a number of other string and numeric operations which you are free to explore.

Flow Control

bash includes control structures typical of most programming languages – if-then-elif-else, while for-in, etc. Conditional statements and iteration are treated excellently in the Bash Guide for Beginners from the Linux Documentation Project (LDP). You are encouraged to read those sections, as this guide provides only a brief summary of some important features.

if-then-elif-else

The general form of an if-statement in bash is

if TEST-COMMANDS; then

  CONSEQUENT-COMMANDS

elif MORE-TEST-COMMANDS; then

  MORE-CONSEQUENT-COMMANDS

else 

  ALTERNATE-CONSEQUENT-COMMANDS;

fi

Indentation is good practice, but not required.

For example, if we write

#!/bin/bash
# contents of awesome_shell_script

if [ $1 -eq $2 ]; then
  echo args are equal
else
  echo args are not equal
fi

we see

$ ./awesome_shell_script 0 0
args are equal
$ ./awesome_shell_script 0 1
args are not equal

while

The general form of a while loop in bash is

while TEST-COMMANDS; do

  CONSEQUENT-COMMANDS

done

If TEST-COMMANDS exits with status code 0, CONSEQUENT-COMMANDS will execute. These steps will repeat until TEST-COMMANDS exits with some nonzero status.

For example, if we write

#!/bin/bash
# contents of awesome_shell_script

n=$1
while [ $n -gt 0 ]; do
  echo $n
  let n=$n-1
done

we see

$ ./awesome_shell_script 5
5
4
3
2
1

Functions

bash supports functions, albeit in a crippled form relative to many other languages. Some notable differences include:

Functions dont return anything, they just produce output streams (e.g. echo to stdout)
bash is strictly call-by-value. That is, only atomic values (strings) can be passed into functions.
Variables are not lexically scoped. bash uses a very simple system of local scope which is close to dynamic scope.
bash does not have first-class functions (i.e. no passing functions to other functions), anonymous functions, or closures.

Functions in bash are defined by

name_of_function() {

  FUNCTION_BODY

}

and called by

name_of_function $arg1 $arg2 ... $argN

Note the lack of parameters in the function signature. Parameters in bash functions are treates similarly to global positional parameters, with $1 containing the $arg1, $2 containing $arg2, etc.

For example, if we write

#!/bin/bash
# contents of awesome_shell_script

foo() {
  echo hello $1
}

foo $1

we see

$ ./awesome_shell_script world
hello world

Examples

Despite bash’s clumsiness, recursion and more complex programming logic are possible (read: painful).

#!/bin/bash
# contents of fibonacci

if [ $# -eq 0 ]; then
    echo "fibonacci needs an argument"
    exit 1
fi

fib() {
    N="$1"
    if [ -z "${N##*[!0-9]*}" ]; then
        echo "fibonacci only makes sense for nonnegative integers"
        exit 1
    fi

    if [ "$N" -eq 0 ]; then
        echo 0
    elif [ "$N" -eq 1 ]; then
        echo 1
    else
        echo $(($(fib $((N-2))) + $(fib $((N-1)))))
    fi
}

fib "$1"

bash can give us the classic recursive solution to finding the nth Fibonacci number.

$ ./fibonacci 10
55

Lab Assignment

You’ll be completing a classic first shell scripting assignment: make a phonebook.

Write a shell script phonebook which has the following behavior:

./phonebook new name number adds an entry to the phonebook.
./phonebook name displays the name and phone number associated with that name.
./phonebook list displays every entry in the phonebook (in no particular order). If the phonebook has no entries, display phonebook is empty
./phonebook remove name deletes the entry associated with that name.
./phonebook clear deletes the entire phonebook.

For example,

$ ./phonebook new Linus Torvalds 101-110-0111
$ ./phonebook Linus Torvalds
Linus Torvalds 101-110-1010
$ ./phonebook new Tux Penguin 555-666-7777
$ ./phonebook list
Linus Torvalds 101-110-1010
Tux Penguin 555-666-7777
$ ./phonebook remove Linus Torvalds
$ ./phonebook list
Tux Penguin 555-666-7777
$ ./phonebook clear
$ ./phonebook list
phonebook is empty