bash - ratfactor

Oh boy! I saw this one coming up and knew it would be a beast to tackle. But I’m looking forward to it.

Bash the interface and Bash the language

Any UNIX shell provides a lot of functionality. Bash is no exception, providing all sorts of interesting little corners to delve into. For deep learning, my habit is to hit the books.

In this case, I’m using a fairly slim (300 pages) O’Reilly volume from 1998 (See Fig. 1). This book has floated between my bookshelf and desk for ages. I’ve read the first few chapters a number of times, thumbed through it frequently, and made extensive use of its index. But now I’ll be reading it cover-to-cover, finally devoting a block of time to learning the shell I’ve been using for so long.

photo of my worn and bookmarked copy of the book learning the bash shell

Figure 1. This O’Reilly title has sat at my desk for years

One thing I’ve learned is that Bash scripts may be awkward to read and write, but there are some incredibly handy things built into the language that make it exceptionally good for its intended purpose: making your life at the command line easier!

The other thing I already knew: I haven’t been making good use of Bash as an interface. Understanding job control, variable scope, and using the history mechanism are important for making the command line a completely comfortable environment.

The Slackware bash package

I’d like to note that this package includes exactly one executable, bash. It’s initially installed to /bin/bash4.new when it’s unarchived and then renamed by the install script:

mv bin/bash4.new bin/bash

I’m very pleased that my dailypkg Ruby script (described here) detected the rename and added it as a note in the blog entry skeleton for this page you’re now reading.

Bash’s lineage

By the way, I think you can get a lot of insight by reading the Wikipedia entry for Bash to get a historical context (the O’Reilly book also has some).

You are probably already aware that many of the features of Bash come from earlier shells. Indeed, though Bash is the "Bourne-again shell", taking its inspiration from the Bourne shell (1979), it’s helpful to know that the Bourne shell was itself a replacement for the Thompson shell (1971).

There’s a whole family tree of shells which have inter-bred and inspired each other. Many share features, but there are also a huge number of subtle (and not-so-subtle) differences between them.

Lots of things you learn about Bash do apply to other shells. And plenty of things don’t. You can go mad trying to learn a subset of POSIX standard features for shells and try to write scripts that will work on every shell you might find in the wild. There are good reasons to try to write fairly "universal" shell scripts, but ultimately, you’re just going to have to test on any system you actually intend to support.

At any rate, I’d just like to make it clear that a lot of the things I’ll praise in Bash are also available in other shells and while I’m aware of that, I’m not going to make any attempt to point them out.

Bash: the user interface

Bash has some really neat features for advanced users. If you spend as much time at the command line as I do these days, it’s worth investing a little time now and then to learn some of them.

Pathname expansion or globbing

You’re undoubtably familiar with the standard usage of "wildcards" to match pathnames:

$ echo wig*
wiggler.txt wiggler.c wiggler.h

$ echo *.txt
LICENSE.txt tables.txt wiggler.txt

$ echo *able*
tables.txt

Bash returns the filenames which match the pattern. Simple stuff.

But while it’s not RegExp, Bash pathname expansion is pretty powerful: it can also match sets of characters, negative sets, and brace expand strings!

$ echo wiggler.[ch]
wiggler.c wiggler.h

$ echo wiggler.[!c]*
wiggler.h wiggler.txt

$ echo {dang,bogg,wigg}ler.c
dangler.c boggler.c wiggler.c

You can also use wildcards and sets with both filenames and directories:

$ echo set*/*.txt
set1/foo.txt set2/foo.txt set3/baz.txt set3/foo.txt setX/zoom.txt

$ echo $ echo set[12345]/*.txt
set1/foo.txt set2/foo.txt set3/baz.txt set3/foo.txt

So if you’re clever enough, the sky seems to be limit for the number of ways you can select files by name at the command line.

The big "ah ha!" that has taken me years to understand well enough to exploit it creatively about filename expansion is that it is exactly as if I had typed the list on the command line myself. So filename expansions are perhaps valid in places that are a bit unexpected at first.

Let me give you a concrete example. Often I’ll need to do something with a file that shares a somewhat long prefix with one or more other files. Check out this annoying listing:

$ ls -1
tools
tornado-9.8
tornado-9.8-super
tornado-9.8-super.src
tornado-9.8-super.src.patch
tornado-9.8-super.src.patch.1
tornado-9.8-super.src.patch.2
tornado-9.8-super.src.patch.3

Let’s say I want to edit tornado-9.8-super.src.patch.2 with Vim. Well, tab completion can only do so much: it’s going to be a painful, halting experience to specify each little step of the filename:

$ vim tor<tab>-<tab>.<tab>.<tab>.2

Yuck, it feels faster to me just to type the darn thing.

But if you realize that a wildcard expansion resulting in one filename is exactly the same as typing that one filename, then you realize that you’re free to do this instead:

$ vim *.2

Of course, you have to be very careful making destructive changes this way (preview the glob with echo *.2 before running rm -rf *.2 or something. But you already know that.

Job control

Having a lot of background jobs can get out of hand pretty fast (and I find that it’s a great way to get confused, especially when I have my work spread across multiple tiled windows and multiple desktops already!)

But gosh, it’s wonderfully handy to be able to ctrl-z an interactive process to the background in order to do a quick man page lookup or check the names of some files and then bring the interactive process back to the foreground with the fg command.

I’ll address a similar subject, processes, later.

Misc. Emacs editing mode commands

I’m a Vim user (learning Emacs is on my long-term TODO list), but when I tried it, I didn’t find modal editing at the command line to be particularly natural for most quick tasks and it’s not the default anywhere I "ssh into", so I keep Bash in its default Emacs editing mode to retain familiarity

However, I must say, I think the vi editing mode has an awesome set of features that make it very compelling (dare I even say better?). If you’re a vi user and you’re okay with investing time learning a non-standard environment, I highly recommend checking out what it has to offer!

Some useful (or just interesting) commands are:

Ctrl-a - move to beginning of line
Ctrl-e - move to end of line
Esc-b - move one word backward
Esc-f - move one word forward
Ctrl-k - delete to the end
Ctrl-u - delete to the beginning
Ctrl-y - paste the last chunk deleted (word or more)
Esc-. - insert last word from previous command (love this one!)
Ctrl-t - transpose two characters

But there’s a lot more to explore. Check out the readline section below.

History

I use the Up Arrow key all the time to get commands from the history. I probably use that almost as much as I use the Backspace key (which is a lot).

I’m also pretty good about searching the history for the command I want and running it with the ! operator:

$ history | grep tornado
 1054  touch tornado-9.8
 1055  touch tornado-9.8-super
 1056  touch tornado-9.8-super.src
 1057  touch tornado-9.8-super.src.patch
 1058  touch tornado-9.8-super.src.patch.1
 1059  touch tornado-9.8-super.src.patch.2
 1060  touch tornado-9.8-super.src.patch.3
 1063  history | grep tornado
$ !1059
touch tornado-9.8-super.src.patch.2

But it wasn’t until I read the O’Reilly book that I finally learned how to properly use the 'reverse-i-search' history feature and I’m glad I did.

Here’s how it works:

Type Ctrl-r (think 'r' for 'reverse')
Type part of the command (the most recent match appears)
Type Ctrl-r again and again to look further back

When it finds what you need, it’s way faster than grepping your history.

(Now this is one area where I find the the vi-style controls to be vastly superior (all the familiar /, ?, n commands work), but as I mentioned above, every terminal I sit down at has Emacs-style controls…and I swim upstream from convention enough as it is!)

The other big history-related command that I never used prior to reading the manual and which I now find indispensable is fc.

If you just wrote a really long command and it failed (as really long commands are wont to do), then rather than doing an Up Arrow and trying to edit it and command line (argh!!), you can edit it with in your favorite editor and then execute it!

Let’s say we typed this brutally long command:

$ transmogrify -n 78 -q /tmp/smell-collectioz --output=~noob/goat
--mother-ship="Mars Explorer 988" dangler-objectives.trn
Path not found: /tmp/smell-collectioz

Oh man, I mistyped smell-collection! That’s a fast fix in my text editor (defined in $FCEDIT or $EDITOR):

$ fc
<make the correction using favorite editor>

transmogrify -n 78 -q /tmp/smell-collection --output=~noob/goat
--mother-ship="Mars Explorer 988" dangler-objectives.trn

Transmogrifying....100%
$

You can also use fc to list and then edit a command that happened previously:

$ fc -l
103    touch goats
104    echo $SNAKE_PIS
105    vim ~/dangler/plasm
106    vomit ~/dangler/plasm > ~/wiggler/out
$ fc 104
echo $SNAKE_PIT
Hisssssss

What would be really nice would be the ability to edit the current line in an external editor, and guess what!? I found it! To edit the current command and execute is Ctrl-x, Ctrl-e (that is, type first Ctrl-x and then Ctrl-e in sequence - Emacs users will find this sequence very natural to perform.)

(How did I find this sequence even though it wasn’t in my book? Through the wonderful bind command. See the Readline section below!)

As far as replaying history with various forms of ! goes, I knew about !! to re-run the last command and !<number> to run the command from it’s history number, but there is a pretty wild assortment of history such as:

!-<number> - run the command that is number back in the history
!<string> - run the most recent command that starts with the string
!?<string>? - run the most recent command that contains the string
^<mistake>^<correction - repeat the last command, replacing mistake with correction

That last one is pretty neat, so here’s an example:

$ get_wiggler_log -n60 | grep "frist post" > plop
$ ^frist^first
get_wiggler_log -n60 | grep "first post" > plop

The truth is, I’m probably more likely to Up Arrow and fix the correction manually. But depending on the mistake, the substitution expansion form is really elegant.

Quite frankly, the rest of the expansions make tons of sense in a teletype environment or a really slow network connection, but you’re probably otherwise better off with interactive editing.

(Slow network connections are still a thing. And now that some of us find ourselves occasionally using our phones as SSH terminals, the old ways suddenly make sense again! Vi commands are awesome on little touch-screen keyboards. I find myself using commands I rarely use at a proper keyboard where I can brute-force changes by touch-typing quickly. (On that topic, I can’t even imagine what it’s like to try to use Emacs with all of those modifier keys on a touchscreen device…maybe there are alternative keyboard layouts that help?))

readline

As a developer, I was vaguely aware that readline is a standard library for taking buffered, line-based input from terminals.

What I did not realize is that you can configure readline for your own personal preferences.

I’ll be honest, after reading through the options, I was a little bit overstimulated and didn’t find anything I was dying to change…but I do think it’s fascinating that there are over 60 functions and a couple handfuls of variables you can use to re-configure the behavior of readline. The changes go into .inputrc in your home directory.

By far the easiest way to play with readline is to experiment with the bind command. You can list the readline functions with the -l option:

$ bind -l
abort
accept-line
alias-expand-line
arrow-key-prefix
backward-byte
backward-char
...
vi-subst
vi-tilde-expand
vi-yank-arg
vi-yank-to
yank
yank-last-arg
yank-nth-arg
yank-pop

Which should get your imagination going.

You should absolutely check out the list of keyboard bindings for your system with -P:

$ bind -P
abort can be found on "\C-g", "\C-x\C-g", "\e\C-g".
accept-line can be found on "\C-j", "\C-m".
alias-expand-line is not bound to any keys
arrow-key-prefix is not bound to any keys
backward-byte is not bound to any keys
backward-char can be found on "\C-b", "\eOD", "\e[D".
...
vi-tilde-expand is not bound to any keys
vi-yank-arg is not bound to any keys
vi-yank-to is not bound to any keys
yank can be found on "\C-y".
yank-last-arg can be found on "\e.", "\e_".
yank-nth-arg can be found on "\e\C-y".
yank-pop can be found on "\ey".

Of course, grep is your friend when you want to find a specific binding. That’s how I found out about the ability to edit the current line in an external editor:

$ bind -P | grep edit
edit-and-execute-command can be found on "\C-x\C-e".
emacs-editing-mode is not bound to any keys
vi-editing-mode is not bound to any keys

I’ll show an example of making a new binding, but first, there’s another simply awesome feature of bind: the -p option (that’s a lowercase pee), which lists the current bindings in a format that bind itself understands:

$ bind -p
"\C-g": abort
"\C-x\C-g": abort
"\e\C-g": abort
"\C-j": accept-line
"\C-m": accept-line
# alias-expand-line (not bound)
# arrow-key-prefix (not bound)
# backward-byte (not bound)
"\C-b": backward-char
"\eOD": backward-char
"\e[D": backward-char
"\C-h": backward-delete-char
"\C-?": backward-delete-char
...

This is the same format you can put into .inputrc and bind accepts it at the command line.

(Funny aside: before I figured out the correct syntax, I accidentally re-bound the letter 'e' and wasn’t able to type "th lttr " ("the letter e") until I killed my shell session.)

As an example, let’s make Ctrl-t comment the current line:

$ bind '"\C-t": insert-comment'
$ la la la la la <press Ctrl-t>
$ #la la la la la

Another fun thing you can do is bind shortcuts to snippets of text:

$ bind '"\C-t": "Hello World"'
$ echo <press Ctrl-t>
$ echo Hello World

Sweet! I have a new "Hello World" shortcut!

I suspect the true road to guru-like shell usage is to become one with the mappings of these readline commands.

Command hashing

This one is just interesting rather than practical. Bash doesn’t perform a full search of your $PATH to find an executable if it’s already found it. Instead, it enters each found item in a hash table for lookup. It also keeps track of the number of times you’ve used each executable:

To see the list, use the hash command:

$ hash
hits	command
   2	/bin/tttml-fmt
   2	/usr/bin/rm
   1	/usr/bin/cat
   6	/usr/bin/vim
   1	/usr/bin/diff
   1	/usr/bin/mkdir
   5	/usr/bin/scp
   1	/usr/bin/less
  17	/usr/bin/ls

You can pipe through sort to see what you use the most. Like I said, interesting.

xtrace

If a complicated statement is failing for some baffling reason, this option might help:

$ set -o xtrace
$ echo "It is $(date)"
++ date
+ echo 'It is Mon Sep 10 20:22:39 MST 2018'
It is Mon Sep 10 20:22:39 MST 2018

As you can see, the xtrace option displays each level of substitution and expansion.

Very helpful.

shopt and shell variables

As you can imagine, there are a whole bunch of options and variables for Bash itself. I shall not list them here.

But here’s one option I really like: cdable_vars. When you turn it on, the cd command will check for variable names (if the directory wasn’t found in the current directory, of course.) This is best demonstrated with an example. Note that shopt -s is the bash command to view and change shell options:

$ shopt -s cdable_vars # turn it on
$ gopher=/home/dave/d/proj/sdf/gopher/
$ cd gopher
/home/dave/d/proj/sdf/gopher/

This is by far the best built-in mechanism I’ve seen for making quick shortcuts to locations at the command line.

In that vein, I already knew about the pwd command and accompanying $PWD variable to print the working (current) directory.

But I did not know about $OLDPWD, which is the previous working directory before the last cd command!

$ pwd
/home/dave/d/proj/sdf/gopher
$ cd /tmp/set1
$ pwd
/tmp/set1
$ cd $OLDPWD
$ pwd
/home/dave/d/proj/sdf/gopher

Where has that been all my life?

Sure, you can also use pushd and popd, but only if you remember to use them before you go to the other location; otherwise it’s too late.

type

Another great Bash built-in command is type. It tells you what a command is:

$ type cd
cd is a shell built-in
$ type type
type is a shell built-in
$ type lopher
lopher is aliased to `lynx gopher://sdf.org'
$ type bash
bash is /usr/bin/bash

Oh, and now’s a good time to mention that there aren’t any man pages for built-in commands. Instead, you use the built-in help to learn about them:

$ man shopt
No manual entry for shopt
$ type shopt
shopt is a shell built-in
$ help shopt
shopt: shopt [-pqsu] [-o] [optname ...]
    Set and unset shell options.

    Change the setting of each shell option OPTNAME.  Without any option
    arguments, list all shell options with an indication of whether or not each
    is set.

    Options:
      -o	restrict OPTNAMEs to those defined for use with `set -o'
      -p	print each shell option with an indication of its status
      -q	suppress output
      -s	enable (set) each OPTNAME
      -u	disable (unset) each OPTNAME

    Exit Status:
    Returns success if OPTNAME is enabled; fails if an invalid option is
    given or OPTNAME is disabled.

I love Bash’s help command

Bash’s help is really great. The results are truly helpful and completely free of fluff, basically what I wish all man pages were. When you get a chance, try help to see a list of topics and help help for a couple other tips.

(Usually when I get advice like that I think, "yeah, yeah, I’ll do that when I’m dead." But really, help doesn’t waste your time at all. And there’s no new interface to learn - it just prints out some info and you’re back at the command line.)

Though I’m still no expert, I do feel a lot better now that I have a stronger grasp of the fundamentals of Bash as a user interface. The convenience features are also wonderful to have now that I find myself doing 90% or more of my work at the command line.

Command line order of operations and quoting

The relationship between quoting and how command lines are processed is easily important enough to warrant a re-cap and further explanation. I also feel like I gained a lot of insight and clarity when I was able to "zoom out" to see the full sequence all at once:

A line of input is a "pipeline"
Pipelines are split by the | character into commands
Commands are split into tokens (factoring whitespace and quotes)
If first token is a keyword (function, if), handle logic, send commands back to step 1
Is the first token an alias? Substitute (recursively, if needed)
Brace expansion (foo.{c|h} = foo.c foo.h)
Tilde expansion for user directories (~dave)
Variable substitution ($foo)
Command substitution ($(date))
Arithmetic ($1+1)
Re-split into words (similar to token split, but now uses $IFS as separator)
Path name expansion (*, ?, [ABCDabcd])
Run the command (function, built-in, or executable found via $PATH)

This list is not only a summary of the operations Bash can perform. It’s also an overview of the order of operations.

The best part is how it aids in understanding quoting. Basically, the rules are as follows:

Anything inside single quotes ('…') skips all of the transformations
Anything inside double quotes ("…") skips everything except those starting with a dollar sign ($)

By the way, I mentioned the special $IFS variable in the list above. Search the man bash page for "IFS" to see how that works.

Redirection, pipes, and processes

All three of these concepts straddle the line between using Bash as an interface and using Bash as a programming language.

I/O Redirection

It turns out that Bash has all kinds of exotic redirection operators. Feel free to see a full list.

I use > and >> all the time to write to files and < sometimes to read from files.

In scripts, I’ve occasionally redirected stderr to stdout with 2>&1 or redirected stderr to a file with 2> filename.

It was extremely helpful to learn from my book that functions and blocks (multiple command lines contained in { … } can take I/O redirectors, which allows you to redirect the input or output of multiple statements easily. Here’s a silly example block:

$ { echo "it is "; date; } > /tmp/clock
$ cat /tmp/clock
it is
Sun Sep  9 18:57:13 MST 2018

Most of the rest of the redirections seem like good things to know are possible should the need arise, but I’m not going to bother to memorize them.

In particular, I was baffled at the existence of <> which opens a file descriptor (stdin by default) in both read and write mode. Luckily, an excellent unix.stackexchange.com answer by Stéphane Chazelas clearly outlines what it can do for you.

As we get further into the depths, though, one has to ponder the tricky line between systems programming (with a systems programming language) and "scripting" with a glue language like Bash.

Pipes

To be at the Unix command line is to pipe the output of one utility through another. I love being able to use tools like wc (word count), sort, grep, tr (translate), and friends to do useful things quickly and efficiently.

Pipes aren’t specific to Bash, but I found this paragraph from the Wikipedia pipeline article very enlightening as a high-level summary of how pipes actually work:

In most Unix-like systems, all processes of a pipeline are started at the same time, with their streams appropriately connected, and managed by the scheduler together with all other processes running on the machine. An important aspect of this…is the concept of buffering: for example a sending program may produce 5000 bytes per second, and a receiving program may only be able to accept 100 bytes per second, but no data is lost. Instead, the output of the sending program is held in the buffer. When the receiving program is ready to read data, the next program in the pipeline reads from the buffer. In Linux, the size of the buffer is 65536 bytes (64KiB).

I am reminded of one of the prime reasons not to use cat foo.txt | sort when you mean sort < foo.txt: pipes are fast, but having the OS send a file to the stdin of one process is sure to be more efficient than starting two processes, connecting them with a buffered pipeline and then having the first process read the file…!

Processes

The most I usually see of processes at the command line are the commands and shortcuts mentioned above in the Job control section.

But there are a few things we can explore to obtain a more complete understanding.

For one thing, there is a variable called $$ which contains the process ID of the current process.

$ echo $$
23750

We can use this to show that invoking a script runs a separate process. Here, foo is a one-line script which echos its name and process ID:

$ echo 'echo foo $$' > foo
$ bash foo
foo 25587

We can also see how variables interact between processes. Let’s set a variable:

$ HELLO=howdy
$ echo $HELLO
howdy

Regular variables aren’t shared between processes:

$ echo 'echo hello: $HELLO' > foo
$ bash foo
hello:

But we can set variables for a process by setting them on the command line before a command:

$ HELLO=howdy bash foo
hello: howdy

And we can use the export built-in to make them available to sub-processes:

$ export HELLO
$ bash foo
hello: howdy

However, it doesn’t work the other way:

$ echo 'export ZAP=zippity' > foo
$ bash foo
$ echo $ZAP

This makes sense from both a security and a "clobbering the namespace" point of view. By the way, using export technically makes what is known as an "environment variable". We can see this by listing and searching for our $HELLO variable with the env program from coreutils:

$ env | grep HELLO
HELLO=howdy

"Hello" yourself, you nifty little variable! (Look at how it’s made its way in the world, all grown up!)

Bash: the programming language

I’ll readily admit that I put off for years learning the syntax of Bash as a programming language. It seemed hopelessly arcane and awkward and I didn’t need to use it that often anyway.

But as I find myself writing more and more shell scripts (and relying less and less on heavier alternatives such as Python, Perl, and Ruby), I’m starting to actually learn and remember the syntax. Yeah, I still think most of it is awkward as hell to type.

But the more I use it, I’ll be darned if I’m not discovering that there are some excellent conveniences built into Bash that are absolutely worth learning. It’s really good at its job.

Functions (and programming at the command line)

I’ll just take this opportunity to say that this is my preferred syntax for defining functions:

function foo
{
	echo "Foo to you!"
}

The function keyword is easy to search for and since we don’t declare arguments anyway, the foo() { … } definition syntax makes less semantic sense to me.

Programming interactively at the command line has its pluses and minuses:

$ function foo
> {
> echo "Foo to you!"
> }
$ foo
Foo to you!

The immediate feedback is wonderful in the way that all REPLs are wonderful - but the lack of proper programmer’s editor sucks.

However, now that I know of the Ctrl-x, Ctrl-e shortcut to edit the current command in an external editor (and fc to edit the previous command; both explained in full above), this changes everything! Now it’s the best of both worlds.

If you ever have a task that involves repeatedly running one or more commands, consider writing a quick function to make it easier. Something changes between each successive run? No problem, just use a command line argument:

function foo
{
	echo "I am a $1."
}

Then you can repeat what needs repeating and change what needs changing:

$ function foo
> {
> echo "I am a $1."
> }
$ foo "good boy"
I am a good boy.
$ foo "mysterious programmer"
I am a mysterious programmer.
$ foo goat
I am a goat.

Variables and quoting

Really learning the syntax rules for variables and quoting is as essential with Bash as it is with any programming language.

I’ll try to summarize the basics with some examples:

$ foo=hello
$ echo foo world      # foo world
$ echo $foo world     # hello world
$ echo "$foo world"   # hello world
$ echo '$foo world'   # $foo world
$ echo $fooworld      #
$ echo ${foo}world    # helloworld

String operators

You’re either going to love or hate these. They’re cryptic and terse, but they’re also incredibly useful and terse.

Default value: ${<var>:-<value>}

$ foo=${NOPE:-"Does not exist!"}
$ echo $foo
Does not exist!
$ NOPE=Hello
$ foo=${NOPE:-"Does not exist!"}
Hello

Message if undefined/null: ${<var>:?<message>}

$ foo=${NOPE:?"Must be defined!"}
bash: NOPE: Must be defined!
$ NOPE=howdy
$ foo=${NOPE:?"Must be defined!"}

Value if defined: ${<var>:+<value>}

$ COUNT=17
$ echo ${COUNT:+"Yup!"}
Yup!

Substring/slice: ${<var>:<start>:<length>} (where :<length> is optional)

$ phrase="Hey there, moo cows!"
$ echo ${phrase:5}
here, moo cows!
$ echo ${phrase:5:4}
here
$ echo ${phrase::-5}
Hey there, moo

Delete shortest beginning match: ${<var>#<pattern>}

$ phrase="Hey, there, moo cows!"
$ echo ${phrase#*,}
there, moo cows!

Delete longest beginning match: ${<var>##<pattern>}

$ phrase="Hey, there, moo cows!"
$ echo ${phrase##*,}
moo cows!

(Delete ending matches by replacing the # characters with % in the examples above.)

Find/Replace first: ${<var>/<find>/<replace>}

$ phrase="Hey moo cows!"
$ echo ${phrase/cows/cats}
Hey moo cats!

(Replace all matches with ${<var>/<find>/<replace>}.)

Command substitution

Perhaps you know about command substitution wherein Bash runs a command and returns its value in place:

$ echo "My $(date +%Y) Smelling Diary"
My 2018 Smelling Diary

There are a million and one creative ways to use command substitution within your scripts and at the command line such as processing strings with sed or tr, or sorting stuff or getting lists of files and directories, etc.

In a lot of ways, this gets at the heart of what makes shell scripting so attractive: some things are just so easy at the UNIX command line once you’re familiar with the basic tool set. A lot easier than rolling your own from scratch in a "real" programming language. If you can’t beat 'em, join 'em.

Flow control and conditions

Now it’s time to control the flow with if/elses and fors and whiles and cases!

The structure of an if/else statement in Bash isn’t particularly interesting if you’ve done any programming at all before, but the syntax of the condition is!

Conditions in Bash aren’t like those of ordinary programming languages; they’re actually a command. Even more confoundingly, the "truthy-ness" of the command isn’t a literal return value such as "true" or "1"; it’s the exit status of the command!

Understanding this will take more than a couple examples, but this should at least be revealing. Let’s start by displaying the exit status of a success and failure by using the special variable $?:

$ echo "hello" > foo.txt   # create a file
$ cat foo.txt              # it exists, success!
hello
$ echo $?                  # $? is the last exit status
0                          # '0' is typically success or 'true'!

$ cat nothing.txt          # does not exist, fail!
cat: nothing.txt: No such file or directory
$ echo $?
1                          # '1' (non-zero) indicates an error or 'false'!

Now let’s use that in an if/else statement:

$ function happycat {
	if cat $1
	then
		echo "Yay! I am so glad I could be helpful!"
	else
		echo "Oh no! I am so sorry I couldn't help. Maybe next time?"
	fi
}

$ happycat foo.txt
hello
Yay! I am so glad I could be helpful!

$ happycat nothing.txt
cat: nothing.txt: No such file or directory
Oh no! I am so sorry I couldn't help. Maybe next time?

Just to reiterate, it’s not a value anywhere in the normal standard input/output flow nor a literal value in the script or command line that is being tested, it’s a command. To drive this home, let’s try to use 1 (one) as a condition:

$ if 1; then echo "so sorry"; fi
bash: 1: command not found

See, Bash is trying to find a command named 1 (one). It doesn’t matter if you quote the value or anything like that - Bash is still going to treat the condition as a command.

So…how do we test against values?

Ah, for that we have the test built-in command. The help information about test is incredibly handy as a reference for shell programming. Heck, here’s the whole thing:

$ help test
test: test [expr]
    Evaluate conditional expression.

    Exits with a status of 0 (true) or 1 (false) depending on
    the evaluation of EXPR.  Expressions may be unary or binary.  Unary
    expressions are often used to examine the status of a file.  There
    are string operators and numeric comparison operators as well.

    The behavior of test depends on the number of arguments.  Read the
    bash manual page for the complete specification.

    File operators:

      -a FILE        True if file exists.
      -b FILE        True if file is block special.
      -c FILE        True if file is character special.
      -d FILE        True if file is a directory.
      -e FILE        True if file exists.
      -f FILE        True if file exists and is a regular file.
      -g FILE        True if file is set-group-id.
      -h FILE        True if file is a symbolic link.
      -L FILE        True if file is a symbolic link.
      -k FILE        True if file has its `sticky' bit set.
      -p FILE        True if file is a named pipe.
      -r FILE        True if file is readable by you.
      -s FILE        True if file exists and is not empty.
      -S FILE        True if file is a socket.
      -t FD          True if FD is opened on a terminal.
      -u FILE        True if the file is set-user-id.
      -w FILE        True if the file is writable by you.
      -x FILE        True if the file is executable by you.
      -O FILE        True if the file is effectively owned by you.
      -G FILE        True if the file is effectively owned by your group.
      -N FILE        True if the file has been modified since it was last read.

      FILE1 -nt FILE2  True if file1 is newer than file2 (according to
                       modification date).

      FILE1 -ot FILE2  True if file1 is older than file2.

      FILE1 -ef FILE2  True if file1 is a hard link to file2.

    String operators:

      -z STRING      True if string is empty.

      -n STRING
         STRING      True if string is not empty.

      STRING1 = STRING2
                     True if the strings are equal.
      STRING1 != STRING2
                     True if the strings are not equal.
      STRING1 < STRING2
                     True if STRING1 sorts before STRING2 lexicographically.
      STRING1 > STRING2
                     True if STRING1 sorts after STRING2 lexicographically.

    Other operators:

      -o OPTION      True if the shell option OPTION is enabled.
      -v VAR	 True if the shell variable VAR is set
      -R VAR	 True if the shell variable VAR is set and is a name reference.
      ! EXPR         True if expr is false.
      EXPR1 -a EXPR2 True if both expr1 AND expr2 are true.
      EXPR1 -o EXPR2 True if either expr1 OR expr2 is true.

      arg1 OP arg2   Arithmetic tests.  OP is one of -eq, -ne,
                     -lt, -le, -gt, or -ge.

    Arithmetic binary operators return true if ARG1 is equal, not-equal,
    less-than, less-than-or-equal, greater-than, or greater-than-or-equal
    than ARG2.

    Exit Status:
    Returns success if EXPR evaluates to true; fails if EXPR evaluates to
    false or an invalid argument is given.

Notice how test "exits with a status" to indicate true/false values.

There are a lot of really neat test options, but let’s just do a simple string comparison:

function animaltest {
	if test $1 = meow
	then
		echo "You're a cat!"
	elif test $1 = moo
	then
		echo "You're a cow!"
	fi
}

$ animaltest meow
You're a cat!
$ animaltest moo
You're a cow!

Now comes the really interest syntactic thing: there is an alias for test called [ which requires a final argument of ]! I’ve never personally found an active script in the wild that used test, so you might as well get used to [ (and ]).

Before I even bother showing an example of the [ … ] command, let’s also mention the double bracket "upgrade" ([[ … ]]). It gives you everything the single bracket version does, but also allows parenthetical grouping of expressions as well as logical (and/or) operations within the expression and (here’s the big one) supports the =~ operator for regular expression matching.

(I’m not sure when [[ was added to Bash, but there’s no mention of it in my O’Reilly book from 1998 (when Bash 2.x was pretty new and 1.x was still in common usage). For what little it’s worth, my advice is to use [ if you think you may need to adapt your script for other shells some time in the future. Otherwise, use [[ and reap the rewards.)

Here’s our previous example with test replaced with [[.

function animaltest {
	if [[ $1 = meow ]]
	then
		echo "You're a cat!"
	elif [[ $1 = moo ]]
	then
		echo "You're a cow!"
	fi
}

Now let’s do a file test and a regular expression just to stretch our wings a bit:

$ touch foo.conf
$ critter="700 snakes"
$ if [[ -f foo.conf && "$critter" =~ [0-9]+\ snakes? ]]
  then
    echo "I can configure those snakes for you!"
  fi
I can configure those snakes for you!

Okay, that was just silly. Good examples don’t just grow on trees, you know.

My final word on conditionals before we see some other control flow constructs is that the string and numeric comparison operators are probably exactly the opposite of what you’d assume they are: < tests if a string is less than another string (lexigraphically); -lt tests if a number is less than another number.

Here’s for:

$ for i in 2 3 4 5; do echo "I have $i cows."; done
I have 2 cows.
I have 3 cows.
I have 4 cows.
I have 5 cows.

Bash’s for gets it right: it operates on lists of fields or "words". And what, exactly, is a list in Bash? It’s a string in which fields are separated by the value of $IFS (internal field separator). By default, $IFS is set to a value which makes it match any whitespace characters. So 99% of the time, a "list" in Bash is a series of "words" separated by spaces.

(There was a brief mention of $IFS above in the Order of Operations section.)

In the above example, the numbers 2 3 4 5 are separated by spaces. We can tell Bash to not consider these values separately (essentially escaping the whitespaces) by quoting them:

$ for i in "2 3 4 5"; do echo "I have $i cows."; done
I have 2 3 4 5 cows.

Shell scripting is so weird, right? Yes, but with all of this complexity you get the ability to express concisely what you want in a minimal amount of space. On the command line, both space and time are at an absolute premium.

See how the ls command also returns a list that the for statement can work with:

$ for i in $(ls); do echo "I have $i cows."; done
I have animal cows.
I have foo.c cows.
I have foo.conf cows.
I have foo.txt cows.
I have myfile1535930751 cows.

The case statement is great when you would otherwise have a sprawling set of if/else statements. I’ll be up front in telling you that my professional opinion as a developer is that sprawling if/else statements (and therefore case statements) are often a sign that your program has some structural issues.

But there are completely legitimate reasons to have case statements in programs where there are, in fact, many options to choose from - in those cases, I don’t hesitate to use them. The syntax is predictably weird with ) after each pattern and ;; separating commands. Let’s revisit our earlier animaltest function:

$ function animaltest {
	case $1 in
		meow | purr ) echo "You're a cat!" ;;
		woof | bark ) echo "You're a dog!" ;;
		moo         ) echo "You're a cow!" ;;
		*           ) echo "You're a squid!" ;;
	esac
}

$ animaltest meow
You're a cat!
$ animaltest purr
You're a cat!
$ animaltest bark
You're a dog!
$ animaltest moo
You're a cow!
$ animaltest glerpglerp
You're a squid!

Now for one of my favorite recent discoveries: select. A word of warning: this one is not available in many non-Bash shells. But it is a huge convenience.

What does select do? It makes menus for you! Check it out:

$ select thing in beans skull goat rat ball QUIT
do
	echo "You picked $thing!"
	if [[ $thing = QUIT ]]; then break; fi
done

1) beans
2) skull
3) goat
4) rat
5) ball
6) QUIT
#? 1
You picked beans!
#? 2
You picked skull!
#? 6
You picked QUIT!

Any list will work and the menu will keep looping until you stop it with break. I recently used this to generate a menu from a list of files. What a huge time-saver!

Finally, let’s have an old-school loop party with everybody’s favorite, while:

while [[ $(date +%Y) == "2018" ]]
do
	echo "Still the year 2018..."
	sleep 30
done

Still the year 2018...
Still the year 2018...
Still the year 2018...

To properly test this program, you will need to start it in 2018 and make sure it has stopped when it turns 2019. Give your computer a kiss for good luck at midnight!

Math

Now that you’ve seen a fair bit of Bash syntax, you’re probably wondering how the heck you can do arithmetic in the shell.

There’s two ways: declare some variables as integers and Bash will treat them differently (good luck figuring that out years later when you have to revisit your script!); or enclose the expression in an operator made specifically for that purpose: $(( … )).

$ echo $(( 4 + 8 ))
12
$ foo=$(( 6 * 1024 ))
$ echo $foo
6144
$ echo $(( $foo / 17 ))
361

As you can see, it’s all integer math as nature intended. You can also shift bits, perform logical operations, and test relationships between numbers.

On the other hand, doing large amounts of math in the shell is possibly silly. There are dc (reverse-polish "desk calculator" with a programmable stack) and bc ("basic calculator" - a full-blown mathematical language with variables and conditionals) for that. Just pipe your problem into one of those and get the answer so fast it turns your socks inside-out.

getopts

If you’ve ever written a command line utility that takes a lot of optional arguments, you know what a total pain in the crotch-area it can be to perform the logic of matching and parsing all of those options.

Fortunately, Bash provides a convenience feature for parsing these blasted things for you. As with everything "shell", it looks like something designed by an alien, but it’s compact and it works! It’s really easy to explain with an example:

#!/bin/bash

while getopts ":abc:def" opt
do
	case $opt in
		a ) echo "Thanks for choosing to fly with option -a!" ;;
		b ) echo "Sorry, option -b has been deprecated because" ;;
		c ) echo "Yes, we will totally -c the $OPTARG." ;;
		d | f ) echo "Options -d and -f are synonyms:\
they both tell you that they are synonyms." ;;
	esac
done
shift $(($OPTIND -1))

echo "Now we can continue to do things to the '$1'."

See how it runs:

$ chmod +x opts

$ ./opts -a -d wiggler
Thanks for choosing to fly with option -a!
Options -d and -f are synonyms:they both tell you that they are synonyms.
Now we can continue to do things to the 'wiggler'.

$ ./opts -c slippery -b dangler
Yes, we will totally -c the slippery.
Sorry, option -b has been deprecated because
Now we can continue to do things to the 'dangler'.

The two bits of the script which need some explanation are these:

getopts ":abc:def" opt - the getopts command takes two parameters: the first is a string which defines the single-letter option names (sorry, no long "--GNU-style" options), the second is the name of the variable to hold the option parsed
In ":abc:def", the first colon tells getopts to shut up if somebody types a bad option, the colon after 'c' (abc:) tells getopts that the -c option takes an argument
Whenever an option takes an argument, it is put into a variable named $OPTARG for you
shift $$OPTIND -1 - this evil-looking construct renames the arg variables so that $1 becomes the first argument after the last option parsed by getopts

You can type help shift and help getopts for the full story or just memorize (or copy/paste) this boilerplate - either way, getopts still beats the heck out of manually parsing options!

Arrays

I don’t like Bash’s arrays very much. Change my mind.

echo and read

Two very helpful built-in commands for getting line-based information in and out of Bash scripts are echo and read. Of course, we’re already very familiar with echo, right?

I just want to take this opportunity to point out two very helpful options for echo: -n (no newline) and -e (process escapes (such as \n for newline)):

$ echo -n "The date is: "; date
The date is: Tue Sep 11 11:36:22 MST 2018

$ echo -e "Look Ma,\n\tnewlines\n\tand\n\ttabs"
Look Ma,
	newlines
	and
	tabs

It’s also fun (and often quite helpful) to print ANSI escape sequences to display colors and other effects (echo -e "Look at the \e[32mpretty green\e[0m text.").

Akin to echo for output, read reads a line of input from stdin.

$ echo "What kind of beans do you like?"
What kind of beans do you like?
$ read beans
pinto
$ echo "Oh, you like $beans beans. Cool."
Oh, you like pinto beans. Cool.

In addition to being indispensable tools for making interactive scripts, read and echo also let you process lines from any input and output source, not just the terminal, so there are all kinds of things you can do with files or other streams in Bash scripts.

The O’Reilly book makes an interesting point that reading and writing lines with the shell is fine in small doses, but inefficient in large doses and a bit anti-Unix because scripts should really be sharing whole streams of input via pipelines, which are far more performant).

Food for thought.

eval

Honestly, I’m still struggling to find truly compelling uses of eval. It does what you’d expect (if you’re familiar with evals in other programming languages): it evaluates an expression and executes it as if you’d typed it yourself at the command line.

I keep looking for good examples, but most of them seem better expressed in other, safer ways. On the other hand, knowing that eval exists will no doubt get me out of some sort of jam in a script some day, so I’ll try to remember it.

The only thing that seems potentially useful is the ability to get the value of a variable whose name is stored in another variable (I’ve seen these called "variable-variables" in other languages) as a level of indirection:

$ eval echo \$$foo
snakes

It’s one of those things that makes you say, "why would you want to do that?" until one day you run into a particularly interesting one-off problem with a fiendishly simple solution "if only I could have the shell evaluate my expression twice…" and on that day you will be glad eval exists.

Trapping signals, etc.

It’s surprising how application-like a Bash script can be: you can even trap signals send to your script’s process. Fun example:

$ cat > foo
function sulky_exit {
	echo "Fine, whatever."
	exit
}
trap sulky_exit INT
for i in 3 2 1
do
	echo "$(( $i * 5 )) seconds left..."
	sleep 5
done

$ bash foo
15 seconds left...
10 seconds left...
^CFine, whatever.

At first, this seems like a bit of an overkill feature for a shell script to have. But the more I think about it, it is important to be able to have your script clean up files or otherwise gracefully handle exits.

You can do other exotic things like start subshells (by nesting code in parenthesis ( … )), waiting for background subprocesses to finish with wait, and even treat a process as if it were a file descriptor and redirect it to a command (which you can imagine as treating a foo | grep as grep <( foo ), which gives you some interesting abilities.

The list goes on.

Bash has taken functionality from nearly 50 years (holy cow!) of Unix shell tradition and combined it into a whole. That’s a lot of stuff.

As always, you have to decide how much of it you want to learn.

Is it okay to rely on Bash?

It certainly is on Slackware for the forseeable future!

Bash is the default shell for a lot of Linux distros. It has been available (default, in fact) on every Linux-based virtual machine image I’ve encountered on Web/cloud hosting platforms. Is Bash available everywhere?

As near as I can tell, it is pre-installed on or available as a package on just about every current Unix-like OS.

But, honestly, it has been quite challenging to find a single source of information on the availability of various shells on different Linuxes and BSDs. It might be an interesting project some day to put together a survey of current FOSS and commercial Unixes and attempt to figure out which shells are installed by default (in a "normal" or "full" install) and which shell is the default on each.

But that’s a big project. Heck, just figuring out which shell is the default on a completely stock NetBSD 8.0 install is a challenge. The man page for sh (1) just calls it "sh" and says that it takes a lot of inspiration from the Korn shell, however the Wikipedia entry for the Almquist shell says that NetBSD’s sh is ash, the Almquist shell. Oh, and you’ll love this: that information is quoted from the Slackware description of the ash package! Oh, what a tangled web!

(NetBSD boots in under 30 seconds on my old ASUS EeePC 701 palmtop, by the way. Oh, and Bash was not part of the stock install of NetBSD.)

In my opinion, it’s worth knowing all but the darkest corners of the tools you’re using on a constant basis. You could do a lot worse than to invest some time in Bash. Many shells are at least mostly Bash compatible (or the other way around), so a lot of that learning is transferable.

Choose your own adventure!

Conclusion

This post is ridiculously long. I know that.

I plan to extract the most useful bits out into a Bash cheatsheet for myself. If I do that and if I remember, I’ll link that here.

Whew! I’m glad this package is over, but it has been a wonderful learning experience reading and writing so much about Bash - and it’s already paid off. I have no idea what package is next in line. Hopefully something small!

Until next time, happy hacking!

-Dave