
Dave's Slackware Package Blog
coreutils
This package is not what I had in mind when I started this project! It’s quite a large collection of utilities and it’s going to take me a while to get through them all.
I’m going to simply tackle each executable in alphabetical order.
Note: If you really want to dig into the ultimate "source" of truth (pun intended) for these tools, check out the GitHub mirror of the coreutils source:
Well, let’s get on with it: an exploration of each of the coreutils commands!
[ and test
These both take an expression as a command line argument, evaluate it, and exit with a 0
(success) or 1
(fail) status.
In terms of the values of the argument string itself, empty args are fail/false and everything else is success/true. We can see that with the special variable $?
in bash, which prints the last exit status:
$ test; echo $? # 1 $ test 1; echo $? # 0 $ test 0; echo $? # 0 $ test true; echo $? # 0 $ test false; echo $? # 0
Or to perhaps make it clearer:
$ if test 1 ; then echo 'true' ; else echo 'false' ; fi # true $ if test ; then echo 'true' ; else echo 'false' ; fi # false
Hey, wait a second, isn’t test
and [
provided by Bash!?
Yes, that’s correct.
Without specifically specifying the path to the executables, Bash is going to supply the test functionality with the shell built-in version of these:
$ [ --version bash: [: missing `]'
So to call the executables from the coreutils package, we have to supply an explicit path:
$ /bin/[ --version [ (GNU coreutils) 8.25 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Kevin Braunsdorf and Matthew Bradburn.
I’ll be honest, I couldn’t be bothered to see if there were any differences between Gnu Bash and Gnu coreutils versions of test
. My eyes glazed over after staring at the output of help test
(Bash) and man test
(coreutils) for a while.
I did think it was interesting that [
and test
are separate executables as installed by this package (as opposed to one of them being a symlink to the other). And the executables are not identical:
$ md5sum /bin/[ /bin/test 6e6588788b3ec5110b4532f8f1d912e3 /bin/[ 76db2a10639424c0ba7c09c9d6626ec5 /bin/test
So I took a look at the coreutils source (github mirror) of test.c and found that they are the same program, but compiled with slightly different options.
... if (LBRACKET) { /* Recognize --help or --version, but only when invoked in the "[" form, when the last argument is not "]". Use direct parsing, rather than parse_long_options, to avoid accepting abbreviations. POSIX allows "[ --help" and "[ --version" to have the usual GNU behavior, but it requires "test --help" and "test --version" to exit silently with status 0. */ ...
That’s the only difference.
Here is the entire contents of lbracket.c:
#define LBRACKET 1 #include "test.c"
arch
Prints the machine architecture.
Example:
$ arch x86_64
The man page points out that this is the same as running uname -m
, which I can confirm so you don’t have to.
base32
Encodes and decodes (typically binary) data as a string using A-Z and 2-7. Here I encode and decode a string. This is generally a silly thing to do, but the base32 encoded string is a valid filename, unlike the input string (which contains a '/'.
$ echo "Hello! Foo/Bar" | base32 JBSWY3DPEEQEM33PF5BGC4QK $ echo "JBSWY3DPEEQEM33PF5BGC4QK" | base32 -d Hello! Foo/Bar
The -w
option to wrap columns may be quite handy if you’re sending this data through a communication platform that will do hard line wrapping for you. Or perhaps for printing if you want to have hardcopy of some small piece of binary data?
I enjoy the --ignore-garbage
option, which I also use around the house when I want to relax.
base64
Works exactly the same way, but uses 64 characters (including '/'), so the result can’t reliably used as a filename, etc.:
$ echo "Hello! Foo/Bar" | base64 SGVsbG8hIEZvby9CYXIK $ echo "SGVsbG8hIEZvby9CYXIK" | base64 -d Hello! Foo/Bar
Notice the inclusion of lower case characters and the better information density of the base64 encoding.
Any time you find yourself in a text-only medium, but you need to send or store binary data, base32
and base64
basename
A very handy utility to know for both scripting and constructing cool one-off commands. Best demonstrated and learned by trying it out:
$ basename /foo/bar.txt bar.txt $ basename --suffix=.txt /foo/bar.txt bar $ basename /foo/bar.txt /foo/baz.txt bar.txt $ basename --multiple /foo/bar.txt /foo/baz.txt bar.txt baz.txt $ basename --suffix=.txt /foo/bar.txt /foo/baz.txt bar baz
There are also short options for each of these (-s
for suffix and -a
for multiple ("all"?)).
cat
That amazingly handy command we all know and love.
I often use it to quickly write multi-line files and to build up larger commands starting with getting input from a file. Some people say that this is a "useless use of cat", but I think that 'rule' has been parroted far too much.
Sprinkle cat
around as much as you like. Do it with pride. Do it with a "meow". You can always remove it later if you need to.
Knowing cat
is arguably more about knowing your shell (with redirection, pipes, etc.) but it does have some interesting options of its own such as --squeeze-blank
, --number
, --number-nonblank
, --show-nonprinting
, --show-ends
, --show-tabs
, and --show-all
. All of these have short option names too.
Of course, the intended function of cat
is to conCATenate files together, which cat does by simply streaming each file sequentially to STDOUT:
$ cat foo1.txt foo2.txt > multifoo.txt
Now multifoo.txt
contains the contents of both foos.
chcon
Change the security context of a file.
Okay, what is a "security context"? It comes from SELinux, which sounds like the name of a Linux distro. But it’s not, it’s a kernel module.
I’m not really sure what the current story is with Slackware and SELinux and casual searching didn’t reveal any easy answers. As near as I can tell, it’s not a first-class citizen on Slackware. So I’m not going to dive into this right now.
chgrp, chmod, chown
Change a file’s group, "mode" (permissions), owner:
$ chgrp apache foo.html # apache group now owns file $ chmod a+r foo.html # give "all" the read permission $ chown dave foo.html # dave now owns file. hell yeah
chroot
I’d seen this mentioned in the context of sandboxing or "jailing" processes as a security measure. But Michael Kerrisk makes it clear in The Linux Programming Interface that this is not the purpose nor strength of chroot
. Instead, there are plenty of useful reasons to give a process a new root directory.
Wikipedia has a history of chroot which indicates it is possible the command was created by Bill Joy in order to test Version 7 UNIX’s installation and build system.
The other place I’d seen mention of chroot
was when using it to recover a system using the Slackware install media
like so.
What chroot
does is quite simple: it modifies the apparent root directory for all file access for the affected process. In other words, it sets /
and everything under /
to a different directory of your choice.
So, how do we go about trying it out?
The man page is a little…uh…light on details.
As usual, we’re supposed to use the info
documentation instead.
I’ll perhaps rant about that some other day.
At any rate, we do learn that the most basic invocation is (roughly):
chroot NEWROOT [COMMAND]`
and that
If no command is given, run '"$SHELL" -i' (default: '/bin/sh -i').
Well, that sounds easy. Let’s try:
$ mkdir newroot $ chroot newroot chroot: cannot change root directory to 'newroot': Operation not permitted
Uh, okay. The man page failed to mention that we also have to run chroot
as the superuser.
$ sudo chroot newroot chroot: failed to run command '/bin/bash': No such file or directory
Ah, now that makes sense! Indeed, there is no /bin/bash
under the new root - it’s completely empty!
So how do we get Bash in there? Well, we can just copy it:
$ mkdir newroot/bin $ cp /bin/bash newroot/bin $ tree newroot newroot `-- bin `-- bash 1 directory, 1 file
That ought to do it:
$ sudo chroot newroot chroot: failed to run command '/bin/bash': No such file or directory
Huh? Okay, so it turns out the error message here is extremely unhelpful - there is, indeed such a file. However, it could not be executed because it relies on dynamic libaries. That executable is all alone in a cruel, formless void.
Fine, we can fix that, but I want to see something running right now! I’m going to compile a static executable.
$ cat > hello.c #include <stdio.h> int main(){ printf("Hello world!"); return 0; } $ gcc -static hello.c -o hello $ cp hello newroot/bin/ $ sudo chroot newroot /bin/hello Hello world!
Ha!
Now, back to the problem at hand: we need an environment that can actually run regular executables that use dynamic shared libraries. Well, I could copy all of those. But I don’t want to. And I can’t just make a symlink because symlinks resolve using paths…and those paths would point outside of my new root.
That’s where "bind mounts" come in. You can call mount --bind <fromdir> <todir>
which will make the contents of fromdir
accessible in both places.
$ mkdir -p newroot/lib64 newroot/lib newroot/usr $ sudo mount --bind /lib64 newroot/lib64 $ sudo mount --bind /lib newroot/lib $ sudo mount --bind /usr newroot/usr
Now let’s see if Bash is happy:
$ sudo chroot newroot $
No news is good news. Now let’s look around:
$ ls bash: ls: command not found
Huh? But I thought ls
was in /usr/bin
?
$ exit $ which ls /usr/bin/ls $ ls -l newroot/usr/bin/ls lrwxrwxrwx 1 root root 12 Apr 15 02:35 newroot/usr/bin/ls -> ../../bin/ls
Ohhhhh, that’s a broken symlink to a /bin/ls
that doesn’t exist. Since we already created /bin
to put our copy of Bash in, I’ll just copy ls
over there too:
$ cp /bin/ls newroot/bin $ sudo chroot newroot $ ls bin lib lib64 usr
Hey! There we are!
So, I would probably want to mount the real /bin
in my newroot as we have with /lib
and /usr
.
And we might also need /dev
and /proc
and the like. But this definitely works.
cksum
This does a cyclic redundancy check (CRC) of a given file (or STDIN).
$ echo "hello" | cksum 3015617425 6
The first number is the CRC sum, the second number is the size of input data.
I reckon the real question is: are there still uses for cksum
when MD5 and SHA-based hashing methods are so much more popular for checking file integrity?
I haven’t been able to find any evidence that there are any compelling reasons to use cksum
. It is simple and fast, but so is md5sum
. It appears this utility simply exists for POSIX compatibility. So perhaps you can dig up some ancient scripts which rely on it. I didn’t find any on my Slackware 14.2 system using a naive search.
comm
Compares sorted lines of files and displays the results in three columns. The output is intuitively understood using an example:
$ cat foo1.txt ant bat cat dog $ cat foo2.txt ant bat crunch dog $ comm foo1.txt foo2.txt ant bat cat crunch dog
As you can see, this is perfect for comparing lists of items. Turn specific columns off, and specify the output delimiter between columns.
I’m probably most likely to reach for diff
to compare files 99.9% of the time out of sheer muscle memory. But this could certainly be extremely handy in specific circumstances.
It looks like later versions of GNU comm
has a --total
option which prints a numeric summary of each column at the end. My version doesn’t have this option.
cp
Copy file(s)!
There are some really interesting options to cp
which I have to admit, I didn’t know existed:
-i, --interactive prompt before overwrite (overrides a previous -n option) -n, --no-clobber do not overwrite an existing file (overrides a previous -i option) -s, --symbolic-link make symbolic links instead of copying -u, --update copy only when the SOURCE file is newer than the destination file or when the destination file is missing --backup[=CONTROL] make a backup of each existing destination file
The backup feature is interesting. You can have cp
make backups of any files which are about to be overwritten. You can even specify how the backups are named. See the man page for details.
I do use -r
sometimes to copy directories:
$ cp -r foo foo2 # where foo is a dir
UPDATE: Scroll down to 'dd' for more about the AMAZING POWER of cp!
csplit
Like comm
above, csplit
is a utility that may come in very handy for specific needs. In this case, you can split an input file (or STDIN if you use -
as the filename) into separate files using the delimiter(s) of your choice. There are options for specifying output filenames, etc.
Sure, it’s not as flexible as a bespoke script (AWK, Perl, etc.), but it’s a lot easier to learn than a full-blown programming language. And it’s pretty dang flexible. I’m not going to go into a deep exploration of all of the options, but I did learn enough to come up with this example:
$ cat | csplit - /===/ {*} Line 1 Line 2 === 14 Line 3 Line 4 Line 5 === 25 Line 6 Line infinity. 26
(I typed everything above except the numbers after each ===
line - csplit wrote those - they’re the number of bytes written to each file it created.
By default, the output files are named xxNN
where NN is a number starting with 00. Here’s the output:
$ head xx* ==> xx00 <== Line 1 Line 2 ==> xx01 <== === Line 3 Line 4 Line 5 ==> xx02 <== === Line 6 Line infinity.
As you can see, csplit created three files for us: xx00, xx01, xx02.
cut
cut
splits up lines by delimiters and prints out only the fields (columns) you specify. Example:
$ cat | cut -d' ' -f2 - cat cow dog cow moose lemon snakes lemon
As you can see, we specified a space ' '
and requested field 2
. So we get the second item of each list echoed back at us.
Honestly, I have no use for cut
. AWK is just so much better.
Here’s the same thing in AWK:
$ cat | awk '{print $2}' cat cow dog cow moose lemon snakes lemon
By default, AWK breaks up tokens by whitespace. cut
would fail in the above example if there were two spaces (or a tab) between items. AWK would succeed "out of the box".
Note: also see paste
below.
date
By default, date
gives you a human-readable date and time:
$ date Thu Jan 23 20:07:38 EST 2020
You can also set the date with this utility or parse and reformat a date. It’s pretty impressive what it can understand:
$ date -d'feb 29 2020' Sat Feb 29 00:00:00 EST 2020 # the 29th will be on a Saturday $ date -d'monday' Mon Jan 27 00:00:00 EST 2020 # the next Monday will be on the 27th
Handy output formatting options include:
$ date -u Fri Jan 24 01:10:12 UTC 2020 # UTC date $ date +%F 2020-01-23 # YYYY-MM-DD format $ date +%T 20:17:19 # HH:MM:SS format $ date +%s 1579828611 # standard UNIX epoch timestamp
So to get the current datetime in the One True Datetime format, ISO 8601, you do this:
$ date +'%FT%T' 2020-01-23T20:30:07
dd
Copy to and from files (including UNIX devices). Probably the weirdest old-school weapon of destruction you’re likely to actually use on a somewhat regular basis.
I use this all the time to write file system images to USB "thumb" flash memory drives.
(Update: Ha ha, I don’t any more! Keep reading.)
I have to look up the syntax with man
every. Single. Damn. Time.
The basic usage is simple: specify an input file ("if") and output file ("of"). Writing a bootable ISO image to a USB drive might look like this:
$ dd if=/slackware14.2.iso of=/dev/sdb
There are about a billion options with this utility (at last, you can convert your old EBCDIC text files!).
One thing I’ve always wondered: how much of a performance hit am I taking if I don’t specify the block size to read/write?
In searching for an answer to this question (which amounts to, basically, "it varies"), I came across this really interesting answer on Unix & Linux Stack Exchange:
There’s a bit of a cargo cult around dd. Originally, there were two bugs in cp that caused problems: It would misdetect files as sparse when reported with a block size other than 512 (Linux used a block size of 1024), and it did not clear empty blocks from the destination when copying from a sparse file to a block device…
This led me to The Cult of DD and Efficient File Copying On Linux
I love this stuff! Turns out you can use cp
or even cat
(!) to perform exactly the same thing (and probably even better, since these tools do their best to figure out an optimal block size for you!)
I will try to remember to use cp
next time I write an ISO. So long, dd
.
A year later: I’m still writing this article and since then, I’ve written dozens of OS install images to USB using cp
. It works great. I won’t claim any speed increases since I haven’t benchmarked the two methods, but the syntax is so much easier:
# cp downloads/os.3.9.iso /dev/sdb
Second year: I’ve written ISOs for all three major BSD installers (FreeBSD, OpenBSD, and NetBSD) and lost track of the number of Linux distro installers to several generations of SanDisk and Kingston USB "thumb" drives with the humble cp
command like the example above and have had absoutely no issues with it!
Do not waste any more of your brain cells on dd
. "Burn" your ISOs with cp
with this one weird trick!
Third year: LOL, I even used cp to recover an entire drive in 2022.
df
df
stands for "disk free" and shows the usage and free space available on the mounted filesystems available.
dave@europa~$ df Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 32768 1492 31276 5% /run devtmpfs 16399100 0 16399100 0% /dev /dev/sda3 894228804 113366324 735368584 14% / tmpfs 16407384 28408 16378976 1% /dev/shm cgroup_root 8192 0 8192 0% /sys/fs/cgroup /dev/sda1 101590 17120 84470 17% /boot/efi none 102400 8 102392 1% /var/run/user/1000
The best option is the -h
flag which (like many utilities) displays "human readable" values rather than bytes:
dave@europa~$ df -h Filesystem Size Used Avail Use% Mounted on tmpfs 32M 1.5M 31M 5% /run devtmpfs 16G 0 16G 0% /dev /dev/sda3 853G 109G 702G 14% / tmpfs 16G 28M 16G 1% /dev/shm cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup /dev/sda1 100M 17M 83M 17% /boot/efi none 100M 8.0K 100M 1% /var/run/user/1000
dir
Amazingly, dir
is not just an alias to ls
. Is is very, very similar, but does provide a slightly different output format.
Both utilities share the same source. Check it out!
#include "ls.h" int ls_mode = LS_MULTI_COL;
The source for ls
is the same thing, but with ls_mode = LS_LS;
. You can see the little subtle differences by searching for ls_mode
in ls.c
.
dircolors
I had no idea this existed. It creates a pretty setting for the LS_COLORS
environment variable used by ls
. Check it out:
$ dircolors LS_COLORS='rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*....
Wooooeeee, that would have been fun to type by hand. Good thing dircolors does it for us.
If you eval
the output of dircolors
, you’ll set the environment variable in your current shell session and ls
will be mighty pretty.
$ eval $(dircolors)
See more invocation goodness here:
dirname
You may have used this one in shell scripts before (I know I have). It chops the last item after the '/' slash directory separator from a string.
Note that a trailing slash doesn’t "count" as the final component of the path:
$ dirname /foo/bar/baz /foo/bar $ dirname /foo/bar/baz/ /foo/bar
du
Stands for 'disk usage' and the default output is the size in bytes of every dang file in every dang directory and subdirectory in the current directory.
Most of the time, you probably want the -s
("summarize") and -h
("human readable") options to clean up the output. Here’s the size of the current directory (including any subdirectories):
du -sh 448M .
Given file/directory name arguments (or a glob), it will summarize those as well. This is what I want most of the time:
$ du -sh * 11M bin 110M docs 7.1M dotfiles 53M img 96M proj 173M wiki
echo
Right up there with cat
in terms of importance in CLI and shell script wizardry.
$ echo "Echo!" Echo!
GNU echo
has a couple really important options. -n
does not output the trailing newline. -e
turns on backslash string escape characters.
But, reading man echo
will steer you to a really important note: the echo
command is usually provided by your shell!
Check it out in Bash:
$ help echo echo: echo [-neE] [arg ...] Write arguments to the standard output. Display the ARGs, separated by a single space character and followed by a newline, on the standard output. Options: -n do not append a newline -e enable interpretation of the following backslash escapes -E explicitly suppress interpretation of backslash escapes ...
Of course, you can still call GNU echo explicitly like so:
$ /bin/echo "In the name of the Senate and the people of Rome." In the name of the Senate and the people of Rome.
Either way, it’s cool to embed things like ANSI color codes in strings and output them with echo like so:
echo -e "\e[31mRed Text\e[0m" Red Text
You can’t see it here, but in my terminal, the above "Red Text" is red.
It’s fun to read the source code for GNU echo here:
The whole thing is under 300 lines and boils down to a series of calls to putchar()
.
env
I’m most familiar with this little util from writing scripts in languages such as Ruby where the standard "shebang" invocation is thus:
#!/usr/bin/env ruby
The description "run a program in a modified environment" simply means that env
gives you control over the environment variables, working directory, signal handling (such as blocking signals from being delivered), and CLI arguments.
When we use env
in a script shebang line, we’re typically not using any of those features. We’re just using it to invoke the executable using the current $PATH
rather than by absolute path. In the above Ruby example, env
will execute whichever version of ruby
it can find based on my environment rather than executing a specific version such as whatever I have installed at /usr/bin/ruby
.
The -s
or --split-string
option lets you pass multiple arguments to an executable on the "shebang" line, which is something that is not otherwise possible if you’re calling the executable directly.
expand
Convert tabs to spaces! expand -t 4 fart.c
will convert tabs to four spaces. The -i
option only converts the initial tabs on each line to spaces.
Fire the first shot in a "tabs vs spaces wars" with your friends!
Also see the unexpand
command which is the reverse of this below.
expr
According to Wikipedia, expr
was born in 1979. It can be used as a serious scripting tool or just do some quick math at the command line:
$ expr 5 + 5 10
Supports string (including regex), numeric, and logical functions and relational tests.
$ expr length "Goat cheese" 11 $ expr index "Goat cheese" c 6 $ expr substr "Goat cheese" 2 8 oat chee $ expr match foo foo 3 $ expr match foo bar 0
This is one of those legacy things that will probably be around forever. So you can count on it. But there are likely better alternatives for most of its functionality.
factor
Returns the prime factors of an integer:
$ factor 96 96: 2 2 2 2 2 3
This comes to us from 1974. Is it the weirdest thing that ships with coreutils? We’ll see…
false
Always returns a failure (false) exit status. Though it can be used in shell scripts when we want a false condition to always occur, I think I’ve mostly seen it used as the "dummy shell" of user accounts that cannot log in. As in, "enjoy running that in false, ha ha ha."
I recall that Bash also has a false
built-in and other shells may also. So unless you specifically call /usr/bin/false
, you’re actually calling your shell’s command.
Not to be confused with the False esoteric programming language which was apparently one of the inspirations for everybody’s favorite esolang: Brainfuck.
fmt
Oooh, I have some experience with this one. It re-wraps lines to fit a certain width. I tried to script a Gopher content generator using a very light script wrapper around fmt
. Sadly, it just wasn’t quite featureful enough to format all of my content (there was no way to have it ignore my "ASCII art" and stuff like that. Otherwise, it does a really nice job of formatting paragraphs of text using an algorithm that tries to even out the lengths of neighboring lines to make the output attractive.
$ fmt -w 80 meow The Cats Will Know By Cesare Pavese Translated by Geoffrey Brock Rain will fall again on your smooth pavement, a light rain like a breath or a step. The breeze and the dawn will flourish again when you return, as if beneath your step. Between flowers and sills the cats will know. $ fmt -w 20 meow The Cats Will Know By Cesare Pavese Translated by Geoffrey Brock Rain will fall again on your smooth pavement, a light rain like a breath or a step. The breeze and the dawn will flourish again when you return, as if beneath your step. Between flowers and sills the cats will know.
fold
Reformats line lengths. You probably want fmt
(above), which is generally superior. This version might be used in scripts that want very specific behavior.
groups
Prints the groups a user is in. If you don’t specify the user, it’s you! Here’s the interesting bit: if you don’t specify the user, you may see different groups based on the current process:
$ groups dave dave : users wheel $ groups users lp wheel floppy audio video cdrom scanner
Hey, I can look in /etc/passwd
and /etc/group
and see that I’m explicitly a member of users
and wheel
, but where are all these extra "system" groups coming from?
I spent the better part of an evening trying to figure it out and then finally asked and got the answer here:
They’re being assigned at runtime by the login
executable. Specifically, they’re set in the CONSOLE_GROUPS
setting in /etc/login.defs
:
$ grep floppy /etc/login.defs CONSOLE_GROUPS floppy:audio:cdrom:video:lp:scanner
head
One of those truly great little utlities that has all kinds of uses. One of the most important options to know is -n
, which lets you specify the number of lines to display from the "head" of a text file:
$ head -n 2 meow The Cats Will Know By Cesare Pavese
I use head
all the time to preview files. I especially like using it to preview multiple files at once:
$ head -n 2 meow* ==> meow <== The Cats Will Know By Cesare Pavese ==> meow2 <== The cat's song By Marge Piercy ==> meow3 <== A Little Language By Robert Duncan
I’ll often use it with a glob and pipe it through a pager to view a whole directory’s worth of files really quickly.
UPDATE: I also recently used this to send part of a binary file over a serial
connection while debugging some microcontroller code.
To end the stream after a number of bytes, use the -c
option like so:
$ head -c 1024 rickroll.ogg > /dev/boombox0
You can also end the stream some number of bytes before the end. Check the man page. This is the sort of utility that makes "The Unix Way" glorious sometimes.
hostid
Huh, this was completely new to me. Returns a unique ID for the current machine.
$ hostid 007f0100
Basically a wrapper for gethostid()
. See man gethostid
:
NOTES In the glibc implementation, the hostid is stored in the file /etc/hostid. (In glibc versions before 2.2, the file /var/adm/hostid was used.) In the glibc implementation, if gethostid() cannot open the file containing the host ID, then it obtains the hostname using gethostname(2), passes that hostname to gethostbyname_r(3) in order to obtain the host's IPv4 address, and returns a value obtained by bit-twiddling the IPv4 address. (This value may not be unique.)
Apparently this is mostly used for software licensing. A lazy Web search on my part returned no other uses.
id
The output of id
is a little geekier than groups
(above), but I like that it shows the uid
and gid
along with the supplimentary groups. Otherwise, it’s the exact same info:
$ id uid=1000(dave) gid=100(users) groups=100(users),7(lp),10(wheel),11(floppy),17(audio),18(video),19(cdrom),93(scanner) $ id dave uid=1000(dave) gid=100(users) groups=100(users),10(wheel)
(Note that my current shell has inherited membership to additional groups by the login
process at "runtime" - default Slackware behavior.)
install
I’d never heard of this util. I’m not surprised. It’s hilariously difficult to look up using a Web search (with a name like "install"…)
It can create directories and copy files and set attributes (such as permissions) on them.
My first thought was "is this traditionally related to the 'install' step of some Makefiles?" Looks like that can be the case:
Here’s the only other non-manpage link I found in a lazy search:
I suspect you’re generally better off with cp
(which can do recursive copying and has a lot of the same features, such as --backup
) or something much more powerful like rsync -a
.
I’d love to know if there’s a compelling use case for install
.
2021-05-11 Update
Danny writes in:
I actually know the reason for install! In unix-likes (I’ve verified on Linux and OpenBSD) you can’t overwrite the file of a running executable. If you have, say, vi running and your install script tries to cp over /bin/vi it’ll fail. Install does the proper dance to handle this (rename /bin/vi to something like /bin/vi.old, cp vi into /bin/vi, rm /bin/vi.old)
Which makes total sense. Thank you Danny!
I just tried it. Terminal 1:
$ cp /usr/bin/ed . $ ./ed
Terminal 2:
$ cp foo ed cp foo ed cp: cannot create regular file 'ed': Text file busy
But it would let me delete and replace ed
!
2022-10-25 Update
Julien writes:
I use
install
a lot to make one-liners:install -d -m0750 /tmp/toto
, instead of mkdir + chmod. …And yes, install is very much used in autoconf/automake dev setups.
join
I’ve seen examples of join
before, but never used it myself. It does a relational join (like database tables) on two files based on a common field. If you don’t specify the common field to join on, it defaults to the first field. Of course, the concept of "field" is common to a lot of UNIX tools (like awk) and means "whitespace-separated stuff" unless you specify otherwise.
I’ll make a little contrived example:
$ cat > users 101 Dave 102 Skeletor 103 Beansmaster 104 Thegoat $ cat > email 101 dave@example.com 102 skeleton99@example.com 103 iluvbeans@example.com 104 goat.goaterson@example.com $ join users email 101 Dave dave@example.com 102 Skeletor skeleton99@example.com 103 Beansmaster iluvbeans@example.com 104 Thegoat goat.goaterson@example.com
Boom! You have a text file database!
There are all kinds of ways to use this utility and the man page could sure use some more examples! But I think the most important thing is to just remember that it exists. I could easily see this saving me from writing a shell script someday!
link
The moment I saw this I typed "ln vs link" in my search engine of choice which returned this as the top answer:
Short answer: Link always creates hard links and differs in some other details. But I can’t fathom why you would use link
on purpose! Perhaps someone can enlighten me.
ln
Ah, this is the linking tool we’re looking for. You can create hard and soft links with ln
. I make a fair amount of soft links to make my life easier. Often to put executable stuff in my path without having to add to my path or move files out of their homes (like when I compile Git projects I’ve cloned).
There are a modest number of options, so check out man ln
for the full details. Basically, you can have ln
prompt you before overwriting, or make a backup, or force, or make hard/soft/"physical"/relative links.
logname
Nobody knows who wrote this and it doesn’t work?
$ man logname logname - print user's login name $ echo $LOGNAME dave $ logname logname: no login name
On Slackware, I rate this: "1/10 Not recommended."
UPDATE: This is a fractal of garbage. I was amused by my original review and decided
to pursue it further. The source of logname.c
has 73 lines of GNU boilerplate code
and then this:
/* POSIX requires using getlogin (or equivalent code) and prohibits using a fallback technique. */ cp = getlogin (); if (! cp) die (EXIT_FAILURE, 0, _("no login name")); puts (cp);
From this we learn two things:
-
This is just a front-end for the Lib C function,
getlogin()
. -
POSIX requires that it behaves this way.
The man page for getlogin()
appears to be the output of some tortured souls. Here’s
the BUGS section
Unfortunately, it is often rather easy to fool getlogin(). Sometimes it does not work at all, because some program messed up the utmp file. Often, it gives only the first 8 characters of the login name. The user currently logged in on the controlling terminal of our program need not be the user who started it. Avoid getlogin() for security-related purposes. Note that glibc does not follow the POSIX specification...
And the best part:
Nobody knows precisely what cuserid() does; avoid it in portable programs. Or avoid it altogether: use getpwuid(geteuid()) instead, if that is what you meant. Do not use cuserid().
ls
List files! Often used as an example of a utility that breaks the "do one thing and do it well" maxim. Check out the man page for the (surprisingly long) list of options.
Truth is, I think it’s pretty handy to have all of those output options for ls
. I pretty much never need them, but when I do, a lot of them would be really hard to replicate using other tools, right?
md5sum
Gotta have this. People love taking a big old stinky dump on MD5 all the time, but it still does exactly what it was meant to do: fast, low-collision hashes. It’s the right tool for the job when the job calls for MD5 sums!
$ md5sum meow 2731deb26ce04aa850042dcc40cccdb3 meow
Having said that, coreutils also comes with sha1sum
(see below) which "better". But it really depends on what you’re trying to accomplish.
mkdir
Make directories! Looking at the man page, I see that you can set the mode (as in permissions) as you create the directory, which is nice. But I think the best option to memorize is the -p
option to make any parent directories as needed:
$ tree . 0 directories, 0 files $ mkdir -p a/b/c $ tree . `-- a `-- b `-- c 3 directories, 0 files
mkfifo
As a developer, I know the acronym FIFO means "First In, First Out", which describes a queue - the first thing into the queue is the first thing that comes out. (The opposite is what we call a "stack" - FILO or "First In, Last Out" in which items are pushed onto and popped off of the top of a stack like dinner plates - you can’t get to the first plate until all of the ones on top of it have been removed.)
It turns out, mkfifo
creates a queue mechanism we’re all very familiar with, a pipe. Specifically a named pipe. I’d heard of named pipes, but had never used them before.
It’s really interesting. First you create the pipe "file":
dave@europa~/tmp/foo$ mkfifo foo dave@europa~/tmp/foo$ ls -l prw-r--r-- 1 dave users 0 Aug 30 14:43 foo
Note the p
at the beginning of prw-r—r--
.
Now I’m going to start displaying the "contents" of the foo pipe "file":
$ cat < foo
Then in another terminal, I’ll send something to that "file":
$ date > foo
In my first terminal, I now see this:
$ cat < foo Sun Aug 30 14:48:47 EDT 2020
The pipe closed when date
was finished. Now, in the second terminal I can repeat the output…
$ date > foo
…and it sits there waiting until I open the pipe up for output again.
It’s a fun party trick, but I’ll be keeping an eye out for uses in my real computing life.
mknod
This command line tool seems to be almost entirely a historical curiosity.
Wikipedia has a great writeup on the Device file article.
In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file.
and
Nodes are created by the mknod system call. The command-line program for creating nodes is also called mknod. Nodes can be moved or deleted by the usual filesystem system calls (rename, unlink) and commands (mv, rm).
mknod
takes four parameters: a filename (to create), a type (b=block, c|u=character, p=FIFO), and then two numbers: major and minor device type IDs. The device numbers are the most opaque (and stupid) thing about this system.
The best explanation came from Chapter 14 (File Systems) in the book The Linux Programming Interface. Reading that is how I learned that you can list these magic numbers with ls -l
:
$ ls -l /dev total 0 crw------- 1 root root 10, 235 Sep 3 10:44 autofs drwxr-xr-x 2 root root 600 Sep 3 10:44 block drwxr-xr-x 2 root root 60 Sep 3 10:44 bsg crw------- 1 root root 10, 234 Sep 3 10:44 btrfs-control drwxr-xr-x 3 root root 60 Sep 3 10:44 bus drwxr-xr-x 2 root root 5600 Sep 3 14:48 char crw------- 1 root root 5, 1 Sep 3 14:44 console lrwxrwxrwx 1 root root 11 Sep 3 10:44 core -> /proc/kcore drwxr-xr-x 18 root root 380 Sep 3 10:44 cpu crw------- 1 root root 10, 60 Sep 3 14:44 cpu_dma_latency crw------- 1 root root 10, 203 Sep 3 10:44 cuse drwxr-xr-x 6 root root 120 Sep 3 10:44 disk drwxr-xr-x 3 root root 100 Sep 3 10:44 dri crw------- 1 root root 239, 0 Sep 3 14:44 drm_dp_aux0 ...
Huh, so can I make a "fun" one like /dev/random
? Let’s see. I’ll start by looking at the device numbers on the current one:
$ ls -l /dev/random crw-rw-rw- 1 root root 1, 8 Sep 3 14:44 /dev/random $ mknod rando c 1 8 mknod: rando: Operation not permitted $ sudo mknod rando c 1 8 $ ls -l rando crw-r--r-- 1 root root 1, 8 Sep 3 17:37 rando $ head --bytes=16 rando | hexdump 0000000 1c1a 28e5 8690 7c84 486b 2d64 504a f88f 0000010
Ha ha, that’s pretty cool. By the way, I totally guessed the c
option for "character" device.
The list of device numbers should be here: http://lanana.org/ But when I checked, The Linux Device List page link was throwing a 404 error.
The Wayback Machine has a copy of the list that was current in 2009 (archived in 2019): https://web.archive.org/web/20190429050512/http://www.lanana.org/docs/device-list/devices-2.6+.txt
And yeah, check it out, there’s random and some other character devices near the top of the list:
1 char Memory devices 1 = /dev/mem Physical memory access 2 = /dev/kmem Kernel virtual memory access 3 = /dev/null Null device 4 = /dev/port I/O port access 5 = /dev/zero Null byte source 6 = /dev/core OBSOLETE - replaced by /proc/kcore 7 = /dev/full Returns ENOSPC on write 8 = /dev/random Nondeterministic random number gen. 9 = /dev/urandom Faster, less secure random number gen. 10 = /dev/aio Asyncronous I/O notification interface 11 = /dev/kmsg Writes to this come out as printk's
The rest of the list contains devices from the general (my main SSD drive is a block,8,0 "First SCSI disk whole disk") to the specific (char,10,4 "Amiga mouse (68k/Amiga)").
There are a couple great references to mknod
in the Unix Admin Horror Story list here:
http://www-uxsup.csx.cam.ac.uk/misc/horror.txt
mktemp
Note that depending on when you read this, mktemp
might be the GNU version or an old Debian version, which Slackware used for backward compatibility with scripts until 2018. If you have the new GNU version, you’ll see this:
$ mktemp --version mktemp (GNU coreutils) 8.32 ...
…in which case, the old version is still available as mkdir-debianutils
.
The olde mktemp that comes with Slackware 14.2 tells you it’s version like so:
$ mktemp -V mktemp version debianutils-2.7
mktemp
is an incredibly handy tool for a lot of file-handling tasks, especially in scripts. If you don’t know about it, you’ll end up re-implementing it poorly (like me).
The important thing to know is that if you run it, it’ll create a unique temporary file for you in /tmp/
and return the name so that you can capture it in your script:
$ mktemp /tmp/tmp.tVn7z3
I think the most important option to remember is -d
to make a new directory rather than a file (which is usually what I need):
$ mktemp -d /tmp/tmp.ecmk4s $ ls -ld /tmp/tmp* drwx------ 2 dave users 4096 Sep 4 00:49 /tmp/tmp.ecmk4s/ -rw------- 1 dave users 0 Sep 4 00:48 /tmp/tmp.tVn7z3
Both versions of mktemp
work the same way for simple file and directory creation, so just consult the man page for the installed version if you need to get advanced (like using templates for the filename format).
mv
Move files or directories!
$ mv source destination
There are a lot of nice options and you should definitely scim the man page real quick if you’re about to use it in a script.
Otherwise, just know that it can let you control how to deal with existing files with the same names including prompting you (interactively) for each one rather than just overwriting them.
nice
You know, I have yet to find a compelling reason to set the priority of processes on my systems. Maybe I’m just not cool enough. I did know about nice
, though.
I reckon there are two hard things to remember about the command.
First, the -
in front of the priority argument is not a negatation, but the argument prefix:
$ nice foo -10 $ nice bar --15
launches foo
with a priority of positive ten and bar
with negative fifteen!
Second, the priority scale goes from -20
to 19
where the lower the number, the higher the priority given to the process.
I ran some examples and looked at them with htop
and…the results are too boring even for me to care about.
Oh, and if you run it without any arguments, nice
will tell you what your current shell’s priority is:
$ nice 0
Boring.
nl
Huh, I had no idea this existed. nl
stands for "number lines" and here’s what it does:
$ cat foo Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. $ nl foo 1 Wolves enjoy waterballoons. 2 Don't eat the wrong fruit. 3 My skull is made of cheese. 4 Welcome to the pain castle.
Clearly it is giving us some padding in the gutter for multiple digits. By the way, that’s also a TAB character after the numbers. As you’d expect, you can adjust these settings:
$ nl -w2 -s' ' foo 1 Wolves enjoy waterballoons. 2 Don't eat the wrong fruit. 3 My skull is made of cheese. 4 Welcome to the pain castle.
It’s actually a pretty cool tool. You can specify the format of the numbering, chose to number only non-blank lines, chose a delimeter for "sections" (and restart numbering for sections). I would seriously consider reading the man page for this tool before embarking on any sort of scripted numbering task. It looks to be flexible enough to cover most uses. I like it.
nohup
Like nice
above, I’d heard of but never used the "no hangup" command nohup
. I understood that it lets your process run even when the terminal "hangs up the phone" (more or less literally back in the early days of UNIX).
But this leads to those "when should I use" and "why should I use" questions that the man pages and other documentation carefully avoid answering. (Grrrr.)
Thankfully, there is this excellent explanation here on serverfault:
nohup makes a program ignore the HUP signal, allowing it to run after current terminal is closed / user logged out. Nohup does not send program to background.
Usually
nohup
and&
are combined to launch program which runs after logout of user and allows to continue work at current shell session.
So, if you want to launch a process from the current terminal and not have it die when you exit that terminal, use nohup
.
But, as Tim O’Reilly (yes, that Tim O’Reilly) points out in my copy of the book Unix Power Tools, maybe what you really want to do is start a job with something like at
, cron
, or batch
.
nproc
This was a surprise. Did you know there is a command that returns the number of processors ("processing units") available to the current process?
Not to brag (in the year 2020 - I realize that when you are reading this, you’ll have a much larger number), but here’s mine:
$ nproc 16
With apologies to Tennessee Ernie Ford:
You have sixteen cores, what do you get? Another day older and deeper in technical debt. Saint IGNUcius don't you call me, 'cause I can't go I owe my soul to the company repo.
numfmt
Today is just full of surprises. Need that number in a different (human-readable) format? GNU’s got the coreutil for you!
Let’s jump straight to some examples:
$ numfmt --to=si 10000 10K $ numfmt --to=iec 10000 9.8K $ numfmt --grouping 8712634782364 8,712,634,782,364
Note: the "iec" format above refers to the International Electrotechnical Commission binary prefix designations. Where 1K is 1024 as in 210. Not to be confused with the International System of Units (SI).
od
As you’re probably used to hearing a lot around the house, it’s "time to take an od!" Refering, of course, to an "octal dump".
$ od foo 0000000 067527 073154 071545 062440 065156 074557 073440 072141 0000020 071145 060542 066154 067557 071556 005056 067504 023556 0000040 020164 060545 020164 064164 020145 071167 067157 020147
It’s just like the hexdump
command (I was familiar with) from the util-linux package…but in octal!
Actually, od
can output hexadecimal and a ton of other units and formats. Here’s hex:
$ od -x foo 0000000 6f57 766c 7365 6520 6a6e 796f 7720 7461 0000020 7265 6162 6c6c 6f6f 736e 0a2e 6f44 276e
There’s a huge number of formatting options including the endianness of the bytes, so if you have some specific needs while trying to view the contents of a file or stream, check out the man page!
paste
This is a weird utility.
$ paste foo bar Wolves enjoy waterballoons. Bar line one. Don't eat the wrong fruit. Bar line two. My skull is made of cheese. Bar line three. Welcome to the pain castle. Bar line four. Bar line five!
In the words of the man page, it "Write lines consisting of the sequentially corresponding lines from each FILE, separated by TABs, to standard output."
So the question is not "what", but "why". Why would we use this?
First of all, it’s clearly a companion to the cut
command (reviewed above):
$ cut -d' ' -f 1 foo | paste - bar Wolves Bar line one. Don't Bar line two. My Bar line three. Welcome Bar line four. Bar line five!
or with AWK (which I find a lot easier to remember):
$ awk '{print $1}' foo | paste - bar Wolves Bar line one. Don't Bar line two. My Bar line three. Welcome Bar line four. Bar line five!
In the above examples, I’ve used the standard -
filename placeholder for STDIN.
A neat trick is to use mulple -
to put any input into columns:
$ cat | paste - - - dog cat cow dog cat cow $ cat foo | paste - - Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle.
This isn’t something you’re going to need every day, but when you do, it will be waiting, ever faithful.
pathchk
Wow, the man page for this is useless!
pathchk
takes a pathname as a parameter and returns an exit code if it’s not a valid path.
Mind you, this utility only tells you if a path is, like, potentially valid.
$ if pathchk 'foo'; then echo "ok"; else echo "bad"; fi ok $ if pathchk '~f'; then echo "ok"; else echo "bad"; fi ok $ if pathchk '/////~f'; then echo "ok"; else echo "bad"; fi ok $ if pathchk '/////~f\9000'; then echo "ok"; else echo "bad"; fi ok $ if pathchk '/////~f\x9'; then echo "ok"; else echo "bad"; fi ok $ if pathchk '/////~f$meow!@#$'; then echo "ok"; else echo "bad"; fi ok $ if pathchk ''; then echo "ok"; else echo "bad"; fi pathchk: '': No such file or directory bad
It’s actually pretty hard to make a path that isn’t valid. What’s probably more useful is knowing if a path actually points to a file or directory and your shell has way better utilities for that.
Even a full StackExchange search for pathchk reveals very little. I would say that uses of this utility are going to be pretty rare.
pinky
From the man page:
pinky - lightweight finger
Ha ha, what?
Oh yeah, that finger
! Okay, how do they compare?
$ finger dave Login: dave Name: Directory: /home/dave Shell: /bin/bash On since Sun Sep 13 10:55 (EDT) on tty1 2 hours 25 minutes idle No mail. No Plan. $ pinky dave Login Name TTY Idle When Where dave tty1 02:25 2020-09-13 10:55
Okay. Ha ha, looks like I need to make a .plan
file so all of my friends on this system can see what I’m up to these days, huh? :-)
pr
Considering all of the time I’ve put into learning about the text formatting tools available on Linux, I’m kind of amazed I hadn’t yet run into pr
before. It’s for converting text files "for printing".
$ pr foo 2020-09-13 11:09 foo Page 1 Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle.
Huh, so it’s like groff/troff/roff with plaintext output. Neat.
It’s actually got a pretty great set of options like columnar output:
$ pr foo -2 2020-09-13 11:09 foo Page 1 Wolves enjoy waterballoons. My skull is made of cheese. Don't eat the wrong fruit. Welcome to the pain castle.
Oooh! Or check out the merge (-m
) feature (eat your heart out, paste
):
$ pr -m foo bar 2020-09-13 13:31 Page 1 Wolves enjoy waterballoons. Bar line one. Don't eat the wrong fruit. Bar line two. My skull is made of cheese. Bar line three. Welcome to the pain castle. Bar line four. Bar line five!
This is seriously great. Waaaay lighter and easier to use than anything similar (like the aforementioned 'roffs). I have no idea when I’ll need to paginate some text files…but I’ll try to remember it exists.
printenv
Super useful. Prints all of the current environment variables. Usually best with grep
or a pager, since you’re likely to have screenfulls of output:
$ printenv | grep PERL PERL5LIB=/home/dave/perl5/lib/perl5:/home/dave/perl5/lib/perl5 PERL_MB_OPT=--install_base "/home/dave/perl5" PERL_MM_OPT=INSTALL_BASE=/home/dave/perl5 PERL_LOCAL_LIB_ROOT=/home/dave/perl5:/home/dave/perl5
printf
When you want precise control over output to the terminal, look no further than printf
. By default, it does not print newlines:
$ printf hello hello$
But if you’re a developer, the usual escapes work exactly as expected. Here’s a newline:
$ printf "hello\n" hello
A Unicode smiley:
$ printf "\u263a\n" âº
ptx
The man page is useless, just this description and a list of command options:
ptx - produce a permuted index of file contents
The info document is somehow even worse because it’s just additional explanation of the options.
Output is interesting. My foo file contains four short sentences, one per line.
$ ptx foo Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. Wolves enjoy waterballoons. Welcome to the pain castle. My skull is made of cheese. Don't eat the wrong fruit. Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. My skull is made of cheese. My skull is made of cheese. Welcome to the pain castle. My skull is made of cheese. Don' t eat the wrong fruit. Don't eat the wrong fruit. Welcome to the pain castle. Welcome to the pain castle. Wolves enjoy waterballoons. Don't eat the wrong fruit.
Fascinating!
Here’s a paragraph from The Lord of the Rings:
$ cat baggins.txt I am old, Gandalf. I don't look it, but I am beginning to feel it in my heart of hearts. Well-preserved indeed! Why, I feel all thin, sort of stretched, if you know what I mean: like butter that has been scraped over too much bread. That can't be right. I need a change, or something. $ ptx baggins.txt beginning to feel/ I am old, Gandalf. I don't look it, but I am , but I am beginning to/ I am old, Gandalf. I don't look it beginning/ I am old, Gandalf. I don't look it, but I am /, Gandalf. I don't look it, but I am beginning to feel it in my/ /. Well-preserved indeed! Why, I feel all thin, sort of stretched/ /of stretched, if you know what I mean: like butter that has been/ /bread. That can't be right. I need a change, or something. /scraped over too much bread. That can't be right. I need a/ /feel it in my heart of hearts. Well-preserved indeed! Why, I feel/ hearts. Well-preserved indeed! Why, I feel all thin, sort of/ /of ... (continues for a total of 59 lines)
What the heck are we looking at here?
Wikipedia a useful writeup here with an additional clue: https://en.wikipedia.org/wiki/Ptx_(Unix)
ptx is a Unix utility, named after the permuted index algorithm which it uses to produce a search or concordance report in the Keyword in Context (KWIC) format.
This brings us to https://en.wikipedia.org/wiki/Key_Word_in_Context
Ah, now I see it! If you look at both examples above, you’ll see that the second column is in alphabetical order. What made it tricky to recognize this is that upper and lower case letters are treated differently by default.
Using the -f
or --ignore-case
option fixes this and now it’s much more obvious:
$ ptx -f baggins.txt . That can't be right. I need a change, or something. /bread /-preserved indeed! Why, I feel all thin, sort of stretched, if/ but I am beginning to feel/ I am old, Gandalf. I don't look it, Gandalf. I don't look it, but I am beginning to feel it in my/ /, /over too much bread. That can't be right. I need a change, or/ /I mean: like butter that has been scraped over too much bread./ /. I don't look it, but I am beginning to feel it in my heart/ /has been scraped over too much bread. That can't be right. I need/ /old, Gandalf. I don't look it, but I am beginning to feel it in/ /, if you know what I mean: like butter that has been scraped over/ ...and so forth...
The first column is textual context that comes before the alphabetized term at the beginning of the second column. The rest of the second column is context after the term. It’s a little unusual, but pretty nice once you get used to reading it.
My copy of the book Unix Power Tools doesn’t mention ptx
, but it does have "permuted indexes" in the index which leads to…an example of how these permuted indexes were used in the traditional UNIX manuals and an explanation of how and why they’re used.
It turns out, there’s currently a copy of UNIX System User’s Manual, Release 5.0 by Bell Laboratories in the Internet Archive. Here’s a screenshot of a portion of the permuted index:

You can see the rest here.
After having spent a couple days learning about these as I complete this coreutils entry over a series of evenings, I’ve come to really like this style of index. It takes up more room than a normal book index with a list of terms and page numbers. But it’s way more useful. Having context for each term makes it incredibly quick and easy to find the usage you’re looking for (it might even provide the answer you’re looking for without having to go to the actual source material).
That’s neat and all. But how can we use this tool for other interesting stuff?
One of the most useful/interesting options is -o
or --only-file
, which lets you specify a file that contains a list of terms to include in the index. (Note that there is also an ignore-file option.)
You can use the old -
trick to specify STDIN as the filename, so making an index for a single term could be done like this:
$ echo foo | ptx --only-file=- input.txt
And now I’ve got a compelling example of usage. Compare the useless grep
output when I search for the word "I" in our Bilbo Baggins quote:
$ grep I baggins.txt I am old, Gandalf. I don't look it, but I am beginning to feel it in my heart of hearts. Well-preserved indeed! Why, I feel all thin, sort of stretched, if you know what I mean: like butter that has been scraped over too much bread. That can't be right. I need a change, or something.
Versus the excellent output of ptx for the same query:
$ echo I | ptx --only-file=- baggins.txt , but I am beginning to/ I am old, Gandalf. I don't look it beginning/ I am old, Gandalf. I don't look it, but I am /, Gandalf. I don't look it, but I am beginning to feel it in my/ /. Well-preserved indeed! Why, I feel all thin, sort of stretched/ /of stretched, if you know what I mean: like butter that has been/ /bread. That can't be right. I need a change, or something.
I just wish it were a little more intuitive to do this sort of search. Easily scripted.
pwd
Always handy: "print working directory":
$ cd ~/tmp/foobar $ pwd /home/dave/tmp/foobar $ echo $PWD /home/dave/tmp/foobar
Also interesting are the -L
(logical) and -P
(physical) options, which change how symlinks are treated:
$ ln -s /usr/bin bonk $ cd bonk $ pwd /home/dave/tmp/foobar/bonk $ pwd -L /home/dave/tmp/foobar/bonk $ pwd -P /usr/bin
readlink
Read a file link. You can also use -f
to get the canonical path.
$ ln -s foo bar $ readlink bar foo $ $ readlink -f bar /home/dave/tmp/foo
Consider using the next command, realpath
instead of -f
.
Check out some historical background
here.
realpath
Basically readlink -f
. Has a lot more options and flexibility, so use this instead and check out the man page.
$ realpath bar /home/dave/tmp/foo
rm
You know this one from the first Star Wars prequel:
sidious# rm -rf /naboo/gungans/ # wipe them out, all of them
But wow, next time you need to delete a bunch of stuff, check out the man page. Like cp
and friends, there are a lot of options including two levels of interactive prompting.
I wish the -d
option were on by default (remove empty directories). Sure, I could make an alias, but my point is that I wish this were the default behavior. I’m not about to start messing with the default behavior of rm
on some machines. That leads to fear. Fear leads to hate. Hate leads to the dark side. And prequels.
rmdir
Remove a directory. The nice feature I didn’t know about was -p
, which removes empty ancestor directories (exactly the opposite of mkdir -p
)!
$ tree foo foo `-- bar `-- biz `-- baz 3 directories, 0 files $ rmdir foo/bar/biz/baz
runcon
Another SELinux thing (the first was chcon
above).
To quote Wikipedia from the SELinux article:
The command runcon allows for the launching of a process into an explicitly specified context (user, role, and domain), but SELinux may deny the transition if it is not approved by the policy.
I have no interest in this until someone demonstrates how SELinux helps you make a more delicious hot dog.
seq
My favorite type of UNIX utility! It simply prints a sequence of numbers. At its simplest:
$ seq 4 1 2 3 4
Nice and terse usage, too. Here’s a sequence separated by commas counting down from ten to zero subtracting twos:
$ seq -s, 10 -2 0 10,8,6,4,2,0
It’ll save you a minute writing a script to do the same thing some day.
sha1sum
Pretend that you are Git by computing some SHA-1 hashes for files! Here’s a mini-tutorial for the lifecycle of a SHA-1 checksum for a file:
ha1sum foo.txt fa7dd7e51436401f0555f0cb6030393a0f18cfd5 foo.txt $ sha1sum foo.txt > foo.sha1 $ sha1sum -c foo.sha1 foo.txt: OK
There are also the sha224sum, sha256sum, sha384sum, and sha512sum strengths available:
$ sha1sum foo.txt fa7dd7e51436401f0555f0cb6030393a0f18cfd5 foo.txt $ sha224sum foo.txt 0dbfebfe2057dd9b63ebbbeb8d21925323bc4ea293e4b23e1eb4a66b foo.txt $ sha256sum foo.txt 959a0da619f2594a481ee6e78c7c11f3686abdbbbab91f5b7d429ef8a0b46383 foo.txt $ sha384sum foo.txt ce87107ae3baa9f2217266770d37ddc8350609f856fd4441b6a80dd7a1fb0c362bdc427f5505a56e70aed083154fce2f foo.txt $ sha512sum foo.txt 0e83f638730bec5d0531382a4e40ea4fe9b1da05e444833282af16af03020697faf0baaa8db23b05a650b210477b7e50618a903584d140529cb2203198906b92 foo.txt
I’m sure someday there will be a sha9087123448sum
available in coreutils to fill your screen with hex goop. (Not to be confused with "octal dump" - see od
above.)
shred
This is a super-cool spy command. It overwrites a file multiple times with random data, which makes it very hard to recover the file…from traditional, old-school spinning magnetic platters with traditional, old-school file systems where writing data to the same file would likely overwrite the same physical space on the storage media.
Shred is pretty much useless in our modern times, but I shredded a file just for fun:
$ cat > foo.txt Super cool spy stuff. I am a secret agent from Mars. $ file foo.txt foo.txt: ASCII text $ shred foo.txt $ file foo.txt foo.txt: data $ hexdump foo.txt 0000000 5b90 3445 6e50 da24 69f4 5f77 4ee9 3f9e 0000010 6d1b ddfe 47d8 ba69 bd10 72cc a59f ee52 0000020 2184 3f03 3d29 8de9 fb32 3bc2 f758 242e ...
Now no one will know my secret.
shuf
This one is great, and I didn’t even know it existed! How have you people been keeping this a secret from me for so long?
shuf
is short for shuffle (say that three times fast) and it randomly shuffles elements from a file:
$ cat > words.txt Apple Bat Cat Donkey Elephant Fruit Goat Horse $ shuf words.txt Elephant Donkey Bat Goat Fruit Cat Apple Horse
It’s a really nice tool with excellent, useful options. For example, -e
shuffles the input arguments:
$ shuf -e A B C D E F G E F B G A C D
Or you can give it a range of numbers:
$ shuf -i 1-10 10 5 2 1 4 8 3 9 7 6
And you can request a certain number of results for any of the above:
$ shuf -n 1 words.txt Fruit $ shuf -n 1 -e A B C D E F G E $ shuf -n 1 -i 1-10 2
I just added this:
$ alias rolld6='shuf -n 1 -i 1-6' $ rolld6 6 $ rolld6 4
This is the most fun utility yet and it combines great with a lot of the others. I can’t believe this is the first time I’ve encountered it!
Check out this handy one-liner to delete a random file from the current directory:
$ alias randorm='ls | shuf -n 1 | xargs rm' $ touch foo{1..5}.txt $ ls foo1.txt foo2.txt foo3.txt foo4.txt foo5.txt $ randorm $ ls foo1.txt foo2.txt foo4.txt foo5.txt
Bye bye foo3.txt
. I bet you’ll be using that one in your next project. I know I will.
sleep
Pauses the current process (script or terminal) for a specified amount of time, in seconds by default:
$ sleep 5 $ sleep 10m $ sleep 3d
So the question is: why would you want to do this? The most common case seems to be when scripting a loop where you don’t want something to happen too frequently - maybe some sort of network request.
Another fun use I just came up with is a simple command line timer. This plays a bell MP3 sound after one minute:
$ sleep 60s | play ~/Downloads/bell.mp3
sort
Well, this one I use all the time. It’s one of the classic, indespensible UNIX tools.
Let’s get these lines into alphabetical order:
$ cat foo Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. $ sort foo Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. Wolves enjoy waterballoons.
I’m not going to list all of the options for this command, but there are a ton of them and they’re really useful and helpful. You can ignore leading whitespace, case, non-printable characters, and sort by numeric order.
You can even sort by month order! Check it out:
$ cat | sort -M June - eat a cake August - burn wood January - melt everything December - eat wax July - drink all the liquids January - melt everything June - eat a cake July - drink all the liquids August - burn wood December - eat wax
By far, the option I use most is -n
for numeric sort:
$ cat | sort -n 198 clowns 16 dogs 985 snakes 84 goats 16 dogs 84 goats 198 clowns 985 snakes
Brings order to shuf
.
Love it. Can’t do without it.
split
Splits a file (or STDIN) into files based on size in bytes, lines, or by generating a specified number of files and letting split
figure out how big each one should be.
It’s a great little utility. I just haven’t had need for it yet.
Here I split a text file into separate files with one file per line:
$ cat foo Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. $ split --lines=1 foo $ ls foo xaa xab xac xad $ head x* ==> xaa <== Wolves enjoy waterballoons. ==> xab <== Don't eat the wrong fruit. ==> xac <== My skull is made of cheese. ==> xad <== Welcome to the pain castle.
The other super useful option is -t
or --separator
to specify something other than newline to separate "records".
Here I separate the Bilbo Baggins quote on the word "I":
$ cat baggins.txt I am old, Gandalf. I don't look it, but I am beginning to feel it in my heart of hearts. Well-preserved indeed! Why, I feel all thin, sort of stretched, if you know what I mean: like butter that has been scraped over too much bread. That can't be right. I need a change, or something. $ split -t 'I' -l 1 baggins.txt $ head x* ==> xaa <== I ==> xab <== am old, Gandalf. I ==> xac <== don't look it, but I ==> xad <== am beginning to feel it in my heart of hearts. Well-preserved indeed! Why, I ==> xae <== feel all thin, sort of stretched, if you know what I ==> xaf <== mean: like butter that has been scraped over too much bread. That can't be right. I ==> xag <== need a change, or something.
The man page for split
lists Richard Stallman as one of the authors. I wonder how many of the coreutils share this distinction?
stat
Sigh. Man pages should really be required to have some examples up at the top.
This tool’s man page is not an especially bad example. I’m just getting worn down by the annoying format after using it so much lately (man <thing>
is 100% better than trying to do a Web search for most of these tools)… So, I’m super thankful for man. I just wish most of them were better.
Anyway, point stat
at a file and you can see detail about it
$ stat foo File: foo Size: 111 Blocks: 8 IO Block: 4096 regular file Device: 803h/2051d Inode: 1573151 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ dave) Gid: ( 100/ users) Access: 2020-09-15 18:09:48.662941892 -0400 Modify: 2020-09-13 11:09:19.077060019 -0400 Change: 2020-09-13 11:09:19.077060019 -0400 Birth: 2020-09-13 11:08:12.213055161 -0400
We can also learn about the filesystem that the file lives on with the -f
option:
$ stat -f foo File: "foo" ID: 53e08d3f116ca070 Namelen: 255 Type: ext2/ext3 Block size: 4096 Fundamental block size: 4096 Blocks: Total: 223557201 Free: 193352658 Available: 181979184 Inodes: Total: 56852480 Free: 55538042
You can even give it a format to output, making it extremely useful for scripting.
Everybody’s favorite thing to do (after tell you that you’re using cat
wrong) is to tell you that you shouldn’t parse the output of ls
. Formatted output from stat
seems like a great alternative:
Here’s the octal permissions and owner of foo
$ stat --printf="perms: %a, owner: %U\n" foo perms: 644, owner: dave
stdbuf
Interesting. This lets you set the modes of the STDIN, STDOUT, and STDERR of a process.
And the man page for stat has an interesting example (which immediately displays unique entries from a file called access.log):
tail -f access.log | stdbuf -oL cut -d aq aq -f1 | uniq
Unfortuntely, when I tried a contrived example, it worked just fine without using stdbuf
:
# terminal 1: $ echo "cat" >> lawg.log $ echo "cat" >> lawg.log $ echo "cat" >> lawg.log $ echo "cow" >> lawg.log # terminal 2: $ tail -f lawg.log | uniq cat cow
So that kind of ruins my plans to demonstrate a "before and after" example of stdbuf
in action. (But I love the idea of piping tail -f
(which I use all the time to follow changes to Apache web logs while developing) through uniq
to get only unique messages.)
There’s a really great article with information about UNIX buffering here. But I am running out of interest in this topic quickly.
The important thing is that if you need it, this tool exists and may someday solve a problem you have.
stty
Terminal settings are a deep subject.
This utility can either print the current settings or change them. Here are my current settings:
$ stty speed 38400 baud; line = 0; -brkint -imaxbel
Change them at your own risk. I put this in the same category as stdbuf
above - it’s there in the unlikely event that you need it.
UPDATE: I ended up using this to set the baud rate of a serial device to talk to a microcontroller. Check it out here. But the usage is beyond arcane and should only be used as a last resort. You’re better off doing almost anything else before using this utility.
sum
This one appears to be strictly historic. I haven’t been able to find anything interesting about this command. Here it is:
$ sum foo 49295 1
The first number is a checksum (using one of two algorithms available) and the second is the number of disk blocks it’s using.
I’d love to know if any person or thing is still using this utility in the year 2020 and why.
sync
There’s a good write-up of this command here.
In short, Linux buffers data in memory and writes to disk in an efficient manner. This utility forces all drive data or even just a single file to be written immediately to disk.
Why would you need to do this? I guess if you’re about to throw your computer in a lake, you’d want to do it?
$ sync $ # throw computer in lake
Otherwise, I’m a big believer in letting the OS do its thing.
Also, there is zero reason to do this on a device that you are about to unmount. Unmount does a better job of making sure data is synced than you can. Have faith.
tac
Reverse lines! (It’s cat
backwards, get it?)
dave@europa~$ tac Cat is a cool guy. Ant is a friend. Termite eats houses. Termite eats houses. Ant is a friend. Cat is a cool guy.
But it can also reverse using any separator:
$ tac --separator=, a,b,c,d,e e d,c,b,a,$
(You can fix the trailing comma by using the --before
option. Experiment until your output is delicious.)
tail
I use this pretty much daily, much like head
aboove.
The killer feature (other than the ability to see the end of files, of course), is the -f
for "follow" option. It shows additional lines as they are appended to the file:
$ tail -f /var/log/httpd/error_log [Wed Sep 16 20:46:53.530838 2020] CRM9921: Web crimes detected. [Thu Sep 17 19:04:49.254350 2020] SML0012: Smells too intense. Stop it.
It’s hard to demonstrate this in action on a web page, because it’s a dynamic thing. But as new errors roll into this unfortunate server, we’d see them pop up in realtime. I use this all the time for PHP error debugging and such.
tee
Such a clever name. The tee
is like a "T" junction in a water pipe. It lets you send output to multiple places at once! Ever want to redirect output to a file but also see it? Check it out:
$ uname | tee outpoot Linux $ cat outpoot Linux
I can see it in the outpoot file and I can see it on the screeeeeeen.
Okay, these coreutils have got me a bit slap-happy at this point.
Anybody still reading?
timeout
Oooh, this is cool! I had no idea this existed. It runs a command and then kills it after a specified timeout:
$ timeout 5 ping phobos PING phobos.localdomain (10.0.0.37) 56(84) bytes of data. 64 bytes from phobos.localdomain (10.0.0.37): icmp_seq=1 ttl=64 time=0.161 ms 64 bytes from phobos.localdomain (10.0.0.37): icmp_seq=2 ttl=64 time=0.914 ms 64 bytes from phobos.localdomain (10.0.0.37): icmp_seq=3 ttl=64 time=0.900 ms 64 bytes from phobos.localdomain (10.0.0.37): icmp_seq=4 ttl=64 time=0.919 ms 64 bytes from phobos.localdomain (10.0.0.37): icmp_seq=5 ttl=64 time=0.904 ms $
But this is a super flexible tool. You can specify the signal to send (default is TERM
) after the timeout. The duration can be really long (like 16d
is sixteen days). And a handful of other useful settings. Check out the man page!
touch
Another essential in the UNIX toolbox. Learn it. Know it. Live it.
$ touch foobar $ ls -l foobar -rw-r--r-- 1 dave users 0 Sep 17 20:07 foobar
I reached out my hand and touched that file right into existance. Time and space mean nothing to me. All matter is an extension of my mind.
Also, you can update just the access time, modification time, set times to specific dates (default is right the heck now), or even say "set this file’s access and mod time to be identical to that other file’s time". Whatever. The world is your oyster. Carpe diem.
tr
Translate (or remove) characters from input stream and write out the result.
Gosh, I always just use sed
for this stuff. Or awk
or even whatever general purpose programming language I’m particularly into that week.
Having said that, sometimes these really specific tools are way more efficient than a more powerful/general tool.
I guess if you’re really just replacing all instances of single characters, tr
is shorter than the equivalent sed
command because it’s global by default and doesn’t need regex syntax…
$ echo "foo" | tr 'o' 'z' fzz $ echo "foo" | sed 's/o/z/g' fzz
But for me, learning sed
lets you do so much more. Typing, what?, three more characters on the command line is nothing compared to having to remember how to use a different tool.
I guess the -s
("squeeze repeated characters into one") is also a place where the tr
command is going to be way more terse:
$ echo "fooooo" | tr -s 'o' fo $ echo "fooooo" | sed 's/o\+/o/g' fo
Clearly you’d want to type the tr
command instead. But, again, if I already have the sed
syntax memorized, I can type that faster than I can lookup the squeeze option in the tr man page…
I don’t mean to pick on the tr
command. But sufice it to say that I don’t think you need to memorize everything. Just know it exists. Ideally, do memorize a few of the tools you use the most often.
true
See what I wrote about false
above. This is like the lawful good to false’s lawful evil.
truncate
Interesting. Hacks the ends off of files, making them the size you specify.
$ cat foo2 Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. $ truncate --size=16 foo2 $ cat foo2 Wolves enjoy wat$
It also grows files to make them longer (where the rest of the file becomes "sparse":
-rw-r--r-- 1 dave users 16 Sep 17 20:33 foo2 $ truncate --size=128 foo2 $ ls -l foo2 -rw-r--r-- 1 dave users 128 Sep 17 20:35 foo2
Very interesting. Hmm. Maybe some day I’ll find a use for that.
tsort
This tool’s man page really needs an example or two.
tsort - perform topological sort
The info page(s) are almost as bad, but if you read the following sentences slowly enough, and run the example, it does actually make sense:
tsort reads its input as pairs of strings, separated by blanks, indicating a partial ordering. The output is a total ordering that corresponds to the given partial ordering.
Alright, let’s give it some pairs:
$ cat | tsort d e b c a b c d e f a b c d e f
Neat. A sorted alphabet. What’s going on here? Okay, so the input consists of pairs of letters. The first pair is:
d e
Which simply means that d
comes before e
(or "points to" or "is the parent of" or whatever directed relationship you’d like). That pair becomes the rule "d comes before e".
Now let’s add the rest of the rules: "b before c", "a before b", "c before d", "e before f".
(Now, you and I know that this is just alphabetical order. But tsort
doesn’t know that. I’m just using this example because it’s easy to type and easy to confirm correctness.)
What’s cool is that tsort
took all of these rules and then constructed the sorted list correctly for us: a b c d e f.
Just for fun, let’s also use Graphviz (the dot
command and others) to draw this set of rules - which, as we say in graphing terminology, is a
directed graph.
$ cat | dot -Tsvg > tsort1.svg digraph G { rankdir=LR; node [shape=circle] d->e b->c a->b c->d e->f }
Which produces the delightful graphic below:
But that first example was pretty weak. Let’s give it something to chew on. I’ve made this very abridged history of program language evolution:
$ cat langs_unordered bcpl b c c++ algol simula fortran algol b c smalltalk object_pascal c++ java algol pascal simula smalltalk algol bcpl smalltalk java
Which looks like this when I use AWK and dot
(from Graphviz) to turn this list of pairs into an SVG of the directed graph:
Now let’s tsort
this to see in what order these languages have to be born in order to make cronological sense:
$ tsort langs_unordered fortran algol bcpl pascal simula b smalltalk c object_pascal c++ java
Yup! That’s exactly right.
Apparently, the historical purpose for this tool was for input to the old UNIX linker. Info has the whole story, which you can read here. Cool.
I don’t have any immediate uses for this. But it’s certainly interesting.
I have a really similar dataset for my website - it’s a list of all of the redirects I’ve ever had (as pages change names and locations). To generate my Apache aliases, I have a Ruby script that gets all of the unique old URLs and finds the new URL endpoints (following multiple moves and renames as needed). Sadly, this tool doesn’t do anything like that. But I’m just saying, I see how this sort of thing can be useful.
This sort of thing.
This topological sort of thing.
tty
Nice and simple: prints the device filename of your current terminal:
$ tty /dev/pts/1
If we run a command from a pipe, that command is not not connected directly to the terminal. Same if it’s launched from cron or other process. In that case, we get this output:
$ ls | tty not a tty
Ah, but beyond that message, we can also use the exit code from tty
to determine if we’re attached to a terminal or not. We don’t want any additional messages in that case, so we can use the -s
(silent) option to surpress the messages and just get the exit status:
$ if tty -s; then echo "Hello, hoopy frood!"; else echo "gruntbuggly"; fi Hello, hoopy frood! $ ls | if tty -s; then echo "Hello, hoopy frood!"; else echo "gruntbuggly"; fi gruntbuggly
Obviously, this is something you’re more likely to do in a script than on the command line.
uname
The standard tool for getting some basic information about the system:
$ uname Linux $ uname -a Linux europa.ratfactor.com 5.4.35 #1 SMP Thu Apr 23 13:47:56 CDT 2020 x86_64 AMD Ryzen 7 3700X 8-Core Processor AuthenticAMD GNU/Linux
unexpand
The opposite of expand
! Converts spaces to tabs. It’ll be especially important to take a look at the man page to make sure you’re converting the right number of spaces to tabs because the document will (presumably!) use a specific number of spaces as tab stops.
Hint: you use the -t
option to specify the number of spaces in thy input:
$ ed tab_poem tab_poem: No such file or directory i Start of poem Indented by two I say this to you Enjoy the indent Expanded intent It's always nice to have tabs when they'll do! . wq 158 bonus Unix points awarded for the use of ed. $ unexpand -t 2 tab_poem Start of poem Indented by two I say this to you Enjoy the indent Expanded intent It's always nice to have tabs when they'll do!
My terminal displays tabs as 8 characters, so we know this has worked because the indented lines are now way more indented.
Also, note the bonus points awarded to me for the use of ed, the standard text editor. Start earning yours today!
uniq
Given a sorted list of items, returns only uniq items:
$ cat > animals cow cow chicken chicken chicken pig pig pig pig chicken cow $ uniq animals cow chicken pig chicken cow
Note it only works as you’d expect for a sorted list. I can write a better uniq
in one line of AWK:
$ awk '{lines[$0]=$0} END{for(l in lines) print l}' animals chicken pig cow
To get the same effect, we can run our file through sort
first:
$ sort animals | uniq chicken cow pig
And that is certainly easier to type than the AWK program.
Plus, GNU uniq
has got some other great features such as counting:
$ uniq -c sorted_animals 4 chicken 3 cow 4 pig
Or printing only the duplicated entries (with -d
).
Or showing all lines, but grouping them:
$ uniq --group sorted_animals chicken chicken chicken chicken cow cow cow pig pig pig pig
Which I imagine would be most useful when paired with -w
, which lets you specify how many characters to check:
$ uniq -w 1 --group sorted_animals chicken chicken chicken chicken cow cow cow pig pig pig pig
(Note how the chickens and cows are now in the same group because they both start with 'c'.)
It’s not perfect, but give uniq
a good, hard look before you write your own script for this sort of task.
unlink
Unlinking a file means detaching it from a "link" (filename).
I’m most familar with "unlink" as unlink()
, the system call to delete a file.
Of course, that’s what rm
does, too. So what’s the difference?
In practice, unlink is just a much less useful and less safe rm
.
Use rm
.
users
Who’s logged into this system?
$ users dave
Heck yeah. And loving it.
vdir
Huh, another historical way to display files.
See dir
above.
This one does a "long" output, much like ls -l
:
$ vdir total 8 -rw-r--r-- 1 dave users 285 Sep 15 21:06 baggins.txt -rw-r--r-- 1 dave users 111 Sep 15 20:58 foo
It has many options.
wc
Word count. One of my favorites! I use this all the time.
$ cat foo Wolves enjoy waterballoons. Don't eat the wrong fruit. My skull is made of cheese. Welcome to the pain castle. $ wc foo 4 19 111 foo
The output above is as follows:
4 lines 19 words 111 characters
You can also request one of those three items:
$ wc -w foo 19 foo
It also understands the difference between bytes and characters.
For historical reasons, -c
is bytes, and the newer -m
is actual characters.
Here’s an interesting option I didn’t know about, -L
, for max line width:
$ wc -L foo 27 foo
who
Who is currently logged in?
$ who dave tty1 2020-09-19 12:19
There are quite a few options. You can see most of them with -a
for "all" and -H
for "headers" makes it more readable:
$ who -a -H NAME LINE TIME IDLE PID COMMENT EXIT 2020-09-19 12:19 478 id=si term=0 exit=0 system boot 2020-09-19 12:19 run-level 3 2020-09-19 12:19 last=S 2020-09-19 12:19 1114 id=rc term=0 exit=0 dave + tty1 2020-09-19 12:19 02:18 1439 LOGIN tty2 2020-09-19 12:19 1440 id=c2 LOGIN tty3 2020-09-19 12:19 1441 id=c3 LOGIN tty4 2020-09-19 12:19 1442 id=c4 LOGIN tty5 2020-09-19 12:19 1443 id=c5 LOGIN tty6 2020-09-19 12:19 1444 id=c6
whoami
Simple existential answers:
$ whoami dave
yes
Repeats the string of your choice (default "y") and a newline to STDIN forever.
$ yes y y y y ... $ yes no no no no no ...
Typically used to "answer" programs which expect you to answer "y" to confirm things interactively.
$ yes | annoying_script.sh
GNU yes
is hilariously good at its job. Check out this classic:
How is GNU yes so fast?.
Conclusion
Holy cow. After…(checks date)…almost exactly a year, I have finally completed this entry. Arguably a bunch of these could have been (and should be) separate articles/blog posts in their own right. Then people would at least have a chance to find the info they’re looking for when a search engine sends them here, right?
TODO: Split some of these into their own pages for quicker reference.
Anyway, after a year (mostly a long hiatus, mind you), I’ve completed this. I seriously doubt I’ll have too many more entries as big as this. The previous record was bash:
$ wc -w coreutils.adoc bash.adoc 13797 coreutils.adoc 8602 bash.adoc
Wow, over 13,000 words. Now I’m getting into NaNoWriMo territory.
I’m extremely excited to get to the next package, which looks to be the straight-forward utility, cpio
.
Should be able to knock that one out in way less than a year.
Until next time, happy hacking!