Dave's Paper Notes: A Tutorial for the Sam Command Language

Page created: 2025-10-01

<< Back to Dave’s Paper Notes.

Sam is an interesting text editor by Rob Pike. This paper is a brief tutorial for the command language used by the editor.

I got a lot out of reading this 14 page document. Mostly regarding language design.

Language Idioms

The first insight from the paper is the idea of language idioms. Of course, all languages have idioms, and we often speak of idiomatic solutions to problems, as in, "The idiomatic way to do this in Ruby is…​" But Pike explicitly calls out a two-character sequence and says:

The address +- is an idiom.

(Specifically, +- is short for implicit +1-1, "select the next line, then select the previous line", which has the effect of selecting the entire current line.)

Why does this matter? Well, if I were designing a language and I needed to have the ability to select the current line, I would probably be tempted to make a new command L or something that did just that. But this is how a language or tool can start becoming unweildy, with an enormous number of separate commands to learn. (We all know an example or two.)

Ideally, you make your language elements work together in useful ways, and then have as few as you can get away with.

Zero ceremony looping and conditionals

The x command is, in my opinion, the key to what makes the sam command language so brilliant. It’s short for "extract", but I came to think of it as a "for each" loop.

This example will replace every "hello" with "bye":

,x/hello/ c/bye/

The , address is short for 0,$, which initially selects the entire document (another sam idiom!). x/hello/ is, "for each selected match of 'hello'…​" And c/bye/ means, "replace the selection with 'bye'."

Of course, that’s just a find-and-replace. Here’s an equivalent substitute command in Vim:

:%s/hello/bye/g

But whereas ex/vi/Vim’s s is specialized and only does substitution. x can run any command, such as d (delete).

This deletes all shell-style comments:

,x/^#.*\n/ d

To do something for every line in a file in sam:

,x/.*\n/ <command>

Ah, you say, but this is just like ex/vi/Vim’s g ("global") command, which can also run arbitrary commands for every pattern match…​

You are correct. But we’re only scratching the surface. How about conditionals?

Here’s conditionals with sam’s g for "guard":

,x/.*\n/ g/foo/ d

(Sorry, I know it’s confusing to mention ex/vi/Vim’s g and then show sam’s g. They’re not the same.)

I think of sam’s g as an "if" statement.

So the above reads to me as "for each line, if it contains the string 'foo', delete that line."

Note that the d for delete still applies to the entire line. That’s because g/foo/ means "…​if it contains 'foo'". Like an "if" statement, g doesn’t select a match, it just determines if we should continue.

Composition

Here’s where I really got excited because this paper has finally allowed me to articulate something that has bothered me for a long time.

The problem with sed

I like sed because it is so terse. I also appreciate readability of source: see RubyLit ("A literate programming system in 35 lines of Ruby," or "This README is a program!"). But at the command line, terse is good! I remember when PowerShell came out and, while it has plenty of good ideas, I saw the long command names with Capital-Letters and thought, "Are you out of your minds?!" The last thing I want to do when I’m trying to accomplish stuff at the command line is type out paragraph-long sequences of text! (And yes, I’m aware of PowerShell shorthand.)

But sed frustrates me because it teases programmability by being technically "Turing-complete" but was never intended to express complex programming logic, and it shows.

The problem with Awk

I also like Awk because the implicit line (or other) parsing loop and statement structure of the rules, pattern {action}, is absolutely brilliant and very useful. Also, the book by A, W, and K is very, very good!

But Awk frustrates me even more than sed because it is a programming language (and highly innovative for the time), but in Awk, there’s no way to express rules within rules (e.g. pattern { action, pattern {action}}, so you can’t use the wonderful expressive power from the top level to match within an outer match.

In other words, the Awk’s implicit loop doesn’t nest.

(After matching stuff in the outer loop, the rest of your Awk program is highly imperative C-like programming with for(i=0; i<3; i++) and that sort of thing. What a shame.)

The brilliance of sam

The thing that sam has is the thing that Awk is missing: Composition.

The x command composes. Here’s Pike’s example:

A simple example is to change all occurrences of Emacs
to emacs; certainly the command

    ,x/Emacs/ c/emacs/

will work, but we can use an x command to save retyping
most of the word Emacs:

    ,x/Emacs/ x/E/ c/e/

Look carefully at that second example and maybe you’ll find it as exciting as I do.

The way I read that statement is: "over the entire document, match each 'Emacs', and then inside those, each 'E', and then change them to 'e'."

Since the x is properly composable with itself and other commands, you can use this one easily-learned mechanism to succinctly and accurately drill down into any arbitrarily complex selection without any of the mental gymnastics required when you try to do all of this with a single regular expression.

Of course, the example above would be a trivial find-and-replace task, but I trust you can easily imagine non-trivial cases.

The whole second half of the paper is mostly made up of statements that use x.

I guess what I find so appealing about this is how naturally these "loops" compose with absolutely no need for explicit variable names or code delimiters.

Grouping

All of the examples above are arguably concatenative programming - each statement operates tacitly on the results of the previous (like pipes on the command line). But you can also run statements in a group, which substantially changes their relationship:

{
    command1
    command2
}

Pike describes grouped statements as "applied in parallel".

What this means in practice is that each statement sees the state of the world as it existed when the group began.

For example, if you did two replacements without grouping:

,x/foo|bar/ g/foo/c/bar/ g/bar/c/foo/

you would end up changing all "foo"s and "bar"s to "foo", because after the first change, there would be no more "foo"s and after the second change, there would be no more "bar"s. The intent was probably to swap them, but how do we accomplish that? It’s like trying to swap two variables without a third temporary space.

Well, with sam’s grouping, you can do this instead:

,x/foo|bar/ {
    g/foo/c/bar/
    g/bar/c/foo/
}

The first statement still changes all the foos to bars…​but the second statement sees the original text, so it can change bars to foos as intended!

To me, this is pretty wild and I’m not sure I completely have my head around how this is might be accomplished under the hood. It’s easy enough to imagine the grouped statements working with separate copies of the incoming strings, but how are the results merged back into the final result?

The only clue Pike mentions in this document is:

"This means, as mentioned, that commands within a compound command see the state of the file before any of the changes apply. […​] An indirect ramification is that changes must occur in forward order through the file, and must not overlap."

I think I can picture that, but I’m fuzzy on the details.

Conclusion

As I said in the beginning, I found a ton of value from this short "tutorial" for a command language for a text editor I will likely never use.

The sam language is conceptually small and simple, which demonstrates the power of having a few well-chosen elements that play well together.

It was worth the price of admission for the ability to describe what I consider to be Awk’s greatest missed opportunity: the lack of composability of its main feature: pattern {action} rules.

The sam language is similar enough to ex or sed that it feels pretty familiar at first. But sam shows that you can have much greater flexibility and expressive power without really any added complexity - just the fairly intuitive concept of commands working together.

Reability is in the eye of the beholder and some of Pike’s examples are easier to read than others, but because it breaks the problem up into small pieces, I find the x command examples way easier to read than a typical complex ex/vi/Vim statement.

Here’s one I wrote in my .vimrc a couple weeks ago:

vnoremap <leader>td =gv:g/^$/d<cr>gv:s/^\s*/* [ ] /g<cr>:noh<cr>

It takes a selected list of statements and turns them into a VimWiki to-do list. It would be an interesting exercise to convert that to sam!

Big thanks to Will Clardy (quexxon.net) for pointing me to this document!