Dave's Paper Notes: Programming as Theory Building

Page created: 2023-05-09
Updated: 2025-07-10

<< Back to Dave’s Paper Notes.

(There’s also a handy text version provided by Diogo Felix: Programming as Theory Building (github.com).)

This paper was suggested by my Internet friend, Eli.

What it’s NOT about

First of all, let’s get this out of the way: This paper is not making a case that learning to program is theory building.

I’ve seen people make this assumption on certain popular discussion websites and then attempt to argue against it.

What it IS about

Instead, Naur is making the case that the most important part of writing and understand a program is building a "theory" about that program.

Again, the "theory" is for a program, not "programming" in general.

My favorite part of this paper is where Naur makes a scorching argument against programming being an act of "text production". If you read nothing else on this page, check out those quoted chunks below!

Theory?

The whole first part of the paper is establishing what Naur means by "theory". The term comes from a book by Gilbert Ryle called The Concept of Mind (wikipedia.org). You’re welcome to dive into that, but I don’t recommend it.

Instead, I think Naur explains it best when he describes what you can do when you have developed (or acquired) the theory:

"…​not any particular knowledge of facts, but the ability to do certain things, such as to make and appreciate jokes, to talk grammatically, or to fish."

(I wish he’d expanded on what "to fish" means, but I’m guessing this is in the sense of the English proverb, (wiktionary.org) "Give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime.")

And later:

"…​theory is understood as the knowledge a person must have in order not only to do certain things intelligently but also to explain them, to answer queries about them, to argue about them, and so forth."

My wife also suggested an excellent analogy: Replace a program with a reasonably complicated strategy board game. The rules for the game are contained in the game manual. But the "theory" of the game is what exists in the experienced player’s head.

Update: A month or so later and I’m still using the term "theory" to refer to the concepts in the paper. It describes an aspect of software understanding for which I didn’t previously have a word and now I do. I wish I could think of a different word for this concept because it sounds weird and requires explanation when I mention it, "The Theory of the project has changed…​you know, like as in Naur’s Programming as Theory Building."

The theory is vital for working effectively on the program

A program consists of source code and (if you’re lucky) comments and documentation.

But as anyone who has every worked on a large software project can attest, these written things are seldom enough on their own to work effectively on the codebase!

Naur points out two case studies exemplifying this phenomena:

  • A compiler written by one group and modified by another. The second group was unable to use the existing codebase effectively because they did not really grasp the "theory" of how it was built.

  • An industrial control monitoring system developed and installed by "old timers" who knew the software inside and out, but updated by on-site developers who didn’t have the same "theory" and were unable to gain a proper understanding of its inner workings.

Unobtainable through documentation?

I think the most contentious point Naur makes in this paper is the idea that the "theory" of a program cannot be written down. The very definition of "theory" here is the part that is constructed in the programmer’s mind.

Again, I think a board game analogy works really well to explain why this is so. A board game rules book can attempt to explain the various strategies of playing, but I don’t think any amount of reading will replace actually playing the game with experienced players!

Programming is not text production!

I absolutely loved how Naur used the notion of theory building to dismantle the myth that modifying a program is a mechanical act of modifying source code:

"The expectation that program modifications at low cost ought to be possible is one that calls for closer analysis. First it should be noted that such an expectation cannot be supported by analogy with modifications of other complicated man-made constructions. Where modifications are occasionally put into action, for example in the case of buildings, they are well known to be expensive and in fact complete demolition of the existing building followed by new construction is often found to be preferable economically. Second, the expectation of the possibility of low cost program modifications conceivably finds support in the fact that a program is a text held in a medium allowing for easy editing. For this support to be valid it must clearly be assumed that the dominating cost is one of text manipulation. This would agree with a notion of programming as text production. On the Theory Building View this whole argument is false. This view gives no support to an expectation that program modifications at low cost are generally possible."

(Emphasis mine.)

Screenshot of quote below

"This observation leads to the important conclusion that the problems of program modification arise from acting on the assumption that programming consists of program text production, instead of recognizing programming as an activity of theory building. On the basis of the Theory Building View the decay of a program text as a result of modifications made by programmers without a proper grasp of the underlying theory becomes understandable."

Hell yeah.

He further explains why trying to make the program "flexible" enough to accommodate future changes is a mistake:

"It is often stated that programs should be designed to include a lot of flexibility, so as to be readily adaptable to changing circumstances. Such advice may be reasonable as far as flexibility that can be easily achieved is concerned. However, flexibility can in general only be achieved at a substantial cost. Each item of it has to be designed, including what circumstances it has to cover and by what kind of parameters it should be controlled. Then it has to be implemented, tested, and described. This cost is incurred in achieving a program feature whose usefulness depends entirely on future events. It must be obvious that built-in program flexibility is no answer to the general demand for adapting programs to the changing circumstances of the world."

(Again, the emphasis is mine.)

Smart programmers know YAGNI and to prolong implementing heavy abstractions to avoid writing code that never gets used or painting themselves into a corner.

Likewise, smart programmers should also reconsider trying to plan too far into the future.

(I won’t say no such thing has ever existed, but I have never seen a program with a correct and clearly defined map of future modifications. I’m tempted to argue that there can’t be such a thing, except by random chance.)

Do not try to predict the future. Build what you need now based on what you know about the problem at hand. Understand that you may have to re-write some things later. It is inevitable.

The entire section titled "Problems and Costs of Program Modifications" from which these quotes were taken was a battle cry for me. I wish everyone in the software business would learn it, know it, live it.

New programmers and new teams

If the only way to properly work with a program’s codebase is to have the "theory" of that codebase, then does that mean getting new developers going on a programming project is a non-trivial task?

You better believe it! And if you don’t take this stuff seriously, you can expect hacky solutions and technical cruft.

He even mentions that if you try to put a whole new team on an existing programming project (in, say, an attempt to revive it or as a purchase from another organization), you’re probably better off in the long run, both in terms of cost and in terms of quality, to just start over.

Simple, elegant, sustainable solutions can only come from having a full grasp of the "theory" of the program. There are no shortcuts.

Commentary on this paper by others

I discovered these in a low-effort search after writing the above:

Have you written (or seen) an opinion about this paper and put it up on the Web somewhere? Let me know!