Mark Damon Hughes Don Knuth is Wrong, Alas [Parental Advisory: Explicit Lyrics] [about]
Don Knuth is Wrong, Alas
Sat, 2008Apr26 12:41:53 PDT
in Software by kamikaze

InformIT interview with Donald E. Knuth

I'm astounded, disappointed, and frankly repelled by much of what he says. This is almost tragic: I pretty much learned my serious computer science skills from Don Knuth's The Art of Computer Programming. They're extraordinarily difficult books to read and work through, but they're very rewarding in teaching algorithm design and optimization. The beta versions of the new editions have been interesting, and the MMIX virtual machine was much more relevant to modern hardware. So imagine my surprise here:


the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

I'm not entirely a test-first, test-driven developer. Graphics and interaction make up most of my code, which isn't productive to unit test (QA testing, later, yes). But by and large, and especially when building any algorithm and back-end logic, the tests are the proof that I've actually done what I thought, that there are no typos, and that it runs in something less than geological time.

While some smaller segments of algorithms can be proven mathematically, most code cannot, and the larger interactions absolutely cannot. So you're left with testing the edge cases and most common cases. Unit testing is the only way to isolate those tests from the rest of your app; QA testing just shows that the app looks correct, unit tests show that each part is still working correctly.


Still, I hate to duck your questions even though I also hate to offend other people’s sensibilities—given that software methodology has always been akin to religion. With the caveat that there’s no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I’ve ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other’s code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.

Yow. Extreme Programming's main practices are just good software engineering practices (GSEP hereafter) pushed to the logical conclusion. Unit testing is GSEP, so all code where practical is written test-first, then implemented. Code review is GSEP, so all code is written in pairs, constantly reviewed. Regular check-in to source control is GSEP, so code is only written in short sessions and committed. If you can't check in, you throw it out and try again at a smaller task. Regularly integrating all code in the repository is GSEP, so you set up a build machine to immediately get all checked-in code, build it, and run the unit tests, with a prominent alert if the repository has broken code, and nobody does any more checkins until the build is fixed.

These are all provably 100% improvements in software engineering. The name "extreme programming" might be a bit silly, but it's actually the most serious set of practices I know of. For a single developer, some of them can slide without harm, and some teams might consider them to be more overhead than they're comfortable with, but they're as real as gravity, the heliocentric model of the solar system, and evolution. These are facts. It is nonsensical and religious to dispute them. I'm really appalled.

The notion that reusable code is a menace, that libraries of well-tested, carefully-designed tools do not lift you up and make you more powerful, is so alien and dysfunctional I don't know how to even communicate with that.

And now for the real horrorshow:


I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I’ve written well over a thousand programs, many of which have substantial size. I can’t think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.[1]

How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that I’m wrong.

Actually, most of the good programmers are pretty enthusiastic about multithreading. If you perform a long operation in response to a user event, a single-threaded application will block (and on the Mac, give you the spinning beach ball of death: @). A multithreaded application can respond to the event, fire off a task to work on it, and return to the user; this is a gigantic leap forward in usability.

The OS/2 user interface guidelines required that you react to an event within 0.1 seconds. Not surprisingly, OS/2 had fantastically good threading support for its time. Threading is hard with most older languages and APIs, but modern languages and frameworks make it approachable: Java has had quite good threading and tools for years, ever since Doug Lea's Concurrent Programming in Java, now the java.util.concurrent libraries. Objective-C/Cocoa has NSOperation. Functional languages are better at multithreading, like Haskell, Scala, and Dylan, and these languages are growing in popularity. One of the few major problems with Python is that the "Global Interpreter Lock" prevents true multithreading, which cripples its long-term performance. Python was never meant to be a fast language, but every generation of chips is going to make it fall exponentially further behind until the GIL is removed.


I know that important applications for parallelism exist—rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)

The machine I use today has dual processors. I get to use them both only when I’m running two independent jobs at the same time; that’s nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldn’t be any better off, considering the kind of work I do—even though I’m using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think it’s a pipe dream. (No—that’s the wrong metaphor! "Pipelines" actually work for me, but threads don’t. Maybe the word I want is "bubble.")

This is entirely backwards. The kind of work Knuth is doing is reaching irrelevancy, because it depends on having a single super-fast monolithic computing core, like an old-fashioned mainframe. But we don't have those anymore. We have a network or cloud of computing systems, and we push work out to a bunch of them, collect results when they get done, and make them fault-tolerant. SETI@home is impossible on a monolithic computer, but a cloud of cheap, simple computers is chewing away on it all the time.

This isn't the future of computing, it's the present. Single-processor systems are archaic, and cannot scale much further. We can get almost infinite scaling by parallelism, following the model of the best computers around: the human brain. There is no core CPU in the brain, it's just a bunch of tiny, almost useless processor nodes chatting with their neighbors along weighted connections.


I currently use Ubuntu Linux, on a standalone laptop—it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux.


From the opposite point of view, I do grant that web browsing probably will get better with multicores. I’ve been talking about my technical work, however, not recreation.

This is perhaps the most weird and alien part. The Web is not "recreation" only; it sure can be, like any other medium, but it was designed for publishing scientific papers, and it's primary uses are news and business; and, sure, communications and porn and games, so it's really covering all of human life. Like most technical people, I now spend most of my day on the Web, or using Web-related services like Twitter.

I apologize in advance for the following unpleasant comparison with Professor Knuth (who, while obviously out of touch now, has produced good work in the past), but I must note that Filthy Communist Richard Stallman does not have Internet access or surf the Web. Is this just generational? I can't think of a lot of older computer scientists online; maybe our culture scares them and they're unable to filter the entertainment parts from the business parts? Vint Cerf is still adapting and surviving in the real world. Maybe it's just 50 years of insular University life that makes you fear change and reality.

Feedback  | Key: ] =local file, * =off-site link  | Copyright © 2003-2010 by Mark Damon Hughes | Subscribe with RSS 2.0