Who's in control?

or
What you see is what you deserve
by Greg Lehey

Feedback

I received a number of mainly positive comments about the October article. One opinion in particular interested me: the next step in ``user-friendliness'' is likely to be voice input.

Now anybody who can type properly knows that typing is faster and more efficient than speaking, and there's much less danger of confusion. But that's not likely to stop big companies from doing it anyway, any more than the disadvantages of mice and GUIs have stopped them.

Assuming, then, that the next kind of input will be voice: what are you going to say to the computer? ``Move the cursor to that funny looking icon over there on the left and click'' or ``save this file and open one called foo.txt''? It's pretty obvious that you're going to use language directly, rather than pointing or clicking. In other words, the GUI is an evolutionary dead end.

While I'm on the subject, I'll take the opportunity to solicit further input on this series of articles. They're supposed to be interesting and encourage you to think about alternatives. I don't ask you to agree with me, but I would like to hear what you think about them.

This month, I'll look at another aspect of conventional office software, one for which UNIX doesn't have an obvious solution.

What you see is all you get

I'm currently having an extension put on my house so that I can finally find place to put all my old hardware and my musical instruments. In the process, of course, we're talking to all sorts of people. We went to the timber people to find out about roofing trusses. Nowadays, of course, they design the trusses with the aid of a computer. This particular program took exception to the fact that we had proposed a building that was almost square, but not quite. It seems it couldn't handle that, so we had to change the shape of the house. It's a disappointing fact of life that inadequate computer software can dictate your choices. Sure, we could have asked somebody to design the trusses for us, but that would have cost a lot of money. It was cheaper to change the shape of the house: we chose a workaround rather than a bug fix.

We also went to the Energy Information Centre in Adelaide to find out about how to heat the place. Again, they used a computer. This time, though, they didn't have a special program, they had a spreadsheet, something called Excel. The lady who was running it seemed to know more or less what to do: she started with a spreadsheet which had been prepared for somebody else and changed what needed to be changed.

I took the printout home with me, and looked at it in more detail. I found a whole lot of discrepancies: sections that didn't make any sense (``Whole house heating and cooling'' showed a total floor size of 0 square metres), comments that didn't apply, comments that should never have appeared on the spreadsheet. It did, however, contain the information I wanted: I would need a heating system capable of producing an output of 14.792 kW. Notice the accuracy to the nearest Watt, enough to drive any engineer crazy.

Where does this information come from? I only have a printout, so I have no idea. I have to trust that this particular part of the spreadsheet was correct, although I have evidence that other parts aren't, and I saw the lady correcting some formulae while she was inputting data.

What's the big deal?

What am I complaining about? After all, it's nice to have software that does things for you and only shows you what you want to see. Compare the following excerpts from the beginning of this article:

I'm currently having an extension put on my house so that I can finally find place to put all my old hardware and my musical instruments.

The source (in HTML, Hypertext Markup Language, the formatting language for web pages) looks something like:

I'm currently having an extension put on my house so that I can finally find place to put all my <a href=http://www.lemis.com/grog/history.html>old hardware</a> and my <a href=http://www.lemis.com/grog/instruments.html>musical instruments</a>.

Clearly, the formatted version is much easier to read: that's why we have HTML in the first place. On the other hand, it hides information. If you have difficulty following the links, you might find the second format more convenient: you can check what's going on. This is a basic problem with software that makes things easy for you by hiding the details. Spreadsheets are a particularly difficult example.

I've used spreadsheets for decades, originally on a Microsoft platform. Even after making the move from Microsoft to BSD, I still used spreadsheets for my expense reports. There are some spreadsheets available for UNIX–for example sc and ss. They look pretty bare-bones compared to Excel, but they do the job. Still, I felt uneasy, and I wasn't sure why.

The solution

Finally, earlier this year, I realized what the problem was, and I found the obvious solution: instead of inputting the data directly, I wrote it in a pseudo-spreadsheet form with Emacs, my trusty holy-war-inspiring do-everything editor.

This is obvious, you ask? I'll discuss that point in more detail later on.

But Emacs doesn't perform spreadsheet calculations, right? Right. To perform the spreadsheet calculation, I had to convert this form into something that ss would understand. That was straightforward enough: a 40 line awk script did just what I wanted.

That's a solution?

By now, you may be asking yourself ``Why did this guy do things the hard way when he could have done it directly with much less effort?''.

That's the theme of this article. Why?

ss offers a functionality comparable with early versions of Lotus 1-2-3. Not much, you might think, but then, it's a spreadsheet, not a word processor, a database, a graphics package, a display manager nor an operating system. But that's not important: I want a spreadsheet program, a program which calculates formulæ and displays them on the screen. This is what ss does: it implements one function well, the UNIX way.

Unfortunately, it doesn't stop there. It tries to be an editor as well. In contrast to the job it does well, its editing capabilities are, to put it mildly, rather less than those of Emacs. This restriction applies to every spreadsheet I have ever used. The last time I tried to use Excel, it drove me mad because I didn't know how to use it, and I didn't think it reasonable that it should expect me to learn another editor when I already had one that worked better.

So I did all this just to be able to use Emacs? Well, that was one of the considerations, but it wasn't the only one. In fact, there were three:

Let's look at each of these in more detail.

The editing commands

On the face of it, it looks like a wimp's way out to change everything just so you can use your favourite editor. In fact, it's anything but. Commercial ``integrated software'' usually offers you an editor which will do the job–barely. It does this at considerable cost:

I've used Netscape in this example, but these points are typical of most commercial software packages.

The format

Have you ever taken a look at the files that monolithic office packages produce? Here's one representation of the start of a Microsoft Word document:

-D"I^Q`a,i+-^Z'a^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@>^@^C^@pb^~?^@^F^@^@^@^@^@ ^@^@^@^@^@^@^C^@^@^@^@^@^@^@^@^@^@^@^@^P^@^@^C^@^@^@^A^@^@^@pb^~?^~?^~?^@^ @^@^@^A^@^@^@^~?^@^@^@^A^A^@^@^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^ ~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~ ?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~? ^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^ ~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~ ?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?

It's difficult to tell what this is, although if you persevere you'll find the text Microsoft Word 6.0 a bit further on in the gibberish. It's obvious that this format was not designed for easy interpretation by mere humans. Now Microsoft is infamous for its undocumented documentation formats, which appear to change–always only for the best technical reasons–between each release of Word, thus forcing people to upgrade en masse. Still, other packages use similar formats. For a while my wife used StarOffice for word processing, while I used troff. We couldn't exchange data, which basically killed our use of StarOffice.

The issue of control

These points may look bad enough already, but they're just a minor inconvenience compared to the third: the question have I done this correctly? Spreadsheet cells contain more than what you see, including other extremely important information which is not visible on the screen, for example formulæ. There are other examples.

Let's look at my spreadsheet again. On the screen it looks something like this:

Image of the sample spreadsheet

It's pretty primitive stuff: type in the data in the columns on the left, calculate the fees in column H, create sums of each row in column J and or each column in row 16. Still, the opportunity exists for error. What if I made a special price for one particular customer, and forgot to take account of it in the spreadsheet? There's nothing in this display, short of comparing the results, to tell me that I've charged the wrong rate. Sure, I can go and check the cell formulae, but that's a lot of work, and to do it right I have to check every one. What if I added row 12 later and forgot to update the formulae in row 16 to include it? I'd end up sending in a bill for less than the full amount. Again, I can add up the columns manually and check, but then what do I need a spreadsheet for?

This is the real reason why I changed the way I did things: accountability. My awk script recognizes different kinds of lines: one kind for headers, one for the ``meat'' of the spreadsheet, and a different one for the telephone bills. Here's the input:

C Sunroot telco Benchmark in Austin # Hours Hotel Lunch Dinner Incidentals Transport 8 June 1998 8 8 9 June 1998 10 15 10 June 1998 9.5 50 34.62 11 June 1998 11 7.04 25.85 12 June 1998 10.5 6.5 13 June 1998 2 20 55 5 12.16 T Telephone 19.68

The awk script looks at the beginning of each line:

In this way, awk makes a spreadsheet which ss can understand. It creates the formulae for the calculations itself, so I can check in the script if it has charged the correct rate, and I know that the ranges for the additions are correct because they're programmed that way. Yes, I know that there's still ample opportunity for bugs, but that's a different order of magnitude: with this method, I have control.

Goodbye stress

If you've been around computers for any length of time, you'll know all about stress. One of the main causes of stress is lack of control over what you're doing. This method trades stress for just a little extra work. Once the work is done, you have the confidence that the calculations are correct, since you can't accidentally overwrite them. If you want to be sure, you can go and check the awk source, or you can change things if you find there are additional requirements.

Everything has its limits

If you've been programming for any length of time, you may now be saying, ``OK, he found a good example, one where this approach works. Most times it doesn't''.

Well, yes, there's a limit to what you can do yourself. This example was simple enough for me to be able to show it in its entirety, and that's why I chose it. Other problems are significantly more complicated. Whenever you choose a strategy for approaching this kind of problem, you need to decide early on whether it will be worth the effort. Another thing I tried, for example, was a set of macros to get HTML output from troff. I wrote the original version of this article in troff and converted it–almost. I found I had to touch up the HTML output to make it usable. That wasn't the correct approach: the correct approach would have been to go back and fix the macros, but it started to become apparent that that was more trouble than it was worth. Since we're planning to go to SGML (Standardized General Markup Language) as a documentation base and use that language to create troff or HTML output, it seems to be a better idea to work on those conversions rather than a troff to HTML converter.

But this means programming!

On the other hand, you may be saying, ``Alright, but he knows how to program. I don't.'' Sure, you need a little extra understanding to be able to do things this way. Is it difficult? No.

One of the big problems bringing computers to the ``masses'' was to make them easier to use. The initial attempts in this direction were pretty good: many computer systems were difficult to use simply because nobody had ever tried to make them easy to use. But then a different attitude took over: ``Let's make it possible for every idiot to use a computer without knowing anything about how they work''. This approach doesn't even work for cars. Why should it work for computers?

No programming, please

In the process of making computers suitable for idiots, the vendors forgot that idiots only make up a very small percentage of their customers. On the other hand, this approach made it more difficult for the rest of the customers to get beyond the idiot user stage. One consequence of this attitude is that people say ``programming is difficult, I'll never learn it''. This is a self-perpetuating attitude, and it has some unexpected results.

Recently, my daughter's school has come out with a computer usage policy. You know, ``use your common sense'', ``don't download pornography'', and so on. It's not really clear why they need this kind of policy, since it turns out they're not allowed to use or possess pornography while studying other subjects either. But that's a different story.

One of the things that upset me about the policy was:

Students are not permitted to have in their folders or on floppy disks:

If you're going to learn to program, you'll need at least an utilities such as an editor, a C compiler and a debugger. You'll produce executable output files, and probably want to save them on floppy. You might be asked to use an integrated package such as the ones I have been decrying above, but if you're serious you'll use real tools, such as the GNU C compiler. Never mind that its author, Richard Stallman, definitely considers it a hacking utility: the school is confused in its terminology when it speaks of hacking utilities.

The real problem is that computer education at my daughter's school, the Eastern Fleurieu campus in Strathalbyn, isn't computer education at all. It's a series of classes telling you how to use various Microsoft products–rather like booking a course in automobile mechanics and find yourself being taught how to drive a particular model of car. The policy didn't intend to make it punishable to learn to program–it happened simply because nobody has ever thought about the possibility.

So what about the directory nesting limit and the prohibition on invisible files? I don't know, and they haven't been able to tell me yet. If you have a good guess, I'd be interested to hear it.

Programming, please

If you've read this much of the article, you're probably not surprised that I didn't accept this policy. In today's computer-based society, the ability to program has the same kind of importance that being able to read and write had a thousand years ago. Unwittingly, Microsoft makes this similarity more obvious by using pictures rather than (written) words to operate its software. You don't have to be directly involved in the computer industry to find a need for programming.

This doesn't mean that you should try to become a world-class programmer. Not many people make it that far. Not many people make it to being world-class racing drivers, either, but that doesn't stop them from driving cars. In a similar manner, learning the basics of programming isn't difficult, and it saves you a lot of time waiting in line at bus stops.