Following on from Monday’s post, I thought it might be interesting to go into a bit more detail about why I won’t ever use Microsoft Word (or, indeed, any word processor) unless I have to. I have three reasons – one is personal experience, one is technical and the third is largely philosophical.
Firstly, the personal experience. I consider (or, more accurately, considered) myself a Word power-user. Many years ago I used to write Word macros to automate document creation. I know what I’m doing with Word. So when I wrote my first book, six or so years ago, Word was my first choice as the tool to use. I was using Word 97 and some formatting templates that had been supplied by my publishers. I remember that I had read some stories about problems that people had experienced writing a whole book in one Word document so I followed what seemed to be considered best practice and created a separate document for each chapter. Now this is all some time ago so my memory is a bit hazy, but I distinctly remember often having to spend time reapplying formatting that got scrambled when I opened a chapter. Now that could have been the version of Word, a problem with the templates, or some other technical glitch. All I know is that it left a bad taste in my mouth and over the months that I spent writing the book I went from being a Word fan to hating it with a passion.
My second reason is more technical. Any word processor will store your document in a proprietary binary file format. If you were to open a Word document in a text editor like Notepad then you wouldn’t be able make much sense of what you see. That’s because all of the formatting information is stored in a manner that only a word processing program can understand. One the other hand, Unix (and therefore Linux) has a long tradition of dealing with plain text files[1]. The Unix tool set has a large number of interoperable tools which can be used to manipulate text files in various ways. For example, it’s simple to use “find” and “grep” to recursively search a directory and all subdirectories to find all of the files that contain a particular phrase. Another good example is getting a word count for a set of documents. With Word you would need to open each file individually, get the word count and add the numbers manually to get a total. With Unix tools, it’s a simple process to get the word count for each individual file and the total across all the files. It’s probably just what I’m used to, but I find it far easier to deal with plain text files.
My final reason is, as I said above, more philosophical. I don’t think that WYSIWYG tools are a good way to produce documents. Think about it. How often do you spend almost as much time fiddling with the formatting of a Word document as you do actually writing? A WYSIWYG program encourages you to see the presentation of your document as intrinsically linked to the content. We used to see web publishing the same way – the presentation of a web page (lots of <font> tags and too many nested tables) were completely intertwined with the actual content making it hard to change one without changing the other. Now, of course, we laugh at the old days as we all produce semantically meaningful markup which will be formatted using an external stylesheet. And it should be the same with documents. Write what you have to write, only pausing to add extra information to define the various parts of the document (this is the title, this is a subsection header, this is a bullet list, and so on). Once you’ve created the document that way, you can start to think about how it should look and apply styles appropriately. I realise that Word can be used that way (the default document styles allow you to define the various parts of your document) but I don’t think that a WYSIWYG program encourages you to think about your writing that way – the presentation always gets in the way.
I’m not saying anything new or radical here. People have been producing documents this way for years (ask your neighbourhood Unix geek about LaTeX). It’s just a shame that the most popular end-user tools for document creation don’t encourage this mode of working.
So that’s why I prefer to work in a plain text format (or something that is, at least, stored as text like POD or DocBook) and why I’ll never use a word processor unless it’s something that a client insists on for some reason.
[1] And yes, I realise that text is (strictly speaking) another binary format. The point is that it is a simple and well-understood format. Of course Unicode encodings complicate that somewhat.
Personally, I find it much easier to get plaintext + markup to look the way I intend to than figuring out WYSIWYG + magic keystrokes. Expressing complex things like nested lists with headers inside blockquotes is completely straightforward with markup but an excercise in hair-pulling with WYSIWYG apps. Additionally, using lightweight markup like Markdown makes source documents just as readable as the formatted end result.
Something you didn’t mention explicitly (although your point about using standard tools alludes to it) is that being plaintext makes a document easy to track using standard version control systems. In comparison, the builtin versioning in MSWord and friends is limited and unreliable (although admittedly, the integration makes some nice visualisation features possible).
Finally, as far as I’m concerned, MSWord can’t hold a candle to vim’s raw editing power. Other people will want to substitute Emacs here if they belong to that church – but the point is that proficiency with a powerful editor gives you productivity far beyond anything you could achieve with MSWord.
Personally I prefer UltraEdit on Windows to vim or emacs, but each to their own.Anyway, Dave, what is it exactly that you do with your text files using unix that is so useful but that Word won’t do? I’m sure there are lots of things, but I just can’t think of them.
Ian,I’ve already mentioned getting the word count across multiple files and Aristotle has mentioned source code control. Of course, it’s possible to store binary word processing documents in source code control as well, but showing the differences between versions becomes much harder for non-text formats.And then there’s stuff like, searching for files using a combination of ‘find’ and ‘grep’ or making changes to multiple files with one command line Perl program.But, to me, the major advantage is knowing that anyone on any computer in the world will be able to open my document with just the tools that come as standard with their operating system.
I write a lot of shorter reports adn my tools of choice are TextPad (a great Windows text editor) and ReST (Restructured Text, part of the DocUtils package). ReST is easy to read, yet can be quickly converted to HTML. I also like to use KeyNote, a hierarchical notebook. I set KeyNote to format all nodes (pages in the notebook) to be monospaced (Courier), 9pt. I cut and paste between TextPad and KeyNote as needed. TextPad has very nice regex-based search and replace. Many times I don’t even convert to HTML, etc., since ReST is so easy to read.http://www.textpad.com/http://docutils.sourceforge.nethttp://docutils.sourceforge.net/rst.htmlhttp://www.tranglos.com/free/keynote.html— Clint (clint at robotic dot com)