Looking for writing-related posts? Check out my new writing blog, www.larrykollar.com!

Friday, October 14, 2005

Laziness and Open Document Format

Categories: technology, work
Current music: Groove Salad

Just before I took off for lunch today, the contractor who picked up the projects I was working on before the reorg motioned me over and asked me, “how did you do it? You put together the whole shell of this project, and I’m just hanging stuff on it. Especially the command-line stuff... how did you get so much of it done with nothing to work with?” Pulling miracles out of my, er, back pocket has been a lot of what I’ve done at the office for the last few years. I got deadlines, limited access (at best) to equipment, a little help from my boss when he’s not swamped with other stuff, very little in the way of specifications, and somehow I managed to maintain documentation for three entire product lines.

My secret is: I’m lazy.

Look, I sit in front of a computer all day. If I can get the computer to do something for me, especially if it’s something that needs to be done more than once, I’ll do it. For example, our original (now “legacy”) product line came with about 4000 pages of documentation scattered over about 20 different manuals. We provided a master index, a 110-page book of its own, as a way to let customers zero in on which manual(s) covered a particular topic. The first time I did the master index, it took two solid weeks of nothing else. This is one of those prime examples for automation: I had to build a book in FrameMaker of all the other books, tag each chapter in each manual, run the index, convert the tags to document references (for example, change “EG” to Engineering Guidelines), remove all but the first reference to any chapter, blah blah blah. To make a long story shorter, I wrote a handful of AppleScripts that eliminated literally 80% of the grunt work: instead of two weeks, I could build the master index in two days. Yesterday, I wrote a script that created index entries from headings (which is OK for a first pass at indexing commands) that saved me a day’s worth of work.

I told you all that to tell you this.

A couple of weeks ago, I wrote about Massachusetts adopting Open Document Format (ODF) for state government documents. Between than and now, OpenOffice 2.0 went into beta test; yesterday, OASIS (Organization for the Advancement of Structured Information Standards) submitted ODF to the International Standards Organization (ISO) for consideration as an international standard for office document interchange.

The neat thing about all this is that the ODF format is easy to pick apart and fiddle with. Internally, content, graphics, and style information are separated and the whole thing is rolled up in a ZIP file. Content and style files use an XML format, which is important for two reasons: XML is plain text, and there are lots of utilities to work with it. So what does that mean? There are several Free programs that support ODF already (OpenOffice and AbiWord run on most computers, while Koffice also runs on Linux systems). But the really fun part is, given a document format both open and relatively easy to parse, you don’t need an office application to do things with ODF files.

In the computing world, when a group like OASIS sets out to nail down a standard, they form a Technical Committee (TC) of interested parties. In the case of the ODF TC, some of the interested parties include companies that make content management systems (or CMS... the alphabet soup is sloshing around quite a bit tonight!) — suffice it to say that a CMS allows you to store, retrieve, and process documents to make something new (kind of like putting basil leaves in a food processor and making pesto). Given the job of a CMS, it usually doesn’t just store a document as-is. In the case of an ODF file, the CMS would probably unzip it and extract just the content and metadata (data about the data) components. The graphics are already stored in the CMS. Let’s say I send a document to the CMS and come back for it a couple of months later. During that time, some of the artwork has been changed. The CMS grabs the original content and metadata, rolls in the updated graphics, and hands me an ODF file. Oh cool, I didn’t have to update the graphics myself!

Another handy utility might be nightly publishing runs. Sometimes, I’m working on a manual that’s getting change requests and bug reports coming in fast & furious. Some of the manuals I deal with have a lot of bitmap graphics, and can take nearly an hour to generate a PDF. Remember, I’m lazy... I don’t want to sit at work an hour overtime just to watch the computer make a PDF. In my theoretical ODF-based system, I simply send in the stuff I worked on during the day, and the CMS builds a new document and emails it (with a summary of what changed) to all the reviewers. The reviewers get fresh hot documentation every morning; I get to go home, sit on the porch, and write haiku before it gets dark.

With the manual finished, I have to send it to the translators. Currently, this involves gathering all the various files together and archiving them (and sending missing pieces or assuring them that the extraneous files aren’t important). In my dream system, I tell the CMS to give me an ODF document of the book. Boom, all the pieces get wrapped together, nothing gets dropped, nothing extra gets added, and I send one file to the translators.

I’m willing to put some effort into making this a reality. After all, I want the computer to do the work for me.


  1. Did you hear that Microsoft Office 12 will have native support for PDF? I read somewhere that it's partly due to the Massachusetts decision to insist on open standards. Hang on, where's that link..aahh, here it is:

  2. Well, as you are lazy FARfetched, so am I.

  3. Hey FARfetched, I changed my URL to http://austinpost.blogspot.com/

  4. Anyway, I changed my URL in order to take the bias out of it.

  5. Mr. Burrard... yes, I’d heard about the PDF generation in Office 12. Given the state of the development, though, I think it was going on before the fun in Massachusetts got started.

    Austin... OK, I’ll get the link fixed here. I liked the old name, but hey it’s your blog. ;-)


Comments are welcome, and they don't have to be complimentary. I delete spam on sight, but that's pretty much it for moderation. Long off-topic rants or unconstructive flamage are also candidates for deletion but I haven’t seen any of that so far.

I have comment moderation on for posts over a week old, but that’s so I’ll see them.

Include your Twitter handle if you want a shout-out.


Related Posts Plugin for WordPress, Blogger...