When I noted that I had been busy with conference planning, one angle to that I had left out is my crash education in DocBook XML, a markup language used for technical documentation.
I’ve spent close to a year circling around the question of documentation for an open source software project. Documentation is one of those maturational issues for open source software (and before we get too far, I will add that there’s no shortage of lame documentation in the proprietary software world — but that’s not the problem I’m trying to solve).
I know what doesn’t work, such as assuming documentation will naturally bubble up from the gift economy (the kind of woo-woo philosophizing up there with assuming an unregulated market will police itself). That approach yields at best a smattering of notes in a hodgepodge of formats. You also can’t just point contractors toward the project and say “write this.” I mean, you can, but it won’t work.
In the end, you need focus and direction — or as I put it in a talk a couple weeks back, some people, a plan, and a pickaxe.
The kewl thing about Evergreen is that the project is now approaching the critical mass required to support almost anything the community wants to do, including establishing a documentation project. (I don’t kid myself that a community documentation project could necessarily handle all documentation needs for an open source community, but without a project, we’ll never know what those needs are to begin with — and a community can bite off some chunks of the problem.)
Evergreen’s now got the people, and they are ready and willing to plan. But to give this project direction, it also needed the pickaxe, which is where DocBook XML comes in.
When you look at all the options for formatting documentation, and then look at the basic documentation needs of any project, you work your way to DocBook XML by process of elimination. Assuming your project needs a single-source, standards-based, non-binary documentation format that supports translation, reuse, and other requirements, with an active user community, and strong fee-or-free toolsets, you end up with DocBook XML or DITA. The ramp-up for DocBook XML is much less daunting than DITA (though not without plenty of daunt on its own), in part due to a couple of excellent books (and though they are freely available online, it’s much easier to buy the print books and have them parked near your keyboard for ready reference).
DocBook XML is a lot like democracy (to paraphrase some pundit): it doesn’t look so great until you compare the alternatives. Nobody thinks writing XML is a walk in the park, and after you’ve produced lengthy XML documents, you still have to transform them into HTML (or PDF), and even at that you need to style the pages so they’re all purty, because plain HTML looks so 1993. But again, after close to a year of banging my head on the wall, I get it. DocBook. All righty.
But it’s one thing to suggest using DocBook XML — and building an entire project around it — and another to actually demonstrate it in action. So about six weeks ago I realized that if I was going to make a convincing, project-energizing argument for DocBook XML — an argument first made two years ago by others in the community and repeated several times hence, with no objection but also no action — I was going to have to get serious about learning DocBook XML, if not to the level of expertise, at least to a minimal competence.
(It helped that I had been reviewing an intern’s beginning DocBook projects for a couple of months; as is often the case with teaching, I quietly absorbed more than I realized during the process of evaluating the student’s work.)
So in addition to working on the conference planning stuff, I got up at the butt-crack of dawn for weeks on end to review, validate, revise, tweak, experiment with, and otherwise produce real DocBook XML examples. After experiencing the pain of working at a DOS prompt with some free tools, I moved to a nice editor, oXygen, and that helped somewhat — but there was still much to learn (and I repeated all my examples with the free tools just to be sure they could be produced that way as well).
And then, of course, there’s the beer connection
When I started writing this blog post I saw a clear link between this and homebrewing. Circling back to that idea, I still see the similarities.
In both cases I have been learning a fairly arcane skill through books, websites, discussion groups, and iterative practice. There’s a geek level to both I enjoy; I’m not ever going to be a truly yee-haw XML/XSL cowgirl any more than I am going to open my own brewery, but I admit that the first time I got a reasonably long document to not only transform but to get styled with CSS, I did feel a wee spark of pride — similar to the first beer batch I made where I actually, and successfully, “mashed” (that is, converted malted barley into wort, the liquid that when boiled with hops and activated with yeast, eventually becomes beer).
Plus in both cases, by mastering some fundamental skills (and a domain vocabulary), I can now communicate within their respective communities. I understand terms such as single-source, transform, validate, XSL, stylesheet, FO, FOP; sparge, pitch, vorlauf, lauter, rack, mash, tun. (And to my delight, there is an XML schema for beer called, of course, BeerXML, proving that all roads lead to London.)
The ability to communicate is key; getting past that initial hurdle is crucial for learning. (Remember Helen Keller, spelling out “water”?) I may not understand every question that flies past me, but my feet have some purchase in the loam of their fields.
I don’t know. Maybe I’m just in it for the language. But these processes happening in parallel have me marveling at our capacity to keep learning, sometimes when we least expect to.
Successful completion of a graduate level XML class is one way to satisfy the “core technology requirement” at the University of Washington’s Information School. Certainly everyone who takes the class understands that XML is a way to model information. It is less widely understood how XML is pertinent to the library profession.
Your post provides a great example of exactly why and how XML is relevant to libraries. I will be sending friends and colleagues over this way.
This post also reinforces an argument I will be making in a meeting tomorrow — MLIS students at UW’s iSchool would greatly benefit from working on “real world” solutions to problems.
And you can rarely go wrong with the inclusion of beer.
Thanks for posting.
As an instructor, I was all about real-world instruction; I believe it makes all the difference. Keep us in mind if you’re looking for internships!
Some folks will read this and look at the technical issues you have “conquered.” Others (your “beer” friends) will look at that language. I know just enough of each to be dangerous. However, what I got, and I think is critical for success, is something which I have appreciated in you for many years: the ability to take concrete examples and use them to rise to the really big picture. I count on you for that (at times). You can help us (well, at least me) focus on the big issue/big picture. Here is what is the most important sentence…and it is near the end, having use your writing skills to get us there “The ability to communicate is key.”
Thanks Karen.
I know some folks like using XMLMind as a DocBook editor as well, although I have not used it enough to form an opinion. Oxygen is nice for integrating with xsl.
@Jen –
To me another aspect of XML besides just the use in documents and resulting ease of doing certain types of manipulation once information is made explicit is the increasing use of XML for data exchange and communication.
There’s lots of tools and sites that offer data in various XML schemas that I’d love to see more integrated with library systems. Off the top of my head rss and atom feeds, onix, and web service for sites like board game geek, musicbrainz, and dbpedia all provide some form of xml.
(Many provide multiple formats actually such as JSON and the like as well. That’s a whole other debate as well.)
Great article – I can’t wait to check out DocBook XML. Adobe has lots of examples of XML performing stupid browser tricks using the Spry framework for Ajax JavaScript library.
http://labs.adobe.com/technologies/spry/samples/
One of the fun and easy things to do is using AJAX to make tables with sortable columns.
I enjoyed the article. One system that at least bears mentioning as a DocBook alternative is ReStructuredText. When paired with Sphinx, it’s one of the easier ways to document a large, collaborative technical project. Probably the best example is Python’s documentation project (http://docs.python.org).
In general, I think it’s a lot more pleasant for folks to write documentation in plain text than in XML — and so for many applications outside of universities and corporate businesses, plain text formats like Wikipedia’s MediaWiki format or Python’s RestructuredText are (and will remain) the norm.