[lxc-devel] Putting man pages on website?

Rob Landley rlandley at parallels.com
Tue Feb 15 14:37:30 UTC 2011


On 02/15/2011 04:29 AM, Daniel Lezcano wrote:
>>> PS: the extension ought to be .xml, not .sgml, and I recommend you
>>> switch from Emacs' sgml-mode to nxml-mode, which is the default for .xml
>>> files in recent GNU Emacs releases.
>> I'd rather not get any emacs on me.
>>
>> But this should be enough to put something on the website.  Thanks.
> 
> Won't be easier man2html ?

No.  Converting a modernish human readable angle bracket delimited
format into another modernish human readable angle bracket delimited
format is a lot easier and more reliable than producing a typesetting
language for daisy wheel printers from the 1970's and then running a
pile of regex heuristics to try to parse it back.  Under the covers
that's a bit like running it through babelfish twice.

Years ago I helped debug Doclifter, which is a gigantic pile of
heuristics and attempts at AI to translate man pages into docbook,
badly.  It's written in python.  Here's a description of _some_ of the
heuristics it uses:

  http://www.catb.org/~esr/doclifter/doclifter.html

The purpose of that package (and the reason you don't really hear about
it anymore) was to let people do a one time conversion, to stop
maintaining troff sources and instead convert to something anybody under
the age of 50 still understood.  Lots of packages did such one time
conversions a decade ago, and ever since they've maintained their man
pages in a source format _other_ than troff macros.

The package you suggested is essentially a pile of perl regexes to do
something similar, only less thorough and targeting HTML directly
instead of docbook (from which you can also produce a PDF).  It has
similar problems parsing the horrors of troff:

  http://trac.osgeo.org/grass/ticket/612

Both are dealing with an easier problem these days because nobody really
maintains troff and all those macro packages as a source format anymore,
so it's all generated from other source formats by programs like pod2man
with recognizable idiosyncrasies that don't exercise all the corner
cases of the ancient dead macro languages.  (This wasn't true when
doclifter was written, but a decade's a long time on the internet...)




More information about the lxc-devel mailing list