Section: User Contributed Perl Documentation (3)Updated: 2010-11-30Local indexUp
NAME
Locale::Po4a::Sgml - convert SGML documents from/to PO files
DESCRIPTION
The po4a (PO for anything) project goal is to ease translations (and more
interestingly, the maintenance of translations) using gettext tools on
areas where they were not expected like documentation.
Locale::Po4a::Sgml is a module to help the translation of documentation in
the SGML format into other [human] languages.
This module uses nsgmls to parse the SGML files. Make sure it is
installed.
Also make sure that the DTD of the SGML files are installed in the system.
OPTIONS ACCEPTED BY THIS MODULE
debug
Space separated list of keywords indicating which part you want to debug. Possible values are: tag, generic, entities and refs.
verbose
Give more information about what's going on.
translate
Space separated list of extra tags (beside the DTD provided ones) whose
content should form an extra msgid.
section
Space separated list of extra tags (beside the DTD provided ones)
containing other tags, some of them being of category translate.
indent
Space separated list of tags which increase the indentation level.
verbatim
The layout within those tags should not be changed. The paragraph won't get
wrapped, and no extra indentation space or new line will be added for
cosmetic purpose.
empty
Tags not needing to be closed.
ignore
Tags ignored and considered as plain char data by po4a. That is to say that
they can be part of an msgid. For example, <b> is a good candidate
for this category since putting it in the translate section would create
msgids not being whole sentences, which is bad.
attributes
A space separated list of attributes that need to be translated. You can
specify the attributes by their name (for example, ``lang''), but you can also
prefix it with a tag hierarchy, to specify that this attribute will only be
translated when it is into the specified tag. For example:
<bbb><aaa>lang specifies that the lang attribute will only be
translated if it is in an <aaa> tag, which is in a <bbb> tag.
The tag names are actually regular expressions so you can also write things
like <aaa|bbbb>lang to only translate lang attributes that are in
an <aaa> or a <bbb> tag.
qualify
A space separated list of attributes for which the translation must be
qualified by the attribute name. Note that this setting automatically adds the
given attribute into the 'attributes' list too.
force
Proceed even if the DTD is unknown or if nsgmls finds errors in the input
file.
include-all
By default, msgids containing only one entity (like '&version;') are skipped
for the translator comfort. Activating this option prevents this
optimisation. It can be useful if the document contains a construction like
``<title>Á</title>'', even if I doubt such things to ever happen...
ignore-inclusion
Space separated list of entities that won't be inlined.
Use this option with caution: it may cause nsgmls (used internally) to add
tags and render the output document invalid.
STATUS OF THIS MODULE
The result is perfect. I.e., the generated documents are exactly the
same. But there are still some problems:
•
The error output of nsgmls is redirected to /dev/null, which is clearly
bad. I don't know how to avoid that.
The problem is that I have to ``protect'' the conditional inclusions (ie, the
"<! [ %foo [" and "]]>" stuff) from nsgmls. Otherwise
nsgmls eats them, and I don't know how to restore them in the final
document. To prevent that, I rewrite them to "{PO4A-beg-foo}" and
"{PO4A-end}".
The problem with this is that the "{PO4A-end}" and such I add are valid in
the document (not in a <p> tag or so).
Everything works well with nsgmls's output redirected that way, but it will
prevent us from detecting that the document is badly formatted.
•
It does work only with the DebianDoc and DocBook DTD. Adding support for a
new DTD should be very easy. The mechanism is the same for every DTD, you just
have to give a list of the existing tags and some of their characteristics.
I agree, this needs some more documentation, but it is still considered as
beta, and I hate to document stuff which may/will change.
•
Warning, support for DTDs is quite experimental. I did not read any
reference manual to find the definition of every tag. I did add tag
definition to the module 'till it works for some documents I found on the
net. If your document use more tags than mine, it won't work. But as I said
above, fixing that should be quite easy.
I did test DocBook against the SAG (System Administrator Guide) only, but
this document is quite big, and should use most of the DocBook
specificities.
For DebianDoc, I tested some of the manuals from the DDP, but not all yet.
•
In case of file inclusion, string reference of messages in PO files (ie,
lines like "#: en/titletoc.sgml:9460") will be wrong.
This is because I preprocess the file to protect the conditional inclusion
(ie, the "<! [ %foo [" and "]]>" stuff) and some entities (like
&version;) from nsgmls because I want them verbatim to the generated
document. For that, I make a temp copy of the input file and do all the
changes I want to this before passing it to nsgmls for parsing.
So that it works, I replace the entities asking for a file inclusion by the
content of the given file (so that I can protect what needs to in subfile
also). But nothing is done so far to correct the references (i.e., filename
and line number) afterward. I'm not sure what the best thing to do is.
AUTHORS
This module is an adapted version of sgmlspl (SGML postprocessor for the
SGMLS and NSGMLS parsers) which was: