Extract node info from WML into PO

Discuss and coordinate development of mainline and user-made content translations.

Moderator: Forum Moderators

Post Reply
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Extract node info from WML into PO

Post by caslav.ilic »

While translating, I had to peek a lot into source WML files, to be able to tell e.g. who's talking in a particular message, or if any person is talking at all. For example, sentence elements in my language may depend on whether male or female character is talking, not to mention possible style difference between characters.

So I thought it would be nice if the WML extraction would add the node info to each PO entry. Like this:

Code: Select all

#. message: description=Nil-Galion
#: data/campaigns/Two_Brothers/scenarios/2_The_Chase.cfg:477
msgid "You foolish human, you killed me but you won't be able to reach the undead. I fulfilled the pact and will be reanimated soon to
be a Lord of their armies."
msgstr ""

#. message: description=Arne
#: data/campaigns/Two_Brothers/scenarios/2_The_Chase.cfg:481
msgid "Hurry, we have to track them down. Maybe we can still get them. They have to be in the north!"
msgstr ""

#. objective: condition=win
#: data/campaigns/Two_Brothers/scenarios/2_The_Chase.cfg:486
msgid "Hurry to the north and stop the kidnappers"
msgstr ""
The patch for wmlxgettext to do this is attached.

However... I'm not posting this straight to patch section at Gna, because node info parsing is inherently brittle -- it seems to me that nothing short of the referent WML parser itself would do perfectly. wmlxgettext as patched above passes make update-po for the current trunk, but with a patch to one of WML files (the second attached), and similar tweaks may be needed in the future.

So the question: is it worth it? Would the node info be useful for enough languages/translators, such that the need for occasional fixes to wmlxgettext or slight adjustments to WML files is acceptable?
Attachments
nodeinfo-wml-adjust-01.txt
Slight modification for make update-po to work with patched wmlxgettext.
(629 Bytes) Downloaded 453 times
wmlxgettext-nodeinfo-01.txt
Patch for wmlxgettext: extract node info.
(4.33 KiB) Downloaded 465 times
Chusslove Illich (Часлав Илић)
vicza
Posts: 238
Joined: January 16th, 2008, 11:40 pm
Location: Moscow

Re: Extract node info from WML into PO

Post by vicza »

caslav.ilic wrote:So the question: is it worth it? Would the node info be useful for enough languages/translators
I think, yes. It would be quite useful.

Though, maybe, it would be better not in comments, but in msgid itself, separated by ^ sign, as it's already used in some cases in wesnoth.po

That is:

msgid "Arne^Hurry, we have to track them down...."
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Re: Extract node info from WML into PO

Post by caslav.ilic »

vicza wrote:Though, maybe, it would be better not in comments, but in msgid itself, separated by ^ sign [...]
This is not really technically feasible, as it would make the message different to what the program sees at runtime, and thus it would fail to be translated. The existing "...^..." messages are stated like that in the code itself, it is not the message extraction that composes them.

Besides, using a #. ... comment (called automatic comment) to provide such information, rather than "...^..." (called context), is customary for Gettext-based translations and PO file format. Translators really ought to keep an eye for any automatic comments that appear in the messages.
Chusslove Illich (Часлав Илић)
vicza
Posts: 238
Joined: January 16th, 2008, 11:40 pm
Location: Moscow

Re: Extract node info from WML into PO

Post by vicza »

caslav.ilic wrote:Besides, using a #. ... comment (called automatic comment) to provide such information, rather than "...^..." (called context), is customary for Gettext-based translations and PO file format. Translators really ought to keep an eye for any automatic comments that appear in the messages.
Well, maybe... Though I sometimes use the comment field for my own comments during translation, so too many automatic comments wouldn't be too good. But better thus, than not at all, anyway.
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Re: Extract node info from WML into PO

Post by caslav.ilic »

vicza wrote:Though I sometimes use the comment field for my own comments during translation, so too many automatic comments wouldn't be too good. [...]
Worry not :) Upon merge with templates, comments are arranged such that the translator's comments (those with a blank folowing the hash, # ...) are put first, then the automatic comments (#.), and finally the source comments (#:). Text editors with PO syntax highlighting will usually show translator comments in different color (e.g. in Kate they are grey, while auto- and source comments are blue), and dedicated PO editors will typically show them in different widgets, so that translator comments may be edited.
Chusslove Illich (Часлав Илић)
VS
Translator
Posts: 187
Joined: November 27th, 2005, 10:07 am

Re: Extract node info from WML into PO

Post by VS »

This is a very good idea, and seeing you already coded it doubles the "good" part :D

I'm really looking forward to seeing speakers in comments - one translator in my team added them manually for the whole campaign he worked on, and suddenly I saw it's highly useful...

You should post the patch to Wesnoth's gna! patch section since that's where the real stuff is...
User avatar
ivanovic
Lord of Translations
Posts: 1149
Joined: September 28th, 2004, 10:10 pm
Location: Germany

Re: Extract node info from WML into PO

Post by ivanovic »

Great idea. Definitely worth to have a look at, will do so later on but I have no time to do so at the moment. Please add the patch to patches.wesnoth.org, so that it does not get forgotten.
torangan
Retired Developer
Posts: 1365
Joined: March 27th, 2004, 12:25 am
Location: Germany

Re: Extract node info from WML into PO

Post by torangan »

Your patch has a slight error - #textdomain is supposed to be independent of the point in the WML tree. Right now it fails for the Raajal campaign which has #textdomain inside [campaign] which is allowed.
Example use:
#textdomain mine
[unit]
description=...
...
#textdomain wesnoth
... some stuff sharing strings with mainline...
#textdomain mine
...
[/unit]

This is supposed to work.
WesCamp-i18n - Translations for User Campaigns:
http://www.wesnoth.org/wiki/WesCamp

Translators for all languages required: contact me. No geek skills required!
User avatar
Noyga
Inactive Developer
Posts: 1790
Joined: September 26th, 2005, 5:56 pm
Location: France

Re: Extract node info from WML into PO

Post by Noyga »

In fact that patch was nice and applied, but have a small problem : it seems the #textdomain need to be at the toplevel, else the parser will be confused.
This is something that we can manage to fix for mainline but for UMC we don't have control.
Also sometime you can expect to have the #texdomain starting in the middle of the file because the other parts of the file would be dealt by another textdomain.
The following code did cause us some problem on Wescamp-i18n until we realised what was going on :

Code: Select all

[textdomain]
    name="wesnoth-Raajal"
    path="data/campaigns/Raajal/translations"
[/textdomain]
[campaign]
    #textdomain wesnoth-Raajal
    icon=units/human-magi/arch-mage.png
    name= _ "Raajal"
    abbrev= _ "Raajal"
    description= _ "An Arch Mage with the power of resurrection seeks a new apprentice. Version" + " 0.6.3, 03-14-2008"
    define=CAMPAIGN_RAAJAL
    difficulties=EASY,NORMAL,HARD
    difficulty_descriptions={MENU_IMG_TXT "units/undead-skeletal/skeleton.png~TC(5,magenta)" _"Easy"} +
    ";*" + {MENU_IMG_TXT "units/undead-skeletal/revenant.png~TC(5,magenta)" _"Normal"} + ";" +
    {MENU_IMG_TXT "units/undead-skeletal/draug.png~TC(5,magenta)" _"Hard"}
    first_scenario=Resurrection
    [about]
        title = _ "Campaign Design"
        [entry]
            name = "Sam M. (Genosuke)"
            email = "genosuke69@hotmail.com"
        [/entry]
    [/about]
[/campaign]

#ifdef CAMPAIGN_RAAJAL
    [binary_path]
        path=data/campaigns/Raajal
    [/binary_path]
    [+units]
        {@campaigns/Raajal/units}
    [/units]
    {@campaigns/Raajal/utils/}
    {@campaigns/Raajal/utils/utils.cfg}
    {@campaigns/Raajal/scenarios}
#endif
Here wmlxgettext crash on [/campaign] because the #textdomain is after [campaign] :
expected closed node 'top' got 'campaign' at ./Raajal/_main.cfg:24 at /home/torangan/wesnoth/trunk/utils/wmlxgettext line 134, <FILE> line 24.
If you can change this to a non fatal error it would be nice and would be usable for wescamp-i18n (the current version of wmlxgettext causes too much trouble here)
"Ooh, man, my mage had a 30% chance to miss, but he still managed to hit! Awesome!" ;) -- xtifr
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Re: Extract node info from WML into PO

Post by caslav.ilic »

Noyga wrote:If you can change this to a non fatal error it would be nice and would be usable for wescamp-i18n (the current version of wmlxgettext causes too much trouble here)
The problem with #textdomain directive was that wmlxgettext was just skipping lines not in the extraction domain, with the effect of "commenting out" WML portions, and hence the problem. I've now switched it so that all messages are parsed regardless of the domain, but only those with matching domain actually recorded for output.

This also means that wmlxgettext would stop on otherwise invalid WML, rather than make it non-fatal. I did this because if WML is not valid, messages may get wrong node info, which is worse than no info at all. To still safeguard from this but add more robustness, now I've made it so that on invalid WML (i.e. if the parser thinks so), node info is simply not output into PO at all (and a warning is given).

The patch attached (should I head over to Gna patches...?)
Attachments
wmlxgettext-robustness-01.diff.txt
(3.31 KiB) Downloaded 425 times
Chusslove Illich (Часлав Илић)
User avatar
Noyga
Inactive Developer
Posts: 1790
Joined: September 26th, 2005, 5:56 pm
Location: France

Re: Extract node info from WML into PO

Post by Noyga »

Thanks for this very quick reply and fix.
I don't think that rejecting invalid WML is a problem.
This has been commited, so no need to submit a patch ;).
"Ooh, man, my mage had a 30% chance to miss, but he still managed to hit! Awesome!" ;) -- xtifr
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Re: Extract node info from WML into PO

Post by caslav.ilic »

There is in any language, probably, a proverb illustrating the inverse relation of quickness and savviness.

At any rate: there was no need to dump all node info for the extracted PO in case of WML parse problem. Not for each of the WML files parsed for the given PO, and not upstream of the error in the problematic WML file. Attached is the patch to curb this eagear dumping of valid info (applies to the previous, no need to revert anything beforehand).
Attachments
wmlxgettext-robustness-02.diff.txt
(3.88 KiB) Downloaded 432 times
Chusslove Illich (Часлав Илић)
caslav.ilic
Translator
Posts: 61
Joined: February 5th, 2007, 4:17 pm
Location: Brunswick, Germany

Re: Extract node info from WML into PO

Post by caslav.ilic »

There was a bug of not extracting info variables when the value is quoted. Fix attached.
Attachments
wmlxgettext-robustness-03.diff.txt
(2.45 KiB) Downloaded 429 times
Chusslove Illich (Часлав Илић)
Post Reply