libwml

Discussion of all aspects of the game engine, including development of new and existing features.

Moderator: Forum Moderators

Post Reply
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

libwml

Post by iceiceice »

I wanted to make a page about a wesnoth-related project that I worked on that I open sourced recently.

libwml is a parser library which can be used to parse WML.

The main distinctive feature of it is that it parses WML without expanding the macros -- the macros get represented in an unexpanded state, and it stores the definitions of the macros as well.

It also knows most of the schema of WML files known by the wesnoth program, as described on the wiki. It can read standard WML files and check if they match the schema.

I made an early version of this in 2015 which found a small number of bugs in mainline campaigns and core macros -- those bugs were reported and fixed at that time. Generally they were like, mistyped attribute names, causing some attribute to be ignored, or an illegal value.

There are a number of possible uses for this. The main ones are:
  • Make a new linter tool which makes fewer mistakes than `wmllint` and doesn't rely on regex.
  • Make a tool to automatically rewrite wesnoth campaigns and addons into WSL (fabi's lua moonscript language for wesnoth)
  • Make a tool to automatically rewrite wesnoth campaigns and addons into FFL (the anura formula language used by wesnoth2).
It is still too immature for any of these applications right now, but I thought I would post it anyway and gauge interest. It is free for anyone to use.

I will probably continue to tinker with it and develop it in the near future, see README for a more extended discussion of my goals with this. https://github.com/cbeck88/libwml
User avatar
Astoria
Inactive Developer
Posts: 1007
Joined: March 20th, 2008, 5:54 pm
Location: Netherlands

Re: libwml

Post by Astoria »

This is honestly very nice. Very curious to see what you're going to do with it. Might even look into doing something with it myself if I can find the time.
Formerly known as the creator of Era of Chaos and maintainer of The Aragwaithi and the Era of Myths.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: libwml

Post by iceiceice »

:D

So I decided to change how the whole thing works yesterday and today.

The idea is that, there is an XML file that knows about every tag that is documented on the wiki. (Knowing, what attributes it has, what types they can have, what their default values are, what child tags it can have, and how many / when is the order important etc.)

The XML file looks like this:

https://github.com/cbeck88/libwml/blob/ ... h_1_12.xml

It's about ~1500 lines. Actually I thought it would be a lot larger.

There is a python script, ~500 lines, which can read that and convert it into a useful C++ header that defines a structure type for each tag, and a number of useful power tools associated to WML. The XML file gets translated "sort of" line-for-line into C++ (or block-for-block, rather). The XML format is documented in that python file. (py/generate.py in repository)

There are still some issues with the xml spec, it's not 100% correct yet, and I think there are some parts of WML that I didn't do, mainly associated to AI. I'm not sure if the terrain graphics / some parts of the unit_type variations etc. are properly specced right now also. Oh, I also didn't do specials or abilities yet I think.

The best way to check that stuff is probably to try to get the linter mode running again, but it might be that just carefully double checking it against the wiki is good enough...

To make an emitter, there are some more things that need to happen:
- Currently struct members have type "int", "bool", "std::string" etc. if the corresponding attribute is a number, boolean, and so on. This is correct when we are fully expanding macros, but for unexpanded macros, we probably want something like "int_expression", "bool_expression". Those don't actually exist yet :D
- One thing that makes it hard right now is that, macros don't really have a type, they are a pure textual substitution. In almost all cases though, (when the WML writer is sane), the macro evaluates either to a string, a number, or a table of some form. In a language that isn't statically typed, such as lua, this isn't that big a deal. In FFL functions are supposed to have a static return type. We could cheat by returning "any" instead, but another thing we could try to do is deduce the return type from the context -- what does it look like the function is supposed to produce. If the function is used in many places, one of them might be unambiguous, or we might detect a macro that we have no hope to translate properly to a function. (It might imbalanced or something.)
- When we've concluded that we can't properly convert a macro to a function, we can either fail, or just inline the macro there , and try to proceed with translation. This means the goal doesn't need to be to succeed in all of the most difficult cases: we can just keep improving it until we are satisfied. Also we can inline a macro at some places that it is used, but use it as a function elsewhere.
- A big piece of the problem is, how to translate action_wml: once we can translate action_wml, the remaining barriers should be surmountable. A small piece of action_wml where one might start is, conditional_wml. From there, one could attempt to translate either control flow (if, while, switch), or filters... I think filters should have a quite elegant expression in wesnoth2 FFL.
- Before making an emitter, it might be better to try to make a really smart linter that analyzes the WML for lots of problems, like, some macro is supposed to expand to a string, but instead it expanded to something that doesn't look like a string... afaik no linters right now can see a problem like that.
- Another quite large issue is that, sometimes a macro is used not just for an attribute or a child, but for a collection of attributes within a tag. The macro effectively returns a table which should be merged with the current tag. There's no good way right now to represent that situation. In many cases, we just want to inline the macro I think, but there might be an alternative data structure we can use for this depending on what we hope to emit.
- A different goal would be try to avoid scenarios and units entirely for now -- just try to write something that can translate the terrain graphics logic into another language like lua, and see if it works. terrain_graphics has a lot of macros, so that's a pretty decent test case.

Cheers,
iceiceice
Post Reply