Python Toolchain - Moving forward.
Moderator: Forum Moderators
Python Toolchain - Moving forward.
Hey,
I've been doing some digging and thinking about the python tool
chain, and given that we have a number of interested python
programmers buzzing around, I think it's more important to get
discussion going. My idea isn't 100% thought through in all
important details so far, but I'm feeling pretty good about my
general idea, so let's get this rolling.
As far as I can see, the big objective for the python toolchain
is to simplify the life and work of content creators. Immediately,
this would require the existing WML checking and some WML formatting,
but later on this topic could branch out, with WML generation,
editor integrations, and probably more stuff. This is good, since
this means this project will be going for quite some time.
Given this, the more immediate goals will be to keep the
current feature set of the WML data tooling. This includes, but
isn't limited to:
this means we need to be quite careful when structuring a solution,
since these are some well-known, but rather tricky compiler problems.
Now, given that, let's talk about the code base and the course of action
I'd propose.
but start a new modern, well-tested python3 tool. This tool should
follow a number of fundamental design decisions:
own repository. This makes it easier to find the project, and we can
use travis to checkout our code. With tox, we could also go ahead and
maintain python2 and python3 compatibility and make upgrading easier.
However, I understand how this is hard and opposed due to current
build chains.
From here, i'd like to see the following happen:
I've been doing some digging and thinking about the python tool
chain, and given that we have a number of interested python
programmers buzzing around, I think it's more important to get
discussion going. My idea isn't 100% thought through in all
important details so far, but I'm feeling pretty good about my
general idea, so let's get this rolling.
As far as I can see, the big objective for the python toolchain
is to simplify the life and work of content creators. Immediately,
this would require the existing WML checking and some WML formatting,
but later on this topic could branch out, with WML generation,
editor integrations, and probably more stuff. This is good, since
this means this project will be going for quite some time.
Given this, the more immediate goals will be to keep the
current feature set of the WML data tooling. This includes, but
isn't limited to:
- Syntactic validation of WML files. This is mostly just parsing the
files. - Semantic validation of WML files after being pre-processed. This
mostly consists of questions like "Is this tag allowed to have this child"
or "Are there spelling errors in this value". This probably also
includes some cross-checking on a WML level, if possible. - Reformatting of well-formed WML files before being pre-processed
- If we consider wmlscope (and we should, mid-term) semantic
validation before preprocessor as long as the macros are
sane. - If we're feeling brave, more precise macro analysis might be possible.
This will require some deep meditation with the dragon book, though,
so it's more a nice to have than something we should really plan for.
this means we need to be quite careful when structuring a solution,
since these are some well-known, but rather tricky compiler problems.
Now, given that, let's talk about the code base and the course of action
I'd propose.
- I don't think wmllint is slavageable. Not gonna beat around this.
The code consists of two weird, hairy loops with multiple kinds
of states flying around somehow and I don't even know what. - wmlindent works, however, I'd need to check if we can adapt it
into the structure I'll propose in a second. - I have not yet had time to check wmlscope, but I've heard that it
has non-trivial problems, and I doubt it will be compatible. - I'll need to think about wmliterator.
- I don't like how wmlparser2 has a dependency on the wesnoth binary. This
will make testing this software a lot harder, and I'd want the new
wml data suite to be pre-processor aware to be able to perform
macro checking. However, I think we can use wmlparser2's syntax tree
as the representation for the fully expanded wml tree and wmlparser2
itself to kickstart development.
but start a new modern, well-tested python3 tool. This tool should
follow a number of fundamental design decisions:
- The tool should use subparsers to structure different commands.
There'd be "wmldat lint", "wmldat migrate", "wmldat indent" and
so on. This makes it easier to use, since there's just one entry
point into the system, and it's easy to extend, since we can just
add more sub commands. - We need to be very clear about separating different tasks. Lint
should be a read-only operation which runs checkers on a tree.
Autocorrect could fix some issues with simple fixes. Migrate
assumes a sane file (or enforces it) and changes
things in a single file. Indent assumes a syntactically sane file
and outputs it again. - Internally, the tool should task or phase oriented and very clear about
the state of the files it is looking at. Thus, for example, most
semantic checks would require a task 'fully-expand-input-wml', which
is guaranteed to provide a fully parsed and preprocessed
AST of the wml we're working on.
Other, braver tasks might just depend on 'preprocessor-parsed' or
'lexxer-created' or something like that, and deal with mostly
unparsed and unreliable files.
This, overall, would give us great flexibility to reuse existing
code in all kinds of ways, and it would enable us to have many
developers work in parallel on the same code base.
own repository. This makes it easier to find the project, and we can
use travis to checkout our code. With tox, we could also go ahead and
maintain python2 and python3 compatibility and make upgrading easier.
However, I understand how this is hard and opposed due to current
build chains.
From here, i'd like to see the following happen:
- Please discuss this. I don't want to just do something without
backing of the more senior devs. - We need to decide the repository question. Code can't happen
without a repo. - Hopefully there are more python devs here. We'd need to flesh
out some of the more concrete implementation details. After
that, we'd probably start bootstrapping the task architecture,
add in wmlparser2 as a start and branch into linters, checkers
and all of the good stuff to get something going quickly.
Re: Python Toolchain - Moving forward.
please count on me
I agree on the own repo for the project, but it should maintain some kind of link with the main wesnoth code, so that content creators use the new tools.
Let me think about all what you are proposing, to state more concrete opinions here, but just wanted to raise my hand backing the idea
I agree on the own repo for the project, but it should maintain some kind of link with the main wesnoth code, so that content creators use the new tools.
Let me think about all what you are proposing, to state more concrete opinions here, but just wanted to raise my hand backing the idea
Eru kaluva tielyanna
Re: Python Toolchain - Moving forward.
Hey, I'm one of the python devs that's been trying to reach you to coordinate on this. Your goals all seem attainable and I'm willing to help out with them.
I'm fully behind your suggestion of moving this code to another repo. Four reasons off the top of my head for moving to a separate repo:
My own background:
I've been working in python for most of my career, not because my companies natively use it but because I enjoy it so much. I've done all my work in python 2 but I'm not at all concerned about moving to 3 if that's what the project needs.
I have a passion for elegant software design and robustness, achieved through well-understood design practices and rigorous testing. I'm certain I can bring a lot of value to the project. I'd like to meet with other python devs in an IRC session if our schedules can align.
I'm fully behind your suggestion of moving this code to another repo. Four reasons off the top of my head for moving to a separate repo:
- Focus: A dedicated repo will help to clarify the separation between our code and code that is not impacted by our changes.
- Structure: In a dedicated repo we will be free to structure code in a way that best serves our code and doesn't have to be weighed against unrelated code.
- Minimalism: I shouldn't have to download several GB of assets to work on code that never leverages it.
- Attractiveness: You'll have an easier time attracting people to work on that portion of the code you can lower the barrier to entry.
My own background:
I've been working in python for most of my career, not because my companies natively use it but because I enjoy it so much. I've done all my work in python 2 but I'm not at all concerned about moving to 3 if that's what the project needs.
I have a passion for elegant software design and robustness, achieved through well-understood design practices and rigorous testing. I'm certain I can bring a lot of value to the project. I'd like to meet with other python devs in an IRC session if our schedules can align.
- Elvish_Hunter
- Posts: 1585
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
Re: Python Toolchain - Moving forward.
Yesterday I took another look at wmllint. Its main problem (IMO) is that its conversion and sanity check functions are simply too long, and as such they cannot be easily maintained.Tetha wrote:I don't think wmllint is slavageable. Not gonna beat around this.
The code consists of two weird, hairy loops with multiple kinds
of states flying around somehow and I don't even know what.
A good idea may be to split them in smaller functions and/or moving them to a separate file.
Tetha wrote:Furthermore, I maintain that I'd love to move the project into it's
own repository. This makes it easier to find the project, and we can
use travis to checkout our code.
jstitch wrote:I agree on the own repo for the project, but it should maintain some kind of link with the main wesnoth code, so that content creators use the new tools.
I disagree with this approach.chaverma wrote:I'm fully behind your suggestion of moving this code to another repo.
From a developer's point of view, having a separate repo is the best solution. If it was only for me, I'd say "what are we waiting to move?!?". But the problem is that these tools are to be used by UMC authors, which may or may not know how to use Github, and which may think "so, I need to install these tools. Then I need to install Python. Well, too much work: forget it", and we'd risk having exactly what we don't want: nobody using them but the mainline devs.
Instead, as things currently are, we can tell to a UMC author "Hey, you need to run wmlindent". He answers "How do I do it?", and our answer is "Install Python, then double click on the GUI.pyw file. That's it".
Besides, some tools (like GUI.pyw and trackplacer) assume that they're installed in the wesnoth/data/tools directory. Granted, since I'm maintaining GUI.pyw it won't be much work for me to add a preferences file and ask for the core directory on the first run, or add a preferences dialog, but this point may apply for some other scripts as well.
Well, I already started moving by using the commandchaverma wrote:I'm not at all concerned about moving to 3 if that's what the project needs.
Code: Select all
from __future__ import print_function
The problem, using a parser-based solution, is that we may not be able to keep some of the data contained in the file, like comments - it depends on how the new parser will be implemented, of course. I guess that this is the reason why wmllint uses regexps and string functions instead of a proper parser.Tetha wrote:I don't like how wmlparser2 has a dependency on the wesnoth binary.
Besides, I never wrote a parser so far, but I guess that this won't be an issue.
In my case, keep in mind that I read the logs, and when I'm not in the channel you can always drop me a PMchaverma wrote:I'd like to meet with other python devs in an IRC session if our schedules can align.
EDIT: I was almost forgetting a thing. Trackplacer, which is currently broken, uses PyGTK as GUI toolkit. If someone wants to work on it (I must have some half baked attempt somewhere), keep in mind that a refactoring will have to use Tkinter/ttk as GUI toolkit, mainly because this is the default one for Python. It'll still require installation of the Pillow library (BTW, I should update wmlscope to warn about installing the more updated Pillow instead of PIL), but for the average user installing a 1,3 MB library is better that downloading and installing, say, a 38,1 MB one (PyQT5), right?
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
Re: Python Toolchain - Moving forward.
UMC creator here. I don't know how many are in my boat, but I thought I'd wade in. I don't use most of the tools because I find them too difficult to use, not to install (admittedly I haven't tried for quite a period). For me, if the devs are in an environment that works better for them and lets them build more effective, more user-friendly tools, I think that's a better route than having tools that not everyone can use anyway.
Surely part of the making them user-friendly could be building some kind of tool to help download and install the tools, right? :p
Surely part of the making them user-friendly could be building some kind of tool to help download and install the tools, right? :p
Maintainer of the Imperial Era and the campaigns Dreams of Urduk, Epic of Vaniyera, Up from Slavery, Fall of Silvium, Alfhelm the Wise and Gali's Contract.
But perhaps 'maintainer' is too strong a word.
But perhaps 'maintainer' is too strong a word.
Re: Python Toolchain - Moving forward.
If we decide to split the Python tools into their own repo, maybe git subtree could be an option:
https://developer.atlassian.com/blog/20 ... t-subtree/
https://medium.com/@porteneuve/masterin ... 3d29a798ec
It allows to include a snapshot of another repository inside Wesnoth and merge updates to it. It seems that, besides from merging the subtree from time to time, no changes to the wesnoth repo would be necessary.
I never used it and can't say whether it's the correct solution, but perhaps 8680 (c74d on IRC) knows more about it?
https://developer.atlassian.com/blog/20 ... t-subtree/
https://medium.com/@porteneuve/masterin ... 3d29a798ec
It allows to include a snapshot of another repository inside Wesnoth and merge updates to it. It seems that, besides from merging the subtree from time to time, no changes to the wesnoth repo would be necessary.
I never used it and can't say whether it's the correct solution, but perhaps 8680 (c74d on IRC) knows more about it?
Re: Python Toolchain - Moving forward.
aquileia wrote:If we decide to split the Python tools into their own repo, maybe git subtree could be an option
It allows to include a snapshot of another repository inside Wesnoth and merge updates to it. It seems that, besides from merging the subtree from time to time, no changes to the wesnoth repo would be necessary.
I never used it and can't say whether it's the correct solution, but perhaps 8680 (c74d on IRC) knows more about it?
I use subtrees a lot at my job. And they are definitely an option.
When merging from a subtree, at least two commits are generated on the main repo: the merge one and the commit you are merging from the subtree (and using squash it can always be reduced to one commit here)
http://git-scm.com/book/en/v2/Git-Tools ... tree_merge
Eru kaluva tielyanna
Re: Python Toolchain - Moving forward.
Yup, noticed you. Sadly, the plan to stumble into you on IRC isn't working out as intended.Hey, I'm one of the python devs that's been trying to reach you to coordinate on this. Your goals all seem attainable and I'm willing to help out with them.
This is mostly why I think we need a phase- or task-based approach.The problem, using a parser-based solution, is that we may not be able to keep some of the data contained in the file, like comments - it depends on how the new parser will be implemented, of course. I guess that this is the reason why wmllint uses regexps and string functions instead of a proper parser.
Besides, I never wrote a parser so far, but I guess that this won't be an issue.
If you need to retain as much from the input file as possible, and potentially rewrite the input file just a little bit, you'll want to hook into the very early lexical phase. Whitespace- and comment-aware lexers are possible. So for example wmlindent would just consume the token stream from the lexer, ignore input whitespace tokens and write the rest to an output formatted correctly.
On the other hand, if you want to do matching on the tree structure, there's no need to worry about whitespace and comments. You need to be able to find tags and structures of tags, which can be greatly simplified with a strong AST in the background. And the last times I've worked on systems like this.. you'll end up with checks where you love the strong AST, because it's hell to write the check even with that.
During the week I'm available at around 2000 - 2200 UTC+2. On weekends, I can stretch that somewhat.I'd like to meet with other python devs in an IRC session if our schedules can align.
In combination with UnwiseOwl's comment, this actually seems more like a user story to my. "As a content developer, I want a simple and easy way to setup a development environment with all tools available".Instead, as things currently are, we can tell to a UMC author "Hey, you need to run wmlindent". He answers "How do I do it?", and our answer is "Install Python, then double click on the GUI.pyw file. That's it".
In fact, why'd you even need to run wmllint manually? We could go ahead and beef up GUI.pyw with inotify or equivalent tools to just run wmllint if you save files, so you get immediate feedback about the state of your code.
Re: Python Toolchain - Moving forward.
a Wesnoth IDE for WML developersTetha wrote: In fact, why'd you even need to run wmllint manually? We could go ahead and beef up GUI.pyw with inotify or equivalent tools to just run wmllint if you save files, so you get immediate feedback about the state of your code.
Eru kaluva tielyanna
Re: Python Toolchain - Moving forward.
There is one. Though, it is actually a Eclipse plug-in. http://forums.wesnoth.org/viewtopic.php?f=21&t=30880jstitch wrote:a Wesnoth IDE for WML developers
Problem of distribution of <anything> can be easily solved by creating web service, which will do the job.Tetha wrote:In fact, why'd you even need to run wmllint manually? We could go ahead and beef up GUI.pyw with inotify or equivalent tools to just run wmllint if you save files, so you get immediate feedback about the state of your code.
- Pentarctagon
- Project Manager
- Posts: 5599
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: Python Toolchain - Moving forward.
Until the internet goes outXudo wrote:Problem of distribution of <anything> can be easily solved by creating web service, which will do the job.Tetha wrote:In fact, why'd you even need to run wmllint manually? We could go ahead and beef up GUI.pyw with inotify or equivalent tools to just run wmllint if you save files, so you get immediate feedback about the state of your code.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: Python Toolchain - Moving forward.
If the internet goes out, I'd like my next version of Wesnoth on CD posted to my address, please.
Maintainer of the Imperial Era and the campaigns Dreams of Urduk, Epic of Vaniyera, Up from Slavery, Fall of Silvium, Alfhelm the Wise and Gali's Contract.
But perhaps 'maintainer' is too strong a word.
But perhaps 'maintainer' is too strong a word.
Re: Python Toolchain - Moving forward.
There is a tradeoff in your argument not mentioned: that keeping the status quo hinders the tools' progress on the fronts of correctness, usability and robustness. All of these contribute to that undesirable end state of "too much work, forget it."Elvish_Hunter wrote:I disagree with this approach.chaverma wrote:I'm fully behind your suggestion of moving this code to another repo.
From a developer's point of view, having a separate repo is the best solution. If it was only for me, I'd say "what are we waiting to move?!?". But the problem is that these tools are to be used by UMC authors, which may or may not know how to use Github, and which may think "so, I need to install these tools. Then I need to install Python. Well, too much work: forget it", and we'd risk having exactly what we don't want: nobody using them but the mainline devs.
There's no reason we can't store an artifact from the separate repo in the main repo in a predictable place, whose interface can be as simple as the double-click. It can yield the same usability standard you've established. Or using git subtrees as aquileia suggests.Elvish_Hunter wrote: Instead, as things currently are, we can tell to a UMC author "Hey, you need to run wmlindent". He answers "How do I do it?", and our answer is "Install Python, then double click on the GUI.pyw file. That's it".
Re: Python Toolchain - Moving forward.
Weekdays are out, but I should be able to swing that this coming weekend.tetha wrote: During the week I'm available at around 2000 - 2200 UTC+2. On weekends, I can stretch that somewhat.
-
- Inactive Developer
- Posts: 2461
- Joined: August 15th, 2008, 8:46 pm
- Location: Germany
Re: Python Toolchain - Moving forward.
One of the main problems with wmllint and wmlscope in my opinion always was the point that they are using their own ways of preprocessing wml, redundant code to the C++ preprocessor, with a lot of bugs/differences. wmllint is unable to spellcheck translatable strings hidden in macros for instance. Couldn't they call the wesnoth executable as a subprocess to use the C++ preprocessor!? wmllint should operate on the --preprocess output.
As for trackplacer being broken...I used to be able to use it on various Linux systems, but not on windowses. There was some problem about the track markers not appearing in the UI IIRC. IIRC I reported it to esr once, it must have been on IRC, and he said something about a bad library, judging from the console messages. IIRC the problem seemed to be upstream (pygtk or gtk), not trackplacer itself.
As for trackplacer being broken...I used to be able to use it on various Linux systems, but not on windowses. There was some problem about the track markers not appearing in the UI IIRC. IIRC I reported it to esr once, it must have been on IRC, and he said something about a bad library, judging from the console messages. IIRC the problem seemed to be upstream (pygtk or gtk), not trackplacer itself.
projects (BfW 1.12):
A Simple Campaign: campaign draft for wml starters • Plan Your Advancements: mp mod
The Earth's Gut: sp campaign • Settlers of Wesnoth: mp scenario • Wesnoth Lua Pack: lua tags and utils
updated to 1.8 and handed over: A Gryphon's Tale: sp campaign
A Simple Campaign: campaign draft for wml starters • Plan Your Advancements: mp mod
The Earth's Gut: sp campaign • Settlers of Wesnoth: mp scenario • Wesnoth Lua Pack: lua tags and utils
updated to 1.8 and handed over: A Gryphon's Tale: sp campaign