I think I want to start thinking about "leftmost" and "rightmost" and
it's just easier and faster if cons has an actual list in it instead
of dotted pairs.
This required rebuilding the matcher compiler significantly, and it
was a lot a lot of work. But now we don't generate so many spurious
newlines and the document we produce is a lot a lot better.
It's weird that it counts against the line length though, like if you
were going to break you could ignore it right? At least, for the
grammar I'm working here....
Now we're cooking with gas ALTHOUGH now we have to deal with the fact
that we're gluing everything together where there *should* be spaces.
Many more improvements to come.
Remember that tree levels are generated by context free languages, not
regular languages, and so they can only be recognized by push-down
automatons, not finite state machines.
What happened was that I failed to account for transparent rules.
Without transparent rules the children of a tree node do not have any
recursion in them (by definition!) and so therefore *are* a regular
language. But transparent rules change that: there *can be* recursion
hidden on the same tree level, and it should have been clear from a
moment's reflection that the recursion there meant that tree levels
were once again a context free language.
Fortunately we have a recognizer for context free languages lying
around, so we can just use that I guess.
Still very garbage but I think the "hard" part of building a Wadler
document from a parse tree might be there. It's a backtracking matcher
which might turn out to be too slow for alternatives but maybe will be
fine?
Still needs lots of tests.
Instead of trying to build a regular expression string, just build a
structured thing with seq() and choice() and whatnot. This is
technically uglier but fixes a problem I found with comment regular
expressions so you know, it works, which is better than not working.
Also now tokens get named and maybe that's good? It's so hard to say.
This allows most of our precedence to be re-used. There are some cases
still where tree-sitter gets confused (and we don't), see the
corresponding change to grammar.py. I wish I knew how to fix this but
I don't. :(
Tree sitter doesn't let me do token-based precedence? I don't like
tree-sitter's "make it inline but give it a number" system- seems like
a bug farm to me.