Look I'm just thinking hard about converting to a parser generator
because I want to derive the pretty-printer from the parser without
having to repeat myself all over the place.
This parser.py is derived from my old LRParsers project, and should
go back there eventually, but for now I'm driving the work from here.
It's another jump, perhaps, but smaller arrays, and now we can track
ephemera efficiently without bloating child trees. (We could also
put ephemera inline with the child trees but then nth_token would be
unwieldy, and it would lower our data density.)
It's a little bit complicated, loading a module is a two-step dance
but here's how it's done. Probably some surface-area refactoring needs
to happen so that we do the right thing.
This is simpler because we don't "discover" functions to compile as we
go, we just compile all the ones we can find, and functions have
pre-defined exports. This is good and useful to us as we can now refer
to functions in different modules by known indices, but it *does* make
me wonder what we're going to do for compiling generic specializations.
The previous technique was better for that sort of thing.
This is all predicated on the idea that I want to have
partially-compiled modules, which I can't really say why I want it. If
I'm happy to just compile things cross module in the same kind of
space then it's much easier to go back to the function key way of
working.
The system has an invariant that if you ever return an error
sentinel (error environment, error type) then that sentinel is caused
by an error that was reported to the user. We have had too many bugs
over the last little while where that was not the case!
(An example is if we mis-interpret the tree by calling `nth_tree` with
the wrong index or something, and get `None`, and think "oh must be a
syntax error", but it was really just the wrong index. Then there's an
error sentinel with no error diagnostic and we don't discover the
mistake until much farther along.)
Now we enforce this by requiring that whoever constructs the error
sentinel *prove* that they can do so by providing a diagnostic. It's
less efficient but prevents the problem.
This actually uncovered a couple of latent bugs where we were
generating error sentinels instead of a more appropriate type! Whoops!
I'm worried that this will turn into something I always have to do
instead of something I just do sometimes: always provide the
provenance of some error type or another. Link error types to
diagnostics, etc.
Going to need to normalize that name though, because right now it
really *really* sucks to have a big \\?\ kinda name. Probably
normalize it relative to the base directory.
So it turns out that I can't hold `&str` in token because it makes it
impossible to encapsulate a source file in the larger context- self
referential structure problems again. Everything gets rebuilt so that
the source can be passed through. While we're at it, more things
become Rc<> because, man..... life it too short.
Semantics in particular has become a giant hub of the module state: we
can basically just hold an Rc<Semantics> and have everything we could
possibly want to know about a source file, computed lazily if
necessary.