Commit graph

19 commits

Author SHA1 Message Date
2d87207b54 Some small tweaks 2024-08-30 09:04:18 -07:00
80d932b36a Refactor to use non_terminals() 2024-08-29 08:23:55 -07:00
f8b62bf4a4 Terminal 'value' is 'name', compile_lexer is method 2024-08-29 08:22:23 -07:00
344dde51be Grammar start symbol is public 2024-08-29 08:12:08 -07:00
dc03bf7373 Grammars can be named 2024-08-29 08:00:40 -07:00
abcb0e516a OptionalRule is not required but MetatdataRule is 2024-08-28 08:33:32 -07:00
02c1aa507e Muck around with usability 2024-08-28 08:27:46 -07:00
0be0075cfe Generic token stream
Compatible with the harness
2024-08-27 16:47:26 -07:00
49ad7fdb52 Associate metadata with terminals
This is a half-assed attempt at doing syntax coloring which I think
will almost certainly turn out to be insufficient. I'm committing it
just to record some of the work I've done but. BUT.

Probably trying to match tree-sitter is a better way of doing
this. (But, like, emitting tree-sitter grammars? Really? Wow, dude.
Way to give up.)
2024-08-27 15:43:07 -07:00
76ef85483e Accept is single-valued, the multi-value thing didn't ever make sense
I mean, it did when we thought we were going to weave NFA states as we
were building them but we ended up not doing that and instead just
using the fancy EdgeList splitting magic when building DFAs from the
NFA. It has the same power and is simpler code, and also means that
we'll *never* be asked to have multiple Terminals be accepted from a
single NFA state.
2024-08-27 15:43:01 -07:00
7a5f17f74b Specify and honor trivia tokens
e.g. "this is how machine-generated parsers know to skip blanks and
comments"

The run time implementation could be better; we don't really want to
just discard trivia because it's useful for e.g. doc comments and the
like. BUT for now this is fine.
2024-08-24 10:01:40 -07:00
f29ec5072f Augment number pattern, tests
More robust testing. Error messages would be nice but.
2024-08-24 09:38:21 -07:00
0c952e4905 Correct NFA construction
There was a bug in the way that I was converting regular expressions
to NFAs. I'm still not entirely sure what was going on, but I
re-visited the construction and made it follow the literature more
closely and it fixed the problem.
2024-08-24 09:24:29 -07:00
454e6fd6fd Regex API "improvements"
I mean, is it better than a regex parser? No, probably not.
2024-08-24 08:35:45 -07:00
6d6aabdeb3 Terminal name must be explicit on construction 2024-08-24 08:35:10 -07:00
72052645d6 Generated lexers actually kinda work
But regular expressions are underpowered and verbose
2024-08-23 15:32:35 -07:00
58c3004702 Move terminals into grammar definition
Starting to work on machine-generated lexers too
2024-08-23 07:24:30 -07:00
e04aa1966e Start moving the examples into tests 2024-06-15 07:52:16 -07:00
7b3c94c469 Move things into more modules
It will help with testing and profiling
It breaks pyright (it's probably time to abandon pyright)
2024-06-10 05:50:09 -07:00