lrparsers

Author	SHA1	Message	Date
John Doty	7edf5e06bf	Rebuild the matcher on grammars Well that wasn't so bad now was it? Eh? Nice to have a parser generator lying around. Let's keep working to see if I can actually finish it.	2024-09-09 11:40:14 -07:00
John Doty	1d28c82007	Saving this for posterity, but it is doomed Remember that tree levels are generated by context free languages, not regular languages, and so they can only be recognized by push-down automatons, not finite state machines. What happened was that I failed to account for transparent rules. Without transparent rules the children of a tree node do not have any recursion in them (by definition!) and so therefore are a regular language. But transparent rules change that: there can be recursion hidden on the same tree level, and it should have been clear from a moment's reflection that the recursion there meant that tree levels were once again a context free language. Fortunately we have a recognizer for context free languages lying around, so we can just use that I guess.	2024-09-09 06:23:25 -07:00
John Doty	0cbf696303	The start rule cannot be transparent	2024-09-09 06:23:11 -07:00
John Doty	49b76b9bcc	Teach trees to format themselves.	2024-09-09 06:22:56 -07:00
John Doty	a2c6390c23	Start to work on a prettier system. Still very garbage but I think the "hard" part of building a Wadler document from a parse tree might be there. It's a backtracking matcher which might turn out to be too slow for alternatives but maybe will be fine? Still needs lots of tests.	2024-09-07 13:02:09 -07:00
John Doty	00b4cd4702	Starting to look at pretty-printing with the idea of auto-indentation I wonder if it will work?	2024-09-06 16:23:14 -07:00
John Doty	d7dfd556ec	Emit an emacs major mode With coloring! Next up: formatting but that might be hard.	2024-09-06 11:51:09 -07:00
John Doty	4941cd049c	Helper routines for generating source code This includes "signing" source to detect modifications, and maintaining user-modified sections. Hooray!	2024-09-06 11:50:17 -07:00
John Doty	23981f82ce	Start working on emacs mode generation	2024-09-06 10:20:34 -07:00
John Doty	501c2e3fbe	Teach the highlight meta about emacs face names	2024-09-06 10:20:17 -07:00
John Doty	676ddedbaf	Add the language name to the end of generated scopes	2024-09-05 15:12:55 -07:00
John Doty	dbf893e48b	Generate queries a little better	2024-09-05 15:03:44 -07:00
John Doty	51c4f14c26	Emit highlight queries for tree-sitter Now we're starting to get somewhere!	2024-09-05 14:52:35 -07:00
John Doty	ea5fab4e4e	Tree-sitter regexps are structured Instead of trying to build a regular expression string, just build a structured thing with seq() and choice() and whatnot. This is technically uglier but fixes a problem I found with comment regular expressions so you know, it works, which is better than not working. Also now tokens get named and maybe that's good? It's so hard to say.	2024-09-05 11:51:29 -07:00
John Doty	be8e017fd9	Fix regex generation, extras	2024-09-05 06:32:28 -07:00
John Doty	94f5958087	Field propagation	2024-09-05 06:30:55 -07:00
John Doty	591da0c971	Rework highlighting metadata GOD	2024-09-02 08:50:36 -07:00
John Doty	e4a8ad7b76	Trailing thing	2024-09-01 11:30:59 -07:00
John Doty	a99b3ecb70	Interpret precedence the way tree-sitter does, kinda This allows most of our precedence to be re-used. There are some cases still where tree-sitter gets confused (and we don't), see the corresponding change to grammar.py. I wish I knew how to fix this but I don't. :(	2024-09-01 07:38:46 -07:00
John Doty	0354fbf4a4	More ways of writing Sometimes prettier	2024-09-01 06:52:13 -07:00
John Doty	3012df4ac6	Precedence but it doesn't work Tree sitter doesn't let me do token-based precedence? I don't like tree-sitter's "make it inline but give it a number" system- seems like a bug farm to me.	2024-08-31 07:22:49 -07:00
John Doty	98c4bb950f	Fix bugs but still doesn't work for Fine	2024-08-30 09:14:01 -07:00
John Doty	066d2d8439	A converter from grammars to tree-sitter grammars Hmm, isn't this fine!	2024-08-30 09:04:32 -07:00
John Doty	2d87207b54	Some small tweaks	2024-08-30 09:04:18 -07:00
John Doty	80d932b36a	Refactor to use non_terminals()	2024-08-29 08:23:55 -07:00
John Doty	f8b62bf4a4	Terminal 'value' is 'name', compile_lexer is method	2024-08-29 08:22:23 -07:00
John Doty	344dde51be	Grammar start symbol is public	2024-08-29 08:12:08 -07:00
John Doty	dc03bf7373	Grammars can be named	2024-08-29 08:00:40 -07:00
John Doty	abcb0e516a	OptionalRule is not required but MetatdataRule is	2024-08-28 08:33:32 -07:00
John Doty	02c1aa507e	Muck around with usability	2024-08-28 08:27:46 -07:00
John Doty	0be0075cfe	Generic token stream Compatible with the harness	2024-08-27 16:47:26 -07:00
John Doty	49ad7fdb52	Associate metadata with terminals This is a half-assed attempt at doing syntax coloring which I think will almost certainly turn out to be insufficient. I'm committing it just to record some of the work I've done but. BUT. Probably trying to match tree-sitter is a better way of doing this. (But, like, emitting tree-sitter grammars? Really? Wow, dude. Way to give up.)	2024-08-27 15:43:07 -07:00
John Doty	76ef85483e	Accept is single-valued, the multi-value thing didn't ever make sense I mean, it did when we thought we were going to weave NFA states as we were building them but we ended up not doing that and instead just using the fancy EdgeList splitting magic when building DFAs from the NFA. It has the same power and is simpler code, and also means that we'll never be asked to have multiple Terminals be accepted from a single NFA state.	2024-08-27 15:43:01 -07:00
John Doty	7a5f17f74b	Specify and honor trivia tokens e.g. "this is how machine-generated parsers know to skip blanks and comments" The run time implementation could be better; we don't really want to just discard trivia because it's useful for e.g. doc comments and the like. BUT for now this is fine.	2024-08-24 10:01:40 -07:00
John Doty	f29ec5072f	Augment number pattern, tests More robust testing. Error messages would be nice but.	2024-08-24 09:38:21 -07:00
John Doty	0c952e4905	Correct NFA construction There was a bug in the way that I was converting regular expressions to NFAs. I'm still not entirely sure what was going on, but I re-visited the construction and made it follow the literature more closely and it fixed the problem.	2024-08-24 09:24:29 -07:00
John Doty	454e6fd6fd	Regex API "improvements" I mean, is it better than a regex parser? No, probably not.	2024-08-24 08:35:45 -07:00
John Doty	6d6aabdeb3	Terminal name must be explicit on construction	2024-08-24 08:35:10 -07:00
John Doty	72052645d6	Generated lexers actually kinda work But regular expressions are underpowered and verbose	2024-08-23 15:32:35 -07:00
John Doty	58c3004702	Move terminals into grammar definition Starting to work on machine-generated lexers too	2024-08-23 07:24:30 -07:00
John Doty	e04aa1966e	Start moving the examples into tests	2024-06-15 07:52:16 -07:00
John Doty	7b3c94c469	Move things into more modules It will help with testing and profiling It breaks pyright (it's probably time to abandon pyright)	2024-06-10 05:50:09 -07:00

42 commits