lrparsers

Author	SHA1	Message	Date
John Doty	385c378edb	[parser] Everything is an ItemSet now	2024-10-26 07:51:13 -07:00
John Doty	923b01f6fd	[parser] Simplify StateGraph	2024-10-26 07:35:28 -07:00
John Doty	27e6bb413c	[parser] Remove Canonical LR1 generator This is fine probably.	2024-10-26 07:25:37 -07:00
John Doty	2b72811486	[parser] ConfigurationSetInfo -> StateGraph	2024-10-26 06:56:30 -07:00
John Doty	e501caa073	[parser] Remove unused import	2024-10-26 06:53:53 -07:00
John Doty	e55bc140f9	[parser] Move ItemSet	2024-10-26 06:53:36 -07:00
John Doty	2d5c73f0b0	[parser] Remove LR0 and SLR1 Sorry, when this was educational it was nice to have the other generators but as part of cleaning I'm just getting rid of them.	2024-10-15 07:43:52 -07:00
John Doty	bb94fc6c9c	[parser] clean clean clean	2024-10-11 07:52:48 -07:00
John Doty	2656a1d328	[parser] Remove bad LALR implementation, start cleanup	2024-10-10 07:58:16 -07:00
John Doty	eef1db72da	[parser] Pager's algorithm. Faster. As good as LALR but the implementation isn't embarassing. (Still pretty bad though.) Honestly the next thing to do is to delete LALR and just use Pager's and also rebuild ConfigSet et al to be ItemSet so that Pager's alg can go even faster. I think I want to keep LR1 just for completeness so I might as well not delete SLR and LR0, although I could I suppose.	2024-10-05 16:00:41 -07:00
John Doty	bb52ab8da5	[parser] Error recovery tests Based on the blog post "Resilient LL Parsing Tutorial" by Alex Kladov, at https://matklad.github.io/2023/05/21/resilient-ll-parsing-tutorial.html Because I was trying to be "simple" in my grammar definition I found a bug in the grammar class, whoops! :)	2024-09-22 08:46:54 -07:00
John Doty	071cd29d8f	[readme] Rewrite the readme and add a helper The helper is nice actually.	2024-09-21 08:45:49 -07:00
John Doty	b1f4c56f49	[wadler] One more bit of writing.	2024-09-21 07:45:59 -07:00
John Doty	1a3ce02d48	[wadler] Re-factor into multiple modules Hard split between builder and runtime, as is proper.	2024-09-21 07:42:52 -07:00
John Doty	1f84752538	[wadler] Refactor: data and runtime split Now we convert the grammar into data for a pretty-printer, so in theoryw e could write the pretty-printer in a different language.	2024-09-21 06:44:53 -07:00
John Doty	e4585170d8	[wadler] Fixup EOF trivia Trivia handling feels pretty good now actually.	2024-09-19 20:23:02 -07:00
John Doty	8a17cfd586	[wadler] Prettier handling of trivia Split the rules for pre- and post- trivia, understand when we want to do either, handle multi-line-break (in an unsatisfying way, I guess) but otherwise lay the groundwork for thinking about it better. Also now we don't generate lazy "Text" nodes because I thought I might want to actually look at the newlines in the source but I don't yet. I can now, though. (I can also detect EOF so there's that.)	2024-09-19 16:39:32 -07:00
John Doty	c31d527077	[wadler] Trivia escapes groups This means that forced breaks from comments don't screw up the following single-line things. But this still isn't right; we need to fine tune how we represent trivia.	2024-09-15 08:51:18 -07:00
John Doty	9d55588a35	[wadler] Cons has a list of documents in it I think I want to start thinking about "leftmost" and "rightmost" and it's just easier and faster if cons has an actual list in it instead of dotted pairs.	2024-09-15 08:12:30 -07:00
John Doty	d5ccd5b147	Really messing around with trivia, it's not good yet It's really not clear how to track it and how to compose it with groups yet. Really very difficult.	2024-09-14 17:14:07 -07:00
John Doty	f3a4c4348a	Custom indentation	2024-09-14 07:28:18 -07:00
John Doty	39ae937ddc	Comments Not really.	2024-09-13 23:16:46 -07:00
John Doty	c8fef52c0c	Support prefix newlines, stop generating empty productions This required rebuilding the matcher compiler significantly, and it was a lot a lot of work. But now we don't generate so many spurious newlines and the document we produce is a lot a lot better.	2024-09-13 22:41:33 -07:00
John Doty	709ba060b4	Hack for metadata in the document	2024-09-13 16:53:51 -07:00
John Doty	8845bcfdab	Fix accidental group duplication Leading to unnecessary ambiguities	2024-09-13 15:28:50 -07:00
John Doty	ca240906c9	Remove the text follow stuff I think we have a good solution to the problem.	2024-09-13 12:02:44 -07:00
John Doty	d7a6891519	Finish annotating test grammar, forced breaks, fixes Forced breaks force a newline in a spot, which is sometimes what we want. (Like, this syntax should never be on a single line.)	2024-09-13 11:57:16 -07:00
John Doty	938f0e5c69	Support newline replacements This allows us to do maybe more complicated spacing. Still unclear about identifier/punctuation spacing.	2024-09-12 11:09:14 -07:00
John Doty	b3b2102864	Record trivia in tokens This will make our formatting better I think.	2024-09-12 06:22:49 -07:00
John Doty	276449287d	Allow for text to follow tokens in pretty-printing It's weird that it counts against the line length though, like if you were going to break you could ignore it right? At least, for the grammar I'm working here....	2024-09-11 11:22:41 -07:00
John Doty	d6dd54f4df	Actual pretty-printing! Now we're cooking with gas ALTHOUGH now we have to deal with the fact that we're gluing everything together where there should be spaces. Many more improvements to come.	2024-09-11 11:08:02 -07:00
John Doty	443bf8bd33	Move formatting meta around, actually mark stuff up	2024-09-10 11:47:22 -07:00
John Doty	7edf5e06bf	Rebuild the matcher on grammars Well that wasn't so bad now was it? Eh? Nice to have a parser generator lying around. Let's keep working to see if I can actually finish it.	2024-09-09 11:40:14 -07:00
John Doty	1d28c82007	Saving this for posterity, but it is doomed Remember that tree levels are generated by context free languages, not regular languages, and so they can only be recognized by push-down automatons, not finite state machines. What happened was that I failed to account for transparent rules. Without transparent rules the children of a tree node do not have any recursion in them (by definition!) and so therefore are a regular language. But transparent rules change that: there can be recursion hidden on the same tree level, and it should have been clear from a moment's reflection that the recursion there meant that tree levels were once again a context free language. Fortunately we have a recognizer for context free languages lying around, so we can just use that I guess.	2024-09-09 06:23:25 -07:00
John Doty	0cbf696303	The start rule cannot be transparent	2024-09-09 06:23:11 -07:00
John Doty	49b76b9bcc	Teach trees to format themselves.	2024-09-09 06:22:56 -07:00
John Doty	a2c6390c23	Start to work on a prettier system. Still very garbage but I think the "hard" part of building a Wadler document from a parse tree might be there. It's a backtracking matcher which might turn out to be too slow for alternatives but maybe will be fine? Still needs lots of tests.	2024-09-07 13:02:09 -07:00
John Doty	00b4cd4702	Starting to look at pretty-printing with the idea of auto-indentation I wonder if it will work?	2024-09-06 16:23:14 -07:00
John Doty	d7dfd556ec	Emit an emacs major mode With coloring! Next up: formatting but that might be hard.	2024-09-06 11:51:09 -07:00
John Doty	4941cd049c	Helper routines for generating source code This includes "signing" source to detect modifications, and maintaining user-modified sections. Hooray!	2024-09-06 11:50:17 -07:00
John Doty	23981f82ce	Start working on emacs mode generation	2024-09-06 10:20:34 -07:00
John Doty	501c2e3fbe	Teach the highlight meta about emacs face names	2024-09-06 10:20:17 -07:00
John Doty	676ddedbaf	Add the language name to the end of generated scopes	2024-09-05 15:12:55 -07:00
John Doty	dbf893e48b	Generate queries a little better	2024-09-05 15:03:44 -07:00
John Doty	51c4f14c26	Emit highlight queries for tree-sitter Now we're starting to get somewhere!	2024-09-05 14:52:35 -07:00
John Doty	ea5fab4e4e	Tree-sitter regexps are structured Instead of trying to build a regular expression string, just build a structured thing with seq() and choice() and whatnot. This is technically uglier but fixes a problem I found with comment regular expressions so you know, it works, which is better than not working. Also now tokens get named and maybe that's good? It's so hard to say.	2024-09-05 11:51:29 -07:00
John Doty	be8e017fd9	Fix regex generation, extras	2024-09-05 06:32:28 -07:00
John Doty	94f5958087	Field propagation	2024-09-05 06:30:55 -07:00
John Doty	591da0c971	Rework highlighting metadata GOD	2024-09-02 08:50:36 -07:00
John Doty	e4a8ad7b76	Trailing thing	2024-09-01 11:30:59 -07:00

1 2

74 commits