Compare commits

...

4 commits

Author SHA1 Message Date
42183be0f4 [dingus] Build out of tree 2024-11-02 09:57:26 -07:00
fb181667b5 [dingus] Home tweaks 2024-11-02 09:57:17 -07:00
3d9c3b2c99 [dingus] Fix links in about page 2024-11-02 09:56:49 -07:00
953d2ee2a0 [dingus] about.html 2024-11-02 09:29:37 -07:00
4 changed files with 261 additions and 9 deletions

209
dingus/about.md Normal file
View file

@ -0,0 +1,209 @@
% About The Grammar Dingus
<!-- Lots of this writing is taken from the project readme, so keep them in sync. -->
[This is a demo](index.html) for a [library](https://github.com/decarabas/lrparsers)
about doing fun things with grammars.
## How to Use The Dingus
- Define your grammar in the left hand pane in python.
- Write some text in your language in the middle pane.
- Poke around the tree and errors on the right hand side.
## Making Grammars
To get started, create a grammar that derives from the `Grammar`
class. Create one method per non-terminal, decorated with the `rule`
decorator. Here's an example:
```python {.numberLines}
from parser import *
class SimpleGrammar(Grammar):
start = "expression"
@rule
def expression(self):
return seq(self.expression, self.PLUS, self.term) | self.term
@rule
def term(self):
return seq(self.LPAREN, self.expression, self.RPAREN) | self.ID
PLUS = Terminal('+')
LPAREN = Terminal('(')
RPAREN = Terminal(')')
ID = Terminal(
Re.seq(
Re.set(("a", "z"), ("A", "Z"), "_"),
Re.set(("a", "z"), ("A", "Z"), ("0", "9"), "_").star(),
),
)
```
Terminals can be plain strings or regular expressions constructed with
the `Re` object. (Ironically, I guess this library is not clever
enough to parse a regular expression string into one of these
structures. If you want to build one, go nuts! It's just Python, you
can do whatever you want so long as the result is an `Re` object.)
Productions can be built out of terminals and non-terminals,
concatenated with the `seq` function or the `+` operator. Alternatives
can be expressed with the `alt` function or the `|` operator. These
things can be freely nested, as desired.
There are no helpers (yet!) for consuming lists, so they need to be
constructed in the classic context-free grammar way:
```python {.numberLines}
class NumberList(Grammar):
start = "list"
@rule
def list(self):
return self.NUMBER | (self.list + self.COMMA + self.NUMBER)
NUMBER = Terminal(Re.set(("0", "9")).plus())
COMMA = Terminal(',')
```
(Unlike with PEGs, you can write grammars with left or right-recursion,
without restriction, either is fine.)
When used to generate a parser, the grammar describes a concrete
syntax tree. Unfortunately, that means that the list example above
will generate a very awkward tree for `1,2,3`:
```
list
list
list
NUMBER ("1")
COMMA
NUMBER ("2")
COMMA
NUMBER ("3")
```
In order to make this a little cleaner, rules can be "transparent",
which means they don't generate nodes in the tree and just dump their
contents into the parent node instead.
```python {.numberLines}
class NumberList(Grammar):
start = "list"
@rule
def list(self):
# The starting rule can't be transparent: there has to be something to
# hold on to!
return self.transparent_list
@rule(transparent=True)
def transparent_list(self) -> Rule:
return self.NUMBER | (self.transparent_list + self.COMMA + self.NUMBER)
NUMBER = Terminal(Re.set(("0", "9")).plus())
COMMA = Terminal(',')
```
This grammar will generate the far more useful tree:
```
list
NUMBER ("1")
COMMA
NUMBER ("2")
COMMA
NUMBER ("3")
```
Rules that start with `_` are also interpreted as transparent,
following the lead set by tree-sitter, and so the grammar above is
probably better-written as:
```python {.numberLines}
class NumberList(Grammar):
start = "list"
@rule
def list(self):
return self._list
@rule
def _list(self):
return self.NUMBER | (self._list + self.COMMA + self.NUMBER)
NUMBER = Terminal(Re.set(("0", "9")).plus())
COMMA = Terminal(',')
```
That will generate the same tree, but a little more succinctly.
### Trivia
Most folks that want to parse something want to skip blanks when they
do it. Our grammars don't say anything about that by default (sorry),
so you probably want to be explicit about such things.
To allow (and ignore) spaces, newlines, tabs, and carriage-returns in
our number lists, we would modify the grammar as follows:
```python {.numberLines}
class NumberList(Grammar):
start = "list"
trivia = ["BLANKS"] # <- Add a `trivia` member
@rule
def list(self):
return self._list
@rule
def _list(self):
return self.NUMBER | (self._list + self.COMMA + self.NUMBER)
NUMBER = Terminal(Re.set(("0", "9")).plus())
COMMA = Terminal(',')
BLANKS = Terminal(Re.set(" ", "\t", "\r", "\n").plus())
# ^ and add a new terminal to describe it
```
Now we can parse a list with spaces! "1 , 2, 3" will parse happily
into:
```
list
NUMBER ("1")
COMMA
NUMBER ("2")
COMMA
NUMBER ("3")
```
### Error recovery
In order to get good error recovery, you have to... do nothing.
The parser runtime we're using here uses a non-interactive version of
[CPCT+](https://tratt.net/laurie/blog/2020/automatic_syntax_error_recovery.html).
I find that it actually works quite well! If you're skeptical that a
machine-generated parser can do well enough for, say, an LSP, give
your favorite examples a try here. You might be surprised.
(Go ahead, give it some of your [favorite examples of resilient
parsing](https://matklad.github.io/2023/05/21/resilient-ll-parsing-tutorial.html)
and see how it does. I would love to see examples of where the
recovery went fully off the rails!)
### Syntax highlighting
*You can annotate the terminals and nonterminals to generate syntax
highlighting but the dingus doesn't have it wired into the editors
yet.*
### Pretty-printing
*You can annotate the grammar with rules for pretty printing but the
dingus doesn't expose it yet.*

View file

@ -4,17 +4,19 @@
<title>Dingus</title>
<script src="https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.js"></script>
<!-- CodeMirror -->
<link rel="stylesheet" href="./codemirror/codemirror.css" />
<script src="./codemirror/codemirror.js"></script>
<script src="./codemirror/python.js"></script>
<link rel="stylesheet" href="./style.css" />
<style>
</style>
</head>
<body>
<div class="page-container">
<div class="page-title">
<h1>Grammar Dingus</h1>
<a href="about.html">What is this?</a>
</div>
<div class="panel-top grammar-title">
Write Your Grammar Here
</div>

View file

@ -1,14 +1,18 @@
/* set codemirror ide height to 100% of the textarea */
body {
height: 100vh;
box-sizing: border-box;
margin: 0;
}
h1 {
margin-top: 0;
margin-bottom: 0.25rem;
}
.page-container {
display: grid;
grid-template-columns: 1fr 1fr 1fr;
grid-template-rows: 4rem 2rem 1fr 2rem;
grid-template-rows: auto 2rem 1fr 2rem;
width: 100%;
height: 100%;
}
@ -33,6 +37,12 @@ body {
border-top: 1px solid;
}
.page-title {
grid-column: 1 / 4;
grid-row: 1;
padding: 0.5rem;
}
.grammar-title {
grid-column: 1;
grid-row: 2;

View file

@ -23,11 +23,42 @@ dist/lrparsers-$(VERSION).tar.gz dist/lrparsers-$(VERSION)-py3-none-any.whl: pyp
.PHONY: clean
clean:
rm -rf ./dist
rm -rf ./dingus/wheel/*
DINGUS_FILES=\
dingus/srvit.py \
dingus/index.html \
dingus/dingus.js \
dingus/worker.js \
dingus/style.css \
dingus/codemirror/codemirror.css \
dingus/codemirror/codemirror.js \
dingus/codemirror/python.js \
dingus/pyodide/micropip-0.6.0-py3-none-any.whl \
dingus/pyodide/micropip-0.6.0-py3-none-any.whl.metadata \
dingus/pyodide/packaging-23.2-py3-none-any.whl \
dingus/pyodide/packaging-23.2-py3-none-any.whl.metadata \
dingus/pyodide/pyodide.asm.js \
dingus/pyodide/pyodide.asm.wasm \
dingus/pyodide/pyodide-core-0.26.2.tar \
dingus/pyodide/pyodide.d.ts \
dingus/pyodide/pyodide.js \
dingus/pyodide/pyodide-lock.json \
dingus/pyodide/pyodide.mjs \
dingus/pyodide/python_stdlib.zip \
DINGUS_TARGETS=$(addprefix dist/, $(DINGUS_FILES))
.PHONY: dingus
dingus: dingus/wheel/lrparsers-$(VERSION)-py3-none-any.whl
python3 ./dingus/srvit.py
dingus: $(DINGUS_TARGETS) dist/dingus/wheel/lrparsers-$(VERSION)-py3-none-any.whl dist/dingus/about.html
python3 ./dist/dingus/srvit.py
dingus/wheel/lrparsers-$(VERSION)-py3-none-any.whl: dist/lrparsers-$(VERSION)-py3-none-any.whl
dist/dingus/%: dingus/%
mkdir -p $(dir $@)
ln $< $@
dist/dingus/about.html: dingus/about.md
pandoc $< -o $@ -s
dist/dingus/wheel/lrparsers-$(VERSION)-py3-none-any.whl: dist/lrparsers-$(VERSION)-py3-none-any.whl
mkdir -p $(dir $@)
cp $< $@