Vendor things
This commit is contained in:
parent
5deceec006
commit
977e3c17e5
19434 changed files with 10682014 additions and 0 deletions
65
third-party/vendor/regex-automata/tests/gen/README.md
vendored
Normal file
65
third-party/vendor/regex-automata/tests/gen/README.md
vendored
Normal file
|
|
@ -0,0 +1,65 @@
|
|||
This directory contains tests for serialized objects from the regex-automata
|
||||
crate. Currently, there are only two supported such objects: dense and sparse
|
||||
DFAs.
|
||||
|
||||
The idea behind these tests is to commit some serialized objects and run some
|
||||
basic tests by deserializing them and running searches and ensuring they are
|
||||
correct. We also make sure these are run under Miri, since deserialization is
|
||||
one of the biggest places where undefined behavior might occur in this crate
|
||||
(at the time of writing).
|
||||
|
||||
The main thing we're testing is that the *current* code can still deserialize
|
||||
*old* objects correctly. Generally speaking, compatibility extends to semver
|
||||
compatible releases of this crate. Beyond that, no promises are made, although
|
||||
in practice callers can at least depend on errors occurring. (The serialized
|
||||
format always includes a version number, and incompatible changes increment
|
||||
that version number such that an error will occur if an unsupported version is
|
||||
detected.)
|
||||
|
||||
To generate the dense DFAs, I used this command:
|
||||
|
||||
```
|
||||
$ regex-cli generate serialize dense regex \
|
||||
MULTI_PATTERN_V2 \
|
||||
tests/gen/dense/ \
|
||||
--rustfmt \
|
||||
--safe \
|
||||
--starts-for-each-pattern \
|
||||
--specialize-start-states \
|
||||
--start-kind both \
|
||||
--unicode-word-boundary \
|
||||
--minimize \
|
||||
'\b[a-zA-Z]+\b' \
|
||||
'(?m)^\S+$' \
|
||||
'(?Rm)^\S+$'
|
||||
```
|
||||
|
||||
And to generate the sparse DFAs, I used this command, which is the same as
|
||||
above, but with `s/dense/sparse/g`.
|
||||
|
||||
```
|
||||
$ regex-cli generate serialize sparse regex \
|
||||
MULTI_PATTERN_V2 \
|
||||
tests/gen/sparse/ \
|
||||
--rustfmt \
|
||||
--safe \
|
||||
--starts-for-each-pattern \
|
||||
--specialize-start-states \
|
||||
--start-kind both \
|
||||
--unicode-word-boundary \
|
||||
--minimize \
|
||||
'\b[a-zA-Z]+\b' \
|
||||
'(?m)^\S+$' \
|
||||
'(?Rm)^\S+$'
|
||||
```
|
||||
|
||||
The idea is to try to enable as many of the DFA's options as possible in order
|
||||
to test that serialization works for all of them.
|
||||
|
||||
Arguably we should increase test coverage here, but this is a start. Note
|
||||
that in particular, this does not need to test that serialization and
|
||||
deserialization correctly roundtrips on its own. Indeed, the normal regex test
|
||||
suite has a test that does a serialization round trip for every test supported
|
||||
by DFAs. So that has very good coverage. What we're interested in testing here
|
||||
is our compatibility promise: do DFAs generated with an older revision of the
|
||||
code still deserialize correctly?
|
||||
Loading…
Add table
Add a link
Reference in a new issue