3d: Dependent Data Descriptions for Verified Validation¶
Note
The documentation from this page covers the EverParse/3d releases v2025.05.20 or later. Documentations for earlier versions may be found in the Git history.
EverParse/3d lets you write a data format description and automatically produces efficient data validation code formally verified for memory-safety (absence of buffer overruns), functional correctness (validation is correct wrt. your data format) and security (absence of double memory fetches.) A gentle overview of formal verification is available in the documentation of EverCrypt; EverParse provides similar formal guarantees except secret-independence.
Summary¶
- Write your data format description in the 3d language, as
MyFile1.3d
,MyFile2.3d
. - Give MyFile1.3d and MyFile2.3d to EverParse:
- with the Windows binary package:
everparse.cmd MyFile1.3d MyFile2.3d
- with the Linux binary package:
everparse.sh MyFile1.3d MyFile2.3d
- with the source tree:
bin/3d.exe --batch MyFile1.3d MyFile2.3d
- with the Windows binary package:
Then, EverParse/3d produces:
MyFile1.fsti
,MyFile1.fst
: the verified F* specification and Low* implementation of parsers and validators for your data format fromMyFile1.3d
, and similarly forMyFile2.3d
MyFile1.c
,MyFile1.h
: the verified C code generated from the verified F* and Low* files, and similarly forMyFile2.3d
MyFile1Wrapper.c
,MyFile1Wrapper.h
: an entrypoint API to call into the verified C code, and similarly forMyFile2.3d
EverParse.h
,EverParseEndianness.h
: various EverParse helpers for the implementation of verified validators, common toMyFile1.3d
andMyFile2.3d
- Insert
#include "MyFile1Wrapper.h"
into any of your C compilation units that need to call a validator fromMyFile1.3d
, and similarly forMyFile2.3d
.
Formal guarantees¶
For each message type in the source data formats, the generated C code provides a _validator_, a procedure to determine if a given input array of bytes contains a valid representation of a message, with the following properties:
- Memory safety: The validator is memory safe, meaning it never accesses memory out of bounds, accesses memory at an incompatible type, uses memory after it has been freed, etc.
- Arithmetic safety: The validator has no integer overflows or underflows.
- Double-fetch freedom: The validator reads every byte in the input array at most once, defending against certain kinds of time-of-check/time-of-use bugs.
- Write footprint: The validator does not modify the input array, and only modifies memory as described by user-controlled imperative actions that are explicit in the specification.
- No heap allocation: The validator does not allocate any memory on the heap.
- Functionally correct: The validator accepts all input arrays that contain valid representations of the message according to its specification, rejecting all others.
Download and install¶
We provide binary (x86_64 only) packages for Windows and Linux.
- Download the binary package corresponding to your platform, from the latest EverParse/3d release.
- Extract it (Windows:
unzip
; Linux:tar xzf
) anywhere you like. No need to configure it anyhow.
Then, everything is contained in a everparse
directory, whose
actual entrypoint executable is everparse.cmd
(for Windows) or
everparse.sh
(for Linux.)
Binary packages already contain all dependencies required by EverParse, such as F*, KaRaMeL and z3, so no need to install anything beforehand.
The Windows binary package contains a binary snapshot of clang-format to pretty-print the produced .c
and
.h
files. On Linux, you may want to install clang-format (more
info at the LLVM download page), but this is optional.
The Windows binary package is fully native, in the sense that it does not require any POSIX emulation layer such as Cygwin or WSL.
Run¶
On Windows: everparse.cmd
[options] input_files
On Linux: everparse.sh
[options] input_files
From the source tree: bin/3d.exe
. Caveat: in that case, no options
are toggled by default (not even --batch
or --clang_format
)
and you need to provide all options by hand.
Options are treated from left to right, with higher priority on
right-most options. Thus, any --no_clang_format
will override any
occurrence of --clang_format
or --clang_format_executable
to
its left, and vice-versa.
You can provide one or more input_files
to EverParse. Then,
EverParse.h
and EverParseEndianness.h
will be shared across
.c
and .h
files produced for all the .3d
files that you
provided.
Note
EverParse input files must all bear the .3d
file name extension,
and their names must start with a capital letter.
Default mode¶
In default mode, EverParse3d produces C code corresponding to validators proven correct with respect to the 3d specification.
Options are treated from left to right:
--odir output_directory_name
Tell EverParse to write all files to the directory
output_directory_name
. This directory must already exist (EverParse will not create it.) The default is.
(current directory)--batch
Verify the produced F*
.fst
and.fsti
files and generate C.c
and.h
files. This is toggled by default.--no_batch
Do not verify the produced F*
.fst
and.fsti
files or generate C.c
and.h
files.--clang_format
Call
clang-format
on extracted .c/.h files. This option automatically toggles--batch
.This is toggled by default on Windows, since the Windows binary package contains a copy of clang-format. On Linux, this is toggled by default only if
clang-format
is reachable through thePATH
environment variable.--no_clang_format
Do not call clang-format on extracted .c/.h files.
--clang_format_executable
Provide the path to clang-format if not reachable through
PATH
(or to override the one provided in the Windows binary package.) This option automatically toggles--clang_format
and--batch
.--clang_format_use_custom_config
With
--clang_format
, skip copying the.clang-format
file from the EverParse distribution, and assume that there is already an existing.clang-format
--cleanup
With
--batch
, remove all produced intermediate files other than.c
and.h` files (thus: ``.fst
,.fsti
,.checked
,.krml
)--no_cleanup
Keep all produced intermediate files. This is toggled by default.
--skip_c_makefiles
Skip generating the default Makefiles produced by Karamel to compile the produced C code.
--add_include <myinclude.h>
--add_include "myinclude.h"
Prepend each produced C file with a
#include <myinclude.h>
or a#include "myinclude.h"
C preprocessor directive.--config myfile.config
Use the specified JSON file
myfile.config
to declare compile-time flags (see compile-time configuration in the 3D language reference.)--fstar path/to/fstar.exe
Allows to specify the location of the F* executable. The
everparse.sh
/everparse.cmd
script in the EverParse binary package is already configured to use the F* executable provided in the package. Otherwise, the default is to use thefstar.exe
found from thePATH
.--input_stream buffer|extern|static
Specifies the format of validator input buffers:
With
buffer
(which is the default), a validator takes as argument a byte array and its length.With
extern
, a validator takes as argument a value of a user-definedEVERPARSE_INPUT_STREAM_BASE
type, representing an input byte stream, for which the following functions need to be defined:extern BOOLEAN EverParseHas( EVERPARSE_EXTRA_T _extra_t, EVERPARSE_INPUT_STREAM_BASE x, uint64_t n );
EverParseHas
returns1
if the input streamx
hasn
bytes available to read,0
otherwise.extern uint8_t* EverParseRead( EVERPARSE_EXTRA_T _extra_t, EVERPARSE_INPUT_STREAM_BASE x, uint64_t n, uint8_t *dst );
EverParseRead
takes and readsn
bytes from the input streamx
, possibly copying them intodst
, and returns a byte array containing the bytes read. There is no requirement that the function returnsdst
, or even that a copy todst
is made. `extern void EverParseSkip( EVERPARSE_EXTRA_T _extra_t, EVERPARSE_INPUT_STREAM_BASE x, uint64_t n );
EverParseSkip
advances the input streamx
byn
bytes without reading them.extern uint8_t* EverParsePeep( EVERPARSE_EXTRA_T _extra_t, EVERPARSE_INPUT_STREAM_BASE x, uint64_t n );
EverParsePeep
returns a pointer to a byte array that is part of the input streamx
and points to its nextn
bytes, then advances the input stream byn
. If such operation is impossible, then returnsNULL
. Such a pointer is returned by thefield_ptr
action (see “Atomic actions” in the 3d language reference), so it is the responsibility of the application to ensure double-fetch freedom on the contents of the returned field is preserved.extern uint64_t EverParseEmpty( EVERPARSE_EXTRA_T _extra_t, EVERPARSE_INPUT_STREAM_BASE x );
EverParseEmpty
skips to the end of the input streamx
and returns the number of bytes so skipped, without reading them.In all such functions,
EVERPARSE_EXTRA_T
is a user-defined type for the context used for error handling (see “Error handling” in the 3d language reference.)static
is likeextern
except that the functions onEVERPARSE_INPUT_STREAM_BASE
are declaredstatic
.
In any case, to compile the C code produced by EverParse, you need to add to the include path the subdirectory
src/3d/prelude/<input_stream_format>
of the EverParse binary package or repository; this directory contains theEverParse.h
corresponding to the input stream format given by--input_stream
--input_stream_include myinclude.h
With
--input_stream
, defines the name of the C include file that provides definitions forEVERPARSE_INPUT_STREAM_BASE
and its associated functions.
Alternate mode: generating a Makefile¶
By default, with --batch
, given a .3d
specification, EverParse
produces .c
and .h
files by first producing .fsti
and
.fst
files files containing verified F* specifications and
implementations of validators, and then calling F* and KaRaMeL on each
such file to verify and extract them to C. EverParse will issue all
such calls to F* and KaRaMeL sequentially.
Alternatively, with everparse.cmd --makefile gmake Test.3d
(on
Windows; use ./everparse.sh
on Linux), instead of producing .c
and .h
files, EverParse will take Test.3d
and all of the
modules it depends on, and produce a GNU Makefile containing all
recipes and rules to produce .c
and .h
files and all
intermediate files with proper dependencies so that the user can
leverage GNU Make parallelism for any rules that do not depend on one
another. Other build systems are supported with variations of the
--makefile
option, as stated below.
The produced Makefile will make use of three variables:
EVERPARSE_CMD
: the command to run EverParse. This variable must contain the path and name of the EverParse executable, and any options (--clang_format
, etc.) useful to produce .EVERPARSE_INPUT_DIR
: the path to the input files.
Note
Currently, all .3d
files must be located in the same input
directory.
EVERPARSE_OUTPUT_DIR
: the path to the output files. EverParse will append the--odir
option appropriately.CC
: the path and name of the C compiler that will be used to compile the produced C files.CFLAGS
: options to provide to the C compiler to compile the produced C files.
The produced Makefile, by default called EverParse.Makefile
, is
meant to be included in the main Makefile
of your project. On
Linux, such a typical main Makefile
for use with GNU Make may look
like:
Also valid in Makefile generation mode are the following options:
--clang_format
Add
.clang-format
as a dependency of generated.c
and.h
files, and produce a rule to generate.clang-format
--clang_format_use_custom_config
Same as
--clang_format
, but do not produce a rule to generate.clang-format
all: test
EVERPARSE_CMD=/path/to/everparse.sh
EVERPARSE_OUTPUT_DIR=obj
EVERPARSE_INPUT_DIR=src
everparse_makefile=$(EVERPARSE_OUTPUT_DIR)/EverParse.Makefile
source_files=$(wildcard $(EVERPARSE_INPUT_DIR)/*.3d)
$(EVERPARSE_OUTPUT_DIR):
mkdir -p $@
$(everparse_makefile): $(source_files) $(EVERPARSE_OUTPUT_DIR)
$(EVERPARSE_CMD) --makefile gmake --makefile_name $@ $(source_files)
include $(everparse_makefile)
test: $(EVERPARSE_ALL_O_FILES)
clean:
rm -rf $(EVERPARSE_OUTPUT_DIR)
.PHONY: all test clean
A fully commented version of that main Makefile
is available in
the EverParse repository.
EverParse supports the following command-line options relevant to this mode:
--makefile gmake
: produce a Makefile for use with GNU Make.--makefile nmake
: produce a Makefile for use with Microsoft NMAKE.--makefile_name my_file_name
: the name of the produced Makefile shall bemy_file_name
instead of the defaultEverParse.Makefile
.--skip_o_rules
: do not output rules that would produce.o
files by compiling.c
files.
Alternate mode: hash checking¶
To speed up some scenarios where the user maintains a snapshot of the
generated .c and .h files to skip unnecessary verification time when
.3d files are not changed, EverParse provides an optional mechanism to
record the hash of .3d, .c and .h files and to compare them to skip
reverification and regeneration. This mechanism is disabled by
default, and has to be explicitly enabled using the --check_hashes
option as described below. WARNING: this mode will weaken
verification guarantees to the probability of hash collisions!
If enabled, then EverParse checks the hashes of the .3d and maybe .c and .h files (depending on the variant, as described below) and fails if hashes do not match; no verification or regeneration is performed at all.
Hashes are computed and checked using the EverCrypt verified cryptographic library, which is included in EverParse binary packages.
Options relevant to this mode:
--save_hashes
: save the hashes of the contents of source .3d, .c and .h files into a .json file. This option can be added in default mode when generating .c and .h files.--check_hashes strong
: checks the hashes of the contents of source .3d, .c and .h files, by reading the recorded hash from the .json file created by--save_hashes
.--check_hashes weak
: checks the hashes of the contents of source .3d files only, by reading the recorded hash from the .json file created by--save_hashes
, leaving the option for the user to manually change the generated .c and .h files (WARNING: this will void all verification guarantees!)--check_hashes inplace
: same as--check_hashes weak
, but uses the hash recorded in the copyright header of the generated .c and .h files instead of the .json file. The hash is introduced by using theEVERPARSEHASHES
keyword in the corresponding.copyright.txt
used to prepend generated .c and .h files with copyright headers; see Copyright headers for more details.--check_inplace_hash MyFile.3d=MyFile.h
: same as--check_hashes inplace
, but instead manually specifies the pair of the source.3d
file and the already-generated.h
(or.c
) file in which to check the hash. This option can be specified several times to provide several such pairs to check.
Alternate mode: test case generation¶
Given a 3d specification, EverParse can now generate positive (resp. negative) test cases. Internally, EverParse converts the 3d specification into a SMT2 theory, and then asks Z3 for the following satisfiability question: is there any sequence of bytes passing (resp. failing) the parser specification?
Note
Test case generation currently does not work with probes that coerce pointer types or specialization for different pointer sizes.
The following options are mutually exclusive:
--z3_test MyModule.my_parser
(requires
MyModule.3d
to be specified on the command line) produces positive and negative test cases formy_parser
defined inMyModule.3d
.This option produces the following results:
a series of binary files containing the contents of each test case. Each binary file is named as follows:
witness.n.POS.i.validatorname.args.dat
where:
n
is the test case numberPOS
(resp.NEG
) indicates a positive (resp. negative) test casei
is a probe counter:0
corresponds to the input buffer passed to the validator; a positive value corresponds to thei
-th time aprobe
3d construct is encountered. (If the 3d specification contains noprobe
construct, then only the0
file will be created for each test case.)More precisely, when encountering a
T *ptr probe ProbeFunction(length = ..., destination = ...)
pointer field, the binary contents for that field in a test case is the big-endian representation of a positive 64-bit integeri
so that the contents of the file with probe indexi
is meant to be passed to the validator forT
.validatorname
is the name of the validator function produced by EverParse. It has the shapeModuleValidateMyParser
args
is the sequence of the non-outparameter argument values passed to the validator, separated by.
; all of those argument values are integers.
a C test program,
testcases.c
, that produces C definitions of the test cases, and calls to the validator with those test cases.A given test case is an array of layers; a layer is a byte array and its length. Layer 0 contains the byte array passed to the validator; layer of a positive index
i
contains the byte array passed to the validator corresponding to thei
-th time aprobe
pointer field is encountered.with
--batch
(set by default when using theeverparse.sh
/everparse.bat
script from the binary package), this option also compiles the C test program intotest.exe
and runs it to check whether the generated test cases are actually accepted or rejected by the verified validator implementations for the given 3d parser specifications. Then, the test program prints on standard output the C definitions of the test cases.Use
--no_batch
to disable compilation of the C test program.
--z3_diff_test MyModule1.my_parser1,MyModule2.my_parser2
(requires
MyModule1.3d
andMyModule2.3d
to be specified on the command line) produces a set of test cases that are positive formy_parser1
but negative formy_parser2
, and a set of test cases for the converse.The shape of the produced results is similar to those produced by
--z3_test
(see above), except that the test file name also contains the name of the second validator against which it is negative.
Note
If Z3 produces no test cases and instead answers unsat
both
ways (my_parser1
and not my_parser2
; and my_parser2
and
not my_parser1
), then the two specifications are equivalent in
terms of validation: they accept the exact same inputs. However,
they do not necessarily write the same values into outparameters.
--emit_smt_encoding
produces the SMT2 encoding corresponding to the
.3d
files passed as argument. Such encoding is printed on standard output.--test_checker MyModule.my_parser
(requires
MyModule.3d
to be specified on the command line) produces a test checker executable for the given parser,test.exe
. The test checker takes as arguments:- a binary file name containing the contents of the input buffer to be passed to the validator
- the non-outparameter integer argument values to be passed to the validator
The test checker executable returns exit code:
- 0 if validation succeeds and consumes all input bytes
- 1 if validation succeeds but the validator does not consume all input bytes
- 2 if validation fails
- 3 if the input file is larger than 4 GB, or if any system error occurs (file does not exist, read error, etc.)
Note
The test checker executable currently does not work with 3d specifications containing probes.
Options for --z3_test
¶
--z3_test_mode pos
Only produces positive test cases
--z3_test_mode neg
Only produces negative test cases
Branch coverage analysis¶
Unless explicitly specified otherwise, the following options can be
used with -z3_test
or --z3_diff_test
:
--z3_branch_depth n
Ask z3 to generate test cases for all possible branches of a 3D specification up to depth
n
, by depth-first search. Branching points in a 3D specification are field constraints andcasetype
; a branch is a sequence of choices performed at successive branching points. Default value forn
is0
to not perform branch coverage analysis.--z3_witnesses n
If
--z3_test_mode
is specified, then this option producesn
test cases for each branch. Otherwise, producen
positive andn
negative tests for each branch. In both cases,n
must be a positive integer.
Alternate mode: help and version¶
Some options make EverParse do nothing other than show a help or version message:
--version
Print the version of EverParse/3d and exit (do not process any further options to the right, nor any files at all.)
--help
Show the usage message and the status of some options given so far, and exit (do not process any further options to the right, nor any files at all.)
Language Reference¶
Contents:
- Overview
- Base types
- Structs
- Constraints
- Bitfields
- Constants and Enumerations
- Parameterized data types
- Tagged unions or
casetype
- Arrays
- Actions
- Alignment
- Explicitly checking 3d types for correspondence with existing C types
- Generating code with for several compile-time configurations
- Validating Data with Indirections or Pointers
- Specialization for Different Pointer Sizes
- Comments
- Adding copyright notices to produced .c/.h files
- Modular structure and files
- Error handling
- Fully worked examples: TCP Segment Headers
- Fully worked examples: ELF files