-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Mypyc Development Workflows
This page explains some common workflows for developing mypyc.
Most mypyc test cases are defined in the same format (.test
) as used
for test cases for mypy. Look at mypy developer documentation for a
general overview of how things work. Test cases live under
mypyc/test-data/
, and you can run all mypyc tests via pytest -q mypyc
. If you don't make changes to code under mypy/
, it's not
important to regularly run mypy tests during development.
When you create a PR, we have Continuous Integration jobs set up that compile mypy using mypyc and run the mypy test suite using the compiled mypy. This will sometimes catch additional issues not caught by the mypyc test suite. It's okay to not do this in your local development environment.
We discuss writing tests in more detail later in this document.
It's often useful to look at the generated IR when debugging issues or
when trying to understand how mypyc compiles some code. When you
compile some module by running mypyc
, mypyc will write the
pretty-printed IR into build/ops.txt
. This is the final IR that
includes the output from exception and reference count handling
insertion passes.
We also have tests that verify the generate IR
(mypyc/test-data/irbuild-*.text
).
./runtests.py self
type checks mypy and mypyc. This is pretty slow,
however, since it's using an uncompiled mypy.
Installing a released version of mypy using pip
(which is compiled)
and using dmypy
(mypy daemon) is a much, much faster way to type
check mypyc during development.
It's often useful to inspect the C code genenerate by mypyc to debug
issues. Mypyc stores the generated C code as build/__native.c
.
Compiled native functions have the prefix CPyDef_
, while wrapper
functions used for calling functions from interpreted Python code have
the CPyPy_
prefix.
This section gives an overview of where to look for and what to do to implement specific kinds of mypyc features.
Our bread-and-butter testing strategy is compiling code with mypyc and running it. There are downsides to this (kind of slow, tests a huge number of components at once, insensitive to the particular details of the IR), but there really is no substitute for running code. You can also write tests that test the generated IR, however.
Test cases that compile and run code are located in
mypyc/test-data/run*.test
and the test runner is in
mypyc.test.test_run
. The code to compile comes after [case test<name>]
. The code gets saved into the file native.py
, and it
gets compiled into the module native
.
Each test case uses a non-compiled Python driver that imports the
native
module and typically calls some compiled functions. Some
tests also perform assertions and print messages in the driver.
If you don't provide a driver, a default driver is used. The default
driver just calls each module-level function that is prefixed with
test_
and reports any uncaught exceptions as failures. (Failure to
build or a segfault also count as failures.) testStringOps
in
mypyc/test-data/run-strings.test
is an example of a test that uses
the default driver.
You should usually use the default driver (don't include
driver.py
). It's the simplest way to write most tests.
Here's an example test case that uses the default driver:
[case testConcatenateLists]
def test_concat_lists() -> None:
assert [1, 2] + [5, 6] == [1, 2, 5, 6]
def test_concat_empty_lists() -> None:
assert [] + [] == []
There is one test case, testConcatenateLists
. It has two sub-cases,
test_concat_lists
and test_concat_empty_lists
. Note that you can
use the pytest -k argument to only run testConcetanateLists
, but you
can't filter tests at the sub-case level.
It's recommended to have multiple sub-cases per test case, since each test case has significant fixed overhead. Each test case is run in a fresh Python subprocess.
Many of the existing test cases provide a custom driver by having
[file driver.py]
, followed by the driver implementation. Here the
driver is not compiled, which is useful if you want to test
interactions between compiled and non-compiled code. However, many of
the tests don't have a good reason to use a custom driver -- when they
were written, the default driver wasn't available.
Test cases can also have a [out]
section, which specifies the
expected contents of stdout the test case should produce. New test
cases should prefer assert statements to [out]
sections.
If the specifics of the generated IR of a change is important
(because, for example, you want to make sure a particular optimization
is triggering), you should add a mypyc.irbuild
test as well. Test
cases are located in mypyc/test-data/irbuild-*.test
and the test
driver is in mypyc.test.test_irbuild
. IR build tests do a direct
comparison of the IR output, so try to make the test as targeted as
possible so as to capture only the important details. (Many of our
existing IR build tests do not follow this advice, unfortunately!)
If you pass the --update-data
flag to pytest, it will automatically
update the expected output of any tests to match the actual
output. This is very useful for changing or creating IR build tests,
but make sure to carefully inspect the diff!
You may also need to add some definitions to the stubs used for
builtins during tests (mypyc/test-data/fixtures/ir.py
). We don't use
full typeshed stubs to run tests since they would seriously slow down
tests.
Many mypyc improvements attempt to make some operations faster. For any such change, you should run some measurements to verify that there actually is a measurable performance impact.
A typical benchmark would initialize some data to be operated on, and then measure time spent in some function. In particular, you should not measure time needed to run the entire benchmark program, as this would include Python startup overhead and other things that aren't relevant. In general, for microbenchmarks, you want to do as little as possible in the timed portion. So ideally you'll just have some loops and the code under test. Be ready to provide your benchmark in code review so that mypyc developers can check that the benchmark is fine (writing a good benchmark is non-trivial).
You should run a benchmark at least five times, in both original and changed versions, ignore outliers, and report the average runtime. Actual performance of a typical desktop or laptop computer is quite variable, due to dynamic CPU clock frequency changes, background processes, etc. If you observe a high variance in timings, you'll need to run the benchmark more times. Also try closing most applications, including web browsers.
Interleave original and changed runs. Don't run 10 runs with variant A followed by 10 runs with variant B, but run an A run, a B run, an A run, etc. Otherwise you risk that the CPU frequency will be different between variants. You can also try adding a delay of 5 to 20s between runs to avoid CPU frequency changes.
Instead of averaging over many measurements, you can try to adjust your environment to provide more stable measurements. However, this can be hard to do with some hardware, including many laptops. Victor Stinner has written a series of blog posts about making measurements stable:
- https://vstinner.github.io/journey-to-stable-benchmark-system.html
- https://vstinner.github.io/journey-to-stable-benchmark-average.html
If you add an operation that compiles into a lot of C code, you may also want to add a C helper function for the operation to make the generated code smaller. Here is how to do this:
-
Declare the operation in
mypyc/lib-rt/CPy.h
. We avoid macros, and we generally avoid inline functions to make it easier to target additional backends in the future. -
Consider adding a unit test for your C helper in
mypyc/lib-rt/test_capi.cc
. We use Google Test for writing tests in C++. The framework is included in the repository under the directorygoogletest/
. The C unit tests are run as part of the pytest test suite (test_c_unit_test
).
Mypyc speeds up operations on primitive types such as list
and int
by having primitive operations specialized for specific types. These
operations are declared in mypyc.primitives
(and
mypyc/lib-rt/CPy.h
). For example, mypyc.primitives.list_ops
contains primitives that target list objects.
The operation definitions are data driven: you specify the kind of
operation (such as a call to builtins.len
or a binary addition) and
the operand types (such as list_primitive
), and what code should be
generated for the operation. Mypyc does AST matching to find the most
suitable primitive operation automatically.
Look at the existing primitive definitions and the docstrings in
mypyc.primitives.registry
for examples and more information.
Some types (typically Python Python built-in types), such as int
and
list
, are special cased in mypyc to generate optimized operations
specific to these types. We'll occasionally want to add additional
primitive types.
Here are some hints about how to add support for a new primitive type (this may be incomplete):
-
Decide whether the primitive type has an "unboxed" representation (a representation that is not just
PyObject *
). For most types we'll use a boxed representation, as it's easier to implement and more closely matches Python semantics. -
Create a new instance of
RPrimitive
to support the primitive type and add it tomypyc.ir.rtypes
. Make sure all the attributes are set correctly and also define<foo>_rprimitive
andis_<foo>_rprimitive
. -
Update
mypyc.irbuild.mapper.Mapper.type_to_rtype()
. -
If the type is not unboxed, update
emit_cast
inmypyc.codegen.emit
.
If the type is unboxed, there are some additional steps:
-
Update
emit_box
inmypyc.codegen.emit
. -
Update
emit_unbox
inmypyc.codegen.emit
. -
Update
emit_inc_ref
andemit_dec_ref
inmypypc.codegen.emit
. If the unboxed representation does not need reference counting, these can be no-ops. -
Update
emit_error_check
inmypyc.codegen.emit
. -
Update
emit_gc_visit
andemit_gc_clear
inmypyc.codegen.emit
if the type has an unboxed representation with pointers.
The above may be enough to allow you to declare variables with the type, pass values around, perform runtime type checks, and use generic fallback primitive operations to perform method calls, binary operations, and so on. You likely also want to add some faster, specialized primitive operations for the type (see Adding a Specialized Primitive Operation above for how to do this).
Add a test case to mypyc/test-data/run*.test
to test compilation and
running compiled code. Ideas for things to test:
-
Test using the type as an argument.
-
Test using the type as a return value.
-
Test passing a value of the type to a function both within compiled code and from regular Python code. Also test this for return values.
-
Test using the type as list item type. Test both getting a list item and setting a list item.
Mypyc supports most Python syntax, but there are still some gaps.
Support for syntactic sugar that doesn't need additional IR operations
typically only requires changes to mypyc.irbuild
.
Some new syntax also needs new IR primitives to be added to
mypyc.primitives
. See mypyc.primitives.registry
for documentation
about how to do this.
-
This developer documentation is not aimed to be very complete. Much of our documentation is in comments and docstring in the code. If something is unclear, study the code.
-
It can be useful to look through some recent PRs to get an idea of what typical code changes, test cases, etc. look like.
-
Feel free to open GitHub issues with questions if you need help when contributing, or ask questions in existing issues. Note that we only support contributors. Mypyc is not (yet) an end-user product. You can also ask questions in our Gitter chat (https://gitter.im/mypyc-dev/community).
These workflows would be useful for mypyc contributors. We should add them to mypyc developer documentation:
- How to inspect the generated IR before some transform passes.