Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new marker to check for memory leaks #52

Merged
merged 5 commits into from
Aug 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. module:: pytest_memray

pytest-memray API
=================

Types
-----

.. autoclass:: LeaksFilterFunction()
:members: __call__
:show-inheritance:

.. autoclass:: Stack()
:members:

.. autoclass:: StackFrame()
:members:

11 changes: 11 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@
from sphinxcontrib.programoutput import Command

extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.extlinks",
"sphinx.ext.githubpages",
"sphinx.ext.intersphinx",
"sphinxarg.ext",
"sphinx_inline_tabs",
"sphinxcontrib.programoutput",
Expand All @@ -36,6 +38,15 @@
"https://github.com/bloomberg/pytest-memray/issues/.*": "https://github.com/bloomberg/pytest-memray/pull/.*"
}

# Try to resolve Sphinx references as Python objects by default. This means we
# don't need :func: or :class: etc, which keep docstrings more human readable.
default_role = "py:obj"

# Automatically link to Python standard library types.
intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
}


def _get_output(self):
code, out = prev(self)
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ reports like:

usage
configuration
api
news
122 changes: 102 additions & 20 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,33 +31,115 @@ reported after tests run ends:
Markers
~~~~~~~

This plugin provides markers that can be used to enforce additional checks and
validations on tests when this plugin is enabled.
This plugin provides `markers <https://docs.pytest.org/en/latest/example/markers.html>`__
that can be used to enforce additional checks and validations on tests.

.. important:: These markers do nothing when the plugin is not enabled.

.. py:function:: pytest.mark.limit_memory(memory_limit: str)

``limit_memory``
----------------
Fail the execution of the test if the test allocates more memory than allowed.

When this marker is applied to a test, it will cause the test to fail if the execution
of the test allocates more memory than allowed. It takes a single argument with a
string indicating the maximum memory that the test can allocate.
When this marker is applied to a test, it will cause the test to fail if the
execution of the test allocates more memory than allowed. It takes a single argument
with a string indicating the maximum memory that the test can allocate.

The format for the string is ``<NUMBER> ([KMGTP]B|B)``. The marker will raise
``ValueError`` if the string format cannot be parsed correctly.
The format for the string is ``<NUMBER> ([KMGTP]B|B)``. The marker will raise
``ValueError`` if the string format cannot be parsed correctly.

.. warning::
.. warning::

As the Python interpreter has its own
`object allocator <https://docs.python.org/3/c-api/memory.html>`__ is possible
that memory is not immediately released to the system when objects are deleted, so
tests using this marker may need to give some room to account for this.
As the Python interpreter has its own
`object allocator <https://docs.python.org/3/c-api/memory.html>`__ it's possible
that memory is not immediately released to the system when objects are deleted,
so tests using this marker may need to give some room to account for this.

Example of usage:
Example of usage:

.. code-block:: python
.. code-block:: python

@pytest.mark.limit_memory("24 MB")
def test_foobar():
pass # do some stuff that allocates memory
@pytest.mark.limit_memory("24 MB")
def test_foobar():
pass # do some stuff that allocates memory


.. py:function:: pytest.mark.limit_leaks(location_limit: str, filter_fn: LeaksFilterFunction | None = None)

Fail the execution of the test if any call stack in the test leaks more memory than
allowed.

.. important::
To detect leaks, Memray needs to intercept calls to the Python allocators and
report native call frames. This is adds significant overhead, and will slow your
test down.

When this marker is applied to a test, the plugin will analyze the memory
allocations that are made while the test body runs and not freed by the time the
test body function returns. It groups them by the call stack leading to the
allocation, and sums the amount leaked by each **distinct call stack**. If the total
amount leaked from any particular call stack is greater than the configured limit,
the test will fail.

.. important::
It's recommended to run your API or code in a loop when utilizing this plugin.
This practice helps in distinguishing genuine leaks from the "noise" generated
by internal caches and other incidental allocations.

The format for the string is ``<NUMBER> ([KMGTP]B|B)``. The marker will raise
``ValueError`` if the string format cannot be parsed correctly.

The marker also takes an optional keyword-only argument ``filter_fn``. This argument
represents a filtering function that will be called once for each distinct call
stack that leaked more memory than allowed. If it returns *True*, leaks from that
location will be included in the final report. If it returns *False*, leaks
associated with the stack it was called with will be ignored. If all leaks are
ignored, the test will not fail. This can be used to discard any known false
positives.

.. tip::

You can pass the ``--memray-bin-path`` argument to ``pytest`` to specify
a directory where Memray will store the binary files with the results. You
can then use the ``memray`` CLI to further investigate the allocations and the
leaks using any Memray reporters you'd like. Check `the memray docs
<https://bloomberg.github.io/memray/getting_started.html>`_ for more
information.

Example of usage:

.. code-block:: python

@pytest.mark.limit_leaks("1 MB")
def test_foobar():
# Run the function we're testing in a loop to ensure
# we can differentiate leaks from memory held by
# caches inside the Python interpreter.
for _ in range(100):
do_some_stuff()

.. warning::
It is **very** challenging to write tests that do not "leak" memory in some way,
due to circumstances beyond your control.

There are many caches inside the Python interpreter itself. Just a few examples:

- The `re` module caches compiled regexes.
- The `logging` module caches whether a given log level is active for
a particular logger the first time you try to log something at that level.
- A limited number of objects of certain heavily used types are cached for reuse
so that `object.__new__` does not always need to allocate memory.
- The mapping from bytecode index to line number for each Python function is
cached when it is first needed.

There are many more such caches. Also, within pytest, any message that you log or
print is captured, so that it can be included in the output if the test fails.

Memray sees these all as "leaks", because something was allocated while the test
ran and it was not freed by the time the test body finished. We don't know that
it's due to an implementation detail of the interpreter or pytest that the memory
wasn't freed. Morever, because these caches are implementation details, the
amount of memory allocated, the call stack of the allocation, and even the
allocator that was used can all change from one version to another.

Because of this, you will almost certainly need to allow some small amount of
leaked memory per call stack, or use the ``filter_fn`` argument to filter out
false-positive leak reports based on the call stack they're associated with.
6 changes: 6 additions & 0 deletions src/pytest_memray/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
from __future__ import annotations

from ._version import __version__ as __version__
from .marks import LeaksFilterFunction
from .marks import Stack
from .marks import StackFrame

__all__ = [
"__version__",
"LeaksFilterFunction",
"Stack",
"StackFrame",
]
Loading