Add a new marker to check for memory leaks #52

pablogsal · 2022-11-15T18:09:29Z

Users have indicated that it will be very useful if the plugin exposes a
way to detect memory leaks in tests. This is possible, but is a bit
tricky as the interpreter can allocate memory for internal caches, as
well as user functions.

To make this more reliable, the new marker will take two parameters:

The watermark of memory to ignore. If the memory leaked by the test is
higher than this value, the test will fail and it will pass otherwise.
The number of warmup runs. This allows to run the test multiple times
(assuming it passes) before actually checking for leaks. This allows
to warmup user and interpreter caches.

Closes: #45

gaborbernat

We should document this in the readme somehow too.

pablogsal · 2022-11-16T12:44:56Z

I have added docs

tests/test_pytest_memray.py

src/pytest_memray/marks.py

src/pytest_memray/plugin.py

tonybaloney · 2022-11-16T22:32:34Z

Just tried 466743b and I'm getting the warning:

PytestUnknownMarkWarning: Unknown pytest.mark.check_leaks - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.check_leaks()

$ pip freeze
....
pytest-memray @ git+https://github.com/pablogsal/pytest-memray@466743b4853261a35c3ff84d4d4654d6586918a1
...

Update:

$ pytest --markers

Doesn't show the markers, but

$ pytest --memray --markers

Does. Compared with hypothesis, or other extensions which show it regardless. This is separate (or by design) to this PR.

docs/usage.rst

src/pytest_memray/plugin.py

src/pytest_memray/marks.py

src/pytest_memray/plugin.py

src/pytest_memray/marks.py

godlygeek · 2022-11-16T22:55:00Z

This is separate (or by design) to this PR

@tonybaloney Yep, and we have #49 for that already.

godlygeek · 2022-11-16T22:55:44Z

Other than needing to provide --memray, did things work nicely for you when you tried it?

tonybaloney · 2022-11-16T22:56:59Z

Other than needing to provide --memray, did things work nicely for you when you tried it?

My tests fail because memray writes to stderr on macOS with a warning (see bloomberg/memray#254) but other than that, it seems to work when I tried it. I'm going to purposefully write a leaky test and verify that it catches it next

pablogsal · 2022-11-16T22:57:12Z

Just tried 466743b and I'm getting the warning:
PytestUnknownMarkWarning: Unknown pytest.mark.check_leaks - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.check_leaks()
$ pip freeze
....
pytest-memray @ git+https://github.com/pablogsal/pytest-memray@466743b4853261a35c3ff84d4d4654d6586918a1
...
Update:
$ pytest --markers
Doesn't show the markers, but
$ pytest --memray --markers
Does. Compared with hypothesis, or other extensions which show it regardless. This is separate (or by design) to this PR.

Apart from that, does it work for your use case? Can you simulate a leak in your extension and try the marker?

tonybaloney · 2022-11-17T00:06:52Z

I tried adding a leaking String object to the __repr__ function for an extension type:

PyObject* Logger_repr(Logger *self) {
    std::string level = _getLevelName(self->effective_level);
    PyObject* leaky_object = PyUnicode_FromString("Hello"); /* TEST */
    return PyUnicode_FromFormat("<Logger '%U' (%s)>", self->name, level.c_str());
}

Running memray by hand using a script that instantiates an instance of the extension type (100,000 times), I can see the leak in the graph:

This is what I'm currently doing. But using this marker doesn't seem to catch the same thing?

@pytest.mark.parametrize("level", levels)
@pytest.mark.check_leaks("0B", warmups=3)
def test_logger_repr(level):
    logger = picologging.Logger("test", level)
    assert repr(logger) == f"<Logger 'test' ({level_names[levels.index(level)]})>"

Doesn't give me any information about leaks, every test seems to leak at least something, even ones not testing my extension module:


--------------------------------------------------------------------------------- memray-leaked-memory ----------------------------------------------------------------------------------
Test leaked 108.0B out of limit of 0.0B
List of leaked allocations: 
        - test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:279 -> 54.0B
        - test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:280 -> 54.0B

I guess I could run each test and work out how much memory it should consume and then set the limit manually?

pablogsal · 2022-11-17T00:12:40Z

I tried adding a leaking String object to the __repr__ function for an extension type:
PyObject* Logger_repr(Logger *self) {
    std::string level = _getLevelName(self->effective_level);
    PyObject* leaky_object = PyUnicode_FromString("Hello"); /* TEST */
    return PyUnicode_FromFormat("<Logger '%U' (%s)>", self->name, level.c_str());
}
Running memray by hand using a script that instantiates an instance of the extension type (100,000 times), I can see the leak in the graph:

This is what I'm currently doing. But using this marker doesn't seem to catch the same thing?
@pytest.mark.parametrize("level", levels)
@pytest.mark.check_leaks("0B", warmups=3)
def test_logger_repr(level):
    logger = picologging.Logger("test", level)
    assert repr(logger) == f"<Logger 'test' ({level_names[levels.index(level)]})>"
Doesn't give me any information about leaks, every test seems to leak at least something, even ones not testing my extension module:
--------------------------------------------------------------------------------- memray-leaked-memory ----------------------------------------------------------------------------------
Test leaked 108.0B out of limit of 0.0B
List of leaked allocations: 
        - test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:279 -> 54.0B
        - test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:280 -> 54.0B
I guess I could run each test and work out how much memory it should consume and then set the limit manually?

Well that’s because something is being leaked indeed, but is not what you are hunting for. The marker will complain about anything created in the test that survives the test. In your flamegraph for leaks you can see many more objects. So it would not be surprising that there is a lot more stuff being kept alive other than the string. That’s precisely why we take the minimum memory allowed to leak, so you can filter out that stuff.

pablogsal · 2022-11-17T00:13:07Z

Doesn't give me any information about leaks

what information are you expecting?

tonybaloney · 2022-11-17T00:17:45Z

I was expecting it to report that the PyUnicode object allocated in the Logger_repr function is leaked, like:

Test leaked XXB out of limit of 0.0B
List of leaked allocations: 
        - _Logger_repr:/Users/anthonyshaw/projects/picologging/src/logger.cxx:Logger_repr : 142 -> XXB

pablogsal · 2022-11-17T00:18:05Z

test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:279 -> 54.0
test_logger_repr:/Users/anthonyshaw/projects/picologging/tests/unit/test_logger.py:280 -> 54.0B

Humm, to me this looks right. Is telling you that your C extension type is leaking memory. I don’t see why you say that is not giving you any information about the leak.

by default the marker doesn’t run in native mode so that’s why it doesn’t show anything below that. It tells you the lower Python frame that called something that leak. As it cannot see below, it cannot split that into the different calls that you see in the flamegraph but is telling you that is leaking which is the main pour pose of the marker.

to debug the leak you can use the profiler directly which allows you to use more powerful visualisations

pablogsal · 2022-11-17T00:19:13Z

was expecting it to report that the PyUnicode object allocated in the Logger_repr function is leaked, like:

The marker doesn’t run in native mode (for performance reasons as that’s quite heavy for a test suite) so it won’t show you any C code. Does that explain what’s going on?

tonybaloney · 2022-11-17T00:20:26Z

was expecting it to report that the PyUnicode object allocated in the Logger_repr function is leaked, like:

The marker doesn’t run in native mode so it won’t show you any C code. Does that explain what’s going on?

Ok, that makes sense then. Is there a way to run it in native mode? The settings I use on the CLI are:

$ PYTHONMALLOC=malloc memray run --trace-python-allocators -o .profiles/memray_logger.py.bin -f --native memray_logger.py
$ memray flamegraph --leaks -f .profiles/memray_logger.py.bin

pablogsal · 2022-11-17T00:20:48Z

The idea of the marker is that it will complain that there are leaks and then will show you the Python call that leaked. If you need to investigate with native code you can do it later with the profiler. Or at least that’s the idea.

the marker is not a substitute for the profiler, is a way to ensure that you are not leaking and it will tell you sonemething about the Python call that leaked but then you should investigate with better reporters.

tonybaloney · 2022-11-17T00:22:05Z

Ok, then yes, this will serve the requested use case in #45 I'll test it with more scenarios.

pablogsal · 2022-11-17T00:25:02Z

Ok, then yes, this will serve the requested use case in #45 I'll test it with more scenarios.

we could add a way to run in native mode, but that would make the happy path (the test passing) extra slow for no reason. As failures are the rare case, we are prioritising detection over exhaustive reports and leaving the manual execution of the profiler for the later.

godlygeek · 2022-11-17T00:28:12Z

Is there a way to run it in native mode?

@pablogsal and I were going back and forth about this on Slack for quite a while today, so I'm feeling a bit vindicated 😛

tonybaloney · 2022-11-17T00:47:30Z

Is there a way to run it in native mode?

@pablogsal and I were going back and forth about this on Slack for quite a while today, so I'm feeling a bit vindicated 😛

I understand it would be really slow, but handy as an optional flag

pablogsal · 2023-07-10T17:58:30Z

@godlygeek This is ready for another round

sarahmonod

I found a few typos as I was reading this

docs/usage.rst

Users have indicated that it will be very useful if the plugin exposes a way to detect memory leaks in tests. This is possible, but is a bit tricky as the interpreter can allocate memory for internal caches, as well as user functions. To make this more reliable, the new marker will take two parameters: * The limit of memory per location to consider an allocation. If the memory leaked by any allocation location in the test is higher than this value, the test will fail. * An optional callable function that can be used to filter out locations. This will allow users to remove false positives. Signed-off-by: Pablo Galindo <[email protected]>

godlygeek

This looks a lot cleaner. I've taken a first pass at reviewing this and I still see some aspects of the API that I think we need to polish. I've focused only on how we describe failures, how we describe the functionality, and how users hook their own filtering into it so far.

docs/usage.rst

src/pytest_memray/marks.py

src/pytest_memray/plugin.py

docs/usage.rst

godlygeek · 2023-08-18T00:09:00Z

I've pushed another fixup. It adds another dataclass to represent a stack frame, so that the users get named fields instead of needing to work with a tuple[str, str, int] (especially because it's not easy to remember which str is which). I've also renamed the stack frame type from StackElement to StackFrame, since I think that'll be more intuitive for users. I've also updated Stack to say that frames is a Sequence rather than a Collection, since it makes no sense to say that they're ordered but not allow users to reverse them or subscript them.

Honestly, I think we should probably change that to list instead. Once we release this, we're never going to change it to be anything other than a list, thanks to Hyrum's Law, and just telling people that it's a list makes for more readable documentation than telling them that it's a Sequence and making them jump to the CPython docs to double check which operations a Sequence supports.

Signed-off-by: Matt Wozniski <[email protected]>

godlygeek · 2023-08-22T22:51:11Z

OK. I've added another fixup commit getting this into a state where I'm happy to land it. I changed a fair amount of stuff, but most of it is pretty minor. The biggest changes were to the documentation.

Sorry that this turned out to be a lot of changes, but I hope most of them will be pretty uncontroversial.

godlygeek · 2023-08-22T22:54:09Z

https://godlygeek.github.io/pytest-memray/usage.html and https://godlygeek.github.io/pytest-memray/api.html show the rendered documentation after my changes.

godlygeek

After my (extensive 😓) changes, this LGTM. If you're happy with it, feel free to squash and land. If not, let me know what I screwed up and I'll be happy to fix it.

.

gaborbernat previously requested changes Nov 15, 2022

View reviewed changes

pablogsal force-pushed the leaks branch from ffc7d38 to d588227 Compare November 16, 2022 12:44

pablogsal force-pushed the leaks branch from d588227 to 5382442 Compare November 16, 2022 12:54

pablogsal mentioned this pull request Nov 16, 2022

Marker for fail a leaking test #45

Closed

pablogsal force-pushed the leaks branch 2 times, most recently from a3864b0 to daa4374 Compare November 16, 2022 19:13

godlygeek reviewed Nov 16, 2022

View reviewed changes

tests/test_pytest_memray.py Outdated Show resolved Hide resolved

pablogsal force-pushed the leaks branch 2 times, most recently from 9e550f5 to 466743b Compare November 16, 2022 21:43

gaborbernat reviewed Nov 16, 2022

View reviewed changes

src/pytest_memray/marks.py Outdated Show resolved Hide resolved

src/pytest_memray/marks.py Outdated Show resolved Hide resolved

src/pytest_memray/plugin.py Outdated Show resolved Hide resolved

godlygeek force-pushed the leaks branch from 466743b to 557653f Compare November 16, 2022 22:52

godlygeek requested changes Nov 16, 2022

View reviewed changes

pablogsal force-pushed the leaks branch 6 times, most recently from 60c08f0 to cf05664 Compare July 10, 2023 17:58

pablogsal requested a review from godlygeek July 10, 2023 17:58

sarahmonod reviewed Aug 9, 2023

View reviewed changes

docs/usage.rst Outdated Show resolved Hide resolved

docs/usage.rst Outdated Show resolved Hide resolved

pablogsal force-pushed the leaks branch 3 times, most recently from 712b65d to e261803 Compare August 16, 2023 16:18

pablogsal force-pushed the leaks branch from e261803 to 8aea9b0 Compare August 16, 2023 16:20

godlygeek reviewed Aug 16, 2023

View reviewed changes

pablogsal requested a review from godlygeek August 17, 2023 14:59

fixup! Add a new marker to check for memory leaks

42495b8

pablogsal force-pushed the leaks branch from 4ab0480 to 42495b8 Compare August 17, 2023 15:18

pablogsal commented Aug 17, 2023

View reviewed changes

docs/usage.rst Outdated Show resolved Hide resolved

fixup! fixup! Add a new marker to check for memory leaks

b71d68d

fixup! fixup! fixup! Add a new marker to check for memory leaks

7705a53

godlygeek force-pushed the leaks branch from 05cf2cf to 7705a53 Compare August 18, 2023 00:13

fixup! fixup! fixup! fixup! Add a new marker to check for memory leaks

8dc0dcf

Signed-off-by: Matt Wozniski <[email protected]>

godlygeek approved these changes Aug 22, 2023

View reviewed changes

pablogsal merged commit 0e33179 into bloomberg:main Aug 23, 2023
10 checks passed

pablogsal deleted the leaks branch August 23, 2023 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new marker to check for memory leaks #52

Add a new marker to check for memory leaks #52

pablogsal commented Nov 15, 2022 •

edited

Loading

gaborbernat left a comment

pablogsal commented Nov 16, 2022

tonybaloney commented Nov 16, 2022 •

edited

Loading

godlygeek commented Nov 16, 2022

godlygeek commented Nov 16, 2022

tonybaloney commented Nov 16, 2022

pablogsal commented Nov 16, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022

pablogsal commented Nov 17, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022

pablogsal commented Nov 17, 2022 •

edited

Loading

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022 •

edited

Loading

tonybaloney commented Nov 17, 2022 •

edited

Loading

pablogsal commented Nov 17, 2022 •

edited

Loading

godlygeek commented Nov 17, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Jul 10, 2023

sarahmonod left a comment

godlygeek left a comment

godlygeek commented Aug 18, 2023

godlygeek commented Aug 22, 2023 •

edited by pablogsal

Loading

godlygeek commented Aug 22, 2023

godlygeek left a comment

Add a new marker to check for memory leaks #52

Add a new marker to check for memory leaks #52

Conversation

pablogsal commented Nov 15, 2022 • edited Loading

gaborbernat left a comment

Choose a reason for hiding this comment

pablogsal commented Nov 16, 2022

tonybaloney commented Nov 16, 2022 • edited Loading

godlygeek commented Nov 16, 2022

godlygeek commented Nov 16, 2022

tonybaloney commented Nov 16, 2022

pablogsal commented Nov 16, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022

pablogsal commented Nov 17, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022

pablogsal commented Nov 17, 2022 • edited Loading

tonybaloney commented Nov 17, 2022

pablogsal commented Nov 17, 2022 • edited Loading

tonybaloney commented Nov 17, 2022 • edited Loading

pablogsal commented Nov 17, 2022 • edited Loading

godlygeek commented Nov 17, 2022

tonybaloney commented Nov 17, 2022

pablogsal commented Jul 10, 2023

sarahmonod left a comment

Choose a reason for hiding this comment

godlygeek left a comment

Choose a reason for hiding this comment

godlygeek commented Aug 18, 2023

godlygeek commented Aug 22, 2023 • edited by pablogsal Loading

godlygeek commented Aug 22, 2023

godlygeek left a comment

Choose a reason for hiding this comment

pablogsal commented Nov 15, 2022 •

edited

Loading

tonybaloney commented Nov 16, 2022 •

edited

Loading

pablogsal commented Nov 17, 2022 •

edited

Loading

pablogsal commented Nov 17, 2022 •

edited

Loading

tonybaloney commented Nov 17, 2022 •

edited

Loading

pablogsal commented Nov 17, 2022 •

edited

Loading

godlygeek commented Aug 22, 2023 •

edited by pablogsal

Loading