|
| 1 | +# Python Reference Counting in the Phlex Python Plugin |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Phlex's Python plugin bridges C++ and Python through `intptr_t` values |
| 6 | +that represent `PyObject*` pointers. These values flow through the |
| 7 | +framework's `declared_transform` nodes, which cache their outputs for |
| 8 | +reuse when duplicate hashes appear. This document describes the |
| 9 | +reference counting discipline used to avoid both leaks and crashes. |
| 10 | + |
| 11 | +## Architecture |
| 12 | + |
| 13 | +A typical Python transform pipeline looks like this: |
| 14 | + |
| 15 | +```text |
| 16 | +Provider → [input converter] → [py_callback] → [output converter] → Observer |
| 17 | + (C++ → PyObject) (Py → Py) (PyObject → C++) |
| 18 | +``` |
| 19 | + |
| 20 | +Each `[…]` above is its own `declared_transform` node. Each node is |
| 21 | +named with a combination of the type, the product label, and the |
| 22 | +consumer algorithm, so every converter has a **unique** identity. |
| 23 | + |
| 24 | +## The Caching Mechanism |
| 25 | + |
| 26 | +`declared_transform` (see `phlex/core/declared_transform.hpp`) caches |
| 27 | +its output product store by `data_cell_index::hash()`: |
| 28 | + |
| 29 | +```cpp |
| 30 | +if (stores_.insert(a, hash)) { |
| 31 | + // First time: call the transform, store result |
| 32 | + a->second = make_shared<product_store>(...); |
| 33 | +} else { |
| 34 | + // Cache hit: reuse a->second without calling the transform |
| 35 | +} |
| 36 | +``` |
| 37 | + |
| 38 | +Each transform function is called **exactly once** per unique hash. |
| 39 | +Duplicate events with the same hash reuse the cached product store |
| 40 | +without re-invoking the transform. The cache entry is erased once |
| 41 | +`done_with()` returns true (all events for that hash have been |
| 42 | +processed). |
| 43 | + |
| 44 | +## The One-to-One Model |
| 45 | + |
| 46 | +Because each converter node has a unique name and each transform is |
| 47 | +called once per unique hash, reference counting is **one-to-one**: |
| 48 | + |
| 49 | +1. **Input converters** create a new `PyObject*` reference (refcnt=1). |
| 50 | +2. **Consumers** (`py_callback` or output converters) `Py_DECREF` the |
| 51 | + reference after use (refcnt→0, freed). |
| 52 | + |
| 53 | +No `XINCREF`/`XDECREF` wrapping is needed around the callback call. |
| 54 | +Adding one would produce a net +1 leak per product. |
| 55 | + |
| 56 | +### Input Converters (`_to_py`) |
| 57 | + |
| 58 | +Each input converter creates a **new reference** that is stored in the |
| 59 | +product store as an `intptr_t`. The framework treats `intptr_t` as an |
| 60 | +opaque integer — it has no C++ destructor and cannot call `Py_DECREF`. |
| 61 | +The consumer is responsible for releasing the reference. |
| 62 | + |
| 63 | +```cpp |
| 64 | +// BASIC_CONVERTER: creates new reference via Python C API |
| 65 | +static intptr_t int_to_py(int a) { |
| 66 | + PyGILRAII gil; |
| 67 | + return (intptr_t)PyLong_FromLong(a); // new reference, refcnt=1 |
| 68 | +} |
| 69 | + |
| 70 | +// VECTOR_CONVERTER: creates new PhlexLifeline wrapping a numpy view |
| 71 | +static intptr_t vint_to_py(shared_ptr<vector<int>> const& v) { |
| 72 | + // ... creates PyArrayObject + PhlexLifeline ... |
| 73 | + return (intptr_t)pyll; // new reference, refcnt=1 |
| 74 | +} |
| 75 | +``` |
| 76 | +
|
| 77 | +### py_callback (`call` / `callv`) |
| 78 | +
|
| 79 | +`py_callback::call()` and `py_callback::callv()` receive `intptr_t` |
| 80 | +args that are **owned references** created by the upstream input |
| 81 | +converter, dedicated to this specific consumer. They: |
| 82 | +
|
| 83 | +1. Call `lifeline_transform()` to unwrap any `PhlexLifeline` objects |
| 84 | + (extracting the numpy view for the Python function). |
| 85 | +2. Pass the args to `PyObject_CallFunctionObjArgs()`. |
| 86 | +3. Call `decref_all(args...)` to release the input references. |
| 87 | +
|
| 88 | +```cpp |
| 89 | +template <typename... Args> |
| 90 | +intptr_t call(Args... args) { |
| 91 | + PyGILRAII gil; |
| 92 | + PyObject* result = PyObject_CallFunctionObjArgs( |
| 93 | + m_callable, lifeline_transform(args)..., nullptr); |
| 94 | + // ... error handling ... |
| 95 | + decref_all(args...); // release input references |
| 96 | + return (intptr_t)result; // new reference from Python call |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +Both `call()` and `callv()` use `lifeline_transform()` symmetrically. |
| 101 | + |
| 102 | +### Output Converters (`py_to_*`) |
| 103 | + |
| 104 | +Output converters receive the `intptr_t` from `py_callback`'s product |
| 105 | +store — again, an **owned reference** created by the Python call, |
| 106 | +dedicated to this specific output converter. They extract the C++ |
| 107 | +value, then `Py_DECREF` the input: |
| 108 | + |
| 109 | +```cpp |
| 110 | +// BASIC_CONVERTER py_to_* |
| 111 | +static int py_to_int(intptr_t pyobj) { |
| 112 | + PyGILRAII gil; |
| 113 | + int i = (int)PyLong_AsLong((PyObject*)pyobj); |
| 114 | + // ... error handling ... |
| 115 | + Py_DECREF((PyObject*)pyobj); // release the reference |
| 116 | + return i; |
| 117 | +} |
| 118 | + |
| 119 | +// NUMPY_ARRAY_CONVERTER py_to_* |
| 120 | +static shared_ptr<vector<int>> py_to_vint(intptr_t pyobj) { |
| 121 | + PyGILRAII gil; |
| 122 | + auto vec = make_shared<vector<int>>(); |
| 123 | + // ... copy data from PyArray or PyList ... |
| 124 | + Py_DECREF((PyObject*)pyobj); // release the reference |
| 125 | + return vec; |
| 126 | +} |
| 127 | +``` |
| 128 | +
|
| 129 | +## Reference Flow Diagram |
| 130 | +
|
| 131 | +```text |
| 132 | + ┌──────────────┐ |
| 133 | + │ Provider │ C++ value (int, float, vector<T>) |
| 134 | + └──────┬───────┘ |
| 135 | + │ |
| 136 | + ┌──────▼───────┐ |
| 137 | + │ input conv. │ Creates NEW PyObject* (refcnt=1) |
| 138 | + │ (e.g. int_ │ Stored in product_store as intptr_t |
| 139 | + │ to_py) │ One per unique hash; consumer owns it |
| 140 | + └──────┬───────┘ |
| 141 | + │ intptr_t (owned reference, refcnt=1) |
| 142 | + ┌──────▼───────┐ |
| 143 | + │ py_callback │ lifeline_transform() to unwrap |
| 144 | + │ ::call() │ PyObject_CallFunctionObjArgs() |
| 145 | + │ │ decref_all(args...) → refcnt=0, freed |
| 146 | + │ │ Returns result (NEW reference, refcnt=1) |
| 147 | + └──────┬───────┘ |
| 148 | + │ intptr_t (owned reference, refcnt=1) |
| 149 | + ┌──────▼───────┐ |
| 150 | + │ output conv. │ Reads PyObject* value |
| 151 | + │ (e.g. py_to_ │ Py_DECREF → refcnt=0, freed |
| 152 | + │ int) │ Returns C++ value |
| 153 | + └──────┬───────┘ |
| 154 | + │ |
| 155 | + ┌──────▼───────┐ |
| 156 | + │ Observer │ Uses C++ value |
| 157 | + └──────────────┘ |
| 158 | +``` |
| 159 | + |
| 160 | +## Error Handling |
| 161 | + |
| 162 | +### VECTOR_CONVERTER Must Throw on Error |
| 163 | + |
| 164 | +`VECTOR_CONVERTER` error paths throw `std::runtime_error` instead of |
| 165 | +returning `(intptr_t)nullptr`. A null `intptr_t` passed to |
| 166 | +`PyObject_CallFunctionObjArgs` would act as the argument-list sentinel, |
| 167 | +silently truncating the argument list. |
| 168 | + |
| 169 | +### lifeline_transform Returns a Borrowed Reference |
| 170 | + |
| 171 | +`lifeline_transform()` unwraps `PhlexLifeline` objects to extract the |
| 172 | +numpy array view (`m_view`). The returned pointer is a borrowed |
| 173 | +reference from the lifeline object, which remains alive because the |
| 174 | +caller still holds the owned reference to the lifeline (the DECREF |
| 175 | +happens after the call returns). |
| 176 | + |
| 177 | +### ll_new Must Return nullptr on Allocation Failure |
| 178 | + |
| 179 | +`ll_new` (in `lifelinewrap.cpp`) returns `nullptr` when `tp_alloc` |
| 180 | +fails, rather than falling through to dereference the null pointer. |
| 181 | + |
| 182 | +## Common Pitfalls |
| 183 | + |
| 184 | +1. **Do not add `Py_XINCREF`/`Py_XDECREF` around `py_callback` calls.** |
| 185 | + The one-to-one model means each reference is created once and |
| 186 | + consumed once. Adding INCREF/DECREF introduces a net +1 leak per |
| 187 | + product — potentially millions of leaked objects. |
| 188 | + |
| 189 | +2. **Do not remove `Py_DECREF` from output converters or `decref_all` |
| 190 | + from `py_callback`.** These are the "consume" side of the |
| 191 | + one-to-one contract. Removing them leaks every converted object. |
| 192 | + |
| 193 | +3. **Do not return `(intptr_t)nullptr` from converters.** It acts as |
| 194 | + the `PyObject_CallFunctionObjArgs` sentinel. Throw an exception |
| 195 | + instead. |
| 196 | + |
| 197 | +## Known Limitations |
| 198 | + |
| 199 | +- `BASIC_CONVERTER`'s `_to_py` functions can return |
| 200 | + `(intptr_t)nullptr` if the underlying Python C API call fails (e.g., |
| 201 | + `PyLong_FromLong` on OOM). This null would propagate through the |
| 202 | + product store and cause a crash in `decref_all`. This is an |
| 203 | + extremely rare edge case that has not been addressed yet. |
0 commit comments