Skip to content

Commit

Permalink
Change .rst file maximum line length to 80
Browse files Browse the repository at this point in the history
Align the .rst line length with the one in use in the php/policies
repository. [1]

Policies uses the same Sphinx / Docutils based documentation build
including the same rstfmt formatting utility.

Additionally, the line length of 80 has been mentioned in the Python
Developer Guide, home of the file format: [2]

    The maximum line length is 80 characters for normal text, but \
    tables, deeply indented code samples and long links may extend \
    beyond that.

[1]: https://github.com/php/policies
[2]: https://devguide.python.org/documentation/markup/#use-of-whitespace
  • Loading branch information
hakre committed Nov 16, 2024
1 parent d42b8c0 commit 965343c
Show file tree
Hide file tree
Showing 10 changed files with 648 additions and 516 deletions.
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ trim_trailing_whitespace = false

[*.rst]
indent_style = space
max_line_length = 100
max_line_length = 80
2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
RSTFMT = rstfmt
RSTFMTFLAGS = -w 100
RSTFMTFLAGS = -w 80

rwildcard = $(foreach d,$(wildcard $(1:=/*)),$(call rwildcard,$d,$2) $(filter $(subst *,%,$2),$d))
FILES = $(call rwildcard,$(SOURCEDIR),*.rst)
Expand Down
174 changes: 96 additions & 78 deletions docs/source/core/data-structures/reference-counting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,22 @@
Reference counting
####################

In languages like C, when you need memory for storing data for an indefinite period of time or in a
large amount, you call ``malloc`` and ``free`` to acquire and release blocks of memory of some size.
This sounds simple on the surface but turns out to be quite tricky, mainly because the data may not
be freed for as long as it is used anywhere in the program. Sometimes this makes it unclear who is
responsible for freeing the memory, and when to do so. Failure to handle this correctly may result
in a use-after-free, double-free, or memory leak.

In PHP you usually do not need to think about memory management. The engine takes care of allocating
and freeing memory for you by tracking which values are no longer needed. It does this by assigning
a reference count to each allocated value, often abbreviated as refcount or RC. Whenever a reference
to a value is passed somewhere else, its reference count is increased to indicate the value is now
used by another party. When the party no longer needs the value, it is responsible for decreasing
the reference count. Once the reference count reaches zero, we know the value is no longer needed
anywhere, and that it may be freed.
In languages like C, when you need memory for storing data for an indefinite
period of time or in a large amount, you call ``malloc`` and ``free`` to acquire
and release blocks of memory of some size. This sounds simple on the surface but
turns out to be quite tricky, mainly because the data may not be freed for as
long as it is used anywhere in the program. Sometimes this makes it unclear who
is responsible for freeing the memory, and when to do so. Failure to handle this
correctly may result in a use-after-free, double-free, or memory leak.

In PHP you usually do not need to think about memory management. The engine
takes care of allocating and freeing memory for you by tracking which values are
no longer needed. It does this by assigning a reference count to each allocated
value, often abbreviated as refcount or RC. Whenever a reference to a value is
passed somewhere else, its reference count is increased to indicate the value is
now used by another party. When the party no longer needs the value, it is
responsible for decreasing the reference count. Once the reference count reaches
zero, we know the value is no longer needed anywhere, and that it may be freed.

.. code:: php
Expand All @@ -24,18 +26,20 @@ anywhere, and that it may be freed.
unset($a); // RC 1
unset($b); // RC 0, free
Reference counting is needed for types that store auxiliary data, which are the following:
Reference counting is needed for types that store auxiliary data, which are the
following:

- Strings
- Arrays
- Objects
- References
- Resources

These are either reference types (objects, references and resources) or they are large types that
don't fit in a single ``zend_value`` directly (strings, arrays). Simpler types either don't store a
value at all (``null``, ``false``, ``true``) or their value is small enough to fit directly in
``zend_value`` (``int``, ``float``).
These are either reference types (objects, references and resources) or they are
large types that don't fit in a single ``zend_value`` directly (strings,
arrays). Simpler types either don't store a value at all (``null``, ``false``,
``true``) or their value is small enough to fit directly in ``zend_value``
(``int``, ``float``).

All of the reference counted types share a common initial struct sequence.

Expand All @@ -58,19 +62,20 @@ All of the reference counted types share a common initial struct sequence.
// ...
};
The ``zend_refcounted_h`` struct is simple. It contains the reference count, and a ``type_info``
field that repeats some of the type information that is also stored in the ``zval``, for situations
where we're not dealing with a ``zval`` directly. It also stores some additional fields, described
under `GC flags`_.
The ``zend_refcounted_h`` struct is simple. It contains the reference count, and
a ``type_info`` field that repeats some of the type information that is also
stored in the ``zval``, for situations where we're not dealing with a ``zval``
directly. It also stores some additional fields, described under `GC flags`_.

********
Macros
********

As with ``zval``, ``zend_refcounted_h`` members should not be accessed directly. Instead, you should
use the provided macros. There are macros that work with reference counted types directly, prefixed
with ``GC_``, or macros that work on ``zval`` values, usually prefixed with ``Z_``. Unfortunately,
naming is not always consistent.
As with ``zval``, ``zend_refcounted_h`` members should not be accessed directly.
Instead, you should use the provided macros. There are macros that work with
reference counted types directly, prefixed with ``GC_``, or macros that work on
``zval`` values, usually prefixed with ``Z_``. Unfortunately, naming is not
always consistent.

.. list-table:: ``zval`` macros
:header-rows: 1
Expand All @@ -93,12 +98,14 @@ naming is not always consistent.

- - ``zval_ptr_dtor``
- Yes
- Decreases the reference count and frees the value if the reference count reaches zero.
- Decreases the reference count and frees the value if the reference
count reaches zero.

.. [#non-rc]
Whether the macro works with non-reference counted types. If it does, the operation is usually a
no-op. If it does not, using the macro on these values is undefined behavior.
Whether the macro works with non-reference counted types. If it does, the
operation is usually a no-op. If it does not, using the macro on these values is
undefined behavior.
.. list-table:: ``zend_refcounted_h`` macros
:header-rows: 1
Expand All @@ -121,27 +128,32 @@ naming is not always consistent.

- - ``GC_DTOR[_P]``
- Yes
- Decreases the reference count and frees the value if the reference count reaches zero.
- Decreases the reference count and frees the value if the reference
count reaches zero.

.. [#immutable]
Whether the macro works with immutable types, described under `Immutable reference counted types`_.
Whether the macro works with immutable types, described under `Immutable
reference counted types`_.
************
Separation
************

PHP has value and reference types. Reference types are types that are shared through a reference, a
"pointer" to the value, rather than the value itself. Modifying such a value in one place changes it
for all of its observers. For example, writing to a property changes the property in every place the
object is referenced. Value types, on the other hand, are copied when passed to another party.
Modifying the original value does not affect the copy, and vice versa.

In PHP, arrays and strings are value types. Since they are also reference counted types, this
requires some special care when modifying values. In particular, we need to make sure that modifying
the value is not observable from other places. Modifying a value with RC 1 is unproblematic, since
we are the values sole owner. However, if the value has a reference count of >1, we need to create a
fresh copy before modifying it. This process is called separation or CoW (copy on write).
PHP has value and reference types. Reference types are types that are shared
through a reference, a "pointer" to the value, rather than the value itself.
Modifying such a value in one place changes it for all of its observers. For
example, writing to a property changes the property in every place the object is
referenced. Value types, on the other hand, are copied when passed to another
party. Modifying the original value does not affect the copy, and vice versa.

In PHP, arrays and strings are value types. Since they are also reference
counted types, this requires some special care when modifying values. In
particular, we need to make sure that modifying the value is not observable from
other places. Modifying a value with RC 1 is unproblematic, since we are the
values sole owner. However, if the value has a reference count of >1, we need to
create a fresh copy before modifying it. This process is called separation or
CoW (copy on write).

.. code:: php
Expand All @@ -155,20 +167,24 @@ fresh copy before modifying it. This process is called separation or CoW (copy o
Immutable reference counted types
***********************************

Sometimes, even a reference counted type is not reference counted. When PHP runs in a multi-process
or multi-threaded environment with opcache enabled, it shares some common values between processes
or threads to reduce memory consumption. As you may know, sharing memory between processes or
threads can be tricky and requires special care when modifying values. In particular, modification
usually requires exclusive access to the memory so that the other processes or threads wait until
the value is done being updated. In this case, this synchronization is avoided by making the value
immutable and never modifying the reference count. Such values will receive the ``GC_IMMUTABLE``
flag in their ``gc->u.type_info`` field.

Some macros like ``GC_TRY_ADDREF`` will guard against immutable values. You should not use immutable
values on some macros, like ``GC_ADDREF``. This will result in undefined behavior, because the macro
will not check whether the value is immutable before performing the reference count modifications.
You may execute PHP with the ``-d opcache.protect_memory=1`` flag to mark the shared memory as
read-only and trigger a hardware exception if the code accidentally attempts to modify it.
Sometimes, even a reference counted type is not reference counted. When PHP runs
in a multi-process or multi-threaded environment with opcache enabled, it shares
some common values between processes or threads to reduce memory consumption. As
you may know, sharing memory between processes or threads can be tricky and
requires special care when modifying values. In particular, modification usually
requires exclusive access to the memory so that the other processes or threads
wait until the value is done being updated. In this case, this synchronization
is avoided by making the value immutable and never modifying the reference
count. Such values will receive the ``GC_IMMUTABLE`` flag in their
``gc->u.type_info`` field.

Some macros like ``GC_TRY_ADDREF`` will guard against immutable values. You
should not use immutable values on some macros, like ``GC_ADDREF``. This will
result in undefined behavior, because the macro will not check whether the value
is immutable before performing the reference count modifications. You may
execute PHP with the ``-d opcache.protect_memory=1`` flag to mark the shared
memory as read-only and trigger a hardware exception if the code accidentally
attempts to modify it.

*****************
Cycle collector
Expand All @@ -185,14 +201,15 @@ Sometimes, reference counting is not enough. Consider the following example:
unset($a);
unset($b);
When this code finishes, the reference count of both instances of ``stdClass`` will still be 1, as
they reference each other. This is called a reference cycle.
When this code finishes, the reference count of both instances of ``stdClass``
will still be 1, as they reference each other. This is called a reference cycle.

PHP implements a cycle collector that detects such cycles and frees values that are only reachable
through their own references. The cycle collector will record values that may be involved in a
cycle, and run when this buffer becomes full. It is also possible to invoke it explicitly by calling
the ``gc_collect_cycles()`` function. The cycle collectors design is described in the `Cycle
collector <todo>`_ chapter.
PHP implements a cycle collector that detects such cycles and frees values that
are only reachable through their own references. The cycle collector will record
values that may be involved in a cycle, and run when this buffer becomes full.
It is also possible to invoke it explicitly by calling the
``gc_collect_cycles()`` function. The cycle collectors design is described in
the `Cycle collector <todo>`_ chapter.

**********
GC flags
Expand All @@ -207,22 +224,23 @@ collector <todo>`_ chapter.
#define GC_PERSISTENT (1<<7) /* allocated using malloc */
#define GC_PERSISTENT_LOCAL (1<<8) /* persistent, but thread-local */
The ``GC_NOT_COLLECTABLE`` flag indicates that the value may not be involved in a reference cycle.
This allows for a fast way to detect values that don't need to be added to the cycle collector
buffer. Only arrays and objects may actually be involved in reference cycles.
The ``GC_NOT_COLLECTABLE`` flag indicates that the value may not be involved in
a reference cycle. This allows for a fast way to detect values that don't need
to be added to the cycle collector buffer. Only arrays and objects may actually
be involved in reference cycles.

The ``GC_PROTECTED`` flag is used to protect against recursion in various internal functions. For
example, ``var_dump`` recursively prints the contents of values, and marks visited values with the
``GC_PROTECTED`` flag. If the value is recursive, it prevents the same value from being visited
again.
The ``GC_PROTECTED`` flag is used to protect against recursion in various
internal functions. For example, ``var_dump`` recursively prints the contents of
values, and marks visited values with the ``GC_PROTECTED`` flag. If the value is
recursive, it prevents the same value from being visited again.

``GC_IMMUTABLE`` has been discussed in `Immutable reference counted types`_.

The ``GC_PERSISTENT`` flag indicates that the value was allocated using ``malloc``, instead of PHPs
own allocator. Usually, such values are alive for the entire lifetime of the process, instead of
being freed at the end of the request. See the `Zend allocator <todo>`_ chapter for more
information.
The ``GC_PERSISTENT`` flag indicates that the value was allocated using
``malloc``, instead of PHPs own allocator. Usually, such values are alive for
the entire lifetime of the process, instead of being freed at the end of the
request. See the `Zend allocator <todo>`_ chapter for more information.

The ``GC_PERSISTENT_LOCAL`` flag indicates that a ``GC_PERSISTENT`` value is only accessible in one
thread, and is thus still safe to modify. This flag is only used in debug builds to satisfy an
``assert``.
The ``GC_PERSISTENT_LOCAL`` flag indicates that a ``GC_PERSISTENT`` value is
only accessible in one thread, and is thus still safe to modify. This flag is
only used in debug builds to satisfy an ``assert``.
Loading

0 comments on commit 965343c

Please sign in to comment.