Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

Summary

This PR adds a comprehensive style guide for Python and Cython code in cuda/core/experimental. The guide establishes clear conventions and best practices to maintain code quality and consistency across the codebase.

What's Included

The style guide covers 18 major areas:

  1. File Structure - Ordering of elements (SPDX headers, imports, __all__, classes, functions)
  2. Package Layout - Organization of .pyx, .pxd, and .py files, subpackage patterns
  3. Import Statements - Five-group organization with alphabetical sorting
  4. Class and Function Definitions - Ordering within classes (dunder methods, methods, properties)
  5. Naming Conventions - PascalCase, snake_case, UPPER_SNAKE_CASE patterns
  6. Type Annotations - Modern PEP 604 union syntax (X | Y | None), forward references
  7. Docstrings - NumPy docstring style with comprehensive examples
  8. Errors and Warnings - Custom exceptions, error messages, warning patterns
  9. Memory Management - Resource lifecycle and cleanup patterns
  10. Thread Safety and Concurrency - Locks, thread-local storage patterns
  11. Cython-Specific Features - cdef/cpdef/def, nogil, inline functions
  12. Constants and Magic Numbers - Naming conventions, CUDA constants
  13. Comments and Inline Documentation - TODO, NOTE patterns
  14. Code Organization Within Files - Structure and helper function placement
  15. Performance Considerations - GIL management, C types, optimization guidelines
  16. API Design Principles - Public vs private API, backward compatibility
  17. CUDA-Specific Patterns - GIL management for CUDA driver API calls
  18. Copyright and Licensing - SPDX format requirements

Key Principles

  • PEP 8 Base: Follows PEP 8 as the normative base, highlighting project-specific rules
  • Modern Python: Promotes PEP 604 (X | Y union syntax) and PEP 563 (from __future__ import annotations)
  • Codebase-Aligned: Examples reflect actual patterns from the codebase
  • Forward-Looking: Documents current patterns while promoting migration to modern best practices

Benefits

  • Consistency: Establishes clear conventions for all contributors
  • Onboarding: Helps new contributors understand codebase patterns quickly
  • Code Review: Provides objective standards for reviewing code
  • Maintainability: Reduces cognitive load by standardizing structure and style

File Added

  • cuda/core/style-guide.md (1,818 lines)

Next Steps

After merge, this guide can be:

  • Referenced in code reviews
  • Used as a checklist for new code
  • Gradually applied to existing code during refactoring
  • Extended with additional patterns as needed

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 4, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost Andy-Jost added the documentation Improvements or additions to documentation label Dec 4, 2025
@Andy-Jost Andy-Jost self-assigned this Dec 4, 2025
@Andy-Jost Andy-Jost added the cuda.core Everything related to the cuda.core module label Dec 4, 2025
@Andy-Jost Andy-Jost added this to the cuda-python 13-next, 12-next milestone Dec 4, 2025
@Andy-Jost Andy-Jost requested review from keenan-simpson and removed request for keenan-simpson December 4, 2025 20:50
@Andy-Jost
Copy link
Contributor Author

Link to rendered document: style-guide.md

@Andy-Jost
Copy link
Contributor Author

I'm interested in feedback on this idea. With a comprehensive style guide, aside from the other benefits listed in the description, we can use LLMs to help refactor, clean, and document the code. If the team is comfortable with this approach, I can begin to apply rules conservatively to the code base.

@Andy-Jost Andy-Jost removed the request for review from keenan-simpson December 4, 2025 20:56
@Andy-Jost Andy-Jost removed this from the cuda-python 13-next, 12-next milestone Dec 4, 2025
This commit adds a complete style guide covering conventions for Python
and Cython code in cuda/core/experimental. The guide includes:

- File structure and organization (SPDX headers, imports, __all__, ordering)
- Package layout (.pyx/.pxd/.py files, subpackage patterns)
- Import statement organization (5 groups with alphabetical sorting)
- Class and function definition ordering (dunder methods, methods, properties)
- Naming conventions (PascalCase, snake_case, UPPER_SNAKE_CASE)
- Type annotations (PEP 604 union syntax, forward references)
- Docstrings (NumPy style with comprehensive examples)
- Error handling and warnings (custom exceptions, stacklevel, one-time warnings)
- Memory management (resource lifecycle, cleanup patterns)
- Thread safety and concurrency (locks, thread-local storage)
- Cython-specific features (cdef/cpdef/def, nogil, inline functions)
- Constants and magic numbers (naming, CUDA constants)
- Comments and inline documentation (TODO, NOTE patterns)
- Code organization within files
- Performance considerations (GIL management, C types)
- API design principles (public vs private, backward compatibility)
- CUDA-specific patterns (GIL management for driver API calls)
- Copyright and licensing (SPDX format)

The guide follows PEP 8 as the base and promotes modern Python practices
(PEP 604, PEP 563) while documenting current codebase patterns.
@rwgk
Copy link
Collaborator

rwgk commented Dec 5, 2025

Will look more closely asap.

Drive-by comment: I was surprised to see the new file under cuda_core/cuda/core/. I'd expect it to be under cuda_core/, like other meta-kind-of files, e.g. LICENSE, README.md, etc.

@Andy-Jost
Copy link
Contributor Author

Will look more closely asap.

Drive-by comment: I was surprised to see the new file under cuda_core/cuda/core/. I'd expect it to be under cuda_core/, like other meta-kind-of files, e.g. LICENSE, README.md, etc.

I thought about that, too. I agree it belongs at the root.

Document the two-phase development approach:
- Phase 1: Start with Python driver implementation and tests
- Phase 2: Optimize by switching to cydriver with nogil blocks

Includes step-by-step conversion guide and before/after examples.
@rparolin
Copy link
Collaborator

rparolin commented Dec 5, 2025

Will look more closely asap.
Drive-by comment: I was surprised to see the new file under cuda_core/cuda/core/. I'd expect it to be under cuda_core/, like other meta-kind-of files, e.g. LICENSE, README.md, etc.

I thought about that, too. I agree it belongs at the root.

Do we want this in the root? I'd expect it within a docs folder. That shouldn't impact an LLMs ability to discover and utilize it.

Copy link
Contributor

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't blocking, but I have a few concerns:

  1. I am a bit concerned about the team all using various agents with this an getting a variety of different results, with no way to actually enforce any of these guidelines without the human effort of checking the guide, and comparing a give changeset to it.
  2. Given the extensive guidelines, I think this will actually add human time for: 1) waiting for the LLM to take action, 2) babysitting it when it, as expected, doesn't follow the guidelines in an acceptable-to-the-author way.


Files in `cuda/core/experimental` must follow a consistent structure. The ordering of elements within a file is as follows:

### 1. SPDX Copyright Header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -1 on including things in this guide that can be automated, and I realize that results in a lack of completeness.

In fact, I'd rather our lint rule for this particular thing just add it to the top of the file when it fails.

I think this can be done for all file types, including pyx files.


### 3. `__all__` Declaration

If the module exports public API elements, include an `__all__` list after the imports and before any other code. This explicitly defines the public API of the module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly misleading, since __all__ is about what happens when you star-import a module.

+1 on defining this, but stating it defines the public API I would remove from the description.

LEGACY_DEFAULT_STREAM = C_LEGACY_DEFAULT_STREAM
```

### 5. Principal Class or Function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit mixed on enforcing ordering of elements here. I do think it can help cut review time, but in practice I find that this kind of strict ordering only starts mattering when files get huge, to which I would probably say "Split the file up" 😆

- Not every file will have all sections. For example, a utility module may not have a principal class.
- The distinction between "principal" and "other" classes is based on the file's primary purpose. If a file exists primarily to define one class, that class is the principal class.
- Private implementation functions should be placed at the end of the file to keep the public API visible at the top.
- **Within each section**, classes and functions should be sorted alphabetically by name. The principal class or function is an exception to this rule, as it appears first in its respective section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this should be enforced without a linter. It feels a bit draconian to me to require someone to reorder their methods.

For example, I like to order methods by "relevance radius", which is a squishy concept that is something like "methods that are related to each other are close to each other". I find this makes debugging way easier, if only because I can see more related code in a single vim window.


2. **Keep `.pxd` files minimal**: Only include declarations needed for Cython compilation. Omit implementation details, docstrings, and Python-only code.

3. **Use `__all__` in submodules**: Each submodule should define `__all__` to explicitly declare its public API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public API bit seems to be repeated in a bunch of places, and it's misleading. That said, I'm unsure of a better way to phrase it, but perhaps there's a slightly less "sure" way of stating it?

How about just "Each submodule should define __all__"?


### Implementation Comments

Add comments to explain complex logic or non-obvious behavior:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really question whether this is useful for humans. I suppose it may be useful for an LLM 🤷🏻


### Prefer `cdef` for Internal Functions

Use `cdef` functions for internal operations that don't need to be callable from Python:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already specified above?

Comment on lines +1563 to +1568
result = []
for i in range(n):
result.append(i)

# Preferred: Use C array or pre-allocate
cdef int* c_result = <int*>malloc(n * sizeof(int))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as how these are not equivalent, it seems odd to say "prefer X over Y" when X and Y are very different. Sometimes you want a list, sometimes you don't need it and can use a C array.

Use naming conventions to distinguish public and private APIs:

- **Public API**: No leading underscore, documented in docstrings, included in `__all__`
- **Private API**: Leading underscore (`_`), may have minimal documentation, not in `__all__`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since __all__ only affects star import behavior, this is redundant, since leading underscore names are not included in star imports.


### Guidelines

1. **Placement**: The copyright header must be the first lines of the file, before any imports or other code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this covered by the "all files must include a copyright header"?

@leofang leofang linked an issue Dec 6, 2025 that may be closed by this pull request
Copy link
Contributor

@mdboom mdboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a defacto standard for where to put this kind of stuff and tell the IDE where and when to apply it.

- **Abstract base classes**: ABCs that define interfaces (e.g., `MemoryResource` in `_buffer.pyx`)
- **Other public classes**: Additional classes exported by the module

**All classes and functions in this section should be sorted alphabetically by name**, regardless of their relationship to the principal class. The principal class appears first as an exception to this rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 on alphabetical ordering. I think we should reflect the logical relationships between classes instead (which I understand is a bit mushy as a concept). It's easy enough in an IDE to jump to the one you are looking for using search.


- Functions with names starting with `_` (private)
- `cdef inline` functions used for internal implementation
- Helper functions not part of the public API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is fine, but seems backward to me, given that languages like C have always required forward declarations to do things like this, code tends to be ordered from low-level to high-level, with the "principal public thing" at the bottom, not the top. I realize Python has never had this problem, but it does seems reversed from almost anything else.

The `cuda/core/experimental` package uses three types of files:

1. **`.pyx` files**: Cython implementation files containing the actual code
2. **`.pxd` files**: Cython declaration files containing type definitions and function signatures for C-level access
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For the future): We really should be requiring that all .pyx files have a corresponding .pyi file. Without it, modern IDEs (which don't actually import modules) can not perform auto completion and cross referencing. But we can add that later as an effort to improve the IDE experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 2025, is there a way we can auto-generate .pyi from .py/.pxd/.pyx? If not, let's not make it a must.

One thing I hate the most is Python typing. Does not make any sense to me to maintain 2-3 files.

For each `.pyx` file that defines classes or functions used by other Cython modules, create a corresponding `.pxd` file:

- **`.pxd` file**: Contains `cdef` class declarations, `cdef`/`cpdef` function signatures, and `cdef` attribute declarations
- **`.pyx` file**: Contains the full implementation including Python methods, docstrings, and implementation details
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As above): The better way is to include docstrings and type annotations in a .pyi file and just implementation in the .pyx file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Cython would honor .pyi files? To the very least, I don't think it is friendly to write .pyi files when the types are needed to complete a cdef function?


6. **Separate concerns**: Use `.py` files for pure Python utilities, `.pyx` files for Cython implementations that need C-level performance.

## Import Statements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were writing this for a human, I would take this section (and some other sections below that are basically "PEP8") and just link to them so we aren't maintaining duplication of community standards. If the LLM can handle something like that, I think we should do that here. Given "agents", shouldn't they just be able to run ruff?


#### Cython `cdef` Variables

Consider prefixing `cdef` variables with `c_` to distinguish them from Python variables. This improves code readability by making it clear which variables are C-level types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


```python
LEGACY_DEFAULT_STREAM = C_LEGACY_DEFAULT_STREAM
PER_THREAD_DEFAULT_STREAM = C_PER_THREAD_DEFAULT_STREAM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, related constants should be grouped into enums.

def __dealloc__(self):
"""Automatic cleanup when object is garbage collected."""
DMR_close(self)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we recommend that all such objects that require cleanup define a context manager, i.e. __enter__ and __exit__ methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for now. I can write a section explaining the current rationale in a separate PR if it is OK.


5. **Use stream-ordered deallocation**: When deallocating buffers, use the appropriate stream for asynchronous cleanup to avoid blocking operations.

6. **Track resource ownership**: Clearly document which objects own CUDA handles and are responsible for cleanup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do better than just documenting, by specifying all objects contain a strong reference to their owner to prevent use-after-free errors etc.?

EDIT: Though maybe for a style guide this is too complex of an issue. For style, documenting that such a situation exists is the first requirement for actually patching these sort of holes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, some of the things I'd like to document would more properly belong in a "Developer's Guide." I am therefore considering whether to expand the scope of this document. I'd restructure it into sections where "Style" is one section.


### Inline Functions

Use `inline` for small, frequently-called functions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for Cython, which is not a modern C compiler ;) IME, sometimes the C compiler can see through it and make the right choice and other times not. The salient point for me is that we would ideally backup the choice of inline with a benchmark.

@leofang
Copy link
Member

leofang commented Dec 9, 2025

Merging my offline comment to here. My 2c:

  • If possible I’d like to make part (or all) of this PR rendered as Sphinx docs, for both internal and external contributors to use as developer guide
  • I’d also like all LLM to be able to read it when working on the codebase, including but not limited to: Cursor, CodeRabbit, Copilot, …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update cuda.core developer guide

6 participants