Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions docs/decisions/0025-standardize-serializer-usage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
Standardize Serializer Usage Across APIs
========================================

:Status: Proposed
:Date: 2026-03-09
:Deciders: API Working Group
:Technical Story: Open edX REST API Standards - Serializer standardization for consistency

Context
-------

Many Open edX platform API endpoints manually construct JSON responses using Python dictionaries instead of Django REST Framework (DRF) serializers. This leads to inconsistent schema responses, makes validation errors harder to manage, and creates unpredictable formats that AI and third-party systems struggle with.

Decision
--------

We will standardize all Open edX REST APIs to use **DRF serializers** for request and response handling.

Implementation requirements:

* All API views MUST define explicit serializers for request and response handling.
* Replace manual JSON construction with serializer-based responses.
* Use serializers for both input validation and output formatting.
* Ensure serializers are properly documented with field descriptions and validation rules.
* Maintain backward compatibility for all APIs during migration.

Relevance in edx-platform
-------------------------

Current patterns that should be migrated:

* **Certificates API** (``/api/certificates/v0/``) constructs JSON manually with nested dictionaries.
* **Enrollment API** endpoints manually build response objects without serializers.
* **Course API** views use hand-coded JSON responses instead of structured serializers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the course API uses serializers, but I did find one that needs fixed.

def get_block_metadata(block, includes=()):
"""Get metadata about the specified XBlock."""
data = {
"id": str(block.scope_ids.usage_id),
"type": block.scope_ids.block_type,
}
if "index_dictionary" in includes:
data["index_dictionary"] = block.index_dictionary()
return data


Code example (target serializer usage)
--------------------------------------

**Example serializer and APIView using DRF best practices:**

.. code-block:: python

# serializers.py
from rest_framework import serializers

class CertificateSerializer(serializers.Serializer):
username = serializers.CharField(
help_text="The username of the certificate holder"
)
course_id = serializers.CharField(
help_text="The course identifier"
)
status = serializers.CharField(
help_text="The certificate status (e.g., downloadable, generating)"
)
grade = serializers.FloatField(
help_text="The final grade achieved"
)

# views.py
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework import status

class CertificateAPIView(APIView):
def get(self, request):
data = {
"username": "john_doe",
"course_id": "course-v1:edX+DemoX+1T2024",
"status": "downloadable",
"grade": 0.95,
}
serializer = CertificateSerializer(data)
return Response(serializer.data, status=status.HTTP_200_OK)

Consequences
------------

Positive
~~~~~~~~

* Simplifies validation and ensures consistent response contracts.
* Improves AI compatibility through predictable data structures.
* Enables automatic schema generation and documentation.
* Reduces code duplication and maintenance overhead.

Negative / Trade-offs
~~~~~~~~~~~~~~~~~~~~~

* Requires refactoring existing endpoints that manually construct JSON.
* Initial development overhead for creating comprehensive serializers.
* May require updates to existing client code that expects legacy formats.

Alternatives Considered
-----------------------

* **Keep manual JSON construction**: rejected due to inconsistency and maintenance burden.
* **Use DRF defaults only**: rejected because explicit serializers provide better validation and documentation.
Comment on lines +97 to +98
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using Pydantic models and/or dataclasses? Both are much nicer to work with than DRF Serializers. I believe drf-spectacular already supports Pydantic models, and djangorestframework-dataclasses would support dataclasses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — Pydantic(drf-pydantic) and dataclasses(djangorestframework-dataclasses) are definitely nice to work with.
But the intent of this ADR, however, is mainly to standardize APIs within the existing Django REST Framework ecosystem, which edx-platform already relies on heavily. DRF serializers integrate directly with request parsing, validation, responses, and schema generation. Introducing wrapper libraries Pydantic/dataclasses would likely require additional translation layers and introduce additional dependencies and patterns that the platform would need to maintain long-term(shifting existing APIs from using drf serializers and use Pydantic/dataclasses instead), whereas serializers fit naturally into the current DRF stack and keeps things simpler and consistent with existing openedx-platform APIs.
That said, I am still open to discussion like what would be the best path to approach standardization & consistency across the platform.
@feanil what do you suggest on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we are in a situation where some APIs use serializers and others build response documents more manually. I like the idea of moving to newer easier to work tools but to fully do that we would need to migrate both sets of existing implementations no the 3rd new implementation.

@Faraz32123 how much more work would that be and how quickly could we do it?

I'd love to move to dataclasses since they offer better ergonomics but I'm worried about continuing to run in a mixed state moving forward.

Alternative option: If we decide to only migrate the very oldest non-serializer use to Dataclasses, could we also use linting or import linting to prevent the creation of new standard DRF serializers? eg. We leapfrog to the future we want for the oldest code, we make it so future new code doesn't use Serializers classes directly, we do future work to migrate the rest of the code to dataclasses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we rely heavily on DRF, and a much bigger priority should be making all our APIs consistent(ly using DRF), more so than migrating to newer tech.

It's also true that pydantic/dataclasses don't integrate so well with DRF, and are a better fit with newer frameworks like django-ninja. If we were starting from scratch, I would push strongly for django-ninja.

But since that's not the case, I think recommending only DRF serializers could be a reasonable recommendation. I just want to make sure we've considered the newer alternatives and made an explicit decision in this ADR. If there's an easy way to bring in dataclasses or pydantic alongside serializers and slowly convert everything to dataclasses, I do think that would be better. But if it creates a mess, let's avoid it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, @Faraz32123 maybe you can quickly dig into feasibility here and then we can either document it as a rejected alternative or potentially update our plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to move to dataclasses since they offer better ergonomics but I'm worried about continuing to run in a mixed state moving forward.

Yup, these native python libraries are fast & easier to work with But yeah I was also worried about the same mixed state.

Alternative option: If we decide to only migrate the very oldest non-serializer use to Dataclasses, could we also use linting or import linting to prevent the creation of new standard DRF serializers? eg. We leapfrog to the future we want for the oldest code, we make it so future new code doesn't use Serializers classes directly, we do future work to migrate the rest of the code to dataclasses.

I like your idea of using linting to prevent new standard DRF serializers. It would enforce the “future direction” while giving us breathing room to gradually migrate the rest.

@Faraz32123 how much more work would that be and how quickly could we do it?

For migrating to dataclasses, the main work would be:

  1. Oldest manual-response APIs - updating them to use DataclassSerializer for type safety, validation, and schema integration.
  2. Longer-term migration of existing DRF serializers - as many endpoints rely on serializers and some have nested or complex validation.
    Using linting/import rules, we can prevent new DRF serializers in new code as you have suggested, so we don’t increase mixed-state complexity moving forward.

Agreed, @Faraz32123 maybe you can quickly dig into feasibility here and then we can either document it as a rejected alternative or potentially update our plan.

@feanil, To properly answer that, we’d need to scope the APIs across the platform and estimate roughly how many fall into each category. It would probably take around 2–3 days to get a reasonable estimate of the effort and feasibility. Let me know if that’s what you had in mind, then we can look into it. Based on the findings, we can then decide whether to document it as a rejected alternative or update our plan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to understand if there are any technical issues with moving to Dataclasses using the djangorestframework-dataclasses library. I also want to know how hard would it be to prevent usage of the default serializers across the platform moving forward. Is it as simple as adding a new entry to the import linter for the base serializer classes coming from DRF and adding all the existing files to the exception list for now? Or would it need to be more complicated than that?

I don't think we need to understand how much effort it would be to convert all the existing serializers to dataclasses at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look at these,

I want to understand if there are any technical issues with moving to Dataclasses using the djangorestframework-dataclasses library.

djangorestframework-dataclasses with Django REST Framework is generally feasible because it integrates through DataclassSerializer, which behaves similarly to normal DRF serializers. There are no fundamental technical blockers, but migration would require checking:

  • nested serializers
  • complex validation logic
  • ModelSerializer(that automatically maps Django models whereas Dataclasses do not automatically mirror Django models) heavy endpoints

I also want to know how hard would it be to prevent usage of the default serializers across the platform moving forward. Is it as simple as adding a new entry to the import linter for the base serializer classes coming from DRF and adding all the existing files to the exception list for now? Or would it need to be more complicated than that?

I think its more complicated than just using our existing approach using import-linter(Link). I tried prototyping this with our existing import-linter setup. Unfortunately import-linter cannot forbid submodules of external packages, so we can't directly block rest_framework.serializers. It can only block top-level packages like rest_framework, which would also prevent legitimate DRF imports such as APIView, Response, or status.


Rollout Plan
------------

1. Audit existing endpoints to identify those using manual JSON construction.
2. Create a library of common serializers for shared data structures.
3. Migrate high-impact endpoints first (certificates, enrollment, courses).
4. Update tests to validate serializer-based responses.
5. Update API documentation to reflect new serializer-based contracts.

References
----------

* Open edX REST API Standards: "Serializer Usage" recommendations for API consistency.
Loading