Skip to content

feat(sw): add I4 (4-bit indexed) destination support to software blender#10027

Open
fluffyspace wants to merge 1 commit intolvgl:masterfrom
fluffyspace:feat/sw-blend-i4-destination
Open

feat(sw): add I4 (4-bit indexed) destination support to software blender#10027
fluffyspace wants to merge 1 commit intolvgl:masterfrom
fluffyspace:feat/sw-blend-i4-destination

Conversation

@fluffyspace
Copy link
Copy Markdown

Summary

Adds LV_COLOR_FORMAT_I4 as a render target for the software rasterizer, mirroring the existing I1 / L8 destination handlers. LVGL 9.5 already defines LV_COLOR_FORMAT_I4 = 0x09 as an image-asset format, but src/draw/sw/blend/lv_draw_sw_blend.c has no case LV_COLOR_FORMAT_I4 in either lv_draw_sw_blend_color() or lv_draw_sw_blend_image(), and there is no lv_draw_sw_blend_to_i4.c alongside lv_draw_sw_blend_to_i1.c / lv_draw_sw_blend_to_l8.c. This PR fills that gap.

Motivation

Indexed-color AMOLED controllers (e.g. CO5300, several Sitronix and Solomon Systech parts) accept a 4-bit-per-pixel framebuffer with a 16-entry user-defined palette loaded into the panel's COLSET registers. At typical sizes (e.g. 410x502) a full I4 framebuffer is ~25% the size of the equivalent RGB565 buffer — small enough to fit a single full-frame buffer in the SRAM of MCUs that cannot hold an RGB565 framebuffer (e.g. ESP32-C6, 512 KB SRAM, no PSRAM controller). That unlocks a single-FB tear-free render path on these targets.

The downstream consumer of this PR is a watch firmware on an ESP32-C6 + CO5300 (Waveshare 2.06" AMOLED). CO5300 datasheet §7.5.32 / §7.5.50 / §7.5.51 describe the panel side: COLMOD 0x3A=0x33, COLOPT 0x80 with RGB4bit_en=1, palette loaded via COLSET 0x70..0x7F.

Pixel packing

Two pixels per byte, lower-x pixel in the upper nibble (D[7:4]), higher-x pixel in the lower nibble (D[3:0]) — matching the wire format these panels expect. Stride is rounded up to whole bytes ((width + 1) / 2); the existing lv_draw_buf stride math already handles 4 bpp via lv_color_format_get_bpp and needed no changes.

Palette-binding API

I4 has 16 freely defined entries (rather than I1's fixed black/white), so the blender needs to know the palette to quantize incoming RGB / RGB565 / ARGB8888 / etc. pixels.

void                 lv_display_set_palette(lv_display_t * disp,
                                            const lv_color32_t * palette, uint32_t size);
const lv_color32_t * lv_display_get_palette(lv_display_t * disp);
uint32_t             lv_display_get_palette_size(lv_display_t * disp);

The palette is not copied — the caller owns it. The blender pulls it from lv_refr_get_disp_refreshing() at render time. If no palette is registered, a default 16-entry grayscale ramp is used so canvas / test code can render without configuration.

Quantization is a 16-entry linear search over squared RGB distance (48 multiplies per pixel). On RV32 / Cortex-M this is well under a microsecond per pixel; not the bottleneck for typical refresh rates. A 32K-entry RGB565→index lookup table is a possible future optimization but adds 32 KB of static data and isn't worth the cost for the first cut.

Coverage

lv_draw_sw_blend_image_to_i4 implements all source-format inner loops mirrored from lv_draw_sw_blend_to_i1.c:

  • I1, L8, AL88, RGB565, RGB565_SWAPPED, RGB888, XRGB8888, ARGB8888, I4

Blend modes: NORMAL, plus ADDITIVE / SUBTRACTIVE / MULTIPLY / DIFFERENCE via palette round-trip (index → palette RGB → blend in RGB → quantize back).

Configuration

  • LV_DRAW_SW_SUPPORT_I4 added to lv_conf_template.h (default on, consistent with LV_DRAW_SW_SUPPORT_I1)
  • matching Kconfig entry in Kconfig
  • matching fallback in src/lv_conf_internal.h

Edge cases covered by tests

tests/src/test_cases/draw/test_draw_sw_blend_to_i4.c:

  • Solid fill of all 16 indices on a byte-aligned destination.
  • Odd dest_x start — first byte's upper nibble must be untouched.
  • Odd width — last byte's lower nibble must be untouched.
  • Odd start AND odd width — both ends partial.
  • RGB565 image source quantizing to known palette entries.
  • ADDITIVE blend mode via palette round-trip.
  • MULTIPLY blend mode via palette round-trip.
  • ARGB8888 source with non-cover alpha mixing into I4 destination.

Known limitations / follow-ups (not part of this PR)

  • I4-to-I4 image blend assumes the source and destination share a palette (fast-path nibble copy). This matches I1's behavior, but for I4 image assets whose own palette is carried in lv_image_dsc_t it's not generally correct. A follow-up can plumb a source-side palette through lv_draw_sw_blend_image_dsc_t. Not blocking for the display-framebuffer use case.
  • No assembly hooks. I1 / L8 carry ~80 LV_DRAW_SW_*_TO_I1-style stub macros for accelerated backends. I omitted them to keep this PR focused on the C path; they can be added incrementally if backends grow I4 support.

Test plan

  • Local build (host gcc, cmake -B build && cmake --build build) of the LVGL library + examples + demos is green.
  • Strict compile of new files with -Wall -Wextra -Werror -pedantic -Wshadow -Wundef -Wmissing-prototypes -Wsign-compare -Wfloat-conversion: clean.
  • Full tests/main.py test — needs CI (Ruby + several apt-only libs unavailable in my local env).
  • I1 / L8 / RGB565 destination tests still pass — needs CI.
  • On-hardware sanity check on ESP32-C6 + CO5300 — will follow once this is merged and the firmware is switched over to the new path.

Adds `LV_COLOR_FORMAT_I4` as a render target for the software rasterizer,
mirroring the existing I1 / L8 destination handlers.

Indexed-color AMOLED controllers (e.g. CO5300, several Sitronix and
Solomon Systech parts) accept a 4-bit-per-pixel framebuffer with a
16-entry user-defined palette loaded into the panel's COLSET registers.
At typical sizes (e.g. 410x502) a full I4 framebuffer is roughly a
quarter the size of the equivalent RGB565 buffer, which can unlock a
single-FB tear-free render path on memory-constrained MCUs that cannot
fit an RGB565 framebuffer in SRAM.

Pixel packing follows the wire convention used by these panels: two
pixels per byte, lower-x pixel in the upper nibble (D[7:4]),
higher-x pixel in the lower nibble (D[3:0]). Stride is rounded up to
whole bytes (`(width + 1) / 2`).

Palette-binding API:
* `lv_display_set_palette(disp, palette, size)` — register a palette
  on the display so the blender can quantize incoming RGB pixels to
  the closest of the 16 entries.
* `lv_display_get_palette(disp)` / `lv_display_get_palette_size(disp)`
  retrieve it.
* If no palette is set, the I4 blender falls back to a default 16-entry
  grayscale ramp so canvas/test code can render without configuration.

Quantization is a 16-entry linear search over squared RGB distance
(48 multiplies per pixel; trivial on RV32 / Cortex-M). Faster lookup
tables can be added later if profiling shows they are needed.

Implements all source-format inner loops (I1, L8, AL88, RGB565,
RGB565_SWAPPED, RGB888, XRGB8888, ARGB8888, I4) plus normal/additive/
subtractive/multiply/difference blend modes. Non-normal modes round-trip
each destination pixel through the palette: index -> palette RGB ->
blend in RGB -> quantize back to index.

Edge cases covered by the new unit tests in
`tests/src/test_cases/draw/test_draw_sw_blend_to_i4.c`:
* solid fill of all 16 indices on a byte-aligned destination
* odd `dest_x` start (first byte's upper nibble must be untouched)
* odd width (last byte's lower nibble must be untouched)
* odd start AND odd width (both ends partial)
* RGB565 image source quantizing to known palette entries
* ADDITIVE / MULTIPLY blend modes via palette round-trip
* ARGB8888 source with non-cover alpha mixing into I4 destination

Configuration:
* `LV_DRAW_SW_SUPPORT_I4` in `lv_conf_template.h` (default on,
  consistent with `LV_DRAW_SW_SUPPORT_I1`)
* matching Kconfig entry in `Kconfig`
* matching fallback in `lv_conf_internal.h`

Existing I1 / L8 / RGB565 destination handlers are unchanged.

Downstream motivation: a watch firmware on an ESP32-C6 + CO5300 panel
that wants the single-FB path. CO5300 datasheet sections 7.5.32 / 7.5.50
/ 7.5.51 describe the panel side: `COLMOD 0x3A=0x33`, `COLOPT 0x80`
with `RGB4bit_en=1`, palette loaded via `COLSET 0x70..0x7F`.
@github-actions
Copy link
Copy Markdown
Contributor

Hi 👋, thank you for your PR!

We've run benchmarks in an emulated environment. Here are the results:

ARM Emulated 32b - lv_conf_perf32b

Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
All scenes avg. 28 37 7 7 0
Detailed Results Per Scene
Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
Empty screen 11 33 0 0 0
Moving wallpaper 2 33 1 1 0
Single rectangle 0 50 0 0 0
Multiple rectangles 0 33 (-1) 0 0 0
Multiple RGB images 0 39 0 0 0
Multiple ARGB images 10 (-6) 41 (+3) 2 (-2) 2 (-2) 0
Rotated ARGB images 57 (-2) 44 15 15 0
Multiple labels 4 (+1) 35 (+2) 0 0 0
Screen sized text 83 (+2) 45 17 17 0
Multiple arcs 39 33 7 7 0
Containers 4 (+1) 37 (-1) 0 0 0
Containers with overlay 89 (-1) 21 44 44 0
Containers with opa 14 37 1 1 0
Containers with opa_layer 19 (+1) 34 5 5 0
Containers with scrolling 45 (+1) 45 12 12 0
Widgets demo 72 (+1) 39 (-1) 16 (-1) 16 (-1) 0
All scenes avg. 28 37 7 7 0

ARM Emulated 64b - lv_conf_perf64b

Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
All scenes avg. 25 37 6 6 0
Detailed Results Per Scene
Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
Empty screen 11 33 0 0 0
Moving wallpaper 1 33 0 0 0
Single rectangle 0 50 0 0 0
Multiple rectangles 0 35 0 0 0
Multiple RGB images 0 39 0 0 0
Multiple ARGB images 11 42 0 0 0
Rotated ARGB images 29 33 9 9 0
Multiple labels 2 35 0 0 0
Screen sized text 85 46 18 18 0
Multiple arcs 40 (+7) 33 6 6 0
Containers 4 37 (-1) 0 0 0
Containers with overlay 88 (-9) 23 (+1) 41 41 0
Containers with opa 16 (+1) 37 (-1) 1 (+1) 1 (+1) 0
Containers with opa_layer 10 (+3) 38 (+2) 2 (+1) 2 (+1) 0
Containers with scrolling 49 (+1) 48 (-1) 12 12 0
Widgets demo 68 (+1) 40 15 15 0
All scenes avg. 25 37 6 6 0

Disclaimer: These benchmarks were run in an emulated environment using QEMU with instruction counting mode.
The timing values represent relative performance metrics within this specific virtualized setup and should
not be interpreted as absolute real-world performance measurements. Values are deterministic and useful for
comparing different LVGL features and configurations, but may not correlate directly with performance on
physical hardware. The measurements are intended for comparative analysis only.


🤖 This comment was automatically generated by a bot.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 10 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/draw/sw/blend/lv_draw_sw_blend_to_i4.c">

<violation number="1" location="src/draw/sw/blend/lv_draw_sw_blend_to_i4.c:254">
P2: lv_draw_sw_blend_image_to_i4 does not handle LV_COLOR_FORMAT_I1, so I1 source images are rejected by the default path instead of being blended into I4 destinations.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

case LV_COLOR_FORMAT_I4:
i4_image_blend(dsc);
break;
default:
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: lv_draw_sw_blend_image_to_i4 does not handle LV_COLOR_FORMAT_I1, so I1 source images are rejected by the default path instead of being blended into I4 destinations.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/draw/sw/blend/lv_draw_sw_blend_to_i4.c, line 254:

<comment>lv_draw_sw_blend_image_to_i4 does not handle LV_COLOR_FORMAT_I1, so I1 source images are rejected by the default path instead of being blended into I4 destinations.</comment>

<file context>
@@ -0,0 +1,842 @@
+        case LV_COLOR_FORMAT_I4:
+            i4_image_blend(dsc);
+            break;
+        default:
+            LV_LOG_WARN("Not supported source color format");
+            break;
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds software-rendering support for LV_COLOR_FORMAT_I4 (4-bit indexed) as a destination framebuffer format, including palette-based quantization via the active display, plus configuration and unit tests to validate nibble packing and blending behavior.

Changes:

  • Add I4 destination blenders (lv_draw_sw_blend_*_to_i4) and wire them into the SW blend dispatch.
  • Introduce display palette APIs (lv_display_set/get_palette*) and store palette pointer/size on lv_display_t.
  • Add configuration toggles (LV_DRAW_SW_SUPPORT_I4) and a new Unity test suite for I4 blending.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/src/test_cases/draw/test_draw_sw_blend_to_i4.c New Unity tests validating I4 nibble packing and blend behavior.
src/lv_conf_internal.h Adds internal fallback/config plumbing for LV_DRAW_SW_SUPPORT_I4.
src/draw/sw/blend/lv_draw_sw_blend_to_i4.h Declares I4 destination blender entry points.
src/draw/sw/blend/lv_draw_sw_blend_to_i4.c Implements I4 destination fill/image blending with palette quantization.
src/draw/sw/blend/lv_draw_sw_blend.c Routes I4 destinations to the new I4 blender functions.
src/display/lv_display_private.h Extends lv_display_t with palette pointer + size fields.
src/display/lv_display.h Adds public palette setter/getter APIs and documentation.
src/display/lv_display.c Implements palette setter/getter APIs.
lv_conf_template.h Adds LV_DRAW_SW_SUPPORT_I4 to the template config.
Kconfig Adds a Kconfig option for LV_DRAW_SW_SUPPORT_I4.

int32_t mask_stride = dsc->mask_stride;

int32_t dest_nibble_ofs = dsc->relative_area.x1 & 1;
int32_t src_nibble_ofs = dsc->src_area.x1 & 1;
Comment on lines +38 to +40
#include "../../../core/lv_refr_private.h"
#include "../../../misc/lv_color.h"
#include "../../../stdlib/lv_string.h"
Comment thread src/display/lv_display.c
Comment on lines +629 to +632
{
if(disp == NULL) disp = lv_display_get_default();
if(disp == NULL) return;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants