Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCI: Release unused bridge resources during resize #6653

Draft
wants to merge 819 commits into
base: rpi-6.12.y
Choose a base branch
from

Conversation

P33M
Copy link
Contributor

@P33M P33M commented Feb 5, 2025

From https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/

Take the suggested patch which frees upstream bridge resources if a BAR is being resized.

6by9 and others added 23 commits February 24, 2025 12:04
The DMA block has a clock, but wasn't defined in the driver. This
resulted in the parent being disabled as unused, and then DMA
stopped working.

Signed-off-by: Dave Stevenson <[email protected]>
There should be no issue in disabling the RP1 clocks as long as
the kernel knows about all consumers.

Signed-off-by: Dave Stevenson <[email protected]>
hclk and pclk of the MAC are connected to clk_sys, so define
them as being connected accordingly, rather than having fake
fixed clocks for them.

Signed-off-by: Dave Stevenson <[email protected]>
This makes the kernel representation of the clock structure
match reality.

Signed-off-by: Dave Stevenson <[email protected]>
In the move to the upstream bcm2712.dts, the A76 PMU was omitted from
the required downstream additions.

Link: raspberrypi#6507

Signed-off-by: Phil Elwell <[email protected]>
Following the merging of [1], it is safe to re-enable DMA to UART0
without fear of losing data.

Seen while looking at raspberrypi#6507.

[1] dmaengine: dw-axi-dmac: Allow client-chosen width

Signed-off-by: Phil Elwell <[email protected]>
gpio-direct mode is a modification of the brcmstb GPIO driver that
makes it play nicely with the userspace pinctrl utility. The mode
forces the drive to read its state from the hardware each time, rather
than relying on cached state. Doing so slightly reduces performance,
but this is not a heavily used code path.

Signed-off-by: Phil Elwell <[email protected]>
Fractional source co-ordinates can be used to setup the scaling
filters, so retain the information.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Apply fractional source co-ordinates into the scaling filters.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
When the margins are changed, the dlist needs to be regenerated
with the changed updated dest regions for each of the planes.

Setting the zpos_changed flag is sufficient to trigger that
without doing a full modeset, therefore set it should the
margins be changed.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Support displaying DRM_FORMAT_YUV444 and DRM_FORMAT_YVU444 formats.
Tested with kmstest and kodi. e.g.

kmstest -r 1920x1080@60 -f 400x300-YU24

Note: without the shift of width, only half the chroma is fetched,
resulting in correct left half of image and corrupt colours on right half.

The increase in width shouldn't affect fetching of Y data,
as the hardware will clamp at dest width.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The VC4 HDMI driver has a bunch of accessors to read from a register.
The read accessor was warning when accessing an unknown register, but
the write one was just returning silently.

Let's make sure we warn also when writing to an unknown register.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
DLIST generation can get pretty tricky and there's not a lot of debug in
the driver to help. Let's add a few more to track the generated DLIST
size.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
We need to allocate a few additional structures when checking our
atomic_state, especially related to hardware SRAM that will hold the
plane descriptors (DLIST) and the current line context (LBM) during
composition.

Since those allocation can fail, let's add some error message in that
case to help debug what goes wrong.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
LBM allocations need a different size depending on the line length,
format, etc.

This can get tricky, and fail. Let's add some more prints to ease the
debugging when it does.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The vc4_plane_atomic_check() directly returns the result of the final
function it calls.

Using the already defined ret variable to check its content on error,
and a separate return 0 on success, makes it easier to extend.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
We access multiple times the vc4_crtc_state->assigned_channel variable
in the vc4_crtc_get_scanout_position() function, so let's store it in a
local variable.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
It has been observed that a YUV422 unity scaled plane isn't displayed.
Enabling vertical scaling on the UV planes solves this. There is
already a similar clause to always enable horizontal scaling on the
UV planes.

Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
ABORT_ON_EMPTY chooses whether the HVS abandons the current frame
when it experiences an underflow, or attempts to continue.

In theory the frame should be black from the point of underflow,
compared to a shift of sebsequent pixels to the left.

Unfortunately it seems to put the HVS is a bad state where it is not
possible to recover simply. This typically requires a reboot
following the 'flip done timed out message'.

Discussion with Broadcom has suggested we don't use this flag.
All their testing is done with it disabled.

Additionally setting BLANK_INSERT_EN causes the HDMI to output
blank pixels on an underflow which avoids it losing sync.

After this change a 'flip done timed out' due to sdram bandwidth
starvation or too low a clock is recoverable once the situation improves.

Signed-off-by: Dom Cobley <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
The V3D IP has been separate since BCM2711, so let's make sure we issue
a WARN if we're running not only on BCM2711, but also anything newer.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
…output

Since we'll support BCM2712 soon, let's move the logic behind
vc4_hvs_get_fifo_from_output() to a switch to extend it more easily.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Since the BCM2712 will feature a significantly different HVS, let's move
the hardware initialisation part of our bind function into a separate
function.

That way, it will be easier to extend in the future.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
Just like the HVS itself, the COB parameters will be fairly different in
the BCM2712.

Let's move the COB parameters computation and its initialisation to a
separate function that will be easier to extend in the future.

Signed-off-by: Maxime Ripard <[email protected]>
Reviewed-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dave Stevenson <[email protected]>
pelwell and others added 14 commits February 24, 2025 12:05
To avoid pointless retries, let the probe function succeed if the
firmware interface is configured correctly but the firmware is
incompatible. The value of the private drvdata field holds the outcome.

Link: raspberrypi#6642

Signed-off-by: Phil Elwell <[email protected]>
The of_xlate method saves the calculated event mask in the con_priv
field. It also rejects subsequent attempt to use that channel because
the mask is non-zero, which causes a repeated instantiation of a client
driver to fail.

The of_xlate method is not meant to be a point of resource acquisition.
Leave the con_priv initialisation, but drop the test that it was
previously zero.

Signed-off-by: Phil Elwell <[email protected]>
With a future release of the firmware, it will be possible to use dts
files with a top-level #size-cells of 2. This patch adds the remaining
necessary changes to make that work, gated by the macro symbol
FIRMWARE_UDPATED.

Signed-off-by: Phil Elwell <[email protected]>
Without this, a default of 0 is used which is very suboptimal for timely
service. Consistency with Pi 5 is desired.

Signed-off-by: Jonathan Bell <[email protected]>
Allows the usage of ADC8x stacked on top of the DAC8x.
Activates all I2S pins and uses now the dummy-dai instead
of the formerly used pcm5102 to allow the use of a
capture device, too. The simple card driver will
probe for the ADC8x and may activate the 8 channel
capture. Uses GPIO5 for detection.

Signed-off-by: j-schambacher <[email protected]>
The driver probes for the ADC8x which can be stacked on top
of the DAC8x. It enables a symmetric 8 channel capture using
the dummy-dai.

Signed-off-by: j-schambacher <[email protected]>
Add pmkid parameter in "brcmf_auth_req_status_le" structure to
align the buffer size defined in firmware "wl_auth_req_status"
structure.

Link: raspberrypi#6130

Signed-off-by: Ting-Ying Li <[email protected]>
Signed-off-by: Phil Elwell <[email protected]>
It was noted that if PV1 was in use to drive DSI1, then the
writeback connector could not be used as HVS channel 2 was
already in use.
The HVS allows PV1 (HVS output 2) to be driven by any HVS
channel via the DSP3_MUX setting, but that was hardcoded to be
either 2 (for PV1) or disabled for TXP.

Expand the available channels field for PV1, and configure
DSP3_MUX accordingly.

Signed-off-by: Dave Stevenson <[email protected]>
The tests on vc4 (BCM2835-7) were checking for DSI1 muxing being
to restricted channel 2, and therefore muxing with TXP was impossible.

As we no longer have that restriction, update the capabilities
defined for DSI1, move the tests that used to be impossible to the
valid list, and extend for additional combinations that are now
possible.

Signed-off-by: Dave Stevenson <[email protected]>
If an HDMI connector has no EDID and the mode is set via the
kernel command line, then drm_reset_display_info() is the only
thing that will have set up any of connector->display_info.

With commit 26ff1c3 ("drm/connector: hdmi: Compute bpc
and format automatically"), it is now checked that
DRM_COLOR_FORMAT_RGB444 is supported. Whilst it doesn't fail
the request, it does log dev_warn for every commit, spamming
the log.

For HDMI connectors initialise the color_format field to say
it supports RGB444.

Signed-off-by: Dave Stevenson <[email protected]>
6.12 kernel reworked how ZRAM backends were configured.
Let's enable the ones we've lost.

I've chosen zstd as the default.

6.6 kernel supported:
$ cat /sys/block/zram0/comp_algorithm
[lzo-rle] lzo lz4 zstd

6.12 currently supports:
$ cat /sys/block/zram0/comp_algorithm
[lzo-rle] lzo

With this PR 6.12 supports:
$ cat /sys/block/zram0/comp_algorithm
lzo-rle lzo lz4 [zstd]

See: https://forums.raspberrypi.com/viewtopic.php?p=2296678#p2296678

Signed-off-by: Dom Cobley <[email protected]>
mickeprag and others added 13 commits February 24, 2025 13:54
Change the standard rate of PLL_AUDIO_SEC from 192MHz to
153.6MHz to suit audio out.

Declare audio out hardware and give it a named pin control.

Signed-off-by: Nick Hollinghurst <[email protected]>
…rror

Connect PLL_AUDIO_SEC to CLK_AUDIO_OUT, which had been commented out
to avoid interference with I2S: we expect them never to be enabled
at the same time. Work around a rounding error that occurs when the
desired rate is exactly the max but not exactly achievable by the PLL.

Signed-off-by: Nick Hollinghurst <[email protected]>
Only 48000Hz stereo 16-bit output is currently supported.

It requires some additional OF plumbing to connect it to a
"dummy" codec and generic sound card.

Signed-off-by: Nick Hollinghurst <[email protected]>
Since RP1 Audio Out can only work on GPIOs 12, 13 which would
previously have needed dtoverlay=audremap, overload it both to
enable and pin-map the block (do not enable for other pinouts).

At the same time, generate a default "codec" and "sound card".

Signed-off-by: Nick Hollinghurst <[email protected]>
The mutex used in arducam-pivariety was not properly initialized,
which could lead to undefined behavior. This also caused a NULL
pointer dereference under certain conditions.

This patch ensures the mutex is correctly initialized during probe
and prevents NULL pointer dereferences.

Signed-off-by: Yuriy Pasichnyk <[email protected]>
Support for the RP1 firmware mailbox API is rolling out to Pi 5 EEPROM
images. For most users, the fact that the PIO is not available is no
cause for alarm. Change the message to a warning, so that it does not
appear with "quiet" in cmdline.txt.

Link: raspberrypi#6642

Signed-off-by: Phil Elwell <[email protected]>
On Pi5 5, GPIOs 46/48 are made available on the 'CAM/DISP 1' connector as
'CD1_IO0_MICCLK'/'CD1_IO1_MICDAT1'. These GPIOs are not connected on
CM5.

Add hogs for GPIO 46/48 on CM5 to prevent camera drivers from
inadvertently using them when connected to 'CAM/DISP 1'

Signed-off-by: Richard Oliver <[email protected]>
In some circumstances, devm_gpiod_get_array_optional() can return
PTR_ERR rather than NULL to indicate failure. Handle these cases.

Signed-off-by: Richard Oliver <[email protected]>
Fast transfer mode requires that the first bit of data is clocked with a
rising edge. This can cause extra bits of data to be clocked on hardware
where the clock signal uses a pull-up. This change ensures that clk is
driven low before fast data transfer mode is entered.

Signed-off-by: Richard Oliver <[email protected]>
Acknowledge the fact that bcmrpi3_defconfig is neither used nor
supported by us, and avoid a bunch of future merge conflicts since
it is already gone from rpi-6.14.y.

Signed-off-by: Phil Elwell <[email protected]>
Resizing BARs can be blocked when a device in the bridge hierarchy
itself consumes resources from the resized range.  This scenario is
common with Intel Arc DG2 GPUs where the following is a typical
topology:

 +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0  Intel Corporation DG2 [Arc A380]
                                             \-04.0-[61]----00.0  Intel Corporation DG2 Audio Controller

Here the system BIOS has provided a large 64bit, prefetchable window:

pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window]

But only a small portion is programmed into the root port aperture:

pci 0000:5d:00.0:   bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]

The upstream port then provides the following aperture:

pci 0000:5e:00.0:   bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]

With the missing range found to be consumed by the switch port itself:

pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]

The downstream port above the GPU provides the same aperture as upstream:

pci 0000:5f:01.0:   bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]

Which is entirely consumed by the GPU:

pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]

In summary, iomem reports the following:

b000000000-bfffffffff : PCI Bus 0000:5d
  bfe0000000-bff07fffff : PCI Bus 0000:5e
    bfe0000000-bfefffffff : PCI Bus 0000:5f
      bfe0000000-bfefffffff : PCI Bus 0000:60
        bfe0000000-bfefffffff : 0000:60:00.0
    bff0000000-bff07fffff : 0000:5e:00.0

The GPU at 0000:60:00.0 supports a Resizable BAR:

	Capabilities: [420 v1] Physical Resizable BAR
		BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB

However when attempting a resize we get -ENOSPC:

pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space
pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign
pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space
pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign
pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space
pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign
pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61]
pcieport 0000:5d:00.0:   bridge window [mem 0xb9000000-0xba0fffff]
pcieport 0000:5d:00.0:   bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]
pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61]
pcieport 0000:5e:00.0:   bridge window [mem 0xb9000000-0xba0fffff]
pcieport 0000:5e:00.0:   bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]
pcieport 0000:5f:01.0: PCI bridge to [bus 60]
pcieport 0000:5f:01.0:   bridge window [mem 0xb9000000-0xb9ffffff]
pcieport 0000:5f:01.0:   bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]
pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned

In this example we need to resize all the way up to the root port
aperture, but we refuse to change the root port aperture while resources
are allocated for the upstream port BAR.

The solution proposed here builds on the idea in commit 91fa127
("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can
be resized while there is no driver attached.  In this case, when there
is no driver bound to the upstream switch port we'll release resources
of the bridge which match the reallocation.  Therefore we can achieve
the below successful resize operation by unbinding 0000:5e:00.0 from the
pcieport driver before invoking the resource2_resize interface on the
GPU at 0000:60:00.0.

pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing
pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing
pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing
pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned
pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned
pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned
pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned
pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned
pci 0000:5e:00.0: PCI bridge to [bus 5f-61]
pci 0000:5e:00.0:   bridge window [mem 0xb9000000-0xba0fffff]
pci 0000:5e:00.0:   bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]
pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61]
pcieport 0000:5d:00.0:   bridge window [mem 0xb9000000-0xba0fffff]
pcieport 0000:5d:00.0:   bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]
pci 0000:5e:00.0: PCI bridge to [bus 5f-61]
pci 0000:5e:00.0:   bridge window [mem 0xb9000000-0xba0fffff]
pci 0000:5e:00.0:   bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]
pcieport 0000:5f:01.0: PCI bridge to [bus 60]
pcieport 0000:5f:01.0:   bridge window [mem 0xb9000000-0xb9ffffff]
pcieport 0000:5f:01.0:   bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]

Link: https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/

Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Jonathan Bell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.