Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RGMII core #7

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
* Add Lattice Semi SB_IO primitive
* Add UART core
* Add CRC core
* Add RGMII core
2 changes: 2 additions & 0 deletions clash-cores.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ common basic-config
build-depends:
base >= 4.10 && < 5,
clash-prelude,
clash-protocols,
constraints,
containers >=0.5 && <0.8,
ghc-typelits-extra >= 0.3.2,
Expand All @@ -133,6 +134,7 @@ library
Clash.Cores.LineCoding8b10b
Clash.Cores.LineCoding8b10b.Decoder
Clash.Cores.LineCoding8b10b.Encoder
Clash.Cores.Rgmii
Clash.Cores.SPI
Clash.Cores.UART
Clash.Cores.Xilinx.BlockRam
Expand Down
4 changes: 4 additions & 0 deletions nix/nixpkgs.nix
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ let
self.callCabal2nix "doctest-parallel" sources.doctest-parallel {};
clash-prelude =
self.callCabal2nix "clash-prelude" (sources.clash-compiler + "/clash-prelude") {};
clash-protocols-base =
self.callCabal2nix "clash-protocols-base" (sources.clash-protocols + "/clash-protocols-base") {};
clash-protocols =
self.callCabal2nix "clash-protocols" (sources.clash-protocols + "/clash-protocols") {};
clash-lib =
self.callCabal2nix "clash-lib" (sources.clash-compiler + "/clash-lib") {};
clash-ghc =
Expand Down
18 changes: 15 additions & 3 deletions nix/sources.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,25 @@
"homepage": "https://clash-lang.org/",
"owner": "clash-lang",
"repo": "clash-compiler",
"rev": "aba55fed9f45711c8336935721a43d243f7f78c1",
"sha256": "1hrzp8g189v46qfr9ds7w6w0yj5w8y4im1pa3lf5vjx3z64v26qv",
"rev": "f946617561565440d82f67747acb2486f6526a66",
"sha256": "0924xzzwzrpjb1yid9mvy2imxwrzyxfdmkd2l1wfrsdwgrc53dpg",
"type": "tarball",
"url": "https://github.com/clash-lang/clash-compiler/archive/aba55fed9f45711c8336935721a43d243f7f78c1.tar.gz",
"url": "https://github.com/clash-lang/clash-compiler/archive/f946617561565440d82f67747acb2486f6526a66.tar.gz",
"url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz",
"version": "1.8.1"
},
"clash-protocols": {
"branch": "packetstream",
"description": "a battery-included library for dataflow protocols",
"homepage": null,
"owner": "clash-lang",
"repo": "clash-protocols",
"rev": "b893b1b22e8157c1352295fa115e3f9c01fcaf5c",
"sha256": "0v21bzmg0p2fdzkvri9wvzhbg79zjxmp4zg53h3fb1rlcxnmxq9r",
"type": "tarball",
"url": "https://github.com/clash-lang/clash-protocols/archive/b893b1b22e8157c1352295fa115e3f9c01fcaf5c.tar.gz",
"url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
},
"doctest-parallel": {
"branch": "main",
"description": "Test interactive Haskell examples",
Expand Down
216 changes: 216 additions & 0 deletions src/Clash/Cores/Rgmii.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE RecordWildCards #-}

{- |
Module : Clash.Cores.Rgmii
Description : Functions and types to connect an RGMII PHY to a packet stream interface.

To keep this module generic, users will have to provide their own "primitive" functions:

1. delay functions to set to the proper amount of delay (which can be different for RX and TX);
2. iddr function to turn a single DDR (Double Data Rate) signal into 2 non-DDR signals;
3. oddr function to turn two non-DDR signals into a single DDR signal.

Note that Clash models a DDR signal as being twice as fast, thus both facilitating
and requiring type-level separation between the two "clock domains".
-}
module Clash.Cores.Rgmii (
RgmiiRxChannel (..),
RgmiiTxChannel (..),
rgmiiReceiver,
rgmiiSender,
unsafeRgmiiRxC,
rgmiiTxC,
) where

import Clash.Prelude

import Protocols
import Protocols.PacketStream

import Data.Maybe (isJust)

-- | RX channel from the RGMII PHY
data RgmiiRxChannel dom domDDR = RgmiiRxChannel
t-wallet marked this conversation as resolved.
Show resolved Hide resolved
{ rgmiiRxClk :: "rx_clk" ::: Clock dom
, rgmiiRxCtl :: "rx_ctl" ::: Signal domDDR Bit
, rgmiiRxData :: "rx_data" ::: Signal domDDR (BitVector 4)
}

instance Protocol (RgmiiRxChannel dom domDDR) where
type Fwd (RgmiiRxChannel dom domDDR) = RgmiiRxChannel dom domDDR
type Bwd (RgmiiRxChannel dom domDDR) = Signal dom ()

-- | TX channel to the RGMII PHY
data RgmiiTxChannel domDDR = RgmiiTxChannel
{ rgmiiTxClk :: "tx_clk" ::: Signal domDDR Bit
, rgmiiTxCtl :: "tx_ctl" ::: Signal domDDR Bit
, rgmiiTxData :: "tx_data" ::: Signal domDDR (BitVector 4)
}

instance Protocol (RgmiiTxChannel domDDR) where
type Fwd (RgmiiTxChannel domDDR) = RgmiiTxChannel domDDR
type Bwd (RgmiiTxChannel domDDR) = Signal domDDR ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two separate protocols for RX and TX?
We could also have a single

data Rgmii dom domDDR = Rgmii
  { clk :: "clk" ::: Clock dom
  , ctl :: "ctl" ::: Signal domDDR Bit
  , data :: "data" ::: Signal domDDR (BitVector 4)
  }

Right?
We would require OverloadedRecordDot together with DuplicateRecordFields and NoFieldSelectors

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because on the rx side you receive a clock signal. But on the tx side you have to forward the clock using an ODDR primitive. From the FPGA view it's actually no longer clock and can't be used as such.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you basically abusing the ODDR primitive here as a clock generator? Maybe we should add a wrapper for the ODDR primitive that outputs a Clock dom rather than a Signal domDDR Bit (note difference in domain, factor two period difference).

I don't think there's anything wrong with using an ODDR primitive to generate a clock. But Clash is strongly typed and while there is a certain isomorphism, I still feel this outputs a Clock, not a Signal. It is used as a time reference rather than being relative to one. The Signal domDDRs have a setup and hold constraint, the clock output does not. And more arguments revolving around the same principle.

Copy link
Member

@DigitalBrains1 DigitalBrains1 Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a function

oddrClockGen ::
  forall dom domDdr .
  (DomainPeriod dom ~ (*) 2 (DomainPeriod domDdr)) =>
  Clock dom ->
  Reset dom ->  
  ( Clock dom ->
    Reset dom ->
    Enable dom ->
    Bit ->
    Signal dom Bit ->
    Signal domDdr Bit
  ) ->
  Clock dom

where we construct the function such that in HDL, it outputs the output of the DDR register while in Haskell it just outputs a Clock.

[edit]
If we add an internal helper primitive that is:

ddrSignalToClock ::
  forall dom domDdr .
  (DomainPeriod dom ~ (*) 2 (DomainPeriod domDdr)) =>
  Clock dom ->
  Signal domDdr Bit ->
  Clock dom
ddrSignalToClock clk !_ = clk

where the black box instead picks the DDR Bit signal as the output and ignores the clock input, the previous function becomes trivial. This internal primitive could go into Clash.Signal.Internal, and oddrClockGen could be added to Clash.Explicit.DDR.
[/edit]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the FPGA view it's actually no longer clock and can't be used as such.

I don't think you can use the output of the ODDR primitive, period. You can only wire it to a pin, not to any internal logic. So that remains the same whether it's a Signal or a Clock: it cannot be used at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True perhaps it's better to make it neither a clock or a signal. But rather something like WorldSignal or something. Which is just a newtype around signal which you can only route.

Copy link
Member

@DigitalBrains1 DigitalBrains1 Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds more like a GitHub wishlist issue, for the short term I'd like to propose to add oddrClockGen to clash-prelude and change the type here. If I understand correctly, that'd also make it so you can have one Protocol with both RX and TX in it, which would be a nice improvement, right?

(Provided I can convince other people working on clash-prelude that this is a good thing to add)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status on this? I like the idea of having a Signal type that can only be routed. Changing it to a clock doesn't make sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status from my point of view is that I was waiting for consensus or some form of decision of what to do.

Why doesn't the clock make sense to you? To me, the signal doesn't really make sense for the reasons I already indicated. The output of the DDR primitive is used as a clock for external circuitry and that is its only purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say just outputting Clock dom is not qualitatively different than outputting the ODDR-generated clock; it's just a different construction inside the FPGA that makes stuff easier. Even Xilinx recommends it here; one of the first search results I found.


-- | RGMII receiver.
rgmiiReceiver ::
forall dom domDDR.
(DomainPeriod dom ~ (*) 2 (DomainPeriod domDDR)) =>
t-wallet marked this conversation as resolved.
Show resolved Hide resolved
(KnownDomain dom) =>
-- | RX channel from the RGMII PHY
RgmiiRxChannel dom domDDR ->
-- | RX delay function
(forall a. Signal domDDR a -> Signal domDDR a) ->
-- | iddr function
( forall a.
(NFDataX a, BitPack a) =>
Clock dom ->
Reset dom ->
Signal domDDR a ->
Signal dom (a, a)
) ->
DigitalBrains1 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note this function doesn't match the signature of Clash.Explicit.DDR.ddrIn. To use it, one would need to do something like:

... ddrInComp ...
 where
    ddrInComp = \clk rst -> ddrIn clk rst enableGen (deepErrorX "Undefined initial value")

I'd suggest adding Enable and reset value to the iddr function. Same thing for oddr.

-- | (Error bit, Received data)
Signal dom (Bool, Maybe (BitVector 8))
rgmiiReceiver RgmiiRxChannel{..} rxdelay iddr = bundle (ethRxErr, ethRxData)
where
ethRxCtl :: Signal dom (Bool, Bool)
ethRxCtl = iddr rgmiiRxClk resetGen (rxdelay (bitToBool <$> rgmiiRxCtl))

-- The RXCTL signal at the falling edge is the XOR of RXDV and RXERR
-- meaning that RXERR is the XOR of it and RXDV.
-- See RGMII interface documentation.
ethRxDv, ethRxErr :: Signal dom Bool
(ethRxDv, ethRxErr) = unbundle ((\(dv, err) -> (dv, dv `xor` err)) <$> ethRxCtl)

-- LSB first! See RGMII interface documentation.
ethRxData1, ethRxData2 :: Signal dom (BitVector 4)
(ethRxData2, ethRxData1) = unbundle $ iddr rgmiiRxClk resetGen (rxdelay rgmiiRxData)

ethRxData :: Signal dom (Maybe (BitVector 8))
ethRxData =
(\(dv, dat) -> if dv then Just dat else Nothing)
<$> bundle (ethRxDv, liftA2 (++#) ethRxData1 ethRxData2)

-- | RGMII sender. Does not consider transmission error.
rgmiiSender ::
t-wallet marked this conversation as resolved.
Show resolved Hide resolved
forall dom domDDR.
(DomainPeriod dom ~ (*) 2 (DomainPeriod domDDR)) =>
Clock dom ->
Reset dom ->
-- | TX delay function
(forall a. Signal domDDR a -> Signal domDDR a) ->
-- | oddr function
( forall a.
(NFDataX a, BitPack a) =>
Clock dom ->
Reset dom ->
Signal dom a ->
Signal dom a ->
Signal domDDR a
) ->
-- | Maybe the byte we have to send
Signal dom (Maybe (BitVector 8)) ->
-- | Error signal indicating whether the current packet is corrupt
Signal dom Bool ->
-- | TX channel to the RGMII PHY
RgmiiTxChannel domDDR
rgmiiSender txClk rst txdelay oddr input err = channel
where
txEn, txErr :: Signal dom Bit
txEn = boolToBit . isJust <$> input
txErr = fmap boolToBit err

ethTxData1, ethTxData2 :: Signal dom (BitVector 4)
(ethTxData1, ethTxData2) = unbundle $ maybe (undefined, undefined) split <$> input
t-wallet marked this conversation as resolved.
Show resolved Hide resolved

-- The TXCTL signal at the falling edge is the XOR of TXEN and TXERR
-- meaning that TXERR is the XOR of it and TXEN.
-- See RGMII interface documentation.
txCtl :: Signal domDDR Bit
txCtl = oddr txClk rst txEn (liftA2 xor txEn txErr)

-- LSB first! See RGMII interface documentation.
txData :: Signal domDDR (BitVector 4)
txData = oddr txClk rst ethTxData2 ethTxData1

channel =
RgmiiTxChannel
{ rgmiiTxClk = txdelay (oddr txClk rst (pure 1) (pure 0))
, rgmiiTxCtl = txdelay txCtl
, rgmiiTxData = txdelay txData
}

{- |
Circuit that adapts an `RgmiiRxChannel` to a `PacketStream`. Forwards data
from the RGMII receiver with one clock cycle latency so that we can properly
mark the last transfer of a packet: if we received valid data from the RGMII
receiver in the last clock cycle and the data in the current clock cycle is
invalid, we set `_last`. If the RGMII receiver gives an error, we set `_abort`.

__UNSAFE__: ignores backpressure, because the RGMII PHY is unable to handle that.
-}
unsafeRgmiiRxC ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a configuration option that asserts abort when we lose transactions due to backpressure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pondering whether it will still be unsafe then...
We can also make an issue to make a safe version (it would just drop packets....)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting _abort does not make it safe. Imagine you get backpressure while transmitting the last transfer of a packet, then setting _abort does nothing.

I do not think there is a way to make this safe. If the RX receives backpressure while in the middle of receiving a packet, we have already sent transfers so we cannot drop the packet there anymore. You could maybe add a packet fifo to buffer entire packets to achieve this, but as we are in a 125 MHz clock domain, I think that will ruin our timings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since forward can not depend on backward, backpressure on the last transaction is indeed a problem since you can not set abort anymore.

It would be possible if we could terminate a packet with a 0 byte transaction.

@rowanG077 What is the motivation for making the Just (Index n) contain the index of the last bye rather than the number of bytes in the transaction?

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely so 0 byte transactions are not possible. You are basically saying. Hello I have something for you. Oh please show me what you have. And then it's nothing.

If you use a skid buffer(registerFwd) it's possible to allow forward to depend on backwards. I would even add this to the documentation of the skid buffers in clash-protocols. It's one of the killer features of a skid buffer imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you want to explicitly disallow 0 byte termination transfers?

Now any master must postpone its transaction until it knows more data is coming or until it knows "this" is the end of the packet.

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a zero byte transfer just adds overhead for I think little reason. You have add another state to handle in all PacketStream components. It's also ambiguous when it occurs. For example when depacketizing should you add a zero byte termination? Can zero byte terminations be stripped of? Which for example is something you'd want to know in PacketFifo and even is essential for upConverter. And probably more questions... It just makes the protocol more complex.

Why is it a problem that the RGMII PacketStream master must postpone its transaction? All *MII interfaces I know behave this way. You get some bits and when valid falls it's
the end of packet. RGMII is even special in that regard that it transfers full bytes per clock cycle but others don't. RMII is 2 bits per cycle for example.

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand an UDP packet without payload is valid, same for TCP as well. So perhaps we need this extra complexity.

Do you have some thoughts @t-wallet?

Copy link
Collaborator Author

@t-wallet t-wallet Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can explore adding zero-data transactions, I think the fact that it can only happen on a packet boundary should make it a little easier to implement. Indeed with the current setup handling UDP packets without payload is impossible.

Another benefit of that is that we can remove depacketizeToDfC :) (Edit: that's incorrect, as we need to consume padding for some use cases, which depacketizerC obviously does not do)

Copy link
Collaborator Author

@t-wallet t-wallet Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty UDP packets are not a problem for Ethernet because of padding. But, we should of course not rely on padding as maybe some other protocols have bigger headers or don't do padding.

I have experimented with changing the type of _last to Maybe (Index (dataWidth + 1)), so that its meaning becomes "the amount of valid bytes in the transaction". It was surprisingly easy to implement for all the generic components in clash-protocols: it only took me 2 hours to adjust everything. I didn't test it, but I suspect the extra resource overhead is very minimal. So, it's looking very promising. The next step is to also adapt our Ethernet components, and test whether it works in hardware.

forall dom domDDR.
(HiddenClockResetEnable dom) =>
(DomainPeriod dom ~ (*) 2 (DomainPeriod domDDR)) =>
-- | RX delay function
(forall a. Signal domDDR a -> Signal domDDR a) ->
-- | iddr function
( forall a.
(NFDataX a, BitPack a) =>
Clock dom ->
Reset dom ->
Signal domDDR a ->
Signal dom (a, a)
) ->
Circuit (RgmiiRxChannel dom domDDR) (PacketStream dom 1 ())
unsafeRgmiiRxC rxDelay iddr = fromSignals ckt
where
ckt (fwdIn, _) = (pure (), fwdOut)
where
(rxErr, rxData) = unbundle (rgmiiReceiver fwdIn rxDelay iddr)
lastRxErr = register False rxErr
lastRxData = register Nothing rxData

fwdOut = go <$> bundle (rxData, lastRxData, lastRxErr)

go (currData, lastData, lastErr) =
( \byte ->
PacketStreamM2S
{ _data = singleton byte
, _last = case currData of
Nothing -> Just 0
Just _ -> Nothing
, _meta = ()
, _abort = lastErr
}
)
<$> lastData

{- |
Circuit that adapts a `PacketStream` to an `RgmiiTxChannel`.
Never asserts backpressure.
-}
rgmiiTxC ::
forall dom domDDR.
(HiddenClockResetEnable dom) =>
(DomainPeriod dom ~ (*) 2 (DomainPeriod domDDR)) =>
-- | TX delay function
(forall a. Signal domDDR a -> Signal domDDR a) ->
-- | oddr function
( forall a.
(NFDataX a, BitPack a) =>
Clock dom ->
Reset dom ->
Signal dom a ->
Signal dom a ->
Signal domDDR a
) ->
Circuit (PacketStream dom 1 ()) (RgmiiTxChannel domDDR)
rgmiiTxC txDelay oddr = fromSignals ckt
where
ckt (fwdIn, _) = (pure (PacketStreamS2M True), fwdOut)
where
input = fmap (head . _data) <$> fwdIn
err = maybe False _abort <$> fwdIn
fwdOut = rgmiiSender hasClock hasReset txDelay oddr input err