This project calculates the bandwidth achievable when the Linked Direct Memory Access (LDMA) is used to read data from an SPI flash memory with the USART operating in synchronous mode.
Because this code benchmarks read performance, there is no need to connect an actual SPI flash device to the EFR32xG21. The timing of read operations is gated by the timing achievable with the USART and the GPIO pins that would otherwise interface to such a device. These pins are driven as they would be if connected to an actual IC and can be observed on an oscilloscope.
Modules used: CMU, EMU, LDMA, GPIO, Sleep Timer, USART0 (for VCOM), and USART2 (SPI flash).
- GSDK v4.4.3
Connect the board via the connector cable to your PC to flash the example.
To test this application, you can either create a project based on an example project or start with an "Empty C Project" project based on your hardware.
-
Make sure that this repository is added to Preferences > Simplicity Studio > External Repos.
-
From the Launcher Home, add your product name to My Products, click on it, and click on the EXAMPLE PROJECTS & DEMOS tab. Find the example project filtering by "ldma" and "throughput".
-
Click the Create button on Platform - EFR32xG21 LDMA SPI Throughput example. Example project creation dialog pops up -> click Create and Finish and the project should be generated.
-
Build and flash this example to the board.
-
Create an "Empty C Project" for the "BRD4180A" board using Simplicity Studio v5. Use the default project settings.
-
Copy the
app.c
file in thesrc
folder to the project root folder (overwriting the existing file). -
Install the software components:
-
Open the .slcp file in the project.
-
Select the SOFTWARE COMPONENTS tab.
-
Install the following components:
-
[Services] → [IO Stream] → [IO Stream: USART]: use default instance: vcom
-
[Application] → Utility] → [Log]
-
[Services] → [Timers] → [Sleep Timer]
-
[Services] → [Device Initialization] → [Peripherals] → [Digital Phase-Locked Loop (DPLL)]: use default configuration or configure other clock frequencies as following picture
-
[Platform] → [Board] → [Board Control]: enable Enable Virtual COM UART as below:
-
-
-
Build and flash this example to the board.
This code provides a reasonable configuration. It is a simple matter to change the amount of data read or, in particular, the frequency of the USART module clock. The Digital Phase-Locked Loop (DPLL) is used to generate the system clock (SYSCLK). SYSCLK is the top-level clock from which the bus clock (HCLK) and the synchronous peripheral clock (PCLK) are derived. The initialization structures are present in the code (all but one of which is commented out) to set the DPLL output to 40, 50, and 40 MHz. The PCLK frequency is one of 40, 50, and 40 MHz respectively.
After sending the SPI flash read command and the 24-bit address. Then 1 Mbyte of dummy data is clocked out of the TX pin, which the SPI flash would ignore. Meanwhile, 1 Mbyte of data is clocked in on the RX pin in blocks of 1 Kbyte at a time. The sleep timer is started and stopped immediately before and after the read sequence. The difference between the start and end times is used to calculate bandwidth.
The application flow is described as follows:
-
USART2 is initialized for operation in synchronous mode (SPI).
-
The divider is set for the maximum possible clock frequency (PCLK / 2). The USART is set to transfer and receive data MSB first. This is the standard for SPI devices and what any M25P40-compatible flash expects. The clock phase (CLKPHA) and clock polarity (CLKPOL) are both set to 0, which is often called SPI mode 0.
-
Because the delays through the EFR32 GPIO multiplexing logic are relatively long. Hence, it is necessary to enable synchronous master sample delay (USART_CTRL_SMSDELAY), which results in input data being sampled on the subsequent clock edge. In SPI mode 0, input data is sampled not on the falling edge of the clock but on the next rising edge of the clock. This is perfectly allowable and expected. Because any modern SPI flash device is going to support clock rates well in excess (100 MHz is not unusual) of the maximum 50 MHz. These frequencies are supported by the original M25P40. The slave device will not change the transmitted data until this edge is received. At which point the master will have already latched it.
-
-
The LDMA is initialized with one channel configured to transmit the outgoing data and another configured to receive the incoming data.
- Note that only 1024 bytes of data (1 Kbyte as defined) are transmitted and received at a time. While this is a synthetic benchmark, the 1 Kbyte block size is a reasonable amount. Because the data has to be stored somewhere and processed given the limited RAM available on the EFR32xG21 board.
-
The start time is synchronized (saved as soon as the sleep timer counter is incremented). The sleep timer is started to increase the counter every millisecond.
-
The read sequence is started by sending the READ command and the 24-bit flash address.
-
The LDMA is started to transmit and receive 1024 bytes of data.
- This is performed repeatedly in a for-loop that executes the specified number of times. Two state variables(
rxDone
andtxDone
) are set tofalse
and the device enters EM1. When the LDMA channel handling the transmit and receive data stream is complete, the device exits EM1. The flags are set totrue
in the respective callback function.
- This is performed repeatedly in a for-loop that executes the specified number of times. Two state variables(
-
The sleep timer is stopped and the counter is captured.
-
The data transfer rate is calculated and displayed along with the LDMA clock frequency.