|ARM Technical Support Knowledge Articles|
Applies to: Versatile Baseboards
Our development boards enable you to prototype hardware and software with the latest ARM cores. A question often asked is "Will baseboard X give me sufficient performance to run my software application?" This question is more complicated than it may first appear, since no two applications are identical, and they utilize board resources in different ways. Generally speaking, the more time the application spends accessing system memory and peripherals, the slower it will run. In a complex system (such as one running Linux), devices other than the CPU (such as the DMAC and LCD controller) will also contend for access to these resources, reducing the maximum instantaneous throughput.
The bus architecture and peripheral controllers in an ARM "Platform Baseboard" are distributed between a CPU test chip or "Development Chip", a system ASIC and/or a System FPGA. This arrangement allows us to quickly develop and release boards for each new ARM processor architecture, but it does not produce a system that is optimized for performance and bus throughput. Consequently, these development systems will always give lower performance than the ASICs which they allow you to prototype.
This article uses a standard test suite to compare the bus throughput of several ARM platform baseboards. This document is analogous to the FAQ: How fast is the EB?
The table below shows typical maximum throughput figures for memory accesses on the Versatile family "PB" boards, with throughput measured in MBytes/second. The types of memory listed on the left of the table are: SDRAM and SSRAM on the baseboard, SSRAM on the Logic Tile (LT) and DMA from SDRAM to SDRAM.
|SDRAM||109 / 21 / 100 / 42 / 148||122 / 17 / 119 / 18 / 137||70 / 8 / 57 / 11 / 22||119 / 10 / 21 / 11 / 21||86 / 5 / 33 / 5 / 10|
|SSRAM||26 / 22 / 25 / 38 / 44||22 / 13 / 21 / 22 / 27||30 / 7 / 29 / 12 / 23||- / 8 / 15 / 13 / 15||- / 5 / 5 / 23 / 10|
|LT SSRAM||56 / - / 57 / - / 56||145 / 11 / 73 / 10 / 64||70 / 8 / 58 / 11 / 22||119 / 10 / 21 / 12 / 21||86 / 5 / 33 / 5 / 10|
|NOR Flash||- / 17 / 19 / - / -||- / 13 / 22 / - / -||- / 7 / 25 / - / -||- / 8 / 14 / - / -||- / 5 / 20 / - / -|
|PCI||5 / - / 4 / - / 13||- / - / - / - / -||- / - / - / - / -||- / - / - / - / -||- / - / - / - / -|
|DMA||- / - / 34 / - / 34||No DMAC||- / - / 37 / - / 37||- / - / 38 / - / 38||- / - / 38 / - / 38|
Notes on table:
Table entries with five figures are respectively:
Instruction fetch / Data read (single) / Data read (burst) / Data write (single) / Data write (burst). For example, in the PB11MPCore, a CPU can burst write to the SDRAM at 22Mbytes/second.
Entries which show only a dash "-" indicate that the test software does not accommodate that particular test; in some instances because the test is nonsensical - e.g. you cannot DMA instruction fetches.
It may be possible to increase the ARM core clock frequency, but this has little effect on memory performance since the main restricting factors are: delays through the bus bridges, memory controller burst performance and the various bus clock ratios.
For further information on this subject, the source code and a readme file explaining the test algorithms can also be found on the Versatile family CD-ROM.
Did you find this article helpful? Yes No
How can we improve this article?