Processing in the FPGA
Processing in the FPGA results in a compact system. There are two main reasons for that behavior: Advanced FPGAs require the ADQ because of the high data rate. These FPGAs are powerful computational resources and custom processing in the FPGA uses the available resources efficiently and off-loads the host PC. Data rate balance between different nodes is the most important aspect. For example, the ADCs in ADQ7DC deliver 20 Gbytes/s. This is too much for transmitting to the host PC. Only in the FPGA, can all data be accessed. We solve that in three ways:
- Trigger rate: The effective data rate record length x trigger rate. This means that only a subset of the data is considered. This is data selection in the time domain. It is supported by FWDAQ and FWPD.
- DDC: filtering and decimation. This is data selection in the frequency domain and is supported by the FWSDR.
- Application-specific firmware: This is firmware packages that do application-specific data rate reduction in the FPGA: FWATD, FWPD, FWSDR.
- Custom firmware: We have a high variety of custom specific firmware implementations.
The advantage of processing in the FPGA is a free selection of connections to the PC. The fastest is the PCIe which as up to about 7 Gbytes/s and with powerful processing in the FPGA, USB3.0 10GbE becomes a possibility. Processing in the FPGA allows for maximum flexibility in the mechanical design. But of course, there are trade-offs.
- The development time for FPGA code is very long compared to c-code. So the development cost is high, but the production cost is low. So as a rule of thumb, custom firmware is more suitable for large volumes.
- A data interface requirement may lead to that custom firmware is more suitable in low volume as well. (note that the pre-defined application firmware packages are already available independently of volume).
- Some types of processing are more efficient in the FPGA and some are not. FPGA is good at streaming data applications, like a filter. Memory intense operations like FFT are less suitable.
Peer-to-peer streaming means that the data is sent to the computational node without passing the hosts PC, CPU or DRAM. There are three levels of data transfer.
- The flexibility and compatibility in a PC system are based on that each hardware has its own driver which connects to the user’s application. Data between the boards are then sent via the PC and transferred from one driver’s memory space to the other driver’s memory space. This is safe, flexible and compatible. The drawback is a heavy load on the PC since the data is copied from two memory spaces. For example, streaming data in 7 Gbytes/s means 28 Gbytes/s load on the DRAM.
- The next step is to share the memory between the drivers. This is called pinned buffer and the requirement for copying data is removed. The comparable data load is reduced to 14 Gbytes/s, which is half. This is depending on the design of the driver. The driver has to allow you to share memory, which is sometimes possible. (There is thus a dependency on third party driver and operating system that has to be investigated for the specific installation).
- The highest level is peer-to-peer streaming. The data is sent from one PCIe board to another via the PCIe-switch (or root complex) without passing the host PC. ADQ7, ADQ32 and ADQ14 (only in Windows) supports peer-to-peer streaming. The data transfer is very efficient and leaves the host PC to other tasks than just copy data.
The special advantage in PXIe is that the PXIe up to 18 board slots. Using peer-to-peer streaming from, for example, ADQ7 to ADQDSU. Then several data streams via the backplane can be set up. From 4x ADQ7 to 4x ADQDSU means 20 Gbytes/s effective data rate without loading the PC. TSPD works with peer-to-peer streaming as the main data transfer method.
Streaming to GPU
Streaming data to GPU is a powerful way to implement AI and Machine learning algorithms. The ADQ series of digitizers are suitable for gathering a large amount of data. Using the peer-to-peer streaming concept creates a powerful computational machine.
Streaming to CPU
Streaming to CPU is a complement to streaming to GPU. The CPU is generally suitable for random data-driven tasks. Streaming to CPU is suitable to work with data-driven acquisition like Mass Spec or Neutron time of flight and use with FWPD.
Streaming data to SSD
- TSPD’s digitizer support peer-ot-peer streaming to SSD storage.
- The concept includes an ADQ digitizer, Firmware for SSD streaming, a Software package for setting up the link and an SSD.
- For PXIe we provide a TSPD ADQDSU SSD disk to provide a turn-key recording system.
- For PCIe the SSD can be PCIe based SSD using NVMe. We can guide on specific models.
- The ADQ digitizer can always stream data to an SSD via the PC DRAM. For peer-to-peer streaming (to save DRAM bandwidth and PC load) the firmware and software package is required.