Get Complete Project Material File(s) Now! »
Chapter 3 System Overview
The ETA radio telescope is designed to detect transient pulses received from celestial emis-sions. In order to observe unpredicted transient events, the telescope must have a large viewing area and be limited by the galactic background noise. It has been shown that suf-ficient sensitivity and viewing area can be achieved at low frequencies through the use of inexpensive dipole-like antennas [2]. The ETA system also aids data analysis by performing real-time processing of the received RF signal. System processing is performed by com-mercially available Altera and Xilinx FPGA development boards. The use of commercially available parts and FPGAs facilitate ETA system design and reduce development time. The system must also be capable of recording large amounts of data for an extended period of time. Data capture and system control must have the capability to be managed off-site as the ETA system is near Rosman North Carolina a five hour drive from the Virginia Tech campus. This chapter gives a brief overview of the entire system and then describes the individual elements in greater detail. The overview contained in this chapter does not dis-cuss the RCC in detail but is meant to proved a basis for further discussion of the RCC in Chapters 4 and 5.
Overview
As described, the ETA radio telescope is designed for low-frequency operation between 29–47 MHz, full sky observation, high throughput, real-time digital signal processing, and contin-uous remote operation. Figure 3.1 provides a graphical representation of the ETA system architecture as described in this section. The system is comprised of 24 dipole-like antennas and feed systems, 24 analog receivers, twelve digital receiver Altera FPGA boards, sixteen Xilinx FPGA boards, four acquisition PCs, one control PC (not shown in Figure 3.1), and interconnect.
The array of 12 dual-polarized dipole-like stands receive radio frequency signals. Figure 3.2 shows one of the ETA antenna stands. Each stand consists of two orthogonally placed active antennas with corresponding feed systems. The two orthogonal polarizations provide a total of 24 radio frequency inputs to the system. The feed system for each of the 24 antennas is located inside the protective PVC tubing and is comprised of a preamplifier, balun, and coaxial cable. The coaxial cable carries the signals underground to a central building and connects to the analog receivers. The 24 analog receivers amplify and filter the dipole inputs before they are converted to digital signals in the digital receiver nodes.
The twelve receiver nodes are implemented on Altera Stratix DSP development boards and are referred to as S25s [26]. Each S25 digitally samples two antenna feeds at 120 Msps. The digital signal is then processed in the Stratix EP1S25 FPGAs to perform the digital filtering, down conversion, and multiplexing of two antenna data streams. The multiplexed data stream is then passed through a source synchronous 5-bit parallel connection to the reconfigurable computing cluster (RCC).
The RCC is a two-level network of sixteen Xilinx ML310 FPGA development boards, referred to as ML310s [15]. Twelve RCC outer nodes receive digital data streams from the S25 boards, perform the required digital signal processing, and pass a resulting data streams to two of four RCC inner nodes. The network between RCC outer and inner nodes is formed using the Aurora protocol over InfiniBand cables. Aurora provides a light-weight, high speed, self synchronous, serial protocol well suited to data streaming in the ETA architecture. The RCC inner nodes complete the required signal processing, provide additional filtering, and output data for acquisition to one of four PC acquisition nodes.
Each acquisition node is capable of recording data continuously for up to one hour. This results in approximately 200 GBs of data per PC or 800 GBs aggregate. Data is then archived to tape for easy transport, analysis, and long-term storage off site. Not shown in Figure 3.1 is the control PC. The control PC allows the operation of the system to be observed, controlled, or modified remotely. The control PC connects to the four acquisition nodes through a Gigabit Ethernet switch and to the sixteen RCC nodes through two 8-port serial RS232 PCI cards. Through the RS232 interface, the RCC nodes can be controlled or monitored with a hardware UART controller. Users can log into the control PC from the Internet to manage system status and acquisitions. The remainder of this chapter describes the ETA system elements in more detail.
RF Antenna and Amplifier Systems
Low-frequency, transient pulse searches require antennas and feed systems that have a wide field of view and are limited in sensitivity by the radio frequency noise produced in the galaxy. This can be attained with a simple low cost inverted V dipole-like antenna and feed system [2]. The dipole antenna arms are constructed from 3/4-in (1.9 cm) x 3/4-in aluminum L-shaped angle stock that is 1/8-in thick. Angle stock was chosen because it creates rigid, low-cost antennas with the desired bandwidth and are easily constructed. The total length of a dipole, including both arms and feed gap is 3.8 m. This gives the dipoles a resonant frequency centered near the middle of the ETA frequency range at 38 MHz. The dipole arms are also bent down at a 45 degree angle which broadens the antenna pattern. Each of the twelve antenna stands has one antenna placed in a north-south orientation and another placed orthogonally in an east-west orientation. This results in twelve dual-polarized antenna stands. The antenna terminals are located at the top of a two meter high mast which corresponds to approximately one-quarter wavelength above the ground at resonance. The mast is constructed from PVC electrical conduit with a four inch diameter. The antenna dimensions are shown in figure 3.3 and the twelve stand arrangement is shown in figure .
Each dipole is connected through a preamplifier and balun circuit to a coaxial cable inside the mast. Figure 3.5 shows the preamplifier and balun circuit [27]. The coaxial cable carries each signal to a central building and is then processed through an analog receiver. The analog receiver amplifies and filters the incoming signals and is shown in figure 3.6.
The antenna and feed system design has been tested and showed that it is galactic noise limited and that further design optimizations would provide little if any benefit [27]. Figure 3.7 shows the received signal at the antenna terminals. The figure shows areas of man-made RFI outside the 29-47 MHz band and that the in-band noise level follows the expected galactic noise background curve The simple antenna design functions well for the ETA system because the antennas and amplifiers can be produced easily and cheaply. The antennas achieve the maximum possible performance in that their sensitivity is limited by galactic noise over the frequency band of 29–47 MHz. From the output of the analog receivers, each signal passes to the digital receiver nodes where the signal is digitally sampled and processed further.
Digital Receiver Boards
The digital receiver nodes digitally sample, filter, and downconvert the analog antenna in-puts. Since designing custom hardware to perform this function would be costly and time consuming, receiver functionality is implemented on Altera Stratix DSP development boards (S25s) [26]. The S25 is used because it contains the necessary hardware elements for the design and is familiar to the design team. Each board contains two analog to digital con-verters (ADCs) capable of sampling at 125 Msps and can therefore process two antenna inputs. All 24 analog input streams can be digitally sampled using twelve S25 boards. The on-board Stratix EP1S25 FPGA [28] performs the necessary signal processing to filter and downconvert the signal. The board also contains ample I/O connectors for data transfers. The S25 board is shown in Figure 3.8.
The on-board ADCs in this design sample the analog stream at a sampling frequency (Fs) of 120 Msps. The digitally sampled signal is then sent to the Stratix FPGA for digital processing. Figure 3.9 summarizes the processing which takes place in the Stratix FPGA.
Figure 3.10 shows the transfer function of the signal sent to the Stratix FPGA from the ADCs. In the Stratix chip, the Fs/4 Shift Left block downconverts the digital signal to 1/4 of the sample frequency, or 30 MHz. This is done because downconversion by Fs/4 is a special case, easily implemented in hardware by changing the signs of the sampled data stream. This downconversion shifts the passband center frequency from 38 MHz to 8 MHz and maintains the sample rate of 120 Msps. The output transfer function from this block is shown in Figure 3.11 and is then passed through a 21 MHz low-pass FIR filter which also reduces the sample rate to 60 Msps. The filter reduces aliasing and suppresses out of band signals which may be aliased into the baseband during subsequent processing steps. The signal maintains a center frequency at 8 MHz but has a sample rate of 60 Msps. A 60 Msps numerically controlled oscillator (NCO) and complex tuner are then used to shift the center frequency from 8 MHz to 0 MHz. The transfer function at the output of the complex tuner is shown in Figure 3.12 and passes through a second multirate FIR filter. The second FIR is a 2.5 MHz low-pass filter with an input sample rate of 60 Msps which is reduced to 7.5 Msps at the output. The transfer function at the output of the second FIR is shown in Figure 3.13 and provides a 5 MHz passband to the rest of the system.
Using this method any 5 MHz band between 29–47 MHz can be selected by making minor adjustments to the NCO. Since each S25 board process two data streams, a majority of this circuitry is replicated for a second input. At the end of this process the two signals are each represented using a complex 7-bit I, 7-bit Q value, giving 28 data bits which are transmitted to the RCC outer nodes. The Output Formatter prepares data for transmission by multiplexing the two streams, a sample counter, and start of frame signal to low-voltage differential signal outputs. Five data bits along with a clock are transmitted in parallel at 60 MHz, or 37.5 MBs/s, over Precision Interconnect Blue Ribbon brand coaxial cable with MICTOR connectors [29] to RCC outer nodes. The mini-coaxial ribbon cables were chosen because they have excellent noise immunity, produce minimal RFI, and provide sufficient throughput.
Each S25 board performs its signal processing in parallel with the other S25s. It is important for the boards to synchronously source data at identical sample rates. If data is not sourced at the same rate, downstream synchronization buffers in the RCC will overflow and result in synchronization and data loss. To maintain a synchronous system, one S25 generates the clock and reset signals for all of the S25 boards. The result is increased clock skew between boards but enables each S25 board to output data at the same sample rate. Multiple data streams are then synchronized using the embedded counter and start of frame bits after they are received in the RCC nodes.
Reconfigurable Computing Cluster (RCC)
The reconfigurable computing cluster consists of sixteen Xilinx ML310 development boards (ML310s) [15] that are networked using InfiniBand cables [30]. Similar to the S25s, the Xilinx ML310 is used because it is familiar to the design team and provides sufficient hardware components and I/O connections. The nodes are divided between twelve outer nodes, which connect directly to the digital receiver nodes, and four inner nodes, which connect directly to the acquisition PCs. The purpose of the RCC is to implement signal processing, provide a reconfigurable network that can be modified as required, and reformat the data stream for acquisition. Figure 3.14 shows an ML310 board and the two custom adapter boards required to connect the S25s, RCC network, and acquisition PCs.
The MGTs allow for up to eight InfiniBand connections to each board. The Aurora lightweight, point-to-point, serial protocol [25] is implemented over the InfiniBand connec-tions. Using these links, the twelve RCC outer nodes stream processed data to the RCC inner nodes. The inner nodes combine data from multiple outer nodes and send the resultant data stream, through a 16-bit parallel connection, to the acquisition PCs. Each RCC node also communicates with the control PC to send and receive status or control information. The connections are made through the serial port using a UART and physical RS232 con-nection and allow the RCC nodes to be observed and updated through the control PC. A more detailed discussion of the RCC is provided in Chapters 4 and 5.
1 Introduction
1.1 Motivation
1.2 Contributions
1.3 Thesis Organization
2 Background
2.1 Radio Astronomy
2.2 Algorithms
2.3 FPGAs
2.4 Application-Specific Processing
2.5 Custom Communication
2.6 Clocking and Synchronization
3 System Overview
3.1 Overview
3.2 RF Antenna and Amplifier Systems
3.3 Digital Receiver Boards
3.4 Reconfigurable Computing Cluster (RCC)
3.5 Acquisition PCs
3.6 Control PC
4 RCC Design Considerations
4.1 System Modes
4.2 Networking
4.3 Synchronization
4.4 Remote Access to System Nodes
5 Reconfigurable Computing Cluster
5.1 Overview
5.2 Raw Mode
5.3 Beamforming Mode
5.4 FFT Beamforming Mode
6 Results and Analysis
6.1 Partitioning
6.2 Synchronization
6.3 Computational Performance
6.4 Communication Performance
6.5 Resource Efficiency
6.6 Scalability
7 Conclusions
7.1 Summary and Contributions
7.2 Conclusions
7.3 Future Work
Bibliography
GET THE COMPLETE PROJECT
Stream Communication and Computation in the Eight-meter-wavelength Transient Array (ETA) Radio Telescope