Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.naic.edu/alfa/galfa/docs/galspect/spectrometer/galfa-fir.pdf
Дата изменения: Sat Aug 21 04:27:38 2004
Дата индексирования: Sun Dec 23 06:25:03 2007
Кодировка:

Поисковые слова: п п п п
Digital FIR Decimating Filter for GALFA Spectrometer
Documented by Wonsop Sim, July 30th, 2004

Introduction: The GALFA Spectrometer Board requires a decimating FIR filter to reduce the bandwidth from 100MHz by a factor of fourteen to 7.14MHz. A four channel FIR is required for the real and imaginary parts of the two polarizations coming from the Aricebo feed. This FIR filter is implemented digitally in an FPGA board. It has been designed using both the Xilinx System Generator toolset for MATLAB Simulink, as well as in the Verilog HDL. Filter Specifications: The FIR filter uses a Hanning window with 224 taps. The coefficients are scaled so that the maximum value a coefficient can take is 0.95. This introduces a gain of approximately 24.2. The filters are also converted to a fixed point value of 18 bits with a binary point of 17, resulting in some quantization noise. The cut- off frequency used for the Hanning window is 3.21MHz, or 0.9 times the Nyquist Frequency. This is because the Hanning window has an attenuation of 6dB at the cut- off frequency, but an attenuation of at least 10dB is desired at the Nyquist frequency. Due to the fact that the filter is decimating, all frequencies outside the Nyquist frequency will be aliased into the passband. It is the refore important that the filter dies off quickly and achieves very high attenuation rates. The Hanning window attenuates by more than 100dB beyond 10MHz, and reaches well above 15dB by Nyquist, as shown in the figure below.


Filter Implementation: The design of the FIR filter was optimized to reduce the number of multipliers needed. Two properties of the FIR filter are used to achieve this optimization goal-- symmetric filter coefficients and decimation. A filter with symmetric coefficients can reduce the number of multiplies in half by pre- adding input samples. For a decimating filter, output samples need only be computed every fourteen clock cycles, allowing computations to be done during these clock cycles. The number of multipliers can be reduced by time- multiplexing the pre- added input samples during these fourteen clock cycles. After compilation on a Xilinx Virtex- 2 6000 chip, the device utilization is as follows: Logic Utilization: Number of Slice Flip Flops: Number of 4 input LUTs: Logic Distribution: Number of occupied Slices: Total Number 4 input LUTs: Number used as logic: Number used as a route- thru: Number of bonded IOBs: IOB Flip Flops: Number of MULT18X18s: Number of GCLKs: 14,806 out of 67,584 21% 7,459 out of 67,584 11% 8,439 7,564 7,459 105 71 69 32 1 out of 33,792 24% out of 67,584 11% out of 684 out of 144 out of 16 10% 22% 6%

FIR Modules: The four channel FIR filter is comprised of many smaller modules. Details for some of these modules are provided below. Modules which are not mentioned should be straightforward to understand. Verilog implementations as well as simulink models can be found at http://ssl.berkeley.edu/galfa. *Simulink and Verilog implementations share the same functionality, but may differ in implementation. fir.v (four channel filter) The top- level four channel filter contains coefficient ROMs and four single channel filters. Since the single channel filters share the same coefficients, storing the coefficients in the top- level module reduces memory usage. fir_1c.v (single channel filter) The single channel filter module contains eight filter sections to make the 224- tap delay line needed for the input. The single channel filter generates all the control signals for these sections using a counter. The single channel filter also contains a three level adder tree and rounding logic to produce the final output sample.


filter_slice.v (filter section) The filter sections sample the input at twenty- eight time instances. Fourteen of the input samples will be from the first half of the delay line while the remaining input samples will be from the latter half of the delay line. These input samples are chosen so that a pair of inputs will share a single coefficient. These pairs of input samples can then be pre- added. However, a parallel- to- serial converter is required to align the pairs properly before pre- adding. The fourteen pre- added input samples are multiplied by a coefficient and accumulated. A model of the filter slice is shown in the figure below.

coef_rom.v (coefficient ROMs) The coefficient ROMs are implemented in distributed RAM. The 224 18- bit coefficients have been divided into groups of fourteen to be stored in eight different coefficient ROMs. The coefficients have been grouped so that they correspond to the fourteen pre- added input samples that are calculated in a filter section. The eight ROMs c orrespond to the eight filter sections that make up a single channel filter. vw_delay.v (vector warning delay) The GALFA spectrometer board uses a vector warning system to synchronize all the modules. The vector warning system requires that a vector warning flag go high one clock cycle before valid data. Each module is responsible for delaying the vector warning flag by its latency. It would be inefficient to implement a delay of over two hundred using flip flops so a down counter is used instead. The FIR filter has the added complexity of having to hold the vector warning pulse for fourteen clocks since the modules that follow will be operating at the decimated clock frequency. This is handled by appropriately enabling an output register every fourteen clock cycles.