

# A 16-channel Digital TDC Chip with internal Buffering and Selective Readout

P. Bailly, J. Chauveau, J.F. Genat, J.F. Huppert, H. Lebbolo, L. Roos  
 LPNHE, Universités Paris 6 et 7, T33 RC, 4 place Jussieu, 75252 Paris, France  
 Zhang Bo  
 University of Alberta, Edmonton, Canada

October 9, 1998

## Abstract

A 16-channel digital TDC chip has been built for the DIRC Cerenkov counter of the BaBar experiment at the SLAC B-factory (Stanford, USA). The binning is 0.5 ns and the full-scale 32 microseconds. The data driven architecture integrates channel buffering and selective readout of data falling within a programmable time window. The linearity is better than 100 ps rms on 90% of the production parts.

## 1 Introduction

The Front-End electronics of the Detector of Internally Reflected Cerenkov light (DIRC) for the BaBar experiment is presented. Its aim is to measure to better than 1ns the arrival time of Cerenkov photoelectrons, detected in a 11,000 phototubes array and their amplitude spectra. It mainly comprises 64-channel DIRC Front-End Boards (DFB) equipped with eight full-custom Analog Chips performing zero-cross discrimination with 2 mV threshold and pulse shaping, four full-custom Digital TDC chips for timing measurements with 500 ps binning and a readout logic selecting hits in the trigger window, and DIRC Crate Controller cards (DCC) serializing the data collected from up to 16 DFBs onto a 1.2 Gb/s optical link. Extensive test results of the pre-production chips will be presented, as well as the system tests results.

## 2 Context

### 2.1 The DIRC in the BABAR experiment

The Detector of Internally Reflected Cerenkov light (DIRC) sub-system of the BaBar Detector [1] is in-

tended to provide particle identification, particularly separate  $\pi$  and K mesons to better than three sigmas for momenta between 0.15 and 4 GeV/c. Cerenkov photons are internally reflected in 144 quartz bars towards 10,752 photomultiplier tubes (PMT). A water tank is used as leverage between the quartz bars ends and the photomultipliers. The Cerenkov angles are deduced from the pattern created on the PMT wall. The momentum is measured in the BaBar Drift Chamber. The noise in the DIRC due to the PMTs themselves is estimated to 1 kHz, the PEP II machine noise is estimated to 30 kHz including a safety factor of ten; for an average  $b\bar{b}$  event, 5.7 primary tracks hit the DIRC and produce each 33 photoelectrons to which must be added six extra hits from background generated by these tracks. An equivalent number of background photons of the order of 200 is generated by secondary interactions between the event tracks and the detector (mostly Compton scattering in the quartz). Therefore, photomultipliers with a 1ns time resolution have been chosen, that the Front-End electronics should not degrade significantly.

The Level 1 (L1) trigger is built from Drift Chamber, Calorimeter, and Muons Detector primitives. Its latency is 12  $\mu$ s, with an uncertainty of one microsecond. Raw data are stored locally in the TDC chips during the L1 late ncy. A local selection in this chip is furthermore performed so that only data in time with the one microsecond trigger window are sent to the DAQ system after reception of the L1 trigger.

#### 2.1.1 The photodetectors and their signal

The photodetectors are fast PMTs with single photo-electron resolution. Typical timing resolution is 0.7 ns. The typical gain is  $1.25 \cdot 10^7$  using high voltages

between 900 and 1400 V, with a single photo-electron peak to noise ratio of 2.1. Using a threshold of 10% of the single photoelectron response, (2 mV on 50 Ohms) the rate is found less than 100 Hz.

### 2.1.2 Trigger and Data acquisition

The noise in the DIRC due to the PMTs themselves is estimated to 1 kHz, the PEP II machine noise is estimated to 30 kHz including a safety factor of ten; for an average  $b\bar{b}$  event, 5.7 primary tracks hit the DIRC and produce each 33 photoelectrons to which must be added six extra hits from background generated by these tracks. An equivalent number of background photons of the order of 200 is generated by secondary interactions between the event tracks and the detector (mostly Compton scattering in the quartz). The Level 1 (L1) trigger is built from Drift Chamber, Calorimeter, and Muons Detector primitives. Its latency is 12  $\mu$ s, with an uncertainty of one microsecond. Raw data are stored locally in the TDC chips during the L1 trigger latency. A local selection in this chip is furthermore performed so that only data in time within a one microsecond window around the trigger are sent to the Data Acquisition System. This data driven scheme avoids the buffering of a huge amount of data in the Readout modules: the dataflow from the Front-end to the Readout modules is reduced by a factor of ten, according to the current estimations of the background noise.

## 2.2 The TDC requirements

The DIRC TDC chip integrates the following functions and features:

- 0.5 ns binning with 250 ps rms precision.
- 32  $\mu$ s full-scale.
- 32 ns double-hit resolution.
- 59.5 MHz reference clock.
- Simultaneous Read and Write operations.
- Maximum rates:
  - Single channel rates = 10 times nominal background: 100 kHz
  - Maximum random rate on 16 channels simultaneously: 10 kHz

- Selection of data within a programmable time window available at any time for readout.
  - latency from 64 ns to 16  $\mu$ s (eight bits).
  - window size from 64 ns to 2  $\mu$ s (five bits).
- Bit pattern flagging the overloads during the trigger window.
- Channel disabling of the noisy channels between 500 kHz to 8 MHz
- Power should be under 200 mW at 100 kHz average input rate.

## 2.3 The TDC within the DIRC Front-end Electronics

The DIRC digital TDC chip is a building block of the DIRC Front-end electronics. It receives 16 outputs from two 8-channel zero-crossing discriminators timed with the single photo-electrons responses of the DIRC detector PMTs. On any Level 1 trigger (L1) occurrence, digitized time data associated to this trigger are transferred to a Multi-Event Buffer (MEB) and stay until a readout request (Readout Strobe) originated in the central control and timing system occurs.

## 3 TDC Implementation

A block diagram of the chip is shown Figure 1. Within a 59.5 MHz input clock period, a fine time measurement on 5 bits is achieved using voltage-controlled digital delay lines synchronized on the clock period using a calibration channel generating a reference voltage, to compensate for temperatures, supplies, and process delays variations. This calibration is fully transparent to the TDC operations.

A synchronous counter covers the 11 higher order bits.

In order to allow data-driven operations and asynchronous readout occurring at any trigger time during BaBar runs, sixteen dual port FIFOs allow data to be written from the TDC section.

Three level of buffering in FIFO memories allow to sort data in time with an incoming trigger, and make them available for readout. This feature allow to reduce by a factor of 10 the amount of data to be read from the DIRC detector. Each FIFO overload or



Figure 1. DIRC Digital TDC Chip Block Diagram.

channel disabling during a trigger window is reported at the end of each data block as a sixteen bit pattern.

A maximum average channel input rate of 600 kHz is accepted, as far as there are less than 4 input hits on the same channel within a window of one microsecond, and less than 96 inputs on any of the 16 channels within one trigger latency. Input double pulse resolution is 32 ns.

The chip is manufactured by ES2, using a 0.8 microns CMOS process.

### 3.1 Time measurement

#### 3.1.1 The time measuring channel

The TDC section integrates one 60 MHz counter, 16 digital delay lines with 32 taps of 500 ps delay each, a calibration channel made of a delay line identical to the measuring channel, locked on the clock using two analog controls stored in the gate capacitors of nMOS transistors controlling the delays of the 32 identical stages, and a time offset.

These analog controls are common to all channels, assuming a good process uniformity within the chip. The uniformity had been measured on previous TDC chips designs using the same technology. [2] [3]

An incoming signal latches the counter state in a 11-bit register. It is also propagated through the

delay line. The next clock positive edge latches the state of the delay line in a 32-bit register, the result being binary encoded to five bits. Extra cells on each side of the delay line allow to lock the total delay and offset on the clock period.

The delay lines, the synchronization mechanism between the clock and the time inputs, the 60 MHz counter, the charge pump used as phase feedback have been implemented as full-custom designs, simulated with SPICE, before and after layout.

#### 3.1.2 Phase locking

A state machine sequences the calibration process, sending clock pulses as inputs, tuning the channel to give back zero and full-scale digitizations alternately. This process is basically convergent, and no loss of lock can be observed. Therefore, it is not monitored. Calibration is internally activated at a 1 MHz frequency

### 3.2 Layout

The layout has been done using a symbolic editor, drawing the transistors from a stick representation, using the Silicon manufacturer design rules. A compact layout is obtained, allowing to draw the sixteen time channels, the 60 MHz counter, the channel FIFO within half the chip area.

The latency and output FIFOs have been provided by the manufacturer. The associated counters and logic are merged into the standard cells generated by the Synopsys compiler from Verilog hardware descriptions. A post layout simulation has checked the design using the safety margins recommended by the manufacturer.

### 3.3 Selective Readout

#### 3.3.1 Overview

A block diagram of the selective readout is shown Fig. 3. The TDC section is sensitive to any positive edge applied to the inputs. Datum is stored for one  $\mu$ s at more in a four-deep channel FIFO. There is one FIFO for each channel, they are emptied by a continuous read process at 30 MHz, that extracts the oldest datum among the sixteen channel FIFOs outputs (actually the oldest from each FIFO), and transfers it to a 32 deep latency FIFO (FIFO1), shared by all channels, where it stays until the minimum L1 latency



Figure 2. DIRC Digital TDC Chip Layout.



Figure 3. Selective Readout Architecture.

(actually the latency minus half the resolution).

It is then transferred to a 32 deep FIFO where it stays until the maximum trigger latency (latency plus half the resolution). During that stage, a L1 accept is followed by a readout command that empties this FIFO and outputs a data packet whose header is the L1 accept time (on 11 bits), followed by the time words ordered by 32 ns slices, and terminated by a trailer flagging input FIFO overloads, or channels that have been self-disabled due to input overload. These informations have been buffered in a dedicated FIFO with the associated time, and processed in the same way as the time data during the selective readout process.

The readout process is sequenced at 30 MHz, and can be managed within the time before another L1 accept comes (1.5  $\mu$ s). When data is readout, the selective readout process filling the output FIFO is still working. There is no deadtime associated.

### 3.3.2 Fast sort

The fast sort algorithm is based on a slicing in time windows. Within each of these time windows, data are compared using a binary tree shown Figure 44. From two adjacent channels, two bits from time words (bit 10 and 9) belonging to a window of 256 ns are input to eight comparators, the oldest data being compared again two by two, until the last. There are 128 time windows. Therefore, fifteen two-bit comparators are used. The width of 269 ns is fixed by the response time of the comparator tree, and the maximum input occupancy. It allows to use the smallest and fastest comparators with 18 ns response time. The tree returns the address of the oldest time data for transfer to the latency FIFO.

### 3.3.3 Latency and Output FIFO management

This section sends data from the Latency FIFO to the Output FIFO if they are in time with a just arriving trigger (Figure 4). The current time is first subtracted from the time data, if the result is more or equal to the sum of the latency and half the resolution, the time data is sent after the trigger latency minus half the trigger resolution to the Output FIFO, provided it is not full. If no trigger has come after the latency plus half the resolution, data is lost. The writing rate can be up to 1/32 ns at that stage.



Figure 4. Latency FIFO to Output FIFO transfer.

### 3.4 Expected performance

These three levels of buffering have been implemented in order to cope with the required single channel rates of ten times the nominal background of 10 kHz per channel, and the simultaneous maximum random rate on 16 channels of 10 kHz. The transfer from each TDC section to each channel FIFO is immediate. With the same assumption for the transfer from the channel FIFO to the latency FIFO, the dead time can be estimated for a latency FIFO depth of 32 (Figures 5). FIFO depths of 4 and 32 have been implemented for the channel and the latency FIFO respectively, according to the simulation results, leading to a minimum dead time less than 0.1 % as required for input rates of 100 kHz.

## 4 Performance tests

### 4.1 Test bench

Test benches have been set up at each stage of the chip development. First versions were used for checking prototypes linearity and overall functionalities. The last test bench aimed to fully test the production parts within a few minutes. Tests were also performed on the Front-end cards of the DIRC with actual PMTs signals as inputs. The production test bench is shown Figure 6. It makes use of a Pentium PC, a LeCroy precision pulser 9210, and a dedicated printed circuit board housing a hardwired sequencer, digital interfaces to the PC and the chip socket.



Figure 5. Deadtime for a 32-deep latency FIFO.



Figure 6. Production test bench.



Figure 7. Channel histogram with selective readout activated for channel 0.



Figure 8. Typical delay line differential linearity.

#### 4.2 Selective readout

The selective readout process has been checked by sending an input synchronous with the trigger window while a set of PMT generated random noise. A peak at a fixed position within the trigger window is found.

#### 4.3 Linearity

The histogram of the bin widths has been built by sending random inputs. A typical result is shown Figure 8. On channel 14 and 15, the last bin has been found up to two times too wide for a few chips. This is understood as a process dependent layout effect introduced by parasitic components not taken into account in the simulation since the TDC sections have been replicated from channel to channel.

#### 4.4 Monotonicity

A coarse counter slipping by one clock tick was observed on a few channels (3 over 896) when all chan-



Figure 9. Simulated and measured deadtime for random and synchronous events.

nels are fired together at the same time. For the reason detailed above, the delay lines of some chips needed to be calibrated on 31 bins instead of 32. Then, the number of faulty channels is reduced to 0.3%. However, in both cases, the overall differential linearity remains less than 100 ps rms.

#### 4.5 Dead time

A good agreement is found between simulated and measured deadtime, both for random events on the sixteen channels, and synchronous events (Figure 9).

#### 4.6 Production for the DIRC

##### 4.6.1 Acceptance criteria

All the chips have been successfully tested by the manufacturer using an 8k test-vector file. Over 1250 chips, 38 were not working properly due to faults in the analog sections, tested using an extra 10k test vectors not accepted by the manufacturer standard flow. As another selection criterium, it was decided to reject chips with a bin 31 larger than the others by 20%, or chips with bin 0 or/and 1 smaller by more than 60% than the average, having selected the best calibration scheme. 805 chips satisfied these criteria.

## 4.7 Test of the TDC chips on the DIRC Front-End Board.

56 chips from the preproduction set were mounted on 14 DFBs in order to read out one full DIRC sector (896 PMTs). This test bench has been operational at CEA Saclay since October 97 and is used to test the 12 DIRC sectors after the PMTs have been mounted and fully cabled. The PMTs were illuminated by one LED (LEDTRONIC BP280CW1K and both ADC and TDC spectra were recorded. In total, 805 chips satisfied the selection criteria.

### 4.7.1 Statistics

The distribution of the delay lines differential linearity was peaked at  $35\text{ ps}$  when the chips were calibrated on 32 bins. As for preproduction chips, the tail towards high values was mainly due to channels 14 and 15 (i.e. channels far from the calibration line).

The average differential linearity on all chips and all channels in  $73\text{ ps}$ . This corresponds to a time resolution of about  $196\text{ ps}$  including the binning error.

## 5 Conclusion

This digital TDC chip is a building-block of the Front-End electronics for the Detector of Internally Reflected Cerenkov light of the BaBar experiment at SLAC (Stanford, USA). Twelve hundred parts have now been fabricated and tested with a very good yield. resolution and input rate capabilities have been measured within the initial requirements. The readout of data within a programmable time window reduces by a factor of ten the amount of information to be read for the DIRC detector at BaBar.

## References

- [1] SLAC-PUB-7428 and LBNL-40099 March 1997, *Nuclear Instruments and Methods*, Vol A397, pp 261-271, 1997.
- [2] J.F. Genat. "High Resolution Time to Digital Converters", *Nuclear Instruments and Methods*, Vol A315, pp 411-414, 1992.
- [3] P. Bailly, J. Chauveau, J.F. Genat, H. Lebolo, B. Zhang. "A 500 picosecond resolution TDC for the DIRC at BaBar" *Proceedings of the International Conference on*

*Electronics for High Energy and Nuclear Physics*, LeCroy, New-York May 1997.

- [4] B. Zhang. "Développement d'un Circuit intégré VLSI assurant Mesure de Temps et Lecture sélective dans l'Electronique Frontale du DéTECTeur DIRC de l'expé riENCE BaBar à SLAC". Thèse de Doctorat. Université Pierre et Marie Curie. Paris, France. Sept. 1997.