

Available online at www.sciencedirect.com



**The Journal of China Universities of Posts and Telecommunications**

 August 2017, 24(4): 69–75 www.sciencedirect.com/science/journal/10058885 http://jcupt.bupt.edu.cn

# Area-efficient analog decoder design for low density parity check codes in deep-space applications

Zhao Zhe  $(\boxtimes)$ , Gao Fei, Zheng Hao, Yin Xue

School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China

#### **Abstract**

Area-efficient design methodology is proposed for the analog decoding implementations of the rate-1/2 accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is designed using optimized decoding architecture and regularized routing network, in such a way that the overall wiring overhead is minimized and the silicon area utilization is significantly improved. The prototyping chip used to verify the approach is fully integrated in a four-metal double-poly 0.35 µm complementary metal oxide semiconductor (CMOS) technology, and includes an input-output interface that maximizes the decoder throughput. The decoding core area is  $2.02 \text{ mm}^2$  with a post-layout area utilization of 80%. The decoder was successfully tested at the maximum data rate of 10 Mbit/s, with a core power consumption of 6.78 mW at 3.3 V, which corresponds to an energy per decoded bit of 0.677 nJ. The proposed analog LDPC decoder with low processing power and high-reliability is suitable for space- and power-constrained spacecraft system.

**Keywords** low density parity check (LDPC) code, analog decoding, iterative message-passing algorithms, hardware efficient, area utilization

## **1 Introduction**

Forward error correction (FEC) codes, which can be employed to correct transmission errors, have been an important component of space communication [1–2]. In deep-space missions, the use of FEC codes perhaps should be the single most cost effective means to improve the system performance in contrast to the larger power amplifiers and the bigger antennas. LDPC codes have recently gained acceptance in the aerospace community [3–4] because of their capacity-approaching performance and the ease of parallel implementation in hardware. A family of AR4JA LDPC codes [5] has been incorporated into the Consultative Committee for Space Data Systems (CCSDS) standards, and has been in use on several current missions [6]. When designing the LDPC decoders for deep-applications, power efficiency is one of the paramount concerns. The digital approach currently follows a field

Received date: 02-06-2017

 $\ddot{ }$ 

Corresponding author: Zhao Zhe, E-mail: 3120130329@bit.edu.cn

DOI: 10.1016/S1005-8885(17)60225-5

programmable gate array (FPGA)-based architecture with a precision analog-to-digital converter (ADC) [7–8]. However, the increasing data storage capacity and processing speed with the rapid pace of integration make the digital decoders more and more expensive in terms of hardware complexity and power consumption [9–10].

The major motivation to use analog decoders is based on the promises of low power dissipation and fast processing speed [11–12]. Compared with their digital counterpart, analog-based microchip can perform the complex operations in iterative decoding algorithms more efficiently with less hardware and power consumptions. Without the quantization process, the analog implementations can provide a finer estimation of the logic state of a single information bit. By the means of high modularity design, it is more immune to the non-ideal effects, such as the transistor mismatch effects and noise. Owing to the full-parallel, asynchronous, and continuous time processing, it also improves the total system efficiency. Because of the probabilistic computing paradigm, it offers more efficient error resilient capabilities against single event transient (SET). As a consequence, the analog message-passing methodology is suitable for spaceand power-constrained spacecraft system.

Over the past few years, several analog decoding chips are already available in Refs. [13–17], claiming an outstanding improvement in the power efficiency with respect to their digital counterparts. However, these implementations are merely limited to proof-of-concept decoders with very short block lengths, which are unable to provide enough coding gains for practical application. Pursuing the analog decoding approaches based on the full-parallel architecture into the practicality introduces various challenges. The major challenge is that due to the customized hand-craft design, the straightforward implementations suffer from the extremely poor area utilization and a considerable speed penalty. The work in Ref. [15] devotes roughly 40% of the chip area to routing. This implementation, from the other perspective, revealed the routing congestion rather than gate count as the bottleneck in implementation of analog decoding circuits.

In this work, a designing approach is proposed for the analog decoder of the rate-1/2 AR4JA LDPC code, which is suitable for the deep-space applications. In this approach, the combination of a scalable decoding architecture and a well-thought-out routing strategy can minimize the hardware complexity and maximize the area efficiency. A prototype chip used to verify the approach is fabricated in a 0.35 µm standard CMOS process, and the measurement results show that the decoder can achieve a throughput of 10 Mbit/s and a power consumption of 6.768 mW.

The paper is organized as follows. Sect. 2 describes some preliminaries, including the AR4JA LDPC codes, the basics of the iterative message-passing algorithm and the basics of the analog sum-product module. Details on the proposed decoding architecture design and routing strategy are discussed in Sect. 3. The experimental setup and measurements obtained from fabricated decoder prototypes are shown in Sect. 4. Finally, Sect. 5 concludes this paper.

# **2 Preliminaries**

## 2.1 AR4JA LDPC code

In this paper, we consider the analog implementation of the rate-1/2 LDPC code defined by the CCSDS standard. The selected code belongs to a family of AR4JA LDPC

codes that exhibit very low error floors. Moreover, the AR4JA design also ensures that the code's minimum distance grows linearly with the block size.

The AR4JA codes are structured LDPC codes built by making copies of a protograph and permuting the connecting edges. A protograph is a Tanner graph with a relatively small number of nodes. The protograph of the rate-1/2 AR4JA LDPC code is shown in Fig. 1, where the filled circles represent the variable nodes, the squares with a cross represent the check nodes, and the open circle represents variable nodes corresponding to symbols that are punctured, i.e., are not transmitted over the channel.



**Fig. 1** Protograph for rate-1/2 AR4JA LDPC code

Similarly, from the view of parity-check matrix, the AR4JA codes are designed by lifting up a protograph parity-check matrix into a larger parity-check matrix consisting of circulants. The parity-check matrix *H* for the rate-1/2 codes are constructed from *M*×*M* sub-matrices and specified as follows,

$$
H = \begin{bmatrix} \mathbf{0}_M & \mathbf{0}_M & I_M & \mathbf{0}_M & I_M \oplus \mathbf{\Pi}_1 \\ I_M & I_M & \mathbf{0}_M & I_M & \mathbf{\Pi}_2 \oplus \mathbf{\Pi}_3 \oplus \mathbf{\Pi}_4 \\ I_M & \mathbf{\Pi}_5 \oplus \mathbf{\Pi}_6 & \mathbf{0}_M & \mathbf{\Pi}_7 \oplus \mathbf{\Pi}_8 & I_M \end{bmatrix}
$$
(1)

where  $I_M$  and  $\mathbf{0}_M$  are the  $M \times M$  identity and zero matrices, respectively, and  $\Pi_1$  through  $\Pi_8$  are permutation matrices.

### 2.2 Message-passing schedules for decoding

The iterative message-passing algorithm, which offers near-optimum decoding performance at a manageable complexity, is the most widely used method for large linear LDPC codes. The LDPC decoding procedure operates on a graphical representation of the code dependencies, on which the sum-product algorithm is executed. A simplified illustration of decoding procedure is shown in Fig. 2, and the soft messages in terms of log-likelihood ratios (LLR) are passed between checknodes and variable-nodes. In the first step, variable node  $v_i$ is initialized with the prior message  $L(v_i)$  using the noisy channel information of the transmitted bit *y<sup>i</sup>* :

# ِ متن کامل مقا<mark>ل</mark>ه

- ✔ امکان دانلود نسخه تمام متن مقالات انگلیسی √ امکان دانلود نسخه ترجمه شده مقالات ✔ پذیرش سفارش ترجمه تخصصی ✔ امکان جستجو در آرشیو جامعی از صدها موضوع و هزاران مقاله √ امکان دانلود رایگان ٢ صفحه اول هر مقاله √ امکان پرداخت اینترنتی با کلیه کارت های عضو شتاب ✔ دانلود فورى مقاله پس از پرداخت آنلاين ✔ پشتیبانی کامل خرید با بهره مندی از سیستم هوشمند رهگیری سفارشات
- **ISIA**rticles مرجع مقالات تخصصى ايران