Various implementation approaches have been researched and discussed meeting stringent application goals such as low power, high throughput, low area , but the ultimate goal for many researchers is to find a compact and small hardware footprint for the S-box circuit. In this paper, we present our version of minimized S-box with two separate proposals and improvements in the overall gate count. The compact S-box is adopted with a compact and optimum processor architecture specifically tailored for the AES, namely, the compact instruction set architecture CISA. After various proposal submissions which did not meet their vigorous design requirements, a cipher candidate developed in IBM was deemed suitable and the NSA worked closely with IBM to strengthen that algorithm. From there onwards, the pillar and model of the encryption for data are formed and established as DES having influenced the advancements of the modern cryptography for many years on.
|Published (Last):||13 December 2007|
|PDF File Size:||1.67 Mb|
|ePub File Size:||5.48 Mb|
|Price:||Free* [*Free Regsitration Required]|
Various implementation approaches have been researched and discussed meeting stringent application goals such as low power, high throughput, low area , but the ultimate goal for many researchers is to find a compact and small hardware footprint for the S-box circuit.
In this paper, we present our version of minimized S-box with two separate proposals and improvements in the overall gate count. The compact S-box is adopted with a compact and optimum processor architecture specifically tailored for the AES, namely, the compact instruction set architecture CISA. After various proposal submissions which did not meet their vigorous design requirements, a cipher candidate developed in IBM was deemed suitable and the NSA worked closely with IBM to strengthen that algorithm.
From there onwards, the pillar and model of the encryption for data are formed and established as DES having influenced the advancements of the modern cryptography for many years on.
Since cryptographic solutions are often used to offer integrity and security over the transmission of sensitive data in our communication mediums, it is important for them to have consistent and nondecaying cryptographic strength over time. However, the strength of the encryption is weighted on the key itself, resulting in the strength being exploitable given massive computation strength to search for the key within a finite key space.
Over time, the advances of computing technology have dramatically improved the computer processing power and have rendered the earlier DES with the small-sized bit key as no longer safe. This is because of the far more superior computing power we have today, compared to those computers in the earlier days when the DES is proposed.
This was quickly rectified later by replacing DES with the triple-DES, which is eventually being out-run by the relentless modern computing advancement. It has been fully documented and made available in [ 1 ].
Before proceeding into the details of the AES algorithmic structures and descriptions, we would like to discuss more the progresses made by other researchers regarding improvements and implementations of AES.
When facing design and development issues for applications, the design outcome is often driven and shaped by the application environment requirements. High throughput is the most important requirement in a high-speed communication or optical link environment. On the other hand, some work in [ 6 — 9 ] presents the least power consumption. In some resource-constrained environments such as the wireless sensor network WSN , the lifetime of a node is very limited and power is scarce, making the topic about power consumption vital.
Most designs have setbacks and tradeoffs; as such, high-throughput circuit sacrifices design area or low-power techniques result to low-throughput or even a high-throughput, low-power consumption circuit that costs an extremely large circuit size and area.
All these are highly dependent on the requirements for their intended applications. In AES, the most resource consuming and the bottleneck section is the S-box. This is because the S-box is essentially a combination of affine, multiplicative inversion and inversion in the finite field GF 2 8 which requires complex computations. The inversion in the finite field GF 2 8 is practically complex, and therefore it is identified as a design bottleneck.
The paper written by [ 10 ] has mentioned that the non-LUT-based approaches are fairly attractive since they have breakable delay. The author has elaborated that there are two types of S-box designs. Type 1 is a direct circuit generation using truth table, making use of the sum of products SOP or product of sums POS and usually features higher throughput at expense of extremely large circuit area.
On the other hand, type 2 designs feature higher design area efficiency. Type 2 designs are slowly gaining popularity since the design trend has shifted towards searching efficient logic minimization techniques and circuit depth reduction techniques. The construction of good combinational circuits is important as it affects almost any metric in a digital circuit design we know. The gate count, critical-path delay, clocking and timing, circuitry jitters, and power consumption are discussed when a circuit is designed.
In this work, we focus on our development in the area of low gate count and low-resource environment hardware designs, specifically for resource-constrained environments such as the wireless sensor network WSN , radio frequency identification RFID and even the newly developed wireless identification and sensing platform WISP. In this paper, we discuss the development of our proposed solution in three areas: the review of current S-box design trend towards the low-gate-count approach, the small and compact footprint for AES designs, and lastly the development of a complete system with AES block for a selective encryption architecture SEA.
The structure of the paper is as follows. Section 1 is the introduction of the paper, which introduces some of the key algorithms adapted in our work and other related work for benchmark and comparison. Section 2 is the review of different design approaches for low-gate-count S-boxes. Section 3 presents our version of a small S-box design with 2 approaches: with an additional instruction set; circuit minimization.
Section 4 introduces our proposed instruction set computer architecture, namely, the compact instruction set architecture CISA with the adaptation of our proposed small S-box design. Section 5 discusses a higher-level of implementation which incorporates the set partitioning in hierarchical trees SPIHT compression algorithm as a source to the CISA running AES for a complete selective encryption architecture.
Section 6 is the results and discussions section, and lastly, Section 7 is the conclusion. It is a symmetric block cipher that consists of bit block length and supports , , and bits of key length with 10, 12, or 14 iterations of AES transformation, respectively.
The encryption and decryption operation is a repetition of the substitute permute network SPN operation on the input data. The cipher is applied onto a 2-dimensional 4 by 4 state array. It consists of four rows of bytes containing bytes, where is the block length divided by There are several modes of operations in which the AES can be configured to. Some of these serve different purposes as their functions vary. It generates key stream blocks using an initialization vector and XORed with the respective plain texts to get the complete cipher.
Fundamentally, the AES has four basic steps in each round of encryption. The description of the four basic steps in AES rounds is as follows. In brief, the first round is the AddRoundKey , the subsequent nine rounds include all the four transformations, and the tenth round omits the MixColumns. Note that this only applies to the forward encryption, and as for the decryption rounds, the AddRoundKey remains unchanged and the rest of transformation sequences are their mathematical reverses, namely, the InvSubBytes , InvShiftRows , and InvMixColumns.
Figure 1 shows a block diagram on how the AES works. Other than SubBytes , the other three transformations are considered modulo 2 bitwise calculations, which can be easily implemented.
Early versions of the S-box circuit are essentially an 8 by 8 look-up table and can be found in the following proposals: [ 12 , 13 ]. An illustration of the LUT is shown in Table 1. But for hardware implementations of AES, there is one drawback for the look-up table approach. Each copy of the table requires bytes of storage, along with the circuitry for addressing the table and to fetch the results.
The most straightforward way is to store all these values within a memory block. The problem arises when a fully unrolled AES would require 10 rounds of SubBytes, and in effect, each byte of data would require an independent S-box. In the end, S-boxes would eventually drain all the available memory.
Note that this is assumed to be the worst case of implementation approach and did not consider the pipelining method. Even with the pipelined architecture, the read and write cycle would slow down the architecture. Even though the multiplicative inversion and affine mathematical complexity are hidden by predefining the LUT value and the accesses is merely read and write, the LUT approach has irreducible read-write delays and, therefore, is not suitable for high-speed applications.
On the other hand, some authors suggest that a combinational circuit can be derived using subfield arithmetic. Daemen and Rijmen [ 11 , 15 ] suggested that using subfield arithmetic in the crucial step of computing an inverse in the Galois field of elements, by reducing an 8-bit input to subcalculations of 4-bit variables, may yield a very small S-box circuit.
In [ 16 ], the S-box used is derived from the multiplicative inverse over Galois field 2 8. To avoid attacks based on simple algebraic properties, the S-box is constructed by combining the inverse function with an invertible affine transformation a matching inverse affine is included in the decryption.
Not only the S-box is used in the main AES iterations, it is also shared with the key expansion operation [ 17 ]. The key expanding algorithm reuses the forward S-boxes in encryption and decryption.
And note that during the AES decryption, the same key expander uses the same forward S-box to generate the round keys. Later on, Satoh et al. From the hardware implementation point of view, the search for the multiplicative inversion of GF 2 8 is too complex and resource exhaustive. Being derived from the multiplicative inverse over Galois field 2 8 , it is understood that it projects good nonlinearity and may have high hardware complexities. Other than this, in a resource-constrained design environment, this gives a higher impact since implementation is small enough to allow unrolling or parallel designs for higher throughput.
Recently, the design trend had shifted to further minimizing and optimizing the S-box circuit [ 19 ]. In this section, we will only review the standard hardware proposal and implementation of AES S-box without taking account of the various proposals on variants or tweaks on the AES S-box.
We will only focus on the original version of the S-box and its respective minimization techniques; implementation methodologies and design approaches are surveyed and taken into account. In practice, we build circuit designs using numerous heuristics which potentially led to exponential time complexity which can only be applied onto small-sized circuits. The heuristic approach works naturally fine on circuit function that can be broken down into subfunctions, that is, matrix multiplication, which decomposes into smaller submatrix multiplications.
The initial work from Boyar and Peralta [ 20 ] proposes a new logic minimization technique, which can be applied to any arbitrary combinational logic problems and even circuits that have been optimized by standard methodologies. The authors described their techniques as a two-step process: nonlinear gate reduction and linear gate reduction. It is by far the smallest S-box combinational circuit that they have come up with.
Since there are multiple representations of Galois fields, there would be multiple versions of efficient circuits. The first step consists in identifying the nonlinear components and reducing the AND gates. The author chooses to focus on reducing only the GF 2 4 circuit since it would be significantly beneficial. The second part would be focusing on minimizing linear components with their newly proposed heuristics.
Hence, the author has presented two matrices and for linear-minimization. Note that the initial linear expansion and the linear contraction matrices and were defined to contain as much of the circuit as possible while still maintaining linearity. Thus, the author explains that the portion of the circuit, defined by , overlaps with the GF 2 8 inversion. So, the true purpose of the second step is to minimize the circuits for computing and. The matrices and are shown in 1 and 2. The illustration matrix Figure from [ 20 ] is Equation 2 shows the illustration matrix Figure from [ 20 ] :.
The Boyar technique has yielded a circuit for the AES S-box composed of three primary parts: the top-linear transformation, the middle nonlinear block, and the bottom-linear transformation [ 20 ]. The top-linear transformation is a result of the minimized matrix , a total of 23 XOR gates used and at depth 7, consisting of 8 inputs and 22 outputs. And lastly the bottomlinear block converts the 18 inputs from the middle non-linear block to become 8-bit output, having 26 XOR and 4 XNOR gates.
All these 3 blocks together form the final circuit of the S-box. Note from the work in [ 20 ], the author has only presented the forward version of the S-box, with a total gate count of gates.
Figure 2 shows the illustration of the proposed block diagram explaining the S-box in [ 20 ]. To further improve the work, the authors have presented their improved work in [ 21 ].
This time, Boyar attempts to apply a greedy heuristic approach for linear minimization and several depth reduction techniques. The largest circuit component is the top- and bottom-linear circuits. As explained previously, the top and linear components contain more than just the linear operations in the definition of the complete AES S-box.
A Very Compact AES-SPIHT Selective Encryption Computer Architecture Design with Improved S-Box
Compact and high-speed hardware architectures and logic optimization methods for the AES algorithm Rijndael are described. Encryption and decryption data paths are combined and all arithmetic components are reused. By introducing a new composite field, the S-Box structure is also optimized. An extremely small size of 5. It requires only 0. By making effective use of the SPN parallel feature, the throughput can be boosted up to 2.
What a lovely hat