The present disclosure includes methods and devices for parallel encryption/decryption. In one or more embodiments, an encryption/decryption device includes an input logic circuit, an output logic circuit, and a number of encryption/decryption circuits arranged in parallel between the input logic circuit and the output logic circuit. For example, each encryption/decryption circuit can be capable of processing data at an encryption/decryption rate, and the number of encryption/decryption circuits can be equal to or greater than an interface throughput rate divided by the encryption/decryption rate.
|
1. An encryption/decryption device, comprising
an input logic circuit;
an output logic circuit; and
a number of encryption/decryption circuits arranged in parallel between the input logic circuit and the output logic circuit, each encryption/decryption circuit being capable of processing data at a respective encryption/decryption rate,
wherein initialization vectors are combined with a first number of groups parsed from an input data stream to each parallel encryption/decryption circuit, the initialization vectors are incremented for a first parallel encryption/decryption circuit, and the incremented initialization vectors are used as initialization vectors for a second parallel encryption/decryption circuit, and
wherein the number of encryption/decryption circuits is equal to or greater than an interface throughput rate divided by the encryption/decryption rate, and the input logic circuit operates to parse an input data stream into a number of groups, and distribute the number of groups to at least some of the number of encryption/decryption circuits according to a distribution order.
33. A encryption/decryption method, comprising
parsing an input data stream into a number of groups, the input data stream having an interface uppermost throughput rate; and
distributing the number of groups in a round robin sequence among a number of parallel encryption/decryption circuits operating in a cipher block chaining mode, a plurality of the groups being distributed per each selection of a particular encryption/decryption circuit in the round robin sequence;
processing one group of the plurality of groups at a time through one of the number of parallel encryption/decryption circuits at an encryption/decryption rate
combining initialization vectors with a first number of groups to each parallel encryption/decryption circuit;
incrementing the initialization vectors for a first parallel encryption/decryption circuit; and
using the incremented initialization vectors as initialization vectors for a second parallel encryption/decryption circuit,
wherein the number of parallel encryption/decryption circuits is at least the uppermost throughput rate divided by the encryption/decryption rate.
22. A encryption/decryption method, comprising
parsing an input data stream into a number of groups, the input data stream having an interface uppermost throughput rate; and
distributing the number of groups in a round robin sequence among a number of parallel encryption/decryption circuits operating in an electronic codebook mode, one group being distributed per each selection of a particular encryption/decryption circuit in the round robin sequence; and
processing a particular group at a time through one of the number of parallel encryption/decryption circuits at an encryption/decryption rate,
wherein the number of parallel encryption/decryption circuits is at least the uppermost throughput rate divided by the encryption/decryption rate, and
wherein initialization vectors are combined with a first number of groups parsed from an input data stream to each parallel encryption/decryption circuit, the initialization vectors are incremented for a first parallel encryption/decryption circuit, and the incremented initialization vectors are used as initialization vectors for a second parallel encryption/decryption circuit.
14. A solid state memory system, comprising
at least one memory device; and
a controller communicatively coupled to the at least one memory device, and having an encryption/decryption device, the encryption/decryption device including:
a input multiplexer;
an output multiplexer; and
a number of encryption/decryption circuits arranged in parallel between the input multiplexer and the output multiplexer, each encryption/decryption circuit being capable of processing data at a respective encryption/decryption rate,
wherein the number of encryption/decryption circuits is equal to or greater than an interface throughput rate divided by the encryption/decryption rate, and
wherein initialization vectors are combined with a first number of groups parsed from an input data stream to each parallel encryption/decryption circuit, the initialization vectors are incremented for a first parallel encryption/decryption circuit, and the incremented initialization vectors are used as initialization vectors for a second parallel encryption/decryption circuit,
wherein the input multiplexer operates to parse an input data stream into a number of groups, and distribute the number of groups to at least some of the number of encryption/decryption circuits according to a distribution order.
21. A memory controller, comprising
a host interface configured to be communicatively coupled to a host through a communication interface having a throughput rate;
a front end direct memory access (DMA) communicatively coupled to the host interface;
a number of back end memory channels communicatively coupled to the front end DMA; and
an encryption/decryption device communicatively coupled between the host interface and the number of back end memory channels, the encryption/decryption device including a number of encryption/decryption circuits arranged in parallel, each encryption/decryption circuit being capable of processing data at an encryption/decryption rate,
wherein the number of parallel encryption/decryption circuits is at least the throughput rate divided by the encryption/decryption rate, and
wherein initialization vectors are combined with a first number of groups parsed from an input data stream to each parallel encryption/decryption circuit, the initialization vectors are incremented for a first parallel encryption/decryption circuit, and the incremented initialization vectors are used as initialization vectors for a second parallel encryption/decryption circuit, and
wherein the encryption/decryption device is configured to parse an input data stream into a number of groups, and distribute the number of groups to at least some of the number of encryption/decryption circuits according to a distribution order.
2. The encryption/decryption device of
3. The encryption/decryption device of
at least one input buffer;
at least one output buffer; and
an encryption/decryption engine coupled between the input buffer and the output buffer,
wherein the encryption/decryption engine is configured to process data a group at a time.
5. The encryption/decryption device of
6. The encryption/decryption device of
7. The encryption/decryption device of
8. The encryption/decryption device of
9. The encryption/decryption device of
10. The encryption/decryption device of
11. The encryption/decryption device of
12. The encryption/decryption device of
13. The encryption/decryption device of
15. The solid state memory system of
the interface throughput rate is P bits per second;
the encryption/decryption rate is an uppermost rate of E bits per second; and
the number of encryption/decryption circuits is equal to or greater than P divided by E.
16. The solid state memory system of
17. The solid state memory system of
18. The solid state memory system of
19. The solid state memory system of
20. The solid state memory system of
23. The encryption/decryption method of
24. The encryption/decryption method of
25. The encryption/decryption method of
26. The encryption/decryption method of
27. The encryption/decryption method of
28. The encryption/decryption method of
29. The encryption/decryption method of
30. The encryption/decryption method of
the number of groups are distributed among N parallel encryption/decryption circuits;
distributing one of the number of groups to an encryption/decryption circuit takes T clock cycles; and
processing one of the number of groups at a time through an encryption/decryption circuit takes N times T clock cycles.
32. The encryption/decryption method of
34. The encryption/decryption method of
35. The encryption/decryption method of
the encryption/decryption circuit implements an Advanced encryption Standard (AES) algorithm in cipher block chaining mode, and
four groups are distributed per each selection of a particular encryption/decryption circuit in the round robin sequence.
|
The present disclosure relates generally to semiconductor memory devices, methods, and systems, and more particularly, to parallel encryption and decryption.
Memory devices are typically provided as internal, semiconductor, integrated circuits in computers or other electronic devices. There are many different types of memory including volatile and non-volatile memory. Volatile memory can require power to maintain its data and includes random-access memory (RAM), dynamic random access memory (DRAM), and synchronous dynamic random access memory (SDRAM), among others. Non-volatile memory can provide persistent data by retaining stored information when not powered and can include NAND flash memory, NOR flash memory, read only memory (ROM), electrically erasable programmable ROM (EEPROM), erasable programmable ROM (EPROM), and phase change random access memory (PCRAM), among others.
Memory devices can be combined to form a solid state drive (SSD). An SSD can include non-volatile memory, e.g., NAND flash memory and NOR flash memory, and/or can include volatile memory, e.g., DRAM and SRAM, among various other types of non-volatile and volatile memory.
An SSD can be used to replace hard disk drives as the main storage device for a computer, as the SSD can have advantages over hard drives in terms of performance, size, weight, ruggedness, operating temperature range, and power consumption. For example, SSDs can have superior performance when compared to magnetic disk drives due to their lack of moving parts, which may ameliorate seek time, latency, and other electro-mechanical delays associated with magnetic disk drives. SSD manufacturers can use non-volatile flash memory to create flash SSDs that may not use an internal battery supply, thus allowing the drive to be more versatile and compact.
An SSD can include a number of memory devices, e.g., a number of memory chips (as used herein, “a number of” something can refer to one or more such things; for example, a number of memory devices can refer to one or more memory devices). As one of ordinary skill in the art will appreciate, a memory chip can include a number of dies. Each die can include a number of memory arrays and peripheral circuitry thereon. A memory array can include a number of planes, with each plane including a number of physical blocks of memory cells. Each physical block can include a number of pages of memory cells that can store a number of sectors of data.
Memory systems (e.g., a solid state drive) may be coupled to a host computer system by a communication interface (e.g., bus). Serial Advanced Technology Attachment (SATA) is a high speed serial computer bus primarily designed for transfer of data between the host computer system (e.g., motherboard) and mass storage devices, such as hard disk drives, optical drives, and solid state drives. SATA interfaces provide fast data transfer, ability to remove or add devices while operating (hot swapping when the operating system supports it), thinner cables that let air cooling work more efficiently, and reliable operation.
Whether to safeguard information stored in a portable memory system (such as a flash drive), or to protect the confidentiality of information stored in a memory system portion of a computer system (such as in an internal solid state drive), or as a means to secure data processing on an unsecured communications path (such as the Internet), encryption has been used to encode data. Various encryption/decryption algorithms exist. The Advanced Encryption Standard (AES) is a block cipher adopted as an encryption standard by the U.S. government, replacing its predecessor, the Data Encryption Standard (DES). AES is an encryption standard which non-strictly implements the Rijndael algorithm. AES is implemented as a symmetric block cipher with 128 bit data blocks and a key size that can be chosen from 128, 192, or 256 bits. AES may be implemented by software and/or hardware, may be relatively fast (relative to other encryption methodologies), is rather secure, is relatively easy to implement, and requires little memory. As an encryption standard, AES is currently being deployed on a large scale.
An AES engine receives an input (e.g., plaintext), and produces an encrypted output (e.g., ciphertext). There are several possible implementation modes of the AES standard. For example, the algorithm may be employed as an electronic code book (ECB), with no feedback. An implementation of the AES standard may have a high data rate. Several AES designs achieve a high data rate based on pipelined architectures when employing the AES algorithm as an ECB.
However, the AES standard is most often used in one of several feedback modes of operation for added security, including Cipher Block Chaining (CBC), Cipher Feedback (CFB), and Output Feedback (OFB). In these modes, the output of the AES algorithm is fed back to the input. The AES feedback modes of operation can introduce latencies to pipelined data processing.
The present disclosure includes methods and devices for parallel encryption/decryption. In one or more embodiments, an encryption/decryption device includes an input logic circuit, an output logic circuit, and a number of encryption/decryption circuits arranged in parallel between the input logic circuit and the output logic circuit. For example, in some embodiments, each encryption/decryption circuit is capable of processing data at an encryption/decryption rate, and the number of encryption/decryption circuits is equal to or greater than an interface throughput rate divided by the encryption/decryption rate.
The figures herein follow a numbering convention in which the first digit or digits correspond to the drawing figure number and the remaining digits identify an element or component in the drawing. Similar elements or components between different figures may be identified by the use of similar digits. For example, 104 may reference element “04” in
While parallel encryption apparatus and methods of the present disclosure may be described and illustrated as being implemented as part of a memory controller on a solid state drive having a SATA communication interface, the reader will appreciate that such an implementation is only one example implementation of many possible implementations and applications. The apparatus and methods of the present disclosure may be applied to other signal processing applications, including but not limited to, hardware and software implementations, memory storage systems involving magnetic, optical and other media, at various other physical and logical locations within a computing system, and as part of wired or wireless communication systems, among others. Implementations of the present disclosure within a memory system are not limited to a particular memory technology, e.g., flash. The reader will appreciate that although an example implementation is described herein, the apparatus and methods of the present disclosure may be applied to memory systems and devices using any type of memory backend, e.g., not just those utilizing flash memory devices.
Host system 102 can include a processor 105 coupled to a memory and bus control 107. The processor 105 can be a microprocessor, or some other type of controlling circuitry such as an application-specific integrated circuit (ASIC). Other components of the computing system may also have processors. The memory and bus control 107 can have memory and other components directly coupled thereto, for example, dynamic random access memory (DRAM) 111, graphic user interface 113, or other user interface (e.g., display monitor, keyboard, mouse, etc.).
The memory and bus control 107 can also have a peripheral and bus control 109 coupled thereto, which in turn, can connect to a number of devices, such as such as a flash drive 115, e.g., using a universal serial bus (USB) interface, a non-volatile memory host control interface (NVMHCI) flash memory 117, and/or SSD 104. As the reader will appreciate, a SSD 104 can be used in addition to, or in lieu of, a hard disk drive (HDD) in a number of different computing systems. The computing system 100 illustrated in
SATA was designed as a successor to the Advanced Technology Attachment (ATA) standard, which is often referred to as Parallel ATA (PATA). First-generation SATA interfaces, also known as SATA/150 or unofficially as SATA 1, have an uppermost throughput rate of about 1.5 gigabits per second (GB/s), or 150 megabits per second (MB/s). Subsequently, a 3.0 GB/s signaling rate was added to the physical layer, effectively doubling the uppermost throughput rate from 150 MB/s to 300 MB/s. The 3.0 GB/s specification is also known as SATA/300 or unofficially as SATA II or SATA2. SATA/300's transfer rate may satisfy magnetic hard disk drive throughput requirements for some time; however, solid state drives using multiple channels of fast flash may support much higher throughput rates, so even faster SATA standards (e.g., 6 GB/s) may be implemented in supporting flash solid state drive read speeds.
The communication interface 206 can be used to communicate information between SSD 204 and another device, such as a host system 202. According to one or more embodiments, SSD 204 can be used as a mass data storage memory system in computing system 200. According to one or more embodiments, SSD 204 can be used as an external, and/or portable, memory system for computing system 200 (e.g., with plug-in connectivity). Thus, communication interface 206 can be a USB, PCI, SATA/150, SATA/300, or SATA/600 interface, among others.
The controller 210 can communicate with the solid state memory devices 212-0, . . . , 212-N to read, write, and erase data. The controller 210 can be used to manage the sensing, programming, and erasing of data in the SSD 204. Controller 210 can have circuitry that may be one or more integrated circuits and/or discrete components. For one or more embodiments, the circuitry in controller 210 may include control circuitry for controlling access across a number of channels (e.g., to a number of memory arrays) and/or for providing a translation layer between the external host system 202 and the SSD 204. Thus, the memory controller 210 can selectively communicate through a particular channel (not shown in
The communication protocol between the host system 202 and the SSD 204 may be different than what is required for accessing a memory device e.g., solid state memory devices 212-0, . . . , 212-N. Memory controller 210 can process host command sequences and associated data, among others, into the appropriate channel command sequences, for example to store data.
According to one or more embodiments of the present disclosure, each solid state memory device 212-0, . . . , 212-N can include a number of memory cells. The solid state memory devices 212-0, . . . , 212-N can be formed using various types of volatile and/or non-volatile memory arrays (e.g., NAND flash, DRAM, among others). Memory devices 212-0, . . . , 212-N can include a number of memory cells that can be arranged to provide particular physical or logical configurations, such as a page, block, plane, die, array, or other group.
Each memory device, e.g., 312-0, . . . , 312-7, can be organized as previously described with respect to memory devices 212-0, . . . , 212-N, and can include one or more arrays of memory cells, e.g., non-volatile memory cells. In one or more embodiments, controller 310 can be a component of an SSD (e.g., controller 210 of SSD 204 shown in
Controller 310 can include a front end portion 344 and a back end portion 346. As shown in
The host FIFO 322 can be communicatively coupled to an encryption device 324 having one or more encryption engines (e.g., encryption engines implementing an AES algorithm). The encryption device 324 may be communicatively coupled to an encryption device buffer 326 (e.g., an AES FIFO). As illustrated in
Furthermore, the encryption device 324 may be arranged and configured to process (e.g., encrypt) the payload to provide at an output 373, through the encryption device buffer 326, to a front end direct memory access (DMA) 316. The encryption device 324 can provide at its output, either an unencrypted payload (e.g., plaintext abbreviated in
The front end DMA 316 can be communicatively coupled to a command dispatcher 318. A controller may have a number of channel (e.g., 0, . . . , N) corresponding to a number of memory devices. The front end DMA 316 can effectively couple the front end 344 circuitry to the back end channels, e.g., back end channel 0 (350-0), . . . , back end channel N (350-N).
Referring now to the back end portion 346 of controller 310, the back end portion 346 can include a number of channels, e.g., 350-0, . . . , 350-N. Each back end channel can include a channel processor and a channel DMA, among other components, each back end channel being communicatively coupled to the front end DMA 316. As shown in
Host interface 314 can be used to communicate information between controller 310, and a host system (e.g., 202 in
Within the AES engine 462B (operating in CBC mode), some portion of the encrypted output 466B may be fed back and combined with input 464B to produce the input 469B to an AES engine 462A (operating in ECB mode). Because a subsequent input group of data to a particular AES engine 462B (operating in CBC mode) is encrypted using the feedback of some portion of encrypted output from a previous group of encrypted data by the particular AES engine 462B (operating in CBC mode), the groups of data input linked by feedback may be referred to as being “chained” together. Groups of data which will be linked together through feedback from one to the next may be referred to as being a chain, e.g., of input data.
The feedback loop for the AES engine 462A (operating in ECB mode) can include control logic, e.g., a switch, multiplexer, etc., to select between the encrypted output 466B (ciphertext) or initialization vectors 463B. According to one or more embodiments, the initialization vectors 463B are used, e.g., selected by switch 465, for combining, e.g., by an XOR function, with a first number of bytes of a chain to a particular AES engine 462B (operating in CBC mode), e.g., 16 bytes, and encrypted output 466B (ciphertext) is fed back and used for combining with a second number of bytes of a chain to a particular AES engine 462B (operating in CBC mode), e.g., the balance of bytes associated with a particular data packet. However, embodiments are not limited to using the initialization vectors 463B to the first 16 bytes, and the initialization vectors 463B may be used for combining with more or fewer bytes.
Initialization vectors used for encrypting data can be persistent since the same initialization vectors are used for decrypting the data. According to one or more embodiments, initialization vectors associated with encrypting a particular quantity of data may be stored, and retrieved for decrypting the data. According to one or more embodiments of the present disclosure, initialization vectors associated with encrypting a particular quantity of data may be generated for encrypting the data, and rather than being stored, re-generated for decrypting the data, thus saving having to store and protect associated initialization vectors.
According to one or more embodiments of the present disclosure, a hashed version of the logical block address (LBA) sectors is used for the generation of initialization vectors 463B, at the time of encryption, or decryption, of the data. However, if a standard, e.g., known, hashing algorithm is used, one could determine the initialization vectors from a known input, e.g., the LBA, compromising the encryption security. Therefore, according to one or more embodiments of the present disclosure, a confidential one-way hashing scheme can utilized to protect the encryption security. In this way, even if the input to the hashing algorithm becomes known, e.g., the LBA of the data, generation of the initialization vectors can remain confidential, thus maintaining the integrity of the encryption security. Multiple encryption engines may be used to each generate respective initialization vectors, or one encryption engine may be used to generate initialization vectors for each of multiple encryption engines.
However, embodiments of the present disclosure are not limited to such an implementation, and other methods for developing the initialization vectors 463B are contemplated. In one or more embodiments having multiple, e.g., parallel, AES engines 462B (operating in CBC mode), 64 byte portions of a sector are chained, so eight such 64-byte portions belonging to a same LBA may be chained together, using the hashed version of the LBA sector for the initialization vectors 463B of the first 64-byte portion, and using the same initialization vectors 463B for the other seven 64-byte portions as well. According to a number of embodiments, an LBA field can be extended by additional bits, e.g., three bits, which are hashed together to generate separate initialization vectors 463B for each 64-byte portion, all derived from the same sector LBA. According to one or more other embodiments, the initialization vectors 463B for the first 64-byte portion may be incremented, e.g., by one, to develop initialization vectors 463B for subsequent portions. Other methods for modifying the initialization vectors 463B from one portion to another are contemplated so that the initialization vectors 463B are variable from one portion to another.
Although a CBC mode encryption process is illustrated in
One configuration for arranging a number of encryption engines (e.g., AES engines) is in parallel. Then a first group of incoming streamed data may be directed to a first encryption engine, a second group of incoming streamed data may be directed to a second encryption engine, a third group of incoming streamed data may be directed to a third encryption engine, . . . , and an Nth group of incoming streamed data may be directed to a Nth encryption engine. The data allocation process may then be repeated as necessary, for example, in a round robin sequence such that a group of data at an input to a particular encryption engine is finished being transferred to the particular encryption engine has completed its previous encryption task and is ready to process another group of data.
For illustration purposes, apparatus and methods of the present disclosure are described in the context of encrypting data; however, one having ordinary skill in the art will appreciate from this disclosure that the apparatus and methods may be applied for the purposes of decrypting previously-encrypted data. Thus, as used herein, the term “encryption/decryption” denotes a general term encompassing encryption and/or decryption. That is, for example, an encryption/decryption device is to be interpreted as a device that may be implemented to achieve encryption, or to achieve decryption, or to achieve both encryption and decryption. Thus, “encrypting/decrypting” data is to be interpreted herein as denoting a general term encompassing encrypting and/or decrypting data. Furthermore, embodiments of the present disclosure may be described using one term, such as encryption, which is not intended to indicate an apparatus or method excludes the converse implementation, e.g., decryption. While reference is made herein to the Advanced Encryption Standard (AES), the reader will appreciate that AES techniques may be utilized to decrypt data, as well as encrypt data.
In addition, while a round robin sequence involving N encryption engines is disclosed with respect to a data distribution pattern, the particular order of distribution is not limiting, and any distribution order that achieves the principles of the present disclosure are contemplated. For example, data may be distributed to a first encryption engine, then to a third encryption engine, and then to a second encryption engine, etc. Data need not be distributed to all available encryption engines if not necessary to accommodate the rate at which data is received by the encryption device. For example, data may be distributed to only 3 of 4 encryption engines in a round robin sequence, if that is sufficient to process the rate of incoming data.
As shown in
The input logic circuit 574 operates to parse the input data stream into a number of groups, and direct the number of groups to the number of encryption circuits according to a distribution order, such as in a round robin sequence. The output logic circuit operates to gather data groups from the encryption circuit outputs according to the round robin sequence and, direct the groups into an encrypted output data stream corresponding to an arrangement of the input data stream, e.g., in the same order by which the input data stream was parsed. While the input logic circuit 574 and output logic circuit 576 are shown in
While encryption circuits discussed herein are taken to have the same encryption rate (e.g., data processing rate), embodiments of the present disclosure are not so limited, and an encryption circuit can have the same or different encryption rate as other parallel encryption circuits. However, different encryption rates will complicate the order and speed of the distribution of data groups thereto, the distribution pattern having to account for different speeds at which a particular encryption circuit may be ready for a next data group.
Furthermore, embodiments of the present disclosure are not limited to the encryption rates (e.g., 75 MB/s) used herein, and can be implemented using slower, or faster encryption rates, as may be achievable using other circuit geometries and fabrication techniques. The throughput of a particular encryption circuit, including an AES engine for example, is related to the process geometry and the clock frequency of the application, e.g., module, to which the encryption circuit is applied. Circuit footprint of each encryption circuit, as well as the total footprint associated with the number of encryption circuits are other considerations in determining encryption rate. For example, an encryption rate faster than 75 MB/s may be implemented using 180 nm technology and 6 layer metal fabrication techniques, thereby reducing the quantity of encryption circuits for achieving a given throughput rate; however, synthesizing an encryption circuit with an AES engine having double the 75 MB/s encryption rate may utilize three to four times more logic, e.g., buffers, etc., for a given process geometry node. Thus, doubling the encryption rate of an encryption circuit may half the quantity of encryption circuits, but in doing so may increase the circuit size, complexity, power usage, etc. of the encryption device.
According to various embodiments of the present disclosure, the number of encryption circuits e.g., 578-0, 578-1, 578-2, 578-3, is equal to or greater than an interface throughput rate (e.g., a SATA/300 rate of 300 MB/s) divided by the encryption rate (e.g., 75 MB/s). For example, given a controller with a SATA/300 interface to a host system with a throughput rate of 300 MB/s, and having encryption engines each with an encryption rate of 75 MB/s, at least 4 encryption circuits, working in parallel, can be used to encrypt data at the uppermost rate of the interface, e.g., “on the fly,” in order to keep up with the host system. The incoming streamed data, e.g., from a host system, is distributed to the number of parallel encryption circuits, e.g., 578-0, 578-1, 578-2, 578-3 in a round robin sequence, and thereby divided amongst the respective encryption engines (e.g., AES encryption engines) of the encryption circuits, e.g., 578-0, 578-1, 578-2, 578-3.
According to another example for a controller with a SATA/300 interface to a host system with a throughput rate of 300 MB/s, but having encryption engines each with an encryption rate of 70 MB/s, at least 5 encryption circuits, working in parallel, will be needed to encrypt data at least at the uppermost rate of the interface, e.g., “on the fly,” in order to keep up with the host system. Some encryption capacity may be underutilized in this arrangement. Embodiments of the present disclosure also contemplate utilizing fewer encryption engines than would be required to support the uppermost interface throughput rate, to provide a reduced combined data encryption rate, which may be sufficient in certain applications, or with adequate buffering to accommodate finite durations of uppermost throughput rates (but not continuous uppermost throughput rates).
Referring again to
The outputs of each of the parallel encryption circuits, e.g., 578-0, 578-1, 578-2, 578-3 is coupled to one of multiple inputs of the output multiplexer 576. Output multiplexer 576 receives a control signal at an output control 577, by which output multiplexer 576 is controlled to sequentially select one of its inputs from which to route data to its output. This data assembling process may be accomplished by selecting, in a round robin sequence, an input corresponding to an encryption circuit, e.g., 578-0, 578-1, 578-2, 578-3, having encrypted data emerging from an encryption process. In this manner, encrypted data, from the parallel encryption circuits, e.g., 578-0, 578-1, 578-2, 578-3, assembles the parsed, and now encrypted, data stream into an output data stream.
Each of the parallel encryption circuits, e.g., 578-0, 578-1, 578-2, 578-3, includes, coupled in series from input to output, an input buffer, e.g., 580-0, 580-1, 580-2, 580-3, an encryption engine, e.g., 562-0, 562-1, 562-2, 562-3, and an output buffer, e.g., 582-0, 582-1, 582-2, 582-3. According to one or more embodiments, the encryption engine, e.g., 562-0, 562-1, 562-2, 562-3, can be an encryption engine implementing an AES algorithm (e.g., an AES core) based on a key, e.g., 568-0, 568-1, 568-2, 568-3. The keys, e.g., 568-0, 568-1, 568-2, 568-3, received by the respective encryption engine, e.g., 562-0, 562-1, 562-2, 562-3, may all be the same key, but need not be. One having ordinary skill in the art will recognize that, where different keys are used, the data stream can be similarly parsed and directed to a decryption circuit utilizing a key corresponding to the key used to encrypt the group of data. Utilizing the same key in all parallel encryption engine can simplify the decryption process.
According to one or more embodiments, the input buffer, e.g., 580-0, 580-1, 580-2, 580-3, can be a number of registers each having a capacity equal to the quantity of data bits of the group into which the input data stream is parsed and directed to each encryption circuit. For example, the input buffer, e.g., 580-0, 580-1, 580-2, 580-3, can be four 16 byte registers to hold 64 bytes of data that can be chained together to supply one or more embodiments of an encryption engine operating in CBC mode. The input data stream from the host system (e.g., 102 in
According to one or more embodiments, the output buffer, e.g., 582-0, 582-1, 582-2, 582-3, can be a number of registers each having a capacity equal to the quantity of data bits of the group into which the input data stream is parsed and directed to each encryption circuit. As previously described, the quantity of bits of a group of data into which the input data stream is parsed, directed to each encryption circuit, may be set equal to the quantity of bits that are processed as a unit by the encryption engine, e.g., 562-0, 562-1, 562-2, 562-3. For example, for an encryption engine implementing a 128 bit AES algorithm, the incoming data stream may be parsed into 128 bit groups (e.g., sixteen 8-bit bytes), and the output buffer, e.g., 582-0, 582-1, 582-2, 582-3, can be, for example, two 16 byte registers.
Embodiments of the present disclosure are not limited to the quantities, or sizes, provided as examples above. For example, input and output registers may utilize more or fewer registers, of smaller or greater capacity, which may be compatible with the particular encryption engine used, number of parallel encryption circuits, data rates, and group size into which the incoming data stream is parsed and directed to the number of parallel encryption circuits. Some implementations of the present disclosure may use additional data buffering capabilities, such as where the uppermost encryption rate may be less than the uppermost throughput rate of a host system or communication interface between the host system and memory system within which the encryption device is incorporated.
As previously described with respect to
In one or more embodiments, an encryption engine can implement a 128-bit AES algorithm (e.g., as illustrated in
For data that is transmitted across a communication interface (e.g., 206 in
From
Considering the output end of the parallel encryption circuits, e.g., 678-0, 678-1, 678-2, 678-3, the reader will observe that encrypted data initially emerges from the first encryption circuit, e.g., 678-0, at clock cycle 20. Thus, an initial latency (e.g., 684) occurs that is attributable to the encryption process, of 16 clock cycles. One having ordinary skill in the art will appreciate that an AES encryption algorithm may be executed in various ways, for example using a number (e.g., 11, 13, 15) of rounds of data manipulation, each round being performed in one clock cycle. Thus, the 16 clock cycle initial latency includes not only the AES encryption algorithm, but also movement of data into, through (if necessary), and out of the input, e.g., 680-0, 680-1, 680-2, 680-3, and output, e.g., 682-0, 682-1, 682-2, 682-3, buffers.
According to one or more embodiments of the present disclosure, encrypted data is continuously transferred out of each of the parallel encryption circuits, e.g., 678-0, 678-1, 678-2, 678-3, at the same rate as it is being input. For example, the first 16-byte group of encrypted output data (DATA OUTPUT0) can be clocked out of the first encryption circuit, e.g., 678-0, over 4 cycles beginning with clock cycle 20 (i.e., clock cycles 20-23), then the next (e.g., second) 16-byte group of encrypted output data (DATA OUTPUT1) can be clocked out of the second encryption circuit, e.g., 678-1, over 4 cycles beginning with the next clock cycle 24 (i.e., clock cycles 24-27), and so on in a round robin sequence corresponding to the input round robin sequence, until the last (e.g., 32nd) 16-byte group of encrypted output data (DATA OUTPUT31) of a 512 byte packet can be clocked out of the fourth encryption circuit, e.g., 678-3, over 4 cycles beginning with clock cycle 148 (e.g., over clock cycles 148-151). As is indicated, the packet delay, from the time that a particular packet begins to be clocked into an encryption circuit, e.g., 678-0, until the last group of data begins to emerge from being encrypted, e.g., form encryption circuit 678-3, can be 148 clock cycles.
As previously described with respect to
According to the encryption method embodiment illustrated in
In one or more embodiments, each clock cycle can transfer 4 bytes (i.e., 32 bits at 8 bits per byte), and corresponding to the AES engine processing (e.g., encrypting, decrypting) 128 bit (i.e., 16 bytes) blocks at a time, the input data stream can be still parsed into 16 byte groups. Therefore, 4 clock cycles, at 4 bytes per clock cycle, are used to transfer the 16 byte group of parsed data (e.g., from an input multiplexer to a particular encryption circuit, e.g., 778-0, 778-1, 778-2, 778-3).
For data transmitted across a communication interface (e.g., 206 in
From
For example, a first 16-byte group of data (DATA INPUT0) is distributed (e.g., directed by an input multiplexer) to the input of a first parallel encryption circuit, e.g., 778-0, during clock cycles 1-4. However, the next (e.g., second) 16-byte group of data (DATA INPUT1) parsed from an input data stream is also distributed to the input of the first parallel encryption circuit, e.g., 778-0, during clock cycles 5-8. The next two (e.g., third and fourth) 16-byte groups of data (DATA INPUT2 and DATA INPUT 3) are likewise distributed to the input of the first parallel encryption circuit, e.g., 778-0, during clock cycles 9-12 and 13-16 respectively. Thus, as indicated on
Then, the round robin sequence moves to the next parallel encryption circuit, e.g., 778-1, for example by the input multiplexer (e.g., 574 in
In a similar manner, DATA INPUT8-11 are parsed from the input data stream and distributed to the input of the third parallel encryption circuit, e.g., 778-2, during clock cycles 33-48, and DATA INPUT12-15 are parsed from the input data stream and distributed to the input of the fourth parallel encryption circuit, e.g., 778-3, during clock cycles 49-64. According to the round robin sequence, the first parallel encryption circuit is again selected, and DATA INPUT16-19 are parsed from the input data stream and distributed to the input of the first parallel encryption circuit, e.g., 778-0, during clock cycles 65-80. The above-described round robin distribution continues until data groups parsed from a received packet (e.g., 512 bytes) are distributed as shown in
Considering the output end of the parallel encryption circuits, e.g., 778-0, 778-1, 778-2, 778-3, the reader will observe that encrypted data initially emerges from the first encryption circuit, e.g., 778-0, at clock cycle 20. Thus, an initial latency (e.g., 784) occurs that is attributable to the encryption process, of 16 clock cycles. The first group of data (of four groups of data distributed in sequence to an encryption engine) is encrypted essentially in an ECB mode (e.g., without feedback) since it does not follow a group through the encryption engine from which feedback may be obtained. The initial latency (e.g., 784) shown in
However, unlike the encryption engines shown in
The reader can see from
An encryption method according to one or more embodiments of the present disclosure can include parsing an input data stream into a number of groups (e.g., 0-15). The data groups are numbered in
This round robin distribution of individual data groups per round robin selection of the destination circuit continues with the data groups of a packet (e.g., data groups 4-15 for a 512 byte packet and 16 byte data groups). That is, the number of groups are distributed in a round robin sequence among a number of parallel encryption circuits operating in an electronic codebook mode, one data group being distributed per each selection of a particular encryption circuit at in the round robin sequence. The reader can see that the above-described distribution sequence continues with data group 4 being directed to encryption circuit 0, in sequence behind data group 0.
Each group is processed one at a time through its respective one of the number of parallel encryption circuits, for example at a data processing rate. There is no feedback between respective groups, since the encryption circuits are operating in ECB mode. It is desirable that the number of groups are distributed to the number of parallel encryption circuits such that transfer of a next group to a particular encryption circuit is completed just as processing of the preceding group by the encryption circuit is completed (e.g., distribution of data group 4 is completed just as encryption circuit 0 completes processing data group 0 and is ready to process a next data group).
Assuming the input data stream is derived from a Serial Advanced Technology Attachment (SATA) interface having an uppermost throughput rate, the number of parallel encryption circuits needed for “on the fly” (e.g., continuous) encryption is at least the uppermost throughput rate divided by the data processing rate. For example, assuming a SATA interface uppermost throughput rate of 300 MB/s, and a data processing rate is 75 MB/s, then the number of parallel encryption circuits to provide continuous encryption capability is at least four.
An encryption method according to one or more embodiments of the present disclosure can include parsing an input data stream into a number of groups (e.g., 0-15). The data groups are numbered within the boxes shown in
As is further shown in
According to one or more embodiments, some portion of encrypted output (ciphertext) is then fed back (instead of the initialization vectors) and combined, e.g., by an XOR function, with subsequent bytes of the input data chain. For example, some portion of the output from encrypting data group 0, e.g., 884 in
It is desirable that the number of groups are distributed to the number of parallel encryption circuits such that transfer of a next plurality of groups to a particular encryption circuit is completed just as processing of the preceding plurality of groups by the encryption circuit is completed (e.g., distribution of a next plurality of data groups is completed just as encryption circuit 0 completes processing data group 3 and is ready to process a first data group of a next plurality of data groups).
Assuming the input data stream form which the data groups shown in
The present disclosure includes methods and devices for parallel encryption/decryption. In one or more embodiments, an encryption/decryption device includes an input logic circuit, an output logic circuit, and a number of encryption/decryption circuits arranged in parallel between the input logic circuit and the output logic circuit. Each encryption/decryption circuit is capable of processing data at an encryption/decryption rate, and the number of encryption/decryption circuits is equal to or greater than an interface throughput rate divided by the encryption/decryption rate.
In the detailed description of the present disclosure, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration how one or more embodiments of the present disclosure may be practiced. These embodiments are described in sufficient detail to enable those of ordinary skill in the art to practice the embodiments of this disclosure, and it is to be understood that other embodiments may be utilized and that process, electrical, and/or structural changes may be made without departing from the extent of the present disclosure.
As used herein, the designators “N” and “M,” particularly with respect to reference numerals in the drawings, indicate that a number of the particular feature so designated can be included with one or more embodiments of the present disclosure. As will be appreciated, elements shown in the various embodiments herein can be added, exchanged, and/or eliminated so as to provide a number of additional embodiments of the present disclosure. In addition, as will be appreciated, the proportion and the relative scale of the elements provided in the figures are intended to illustrate the embodiments of the present disclosure, and should not be taken in a limiting sense.
It will be understood that when an element is referred to as being “on,” “connected to” or “coupled with” another element, it can be directly on, connected, or coupled with the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly on,” “directly connected to” or “directly coupled with” another element, there are no intervening elements present. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, components, regions, layers, and/or sections, these elements, components, regions, wiring lines, layers, and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, wiring line, layer, or section from another region, layer, or section. Thus, a first element, component, region, wiring line, layer or section discussed below could be termed a second element, component, region, wiring line, layer, or section without departing from the teachings of the present disclosure.
Spatially relative terms, such as “beneath,” “below,” “lower,” “above,” “upper,” and the like, may be used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures rather than an absolute orientation in space. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as “below” or “beneath” other elements or features would then be oriented “above” the other elements or features. Thus, the example term “below” can encompass both an orientation of above and below. The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and should not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Embodiments of the present disclosure are described herein with reference to functional block illustrations that are schematic illustrations of idealized embodiments of the present disclosure. As such, variations from the shapes of the illustrations as a result, for example, of manufacturing techniques and/or tolerances, are to be expected. Thus, embodiments of the present disclosure should not be construed as limited to the particular shapes of regions illustrated herein but are to include deviations in shapes that result, for example, from manufacturing. For example, a region illustrated or described as flat may, typically, have rough and/or nonlinear features. Moreover, sharp angles that are illustrated may be rounded. Thus, the regions illustrated in the figures are schematic in nature and their shapes and relative sizes, thicknesses, and so forth, are not intended to illustrate the precise shape/size/thickness of a region and are not intended to limit the scope of the present disclosure.
Although specific embodiments have been illustrated and described herein, those of ordinary skill in the art will appreciate that an arrangement calculated to achieve the same results can be substituted for the specific embodiments shown. This disclosure is intended to cover adaptations or variations of one or more embodiments of the present disclosure. It is to be understood that the above description has been made in an illustrative fashion, and not a restrictive one. Combination of the above embodiments, and other embodiments not specifically described herein will be apparent to those of skill in the art upon reviewing the above description. The scope of the one or more embodiments of the present disclosure includes other applications in which the above structures and methods are used. Therefore, the scope of one or more embodiments of the present disclosure should be determined with reference to the appended claims, along with the full range of equivalents to which such claims are entitled.
In the foregoing Detailed Description, some features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the disclosed embodiments of the present disclosure have to use more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.
Asnaashari, Mehdi, Sarno, Robin
Patent | Priority | Assignee | Title |
10110374, | Dec 30 2011 | Intel Corporation | Preventing pattern recognition in electronic code book encryption |
10187358, | Dec 03 2013 | Amazon Technologies, Inc | Data transfer optimizations |
10936212, | Jan 04 2018 | MONTAGE TECHNOLOGY CO., LTD. | Memory controller, method for performing access control to memory module |
10983711, | Jan 04 2018 | MONTAGE TECHNOLOGY CO , LTD | Memory controller, method for performing access control to memory module |
11082210, | May 11 2018 | Zhuhai College of Jilin University; Jilin University | Method for sequentially encrypting and decrypting singly linked lists based on double key stream ciphers |
9531916, | Dec 30 2011 | Intel Corporation | Preventing pattern recognition in electronic code book encryption |
9800401, | Apr 23 2014 | International Business Machines Corporation | Initialization vectors generation from encryption/decryption |
9838199, | Apr 23 2014 | International Business Machines Corporation | Initialization vectors generation from encryption/decryption |
Patent | Priority | Assignee | Title |
7106860, | Feb 06 2001 | Synaptics Incorporated | System and method for executing Advanced Encryption Standard (AES) algorithm |
7221763, | Apr 24 2002 | Microchip Technology Incorporated | High throughput AES architecture |
7526085, | Jul 13 2004 | GLOBALFOUNDRIES Inc | Throughput and latency of inbound and outbound IPsec processing |
7580519, | Dec 08 2003 | Advanced Micro Devices, Inc. | Triple DES gigabit/s performance using single DES engine |
7685434, | Mar 02 2004 | Advanced Micro Devices, Inc. | Two parallel engines for high speed transmit IPsec processing |
8010801, | Nov 30 2006 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | Multi-data rate security architecture for network security |
8036377, | Dec 12 2006 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Method and apparatus of high speed encryption and decryption |
20020012430, | |||
20030198345, | |||
20030202658, | |||
20040128553, | |||
20060026377, | |||
20060056623, | |||
20070237327, | |||
20080031454, | |||
20080065885, | |||
20080187132, | |||
20080201574, | |||
20080232581, | |||
20080240423, | |||
20080247540, | |||
20100027783, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Dec 02 2008 | SARNO, ROBIN | Micron Technology, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 021972 | /0369 | |
Dec 02 2008 | ASNAASHARI, MEHDI | Micron Technology, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 021972 | /0369 | |
Dec 12 2008 | Micron Technology, Inc. | (assignment on the face of the patent) | / | |||
Apr 26 2016 | Micron Technology, Inc | U S BANK NATIONAL ASSOCIATION, AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 038669 | /0001 | |
Apr 26 2016 | Micron Technology, Inc | U S BANK NATIONAL ASSOCIATION, AS COLLATERAL AGENT | CORRECTIVE ASSIGNMENT TO CORRECT THE REPLACE ERRONEOUSLY FILED PATENT #7358718 WITH THE CORRECT PATENT #7358178 PREVIOUSLY RECORDED ON REEL 038669 FRAME 0001 ASSIGNOR S HEREBY CONFIRMS THE SECURITY INTEREST | 043079 | /0001 | |
Apr 26 2016 | Micron Technology, Inc | MORGAN STANLEY SENIOR FUNDING, INC , AS COLLATERAL AGENT | PATENT SECURITY AGREEMENT | 038954 | /0001 | |
Jun 29 2018 | U S BANK NATIONAL ASSOCIATION, AS COLLATERAL AGENT | Micron Technology, Inc | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 047243 | /0001 | |
Jul 03 2018 | MICRON SEMICONDUCTOR PRODUCTS, INC | JPMORGAN CHASE BANK, N A , AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 047540 | /0001 | |
Jul 03 2018 | Micron Technology, Inc | JPMORGAN CHASE BANK, N A , AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 047540 | /0001 | |
Jul 31 2019 | JPMORGAN CHASE BANK, N A , AS COLLATERAL AGENT | Micron Technology, Inc | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 051028 | /0001 | |
Jul 31 2019 | JPMORGAN CHASE BANK, N A , AS COLLATERAL AGENT | MICRON SEMICONDUCTOR PRODUCTS, INC | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 051028 | /0001 | |
Jul 31 2019 | MORGAN STANLEY SENIOR FUNDING, INC , AS COLLATERAL AGENT | Micron Technology, Inc | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 050937 | /0001 |
Date | Maintenance Fee Events |
Dec 17 2012 | ASPN: Payor Number Assigned. |
Jun 30 2016 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jul 07 2020 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Jun 25 2024 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Jan 15 2016 | 4 years fee payment window open |
Jul 15 2016 | 6 months grace period start (w surcharge) |
Jan 15 2017 | patent expiry (for year 4) |
Jan 15 2019 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 15 2020 | 8 years fee payment window open |
Jul 15 2020 | 6 months grace period start (w surcharge) |
Jan 15 2021 | patent expiry (for year 8) |
Jan 15 2023 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 15 2024 | 12 years fee payment window open |
Jul 15 2024 | 6 months grace period start (w surcharge) |
Jan 15 2025 | patent expiry (for year 12) |
Jan 15 2027 | 2 years to revive unintentionally abandoned end. (for year 12) |