The present disclosure relates to a structure including a read controller configured to receive a burst enable signal and a word line pulse signal, identify consecutive read operations from storage cells accessed via a word line, precharge bit lines once during consecutive, sequential reads, and hold the word line active through N−1 of the consecutive read operations, and N is an integer number of the consecutive read operations.
|
8. A method of reading from a memory structure, the method comprising:
performing N−1 consecutive read operations including sensing bits from N storage cells along a first word line of the memory structure, where N is an integer greater than two, and wherein, during the N−1 consecutive read operations, the bit of the N−1th storage cell and the bit of the Nth storage cell are concurrently read from the N−1th storage cell and the Nth storage cell to enable early restoring of bit line of the memory structure and early activating of a next word line of the memory structure; and
concurrently
receiving the bit of the Nth storage cell read by activation of the first word line and read during the N−1th consecutive read operation, and
restoring the bit lines of the memory structure and activating the next word line.
1. A memory structure comprising:
an array of storage cells comprising bit lines and N storage cells, wherein the N storage cells are accessible via a first word line, and where N is an integer greater than two;
at least one of
a single sense amplifier configured to sense bits from N−1 storage cells of the N storage cells during N−1 consecutive read operations, or
a first latch configured to latch the bits from the N−1 storage cells during the N−1 consecutive read operations; and
a read controller configured to, during the N−1 consecutive read operations, i) precharge the bit lines once, ii) access at least N−1 of the N storage cells via the first word line, and, iii) awhile holding the first word line active, signal at least one of the single sense amplifier or the first latch to acquire the bits of the N−1 storage cells.
2. The structure of
3. The structure of
the array of storage cells includes a plurality of word lines corresponding to the rows of the array of storage cells, wherein the plurality of word lines comprises the first word line; and
the bit lines correspond to the columns of the array of storage cells.
4. The structure of
the holding of the first word line active through the N−1 consecutive read operations is enabled by an alternative read path; and
the alternative read path senses data from a Nth storage cell of the N storage cells using a tri-buffer and a second latch while a N−1th column of the array of storage cells including the N storage cells is read using the first latch.
6. The structure of
the precharging of the bit lines once and the holding of the first word line active through the N−1 consecutive read operations is triggered based on a burst enable signal and a ward line pulse signal; and
the burst enable signal and the word line pulse signal are indicative of a static random access memory operating in a sequential read mode.
7. The structure of
9. The method of
10. The method of
11. The method of
12. The method of
13. The method of
14. The structure of
the first latch configured to latch the bits from the N−1 storage cells of the N storage cells during the N−1 consecutive read operations; and
a second latch configured to latch a bit from the Nth storage cell of the N storage cells during the N−1th consecutive read operation.
15. The structure of
16. The structure of
the first latch latches the bits of the N−1 storage cells consecutively; and
the second latch latches the bit of the Nth storage cell in parallel with the latching of the bit of the N−1th storage cell by the first latch.
17. The structure of
a plurality of pairs of inputs connected to respective pairs of the bit lines; and
a single pair of outputs providing the bits from the N−1 storage cells to the first latch.
18. The structure of
wherein the read controller is configured to control the multiplexer to select the output of the second latch while restoring the bit lines, restoring the single sense amplifier, and activating a second word line.
19. The structure of
the read controller is configured to
when selecting the first word line, select a word line of a first memory bank, and when selecting the second word line, select a word line of a second memory bank;
the second memory bank includes a different array of storage cells and different word lines than the first memory bank; and
dock timing of the second memory bank is phase shifted from dock timing of the first memory bank.
20. The method of
concurrently latching the bit of the N−1th storage cell into a first latch and latching the bit of the Nth storage cell into a second latch;
subsequent to latching the bit of the N−1th storage cell and the bit of the Nth storage cell, receiving the bit of the N−1th storage cell at a multiplexer, and
subsequent to the bit of the Nth storage cell being received at a multiplexer, concurrently receiving the bit of the Nth storage cell at the multiplexer and restoring the bit lines of the memory structure and activating the next wordline.
|
The present disclosure relates to a phase shifted sequential read mode in a static random access memory, and more particularly, to a circuit and a method for using a phase shifted burst mode in a static random access memory to save power and improve performance associated with address switching and decoding.
Memory devices are employed as internal storage areas in a computer or other electronic equipment. One specific type of memory used to store data in a computer is random access memory (RAM). RAM is typically used as a main memory in a computer environment, and is generally volatile in that once power is turned off all data stored in the RAM is lost.
A static random access memory (SRAM) is one example of a RAM. The SRAM has the advantage of holding data without a need for refreshing. A typical SRAM device includes an array of individual SRAM cells. Each SRAM cell is capable of storing a binary voltage value that represents a logical data bit (e.g., “0” or “1”).
In SRAM, energy efficiency is a challenge with a need for lower power. For example, typical machine learning applications require lower power as well as faster memory access. In a typical deep neural network hardware, memory is used to store weight parameters and activations as an input propagates through the network. The typical neural network application uses SRAMs to store weights and activations. These weights are stored in consecutive (i.e., next to one another) memory locations to address spatial locality. Further, a typical deep-learning architecture reads full layer matrix in a linear fashion and uses the data to generate the next layer. Applications that require sequential access rather than full random access can achieve power savings and performance enhancement (higher bandwidth) wherein a signal is developed on multiple adjacent words in parallel (i.e., multiple columns connected to a single sense amplifier) and control circuitry enables sensing these words consecutively in a burst mode.
In an aspect of the disclosure, a structure includes a read controller configured to receive a burst enable signal and a word line pulse signal, identify consecutive read operations from storage cells accessed via a word line, precharge bit lines once during consecutive, sequential reads, and hold the word line active through N−1 of the consecutive read operations, and N is an integer number of the consecutive read operations.
In another aspect of the disclosure, a circuit includes a plurality of first bitswitches which are connected to corresponding bit lines, a second bitswitch which receives a corresponding bit line and is connected to a tribuffer circuit that outputs a tribuffer output signal to a second latch, and a sense amplifier that receives the outputs of the first bitswitches and outputs a sense output signal to a first latch.
In another aspect of the disclosure, a method includes sensing sequential read operations of N−1 consecutive read operations through a sense amplifier, and sensing a Nth read operation through a skewed tribuffer circuit to enable early bit line restore and activation of a next word line in a cycle earlier than a next cycle, and N is an integer number of the consecutive read operations.
The present disclosure is described in the detailed description which follows, in reference to the noted plurality of drawings by way of non-limiting examples of exemplary embodiments of the present disclosure.
The present disclosure relates to a phase shifted sequential read mode in a static random access memory, and more particularly, to a circuit and a method for using a phase shifted burst mode in a static random access memory to save power and improve performance associated with address switching and decoding. More specifically, the present disclosure uses phase shifted burst arrays to address sense amplifier restore bubbles with interleaved banks. Advantageously, the present disclosure improves bandwidth and performance of SRAM read operations.
In conventional technology, a burst mode architecture can improve both power and bandwidth in multiplexer-wide burst operations for memory. However, in the conventional technology, at the end of the burst operations, a precharge time is required to precharge the bit lines back up and activate another word line before the next read operation. This precharge time creates a bubble in the operation stream. The precharge time bubble occurs when there is a transition from one group of sequential reads (e.g., 4 columns) to the next group of columns (i.e., a new word line activation). The precharge time bubble causes an issue since the signal development time (i.e., the precharge time) is large for the first slow read operation in the next group of columns. For example, a sense amplifier in a first bank (i.e., BANK0) may be busy precharging the data lines so it is not possible to turn on the bitswitch for the next column. Precharging the data lines is required to clear the read data from the last sense operation.
The present disclosure uses a phase-shift operation which reads from different banks to hide the bitline restore dead time that occurs in the conventional technology due to the bubble (i.e., the bitline precharge time bubble) when transitioning from one group of sequential reads to the next group of columns. Therefore, the present disclosure increases bandwidth and performance without increasing power requirements.
The first system 105 also includes a read controller 130, a first sense amplifier/multiplexer circuit 135, a second sense amplifier/multiplexer circuit 140, a third sense amplifier/multiplexer circuit 145, a first wordline driver 115, a second wordline driver 120, and a third wordline driver 125 corresponding to three adjacent word lines in the memory array. The first sense amplifier/multiplexer circuit 135 is shared by the first column 1, the second column 2, the third column 3, and the fourth column 4 in a basic decode 4 architecture. Similarly, the second sense amplifier/multiplexer circuit 140 is shared by the next consecutive four columns. One skilled in the art would recognize that the four bit line true and complement pairs 1, 2, 3, 4 would connect to data lines true and complement that feed into a shared sense amplifier/multiplexer circuit (i.e., one of the first sense amplifier/multiplexer circuit 135, the second sense amplifier/multiplexer circuit 140, and the third sense amplifier/multiplexer circuit 145) and output a single data output from the data out bus Q0<0:n>.
In the operation of
Reading a column comprises activating the multiplexer/bitswitch (i.e., the first sense amplifier/multiplexer circuit 135) to connect a desired column to the sense amplifier data lines DLT and DLC, setting the first sense amplifier/multiplexer circuit 135, latching the read data RDT0, RDT1, and restoring the sense amplifier data lines DLT and DLC. After the first read on column 1 is completed, a second read is performed for the second column 2. The sense amplifier of the sense amplifier/multiplexer circuit 135 is reset and the multiplexer of the first sense amplifier/multiplexer circuit 135 is shifted to the adjacent bit line pair BLT, BLC corresponding to the third column 3. The sense and read operation is performed for the third column 3 and the sense amplifier of the first sense amplifier/multiplexer circuit 135 is reset. The reading and sense operation for the column 4 is performed in parallel to reading and sense operation of column 3. This is possible due to the second sense structure 635 comprising the skewed inverter and latch. The bitlines are restored immediately after the column 3 is read, as opposed to being restored after column 4 in a conventional architecture. This enables the bitline precharge time to be hidden when moving from one burst read operation to the other burst read operation.
In embodiments, the first system 105 and the second system 150 includes a read controller 130, 175 configured to receive a burst enable signal and a word line pulse signal, identify consecutive read operations from storage cells accessed via a word line WL, and precharge bit lines (i.e., BLT, BLC) once during consecutive, sequential reads, and hold the word line WL active through N−1 of the consecutive read operations. N is an integer number of the consecutive read operations. The read controller 130, 175 is part of a SRAM which comprises two sensing paths for the array of the storage cells. Further, the structure includes an array of the storage cells of the first system 105 and the second system 150 being arranged as rows corresponding to word lines and columns corresponding to bit lines.
In embodiments, the holding of the word line WL active through N−1 of the consecutive read operations is enabled by an alternative read path that senses data using a tri-buffer structure 740 and a latch 635 in parallel with the (N−1)th column read. In embodiments, the precharging of the bit lines once during consecutive, sequential reads and holding the word line active through N−1 of the consecutive reads occurs based on the burst enable signal and the word line pulse signal indicating that the SRAM is operating in a sequential read mode. The sequential read mode hides a bit line restore time and allows an early bit line restore and early word line activation of a next word. The early bit line restore is performed by precharging a bit line immediately after a (N−1)th column read.
Further, the memory banks 100 have an early bit line BL restore 110, 155 and word line activation WL to address the burst-to-burst bubble. Therefore, when using the first system 105 and the second system 150, which is offset from the first system 105 by 180 degrees, a read mode bandwidth is increased by providing a burst architecture and two banks that are clocked offset by 180 degrees to approximately double the bandwidth (while restoring the data lines of first system 105, data from the second system 150 is output from the memory). In
In
On the other hand,
As an example, the SET1 and SALCLKN1 operations are performed in the interleaved burst mode operation 500 where stall operations would normally occur in
In conventional circuitry, a bit line precharge forms a limitation on a cycle time when going from one set of column bursts (i.e., base decode4) to the other column set. Therefore, in the conventional circuitry, the next column read of the other column set cannot start before the bit lines are precharged. In contrast, the structures 600, 700, and 740 of
Moreover, a burst to burst challenge (i.e., going from one set of 4 column reads to another set of 4 column reads) can cause a cycle time limitation due to a bit line BL restore and a signal development of a next word line WL activation. In this scenario, the bit lines BLs are pre-charged before the next read operation. Precharging the bit lines BLs can be an issue because the time to precharge the bit lines BLs all the way from ground to VDD will reduce the minimum cycle time for higher cells/bit lines (consumes a lot of time to precharge the bitlines due to the large capacitance). Further, the burst to burst challenge also poses a challenge for the signal development of the first read operation of the next four columns. In order to mitigate the burst to burst challenge, the present disclosure uses the structure 600 of the sense amplifier bypass scheme in
More specifically,
The sense amplifier 625 outputs true data bit line DBT and complement data bit line DBC to the first latch 630. Further, the second latch 635 receives the fourth true bit line BLT3 and outputs a second true read data RDT1 to the 2:1 multiplexer 640. The first latch 630 also outputs a first true read data RDT0 to the 2:1 multiplexer 640. The 2:1 multiplexer 640 outputs a final true read data RDT_F.
In embodiments, the 2:1 multiplexer 640 selects among the tri-buffer output signal RDT1 and the sense output signal RDT0 and outputs a read output RDT_F for a column structure including N columns.
In
The control logic 735 receives a clock CLK signal, a RESET signal, and the BURST_ENABLE signal. Further, the control logic 735 outputs the WBS0 and RBSN0 signals to the first bitswitch 605, the WBS1 and RBSN1 signals to the second bitswitch 610, and the WBS2 and RBSN2 signals to the third bitswitch 615. The RBSN3 and WBS3 signals are sent to the AND gate 725 and the OR gate 730. Further, the control logic 735 outputs the sense amplifier sense SA_SENSE signal, the sense amplifier latch SA_LATCH signal, the sense amplifier restore SA_RESTORE signal, and the bit line restore BL_RESTORE signal.
The tri-buffer structure 740 receives the RDT1 signal which is output from the second latch 635. Further, a second tri-buffer 750 receives the RDT1 signal, the RBSN2_L signal, the WBS2_L signal and outputs a signal to the input of an inverter 755. Further, the first tri-buffer 745 receives an input signal, the WBS2_L signal, the RBSN2_L signal and outputs the signal to the input of the inverter 755. The tri-buffer structure 740 is a skewed tri-buffer sensing structure which enables an early bit line restore and activation of a next word line in a cycle before a next cycle.
The timing diagram 800 includes the following signals: a READ control signal that goes to “1” to signify that a read is requested from memory; a CLK signal which is a free running clock that is supplied by a customer/tester for the memory; CLK0/CLK1 signals which are even and odd versions of the CLK signal shown in
The timing diagram 800 also includes a word line WL which is the word line activation (row). The word line WL is held active for three of the four consecutive read operations. The timing diagram 800 further includes a sense amplifier restore SARST signal which is required between read operations to restore the true and complement data line signals within a sense amplifier and prepares data sensing on the next word. In addition, a sense amplifier set SET signal is used to set a full differential voltage level and transfer data out of the sense amplifier and onto local/global data lines. A bit line restore BLR signal is used no more than once during a group of consecutive reads to restore the bit lines to a known voltage. The timing diagram 800 also includes a local data out Q_signal from a bank being read that includes a pattern that assumes toggling data on each time period, and a final data out Q signal from the memory which is run at approximately 4 times a normal SRAM as data is coming in a burst fashion from two banks which are ping ponged (i.e., toggled back and forth). The final data out Q signal will capture data from Q_BA and Q_BB and deliver twice the bandwidth to the customer/tester. Lastly, the timing diagram 800 includes a select signal QSEL to select between the Q_BA and Q_BB to output the final data out Q signal.
In the timing diagram 800 of
Also, in the timing diagram 800, true random access in the SRAM is exchanged for higher throughput or read bandwidth (i.e., there is a tradeoff) by sequentially reading column 0, column 1, column 2, and then with a domino style sense, column 3. The COL signal will sequentially count from 0 when reading column 0, to 1 when reading column 1, to 2 when reading column 2, and the BSEL signal transitions to “1”, column 3 regardless of the COL signal value. Column 3 is sensed with a domino style sense (i.e., inverter directly connected to a bit line) instead of being sensed with a sense amplifier. Sensing column 3 with a domino style sense enables higher performance by allowing a bit line restore (i.e., BLR signal goes to “1”) after reading column 2. For example,
If the number of memory banks is 1 (i.e., YES), in step S1012, the word line WL is turned on. In step S1013, the first N−2 columns of N columns attached to a sensing structure are read, without precharging the bit lines. In step S1014, the (N−1)th column of N columns in parallel with the (N−2)th column is read through a tri-buffer and a latch. In step S1015, the word line WL is turned off and the bit lines BLs are restored. In step S1016, the multiplexer (e.g., 2:1 MUX) is switched such that the data stored in LATCH2 (i.e., latch in tri-buffer path) is passed. In step S1017, at a same CLK cycle, the word line WL (address) is turned ON for the next set of columns, and the read operation is continued.
If the number of memory banks is not 1 (i.e., NO), in step S1003, the read operations are performed for two/more than two banks in a time-interleaved burst fashion. The sequence of events in BANK0 (or BANKA) are toggled off of CLK0 and the sequence of events in BANK1 (or BANKB) are toggled off of CLK1. In step S1004, the word line WL corresponding to BANK0 (or BANKA) is turned on. The contents of the cell are read in a burst manner while the word line WL in BANK0 (or BANKA) is kept on. In step S1005, on the rising edge of CLK1, the word line WL for the BANK1 (or BANKB) is turned on and the contents of the call in BANK1 (or BANKB) are read in a burst manner while the word line in BANK1 (or BANKB) is kept on. In step S1006, the first N−2 columns of N columns attached to a sensing structure are read, without precharging the bit lines in BANK0 (or BANKA) and BANK1 (or BANKB).
In step S1007, in BANK0 (or BANKA), the (N−1)th column of N columns are read in parallel with the (N−2)th column through a tri-buffer and a latch. Further, in BANK1 (or BANKB), similar operations are performed as BANK0 (or BANKA) on the rising edge of the phase shifted clock. In step S1008, in BANK0 (or BANKA), the word line WL is turned off on the next clock CLK0 and the bit lines are precharged. In BANK1 (or BANKB), the word line WL is turned off on the corresponding rising edge of CLK1 and the bit lines are precharged. In step S1009, in BANK0 (or BANKA), the multiplexer (i.e., the 2:1 MUX) is switched such that the data stored in LATCH2 (i.e., latch in tri-buffer path) is passed. At a same time, the word line WL (address) is turned on for the next set of columns and the read operations is continued. In BANK1 (or BANKB), the multiplexer (i.e., the 2:1 MUX) performs switching on the phase shifted version of the block. In step 1010, at the same clock cycle CLK0 in BANK0 (or BANKA) and CLK1 in BANK1 (or BANKB), the word line WL (address) is turned on for the next set of columns, and continue to perform the read operations for the next set of columns.
In embodiments, a method may include sensing sequential read operations of N−1 consecutive read operations through a sense amplifier 625 and sensing an N read operation through a skewed tribuffer circuit 740 to enable early bit line restore and activation of a next word line in a cycle earlier than a next cycle. The sequential read operations of N−1 consecutive read operations are performed by reading at least two banks of memory in a sequential interleaved read mode operation.
In embodiments, the sequential interleaved read mode operation includes performing a read operation in a first bank of the at least two banks of memory while performing a sense amplifier restore operation in a second bank of the at least two banks of memory. The first bank has a clock which is phase shifted from a clock of the second bank, and the clock of the first bank and the clock of the second bank run at approximately half an external clock frequency. The method may further include preventing a read operation to one of the first bank and the second bank in back to back clock cycles.
The circuit and the method for using a phase shifted burst mode in a static random access memory to save power and improve performance associated with address switching and decoding of the present disclosure can be manufactured in a number of ways using a number of different tools. In general, though, the methodologies and tools are used to form structures with dimensions in the micrometer and nanometer scale. The methodologies, i.e., technologies, employed to manufacture the circuit and the method for using a phase shifted burst mode in a static random access memory to save power and improve performance associated with address switching and decoding of the present disclosure has been adopted from integrated circuit (IC) technology. For example, the structures are built on wafers and are realized in films of material patterned by photolithographic processes on the top of a wafer. In particular, the fabrication of the circuit and the method for using a phase shifted burst mode in a static random access memory to save power and improve performance associated with address switching and decoding uses three basic building blocks: (i) deposition of thin films of material on a substrate, (ii) applying a patterned mask on top of the films by photolithographic imaging, and (iii) etching the films selectively to the mask.
The method(s) as described above is used in the fabrication of integrated circuit chips. The resulting integrated circuit chips can be distributed by the fabricator in raw wafer form (that is, as a single wafer that has multiple unpackaged chips), as a bare die, or in a packaged form. In the latter case the chip is mounted in a single chip package (such as a plastic carrier, with leads that are affixed to a motherboard or other higher level carrier) or in a multichip package (such as a ceramic carrier that has either or both surface interconnections or buried interconnections). In any case the chip is then integrated with other chips, discrete circuit elements, and/or other signal processing devices as part of either (a) an intermediate product, such as a motherboard, or (b) an end product. The end product can be any product that includes integrated circuit chips, ranging from toys and other low-end applications to advanced computer products having a display, a keyboard or other input device, and a central processor.
The descriptions of the various embodiments of the present disclosure have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Arsovski, Igor, Hunt-Schroeder, Eric D., Patil, Akhilesh
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
6108243, | Aug 26 1998 | SOCIONEXT INC | High-speed random access memory device |
9613685, | Nov 13 2015 | Texas Instruments Incorporated | Burst mode read controllable SRAM |
20020136069, | |||
20030167374, | |||
20070147160, | |||
20080123451, | |||
20160163379, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jan 23 2019 | ARSOVSKI, IGOR | GLOBALFOUNDRIES Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 048126 | /0135 | |
Jan 23 2019 | PATIL, AKHILESH | GLOBALFOUNDRIES Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 048126 | /0135 | |
Jan 23 2019 | HUNT-SCHROEDER, ERIC D | GLOBALFOUNDRIES Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 048126 | /0135 | |
Jan 24 2019 | Marvell Asia Pte, Ltd. | (assignment on the face of the patent) | / | |||
Nov 05 2019 | GLOBALFOUNDRIES, INC | MARVELL INTERNATIONAL LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 051061 | /0681 | |
Dec 31 2019 | MARVELL INTERNATIONAL LTD | CAVIUM INTERNATIONAL | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 053987 | /0949 | |
Dec 31 2019 | CAVIUM INTERNATIONAL | MARVELL ASIA PTE, LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 053988 | /0195 |
Date | Maintenance Fee Events |
Jan 24 2019 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Date | Maintenance Schedule |
Sep 07 2024 | 4 years fee payment window open |
Mar 07 2025 | 6 months grace period start (w surcharge) |
Sep 07 2025 | patent expiry (for year 4) |
Sep 07 2027 | 2 years to revive unintentionally abandoned end. (for year 4) |
Sep 07 2028 | 8 years fee payment window open |
Mar 07 2029 | 6 months grace period start (w surcharge) |
Sep 07 2029 | patent expiry (for year 8) |
Sep 07 2031 | 2 years to revive unintentionally abandoned end. (for year 8) |
Sep 07 2032 | 12 years fee payment window open |
Mar 07 2033 | 6 months grace period start (w surcharge) |
Sep 07 2033 | patent expiry (for year 12) |
Sep 07 2035 | 2 years to revive unintentionally abandoned end. (for year 12) |