The present invention relates to the design of highly reliable high performance microprocessors, and more specifically to designs that use cache memory protection schemes such as, for example, a 1-hot plus valid bit scheme and a 2-hot vector cache scheme. These protection schemes protect the 1-hot vectors used in the tag array in the cache and are designed to provide hardware savings, operate at higher speeds and be simple to implement. In accordance with an embodiment of the present invention, a tag array memory circuit includes a plurality of memory bit circuits coupled together to form an n-bit memory cell, and a valid bit circuit coupled to the n-bit memory cell, the valid bit circuit being configured to be accessed simultaneously with the plurality of memory bit circuits.
|
1. A tag array memory circuit, comprising;
a plurality of memory bit circuits coupled together to form an n-bit memory cell; and a valid bit circuit coupled to the n-bit memory cell, the valid bit circuit to be accessed simultaneously with the plurality of memory bit circuits, and the valid bit circuit including: a valid bit memory bit circuit coupled to the n-bit memory cell, the valid bit memory bit circuit to include an external enable line; a word line coupled to the valid bit memory bit circuit; and a plurality of transistor circuits coupled to the word line, each of the plurality of transistor circuits coupled to a separate one of the plurality of memory bit circuits, and said word line to read one of said plurality of memory bit circuits. 19. An apparatus comprising:
a cache memory coupled to the processor, the cache memory to store 1-hot vectors and to associate a separate valid bit with each 1-hot vector, and said cache memory comprising: a 1-hot tag array to store a plurality of 1-hot vectors and to store a valid bit for each of the plurality of 1-hot vectors, said 1-hot tag array comprising: a plurality of tag array memory circuits, each of said tag array memory circuits comprising: a plurality of memory bit circuits coupled together to form and n-bit memory cell; and a valid bit circuit coupled to the n-bit memory cell, the valid bit circuit to be accessed simultaneously with the plurality of memory bit circuits, and the valid bit circuit including: a valid bit memory bit circuit coupled to the n-bit memory cell, the valid bit memory bit circuit to include an external enable line; a word line coupled to the valid bit memory bit circuit; and a plurality of transistor circuits coupled to the word line, each of the plurality of transistor circuits coupled to a separate one of the plurality of memory bit circuits, and said word line to read one of said plurality of memory bit circuits. 15. A computer system comprising:
a processor; a translation look-aside buffer (TLB) coupled to the processor; and a cache memory coupled to the processor, the cache memory to store 1-hot vectors and to associate a separate valid bit with each 1-hot vector, and said cache memory comprising: a 1-hot tag array to store a plurality of 1-hot vectors and to store a valid bit for each of the plurality of 1-hot vectors, said 1-hot tag array comprising: a plurality of tag array memory circuits, each of said tag array memory circuits comprising: a plurality of memory bit circuits coupled together to form an n-bit memory cell; and a valid bit circuit coupled to the n-bit mentor cell, the valid bit circuit to be accessed simultaneously with the plurality of memory bit circuits, and the valid bit circuit including: a valid bit memory bit circuit coupled to the n-bit memory cell, the valid bit memory bit circuit to include an external enable line; a word line coupled to the valid bit memory bit circuit; and a plurality of transistor circuits coupled to the word line, each of the plurality of transistor circuits coupled to a separate one of the plurality of memory bit circuits, and said word line to read one of said plurality of memory bit circuits. 2. The tag array memory circuit of
3. The tag array memory circuit of
5. The tag array memory circuit of
6. The tag array memory circuit of
7. The tag array memory circuit of
8. The tag array memory circuit of
associate a valid bit with a 1-hot vector; store the 1-hot vector and the valid bit; output the 1-hot vector, if the valid bit is set to valid; and invalidate the 1-hot vector, if the valid bit is set to invalid.
9. The tag array memory circuit of
receive a request to use the 1-hot vector; read out the 1-hot vector and the associated valid bit; determine that the valid bit is set to valid; and output the 1-hot vector.
10. The tag array memory circuit of
receive a request to use the 1-hot vector; read out the 1-hot vector and the associated valid bit; determine that the valid bit is set to invalid; and clear the 1-hot vector and the associated valid bit.
11. The tap array memory circuit of
read out the 1-hot vector; and set the 1-hot vector and the associated valid bit equal to zero.
12. The tag array memory circuit of
13. The tag array memory circuit of
force a plurality of bit locations to contain the value 1; and write an inverted 1-hot vector and an inverted valid bit to the plurality of bit locations.
14. The tag array memory circuit of
16. The computer system of
a TLB virtual address; and a TLB data array coupled to the TLB virtual address array.
17. The computer system of
a plurality of comparators coupled to the 1-hot tag; a first multiplexor coupled to the plurality of comparators; and a cache data array coupled to the first multiplexer.
18. The computer system of
20. The apparatus of
a TLB virtual address; and a TLB data array coupled to the TLB virtual address array.
21. The apparatus of
a plurality of comparators coupled to the 1-hot tag; a first multiplexer coupled to the plurality of comparators; and a cache data array coupled to the first multiplexer.
22. The apparatus of
|
The present invention relates to the design of highly reliable high performance microprocessors, and more specifically to designs using a 1-hot vector tag plus valid bit protection scheme and/or a 2-hot vector tag protection scheme in high speed memories, such as caches.
Modern high-performance processors, for example, Intel® Architecture 32-bit (IA-32) processors, include on-chip memory buffers, called caches, to speed up memory accesses. IA-32 processors are manufactured by Intel Corporation of Santa Clara, Calif. These caches generally consist of a tag array and a data array. The data array generally stores the data that is needed during the execution of the program. The tag array generally stores either a physical address or a virtual address of the data as tags. For reliability reasons, these stored tags are often protected for error detection by associating a separate parity bit with each tag. In even higher performance processors, for example, Intel® Architecture 64-bit (IA-64) processors, each tag is generally stored as a 1-hot vector in a 1-hot cache, which is derived during a Translation Look-aside Buffer (TLB) lookup for an address translation. IA-64 processors are manufactured by Intel Corporation of Santa Clara, Calif. A "1-hot vector" is an n-bit, binary address in which a single bit is set to specify a matching address translation entry in the TLB. The advantage of using a 1-hot vector as a tag is that it improves the operating frequency of a cache. Unfortunately, the protection of these 1-hot vectors presents a great challenge since the conventional parity bit protection scheme used to protect the standard tag in the conventional cache does not work well for the 1-hot vectors. For example, when an entry in the TLB is replaced, all of the tags with the corresponding 1-hot vectors in the 1-hot cache must be invalidated. This invalidation can be performed using a blind invalidate operation, in which all 1-hot vectors in the cache with the "1" bit matching the selected TLB entry will be invalidated. However, since the blind invalidate operation only overwrites the 1-hot vector and not the associated parity bit, the associated parity bit is no longer valid for the new value in the 1-hot vector. In addition, in the 1-hot cache, since all of the cleared bits are now zero, if any of the bits are changed by a soft error to a 1, then, the cleared entry becomes a 1-hot vector, which is indistinguishable from a real, valid 1-hot vector that also may be stored in the 1-hot cache. A "soft" error is an error that occurs when a bit value that is set to a particular value in the processor is changed to an opposite value by, for example, an alpha particle bombardment and/or gamma-ray irradiation of the bit.
A straight forward protection scheme for the 1-hot tag cache that does work for the 1-hot vectors involves having a second tag array to maintain a duplicate copy of the 1-hot vectors in the tag array. However, although this duplicate tag array scheme works, it requires a larger chip area and a high timing impact to implement.
In accordance with embodiments of the present invention, circuits and methods to protect the 1-hot vectors used in the tag cache are described herein. As a way of illustration only, two embodiments of the present invention are described: a 1-hot plus valid bit and a 2-hot vector scheme, however, these two embodiments should not be taken to limit any alternative embodiments, which fall within the spirit and scope of the appended claims.
In general, a cache that stores 1-hot vectors as tags is referred to as a 1-hot tag cache and a cache that stores 2-hot vectors as tags is referred to as a 2-hot tag cache. A 1-hot vector is an n-bit string that contains a single "1" and n-1 "0's", for example, "00001000" is an eight-bit 1-hot vector. Similarly, a 2-hot vector is an n-bit string that contains two consecutive "1's" and n-2 "0's", for example, "00011000" is an eight-bit 2-hot vector. The right most "1" bit in a 2-hot vector is called a primary bit and a left neighbor "1" bit of the primary bit is called an aux (auxiliary) bit.
In
Operation of the 1-hot tag array. In
In
In
In accordance with an embodiment of the present invention, a 1-hot plus valid bit scheme involves adding one bit to each 1-hot vector to serve as a valid identification (Vid) bit. In the 1-hot plus valid bit scheme, while conceptually simple, a multi-cycle read-modify operation can be used to update the valid bit to avoid the timing impact. In addition, in accordance with an embodiment of the present invention, in the 1-hot plus valid bit scheme an additional word line is used to read out the content of the 1-hot column. Therefore, in accordance with an embodiment of the present invention, in this scheme, a single bit is appended at the end of each 1-hot vector to serve as the Vid bit.
In accordance with embodiments of the present invention, on a read operation in the 1-hot plus valid bit scheme, the Vid bit is accessed at the same time as the 1-hot vector and, if the Vid bit is set, the 1-hot vector is considered valid, otherwise, the 1-hot vector is considered invalid by external processor logic (not shown). The Vid bit is cleared on a blind invalidate just as for the 1-hot tag array. The detailed operation of the 1-hot plus Vid bit is described below. It should be noted that the 1-hot plus Vid bit scheme is somewhat slower than the 1-hot tag memory cell due to the added read port via w13 being slower than w10 and w11.
Operation of the 1-hot plus valid bit. In
In accordance with an embodiment of the present invention, in
In accordance with an embodiment of the present invention, in
2-hot vector protection scheme an accordance with an embodiment of the present invention, in the 2-hot vector scheme, the 1-hot vector is converted to a 2 hot vector. This is accomplished by local logic prior to the cache tag during the write operation of the 1-hot vector into the tag. During the read out, the 2-hot vector is automatically converted back to a 1-hot vector by local logic subsequent to the cache tag. In this way, the accesses of the cache work identically to the 1-hot tag cache described above.
In accordance with an embodiment of the present invention, while the 2-hot vector scheme is more complicated, it does not require the multi-cycle operation of the 1-hot plus valid bit scheme. In addition, in accordance with an embodiment of the present invention, the 2-hot scheme does not require additional bit lines or word lines.
Operation of the 2-hot tag cache. In
In accordance with an embodiment of the present invention, in
In accordance with an embodiment of the present invention, in
While the aux bit has been described located in the bit just to the right of the primary bit, in an alternate embodiment of the present invention, the aux bit can be located in any bit position within the 2-hot vector. However, embodiments in which the aux bit is located closer to the primary bit, in general, perform better than those embodiments in which the aux bit is located farther away from the primary bit.
In accordance with an embodiment of the present invention, in
In accordance with an embodiment of the present invention, in
In accordance with an embodiment of the present invention, in
In accordance with an embodiment of the present invention, in
While the embodiments described above relate to the 1-hot plus valid bit and 2-hot vector embodiments, they are not intended to limit the scope or coverage of the present invention. In fact, for example, the 2-hot scheme described above can be extended to a 3-hot vector to protect errors in 2 consecutive bits or to a 4-hot or higher vector to protect errors in 3 and higher consecutive bits, respectively. Similarly, other bit patterns other than the 2-hot scheme may be used depending on the type of the errors, such as, for example, double bit errors, that a designer is trying to protect against.
In addition, the 1-hot plus valid bit scheme is, generally, good for microprocessor designs that are not wire congested in the physical layout and, thus, have available area for the additional read line Likewise, the 2-hot scheme is good for microprocessor designs that are, generally, wire congested in the physical layout and, thus, do not have much available area for the additional hardware that is associated with the 1-hot plus valid bit scheme.
The 2-hot scheme described above minimizes global routing at the expense of local interconnect and transistors. Other 2-hot schemes can use a multiple clock blind invalidation scheme by using a different signal for invalidating the aux bit.
Both the 1-hot plus valid bit and 2-hot vector protection schemes can be implemented in high performance microprocessors and high performance multi-processors on a single chip.
In accordance with an embodiment the present invention, a computer system includes a processor and a cache memory coupled to the processor, where the cache memory is configured to use 1-hot vectors and to associate a separate valid bit with each 1-hot vector.
In accordance with an embodiment the present invention, a multi-processor computer system includes a first processor, a second processor and a cache memory coupled to the first and second processors, where the cache memory is configured to use 1-hot vectors and to associate a separate valid bit with each 1-hot vector.
In accordance with an embodiment the present invention, a multi-processor computer system includes a first processor, a second processor, a first cache memory coupled to the first processor and a second cache memory coupled to the second processor, where the first and second cache memories are configured to use 1-hot vectors and to associate a separate valid bit with each 1-hot vector.
In accordance with an embodiment the present invention, a computer system includes a processor and a cache memory coupled to the processor, where the cache memory is configured to use 2-hot vectors.
In accordance with an embodiment the present invention, a multi-processor computer system includes a first processor, a second processor and a cache memory coupled to the first and second processors, where the cache memory is configured to use 2-hot vectors.
In accordance with an embodiment the present invention, a multi-processor computer system includes a first processor, a second processor, a first cache memory coupled to the first processor and a second cache memory coupled to the second processor, where the first and second cache memories are configured to use 2-hot vectors.
In accordance with an embodiment of the present invention, a tag array memory circuit includes a plurality of memory bit circuits coupled together to form an n-bit memory cell, and a valid bit circuit coupled to each of the plurality of memory bit circuits in the n-bit memory cell, the valid bit circuit being configured to be accessed simultaneously with the plurality of memory bit circuits.
In accordance with an embodiment of the present invention, a tag array memory circuit includes a plurality of memory bit circuits coupled together to form an n-bit memory cell, and a blind invalidate circuit coupled to the n-bit memory cell, the blind invalidate circuit including a primary clear bit line, a primary clear circuit coupled to the primary clear bit line and configured to receive a bit value of a left-adjacent memory bit circuit, and an auxiliary clear circuit coupled to a primary clear circuit and to the primary clear circuit of a right-adjacent memory bit circuit, and configured to receive a bit value of the right-adjacent memory bit circuit.
In accordance with an embodiment of the present invention, a method for protecting 1-hot vectors, including associating a valid bit with each 1-hot vector, storing the 1-hot vector and the valid bit, outputting the 1-hot vector if the valid bit is set to valid, and invalidating the 1-hot vector if the valid bit is set to invalid.
In accordance with an embodiment of the present invention, a machine-readable medium having stored thereon a plurality of executable instructions for defining a series of steps to protect 1-hot vectors, the plurality of executable instructions including instructions to associate a valid bit with each 1-hot vector, store the 1-hot vector and the valid bit, output the 1-hot vector if the valid bit is set to valid, and invalidate the 1-hot vector if the valid bit is set to invalid.
In accordance with an embodiment of the present invention, a tag array memory, including an input conversion circuit, the input conversion circuit configured to receive a 1-hot vector and to convert the 1-hot vector to a 2-hot vector, a memory array coupled to the input conversion circuit, the memory array configured to store the 2-hot vector, and an output conversion circuit coupled to the memory array, the output conversion circuit being configured to receive the 2-hot vector and to convert the 2-hot vector back to the 1-hot vector.
In accordance with an embodiment of the present invention, a computer system, including a processor, a translation look-aside buffer (TLB), and a cache memory coupled to the processor, the cache memory being configured to store 1-hot vectors and to associate a separate valid bit with each 1-hot vector.
It should, of course, be understood that while the present invention has been described mainly in terms of microprocessor- and multi-processor-based personal computer systems, those skilled in the art will recognize that the principles of the invention may be used advantageously with alternative embodiments involving other integrated processor chips and computer systems. Accordingly, all such implementations which fall within the spirit and the broad scope of the appended claims will be embraced by the principles of the present invention.
Crawford, John, Grochowski, Edward, Kosaraju, Chakravarthy, Quach, Nhon, Mathews, Greg S.
Patent | Priority | Assignee | Title |
6826645, | Dec 13 2000 | Intel Corporation | Apparatus and a method to provide higher bandwidth or processing power on a bus |
6907490, | Dec 13 2000 | Intel Corporation | Method and an apparatus for a re-configurable processor |
7020752, | Feb 07 2003 | Oracle America, Inc | Apparatus and method for snoop access in a dual access, banked and pipelined data cache memory unit |
7133957, | Dec 13 2000 | Intel Corporation | Method and an apparatus for a re-configurable processor |
7185170, | Aug 27 2004 | SHENZHEN XINGUODU TECHNOLOGY CO , LTD | Data processing system having translation lookaside buffer valid bits with lock and method therefor |
Patent | Priority | Assignee | Title |
4910668, | Sep 25 1986 | Matsushita Electric Industrial Co., Ltd. | Address conversion apparatus |
6166939, | Jul 12 1999 | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | Method and apparatus for selective match line pre-charging in a content addressable memory |
20020087825, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Dec 29 2000 | Intel Corporation | (assignment on the face of the patent) | / | |||
Mar 28 2001 | QUACH, NHON, | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011684 | /0617 | |
Mar 28 2001 | MATHEWS, GREG S | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011684 | /0617 | |
Mar 28 2001 | KOSARAJU, CHAKRAVARTHY | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011684 | /0617 | |
Mar 29 2001 | CRAWFORD, JOHN | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011684 | /0617 | |
Mar 30 2001 | GROCHOWSKI, EDWARD | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011684 | /0617 |
Date | Maintenance Fee Events |
May 04 2004 | ASPN: Payor Number Assigned. |
Jun 29 2007 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jun 29 2011 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Jun 24 2015 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Jan 06 2007 | 4 years fee payment window open |
Jul 06 2007 | 6 months grace period start (w surcharge) |
Jan 06 2008 | patent expiry (for year 4) |
Jan 06 2010 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 06 2011 | 8 years fee payment window open |
Jul 06 2011 | 6 months grace period start (w surcharge) |
Jan 06 2012 | patent expiry (for year 8) |
Jan 06 2014 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 06 2015 | 12 years fee payment window open |
Jul 06 2015 | 6 months grace period start (w surcharge) |
Jan 06 2016 | patent expiry (for year 12) |
Jan 06 2018 | 2 years to revive unintentionally abandoned end. (for year 12) |