A cache line replacement protocol for selecting a cache line for replacement based at least in part on the inter-cache traffic generated as a result of the cache line being replaced.
|
11. A system comprising:
at least one processor;
a cache, coupled to the processor, with a plurality of cache ways, each of the plurality of cache ways with a plurality of lines; wherein a state is assigned for at least one cache line in at least two cache ways, and
a relative cost function is for inter-cache traffic is assigned for the state of the cache line associated with replacing the cache line; and
the cache line is selected for replacement based at least in part on the cost function from two possible victim cache lines, wherein the two possible victim cache lines are from two different cache levels, wherein the inter-cache traffic comprises traffic between one level of cache and another level of cache.
4. A method for replacing a line in a configuration with multiple levels of cache memories, with each cache memory with a plurality of cache ways, each cache way with a plurality of lines, comprising:
assigning a state for at least one cache line in at least two cache ways;
assigning a relative cost function for inter-cache traffic for the state of the cache line associated with replacing the cache line; and
selecting the cache line for replacement, from two possible victim cache lines, based at least in part on the cost function, wherein the inter-cache traffic comprises traffic between one level of cache in a processor and another level of cache in the processor, wherein the two possible victim cache lines are from two different cache levels.
1. A method for replacing a line in a configuration with multiple levels of cache memories, with each cache memory with a plurality of cache ways, each cache way with a plurality of lines, comprising:
assigning a state for at least one cache line in at least two cache ways;
assigning a relative cost function for potential inter-cache traffic caused by replacing the cache line, wherein the inter-cache traffic comprises traffic between one level of cache in a processor and another level of cache in the processor, wherein data is to be stored in a cache line based at least in part on the relative cost function assignment; and
selecting one of two possible victim cache lines for replacement based on the cost function, wherein the two possible victim cache lines are from two different cache levels.
7. A method for a cache line replacement protocol for replacing a line in a configuration with multiple levels of cache memories, with each cache memory with a plurality of cache ways, each cache way with a plurality of lines, comprising:
assigning a state for at least one cache line in at least two cache ways;
assigning a relative cost function for inter-cache traffic for the state of the cache line associated with replacing the cache line;
identifying two possible cache lines for replacement, wherein the two possible cache lines are from two different cache levels; and
selecting one of the two possible cache lines for replacement based at least in part on the cost function, wherein the inter-cache traffic comprises traffic between one level of cache in a processor and another level of cache in the processor.
18. A method for replacing a line in a configuration with multiple levels of cache memories, with each cache memory with a plurality of cache ways, each cache way with a plurality of lines, comprising:
assigning a state for at least one cache line in at least two cache ways;
assigning a core bit for each core that shares a last level cache in the multi-level cache memory;
assigning a relative cost function for inter-cache traffic for the state of the cache line associated with replacing the cache line; and
selecting the cache line for replacement, from two possible victim cache lines, based at least in part on the cost function, wherein the inter-cache traffic comprises traffic between one level of cache in a core and another level of cache in the core, wherein the two possible victim cache lines are from two different cache levels.
15. A method for replacing a line in a multi-level cache memory in a multi-core processor with a plurality of cache ways, each cache way with a plurality of lines, comprising:
assigning a state for at least one cache line in at least two cache ways;
assigning a core bit for each core that shares a last level cache in the multi-level cache memory;
assigning a relative cost function for inter-cache traffic for the state of the cache line associated with replacing the cache line, wherein data is to be stored in a cache line based at least in part on the relative cost function assignment, wherein the inter-cache traffic comprises traffic between one level of cache in a core and another level of cache in the core; and
selecting one of two possible victim cache lines for replacement based on the cost function, wherein the two possible victim cache lines are from two different cache levels.
3. The method of
6. The method of
9. The method of
10. The method of
14. The system of
17. The method of
20. The method of
|
The present disclosure is related to cache memory, such as for a cache line replacement scheme.
As is well-known, a cache stores information for a computer or computing system in order to decrease data retrieval times for a processor. Some examples of computing systems are a personal digital assistant, internet tablet, and a cellular phone. The cache stores specific subsets of information in high-speed memory. A few examples of information are instructions, addresses, and data. When a processor requests a piece of information, the system checks the cache first to see if the information is stored within the cache. If so, the processor can retrieve the information much faster than if the data was stored in other computer readable media, such as, random access memory, a hard drive, compact disc read-only memory (CD ROM), or a floppy disk.
Cache memories have a range of different architectures with respect to addresses locations mapped to predetermined cache locations. For example, cache memories may be direct mapped or fully associative. Alternatively, another cache memory is a set associative cache, which is a compromise between a direct mapped cache and fully associative cache. In a direct mapped cache, there is one address location in each set. Conversely, a fully associative cache that is N-way associative has a total number of N blocks in the cache. Finally, a set associative cache, commonly referred to as N-way set associative, divides the cache into a plurality of N ways wherein each address is searched associatively for a tag.
Efficient cache operation utilizes cache management techniques for replacing cache locations in the event of a cache miss. In a typical cache miss, the address and data fetched from the system or main memory is stored in cache memory. However, the cache needs to determine which cache location is to be replaced by the new address and data from system memory. One technique for replacing cache locations is implementing a protocol with least recently used (LRU) bits. Least recently used bits are stored for each cache location and are updated when the cache location is accessed or replaced. Valid bits determine the coherency status of the respective cache location. Therefore, based on the value of the least recently used bits and the valid bits, the cache effectively replaces the cache locations where the least recently used bits indicate the line is the least recently used or the line is not valid. There is a variety of replacement protocols utilized by cache memory, such as, pseudo-LRU, random, and not recently used (NRU) protocols. However, the present replacement protocols may result in increased inter-cache traffic. For example, replacing a line from an inclusive last level cache requires the same line to be evicted from all the lower level caches. Therefore, this results in increased inter-cache traffic.
Claimed subject matter is particularly and distinctly pointed out in the concluding portion of the specification. The claimed subject matter, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the claimed subject matter. However, it will be understood by those skilled in the art that the claimed subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the claimed subject matter.
An area of current technological development relates to improving the speed and efficiency of cache memory replacement protocols. As previously described, the present replacement protocols may result in increased inter-cache traffic. For example, replacing a line from an inclusive last level cache requires the same line to be evicted from all the lower level caches. Therefore, this results in increased inter-cache traffic. For example,
In contrast, the claimed subject matter facilitates a replacement protocol for selecting a victim line for replacement based at least in part on the potential inter-cache traffic associated with that particular victim line. Therefore, selecting a victim line based on the inter-cache traffic allows for efficient cache line replacement and/or decreasing the contention on the caches and various levels of the caches which results in more efficient bus utilization. In one embodiment, the replacement protocol is a four-way pseudo least recently used (LRU) replacement protocol and supports the MESI (modified Exclusive Shared Invalidate). In another embodiment, the replacement protocol is a four-way pseudo least recently used (LRU) replacement protocol and supports the MESI (modified Exclusive Shared Invalidate) and additional cache states such as MI, MS, and ES. The additional cache states, MI, MS, and ES facilitate snoop filtering. To explain these states, one interprets the cache states from a system perspective, the first letter stands for state of the line in the last level cache, in contrast, the second letter stands for line in the next lower level cache. As previously described, the same definitions apply for M, E, S, and I. For example, a cache line with a MI state is interpreted as having a modified state for the line in the last level cache, while having a invalidate statues for the line in the next lower level cache.
The claimed subject matter facilitates a LRU protocol to support a single core processor, as depicted in connection with
In contrast, evicting a cache line in an Invalidate state (I state) does not cause any additional traffic because there is no need to remove this line from any other caches. Therefore, a cost of zero is assigned to this cache state in the table.
Therefore, the victim is chosen based on the two possible candidates depicted in connection with
In contrast, evicting a cache line in an Invalidate state (I state) does not cause any additional traffic because there is no need to remove this line from any other caches. Therefore, a cost of zero is assigned to this cache state in the table.
Therefore, the victim is chosen based on the two possible candidates depicted in connection with
The systems can support any combination of dynamic (e.g., random access memory) and static (e.g., read-only memory, CD-ROM, disk storage, flash memory) memory devices and associated drives, where appropriate. The memory devices are used to store information and instructions to be executed by processor or multiple processors.
Instructions can be provided to the system 1002 or 1004 from a static or remote storage device, such as magnetic disk, a read-only memory (ROM) integrated circuit, CD-ROM, DVD, via a remote connection that is either wired or wireless, etc. In alternative embodiments, hard-wired circuitry can be used in place of or in combination with software instructions. Thus, execution of sequences of instructions is not limited to any specific combination of hardware circuitry and software instructions.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes can be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Shannon, Christopher J., Srinivasa, Ganapati, Rowland, Mark
Patent | Priority | Assignee | Title |
9003126, | Sep 25 2012 | Intel Corporation | Apparatus, system and method for adaptive cache replacement in a non-volatile main memory system |
9652398, | Dec 14 2014 | VIA ALLIANCE SEMICONDUCTOR CO , LTD | Cache replacement policy that considers memory access type |
9652400, | Dec 14 2014 | VIA ALLIANCE SEMICONDUCTOR CO , LTD | Fully associative cache memory budgeted by memory access type |
9811468, | Dec 14 2014 | VIA ALLIANCE SEMICONDUCTOR CO , LTD | Set associative cache memory with heterogeneous replacement policy |
9898411, | Dec 14 2014 | VIA ALLIANCE SEMICONDUCTOR CO , LTD | Cache memory budgeted by chunks based on memory access type |
9910785, | Dec 14 2014 | VIA ALLIANCE SEMICONDUCTOR CO , LTD | Cache memory budgeted by ways based on memory access type |
Patent | Priority | Assignee | Title |
5584013, | Dec 09 1994 | International Business Machines Corporation | Hierarchical cache arrangement wherein the replacement of an LRU entry in a second level cache is prevented when the cache entry is the only inclusive entry in the first level cache |
6574710, | Jul 31 2000 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Computer cache system with deferred invalidation |
6643741, | Apr 19 2000 | International Business Machines Corporation | Method and apparatus for efficient cache management and avoiding unnecessary cache traffic |
6912624, | Jun 09 2000 | SK HYNIX INC | Method and system for exclusive two-level caching in a chip-multiprocessor |
20040039880, | |||
GBP396940, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Feb 13 2004 | Intel Corporation | (assignment on the face of the patent) | / | |||
Aug 09 2004 | SHANNON, CHRISTOPHER J | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 015706 | /0084 | |
Aug 09 2004 | ROWLAND, MARK | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 015706 | /0084 | |
Aug 10 2004 | SRINIVASA, GANAPATI | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 015706 | /0084 |
Date | Maintenance Fee Events |
Jun 08 2011 | ASPN: Payor Number Assigned. |
Dec 13 2013 | REM: Maintenance Fee Reminder Mailed. |
May 04 2014 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
May 04 2013 | 4 years fee payment window open |
Nov 04 2013 | 6 months grace period start (w surcharge) |
May 04 2014 | patent expiry (for year 4) |
May 04 2016 | 2 years to revive unintentionally abandoned end. (for year 4) |
May 04 2017 | 8 years fee payment window open |
Nov 04 2017 | 6 months grace period start (w surcharge) |
May 04 2018 | patent expiry (for year 8) |
May 04 2020 | 2 years to revive unintentionally abandoned end. (for year 8) |
May 04 2021 | 12 years fee payment window open |
Nov 04 2021 | 6 months grace period start (w surcharge) |
May 04 2022 | patent expiry (for year 12) |
May 04 2024 | 2 years to revive unintentionally abandoned end. (for year 12) |