A method of operating a cache memory includes the step of storing a set of data in a first space in a cache memory, a set of data associated with a set of tags. A subset of the set of data is stored in a second space in the cache memory, the subset of the set of data associated with a tag of a subset of the set of tags. The tag portion of an address is compared with the subset of data in the second space in the cache memory in that said subset of data is read when the tag portion of the address and the tag associated with the subset of data match. The tag portion of the address is compared with the set of tags associated with the set of data in the first space in cache memory and the set of data in the first space is read when the tag portion of the address matches one of the sets of tags associated with the set of data in the first space and the tag portion of the address and the tag associated with the subset of data in the second space do not match.

Patent
   RE45078
Priority
Jan 16 2001
Filed
Oct 23 2012
Issued
Aug 12 2014
Expiry
Jan 16 2021
Assg.orig
Entity
Large
0
23
all paid
9. A processing system comprising:
a system memory;
a cache memory comprising first and second peer cache memory spaces;
a first table for storing tags associated with data stored in the first cache memory space;
a second table for storing tags associated with data stored in the second cache memory space;
processing circuitry operable to:
access a plurality of blocks of data from said system memory in response to a plurality of addresses;
store said blocks of data accessed from said system memory within said first cache memory space, said blocks of data associated with a set of tags in said first table, said blocks further being associated with a set of valid indicators, each valid indicator indicating whether data associated with a particular block is valid;
maintain a first pointer to a location in the first cache memory space, the first pointer referencing a first block in the first space in the cache memory;
maintain a second pointer to a location in the first cache memory space, the second pointer referencing a second block in the first space in the cache memory;
store a selected block of said blocks of data accessed from said system memory within said second cache memory space, the selected block being associated with a valid indicator indicative of whether the data associated with the block is valid, and said block associated with a tag in said second table;
generate a read address including a tag field;
compare said tag field of said read address with said tag in said second table associated with said selected block and access said selected block from said second cache memory space when said tag field and said tag in said second table match; and
compare said tag field of said read address with said set of tags in said first table when said tag field and said tag in said second table do not match and access a corresponding block in said first cache memory space when said tag field and a tag in said first table match.
1. A method of operating a cache memory comprising the steps of:
storing a set of data in a first space in the cache memory, the set of data associated with a set of tags, the set of data in the first space in the cache memory being comprised of a first plurality of cache lines, each cache line in the first plurality of cache lines being associated with a valid indicator indicative as to whether data associated with the cache line is valid;
maintaining a first pointer to a location in the first space in the cache memory, the first pointer referencing a first cache line in the first space in the cache memory;
maintaining a second pointer to a location in the first space in the cache memory, the second pointer referencing a second cache line in the first space in the cache memory;
storing a subset of the set of data in a second space in the cache memory and associated with a tag, the tag associated with the subset of data and being a subset of the set of tags, the set of data in the second space in the cache memory being comprised of a second plurality of cache lines, each cache line in the second plurality of cache lines being associated with a valid indicator indicative as to whether data associated with the cache line is valid;
wherein one of the second plurality of the cache lines is associated with the tag associated with the subset of data;
comparing a tag portion of an address with the tag associated with the subset of data in the second space in the cache memory;
reading the subset of data in the second space when the tag portion of the address and the tag associated with the subset of data match;
comparing the tag portion of the address with the set of tags associated with the set of data in the first space in the cache memory; and
reading the set of data in the first space when the tag portion of the address matches one of the set of tags associated with the set of data in the first space and the tag portion of the address and the tag associated with the subset of data in the second space do not match.
0. 48. A processing system comprising:
a system memory;
a cache memory comprising first and second peer cache memory spaces;
a processor coupled to the cache memory;
a first table for storing tags associated with data stored in the first cache memory space;
a second table for storing tags associated with data stored in the second cache memory space;
processing circuitry operable to:
access a plurality of blocks of data from said system memory in response to a plurality of addresses;
store said blocks of data accessed from said system memory within said first cache memory space, said blocks of data associated with a set of tags in said first table, said blocks further being associated with a set of valid indicators, each valid indicator indicating whether data associated with a particular block is valid;
store a selected block of said blocks of data accessed from said system memory within said second cache memory space, the selected block being associated with a valid indicator indicative of whether the data associated with the block is valid, and said block associated with a tag in said second table;
generate a read address including a tag field;
compare said tag field of said read address with said tag in said second table associated with said selected block and access said selected block from said second cache memory space when said tag field and said tag in said second table match;
compare said tag field of said read address with said set of tags in said first table when said tag field and said tag in said second table do not match and access a corresponding block in said first cache memory space when said tag field and a tag in said first table match;
the processing circuitry being further operable to write data associated with a write address from the processor to a block in the first cache memory space, compare the tag portion of the write address with the tag in said second table, and set the valid indicator associated with the block associated with the tag in said second table when the tag portion of the write address matches the tag in said second table; and
wherein the processing circuitry is further operable to:
maintain a first pointer to a location in the first cache memory space, the first pointer referencing a first block in the first space in the cache memory; and
maintain a second pointer to a location in the first cache memory space, the second pointer referencing a second block in the first space in the cache memory wherein the valid indicator associated with the second block indicates that the associated with the second block is invalid.
0. 44. A method of operating a cache memory comprising the steps of:
storing a set of data in a first space in the cache memory, the set of data associated with a set of tags, the set of data in the first space in the cache memory being comprised of a first plurality of cache lines, each cache line in the first plurality of cache lines being associated with a valid indicator indicative as to whether data associated with the cache line is valid;
storing a subset of the set of data in a second space in the cache memory and associated with a tag, the tag associated with the subset of data and being a subset of the set of tags, the set of data in the second space in the cache memory being comprised of a second plurality of cache lines, each cache line in the second plurality of cache lines being associated with a valid indicator indicative as to whether data associated with the cache line is valid;
wherein one of the second plurality of the cache lines is associated with the tag associated with the subset of data;
comparing a tag portion of an address with the tag associated with the subset of data in the second space in the cache memory;
reading the subset of data in the second space when the tag portion of the address and the tag associated with the subset of data match;
comparing the tag portion of the address with the set of tags associated with the set of data in the first space in the cache memory; and
reading the set of data in the first space when the tag portion of the address matches one of the set of tags associated with the set of data in the first space and the tag portion of the address and the tag associated with the subset of data in the second space do not match;
writing data associated with a write address from a processor coupled to the cache memory to a memory location of the first space in the cache memory;
comparing the tag portion of the write address with the tag associated with the subset of the set of data stored in the second space in the cache memory;
setting the valid indicator of the cache line associated with the tag to indicate that the data associated with the cache line is invalid when the tag portion of the write address matches the tag associated with the subset of the set of data;
maintaining a first pointer to a location in the first space in the cache memory, the first pointer referencing a first cache line in the first space in the cache memory; and
maintaining a second pointer to a location in the first space in the cache memory, the first pointer referencing a second cache line in the first space in the cache memory, wherein the valid indicator associated with the second cache line indicates that the data associated with the second cache line is invalid.
2. The method of claim 1 further comprising the steps of:
when the tag portion of the address matches does not match one of the set of tags associated with the set of data in the first space, storing a second set of data in the first space in cache memory and associated with a second set of tags, the second set of data including a second subset of data associated with a tag matching the tag portion of the address; and
storing the second subset of data in the second space in the cache memory tagged with the tag matching the tag portion of the address.
3. The method of claim 1 further comprising the steps of:
during a write operation, comparing the tag portion of the write address with the set of tags associated with the set of data in the first memory space; and
if the tag portion of the write address matches one of the set of tags associated with the set of data in the first memory space, overwriting the data in the first memory space associated with the matching tag.
4. The method of claim 3 and further comprising the steps of:
if the tag portion of the write address does not match one of the set of tags associated with the set of data in the first space in the cache memory, retrieving the data associated with the tag portion of the write address from a second memory; and
storing the retrieved data in the first space of the cache memory tagged with a tag corresponding to the tag portion of the write address.
5. The method of claim 2 wherein said step of storing the second set of data in the first space in the cache memory comprises the step of storing the second set of data in a least recently used set of locations in the first space.
6. The method of claim 2 wherein said step of storing the second set of data in the first space in the cache memory comprises the step of storing the second set of data in a randomly selected set of locations in the first space.
7. The method of claim 4 wherein said step of storing the retrieved data comprises the step of storing the retrieved data at a least recently used set of locations in the first space.
8. The method of claim 4 wherein said step of storing the retrieved data comprises the step of storing the retrieved data at a randomly selected set of locations in the first space.
10. The processing system of claim 9 wherein said processing circuitry is further operable when said tag field does not match a tag in the first table to:
retrieve a second plurality of blocks of data from said system memory;
store the second plurality of blocks of data in the first cache memory space, the second plurality of blocks associated with a second set of tags in said first table; and
store a second selected block of the second plurality of blocks in said second cache memory space, said second selected block associated with a second tag in said second table matching said tag field of said address.
11. The processing system of claim 9 wherein said processing circuitry is further operable to:
generate a write address including a tag field;
compare said tag field of said write address with the set of tags in the first table; and
overwrite data in the first memory space associated with a corresponding tag in the first table matching said tag field of said write address associated with a set of tags;
storing a subset of the set of data in a second space in the cache memory and associated with a tag, the tag associated with the subset of data being a subset of the set of tags;
comparing a tag portion of an address with the tag associated with the subset of data in the second space in the cache memory;
reading the subset of data in the second space when the tag portion of the address and the tag associated with the subset of data match;
comparing the tag portion of the address with the set of tags associated with the set of data in the first space in the cache memory; and
reading the set of data in the first space when the tag portion of the address matches one of the set of tags associated with the set of data in the first space and the tag portion of the address and the tag associated with the subset of data in the second space do not match.
12. The processing system of claim 9 wherein said second cache memory space is smaller than said first cache memory space.
13. The processing system of claim 9 wherein said cache memory system comprise a discrete cache memory system.
14. The processing system of claim 9 wherein said cache memory system comprises an on-board cache memory system integrated with said processing circuitry.
15. The processing system of claim 9 wherein said processing circuitry comprises a central processing unit.
16. The processing system of claim 9 wherein said processing circuitry comprises a cache memory controller.
0. 17. The method of claim 3 further comprising the steps of:
comparing the tag portion of the write address with the tag associated with the subset of the set of data; and
when the tag portion of the write address matches the tag associated with the subset of the set of data, changing the valid indicator of the cache line associated with the tag associated with the subset of the set of data to indicate that the data associated with the cache line is invalid.
0. 18. The method of claim 1 further comprising the steps of:
writing data associated with a write address from a processor coupled to the cache memory to a memory location associated with the first cache line in the first space in the cache memory;
comparing the tag portion of the write address with the tag associated with the subset of the set of data stored in the second space in the cache memory; and
setting the valid indicator of the cache line associated with the tag to indicate that the data associated with the cache line is invalid when the tag portion of the write address matches the tag associated with the subset of the set of data.
0. 19. The method of claim 1 further comprising the steps of:
when the tag portion of the address does not match one of the set of tags associated with the set of data in the first space, storing a second set of data to a memory location associated with the second cache line in the first space in the cache memory; and
incrementing the second pointer to reference a third cache line in the first space in the cache memory.
0. 20. The method of claim 19 further comprising the step of writing data associated with a write address from a processor coupled to the cache memory to a memory location associated with the first cache line in the first space in the cache memory.
0. 21. The method of claim 20 further comprising the steps of:
maintaining a third pointer to a location in the second space in the cache memory, the third pointer referencing a first cache line in the second memory space; and
storing a subset of the second set of data to a memory location associated with the first cache line in the second memory space.
0. 22. The method of claim 20 further comprising the step of incrementing the first pointer to reference a fourth cache line wherein the valid indicator associated with the fourth cache line indicates that the data associated with the fourth cache line is invalid.
0. 23. The method of claim 20 further comprising the steps of:
comparing the tag portion of the write address with the tag associated with the subset of the set of data stored in the second space in the cache memory; and
setting the valid indicator of the cache line associated with the tag to indicate that the data associated with the cache line is invalid when the tag portion of the write address matches the tag associated with the subset of the set of data.
0. 24. The method of claim 23 wherein the associativity of the first space in the cache memory and the second space of the cache memory is dynamic.
0. 25. The method of claim 1 wherein the cache memory is part of a multiprocessor system.
0. 26. The method of claim 1 wherein the cache memory is implemented using DRAM.
0. 27. The method of claim 1 wherein the cache memory is implemented using SRAM.
0. 28. The method of claim 1 wherein the cache memory is implemented using multi-ported memory.
0. 29. The method of claim 1 further comprising the steps of:
prefetching data into the first space in the cache memory; and
responding to a read request from a processor coupled to the cache memory with data from the second space in the cache memory.
0. 30. The method of claim 1 further comprising the steps of:
reading data associated with the cache line referenced by the second pointer; and
at substantially the same time writing data to the cache line referenced by the first pointer.
0. 31. The system of claim 9 further comprising:
a processor coupled to the cache memory;
the processing circuitry being further operable to write data associated with a write address from the processor to the block referenced by the first pointer, compare the tag portion of the write address with the tag in said second table, and set the valid indicator associated with the block associated with the tag in said second table when the tag portion of the write address matches the tag in said second table.
0. 32. The system of claim 9 wherein the processing circuitry is further operable when the tag portion of the read address does not match one of the set of tags stored in said first table, store a second set of blocks of data to a memory location referenced by the second pointer, and increment the second pointer to reference a third block in the first space in the cache memory.
0. 33. The system of claim 32 wherein the processing circuitry is further operable to write data associated with a write address from the processor to the block referenced by the first pointer.
0. 34. The system of claim 33 wherein the processing circuitry is further operable to:
maintain a third pointer to a second block within the second cache memory space;
store a block of data that is a subset of the second set of blocks of data within the block referenced by the third pointer.
0. 35. The system of claim 33 wherein the processing circuitry is further operable to increment the first pointer to reference a fourth block wherein the valid indicator associated with the fourth block indicates that the data associated with the fourth block is invalid.
0. 36. The system of claim 33 wherein the processing circuitry is further operable to:
compare the tag portion of the write address with the tag stored in the second table; and
set the valid indicator associated with the block associated with the tag stored in the second table to indicate that the data associated with the block is invalid when the tag portion of the write address matches the tag associated stored in the second table.
0. 37. The system of claim 36 wherein the associativity of the first space in the cache memory and second space in the cache memory is dynamic.
0. 38. The system of claim 9 further comprising a first processor coupled to the cache memory, and a second processor coupled to the cache memory.
0. 39. The system of claim 9 wherein the cache memory is implemented using DRAM.
0. 40. The system of claim 9 wherein the cache memory is implemented using SRAM.
0. 41. The system of claim 9 wherein the cache memory is implemented using multi-ported memory.
0. 42. The system of claim 9 further comprising:
prefetching circuitry operable to prefetch data into the first space in the cache memory; and wherein the processing circuitry is further operable to respond to a read request with data from the second space in the cache memory.
0. 43. The system of claim 9 further comprising:
a processor coupled to the cache memory; and
wherein the processing circuitry is further operable to read data associated with the block referenced by the second pointer; and
at substantially the same time write data to the block referenced by the first pointer.
0. 45. The method of claim 44 further comprising the steps of:
when the tag portion of the address does not match one of the set of tags associated with the set of data in the first space, storing a second set of data to a memory location associated with the second cache line in the first space in the cache memory;
incrementing the second pointer to reference a third cache line in the first space in the cache memory;
maintaining a third pointer to a location in the second space in the cache memory, the third pointer referencing a first cache line in the second memory space; and
storing a subset of the second set of data to a memory location associated with the first cache line in the second memory space.
0. 46. The method of claim 45 further comprising the steps of:
comparing the tag portion of the write address with the tag associated with the subset of the set of data stored in the second space in the cache memory; and
setting the valid indicator of the cache line associated with the tag to indicate that the data associated with the cache line is invalid when the tag portion of the write address matches the tag associated with the subset of the set of data.
0. 47. The method of claim 44 further comprising the steps of:
reading data associated with the cache line referenced by the second pointer; and
at substantially the same time writing data to the cache line referenced by the first pointer.
8

This formula determines the associativity between Bank 1 and Bank 2, where Bank 1 is K-way set associative. It must be noted however that Bank 1 and Bank 2 are fully independent direct mapped associative caches. The associativity between Bank 1 and Bank 2 can be changed by employing a different prefetching scheme, (which in turn changes the formula for calculation of Bank 1 write pointer from the Bank 2 write pointer.)

The controller also runs the protocol responsible for the write to a cache. The WRITE protocol used by the controller is shown in FIG. 9. In the first write scenario:

    • 1) Processor Write→Bank 2 Search (Tag Search)→Tag Hit→Overwrite the Tag that matches with the same Tag in the Tag directory for Bank 2 and “Set” the valid bit. Overwrite the data corresponding to the Tag entry in Bank 2.

To maintain coherency of data, Bank 1 is also searched and the valid bit is changed to “Dirty” if there is a Tag hit in the Tag directory for Bank 1. The data are not overwritten such that the entry at the line number corresponding to this Tag value is free to be overwritten in the next write cycle from a lower level memory prefetch, fetch, or an update from Bank 2. The processor write pointer, which is separate from a memory write pointer, is not updated and points to the line with first “Dirty” valid bit in Bank 2, or the first line in Bank 2, otherwise if no Dirty bit is set. On the first Tag hit on a processor write, and on subsequent processor writes, the processor write pointer gets updated by 2, so as not to overwrite data from Address ‘A.’

The processor write pointer is only used as a replacement instrument in case of a Tag miss as shown in scenario 2:

    • 2) Processor Write→Bank 2 Search (Tag Search)→Tag Miss→Overwrite Tag entry in Tag directory for Bank 2 at the index that will be equal to the processor write pointer with new address Tag generated by the processor. Replace the data in the line that corresponds to the index of the processor write pointer. To avoid coherency problems check the Bank 1 Tags in the Tag directory entries for Bank 1. (Since there might be a Tag match in Bank 1 even though there is Tag miss in Bank 2 this step is necessary.) If there is a Tag hit in the Tag directory for Bank 1 then set the valid bit to “Dirty” If there is no match then the directory entries for Bank 1 are left unchanged.

The selection of two pointers for memory and processor writes allows the application or the instruction set associated with it to dynamically determine the data distribution within this memory subsystem. This allows for dynamic utilization of spatial and temporal locality of data.

If the processor accesses more recently written data from memory it is more likely that these reads will generate hits in Bank 2. If the accesses are more Random, it is likely more hits will be generated in Bank 1. The underlying assumption is that there is some degree of spatial locality associated with instruction and data for all applications.

This design of the cache is that it offers the advantage of a direct mapped cache on the writes and the speed of associativity on the reads. The independent processor write pointer can also be updated using a method where it always points to the first “Dirty” line in the Bank.

In sum, the mirrored memory architecture of the present invention can advantageously be used to maintain the spatial and/or temporal locality of the encached data required for a set of processing operations. Specifically, a set of data and the corresponding tags are stored in the Bank 2 and associated Bank 2 directory respectively. A subset of those data are stored, along with the corresponding tags, in the Bank 1 and associated Bank 1 directory. When a memory address is received, from the CPU or memory controller, the tag is first compared with those in the tag directories. If a hit is found in the Bank 1 tag directory, the Bank 1 of the mirrored memory is preferentially accessed. Otherwise, if address tag misses the Bank 1 directory but hits an entry in the Bank 2 directory, the Bank 2 is used for the access. When the address tag does not match a tag in either of the two directories, then a lower level of memory must be accessed and the mirrored memory contents updated.

During update of the mirrored memory contents on a read miss, a block or other set of data associated with a set of addresses are copied into the Bank 2 of the mirrored memory and the associated tags loaded into the Bank 2 directory. A subset of this block of data, having a tag matching that of the address causing the miss, is also loaded into the Bank 1 and that tag loaded into the corresponding entry in the Bank 1 directory. On a write miss, a victim line or block at the write pointer is overwritten and the corresponding entry in the Bank 2 directory updated with the tag from the address causing the miss.

While a particular embodiment of the invention has been shown and described, it will be obvious to those skilled in the art that changes and modifications may be made therein without departing from the invention in it's broader aspects, and, therefore, the aim in the appended claims is to cover all such changes and modifications as fall within the true scope of the invention.

Kavipurapu, Gautam Nag

Patent Priority Assignee Title
Patent Priority Assignee Title
5361391, Jun 22 1992 Sun Microsystems, Inc. Intelligent cache memory and prefetch method based on CPU data fetching characteristics
5386527, Dec 27 1991 Texas Instruments Incorporated; TEXAS INSTRUMENTS INCORPORATED A CORP OF DELAWARE Method and system for high-speed virtual-to-physical address translation and cache tag matching
5386547, Jan 21 1992 HEWLETT-PACKARD DEVELOPMENT COMPANY, L P System and method for exclusive two-level caching
5581725, Sep 30 1992 NEC Corporation Cache memory system having first and second direct-mapped cache memories organized in hierarchical structure
5619675, Jun 14 1994 Storage Technology Corporation Method and apparatus for cache memory management using a two level scheme including a bit mapped cache buffer history table and circular cache buffer list
5638506, Apr 08 1991 Storage Technology Corporation Method for logically isolating a cache memory bank from a memory bank group
5689679, Apr 28 1993 HEWLETT-PACKARD DEVELOPMENT COMPANY, L P Memory system and method for selective multi-level caching using a cache level code
5706464, Jan 29 1993 International Business Machines Corporation Method and system for achieving atomic memory references in a multilevel cache data processing system
5715428, Feb 28 1994 Intel Corporation Apparatus for maintaining multilevel cache hierarchy coherency in a multiprocessor computer system
5787478, Mar 05 1997 International Business Machines Corporation Method and system for implementing a cache coherency mechanism for utilization within a non-inclusive cache memory hierarchy
5895484, Apr 14 1997 International Business Machines Corporation Method and system for speculatively accessing cache memory data within a multiprocessor data-processing system using a cache controller
5916314, Sep 11 1996 International Business Machines Corporation Method and apparatus for cache tag mirroring
5926830, Oct 07 1996 International Business Machines Corporation Data processing system and method for maintaining coherency between high and low level caches using inclusive states
5937431, Jul 12 1996 SAMSUNG ELECTRONICS CO , LTD Multi- node, multi-level cache- only memory architecture with relaxed inclusion
5956746, Aug 13 1997 Intel Corporation Computer system having tag information in a processor and cache memory
5963978, Oct 07 1996 International Business Machines Corporation High level (L2) cache and method for efficiently updating directory entries utilizing an n-position priority queue and priority indicators
6078992, Dec 05 1997 Intel Corporation Dirty line cache
6253291, Feb 13 1998 Oracle America, Inc Method and apparatus for relaxing the FIFO ordering constraint for memory accesses in a multi-processor asynchronous cache system
6321297,
6629210, Oct 26 2000 GOOGLE LLC Intelligent cache management mechanism via processor access sequence analysis
6826652, Dec 06 1999 Texas Instruments Incorporated Smart cache
20020078303,
EP192578,
/
Executed onAssignorAssigneeConveyanceFrameReelDoc
Oct 23 2012Narada Systems, LLC(assignment on the face of the patent)
Date Maintenance Fee Events
Dec 17 2014M1553: Payment of Maintenance Fee, 12th Year, Large Entity.


Date Maintenance Schedule
Aug 12 20174 years fee payment window open
Feb 12 20186 months grace period start (w surcharge)
Aug 12 2018patent expiry (for year 4)
Aug 12 20202 years to revive unintentionally abandoned end. (for year 4)
Aug 12 20218 years fee payment window open
Feb 12 20226 months grace period start (w surcharge)
Aug 12 2022patent expiry (for year 8)
Aug 12 20242 years to revive unintentionally abandoned end. (for year 8)
Aug 12 202512 years fee payment window open
Feb 12 20266 months grace period start (w surcharge)
Aug 12 2026patent expiry (for year 12)
Aug 12 20282 years to revive unintentionally abandoned end. (for year 12)