A virtual memory system including a local-to-global virtual address translator for translating local virtual addresses having associated task specific address spaces into global virtual addresses corresponding to an address space associated with multiple tasks, and a global virtual-to-physical address translator for translating global virtual addresses to physical addresses. Protection information is provided by each of the local virtual-to-global virtual address translator, the global virtual-to-physical address translator, the cache tag storage, or a protection information buffer depending on whether a cache hit or miss occurs during a given data or instruction access. The cache is configurable such that it can be configured into a buffer portion or a cache portion for faster cache accesses.
|
0. 40. A memory storage system operable to store information accessed by a processor, the memory storage system comprising:
an access path operable to access a main memory; and
a memory storage area which is configurable into a cache portion and a buffer portion, wherein the memory storage system receives an address for accessing the memory storage area, said address including a field for indicating whether the access includes a cache access or a buffer access.
0. 62. A memory storage system operable to store information accessed by a processor, the memory storage system comprising:
an access path operable to access a main memory; and
a memory storage area which is configurable into a cache portion and a buffer portion in accordance with configuration information received by the memory storage system, wherein an address received by the memory storage system is modified in accordance with the configuration information to produce a modified address provided to the memory storage area.
8. A method for storing recently accessed data from a main memory in a computer system, comprising the steps of:
configuring a memory storage area into a cache portion and a buffer portion; and
storing indices corresponding to data stored in said memory storage area;
wherein when said cache portion is accessed by a given address, said stored indices are also accessed by said given address to check if said data accessed from said cache portion is valid and when said buffer portion is accessed by said given address, said stored indices are not checked.
15. A computer-readable medium containing a program that performs the steps of:
receiving an indication that a memory storage area has been configured into a cache portion and a buffer portion; and
storing indices corresponding to data stored in said memory storage area;
wherein when said cache portion is accessed by a given address, said stored indices are also accessed by said given address to check if said data accessed from said cache portion is valid and when said buffer portion is accessed by said given address, said stored indices are not checked.
1. A memory storage system for storing recently accessed data from a main memory in a computer system, said memory storage system comprising:
a memory storage area which is configurable into a cache portion and a buffer portion; and
means for storing indices corresponding to data stored in said memory storage area;
wherein when said cache portion is accessed by a given address, said index storage means is also accessed by said address to check if said data accessed from said cache portion is valid and when said buffer portion is accessed by said given address, said index storage means is not checked.
0. 64. A memory storage system operable to store information accessed by a processor, the memory storage system comprising:
an access path operable to access a main memory;
a memory storage area which is configurable into a cache portion and a buffer portion; and
a cache tag memory configured to store indices corresponding to information stored in the memory storage area,
wherein the memory storage system receives an address for accessing the memory storage area, said address including a field for indicating whether the access is a cache access or a buffer access, wherein a cache access results in an access to the main memory if the cache tag memory does not contain an index corresponding to the cache access, and a buffer access does not result in an access to the main memory.
0. 63. A memory storage system operable to store information accessed by a processor, the memory storage system comprising:
an access path operable to access a main memory; and
a memory storage area which is configurable into a cache portion and a buffer portion in accordance with configuration information received by the memory storage system,
wherein the memory storage system receives an address for accessing the memory storage area, said address being within an address space divided into a cache address space portion and a buffer address space portion, and the address including a field for indicating whether the access to the memory storage area is a cache access or a buffer access, and
wherein a portion of the address is modified in accordance with the configuration information to produce a modified address, the modified address together with an un-modified portion of the address being provided to the memory storage area.
2. The memory system as described in
3. The memory system as described in
4. The memory system as described in
5. The memory system as described in
6. The memory system as described in
7. A memory system as described in
9. The method of
10. The method of claim 9 14, wherein said protection information includes a field that defines a coherence state of data stored at said given address, wherein said coherence state indicates whether data stored at said given address may be read, written into, or replaced.
11. The method of claim 9 14, wherein said protection information includes a field that defines an access priority of said given address, wherein said access priority indicates the order at which said given address is accessed with respect to other accesses in said memory system.
12. The method of claim 9 14, wherein said protection information includes a field for indicating when a detail exception should occur.
13. The method of claim 9 14, wherein said protection information includes a field that defines a cache control condition of said given address, wherein said cache control condition indicates states of said data stored in said given address including a cache coherent state, a non-allocated state, and a physical state.
14. The method of
16. The method computer-readable medium of
17. The computer-readable medium of claim 16 21, wherein said protection information includes a field that defines a coherence state of data stored at said given address, wherein said coherence state indicates whether data stored at said given address may be read, written into, or replaced.
18. The computer-readable medium of claim 16 21, wherein said protection information includes a field that defines an access priority of said given address, wherein said access priority indicates the order at which said given address is accessed with respect to other accesses in said memory system.
19. The computer-readable medium of claim 16 21, wherein said protection information includes a field for indicating when a detail exception should occur.
20. The computer-readable medium of claim 16 21, wherein said protection information includes a field that defines a cache control condition of said given address, wherein said cache control condition indicates states of said data stored in said given address including a cache coherent state, a non-allocated state, and a physical state.
21. The computer-readable medium of
0. 22. The memory storage system of
a compare unit coupled to compare said given address to the indices upon accessing the cache portion, and generate a cache hit if a match is found, and generate a cache miss if no match is found.
0. 23. The memory storage system of
0. 24. The memory storage system of
0. 25. The memory storage system of
0. 26. The memory storage system of
0. 27. The memory storage system of
0. 28. The memory storage system of
0. 29. The memory storage system of
0. 30. The memory storage system of
0. 31. The method of
upon accessing the cache portion, comparing said given address to the indices and generating a cache hit if a match is found, and generating a cache miss if no match is found.
0. 32. The method of
if a cache miss is detected in an access to the cache portion, accessing the main memory for transferring data to and from the main memory.
0. 33. The method of
0. 34. The method of
0. 35. The method of
upon accessing the memory storage area, receiving information indicating whether the access is a cache access or a buffer access.
0. 36. The method of
partitioning the memory storage area between the cache portion and the buffer portion in accordance with configuration information received by the memory storage system.
0. 37. The method of
partitioning the memory storage area into one of a predetermined number of combinations of cache portion size and buffer portion size in accordance with configuration information received by the memory storage system.
0. 38. The method of
setting the configuration information in a status register when the computer system is configured by software.
0. 39. The method of
allocating address ranges to the cache portion and the buffer portion in accordance with the partitioning of the memory storage area between the cache portion and the buffer portion.
0. 41. The memory storage system of
0. 42. The memory storage system of
a cache tag memory configured to store indices corresponding to information stored in the memory storage area, wherein upon accessing the cache portion the cache tag memory is checked to determine whether the cache tag memory contains an index corresponding to the cache access.
0. 43. The memory storage system of
a cache tag memory configured to store indices corresponding to information stored in the memory storage area; and
a compare unit adapted to compare an address corresponding to a cache access to the indices stored in the cache tag memory, and generate a cache hit if a match is found, and generate a cache miss if no match is found.
0. 44. The memory storage system of
0. 45. The memory storage system of
0. 46. The memory storage system of
0. 47. The memory storage system of
0. 48. The memory storage system of
0. 49. The memory storage system of
0. 50. The memory storage system of
0. 51. The memory storage system of
0. 52. The memory storage system of
0. 53. The memory storage system of
0. 54. The memory storage system of
0. 55. The memory storage system of
0. 56. The memory storage system of
0. 57. The memory storage system of
0. 58. The memory storage system of
0. 59. The memory storage system of
0. 60. The memory storage system of
0. 61. A processor operable to simultaneously execute multiple threads and comprising the memory storage system of
|
This application is a continuation of application Ser. No. 08/541,416, filed Oct. 10, 1995, now U.S. Pat. No. 6,101,590.
The present invention relates to computer memory systems and particularly to virtual memory systems.
In order to enhance performance and utility in a computer system a technique called virtual memory is frequently used. One motivation for using virtual memory is to allow multiple programs to simultaneously share a computer system's main memory. This is achieved by allocating individual portions (referred to as blocks or segments) of the main memory to each of the programs being run (also referred to as a tasks). Virtual memory systems are also used in cases when a single program is too large to fit into main memory. In this case, portions of the program are stored in secondary memory and the virtual memory system assists in retrieving these portions from the secondary memory.
Virtual memory is implemented by using virtual addresses at the task or program level—each task having its own set of independent addresses. When a program performs a memory access, the virtual addresses are translated into physical addresses that may or may not be the same as other physical addresses for other tasks. The translation may be successful, leading to an access to main memory using that physical address. The translation may be unsuccessful, indicating that physical, or main memory has not been allocated for that virtual address, leading to a processor exception, from which the program may be aborted or physical memory may be allocated and the task restarted. To enhance the translation performance, virtual address are translated to physical addresses using information stored in a translation look-aside buffer (TLB), also known as a translation cache. The TLB provides the information that defines the mapping for each of the virtual addresses.
There are basically two categories of virtual memory systems presently utilized: paging and segmentation. Paging systems typically use fixed size blocks for allocating memory to processes. Segmentation, in contrast uses variable size blocks which may range from a value as small as one byte. Paging suffers from the disadvantage that sections of contiguous physical memory become unused because the page size is fixed: this is known as internal fragmentation. Segmentation, on the other hand, has disadvantages that the variable-sized segments may create unused regions of memory as segments are allocated, deallocated, and reallocated in arbitrary order, leaving holes in the consecutive memory allocation which become unused or unusable because they are not of a suitable size. A hybrid of the two categories has been employed in prior art systems in which segmentation and paging are both employed together.
Virtual memory systems may also employ a memory cache system to minimize virtual memory misses which includes a cache data storage and its corresponding cache tag storage. The cache stores recently accessed data and the tag storage stores a portion of the virtual address or physical address, providing the means by which it can be determined whether the cache contains the requested address. Only a portion of the address is usually required because the remaining portion of the address is used to locate (index) a reference within the cache data and tag storage, and so need not be checked again.
Caches may use either a virtual or physical address to index the cache, known as a virtual-index cache or a physical-index cache. Additionally, caches may use either a virtual or physical address stored and compared against in the cache tag storage, known as a virtual-tag cache or a physical-tag cache. Virtual-index and virtual-tag caches are generally able to attain higher peak performance, but add constraint to the mapping of addresses available when sharing data or changing the address mapping. In particular, the problem called aliasing occurs, in which two tasks use different virtual addresses to reference the same physical memory. Aliasing may require that tasks sharing memory space use identical or similar virtual addresses.
Since virtual memory allows two processes to share the same portion of physical memory such that each of the processes' virtual memory are mapped to different addresses, it is necessary to implement a protection scheme that prevents one task (i.e. a set of program instructions) from modifying a portion of memory, unless specifically allowed. Typically, tasks are assigned privilege levels which indicate the task's its ability to modify areas within physical memory and establish a control hierarchy, where higher privileged tasks are able to manipulate the storage of lower privileged tasks, including the possibility of higher privileged tasks manipulating the state of the virtual memory system itself.
One implementation of a protection scheme presently employed by virtual memory systems are “gateways” or “call gates” that function to provide a given task limited access privilege to areas in the physical memory having higher privilege than the task. The disadvantages of this prior art gateway implementation is that they utilize the CPU's status register requiring additional instructions in order to modify the status registers, and fail to provide securely initialized machine state, requiring additional instructions to initialize CPU registers used to access privileged memory registions. As a result, prior art gateway methods tend to reduce overall system performance by increasing execution times.
The present invention is a virtual memory system that performs virtual address-to-physical address translations in a manner that increases the overall efficiency and flexibility of the virtual memory system.
A virtual memory system that functions to translate a task specific virtual address (referred to as a local virtual address) into a virtual address that is generalized to all or a group of tasks (referred to as a global virtual address) and then translates the global virtual address into an address which points to a block of physical memory, (referred to as the physical address) is described. A first embodiment of the virtual memory system of the present invention includes a local-to-global virtual address translator for translating the local virtual address into the global virtual address and a global virtual-to-physical address translator for translating the global virtual address into the physical address. In an alternate embodiment, separate local-to-global virtual address translators are used for translating each of the data and instruction access addresses.
In one embodiment of the present invention, the local-to-global virtual address translator and the global virtual-to-physical address translator each include a plurality of cells, each cell implementing a single entry in a translation look-aside buffer (TLB) which defines a particular address space mapping. The TLB entry includes a match field, a mask field, an XOR field, and a protection field. Each cell includes a first logic means for matching the input address to be translated with the contents of the cell's match field to generate a match indicator output signal, a second logic means for masking the match indicator output signal with the contents of the cell's mask field to generate a masked output signal, a third logic means for generating a select signal if all of the signals making up the masked output signal are at the same logic level, a fourth logic means for outputting the cell's XOR value if the cell is selected, and a fifth logic means for providing a protection signal when the cell is selected. Each of the translators also includes a means for multiplexing all of the XOR values from each cell and outputting the XOR value of the selected cell and a second means for multiplexing all of the protection information from each cell and outputting the protection information of the selected cell. Further, each of the translators includes a logic means for combining the XOR value from the selected cell with the address to be translated, using a bitwise exclusive-or operation to generate the translated address (either global virtual address or physical address).
In another embodiment, the local-to-global virtual address translator comprises a single cell. In this case, all of the local virtual addresses are translated by the one cell and thus this embodiment of the local-to-global virtual address translator does not require the first and second multiplexers.
The local-to-global virtual memory translator, the global virtual-to-physical (GV/P) address translator and a cache tag storage provide protection information for each instruction or data access. The protection information includes the task privilege for the given access. The address's protection information is passed to the processor and is checked with the protection information associated with the corresponding local virtual, global virtual, or physical memory area that the address is attempting to access.
The present invention utilizes gateways to allow a given address to access high privilege areas of memory at certain entry points. A gateway instruction includes a gateway operation code and the gateway operation's privilege level in its protection information. The gateway instruction also provides an address that points to the gateway for the gateway operation. The gateway is a register that stores the gateway entry pointer and a data register pointer. The gateway operation comprises several steps which include storing the address of the gateway entry pointer in the processor program counter, storing the contents of the previous value stored in the program counter in a first register, initializing a second register with data to be used in the operation, performing the gateway operation, reloading the contents of the first register into the program counter.
The virtual memory system also includes a cache and a tag structure. The cache stores the accessed data or instruction that was recently retrieved by a previously translated virtual address. In one embodiment of the present invention a separate cache and tag system is used for each of the data and instruction accesses. In this embodiment, if an instruction access is being performed, the instruction cache tag utilizes the virtual address's corresponding global virtual address to tag the instruction stored in the virtual instruction cache. If a data access is being performed, the data cache tag utilizes the virtual address's corresponding physical address to tag the data stored in the virtual data cache.
The virtual cache structure also employs a means for determining whether a local virtual address cache hit or miss has occurred. If a cache hit occurs, the data in the virtual cache is accessed and the access operation is complete. If a cache miss occurs, the global virtual address is passed to the global virtual-to-physical address translator where it is translated into a physical address. The physical address is then used to access the data from either main memory or secondary memory.
A virtual memory system is described. In the following description, numerous specific details are set forth, such as cache size, address field size and bus widths etc., in order to provide a thorough understanding of the present invention. It will be obvious, however, to one skilled in the art that these specific details need not be employed to practice the present invention. In other instances, well-known memory system structures have not been described in detail in order to avoid unnecessarily obscuring the present invention.
Local-to-global virtual address translation is performed by mapping the higher order 16 bit portions 100a of local virtual address 100 to 16 bit portions 102a of global virtual address 102. As shown in
Global virtual-to-physical address translation is performed by translator 130. The higher order 58 bits of global virtual address 102 (i.e. portions 102a and 102b) are coupled to translator 130. Translator 130 maps the 58 bits of the selected global virtual address to 58 bits of a physical address. Portion 104a in physical address 104 corresponds to the translated 58 higher order bits of the global virtual address. The 6 lower bits of address 102 (i.e. portion 102c) are coupled to the 6 lower order bits of address 104 (i.e. portion 104b).
When an address accesses the virtual memory system shown in
If a cache miss occurs, (i.e. the accessed index provided by the cache tag does not match the higher order 52 bits in the global virtual instruction address), a cache miss is transmitted to the CPU on bus 136 indicating that the data sent on data bus 133 is not valid—at which time the CPU discards the data sent by the cache and the global virtual address is translated into a physical address. If the translation is successful, as indicated by translation hit and protection information 160, the physical address is then used to access the physical memory and the data return from memory is written into the cache storage 111 and the global-virtual address 102 and protection information 160 is written into cache tag 112. If the translation is unsuccessful, an exception is indicated to the CPU, and the physical memory access may not occur.
It should be noted that in one embodiment of the present invention, the type of addresses stored in the cache tag 112 is a physical address that can be used to access the physical memory. In such an embodiment, tag compare unit 113 is coupled to physical address 104 instead of global virtual address 102. Because a virtual-index cache is employed, the portion of the physical address which must be stored in cache tag 112 is larger and therefore a more limited set of protection information is stored in cache tag 112 and transmitted to the processor on bus 135. This information is supplemented by protection information 160 from the global virtual to physical translator 130.
Employing physical address indexes in the cache tag facilitates memory operations such as writing data out of the cache storage 111 and into physical memory when data is modified in the cache in order to retain data consistency. Another reason for using physical address indexes in the cache tag 112 is to facilitate purging data from the cache in order to make room for recently accessed memory storage. In both of these situations, since the cache tag stores physical address indexes, the data can be written into the physical memory without going through a global virtual-to-physical address translation. It should be apparent that the type of address (virtual or physical) stored in the cache tag is chosen so as to optimize the overall accessing operations in the memory system.
In the embodiment shown in
For instance, referring to
It should also be understood that since the global virtual-to-physical address translator 130 is shared between paths multiplexers 131 and 132, controlled by path select signals, are employed to route the global virtual instruction or data addresses through the shared path.
In an alternate embodiment of the present invention, local-to-global virtual address translation for both of the instruction and data addresses is performed by a single, shared translator.
The virtual instruction or data address is referred to as local since it belongs to a virtual address space that is unique to one specific task. Generally, in a computer system that performs multiple tasks, each task in the system has a unique local virtual address space. Accordingly, in one method of the present invention for performing local-to-global virtual translation, each of the local virtual address spaces are mapped into a common global virtual address space. An alternative embodiment, is that the local virtual address spaces are mapped into more than one global virtual address space.
The 16 bit match indicator output signal from XOR gate 511 is coupled to a first parallel input of AND gate 512. The second parallel input is coupled to 16 bit mask register 401. The mask register is used to mask out certain of the 16 bits that are non-zero. The two parallel inputs of AND gate 512 are bit-by-bit ANDed together. If all of the 16 bits from XOR gate 511 are zero, then the output of AND gate 16 will also be zero no matter what the mask value is. If any of the bits from XOR gate 511 are non-zero, the mask may be employed to mask these out so that the output of AND gate 512 is 16 bits of all zeros. However, if any of the bits from AND gate 512 are non-zero and they are not masked out by mask 401 then no match occurs and the 16 bit masked output signal of AND gate 512 will contain a “1”. The 16 bits from AND gate 512 are coupled to NOR gate 513 and are NORed together. If all of the 16 bits are zero, NOR gate 513 outputs a “1” indicating that the cell is selected. If any of the 16 bits are non-zero, NOR gate 513 outputs a “0” indicating that the cell is not selected. The select signal is ANDed with each of the values in XOR register 403. If the select signal is a 1, indicating that the cell is selected, then AND gate 514 outputs the contents of XOR register 403. If the select signal is a 0, indicating the cell is not selected, AND gate 514 outputs a 16 bit word of “0”s.
The select signal from the output of NOR gate 513 is also coupled to one input of AND gate 515. The other input of gate 515 is coupled to register 404 which contains the protection information for the global virtual address space defined by cell 1. In the event the select signal is a “1”, AND gate 515 passes the content of the protection register 404 to one input of OR gate 570. If the select signal is a “0”, then AND gate 515 outputs all “0”s. Since only one cell is selected at a time, gate 570 outputs protection information from only one cell. In other words, OR gate 570 outputs the protection information from the selected cell. This information is then transmitted to the CPU on bus 111.
It should be noted that an alternative space conservative embodiment of the present invention translator 110 comprises a single cell 1. In this embodiment, the match register is all “0”s and the mask register is all “1”s so that they become ineffective. As a result, all of the local virtual addresses are translated by the single cell having a mapping defined by the XOR register 403 in that cell. Also, since only one cell (i.e. only one global virtual address space) is used local protection is not required. Thus, the local-to-global translator would not need to generate any local protection at this point and the local protection portion of the cell would not be used.
Global virtual-to-physical address translator 130 is implement the same as translator 110 except that bus widths are increased to accommodate translation of the higher order 58 bits of the global virtual address. It should be apparent that the match, mask, and XOR values are also increased to 58 bits.
As shown in
The global TLB is similarly structured, having a different field for each of the mask, match, XOR and protect values. In other embodiments of the present invention, the number of entries in the global TLB range from 64 to 256.
As described above, each of the local and global TLBs generate protection information by passing the contents of the cells protection register 404 (
The d field, when set causes a detail exception to take place. This exception passes control to a software routine which may use a variety of methods to further determine whether an exception should be handled or suppressed for this particular memory reference. For example, it may determine from the virtual address which bytes within a memory block are to be accessed, and consult a bit-map for each byte as to whether the particular type of access is to proceed. In this manner, access may be controlled at the byte level. In order to suppress the handling of the exception, the software routine sets system state which is preserved only until the target instruction is re-executed upon returning from the exception handler. The system state is then reset so that detail exceptions may occur on the next successive instruction and particularly upon any further re-execution of the same instruction as may occur after a branch operation.
The r, w, x, and g fields define the minimum privilege level of the access being performed. In one embodiment of the present invention, each field is two bits wide thus allowing for four levels of privilege for each type of access (i.e. minimum privilege=3 and maximum privilege=0). The r field indicates the access privilege level for a read access, the w field indicates the privilege level for a write access, the x field indicates the privilege level for an execute access and the g field indicates the privilege level for a gateway access. A higher privilege level can always access a memory space having a lower privilege level, but a lower privilege level cannot access data stored at a higher privilege level unless a gateway access is specified.
It should be noted that since the cache coherency information in addition to access privilege level is provided for at the local virtual, global virtual and physical memory level, cache coherency is maintained throughout the virtual memory system of the present invention at each of these levels.
Gateway accesses provide a means of accessing high privileged areas of memory in a very controlled manner. For instance, it may be desired to protect certain types of routines, such as operating system software, in a more robust manner than other software routines stored in memory. To provide this protection, the system software is stored in an area of memory (physical or virtual) having the highest privilege level where accesses only having that privilege level is able to access the system software. Further access control to this area is imposed by ensuring that lower types of accesses are only allowed to access this high privileged area by passing through a gateway. The gateway controls the specific entrance and exit points into the privileged memory area.
Gateway accesses are implemented with gateway pointers which are 128 bits of data providing two pointers. The first pointer gives the memory address of where the higher priority area of memory is to be entered. The second pointer is 64 bits of data that can be used for a general purpose, but has the special characteristic that it is loaded as the gateway is entered and that the target procedure may depend upon it being properly loaded into a particular general register of the processor. Because of this characteristic, it is normally loaded with a pointer to a data region to be employed by the target procedure.
Gateways are implemented by the gateway instruction of the present invention.
The Branch: Gateway: Immediate (B.GATE.I) Operation
The B.GATE.I operation, provides a secure means to call a procedure, including those at a higher privilege level. The operation code for the gateway instruction format shown in
Operation Code
B.GATE.I
Branch gateway immediate
For the instruction format shown in
An access disallowed exception occurs if the new privilege level is greater than the privilege level required to write the memory data, or if the old privilege level is lower than the privilege required to access the memory data as a gateway. Furthermore, an access disallowed exception occurs if the target virtual address is a higher privilege than the current level and gateway access is not set for the gateway virtual address, or if the access is not aligned on a 16-byte boundary. A reserved instruction exception occurs if the rb field is non-zero.
The following is a definition in terms of an algorithm showing the operations performed to implement the B.GATE.I instruction of the present invention.
Definition
def BranchGatewayImmediate(ra,rb,imm) as
This operation provides a secure means to call a procedure, including those at a higher privilege level. The operation code for the gateway instruction format shown in
Operation Codes
B.GATE
Branch gateway
For the instruction format shown in
An access disallowed exception occurs if the new privilege level is greater than the privilege level required to write the memory data, or if the old privilege level is lower than the privilege required to access the memory data as a gateway. An access disallowed exception occurs if the target virtual address is a higher privilege than the current level and gateway access is not set for the gateway virtual address, or if the access is not aligned on a 16-byte boundary. A reserved instruction exception occurs if the rb field is non-zero.
The following is a definition in terms of an algorithm showing the operations performed to implement the B.GATE instruction of the present invention.
Definition
def BranchGateway(ra,rb,rc) as
The gateway operation includes several steps. First, the address stored at the first pointer is loaded into the CPU's program counter, causing the CPU to start fetching instructions at that address, performing the instructions, and incrementing the program counter. Further, the previous address stored in the program counter is stored in the location of the first pointer. In addition, the second pointer indicates the address of a data register that is initialized with any data needed to process the instructions. After the access operation is performed, the value stored in the first register is reloaded into the program counter so that processing can return back to its original processing location. This gateway method, i.e. 1) setting the program counter to an entry point address, 2) loading another register with the program counters previous value, and then 3) initiating a data register, reduces processing steps in comparison to prior art methods that utilize CPU status registers.
In one embodiment of the present invention a 64 bit program counter register is employed where the lower order two bits of the register correspond to the privilege level of the access and the rest of the bits make up the instruction address. At the beginning of the gateway operation, the new instruction address provided by the 128 bit gateway is loaded into the program counter register along with its corresponding new privilege level. At the end of the gateway procedure, another instruction is executed which causes the program counter to be reloaded with the return address and the old privilege level. In this way, privilege level consistency is maintained.
Each of the cache portions 300 for each of the instruction and data caches utilizes its corresponding cache tag. For instance, the cache portion in the information cache 111 utilizes cache tag 112 and the cache portion in data cache 121 utilizes cache tag 122. When cache portion 300 of cache 111 is accessed, a portion of the instruction address that is used to access the cache portion is also coupled to cache tag 112. Buffer portion 301, on the other hand, does not make use of cache tags 112 or 122. Caches 111 and 121 are configured with 2 bits each of configuration information: instruction cache configuration (icc) information and data cache configuration (dcc) information, respectively, that are initially set in a status register when the processor is configured by software prior to running tasks.
In one embodiment of the present invention, 32K bytes of memory are utilized for each of instruction and data caches 111 and 121. In the case of instruction cache 111, when icc=0, zero bytes of cache 111 function as a cache portion and 32K bytes of cache 111 function as a buffer portion. For icc=1, cache 111 is divided into 4K bytes of a cache portion and 28K bytes of a buffer portion. For icc=2, 8K bytes are utilized as cache and 24K bytes function as buffer, and finally for icc=3, 16K bytes are cache and 16K bytes are buffer. Thus, for this embodiment, a maximum of half of the 32K bytes of cache 111 can may be used as cache and a maximum of 32K bytes of cache 111 may be used as buffer.
In addition to the configuration information, a flag bit in the instruction address (a47, for this embodiment) is used to indicate whether the access is being performed with cache portion 300 or buffer portion 301 of either of caches 111 or 121. If a47=1, then buffer portion 301 is accessed with the address in the instruction register and if a47=0, cache 300 is accessed.
In the embodiment shown in
In one embodiment of the present invention the cache portion is divided into blocks of 64 bytes. Thus, in the case of the maximum cache portion size condition (i.e. icc=3) 16K bytes of the cache portion, is divided into 256 blocks of 64 bytes each. In order to access each of the 256 blocks, eight address bits are needed to access each cache entry (i.e. 28=256). In the embodiment shown in
The present invention also stores static information indicating the minimum page size of the memory. For the embodiment shown in
In one embodiment of the present invention, portions of the local-to-global virtual address translator is shared between the instruction path and the data path. The main translator is located in the data path as block 120, and the instruction translator 110 is loaded upon demand from the data translator 120. Consequently, the number of entries in instruction translator 110 need not be as large as that of data translator 120. In one embodiment, the instruction translator 110 holds a single entry. The single entry is loaded from the data translator 120 whenever either a branch operation occurs or a page boundary is crossed. In one embodiment, the single entry further may eliminate the masking and matching portions of the entry, thus consisting only of XOR data or the global virtual address which is coupled to address 102.
When the information cache 111 is accessed as a cache portion (a47=0), cache tag 112 is simultaneously accessed and provides previously cached protection information to the processor on bus 135 to multiplexer 154, (FIG. 7). In the case when cache 111 is accessed as a buffer portion (a47=1) and the cache tag is not used, protection information buffer 150 provides the appropriate protection information on bus 156 to multiplexer 154, (FIG. 7). Instruction address bit a47 which indicates whether instruction cache 111 is configured as a buffer or a cache controls multiplexer 154 so that it passes the correct protection information to the CPU. Protection information 160 (
Boundary crossing detector 151 functions to evaluate whether a page boundary is potentially crossed by detecting a carry in certain bits in the instruction address, (bits a31-a6 in this embodiment) as controlled by the minimum page size. When the instruction address indicates that a boundary is potentially being crossed or the instruction corresponds to a branch operation, OR gate 152 outputs a reload buffer signal indicating that the protection information provided on bus 153 from the global TLB for that particular instruction is to be loaded into instruction protection buffer 150.
Protection information buffer 150 is used when translating consecutive addresses, such as instruction addresses. For instance, when an instruction address is first translated by local-to-global TLB 110 and then translated by global-to-physical TLB 130, protection information is provided to the processor by the global TLB. For the previously described embodiment shown in
In the embodiment of the present invention shown in
The virtual memory of the present invention may be implemented in a computer system having multiple simultaneous threads of execution—referred to as a superthreaded or multi-threaded computer system. The computer system is designed such that it employs a different copy of the register file, the local-to-global TLB, the data and instruction caches and the protection information buffer for each thread and only one data cache, instruction cache and global-to-physical TLB for all threads. Consequently, each of the threads of the machine may correspond to different tasks that operate independently and may be protected from each other. Data may be shared or maintained independently between each thread using the virtual memory system of the present invention since each thread has a distinct version of the local-to-global TLB so that they may use the same address to mean different things, or may use the different addresses to mean the same thing, or may use the same address to reference the same memory, depending on the settings of the local-to-global translators.
Thus, a virtual memory system utilizing local virtual and global virtual addresses is described. Although the elements of the present invention have been described in conjunction with a certain embodiment, it is appreciated that the invention may be implemented in a variety of other ways. Consequently, it is to be understood that the particular embodiment shown and described by way of illustration are in no way intended to be considered limiting. Reference to the details of these embodiments is not intended to limit the scope of the claims which themselves recite only those features regarded as essential to the invention.
Patent | Priority | Assignee | Title |
7370171, | Apr 26 2004 | Oracle America, Inc | Scalable buffer control for a tracing framework |
7616210, | Aug 23 2005 | Canon Kabushiki Kaisha | Memory apparatus and memory control method |
Patent | Priority | Assignee | Title |
4442484, | Oct 14 1980 | INTEL CORPORATION SANTA CLARA, CA A CORP OF CA | Microprocessor memory management and protection mechanism |
5226133, | Dec 01 1989 | MIPS Technologies, Inc | Two-level translation look-aside buffer using partial addresses for enhanced speed |
5319760, | Jun 28 1991 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Translation buffer for virtual machines with address space match |
5390310, | Sep 30 1991 | Apple Inc | Memory management unit having cross-domain control |
5410669, | Apr 05 1993 | Motorola, Inc. | Data processor having a cache memory capable of being used as a linear ram bank |
5420998, | Apr 10 1992 | Toshiba Storage Device Corporation | Dual memory disk drive |
5423019, | Dec 19 1991 | OPTi Inc. | Automatic cache flush with readable and writable cache tag memory |
5430888, | Jul 25 1988 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Pipeline utilizing an integral cache for transferring data to and from a register |
5446861, | Jun 04 1979 | Unisys Corporation | Variable data rate improvement of disc cache subsystem |
6000012, | Dec 09 1996 | International Business Machines Corporation | Method and apparatus for prioritizing and routing commands from a command source to a command sink |
6101590, | Oct 10 1995 | Micro Unity Systems Engineering, Inc. | Virtual memory system with local and global virtual address translation |
6484237, | Jul 15 1999 | Texas Instruments Incorporated | Unified multilevel memory system architecture which supports both cache and addressable SRAM |
6606686, | Jul 15 1999 | Texas Instruments Incorporated | Unified memory system architecture including cache and directly addressable static random access memory |
EP497600, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jul 29 2004 | MicroUnity Systems Engineering, Inc. | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Feb 04 2010 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Feb 04 2010 | M1555: 7.5 yr surcharge - late pmt w/in 6 mo, Large Entity. |
Date | Maintenance Schedule |
Feb 27 2010 | 4 years fee payment window open |
Aug 27 2010 | 6 months grace period start (w surcharge) |
Feb 27 2011 | patent expiry (for year 4) |
Feb 27 2013 | 2 years to revive unintentionally abandoned end. (for year 4) |
Feb 27 2014 | 8 years fee payment window open |
Aug 27 2014 | 6 months grace period start (w surcharge) |
Feb 27 2015 | patent expiry (for year 8) |
Feb 27 2017 | 2 years to revive unintentionally abandoned end. (for year 8) |
Feb 27 2018 | 12 years fee payment window open |
Aug 27 2018 | 6 months grace period start (w surcharge) |
Feb 27 2019 | patent expiry (for year 12) |
Feb 27 2021 | 2 years to revive unintentionally abandoned end. (for year 12) |