A system for writing to a cache line, the system including: at least one processor; and at least one memory having stored thereon instructions that, when executed by the at least one processor, controls the at least one processor to: pre-emptively invalidate a cache line at a reader device; receive, from the reader device, a read request for the invalidated cache line; delay a response to the read request; and after the delay, output for transmission a response to the read request to the reader device.
|
1. A method for reducing observed write transaction latency of a writer in a cache coherent shared memory system, the method comprising:
pre-emptively invalidating a cache line;
receiving a read request for the invalidated cache line;
temporarily withholding a response to the read request;
responding, after temporarily withholding the response, to the read request; and
pre-emptively re-invalidating the cache line after responding to a read request.
10. A method for reducing an amount of wasted work done by a reader in a cache coherent shared memory system capable of out-of-order processing polling on a cache line, the method comprising:
speculatively invalidating a cache line;
receiving, by a writer, a read request for the invalidated cache line;
withholding, by the writer, a response to the read request until a condition is met; and
speculatively re-invalidating the cache line again after responding to the read request when the condition is met.
12. A system for writing to a cache line, the system comprising:
at least one processor; and
at least one memory having stored thereon instructions that, when executed by the at least one processor, controls the at least one processor to:
pre-emptively invalidate a cache line at a reader device;
receive, from the reader device, a read request for the invalidated cache line;
delay a response to the read request;
after the delay, output for transmission a response to the read request to the reader device; and
pre-emptively re-invalidate the cache line at the reader device after responding to a read request.
2. The method of
3. The method of
4. The method of
5. The method of
7. The method of
8. The method of
9. The method of
11. The method of
13. The system of
14. The system of
15. The system of
16. The system of
17. The system of
18. The system of
|
This application claims the benefit of U.S. Provisional Application No. 62/585,153, filed Nov. 13, 2017.
Modern computer systems include distributed cache memories to speed access to memory shared among multiple components in a system. The shared memory systems that include cache memories typically utilize a cache coherency protocol such as MOESI, MESI, MESIF, or other related cache coherency protocol. As will be understood by one of ordinary skill, under these protocols, cache lines may be assigned (and transition between) various different states, such as “Modified” (“M”), “Exclusive” (“E”), “Shared” (“S”), “Invalid” (“I”), “Owned” (“O”), and “Forward” (“F”). The protocols are designed to arbitrate shared memory utilization in a coherent and consistent manner among multiple components in the presence of distributed caches memories.
Shared memory can be logically organized into units called cache lines. Copies of a particular cache line may be present in multiple components' local cache memories. In many implementations of cache coherency, to maintain coherency and consistency, the protocols require that a component intending to write to a cache line first notify all other components (or a directory) in the system of the component's intent to write to the cache line and then confirm that the component has the only writable copy of the cache line in question. Put differently, the component must gain “Modified” (also commonly referred to as “Dirty”) or “Exclusive” (also commonly referred to as “Valid”) state on its own local copy of the cache line. In the research literature, this technique is commonly called “invalidation.” Note that invalidation may be in the form of explicit invalidation or implied in actions such as, but not limited to, read for exclusive control. Modified (“M”) and Exclusive (“E”) states share a property—the writer with a local copy of a cache line in those states is the only component in the system that has permission to write to the cache line if the system's shared memory is to stay coherent and consistent.
When writing is initiated and the writer's local copy of the relevant cache line is not already in the M or E state, the write is delayed by “coordination overhead” wherein the system expends time and resources granting M or E state to the writer's local copy of the cache line. This coordination overhead therefore increases “Observed Latency” (i.e., the time which elapses between when the writer initiates a write to a cache line and when the data is permitted to be read from that cache line). In the related art, a writer can pre-emptively invalidate all remote copies so as to hide the coordination overhead therein reducing Observed Latency. For many workflows, pre-emptive invalidation (i.e., “write prefetch”) is effective. However, once a reader requests a cache line that has been invalidated pre-emptively and the request is granted prior to the writer initiating its write, the pre-emptive invalidation becomes wasted work because granting the reader's request moves the writer's copy of the cache line out of M or E state. In this scenario, the writer must again incur the coordination overhead at a future time when it wants to initiate a write and consequently will have wasted resources in the system in its unused pre-emptive invalidation. Generally, pre-emptive invalidation approaches known in the related art result in wasted work and therefore is not implemented.
Embodiments of the disclosed technology address the issues mentioned above and lowers the latency experienced in data transfers within a shared memory architecture in the presence of distributed cache memories.
According to some embodiments, there is provided a method for reducing observed write transaction latency of a writer in a cache coherent shared memory system, the method including: pre-emptively invalidating a cache line; receiving a read request for the invalidated cache line; temporarily withholding a response to the read request; responding, after temporarily withholding the response, to the read request; and pre-emptively re-invalidating the cache line after responding to a read request.
Temporarily withholding the response to the read request may include withholding the response until the writer writes fresh data to the cache line.
The method may further include determining that the writer has interest in the cache line, wherein temporarily withholding the response to the read request comprises withholding the response until the writer indicates that it is no longer interested in the cache line.
Temporarily withholding the response to the read request may include withholding the response until a timeout occurs.
The timeout may be based from a time of pre-emptively invalidating the cache line. The timeout may be based from a time of receiving the read request.
The method may further include aborting the timeout in response to the writer determining that a write command to the cache line is unlikely to occur during the timeout period.
The timeout length may be based on at least one from among a historical rate of data write commands to the cache line, a historical frequency of read requests for the cache line, and an estimated fabric delay between the writer and the reader.
The method may further include temporarily delaying the pre-emptive re-invalidation based on historical read requests rates.
According to some embodiments, there is provided a method for reducing the amount of wasted work done by a reader in a cache coherent shared memory system capable of out-of-order processing polling on a cache line, the method including: speculatively invalidating a cache line; receiving, by a writer, a read request for the invalidated cache line; and withholding, by the writer, a response to the read request until a condition is met.
The method may further include speculatively re-invalidating the cache line again after responding to the read request due.
The condition may include a first from among the writer writing useful data to the cache line, determining that the writer is no longer interested in holding the cache line, and a timeout occurring.
According to some embodiments, there is provided a system for writing to a cache line, the system including: at least one processor; and at least one memory having stored thereon instructions that, when executed by the at least one processor, controls the at least one processor to: pre-emptively invalidate a cache line at a reader device; receive, from the reader device, a read request for the invalidated cache line; delay a response to the read request; and after the delay, output for transmission a response to the read request to the reader device.
The instructions, when executed by the at least one processor, may further control the at least one processor to pre-emptively re-invalidate the cache line at the reader device after responding to a read request.
The instructions, when executed by the at least one processor, may further control the at least one processor to delay the pre-emptive re-invalidation based on historical read requests rates.
Delaying the response to the read request may include delaying the response until at least one of the system writes fresh data to the cache line and the system is no longer interested in the cache line.
Delaying the response to the read request may include delaying the response until a timeout occurs.
The timeout may be based from either a time of pre-emptively invalidating the cache line and a time of receiving the read request.
The instructions, when executed by the at least one processor, may further control the at least one processor to about the timeout in response determining that a write command to the cache line is unlikely to occur during the timeout period.
The instructions, when executed by the at least one processor, may further control the at least one processor to determine a timeout length based on at least one from among a historical rate of data write commands to the cache line, a historical frequency of read requests for the cache line, and an estimated fabric delay between the system and the reader device.
Implementations, features, and aspects of the disclosed technology are described in detail herein and are considered a part of the claimed disclosed technology. Other implementations, features, and aspects can be understood with reference to the following detailed description, accompanying drawings, and claims. Reference will now be made to the accompanying figures and flow diagrams, which are not necessarily drawn to scale.
A related art scheme for exchanging data between two components in a shared memory system is to have the reader constantly poll a particular memory location until the writer updates the memory location and indicates that it is safe for the reader to proceed. Sample implementations make use of spin locks, semaphores, ring buffers, and indicator flags among other techniques. It will be appreciated that in the presence of a continuously polling reader, the writer's local copy of the contended cache line will rarely be in an M or E state at the moment the writer initiates a write because the writer's responses to the frequent read requests move, or “downgrade,” the writer's local copy of the cache line out of either of these two states. Therefore, a write initiated by the writer almost always requires it to first gain M or E state for its local copy of a cache line. As explained, this process involves invalidating all other copies of the cache line in the system thereby incurring a coordination overhead which increases the Observed Latency of writer-to-reader data transfer relative to the case in which the writer's local cache line is already in M or E state. Further, the reader's polling needlessly consumes instruction execution resources. Accordingly, the related art's approaches to reducing latency have various technical drawbacks.
The disclosed technology hides coordination overhead typically required by a cache line write by speculatively and pre-emptively undertaking the steps necessary to acquire a M or E state. As such, it reduces Observed Latency of cache writes. In some implementations, anticipating that a polling reader will soon request the latest updated copy of the cache line in a M or E state, the writer unilaterally withholds the response to a read request regardless of whether or not there is data to be written until the first to occur of (i) a predetermined amount of time has elapsed or (ii) the writer allows the read request to be processed. In the case of (i), the writer relinquishes the M or E state because otherwise the reader will infer that an error has occurred. Additionally, after responding to the reader, because the writer is still interested in the cache line just relinquished, the writer will again undertake the steps necessary to acquire an M or E state speculatively. In the case of (ii), the writer has determined that it no longer requires the cache line and therefore does not re-initiate invalidation. The writer does not re-initiate invalidation because it is done with the write and wants to make the updated cache line available to be read or it has determined that the write is no longer necessary.
Typically, a polling reader, upon learning that its local copy of a particular cache line, has been invalidated by a writer, will generally send another read request immediately. In conventional schemes, the writer typically responds immediately to a read request and relinquishes the M or E state that its local copy of the cache line has just gained. Therefore, a potential write only has a fleeting moment to catch the cache line in a M or E state so that it can successfully finish its write. By withholding its response to a read request, the writer increases its window of opportunity for a fast finish of a write thereby avoiding the coordination overhead and associated extra latency of acquiring a M or E state. It will be appreciated that in practice, the held-up read response is treated by the reader as an abnormally delayed reply. As such, typically no modification to the logic in the reader is required to maintain interoperability with writers that can withhold read requests. Additionally, since most modern processors are able to execute instructions out-of-order and/or execute multiple threads simultaneously (e.g., SMT and Hyperthreading), a reading processor which is not validating the value read is not expending resources executing useless instructions. This frees up processor resources to execute useful instructions while waiting for the held-up read response to arrive.
In cache coherency protocols that support “updates,” modified cache lines can be pushed out to the interested readers rather than just invalidated and then requested by the readers. In this model of cache coherency, a polling reader may not need to repeatedly request data from the writer through the data fabric. However, if the protocol design necessitates that the writer gain some sort of special state that the reader, by the very act of initiating a read, subsequently alters, then the “update” case is no different than the “invalidate” case: traditional prefetch mechanisms are ineffective and the writer is forced to incur undesirable coordination overhead associated with regaining the special state for its local copy of the cache line whenever the it needs to write to a cache line. The disclosed technology can be adapted to address this “updated-based” implementation of cache coherency in addition to the “invalidate-based” implementation discussed at greater length herein.
Some implementations of the disclosed technology will be described more fully hereinafter with reference to the accompanying drawings. This disclosed technology may, however, be embodied in many different forms and should not be construed as limited to the implementations set forth herein.
In the following description, numerous specific details are set forth. It is to be understood, however, that implementations of the disclosed technology may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. References to “one implementation,” “an implementation,” “example implementation,” “various implementations,” etc., indicate that the implementation(s) of the disclosed technology so described may include a particular feature, structure, or characteristic, but not every implementation necessarily includes the particular feature, structure, or characteristic. Further, repeated use of the phrase “in one implementation” does not necessarily refer to the same implementation, although it may.
Throughout the specification and the claims, the following terms take at least the meanings explicitly associated herein, unless the context clearly dictates otherwise. The term “connected” means that one function, feature, structure, or characteristic is directly joined to or in communication with another function, feature, structure, or characteristic. The term “coupled” means that one function, feature, structure, or characteristic is directly or indirectly joined to or in communication with another function, feature, structure, or characteristic. The term “or” is intended to mean an inclusive “or.” Further, the terms “a,” “an,” and “the” are intended to mean one or more unless specified otherwise or clear from the context to be directed to a singular form.
As used herein, unless otherwise specified the use of the ordinal adjectives “first,” “second,” “third,” etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
Example implementations of the disclosed technology will now be described with reference to the accompanying figures.
In computing systems, such as a computing system having the architecture shown in
As will be appreciated by one of skill in the art, a second factor when considering latency inherent in cache coherency-based communications schemes relates to the data writing side of the system. Again, referring to
Thus, in order for Actor B 220 to write to that region of memory in its local cache, several events must transpire. First, CCB 224 must send a control request to CCA 214. Second, CCA 214 must receive that request and mark Actor A 210's copies of the requested memory addresses as invalid to prevent Actor A 210's application logic 212 from potentially reading stale data. Third, CCA 214 must send an acknowledgment to CCB 224 that CCA 214 has indeed marked Actor A 210's copy of designated memory addresses as invalid. Finally, CCB 220 must receive CCA 210's acknowledgement message before it finally allows Actor B 220's application logic 222 to write the data to local memory. In sum, unless the memory location that Actor B 220's application logic 222 wishes to update with new data is already in M or E state, CCB 224 must undertake a time-consuming message passing exercise with CCA 214 to ensure that (i) Actor A 210's copy of the relevant memory location has been marked as invalid with respect to Actor A 210's potential reads and writes and (ii) CCB 224 has received acknowledgment from CCA 214 that CCA 214's copy of the memory block has been marked invalid at which time CCB 224 is able to mark the relevant cache lines in Actor B 224's cache as having M or E state and thus available for Actor B 220's writes.
In some instances, the application logic 222 in Actor B 220, periodically, but unpredictably, has new data that should be made available to Actor A 210's application logic 212 as quickly as possible. As shown in the preceding discussion, in the sample cache coherent system considered here, two time-consuming processes relating to CCA 214 and CCB 224 coordination must be undertaken before the updated data is made available to Actor A 210's logic 212. First, CCB 224 must ensure that Actor B 220 has obtained exclusive rights to write to a designated region of its local cache memory; in other words, CCB 224 must send a message to CCA 214 and wait for an affirmative response before it can put Actor B 220's designated cache lines into E or M state (so Actor B 220 is permitted to write to its local memory). Second, even after Actor B 220 has written to its local cache memory, considerable time may elapse before CCB 224 happens to receive a read request made on Actor A 210's behalf. As discussed above, when Actor B 220 has obtained exclusive rights to write to the designated region, application logic 212 in Actor A 210 will only get access to the new data when CCA 214 sends a read request to CCB 224 having recognized that Actor A 210's local copy of a region of memory is invalid (a function of CCB 224's earlier procedure of gaining E or M state for Actor B 220's relevant cache lines), thus requiring CCA 214 to reach out to Actor B 220. To the extent Actor B 220 has written the new data to Actor B 220's local cache, CCB 224 can then respond to CCA 214's read request by responding with the new data into the read response message.
The methods described in this disclosure function to lower the overall Observed Latency that arises when the writer has data which the developer wishes to be made available to the reader in two ways. In some embodiments, the disclosed systems and methods avoid contribution to latency arising out of the coordination overhead associated with CCB 224 gaining E or M state for Actor B 220 by having CCB 224 optimistically request E or M state for a designated region of memory prior to Actor B 220's application logic 222 actually having new data it wishes to write to memory. In this way, Actor B 220 can write data to its local cache the moment the data is made available to the application logic 222 without having to wait to first gain write permission to its local memory.
In some embodiments, the disclosure details how the Reader's (Actor A 210) cache controller 214, in the foregoing example, might optimistically poll the Writer's (Actor B 220) cache controller 224, for any new data in a designated region of memory in Actor B 220's cache. In many cases, CCB 224 may recognize that there is no new data at the time the read request from CCA 214 arrives. However, by holding on to CCA 214's read request and not responding immediately, CCB 224 has a high probability of being able to respond to the read request with new data in Actor B 220's cache that was written some time after the read request arrived. In this way, CCB 224 has an opportunity to transfer to CCA 214, and thus Actor A 210's application logic 212, new data as soon as it is written to Actor B 220's cache rather than having to wait for a new read request to arrive from CCA 214 some time after new data is written to Actor B 220's cache. To avoid destabilizing the system, in some implementations, even in the absence of new data, CCB 224 will be forced to periodically respond to CCA 214's read request and simultaneously relinquish the E or M state on the designated region of Actor B 220's memory. In such implementations, as discussed in this disclosure, the system is designed wherein CCB 224 may again proactively reacquire write permissions for Actor B 220 and optimistically hold a subsequent read request generated from CCA 214 which has been instructed to poll CCB 224 with read requests.
While the foregoing discussion is based on Actor B 220 only writing to its local caches, one of ordinary skill will recognize that implementations of the disclosed technology can make use of “write-through,” wherein Actor B 220 sets a cache line to E or M in another device (e.g., Actor A 210), such as a system's main memory, and simultaneously changes its local cache to invalid. As will be appreciated by one of skill in the art, any variations needed to adapt the disclosed technology to this write-through approach will be readily apparent in light of the present disclosure.
In
At some point while reader 395's copy is invalid, reader 395 polls its local copy (“Read Data”). Reader 395 then requests an update from writer 390 (“Read”). When received, writer 390 persists its local copy (e.g., changing the state from M to I or S), and transfers the updated copy to reader 395 (“ReadResp”). Reader 395 updates its local copy and sets the flag from Ito S. The time from writer 390 receiving the read request from reader 395 to updating reader's local copy (“Xfer) is at least equal to the fabric delay. Although as illustrated writer 390 awaits a read request from reader 395 to update reader 395's copy of the data, this is merely an example and, in some cases, writer 390 may release the data and update reader 395's copy of the data as soon as writer 390's local copy is updated. Furthermore, as the data is ready to be updated at least one fabric delay before reader 395's cache line is notified, reader 395's local copy may return an outdated read.
As a non-limiting example, if the fabric delay is 200 nanoseconds (ns), then the Observed Latency (i.e., the time between the write data command from writer 390 and updating the reader 395 copy) is at least 600 ns (i.e., 3*200-ns) of which 400 ns (i.e., 2*200-ns) is due to the act of acquiring M or E state (i.e., “coordination delay” represented by “Inv Overhead” in the figure) and 200-ns is due to actual data transfer from writer 390 to reader 395 (i.e., “Xfer” in the figure) between the writer's and the reader's local cache lines. Note that the 600 ns value assumes that the gap between the “Inv Overhead” and the “Xfer” is at its theoretical minimum of zero, though the figure, for illustrative purposes, shows a small gap of time (e.g., to account for write times of writer 390, and/or any processing delays of writer 390 or reader 395).
As shown in timing diagram 400 of
In aspects of the disclosed technology, writer 690 unilaterally withholds the read response, thereby increasing the likelihood that when writer 690 wants to write, writer 590 does not have to incur the coordination overhead (of at least two times the fabric delay) associated with gaining E or M state in its local copy of the cache line, Observed Latency could be reduced.
As will be appreciated, writer 790 cannot withhold read responses indefinitely, however, because reader 395 will assume that some critical error, such as one arising from a defective component, has occurred after a timeout period (i.e., when writer 790 does not respond within some predetermined amount of time). Accordingly, writer 790 must respond to the reader before such a timeout is reached and relinquish M or E state in its local copy of the cache line. However, writer 790 can immediately re-initiate the prefetching invalidation process (i.e., “refetch”) and regain the very E or M state in the local copy that it had just relinquished.
As an example, assuming a 200-ns delay in traversing the data fabric, a maximum withholding time of 400 ns, and an immediate polling by reader 395, writer 790 has around a 50% chance (i.e., 400 ns/(2*200-ns+400 ns)) of avoiding the coordination overhead completely and some fraction of the coordination overhead in the other 50% of cases. As another example, assuming a 200-ns delay in traversing the data fabric, a maximum withholding time of 400 ns, and polling by reader 395 on average 200-ns after refetching, writer 790 has around a 60% chance (i.e., (400 ns+200-ns)/(2*200-ns+400 ns+200-ns)) of avoiding coordination overhead completely and some fraction of the coordination overhead in the other 40% of cases. If writer 790 initiates a write immediately after the re-initiation of the invalidation process, writer 790 will incur the maximum amount 400 ns of the coordination overhead. If writer 790 initiates a write near the end of the invalidation process, writer 790 will incur almost no coordination overhead. If writer 790 initiates a write after the end of the invalidation process and/or while writer 790 is withholding the read response, writer 790 completely avoids the coordination overhead.
In some cases, the withholding time may be variable (e.g., between a maximum withholding period and no withholding period). For instance, if writer 790 predicts that no write data instructions will be received within the maximum withholding period from receiving the read request, writer 790 may ignore the withholding period, immediately release the local cache, and immediately respond to the read request. In some cases, historical write data and read requests may be tracked, and at least one of the refetching timing and holding periods may be adjusted. For example, if writer 790 identifies read request patterns (e.g., that they tend to come in bursts), it may delay initiating refetching until after a determined time period has elapsed after a read request. As another example, if writer 790 determines that write data commands occur very infrequently, writer 790 may halt prefetching. If write data commands later increase, writer 790 may again perform prefetching. One of ordinary skill will understand that these are merely examples, and various modifications and alternatives would be apparent in light of the present disclosure.
In the flowchart 1100 of
Meanwhile, if the cache line is being withheld (1125—Yes), the writer determines 1150 whether a timeout has expired. For example, in some cases a response to a read request may only be withheld for a set period of time (e.g., twice a fabric delay) before the requesting reader believes a problem has occurred. If the timeout has not expired, the writer waits 1160. Once the timeout expires, the writer may again determine 1120 whether the local copy of the ache line is in E or M state. However, this is merely an example and, in some instances, after the writer waits 1160 for the timeout to expire, it may proceed with determining 1170 whether the writer is still interested in withholding the cache line. Similarly, if the timeout has expired (1150—Yes), the writer determines 1170 whether the writer is interested in withholding the cache line. If the writer is not interested (1170—No), the writer responds 1140 to the read request with data and downgrades its local copy of its cache. Meanwhile, if the writer is still interested in withholding the cache line (1170—Yes), the write send 118—a response to the reader with data, downgrades its local copy from E or M, and re-initiates the invalidation process (e.g., in order to hold the local copy of the cache line in E or M state). As a consequence of the withholding, the odds increase of the writer incurring lower Observed Latency if it attempts to write data shortly after receiving the read request, as the cache continues to be in the E or M state.
By combining (i) a writer's prefetching of cache lines with (ii) its withholding of read requests, aspects of the disclosed technology offer a means by which the Observed Latency of data transfer in cache coherent systems can be meaningfully reduced.
Aspects of the disclosed technology may be implementing using at least some of the components illustrated in the computing device architecture 1200 of
In an example implementation, the network connection interface 1212 may be configured as a communication interface and may provide functions for rendering video, graphics, images, text, other information, or any combination thereof on the display. In one example, a communication interface may include a serial port, a parallel port, a general purpose input and output (GPIO) port, a game port, a universal serial bus (USB), a micro-USB port, a high definition multimedia (HDMI) port, a video port, an audio port, a Bluetooth port, a near-field communication (NFC) port, another like communication interface, or any combination thereof. In one example, the display interface 1204 may be operatively coupled to a local display, such as a touch-screen display associated with a mobile device. In another example, the display interface 1204 may be configured to provide video, graphics, images, text, other information, or any combination thereof for an external/remote display that is not necessarily connected to the mobile computing device. In one example, a desktop monitor may be utilized for mirroring or extending graphical information that may be presented on a mobile device. In another example, the display interface 1204 may wirelessly communicate, for example, via the network connection interface 1212 such as a Wi-Fi transceiver to the external/remote display.
The computing device architecture 1200 may include a keyboard interface 1206 that provides a communication interface to a keyboard. In one example implementation, the computing device architecture 1200 may include a presence-sensitive display interface 1208 for connecting to a presence-sensitive display 1207. According to certain example implementations of the disclosed technology, the presence-sensitive display interface 1208 may provide a communication interface to various devices such as a pointing device, a touch screen, a depth camera, etc. which may or may not be associated with a display.
The computing device architecture 1200 may be configured to use an input device via one or more of input/output interfaces (for example, the keyboard interface 1206, the display interface 1204, the presence sensitive display interface 1208, network connection interface 1212, camera interface 1214, sound interface 1216, etc.) to allow a user to capture information into the computing device architecture 1200. The input device may include a mouse, a trackball, a directional pad, a track pad, a touch-verified track pad, a presence-sensitive track pad, a presence-sensitive display, a scroll wheel, a digital camera, a digital video camera, a web camera, a microphone, a sensor, a smartcard, and the like. Additionally, the input device may be integrated with the computing device architecture 1200 or may be a separate device. For example, the input device may be an accelerometer, a magnetometer, a digital camera, a microphone, and an optical sensor.
Example implementations of the computing device architecture 1200 may include an antenna interface 1210 that provides a communication interface to an antenna; a network connection interface 1212 that provides a communication interface to a network. As mentioned above, the display interface 1204 may be in communication with the network connection interface 1212, for example, to provide information for display on a remote display that is not directly connected or attached to the system. In certain implementations, a camera interface 1214 is provided that acts as a communication interface and provides functions for capturing digital images from a camera. In certain implementations, a sound interface 1216 is provided as a communication interface for converting sound into electrical signals using a microphone and for converting electrical signals into sound using a speaker. According to example implementations, a random-access memory (RAM) 1218 is provided, where computer instructions and data may be stored in a volatile memory device for processing by the CPU 1202.
According to an example implementation, the computing device architecture 1200 includes a read-only memory (ROM) 1220 where invariant low-level system code or data for basic system functions such as basic input and output (I/O), startup, or reception of keystrokes from a keyboard are stored in a non-volatile memory device. According to an example implementation, the computing device architecture 1200 includes a storage medium 1222 or other suitable type of memory (e.g. such as RAM, ROM, programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), magnetic disks, optical disks, floppy disks, hard disks, removable cartridges, flash drives), where the files include an operating system 1224, application programs 1226 (including, for example, a web browser application, a widget or gadget engine, and or other applications, as necessary) and data files 1228 are stored. According to an example implementation, the computing device architecture 1200 includes a power source 1230 that provides an appropriate alternating current (AC) or direct current (DC) to power components.
According to an example implementation, the computing device architecture 1200 includes and a telephony subsystem 1232 that allows the device 1200 to transmit and receive sound over a telephone network. The constituent devices and the CPU 1202 communicate with each other over a bus 1234.
According to an example implementation, the CPU 1202 has appropriate structure to be a computer processor. In one arrangement, the CPU 1202 may include more than one processing unit. The RAM 1218 interfaces with the computer bus 1234 to provide quick RAM storage to the CPU 1202 during the execution of software programs such as the operating system application programs, and device drivers. More specifically, the CPU 1202 loads computer-executable process steps from the storage medium 1222 or other media into a field of the RAM 1218 in order to execute software programs. Data may be stored in the RAM 1218, where the data may be accessed by the computer CPU 1202 during execution. In one example configuration, the device architecture 1200 includes at least 128 MB of RAM, and 256 MB of flash memory.
The storage medium 1222 itself may include a number of physical drive units, such as a redundant array of independent disks (RAID), a floppy disk drive, a flash memory, a USB flash drive, an external hard disk drive, thumb drive, pen drive, key drive, a High-Density Digital Versatile Disc (HD-DVD) optical disc drive, an internal hard disk drive, a Blu-Ray optical disc drive, or a Holographic Digital Data Storage (HDDS) optical disc drive, an external mini-dual in-line memory module (DIMM) synchronous dynamic random access memory (SDRAM), or an external micro-DIMM SDRAM. Such computer readable storage media allow a computing device to access computer-executable process steps, application programs and the like, stored on removable and non-removable memory media, to off-load data from the device or to upload data onto the device. A computer program product, such as one utilizing a communication system may be tangibly embodied in storage medium 1222, which may comprise a machine-readable storage medium.
According to one example implementation, the term computing device, as used herein, may be a CPU, or conceptualized as a CPU (for example, the CPU 1202 of
In example implementations of the disclosed technology, a computing device may include any number of hardware and/or software applications that are executed to facilitate any of the operations. In example implementations, one or more I/O interfaces may facilitate communication between the computing device and one or more input/output devices. For example, a universal serial bus port, a serial port, a disk drive, a CD-ROM drive, and/or one or more user interface devices, such as a display, keyboard, keypad, mouse, control panel, touch screen display, microphone, etc., may facilitate user interaction with the computing device. The one or more I/O interfaces may be utilized to receive or collect data and/or user instructions from a wide variety of input devices. Received data may be processed by one or more computer processors as desired in various implementations of the disclosed technology and/or stored in one or more memory devices.
One or more network interfaces may facilitate connection of the computing device inputs and outputs to one or more suitable networks and/or connections; for example, the connections that facilitate communication with any number of sensors associated with the system. The one or more network interfaces may further facilitate connection to one or more suitable networks; for example, a local area network, a wide area network, the Internet, a cellular network, a radio frequency network, a Bluetooth enabled network, a Wi-Fi enabled network, a satellite-based network any wired network, any wireless network, etc., for communication with external devices and/or systems.
Certain embodiments of the disclosed technology are described above with reference to block and flow diagrams of systems and/or methods according to example embodiments of the disclosed technology. Some blocks of the block diagrams and flow diagrams may not necessarily need to be performed in the order presented, or may not necessarily need to be performed at all, according to some embodiments of the disclosed technology.
While certain embodiments of the disclosed technology have been described in connection with what is presently considered to be the most practical embodiments, it is to be understood that the disclosed technology is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
This written description uses examples to disclose certain embodiments of the disclosed technology, including the best mode, and also to enable any person skilled in the art to practice certain embodiments of the disclosed technology, including making and using any devices or systems and performing any incorporated methods. The patentable scope of certain embodiments of the disclosed technology is defined in the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements with insubstantial differences from the literal language of the claims.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
5276848, | Jun 28 1988 | International Business Machines Corporation | Shared two level cache including apparatus for maintaining storage consistency |
5787486, | Dec 15 1995 | IBM Corporation | Bus protocol for locked cycle cache hit |
6122696, | Jan 03 1995 | VIA-Cyrix, Inc | CPU-peripheral bus interface using byte enable signaling to control byte lane steering |
6678798, | Jul 20 2000 | Hewlett Packard Enterprise Development LP | System and method for reducing memory latency during read requests |
6895475, | Sep 30 2002 | Analog Devices, Inc | Prefetch buffer method and apparatus |
6915387, | Jul 20 2000 | Hewlett Packard Enterprise Development LP | System and method for handling updates to memory in a distributed shared memory system |
6918009, | Dec 18 1998 | Fujitsu Limited | Cache device and control method for controlling cache memories in a multiprocessor system |
20050108481, | |||
20080147991, | |||
20090113139, | |||
20090216951, | |||
20100268884, | |||
20110202726, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Nov 08 2018 | Johnny, Yau | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Nov 08 2018 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Nov 29 2018 | SMAL: Entity status set to Small. |
Dec 28 2023 | M2551: Payment of Maintenance Fee, 4th Yr, Small Entity. |
Date | Maintenance Schedule |
Jun 30 2023 | 4 years fee payment window open |
Dec 30 2023 | 6 months grace period start (w surcharge) |
Jun 30 2024 | patent expiry (for year 4) |
Jun 30 2026 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jun 30 2027 | 8 years fee payment window open |
Dec 30 2027 | 6 months grace period start (w surcharge) |
Jun 30 2028 | patent expiry (for year 8) |
Jun 30 2030 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jun 30 2031 | 12 years fee payment window open |
Dec 30 2031 | 6 months grace period start (w surcharge) |
Jun 30 2032 | patent expiry (for year 12) |
Jun 30 2034 | 2 years to revive unintentionally abandoned end. (for year 12) |