An example of a controller circuit may include a policy module to generate a power reduction policy output based on a processor power state input. The power reduction policy output may also be generated based on a graphics render engine idleness input. The circuit can also include a clock masking cell to apply a clock masking configuration to a graphics render clock trunk based on the power reduction policy output.
|
9. A circuit comprising:
a policy module to generate a power reduction policy output based on a processor power state input and a graphics render engine idleness input, wherein the policy module further includes an AND gate to evaluate the processor power state input and the graphics render engine idleness input; and
a clock masking cell to apply a clock masking configuration to a graphics render clock trunk based on the power reduction policy output.
20. An apparatus comprising:
logic to,
determine a power state of a processor;
determine an idleness of a graphics render engine;
determine a power reduction policy and a clock edge masking configuration based on one or more register values;
apply the power reduction policy to a graphics render clock trunk based on the power state of the processor and the idleness of the graphics render engine; and
apply the clock edge masking configuration to one or more clock edges of the graphics render clock trunk.
15. A method comprising:
determining a power state of a processor;
determining an idleness of a graphics render engine;
determining a power reduction policy and a clock edge masking configuration based on one or more register values; and
applying the power reduction policy to a graphics render clock trunk based on the power state of the processor and the idleness of the graphics render engine, wherein applying the power reduction policy includes applying the clock edge masking configuration to one or more clock edges of the graphics render clock trunk.
1. A system comprising:
a platform controller;
a processor coupled to the platform controller;
a graphics render engine having at least one of vertex processing logic, texture application logic and rasterization logic to operate based on a graphics render clock; and
a graphics memory controller having a policy module, a masking multiplexer, and a clock masking cell, the policy module to generate a power reduction policy output based on a processor power state input, the masking multiplexer to sequentially select a masking register input from a plurality of masking register inputs based on a clock window input corresponding to a graphics render clock trunk, and the clock masking cell to apply successive masking register inputs to successive clock edges of the graphics render clock trunk based on the power reduction policy output.
2. The system of
3. The system of
4. The system of
5. The system of
6. The system of
7. The system of
8. The system of
a policy register to provide the policy register input;
a masking register to provide the plurality of masking register inputs; and
a basic input/output system (BIOS) memory programmed to write a policy register value to the policy register and a plurality of masking register values to the masking register.
10. The circuit of
11. The circuit of
12. The circuit of
13. The circuit of
14. The circuit of
16. The method of
17. The method of
18. The method of
19. The method of
21. The apparatus of
22. The apparatus of
23. The apparatus of
24. The apparatus of
|
This application claims priority to Malaysian Patent Application PI 20095512, filed Dec. 22, 2009, titled “GRAPHICS RENDER CLOCK THROTTLING AND GATING MECHANISM FOR POWER SAVING,” which is incorporated herein by reference in its entirety.
Embodiments generally relate to the reduction of power consumption in computing platforms. In particular, embodiments relate to throttling and gating graphics render clocks to reduce power consumption.
As the use of system-on-chip (SoC) architectures in computing platforms increases, the importance of power saving techniques to system designs may also grow. For example, although processor-based power reduction techniques such as low power state operation may be available, other functionality such as graphics rendering can constitute a significant portion of the total idle/average power in a given system.
The various advantages of the embodiments of the present invention will become apparent to one skilled in the art by reading the following specification and appended claims, and by referencing the following drawings, in which:
Embodiments of the present invention provide for a system including a processor, a graphics render engine and a graphics memory controller, wherein the graphics render engine receives and/or operates based on a graphics render clock. The graphics memory controller can have a policy module, a masking multiplexer and a clock masking cell, where the policy module may generate a power reduction policy output based on a processor power state input. The masking multiplexer can sequentially select a masking register input from a plurality of masking register inputs based on a clock window input corresponding to a trunk of the graphics render clock. The clock masking cell may apply successive masking register inputs to successive clock edges of the graphics render clock trunk based on the power reduction policy output.
Other embodiments provide for a controller circuit including a policy module to generate a power reduction policy output based on a processor power state input. In addition, the power reduction policy output can be generated based on a graphics render engine idleness input. The circuit may also include a clock masking cell to apply a clock masking configuration to a graphics render clock trunk based on the power reduction policy output.
Other embodiments provide for a method in which a power state of a processor is determined. An idleness of a graphics render engine may also be determined. A power reduction policy can be applied to a graphics render clock trunk based on the power state of the processor and the idleness of the graphics render engine.
In addition, embodiments provide for an apparatus having logic to determine a power sate of a processor and an idleness of a graphics render engine. The logic may also apply a power reduction policy to a graphics render clock trunk based on the power state of the processor and the idleness of the graphics render engine.
The policy module 12 may also generate the power reduction policy output 14 based on a graphics render engine idleness input 20. In particular, the illustrated policy module 12 includes a counter 22 to determine whether a graphics render engine has been idle for a predetermined time period. The counter 22 can therefore be used to ensure that there are no residual transactions in the render engine pipeline upon idle indication. The policy module 12 may use an AND gate 24 to evaluate the processor power state input 16a and the graphics render idleness input 20, and use an AND gate 26 to evaluate the processor power state input 16d and the graphics render idleness input 20. The illustrated policy module 12 also includes a policy multiplexer 28 that selects the power reduction policy output 14 from a plurality of power reduction policy outputs based on a policy register input 30. Thus, the policy register input 30, which may be obtained from a policy register (not shown) that is set by the basic input/output system (BIOS) at start-up, could select any one of five power reduction policy options as identified below in Table 1.
TABLE 1
Option
Description
000
Disable render engine clock trunk throttling and gating.
001
Enable render engine clock trunk gating on render engine
idle in C2, C3, C4
010
Enable render engine clock trunk gating on render engine
idle in C3, C4
011
Enable render engine clock trunk throttling only in C2, C3, C4
(regardless of render engine idleness)
100
Enable render engine clock trunk throttling only in C3, C4
(regardless of render engine idleness)
For example, if the policy register (not shown) contains the value of 001, render engine clock trunk gating may be implemented if the graphics render engine is idle for the predetermined time period and the processor is in low power state C2, C3 or C4. By contrast, if the policy register (not shown) contains the value of 011, render engine clock trunk throttling may be implemented if the processor is in low power state C2, C3 or C4, regardless of the idleness of the graphics render engine. The use of the policy multiplexer 28 can therefore enable greater flexibility in the design and operation of the circuit 10 as well as the functionality of the overall platform.
The circuit 10 may also include a masking module 32 having a masking multiplexer 34 that sequentially selects a masking register input from a plurality of masking register inputs 36 based on a clock window input 38 that corresponds to the graphics render clock trunk. The render engine clock trunk may be the clock on which the graphics render engine operates, before it is distributed to the various logic regions of the graphics render engine and/or graphics controller. In the illustrated example, the clock window 38 provides a 16-clock window count and the inputs of the masking multiplexer 34 are wired to sixteen-bit values that may be set in a masking configuration register (not shown) by the BIOS at boot-up. A 16-clock window is used herein to facilitate discussion only, and larger or smaller clock windows may be used without parting from the spirit and scope of the embodiments described.
Thus, if the selected power reduction policy is satisfied, the power reduction policy output 14 may assert, while the desired graphics render clock trunk masking configuration from the masking register inputs 36 is applied, via a NAND gate 40 and a clock masking cell 42 to a graphics render clock trunk 18. In particular, the clock masking cell 42 may include an OR gate 44 that will toggle a latch 46 with an active low enable based on the signal from the NAND gate 40 and a bypass signal 48 that enables the clock trunk to be kept running even if the power reduction policy is satisfied. The output of an AND gate 50 may therefore be a gated or throttled graphics render clock trunk as a result of the application of the render clock masking configuration in accordance with the selected power reduction policy. Thus, the clock masking cell 42 can provide for significant power reduction through clock gating or clock throttling.
Turning now to
A processor state signal 58 demonstrates that the processor may switch from the C0 state to the C2 state at transition 60. A gating enable signal 62 may not undergo a transition at this point, however, because an illustrated render engine idle signal 64 has not yet asserted. When the render engine idle signal 64 asserts at transition 66, the delay time period defined by the gating configuration register signal 56 is permitted to expire. At such time, a gating counter signal 68 may undergo a transition 70, which can in turn trigger a transition 72 in the gating enable signal 62 because all of the conditions for the power reduction policy have been satisfied. Thus, a masking edge signal 74 may be applied to a free running, ungated render clock trunk signal 78 in response to a render clock gate enable signal 76 switching to a gate enable condition. The illustrated ungated render clock trunk signal 78 is synched via a 16-clock window count signal 80 (e.g., a free running phase count that repeats every sixteen render clocks), and a gated render clock trunk signal 82 may result. At an illustrated transition 84, the processor state signal switches back to the C0 state, causing clock gating to be deactivated and the render clock trunk to resume normal toggling.
Turning now to
The GMCH 114 may also communicate with the graphics controller 113 via a graphics bus 115 such as a PCI Express Graphics (PEG, e.g., Peripheral Components Interconnect/PCI Express x16 Graphics 150W-ATX Specification 1.0, PCI Special Interest Group) bus, or Accelerated Graphics Port (e.g., AGP V3.0 Interface Specification, September 2002) bus. The GMCH 114 may also communicate with the PCH 116, which may be referred to as a Southbridge, over a hub bus 130. In one embodiment, the hub bus 30 is a DMI (Direct Media Interface) bus. The PCH 116 could also be incorporated with the processor 112 and GMCH 114 onto a common SoC. The illustrated system 110 also has one or more peripheral controllers 124 such as a Wi-Fi (e.g., Institute of Electronics Engineers/IEEE 802.11a, b, g, n) network interface, an Ethernet controller (e.g., IEEE 802.3), PC Card controller (e.g, CardBus PCMCIA standard), and so on.
The PCH 116 may also have internal controllers such as USB (Universal Serial Bus, e.g., USB Specification 2.0, USB Implementers Forum), Serial ATA (SATA, e.g., SATA Rev. 3.0 Specification, May 27, 2009, SATA International Organization/SATA-IO), High Definition Audio, and other controllers. The PCH 116 may be able to place the cores of the processor 112 in one or more low power states to reduce power consumption by issuing various power state control signals to a voltage regulator (not shown) that supplies an operating voltage to the processor 112. Alternatively, the processor 112 itself could place the cores in the various low power states and inform the PCH 116 and/or GMCH 114 of its low power state status. In one embodiment, a chipset defined by the GMCH 114 and PCH 116 may include one or more blocks (e.g., chips or units within an integrated circuit) to perform various interface control functions (e.g., memory control, graphics control, I/O interface control, and the like). As already noted, these circuits may be implemented on one or more separate chips and/or may be partially or wholly implemented within the processor 112.
The illustrated graphics render engine 120, which is integrated with the GMCH 114 and processor 112 on to a common SoC, includes a wide variety of logic such as vertex processing logic (Lvp) 132, texture application logic (Lta) 134, and rasterization logic (Lr) 136. This logic, while significantly enhancing graphics performance, may constitute a relatively large portion of the overall power consumption of the processor 112. The illustrated GMCH 114 uses an oscillator 138 and a phase locked loop (PLL) 140 to generate a graphics render clock trunk, which may be distributed to the individual units of logic 132, 134, 136, within the graphics render engine 120. Throttling and/or gating the graphics render clock trunk as described herein may therefore provide significant power savings. Accordingly, the GMCH 114 may include a power circuit 142 such as the circuit 10 (
As already noted, the system 110 may implement a variety of different computing devices or other appliances with computing capability. Such devices include but are not limited to test systems, design/debug tools, laptop computers, notebook computers, PDAs, cellular phones, audio and/or video media players, desktop computers, servers, and the like. The system 110 could constitute one or more complete computing systems or alternatively, it could constitute one or more components useful within a computing system.
Turning now to
A power reduction policy may be applied to a graphics render clock trunk at block 152 based on the power state of the processor and the idleness of the graphics render engine. As already discussed, applying the power reduction policy could include throttling the graphics render clock trunk if the processor is in a low power state. If such an approach is used, the idleness determination at block 150 may be circumvented and/or omitted. Applying the power reduction policy may also include gating the graphics render clock trunk if the processor is in a low power state and an idleness condition of the graphics render engine is satisfied. The idleness condition could include the graphics render engine being idle for a predetermined time period. As also already discussed, application of the power reduction policy could entail applying a clock edge masking configuration to one or more clock edges of the graphics render clock trunk. Other clock throttling and/or gating techniques may also be used. For example, to the extent to which the oscillator 138 (
Thus, instances in which the processor is in a non-executing mode can be leveraged when the clocks of other power-intensive devices such as graphics render engines are reduced in frequency (e.g., throttled) or totally zeroed out (e.g., gated). The result is a reduction or elimination of the toggle rate of targeted clock registers and a lowering of dynamic power consumption. Moreover, the use of registers and BIOS programming enables post-silicon optimization of power saving versus performance or functionality of the integrated graphics. For more aggressive power saving, clock gating may be the appropriate choice. On the other hand, if gating is not permissible or desired, a clock throttling scheme may be used. In either case, the configuration can be programmable after fabrication of the platform.
Embodiments described herein are applicable for use with all types of semiconductor integrated circuit (“IC”) chips. Examples of these IC chips include but are not limited to processors, controllers, chipset components, programmable logic arrays (PLA), memory chips, network chips, and the like. In addition, in some of the drawings, signal conductor lines are represented with lines. Some may be thicker, to indicate more constituent signal paths, have a number label, to indicate a number of constituent signal paths, and/or have arrows at one or more ends, to indicate primary information flow direction. This, however, should not be construed in a limiting manner. Rather, such added detail may be used in connection with one or more exemplary embodiments to facilitate easier understanding of a circuit. Any represented signal lines, whether or not having additional information, may actually comprise one or more signals that may travel in multiple directions and may be implemented with any suitable type of signal scheme, e.g., digital or analog lines implemented with differential pairs, optical fiber lines, and/or single-ended lines.
Example sizes/models/values/ranges may have been given, although embodiments of the present invention are not limited to the same. As manufacturing techniques (e.g., photolithography) mature over time, it is expected that devices of smaller size could be manufactured. In addition, well known power/ground connections to IC chips and other components may or may not be shown within the figures, for simplicity of illustration and discussion, and so as not to obscure certain aspects of the embodiments of the invention. Further, arrangements may be shown in block diagram form in order to avoid obscuring embodiments of the invention, and also in view of the fact that specifics with respect to implementation of such block diagram arrangements are highly dependent upon the platform within which the embodiment is to be implemented, i.e., such specifics should be well within purview of one skilled in the art. Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that embodiments of the invention can be practiced without, or with variation of, these specific details. The description is thus to be regarded as illustrative instead of limiting.
The term “coupled” may be used herein to refer to any type of relationship, direct or indirect, between the components in question, and may apply to electrical, mechanical, fluid, optical, electromagnetic, electromechanical or other connections. In addition, the terms “first”, “second”, etc. may be used herein only to facilitate discussion, and carry no particular temporal or chronological significance unless otherwise indicated.
Those skilled in the art will appreciate from the foregoing description that the broad techniques of the embodiments of the present invention can be implemented in a variety of forms. Therefore, while the embodiments of this invention have been described in connection with particular examples thereof, the true scope of the embodiments of the invention should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims.
Tang, Lai Guan, Chong, Lai Kuan
Patent | Priority | Assignee | Title |
11137815, | Mar 15 2018 | Nvidia Corporation | Metering GPU workload with real time feedback to maintain power consumption below a predetermined power budget |
9639641, | Aug 20 2015 | MICROSEMI SOLUTIONS U S , INC | Method and system for functional verification and power analysis of clock-gated integrated circuits |
9805438, | Nov 06 2012 | TAHOE RESEARCH, LTD | Dynamically rebalancing graphics processor resources |
9928323, | Aug 20 2015 | Microsemi Solutions (U.S.), Inc. | Method and system for functional verification and power analysis of clock-gated integrated circuits |
Patent | Priority | Assignee | Title |
7039819, | Apr 30 2003 | GLOBALFOUNDRIES U S INC | Apparatus and method for initiating a sleep state in a system on a chip device |
7958483, | Dec 21 2006 | Nvidia Corporation | Clock throttling based on activity-level signals |
20050289377, | |||
20090172439, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Oct 19 2010 | TANG, LAI GUAN | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 025167 | /0096 | |
Oct 20 2010 | Intel Corporation | (assignment on the face of the patent) | / | |||
Oct 20 2010 | CHONG, LAI KUAN | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 025167 | /0096 |
Date | Maintenance Fee Events |
Jun 13 2014 | ASPN: Payor Number Assigned. |
Jan 04 2018 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Mar 07 2022 | REM: Maintenance Fee Reminder Mailed. |
Aug 22 2022 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Jul 15 2017 | 4 years fee payment window open |
Jan 15 2018 | 6 months grace period start (w surcharge) |
Jul 15 2018 | patent expiry (for year 4) |
Jul 15 2020 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jul 15 2021 | 8 years fee payment window open |
Jan 15 2022 | 6 months grace period start (w surcharge) |
Jul 15 2022 | patent expiry (for year 8) |
Jul 15 2024 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jul 15 2025 | 12 years fee payment window open |
Jan 15 2026 | 6 months grace period start (w surcharge) |
Jul 15 2026 | patent expiry (for year 12) |
Jul 15 2028 | 2 years to revive unintentionally abandoned end. (for year 12) |