In one embodiment of the present invention, a processing system for processing information efficiently and cost-effectively by switching between execution of time-critical and non-time-critical tasks includes a processing unit. The processing system further includes a first register group coupled to the processing unit and including a first set of registers, the processing unit reading the status of the first set of registers to execute time-critical tasks. The processing system further includes a second register group coupled to the processing unit and including a second set of registers, the second register group for updating the status of the second set of registers, the processing unit reading the status of the second set of registers to execute the non-time-critical tasks by avoiding saving the status of the first set of registers, wherein the processing unit switches between executing time-critical tasks and non-time-critical tasks efficiently and cost-effectively by avoiding saving status of the first or second set of registers.
|
1. A processing core of a processing system, employed in an audio and video encoder/decoder, for performing hardware-assisted context switching, comprising:
a processing unit;
an instruction cache for storing instructions for non-time-critical tasks and a code random access memory for storing instructions required for causing serving of time-critical tasks, the instruction cache and the code random access memory coupled to the processing unit;
a register group for updating status of registers related to time critical tasks; and
another register group coupled to the processing unit,
wherein the processing unit need access status of registers only in the register group for execution of time-critical tasks thereby avoiding saving and restoring status of the another register group for execution of time-critical tasks.
15. An audio and video encoder/decoder, comprising:
a processing system;
a video engine unit and a video interface unit both coupled to the processing system;
an audio engine unit and an audio interface unit both coupled to the processing system and
wherein the processing system has a processing core that includes:
a processing unit;
an instruction cache for storing instructions for non-time-critical tasks and a code random access memory for storing instructions required for causing serving of time-critical tasks, the instruction cache and the code random access memory coupled to the processing unit;
a register group for updating status of registers related to time critical tasks; and
another register group coupled to the processing unit;
a data memory and another data memory coupled to the processing unit; and
wherein the processing unit need access status of registers only in the register group for execution of time-critical tasks thereby avoiding saving and restoring status of the another registers for execution of time-critical tasks.
8. A method of processing information using hardware-assisted context switching in an audio and video encoder/decoder, comprising:
providing a processing unit;
coupling an instruction cache and a code random access memory to the processing unit;
coupling a register group and another register group to the processing unit;
coupling a data memory and another data memory to the processing unit;
coupling a low priority interrupt controller and a high priority interrupt controller to the processing unit;
performing non-time-critical tasks through dedicated use by the processing unit of the instruction cache, the register group, the data memory, and the low priority interrupt controller;
performing time-critical tasks through dedicated use by the processing unit of the code random access memory, the another register group, the another data memory, and the high priority interrupt controller; and
accessing status of registers only in the register group for execution of time-critical tasks thereby avoiding saving and restoring status of the another registers for execution of time-critical tasks.
2. The processing core according to
3. The processing core according to
4. The processing core according to
5. The processing core according to
6. The processing core according to
7. The processing core according to
9. The method according to
10. The method according to
11. The method according to
12. The method according to
13. The method according to
14. The method according to
16. The audio and video encoder/decoder according to
17. The audio and video encoder/decoder according to
18. The audio and video encoder/decoder according to
19. The audio and video encoder/decoder according to
20. The audio and video encoder/decoder according to
21. The audio and video encoder/decoder according to
|
1. Field of the Invention
The present invention relates generally to the design of multi-task processors and particularly to the design of processors employing hardware-assisted context switching to perform a plurality of tasks.
2. Description of the Prior Art
Modern electronic devices such as integrated circuit devices often include systems for performing a multiplicity of tasks. For example, a system on an integrated circuit device may perform audio and video compression and decompression. Alternatively, an electronic device such as a personal computer may include a processor for performing a multiplicity of tasks. In general, such electronic devices include one or more processors representing a multi-task system.
Conventional multi-task systems operate by switching between the tasks, a process known as context switching. For example, a task may constitute compression of video data, which is a time-critical task, wherein the video engine performing the compression may only wait for time intervals of less than 2 microseconds (1 microsecond=10−6 seconds) before requiring additional data. On the other hand, tasks such as user menu display or on-screen display are non-critical tasks wherein up to 2 seconds may lapse before additional data is provided without adversely affecting the quality of display. The processor employed in multi-task systems performs context switching between the tasks with different response time requirements.
Performing context switching includes saving the contents of registers of the hardware (hw) units and the status of resources employed in executing a task. The processor may subsequently perform a completely different task and, upon completion thereof, reinitialize all the saved registers and restore the status of the resources to continue executing the original task.
Hardware units requiring service generally send an interrupt command to the processor. If the interrupt command sent by the hw unit requires performing a time-critical task, the processor performs context switching to provide service to the hw unit. After providing the service, the processor resumes executing the same task that was being executed prior to the arrival of the interrupt command.
In modern electronic devices there are two conventional approaches to executing multiple tasks. In one approach, a single processor is employed to perform both the time-critical and non-time-critical tasks. As some time-critical tasks require a very short response time, a powerful processor operating at high speeds as well as a high-speed real-time operating system are required. However, a more powerful processor is larger and often too expensive for use in an electronic device.
In an alternative approach conventional electronic devices employ two different processors. One processor executes time-critical tasks and the other processor executes non-time-critical tasks. Utilizing two processors, however, requires more extensive hw in the electronic device. In addition, due to differences in speed and power of the processors, two different software (sw) development environments are needed thereby increasing the cost of the electronic device.
Referring now to
Step 16 indicates interrupt service wherein the processor provides service to the hw unit issuing the interrupt command. For example, interrupt service 16 may include providing additional data to the hw unit by the processor. After providing the service, the processor performs context restore, as indicated in step 18, by reinitializing all the registers and restoring all the RTOS resources to the conditions prevailing prior to the arrival of the hw interrupt 12. Context restore 18 enables the processor to continue thread execution in the same manner as was done before the arrival of the hw interrupt 12, as indicated in step 19.
Context save step 14 and context restore step 18 each may require up to 50 machine cycles for completion. Therefore, context switching requires approximately 100 machine cycles with a typical processor speed of 80 megahertz (1 megahertz=106 hertz, where 1 hertz is 1 cycle/second) or 1.25×106 seconds.
In light of the foregoing it is desirable to design a single processor to execute both time-critical and non-critical tasks without requiring considerable power and speed. In addition, the processor should operate with a conventional real-time operating system without requiring extensive sw development environment.
Briefly, a processing system for processing information efficiently and cost-effectively by switching between execution of time-critical and non-time-critical tasks includes a processing unit in accordance with an embodiment of the present invention. The processing system further includes a first register group coupled to the processing unit and including a first set of registers, the processing unit reading the status of the first set of registers to execute time-critical tasks. The processing system further includes a second register group coupled to the processing unit and including a second set of registers, the second register group for updating the status of the second set of registers, the processing unit reading the status of the second set of registers to execute the non-time-critical tasks by avoiding saving the status of the first set of registers, wherein the processing unit switches between executing time-critical tasks and non-time-critical tasks efficiently and cost-effectively by avoiding saving status of the first or second set of registers.
The foregoing and other objects, features and advantages of the present invention will be apparent from the following detailed description of the preferred embodiments which make reference to several figures of the drawing.
Referring now to
Audio and video codec 20 performs audio and video encoding and decoding for a variety of motion picture expert group 2 (MPEG-2)-based applications used in electronic devices such as super video compact disk (SVCD) recorders and universal serial bus (USB)-based television/video players and recorders. Processing system 38 in one embodiment of the present invention is an ARC core-based processor manufactured by ARC International located in Elstree, UK. Processing system 38 functions as a central controller for audio and video codec 20. In addition, processing system 38 performs multiplexing (combining a number of input signals into one output signal) of audio and video streams to generate an MPEG stream as well as demultiplexing (parsing one input signal into a number of output signals) of an MPEG stream into audio and video streams.
The VIO 22, the AIU 30, the VEU 24, processing system 38, the DCU 40, the DSP 34, and the HIU 36 inter-communicate via the R-bus 42 and the D-bus 44. The D-bus 44 is a 64-bit wide data bus and the R-bus 42 is a 16-bit wide register bus. Audio and video codec 20 receives digital audio and video input and generates digital audio and video output. The DCU 40 provides an interface between the audio and video codec 20 and an external (SDRAM) memory. The HIU 36 enables the audio and video codec 20 to communicate with a host controller and an external memory.
In one embodiment of the present invention processing system 38 executes both time-critical and non-time-critical tasks without requiring considerable power and speed. In fact, processing system 38 executes the time-critical tasks without the need for a real-time operating system, as described in detail hereinbelow.
The HIU 36 is used to communicate with a host controller and an external memory. The HIU 36 provides an interface to USB controllers and may also communicate with a personal computer (PC) host system via a personal computer interface (PCI) bridge. In addition, the HIU 36 is used for input/output of the encoded audio and video streams between the audio and video codec 20 and a host controller. The AIU 30 provides an interface between the audio and video codec 20 and an external audio device by transferring signals using inter-integrated sound (I2S) signaling. The audio PLL 32 provides a user-configurable output clock for external audio analog to digital and digital to analog converters.
The audio engine unit 34, being an embedded, 24-bit, general purpose, programmable DSP, performs audio-related functions. The DSP 34 performs a multiply-accumulate operation in a single cycle with no overhead delay. The DSP 34 also provides automatic translation from 64 to 24 bits. The DSP 34 performs audio encoding and decoding for all popular audio formats, such as Dolby digital and MPEG audio.
The VEU 24 is the video processor core for the audio and video codec 20. During encoding, the VEU 24 operates on preprocessed video data to generate an MPEG-compliant video stream by performing such tasks as motion estimation and compensation. During decoding, the VEU 24 operates on video streams to generate decompressed video data.
The VIO 22 preprocesses the input video data to generate preprocessed video data thereby facilitating subsequent encoding operation. Examples of video preprocessing include spatial and/or temporal pre-filtering for video input noise reduction and inverse telecine for converting the television (TV) format to the film format. The VIO 22 performs a variety of post-processing operations including horizontal and vertical scaling and telecine for converting the film format to the TV format.
The DCU 40 provides an interface between the audio and video codec 20 and an external memory (SDRAM). The DCU 40 sustains transfer of real-time audio and video data for encoding and decoding operations at 30 frames per second. The DCU 40 arbitrates requests from the audio and video codec 20 and generates the necessary control signals for the audio and video codec 20 as well as the external SDRAM. In addition, the I2C controller 26 provides control for external video encoders and decoders. The PLL 28 provides clocking for both the audio and video codec 20 and the external memory.
Referring now to
The R-bus bridge 74 provides an interface between data memory 68 and other communication systems within the audio and video codec 20 such as the DSP 34, the VEU 24, etc. shown in
In an embodiment of the present invention, processing unit 58 functions as a computational unit including an arithmetic-logic unit (ALU). Processing unit 58 executes both the time-critical and the non-time-critical tasks. Time critical tasks require a very short time interval within which execution of the task should be completed. An example of a time-critical task is providing additional data to the VEU during video compression. Additional data to the VEU should be provided in a time interval less than 2 microseconds (1 microsecond=10−6 second). Non-time-critical tasks include multiplexing of audio and video streams, demultiplexing of MPEG streams, and user interface applications. Additional data to a user interface application such as the on-screen display may be provided every 2 seconds.
In the present invention most of the resources in the processing system 50, other than processing unit 58, are partitioned into two sets each of which is dedicated to either the time-critical tasks or the non-time-critical tasks. The non-shaded components in the CPU core 52, i.e., the I-cache 56, register group 62, low priority interrupt controller 66, and data memory 68 are present in the conventional processors. In an embodiment of the present invention shown in
More specifically, instructions for the non-time-critical tasks are stored within the I-cache 56 and instructions for the time-critical tasks are stored within the code RAM 54. Processing unit 58 fetches instructions for time-critical and non-time-critical tasks from the code RAM 54 and the I-cache 56, respectively. Accordingly, processing unit 58 fetches all the instructions for executing the interrupt commands from the code RAM 54 rather than from an outside memory thereby enhancing efficiency of data processing.
Register group 60, updates the status of registers related to time-critical tasks and register group 62 updates the status of registers related to non-time-critical tasks. For example, registers updating the status of video compression are located in register group 60 while registers updating the status of audio/video multiplexing are located within register group 62. When an interrupt command is received processing unit 58 reads the registers in register group 60.
Furthermore, each of the two sets of priority interrupt controllers 64 and 66 is dedicated to a particular type of task. Interrupt commands generated by the communication systems are sent to the interrupt controllers. High priority interrupt controller 64 signals processing unit 58 whenever the interrupt command requires execution of time-critical tasks. Similarly, low priority interrupt controller 66 signals processing unit 58 whenever the interrupt command requires execution of non-time-critical tasks. Accordingly, processing unit 58 prioritizes the interrupt commands based on the response time required to provide service. It is noted that the instructions required for providing service to the communication systems for time-critical tasks are stored within the code RAM 54. However, for non-time-critical tasks there are interrupt commands, such as stop or play command generated by a user, that require instructions external to instruction cache 56. Under such circumstances, the I-cache 56 fetches the relevant instructions from an external memory via the D-bus bridge 76.
Similarly, data memory is partitioned to provide data for the two types of tasks separately. Data memory 72 stores data required for executing time-critical tasks and data memory 68 stores data required for executing non-time-critical tasks. For time-critical tasks all the data is stored within data memory 72 to avoid fetching data from an external memory thereby enhancing efficiency of the data processing operation.
Referring now to
An example of thread execution is multiplexing of audio and video streams, which is a non-time-critical task. In another embodiment of the present invention executing a time-critical task is an example of thread execution. The interrupt command, for example, may be issued by the VEU while compressing video data. The VEU may request more video data for compression. Providing additional video data to the VEU is a time-critical task for which the processing system stops audio/video multiplexing to provide service to the VEU, as indicated in the interrupt service step 84. However, in the present invention, through hw-assisted context switching, status of registers of the communication system requesting service, e.g. the VEU, is being updated according to the information stored in register group 60 (shown in
Accordingly, processing unit 58 fetches the necessary instructions from the code RAM 54 to execute time-critical tasks, as shown in interrupt service step 84. In the example considered hereinabove, processing unit 58 provides data to the VEU during interrupt service. Upon completion of the interrupt service, processing unit 58 resumes thread execution, as indicated in step 86. Processing unit 58 is no longer required to restore status of the registers, as required in conventional context switching, since in hw-assisted context switching the status of registers is being updated according to the information stored in register group 60.
Referring now to
Examples of top-level application 90 include MPEG audio and video encoding and decoding. There are a number of time-critical tasks related to MPEG encoding and decoding that are represented by a plurality of sub-functions such as motion estimation 98 and audio encoding 100. Examples of sub-functions representing non-time-critical tasks include audio/video multiplexing 92, audio/video demultiplexing 94, and transfer of user data from an external memory 96. Sub-functions 92–100, shown in
There are RTOS services 102 such as message queue and memory management provided for non-time-critical tasks. The RTOS services 102 are provided to the units participating in execution of non-time-critical tasks, such as I-cache 56, register group 62, data memory 68, and low priority interrupt controller 66 shown in
The hw drivers 106 and 108 represent examples of sw programs that enable the processing system to communicate with other communication systems. For example, if there is a need for the processing system to communicate with a personal computer interface (PCI) bus, a hw driver such as 106 provides the necessary sw allowing communication therebetween. The hw drivers 110–114 also represent sw programs that enable the processing system to communicate with the communication systems of the audio and video codec 20 such as the VEU 24 shown in
In the present invention the processing unit, comprising the arithmetic-logic unit (ALU) and other computational units, is shared between the time-critical and the non-time-critical tasks. Such an arrangement is possible because the communication systems require a relatively short computational time to be served, i.e., the time-critical and non-time-critical tasks do not require intensive computations. However, the remaining resources in the processing system, such as the instruction and data memory, are partitioned into two separate sets each of which is dedicated to either time-critical or non-time-critical tasks whereby data is processed efficiently and cost-effectively.
Although the present invention has been described in terms of specific embodiment, it is anticipated that alterations and modifications thereof will no doubt become apparent to those more skilled in the art. It is therefore intended that the following claims be interpreted as covering all such alterations and modification as fall within the true spirit and scope of the invention.
Patent | Priority | Assignee | Title |
7802259, | Aug 08 2005 | NXP, B V F K A FREESCALE SEMICONDUCTOR, INC | System and method for wireless broadband context switching |
7900031, | Aug 03 2005 | Intel Corporation | Multiple, cooperating operating systems (OS) platform system and method |
8140110, | Aug 08 2005 | SHENZHEN XINGUODU TECHNOLOGY CO , LTD | Controlling input and output in a multi-mode wireless processing system |
8266605, | Feb 22 2006 | Wind River Systems, Inc. | Method and system for optimizing performance based on cache analysis |
8495637, | Feb 12 2009 | ALAMBRITIS, STAVROS | Apparatus and method for temporarily freeing up resources in a computer |
8880960, | May 09 2012 | Target Brands, Inc.; TARGET BRANDS, INC | Business continuity planning tool |
9223387, | Jun 02 2011 | Intel Corporation | Rescheduling active display tasks to minimize overlapping with active platform tasks |
Patent | Priority | Assignee | Title |
5561466, | Jun 23 1993 | NEC Corporation | Video and audio data multiplexing into ATM cells with no dummy cell used and ATM cell demultiplexing |
5586293, | Aug 24 1991 | Freescale Semiconductor, Inc | Real time cache implemented by on-chip memory having standard and cache operating modes |
5619706, | Mar 02 1995 | Intel Corporation | Method and apparatus for switching between interrupt delivery mechanisms within a multi-processor system |
5812868, | Sep 16 1996 | SHENZHEN XINGUODU TECHNOLOGY CO , LTD | Method and apparatus for selecting a register file in a data processing system |
5928321, | May 30 1997 | Sony Corporation; Sony Electronics, Inc. | Task and stack manager for digital video decoding |
5933627, | Jul 01 1996 | Oracle America, Inc | Thread switch on blocked load or store using instruction thread field |
5978838, | Aug 19 1996 | SAMSUNG ELECTRONICS CO , LTD | Coordination and synchronization of an asymmetric, single-chip, dual multiprocessor |
5987592, | Jun 12 1997 | GLOBALFOUNDRIES Inc | Flexible resource access in a microprocessor |
6012137, | May 30 1997 | Sony Corporation; Sony Electronics, INC | Special purpose processor for digital audio/video decoding |
6081783, | Nov 14 1997 | CRYSTAL SEMICONDUCTOR CORP | Dual processor digital audio decoder with shared memory data transfer and task partitioning for decompressing compressed audio data, and systems and methods using the same |
6128641, | Sep 12 1997 | Infineon Technologies AG | Data processing unit with hardware assisted context switching capability |
6292888, | Jan 27 1999 | ARM Finance Overseas Limited | Register transfer unit for electronic processor |
6310921, | Apr 07 1997 | Sovereign Peak Ventures, LLC | Media processing apparatus which operates at high efficiency |
6378065, | Apr 27 1998 | Infineon Technologies AG | Apparatus with context switching capability |
6473864, | Jul 29 1999 | ARM Limited | Method and system for providing power management to a processing system |
6553487, | Jan 07 2000 | SHENZHEN XINGUODU TECHNOLOGY CO , LTD | Device and method for performing high-speed low overhead context switch |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jul 25 2001 | FENG, CHENHUI | STREAM MACHINE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 012031 | /0630 | |
Jul 25 2001 | CHENG, HOWN | STREAM MACHINE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 012031 | /0630 | |
Jul 27 2001 | Magnum Semiconductor, Inc. | (assignment on the face of the patent) | / | |||
Sep 30 2005 | STREAM MACHINE, INC | MAGNUM SEMICONDUCTORS, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016702 | /0775 | |
Sep 30 2005 | STREAM MACHINE, INC | MAGNUM SEMICONDUCTOR, INC | CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE S NAME PREVIOUSLY RECORDED AT REEL: 016702 FRAME: 0775 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT | 034037 | /0084 | |
Jun 12 2006 | MAGNUM SEMICONDUCTOR, INC | Silicon Valley Bank | SECURITY AGREEMENT | 017766 | /0005 | |
Jun 12 2006 | MAGNUM SEMICONDUCTOR, INC | SILICON VALLEY BANK AS AGENT FOR THE BENEFIT OF THE LENDERS | SECURITY AGREEMENT | 017766 | /0605 | |
Apr 26 2013 | SILICON VALLEY BANK , AS AGENT FOR THE BENEFIT OF THE LENDERS | MAGNUM SEMICONDUCTOR, INC | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 030310 | /0985 | |
Apr 26 2013 | Silicon Valley Bank | MAGNUM SEMICONDUCTOR, INC | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 030310 | /0764 | |
Oct 31 2014 | MAGNUM SEMICONDUCTOR, INC | CAPITAL IP INVESTMENT PARTNERS LLC, AS ADMINISTRATIVE AGENT | SHORT-FORM PATENT SECURITY AGREEMENT | 034114 | /0102 | |
Apr 05 2016 | CAPITAL IP INVESTMENT PARTNERS LLC | MAGNUM SEMICONDUCTOR, INC | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 038440 | /0565 | |
Apr 05 2016 | MAGNUM SEMICONDUCTOR, INC | Silicon Valley Bank | SECURITY AGREEMENT | 038366 | /0098 | |
Apr 04 2017 | Silicon Valley Bank | MAGNUM SEMICONDUCTOR, INC | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 042166 | /0405 |
Date | Maintenance Fee Events |
Jan 14 2011 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jan 20 2011 | STOL: Pat Hldr no Longer Claims Small Ent Stat |
Feb 27 2015 | REM: Maintenance Fee Reminder Mailed. |
Jul 17 2015 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Jul 17 2010 | 4 years fee payment window open |
Jan 17 2011 | 6 months grace period start (w surcharge) |
Jul 17 2011 | patent expiry (for year 4) |
Jul 17 2013 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jul 17 2014 | 8 years fee payment window open |
Jan 17 2015 | 6 months grace period start (w surcharge) |
Jul 17 2015 | patent expiry (for year 8) |
Jul 17 2017 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jul 17 2018 | 12 years fee payment window open |
Jan 17 2019 | 6 months grace period start (w surcharge) |
Jul 17 2019 | patent expiry (for year 12) |
Jul 17 2021 | 2 years to revive unintentionally abandoned end. (for year 12) |