A finite state machine (FSM) for a redundant array of independent disk includes a single process context that maintains an entire finite state required for input/output operations performed in a raid system. The finite state is only updated in response to calls and call-backs. The call-backs can include procedure returns and interrupt signals. The call is received directly from an application program, and the call-backs are received from a driver and passed back directly to the application software by the finite state machine.
|
1. A finite state machine stored on a computer-readable medium for a redundant array of independent disks, the finite state machine comprising:
a single process context configured to maintain a plurality of entire finite states required for an input/output operation performed in the redundant array of independent disks;
means for initializing the finite states only in response to a call; and
means for updating the finite states only in response to call-backs;
wherein there is one entire finite state for each one of a plurality of concurrent i/O operations performed in the redundant array of independent disks.
13. A computer-implemented method for operating a redundant array of independent disks, the method comprising:
providing a finite state machine for the redundant array of independent disks;
maintaining a plurality of entire finite states required for an input/output operation performed in the redundant array of independent disks in a single process context;
initializing the finite states only in response to a call; and
updating the finite states only in response to call-backs;
wherein there is one entire finite state for each one of a plurality of concurrent i/O operations performed in the redundant array of independent disks.
9. A finite state machine stored on a computer-readable medium for a redundant array of independent disks, the finite state machine comprising:
a single process context configured to maintain an entire finite state required for an input/output operation performed in the redundant array of independent disks;
means for initializing the finite state only in response to a call; and
means for updating the finite state only in response to call-backs,
wherein the call is received by the finite state machine directly from application software, and the call-backs are received from a driver and passed back directly to the application software by the finite state machine.
17. A computer-implemented method for operating a redundant array of independent disks, the method comprising:
providing a finite state machine for the redundant array of independent disks;
maintaining an entire finite state required for an input/output operation performed in the redundant array of independent disks in a single process context;
initializing the finite state only in response to a call; and
updating the finite state only in response to call-backs, wherein the call is received by the finite state machine directly from application software, and the call-backs are received from a driver and passed back directly to the application software by the finite state machine.
12. A finite state machine stored on a computer-readable medium for a redundant array of independent disks, the finite state machine comprising:
a single process context configured to maintain an entire finite state required for a multi-block input/output operation including more than two blocks performed in the redundant array of independent disks;
means for initializing the finite state only in response to a call; and
means for updating the finite state only in response to call-backs,
wherein the finite state machine enters a first state in response to receiving the call, enters a second state in response to a call-back indicating completion an operation on a first block, enters a third state while performing an operation on all but a last block, enters a third state when initiating an operation on the last block, and enters a fourth state upon completing the operation on the last block.
20. A computer-implemented method for operating a redundant array of independent disks (raid) in response to input/output calls from an external application, the method comprising:
providing an interface in the raid that receives the input/output calls from the external application and transmits responses thereto;
providing a raid finite state process operating entirely within the raid, connected to said interface, that controls a plurality of concurrent input/output operations performed in the raid without reliance on an external operating system;
operating the raid finite state process based on calls and call-backs received from the external application through said interface; and
maintaining a plurality of entire finite states in the raid finite state process, there being one entire finite state for each one of the plurality of concurrent i/O operations performed in the redundant array of independent disks.
2. The finite state machine of
3. The finite state machine of
4. The finite state machine of
5. The finite state machine of
6. The finite state machine of
7. The finite state machine of
a bus connecting the finite state machine to an external computer system via a software interface configured to translate a driver call into the call for the finite state machine; and
a hardware interface configured to translate the call into a driver call after the finite state is initialized.
8. The finite state machine of
a bus connecting the finite state machine to a context-less driver via a software interface configured to translate a driver call into the call for the finite state machine; and
a hardware interface configured to translate the call into a driver calls after the finite state is initialized.
11. The finite state machine of
a software interface configured to translate a driver call into the call for the finite state machine; and
a hardware interface configured to translate the call into a driver call after the finite state is initialized.
14. The method of
15. The method of
19. The method of
21. The method of
22. The method of
23. The method of
placing a raid instruction from the external application in said register;
signaling said finite state process that an instruction is available;
transmitting an acknowledgement signal to the external application; and
transmitting status information to the external application as the finite state process attempts to perform the raid instruction.
24. The method of
|
This invention relates generally to the field of disk storage systems, and more particularly to redundant arrays of independent disks.
Most modem, mid-range to high-end disk storage systems are arranged as redundant arrays of independent disks (RAID). A number of RAID levels are known. RAID-1 includes sets of N data disks and N mirror disks for storing copies of the data disks. RAID-3 includes sets of N data disks and one parity disk. RAID-4 also includes sets of N+1 disks, however, data transfers are performed in multi-block operations. RAID-5 distributes parity data across all disks in each set of N+1 disks. At any level, it desired to have RAID systems where an input/output (I/O) operation can be performed with minimal operating system intervention.
In most modem RAID systems, application software issues a procedure call to a state driven I/O driver of the operating system to perform the I/O operations. The I/O driver then passes the call to the RAID system. Successful or unsuccessful completion of the I/O operation is signaled from the RAID system to the I/O driver, and then to the application via call-backs, e.g., procedure returns and interrupt signals.
Often, the RAID system is used as the core for a file server, or a large database. There, the RAID system must be able to interact with a number of different types of hardware platforms, e.g., end-user PCs and work stations, and compute, print, and network servers, and the like. Consequently, it is a major problem to ensure that the RAID system will work concurrently and reliably with a variety of different operating systems, e.g., UNIX, LINUX, NT, WINDOWS, etc. Key among those problems is to determine how to give the RAID system a process context in which to perform multi-block operations, such as generating parity in a RAID-5 set, or copying data in a RAID-1 set when thousands of blocks need to be processed with a single I/O operation. On-line expansion and RAID level migration also require a process context.
Because process contexts can have different states and different state transitions in different operating systems, it is difficult to make a generic RAID system operate reliably. Also, prior art RAID systems require that there be some process context in the operating system to perform a multi-block operation in the RAID system.
Therefore, it is desired to provide a RAID system that can operate with any operating system, or no operating system at all.
A primary objective of the present invention is to provide a RAID system, which is, in its entirety, state driven, and, therefore, has no dependencies on operating system process contexts.
A related object of the invention is to provide a RAID system, which can be used without an operating system to enhance the performance of a RAID system by eliminating the overhead of operating system process contexts.
In accordance with the invention, only input/output (I/O) calls and call-backs drive a RAID finite state machine (FSM). The entire process state required to perform the I/O operations in the RAID system are maintained within the RAID FSM, and the I/O calls and call-backs are the only stimuli that change the state of the RAID FSM.
The RAID FSM according to the invention can be used with any operating system, any input/output driver, or no external process context at all. The RAID FSM according to the invention uses a small number of I/O calls and call-backs, and a small number of well-defined states and state transitions maintained entirely within the RAID FSM.
More particularly, the invention provides a finite state machine (FSM) for a redundant array of independent disk. The RAID FSM includes a single process context that maintains an entire finite state required for input/output operations performed in the RAID system. The finite state is only updated in response to procedure calls and call-backs. The call-backs can be procedure returns and interrupt signals. The procedure call can be received directly from application software, or an application interface. The call-backs are received from a driver and passed back directly to the application software by the finite state machine. The single process context is external to an operating system, and the input/output operation can specify a large, multi-block operation.
System Structure
The procedure calls can be issued, for example, by application software, and call-backs are due to interrupt signals generated by the redundant arrays of independent disks. In the preferred embodiment, the FSM 100 is implemented with software procedures, although it should be understood that the RAID FSM 100 can also be implemented with a hardware controller, firmware.
As an advantage of the present invention, the entire necessary state related to processing input/output operations in the RAID system is maintained by the FSM 100, and not by any operating system process contexts. As additional advantage, the RAID FSM 100 according to the invention can be used concurrently with any number of computer systems, perhaps executing different operating systems, or none at all
The system also includes an I/O driver 110 and I/O hardware registers 120. The driver 110 and registers 120 are coupled to a RAID 130. The structure and operation of these components are well know.
System Operation
During operation, an application (hardware or software) can directly request the RAID FSM 100 to perform an I/O operation via an I/O call 101. In the case that the application is implemented in software, the call can be a procedure or function call. In the case that the application is implemented in hardware, as described below, the call can be in the form of electronic signals, for example, values in controller or bus registers.
A small number of calls can be defined, for example, initialize, write, read, or copy N blocks beginning at block X. To distinguish these calls from traditional driver calls, these can be called FSM calls.
In response to the call 101, the RAID FSM 100 initializes 108 the state 102 related to processing the I/O call 101. The RAID FSM then issues a driver call 103 to the I/O driver 110. The I/O driver can maintain driver state 104 related to the driver call 103. Some drivers, as described below, cannot maintain state. This does not matter. The I/O driver 110 then writes I/O data 105 into the hardware registers 120 to begin the requested operation in the RAID 130. After the driver 110 has written the registers 120, the driver calls back 106 the RAID FSM 100 so that the RAID FSM state 102 can be updated. The RAID FSM 100 then calls back 107 the application that the requested operation has begun, and the application can resume execution.
At this time, both the RAID FSM 100 and the I/O driver 110 are temporarily finished. Indeed, no code needs to execute in either the RAID FSM 100 or the I/O driver 110 while the RAID 130 performs the request I/O operation 101. Furthermore, no code needs to execute in the operating system, now or later, to manage the I/O operation and its completion.
After the requested I/O operation completes in the RAID 130, successful or not, an interrupt (call-back) 115 signals the driver 110, perhaps causing a completion procedure to be executed in the driver. The call-back can include status information, such as, performance data, the reason for failure, e.g., corrupted data, time-out, etc. The completion procedure can update the driver state 104, and in turn call-back 106 the RAID FSM 100. The RAID FSM updates its state 102, and signals 111 completion of the requested operation 101 to the application in another call-back. The application now knows that the requested operation in the I/O call 101 has been completed, and acknowledges the RAID FSM of this fact in signal 117. The RAID FSM can discard the state 102 related to processing the I/O call 101, another form of state update. For completeness, the RAID FSM can, in turn, signal 116 the driver 110 to do the same. Note, that the signals shown as dotted lines will not be further described, although they can be assumed to be used in the description below.
It should be noted that the RAID FSM 100 according to the invention is arranged between the application software and the I/O driver, whereas in traditional RAID systems the application usually communicates first with the I/O driver, and then the I/O driver communicates with the RAID driver. It should also be noted, that the RAID FSM can maintain multiple finite states, one for each I/O operation that is concurrently in progress.
Although the above described structure and operation might seem straightforward, this is not the case when the I/O request is for a multi-block operation, especially when it desired to do so with a single process context entirely within the RAID FSM, i.e., external to any operating system context, so that the RAID 130 can operate with different operating systems, or none at all. In the prior art, state for large, multi-block I/O operations are usually maintained in a process context of the operating system.
Multi-Block Operation
From state A, the RAID FSM 100 issues a driver call 209 using the interface 103–106 to the I/O driver to start the operation for the first block 210 of the requested multi-block operation 206. When this operation completes, the driver signals 211 the RAID FSM 100 using the interface 113–116. This signal causes the RAID FSM to transition 212 to state B. State B, triggers 213 the operation for the next block 214 using the interface 103–106. When that operation is complete, the driver signals 215 the RAID FSM 100 using the interface 113–116, and the RAID FSM transitions 216 to state C. From this point forward, the RAID FSM remains in state C while issuing 217 driver calls for all remaining blocks 218 using the interface 103–106. However, when the driver signals 219 completion of operation on the last block 119 using the interface 113–116, the RAID FSM 100 transitions 220 to state D. State D causes the RAID FSM 100 to acknowledged 222 to the application, via a call-back, using the interface 111–117, that the entire multi-block operation has completed.
Mini-Port Driver Operation
In most operating systems, the I/O driver is used to translate the software I/O calls from the application to the RAID hardware registers. A commonly used driver with minimum functionality is called a “mini-port” driver (MPD). As a characteristic, a mini-port driver executing under a host computer operating system, such as Windows NT or Windows 95, is limited in how it can operate. For example, the mini-driver, by design, has no access to processes or threads. That is, it can be called a context-less driver. It is called in only one context, and it is expected to initialize the hardware and return as quickly as possible. Traditional RAID systems cannot operate solely within this limitation, particularly while performing multi-block operations. Therefore, prior art RAID systems must also use an operating system process or thread. A RAID system that uses the RAID FSM 100 according to the invention has no such requirements.
As shown in
During operation of the system, a user application 301 issue an I/O call to the operating system 302. The operating system 302 translates the I/O call into a driver call, and calls the software interface 303. The software interface 303 translates the driver call into the call format 101–107–111–117 used by the RAID FSM 100, i.e., RAID FSM calls as described above. The RAID FSM 100 initializes the finite state, and in turn calls the mini-driver 304 via the hardware interface 306, using the interface 103–106–113–116, and the mini-driver interacts 105–115 with the hardware, i.e., the registers 120 and RAID 130 of
Because the RAID FSM 100 according to the invention operates in a single process context, the driver calls and interrupts (call-backs) are sufficient to accomplish all of operations, including multi-block operations, such as writing to the entire RAID, on-line expansion, and on-line RAID level migration.
Large RAID Storage System Operation
In this embodiment, the operating system or application software of the external system 401 calls a software interface 403 via the connection 402. The application may, or may not use multiple tasks 404–406 to manipulate data, e.g., interact with a large file system or database.
The tasks 404–406 can be controlled by a real-time operating system (RTOS) 407, and use calls specific for the RTOS. However, the RAID FSM 100 operates without using any RTOS context. Calls from the software interface 403 to the RAID FSM 100 are via the tasks and/or the RTOS using the interface 101–107–111–117. The RAID FSM 100 then calls the hardware interface 409 using the interface 103–106–113–116 which interacts 105–115 with the hardware, i.e., writes registers 120 and receives interrupts from the RAID 130, as described above. Translation of the calls and call-backs between the various components is done as described above.
Because the RAID FSM 100 only uses a finite state, the calls and call-backs are enough for all operations, including multi-block operations, such as writing to the entire RAID, on-line expansion, and on-line RAID level migration. No RTOS specific tasks are needed, nor are any RTOS specific functions. All RAID system operations are accomplished by using only I/O calls, and completion call-backs.
Operating a RAID System without an Operating System
In this embodiment of the invention, the host drivers 501 write and read PCI bus registers to initiate an I/O operation, i.e., electronic signals. The software interface 504 translates the registers written by the host driver into calls that are compatible with the RAID FSM 100, using the interface 101–107, as described above. The RAID FSM then calls the hardware interface 506 using the interface 103–106, which in turn interacts 105–115 with the hardware 120–130 as described above.
When the I/O operation is complete, the hardware interface 506 receives an interrupt, and calls back the RAID FSM 100 using the interface 113–116, as described above. The RAID FSM 100 in turn causes a call-back to the PCI software interface 504, using the interface 111–117, which then interrupts (calls-back) the host driver 501 (application) through the PCI bus 502. This embodiment has the same advantages as described above. All RAID system operations are done with I/O calls and completion call-backs. Such a system can be used, for example, to automatically and reliably perform large-scale periodic data back-ups without operating system intervention.
Detailed descriptions of the preferred embodiment are provided herein. It is to be understood, however, that the present invention may be embodied in various forms. Therefore, specific details disclosed herein are not to be interpreted as limiting, but rather as a basis for the claims and as a representative basis for teaching one skilled in the art to employ the present invention in virtually any appropriately detailed system, structure or manner.
Noya, Eric S., Arnott, Randy M., Wong, Jeffrey T., Franklin, Chris R.
Patent | Priority | Assignee | Title |
9141417, | Aug 28 2008 | NetApp, Inc. | Methods and systems for integrated storage and data management using a hypervisor |
Patent | Priority | Assignee | Title |
5081577, | Dec 22 1989 | Harris Corporation | State controlled device driver for a real time computer control system |
5088081, | Mar 28 1990 | Intel Corporation | Method and apparatus for improved disk access |
5271012, | Feb 11 1991 | INTERNATIONAL BUSINESS MACHINES CORPORATION, ARMONK, NY 10504 A CORP OF NY | Method and means for encoding and rebuilding data contents of up to two unavailable DASDs in an array of DASDs |
5301297, | Jul 03 1991 | IBM Corp. (International Business Machines Corp.); International Business Machines Corporation | Method and means for managing RAID 5 DASD arrays having RAID DASD arrays as logical devices thereof |
5392244, | Aug 19 1993 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Memory systems with data storage redundancy management |
5524204, | Nov 03 1994 | International Business Machines Corporation | Method and apparatus for dynamically expanding a redundant array of disk drives |
5574851, | Apr 19 1993 | TAIWAN SEMICONDUCTOR MANUFACTURING CO , LTD | Method for performing on-line reconfiguration of a disk array concurrent with execution of disk I/O operations |
5574882, | Mar 03 1995 | International Business Machines Corporation | System and method for identifying inconsistent parity in an array of storage |
5598549, | Jun 11 1993 | TAIWAN SEMICONDUCTOR MANUFACTURING CO , LTD | Array storage system for returning an I/O complete signal to a virtual I/O daemon that is separated from software array driver and physical device driver |
5644767, | Jun 01 1993 | TAIWAN SEMICONDUCTOR MANUFACTURING CO , LTD | Method and apparatus for determining and maintaining drive status from codes written to disk drives of an arrayed storage subsystem |
5721920, | Aug 05 1994 | Telefonaktiebolaget LM Ericsson | Method and system for providing a state oriented and event driven environment |
5758118, | Dec 08 1995 | International Business Machines Corporation | Methods and data storage devices for RAID expansion by on-line addition of new DASDs |
5920884, | Sep 24 1996 | MagnaChip Semiconductor, Ltd | Nonvolatile memory interface protocol which selects a memory device, transmits an address, deselects the device, subsequently reselects the device and accesses data |
6021462, | Aug 29 1997 | Apple Inc | Methods and apparatus for system memory efficient disk access to a raid system using stripe control information |
6058454, | Jun 09 1997 | International Business Machines Corporation | Method and system for automatically configuring redundant arrays of disk memory devices |
6138125, | Mar 31 1998 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | Block coding method and system for failure recovery in disk arrays |
6493804, | Oct 01 1997 | Seagate Technology LLC | Global file system and data storage device locks |
6651165, | Nov 13 2000 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | Method and apparatus for directly booting a RAID volume as the primary operating system memory |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 06 2000 | ARNOTT, RANDY M | RAIDCORE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011574 | /0693 | |
Mar 06 2000 | NOYA, ERIC S | RAIDCORE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011574 | /0693 | |
Mar 06 2000 | WONG, JEFFREY T | RAIDCORE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011574 | /0693 | |
Mar 06 2000 | FRANKLIN, CHRIS R | RAIDCORE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011574 | /0693 | |
Feb 27 2001 | Broadcom Corporation | (assignment on the face of the patent) | / | |||
Apr 11 2007 | RAIDCORE, INC | Broadcom Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 019146 | /0643 | |
Feb 01 2016 | Broadcom Corporation | BANK OF AMERICA, N A , AS COLLATERAL AGENT | PATENT SECURITY AGREEMENT | 037806 | /0001 | |
Jan 19 2017 | BANK OF AMERICA, N A , AS COLLATERAL AGENT | Broadcom Corporation | TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS | 041712 | /0001 | |
Jan 20 2017 | Broadcom Corporation | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 041706 | /0001 | |
May 09 2018 | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | MERGER SEE DOCUMENT FOR DETAILS | 047196 | /0097 | |
Sep 05 2018 | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | CORRECTIVE ASSIGNMENT TO CORRECT THE EXECUTION DATE PREVIOUSLY RECORDED AT REEL: 047196 FRAME: 0097 ASSIGNOR S HEREBY CONFIRMS THE MERGER | 048555 | /0510 |
Date | Maintenance Fee Events |
Dec 20 2010 | REM: Maintenance Fee Reminder Mailed. |
Mar 11 2011 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Mar 11 2011 | M1554: Surcharge for Late Payment, Large Entity. |
Dec 01 2014 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Dec 01 2014 | M1555: 7.5 yr surcharge - late pmt w/in 6 mo, Large Entity. |
Nov 15 2018 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
May 15 2010 | 4 years fee payment window open |
Nov 15 2010 | 6 months grace period start (w surcharge) |
May 15 2011 | patent expiry (for year 4) |
May 15 2013 | 2 years to revive unintentionally abandoned end. (for year 4) |
May 15 2014 | 8 years fee payment window open |
Nov 15 2014 | 6 months grace period start (w surcharge) |
May 15 2015 | patent expiry (for year 8) |
May 15 2017 | 2 years to revive unintentionally abandoned end. (for year 8) |
May 15 2018 | 12 years fee payment window open |
Nov 15 2018 | 6 months grace period start (w surcharge) |
May 15 2019 | patent expiry (for year 12) |
May 15 2021 | 2 years to revive unintentionally abandoned end. (for year 12) |