A method, system and apparatus for cascading backup mirrors are provided. A mirroring map is created. The mirroring map includes at least three mirrors. A first mirror of the three mirrors is set to synchronize to a second mirror and a third mirror is set to synchronize to the first mirror. The first and the third mirror are backup mirrors and the second mirror is a working mirror. One of the backup mirrors is located remotely and the other locally.
|
1. A method of cascading backup logical volume mirrors comprising the steps of:
creating a mirroring map, the mirroring map including at least three mirrors, a working mirror and a first and a second backup logical volume mirrors; and setting the first backup logical volume mirror to be synchronized to the working mirror and the second backup logical volume mirror to be synchronized to the first backup logical volume mirror.
7. An apparatus for cascading backup logical volume mirrors comprising:
means for creating a mirroring map, the mirroring map including at least three mirrors, a working mirror and a first and a second backup logical volume mirrors; and means for setting the first backup logical volume mirror to be synchronized to the working mirror and the second backup logical volume mirror to be synchronized to the first backup logical volume mirror.
4. A computer program product on a computer readable medium for cascading backup logical volume mirrors comprising:
code means for creating a mirroring map, the mirroring map including at least three mirrors, a working mirror and a first and a second backup logical volume mirrors; and code means for setting the first backup logical volume mirror to be synchronized to the working mirror and the second backup logical volume mirror to be synchronized to the first backup logical volume mirror.
10. A computer system for cascading backup logical volume mirrors comprising:
at least one storage device for storing code data; and at least one processor for processing the code data to create a mirroring map, the mirroring map including at least three mirrors, a working mirror and a first and a second backup logical volume mirrors, and to set the first backup logical volume mirror to be synchronized to the working mirror and the second backup logical volume mirror to be synchronized to the first backup logical volume mirror.
2. The method of
3. The method of
5. The computer program product of
6. The computer program product of
8. The apparatus of
9. The apparatus of
11. The computer system of
12. The computer system of
|
This application is related to co-pending U.S. patent application Ser. No. 10/116,520, entitled APPARATUS AND METHOD OF MAINTAINING RELIABLE OFFLINE MIRROR COPIES IN VIRTUAL VOLUME GROUPS, filed on even date herewith and assigned to the common assignee of this application.
1. Technical Field
The present invention is directed to a method and apparatus for managing data storage systems. More specifically, the present invention is directed to a method and apparatus for cascading logical volume mirrors.
2. Description of Related Art
Most computer systems are made up of at least one processor and one physical storage system. The processor processes, stores and retrieves data from the physical storage system under the guidance of an application program.
Application programs generally run atop an operating system. Among the many tasks of an operating system is that of allowing an application program to have a rather simplistic view of how data (i.e., data files) are stored within a physical storage system. Typically, an application program views the physical storage system as containing a number of hierarchical partitions (i.e., directories) within which entire data files are stored. This simplistic view is often referred to as a logical view since most files are not really stored as unit bodies into directories but rather are broken up into data blocks that may be strewn across the entire physical storage system.
The operating system is able to allow an application program to have this simplistic logical view with the help of a file management system. The file management system stores directory structures, breaks up data files into their constituent data blocks, stores the data blocks throughout a physical storage system and maintains data logs of where every piece of data is stored. Thus, the file management system is consulted whenever data files are being stored or retrieved from storage.
Computer systems that have a plurality of physical storage systems (e.g., servers) use an added layer of abstraction when storing and retrieving data. The added layer of abstraction is a logical volume manager (LVM). Volume, in this case, is the storage capacity of a physical storage system. Thus, volume and physical storage system will henceforth be used interchangeably.
The LVM arranges the physical storage systems into volume groups in order to give the impression that storage systems having each a much more voluminous storage capacity are being used. Within each volume group, one or more logical volumes may be defined. Data stored in a logical volume appears to be stored contiguously. However in actuality, the data may be interspersed into many different locations across all the physical storage systems that make up the volume group.
Stated differently, each logical volume in a logical volume group is divided into logical partitions. Likewise, each physical volume in a physical volume group is divided into physical partitions. Each logical partition corresponds to at least one physical partition. But, although the logical partitions in a logical volume are numbered consecutively or appear to be contiguous to each other, the physical partitions to which they each correspond, need not be contiguous to each other. And indeed, most often, the physical partitions are not contiguous to each other. Thus, one of the many tasks of the LVM is to keep tab on the location of each physical partition that corresponds to a logical partition.
For fault tolerance and performance, some servers store at least one extra copy of each piece of data onto the physical storage systems they use. Storing more than one copy of a piece of data is called mirroring the data. In order to store mirrored data, each logical partition used must correspond to as many physical partitions as there are mirrors (or copies) of the data. In other words, if the data is mirrored three times, for example, each logical partition will correspond to three physical partitions.
Writing data in mirrors is quite a time-consuming and CPU-intensive endeavor. Thus when there is more than two mirrors, some system administrators sometimes designate one of the mirrors as a backup mirror and the others as working mirrors. As alluded to above, data is usually written concurrently into all the working mirrors. However, updates are made to the backup mirror periodically (e.g., once a day). One mirror is usually designated as the mirror that will provide the updates. Using data from a working mirror to update a backup mirror is referred to as synchronizing the backup mirror to the designated working mirror.
For disaster recovery, some computer systems may have another mirror located at a remote location. This mirror may also be designated as another backup mirror. It should be noted however, that during the time a backup mirror is being synchronized with a working mirror, no application programs may have access to any of the working mirrors. Therefore, it may not be practical to synchronize a remote backup mirror with a working mirror.
Thus, what is needed is an apparatus and method of synchronizing one backup mirror to another backup mirror.
The present invention provides a method, system and apparatus for cascading backup mirrors. A mirroring map is created. The mirroring map includes at least three mirrors. A first mirror of the three mirrors is set to synchronize to a second mirror and a third mirror is set to synchronize to the first mirror. The first and the third mirror are backup mirrors and the second mirror is a working mirror. One of the backup mirrors is located remotely and the other locally.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
With reference now to the figures,
In the depicted example, server 104 is connected to network 102 along with storage unit 106. In addition, clients 108, 110, and 112 are connected to network 102. These clients 108, 110, and 112 may be, for example, personal computers or network computers. In the depicted example, server 104 provides data, such as boot files, operating system images, and applications to clients 108, 110 and 112. Clients 108, 110 and 112 are clients to server 104. Network data processing system 100 may include additional servers, clients, and other devices not shown. In the depicted example, network data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the TCP/IP suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, government, educational and other computer systems that route data and messages. Of course, network data processing system 100 also may be implemented as a number of different types of networks, such as for example, an intranet, a local area network (LAN), or a wide area network (WAN).
Referring to
Peripheral component interconnect (PCI) bus bridge 214 connected to I/O bus 212 provides an interface to PCI local bus 216. A number of modems may be connected to PCI local bus 216. Typical PCI bus implementations will support four PCI expansion slots or add-in connectors. Communications links to network computers 108, 110 and 112 in
Those of ordinary skill in the art will appreciate that the hardware depicted in
The data processing system depicted in
With reference now to
An operating system runs on processor 302 and is used to coordinate and provide control of various components within data processing system 300 in FIG. 3. The operating system may be a commercially available operating system, such as Windows 2000, which is available from Microsoft Corporation. An object oriented programming system such as Java may run in conjunction with the operating system and provide calls to the operating system from Java programs or applications executing on data processing system 300. "Java" is a trademark of Sun Microsystems, Inc. Instructions for the operating system, the object-oriented operating system, and applications or programs are located on storage devices, such as hard disk drive 326, and may be loaded into main memory 304 for execution by processor 302.
Those of ordinary skill in the art will appreciate that the hardware in
As another example, data processing system 300 may be a stand-alone system configured to be bootable without relying on some type of network communication interface, whether or not data processing system 300 comprises some type of network communication interface. As a further example, data processing system 300 may be a Personal Digital Assistant (PDA) device, which is configured with ROM and/or flash ROM in order to provide non-volatile memory for storing operating system files and/or user-generated data.
The depicted example in FIG. 3 and above-described examples are not meant to imply architectural limitations. For example, data processing system 300 may also be a notebook computer or hand held computer in addition to taking the form of a PDA. Data processing system 300 also may be a kiosk or a Web appliance.
The present invention provides an apparatus and method of synchronizing one backup mirror to another backup mirror. Although the invention may preferably be local to server 104, it may nonetheless, be local to client systems 108, 110 and 112 of
To better understand the invention, a more detailed explanation of the LVM is needed. The LVM interacts with application programs and the physical storage devices as shown in FIG. 4. In
The logical layer 410, for all intent and purpose, is the LVM. The LVM may be regarded as being made up of a set of operating system commands, library subroutines or other tools that allow a user to establish and control logical volume storage. The LVM controls physical storage system resources by mapping data between a simple and flexible logical view of storage space and the actual physical storage system. The LVM does this by using a layer of device driver code that runs above traditional device drivers. This logical view of the disk storage is provided to application programs and is independent of the underlying physical disk structure.
The logical layer 410 contains a logical volume 412 that interacts with logical volume device driver 414. A device driver, as is well known in the art, acts as a translator between a device and programs that use the device. That is, the device driver accepts generic commands from programs and translates them into specialized commands for the device. In this case, the logical volume device driver 414 translates commands from an application program that may be executing on the computer system for device driver 430. Thus, when an application program sends commands to file system manager 402 to store or retrieve data from logical volume 412, the file system manager 402 informs the logical volume manager 412 of the application program's wish. The logical volume manager 412 then conveys the wish to the logical volume device driver 414. The logical volume device driver 414 then consults the appropriate map and instructs the device driver 430 which ones of physical storage systems 422, 424, 426 and 428 to use for the data.
When a system administrator wants to mirror a piece of data, the administrator has to devise a map (or mirroring scheme) to correlate the logical volume being used to the actual physical storage systems in which the data is to be stored. Generally, this map correlates the logical partitions to the physical partitions of the physical storage systems that are to be used. This map is stored in the LVM.
When the computer system effectuates a write operation into PSS-1 500, a write operation is also effectuated into PSS-2 510 and vice versa and the data is written into appropriate locations as per the mirroring map. However, a write operation is not performed into PSS-3 520 and PSS-4 530. Instead, a table of modifications is updated in the LVM. This table is consulted periodically. Specifically, the table is consulted before PSS-520 is to be synchronized to a designated local working mirror (i.e., either PSS-1 500 or PSS-2 510) to determine which pieces of data are to be written into PSS-3 520 to perform the synchronization.
As stated in the Background of the Invention, when the working mirrors are being backed up by the local backup mirror (i.e., when PSS-3 520 is being synchronized to PSS-1 500) application programs that are running on the computer system do not have access to the working mirrors (e.g., cannot read or write into either PSS-1 500 or PSS-2 510). Some application programs may have time-sensitive information that may need to be stored or read from the working mirrors. Thus, a method that allows the application programs to have constant access to the data as well as to modify and write new data into the physical systems must be devised.
One way to allow the application programs to continue reading and writing data is to split off the working mirrors. So, just before the time that PSS-3 520 is to be synchronized to PSS-1 500, PSS-1 may be disassociated with PSS-2 510. The application programs will continue to read or write from PSS-2 510 but not from PSS-1 500. After PSS-3 520 is synchronized to PSS-1 500, PSS-1 500 may be re-associated with PSS-2 510.
Ordinarily, when the two working mirrors are disassociated, the working mirror to which the backup mirror is to be synchronized will be ported to another computer system. There, the file systems on the mirror will be mounted. To mount a file system is to make the file systems available for use. Once mounted, the synchronization procedure may be initiated. When the file systems are mounted, some data may be written into the mirrors. The written data may be metadata such as date and time the file systems were mounted etc. Any data written into the mirror should be marked stale before re-associating the mirrors together. Upon re-association, all data marked stale will be discarded.
As mentioned before, during disassociation new data may have been written into PSS-2 510 or existing data in PSS-2 510 may have been modified. Thus upon re-association, PSS-1 500 will not be a true mirror of PSS-2 510. To ascertain that PSS-1 500 remains a true mirror of PSS-2 510 after re-association, new data and modified data written into PSS-2 510 when the mirrors were disassociated are entered into a modification table. After re-association, the new data and the modified data are copied from PSS-2 510 into PSS-1 500. When this occurs, PSS-1 500 and PSS-2 510 become true mirrors of each other again.
To disassociate PSS-1 500 from PSS-2 510 the mapping mirror must be modified. That is, partitions from the logical volume that originally correspond to physical partitions in both PSS-1 500 and PSS-2 510 will only correspond to only partitions in PSS-2 510. After re-association, the mapping mirror may be modified once more to correlate each partition in the logical volume to the physical partitions in both PSS-1 500 and PSS-2 510.
As described, the remote backup mirror is synchronized to the local backup mirror. The invention can therefore be extended to have a second remote or local backup mirror be synchronized to the first remote backup mirror and to have a third remote backup mirror be synchronized to the second remote backup mirror and so on to obtain a cascaded backup mirrors.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
McBrearty, Gerald Francis, Mullen, Shawn Patrick, Shieh, Johnny Meng-Han
Patent | Priority | Assignee | Title |
6182198, | Jun 05 1998 | Xyratex Technology Limited | Method and apparatus for providing a disc drive snapshot backup while allowing normal drive read, write, and buffering operations |
6216211, | Jun 13 1997 | International Business Machines Corporation | Method and apparatus for accessing mirrored logical volumes |
20030126107, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 27 2002 | MCBREARTY, GERALD FRANCIS | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 012773 | /0708 | |
Mar 27 2002 | MULLEN, SHAWN PATRICK | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 012773 | /0708 | |
Mar 27 2002 | SHIEH, JOHNNY MENG-HAN | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 012773 | /0708 | |
Apr 04 2002 | International Business Machines Corporation | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Oct 06 2004 | ASPN: Payor Number Assigned. |
Jan 11 2008 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jul 02 2012 | REM: Maintenance Fee Reminder Mailed. |
Sep 28 2012 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Sep 28 2012 | M1555: 7.5 yr surcharge - late pmt w/in 6 mo, Large Entity. |
Jun 24 2016 | REM: Maintenance Fee Reminder Mailed. |
Sep 08 2016 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Sep 08 2016 | M1556: 11.5 yr surcharge- late pmt w/in 6 mo, Large Entity. |
Date | Maintenance Schedule |
Nov 16 2007 | 4 years fee payment window open |
May 16 2008 | 6 months grace period start (w surcharge) |
Nov 16 2008 | patent expiry (for year 4) |
Nov 16 2010 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 16 2011 | 8 years fee payment window open |
May 16 2012 | 6 months grace period start (w surcharge) |
Nov 16 2012 | patent expiry (for year 8) |
Nov 16 2014 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 16 2015 | 12 years fee payment window open |
May 16 2016 | 6 months grace period start (w surcharge) |
Nov 16 2016 | patent expiry (for year 12) |
Nov 16 2018 | 2 years to revive unintentionally abandoned end. (for year 12) |