Changed information is provided to multiple masters of a multi-master environment. In order to facilitate the providing of the changed information to the various masters, at least one replication data structure is used. This data structure is managed in such a way that conflicts are avoided in updating the data structure, and thus, in communicating the changed information to the masters.
|
1. A computer-implemented method of managing replication information, said method comprising the steps of:
generating a first table having rows, each of said rows identifying a respective change replicated by a respective server, describing said respective change replicated by said respective server, and identifying said respective server which replicated said respective change;
generating a second table,
said second table having a first row for a first one of a plurality of servers, wherein said first row comprising an identity of a last change which has been replicated by said first server and originated from said first server and an identity of a last change which has been replicated by said first server and originated from a second server of said plurality of servers, and
said second table having a second row for said second server of said plurality of servers, wherein said second row comprising an identity of a last change which has been replicated by said second server and originated from said second server and an identity of a last change which has been replicated by said second server and originated from said first server; and
identifying from said second table an identification of a change which has been replicated by both said first and second servers, and in response, deleting a row in said first table for said change which has been replicated by both said first and second servers, wherein said identifying and said deleting are performed by one of said first or second server.
7. A system for managing replication information, said system comprising:
a processor;
means for generating a first table having rows, each of said rows identifying a respective change replicated by a respective server, describing said respective change replicated by said respective server, and identifying said respective server which replicated said respective change;
means for generating a second table,
said second table having a first row for a first one of a plurality of servers, wherein said first row comprising an identity of a last change which has been replicated by said first server and originated from said first server and an identity of a last change which has been replicated by said first server and originated from a second server of said plurality of servers, and
said second table having a second row for said second server of said plurality of servers, wherein said second row comprising an identity of a last change which has been replicated by said second server and originated from said second server and an identity of a last change which has been replicated by said second server and originated from said first server; and
means for identifying from said second table an identification of a change which has been replicated by both said first and second servers, and in response, deleting a row in said first table for said change which has been replicated by both said first and second servers, wherein said identifying and said deleting are performed by one of said first or second server.
13. A computer program product for managing replication information, said computer program product comprising:
a computer-readable storage media;
first program instructions to generate a first table having rows, each of said rows identifying a respective change replicated by a respective server, describing said respective change replicated by said respective server, and identifying said respective server which replicated said respective change;
second program instructions to generate a second table,
said second table having a first row for a first one of a plurality of servers, wherein said first row comprising an identity of a last change which has been replicated by said first server and originated from said first server and an identity of a last change which has been replicated by said first server and originated from a second server of said plurality of servers, and
said second table having a second row for said second server of said plurality of servers, wherein said second row comprising an identity of a last change which has been replicated by said second server and originated from said second server and an identity of a last change which has been replicated by said second server and originated from said first server; and
third program instructions to identify from said second table an identification of a change which has been replicated by both said first and second servers, and in response, delete a row in said first table for said change which has been replicated by both said first and second servers, wherein said third program instructions to identify and delete are performed by one of said first or second server; and,
wherein said first, second, and third program instructions are stored on said computer-readable storage media in functional form.
2. A computer-implemented method as set forth in
3. A computer-implemented method as set forth in
4. A computer-implemented method as set forth in
identifying from said first table at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers; and
deleting from said first table said at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers.
5. A computer-implemented method as set forth in
6. A computer-implemented method as set forth in
said first server replicating a latest change in said first server, and in response, said first server updating said first row of said second table to indicate said first server replicated said latest change in said first server; and
said second server replicating a latest change in said second server, and in response, said second server updating said second row of said second table to indicate said second server replicated said latest change in said second server.
8. A system as set forth in
9. A system as set forth in
10. A system as set forth in
means for identifying from said first table at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers; and
means for deleting from said first table said at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers.
11. A system as set forth in
12. A system as set forth in
said first server including means for replicating a latest change in said first server and updating said first row of said second table to indicate said first server replicated said latest change in said first server; and
said second server including means for replicating a latest change in said second server and updating said second row of said second table to indicate said second server replicated said latest change in said second server.
14. A computer program product as set forth in
15. A computer program product as set forth in
16. A computer program product as set forth in
fourth-program instructions to identify from said first table at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers; and
fifth program instructions to delete from said first table said at least one other change originating from said first server, wherein said at least one other change is still listed in said first table and is older than said change which has been replicated by both said first and second servers; and,
wherein said fourth and fifth program instructions are stored on said computer-readable storage media in functional form.
17. A computer program product as set forth in
18. A computer program product as set forth in
fourth program instructions for execution in said first server to replicate a latest change in said first server and update said first row of said second table to indicate said first server replicated said latest change in said first server; and
fifth program instructions for execution in said second server to replicate a latest change in said second server and update said second row of said second table to indicate said second server replicated said latest change in said second server; and,
wherein said fourth and fifth program instructions are stored on said computer-readable storage media in functional form.
|
This invention relates, in general, to replicating changed information between servers of a computing environment, and in particular, to replicating changed information between multiple master servers of a multi-master server environment.
Replication is a mechanism in which information held on one server, e.g., a directory server, is copied or replicated to one or more other servers, all of which provide access to the same set of information. Replication can be performed in various ways and based on different replication models. As examples, one replication model includes a master/slave model, while another model includes a multi-master model, each of which is described below.
In the master/slave replication model, a single server is the updateable master server, while the other servers are read-only slave servers. Although this model can assist in load-balancing heavily read-biased workloads, it is deficient in those environments that require highly available, workload-balanced read/write access to information in a directory. To address these needs, multi-master replication is employed.
With multi-master replication, multiple servers allow simultaneous write access to the same set of information by allowing each server to update its own copy of the set of information. Protocols are then used to transmit changes made at the different master servers to other master servers so that each server can update its copy of the information. Conflict resolution techniques are also used to workout differences resulting from simultaneous and/or conflicting updates made at multiple different master servers in the multi-master environment.
Although replication protocols have been established to manage the provision of changes made at multiple servers, enhancements are still needed. For example, a need still exists for a replication capability that provides the information in a simpler and more timely manner. As another example, a need still exists for a replication capability that avoids the need for conflict resolution at the time of providing the changed information.
The shortcomings of the prior art are overcome and additional advantages are provided through the provision of a method of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of the replicated set of information. The method includes, for instance, writing by a master of the multi-master environment a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and providing the data structure to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.
System and computer program products corresponding to the above-summarized methods are also described and claimed herein.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
In accordance with an aspect of the present invention, a replication capability is provided for a multi-master environment in which changed information is replicated among multiple masters of the environment, in such a manner that conflict resolution is substantially avoided in communicating the changes. In a further aspect, change information is communicated, even if a particular server is not available at the time the change is made. This is accomplished by, for example, using one or more data structures to provide the changed information to the various masters.
One embodiment of a computing environment incorporating and using one or more aspects of the present invention is described with reference to
A directory is a database, which is structured as a hierarchical tree of entries. An entry includes information in the form of attribute/value combinations. An attribute has a name, referred to as the attribute type, and one or more values. An attribute containing more than one value is referred to as multi-valued. Operations can be performed against the directory tree. These include adding an entry to the tree; modifying an entry; deleting an entry; searching for entries matching a filter, starting at some root entry; reading an entry; and comparing an attribute value to a test/assertion value. In one example, the directories are managed by one or more masters, which are also known as directory servers or replicas. A master provides access to one or more trees of entries, as well as some form of replication with other directory servers.
In one example, replication includes transmitting changes made to a directory between multiple masters holding copies of the directory. In accordance with an aspect of the present invention, the transmission includes employing at least one replication data structure 104. As one example, there is a set of replication data structures (e.g., a change table and a status table) for each replicated set of information (e.g., each replicated directory). That is, if there are two directories replicated on four masters, then there is a change table and a status table for each of the two directories. The tables (e.g., the change tables) serve as the communications medium by which the masters or replicas are informed of changes made to the directory entries. Further details regarding the replication data structures are described below.
One embodiment of a status data structure is described with reference to
In one example, the vector includes one or more pairs of values 210 (
One embodiment of a change data structure is described with reference to
Rows are deleted from the change table, in response to any master noticing that the changes of the rows have been seen by all the masters participating in the replication environment. Rows in the change table are not updated. This enables multiple masters to simultaneously write to the change table without conflicts.
In one example, row 222 of the change table includes, for instance, a ChangeId 224, which is a primary key for the table, and is a value that uniquely identifies the change; a ReplicaId 226 that indicates the master or replica that performed the change; an EntryId 228 that indicates the entry in the directory upon which the change was made; and ChangeInformation 230 that includes the detailed set of changes that were applied to the entry by the replica that made the change. The ReplicaId aids in processing/searching the change table as changes sometimes are searched by the replica in which the change was made. The changes in the change table are provided to the various replicas of the environment, so that the replicas can apply the changes to their copies of the directory.
The replication data structures may be centrally located and remotely accessible to the masters, as depicted in
To manage the replication data structures, operations are applied thereto. As an example, the status table is modified (i.e., rows are added or deleted), in response to replicas being added or deleted from the environment. Further details regarding these operations are described below.
For example, one embodiment of the logic associated with modifying the status table to reflect an add of a replica to the environment is described with reference to
As a further example, the status table is updated when a replica is deleted from the computing environment. One embodiment of the logic associated with modifying the status table to reflect the deletion is described with reference to
In addition to performing operations on the status table, operations are also performed on the change table. In one example, the change table is modified (i.e., a row is added), in response to adding an entry to the directory, modifying an existing directory entry or directory entry's name, or deleting a directory entry.
One embodiment of the logic associated with modifying the change table, as well as the status table, based on an add/modification to a directory entry is described with reference to
Moreover, in one embodiment, the status table is updated to reflect the addition, STEP 502. For example, the replica's LastChangeVector value corresponding to itself is updated with the ChangeId of the row added to the change table. As an example, assume Replica 1 added a change identified as ChangeId 7, then the LastChangeVector in the row owned by Replica 1 is updated by changing the ChangeId corresponding to Replica 1 to 7 (e.g., 1,7).
Similar logic is performed when a directory entry is modified. For example, with reference to
In addition to adding or modifying an entry, an existing entry's name may also be modified. Again, one embodiment of the logic associated with modifying the replication data structures to reflect a modification to an entry's name is described with reference to
A further operation that can be performed on a directory is deleting an entry from a directory. Thus, one embodiment of the logic associated with updating the replication data structures in view of a delete is described with reference to
In the above processing, it is shown that in order to reflect a change made to the directory, a unique row is added to the change table. No row in the change table is updated, rather a new row is added. This enables conflict resolution to be avoided when updating the replication data structures to reflect the adding, modifying or deleting of entries in the directory.
After each replica participating in replication has processed a change, that row can be removed from the change table. There is no requirement that the rows be removed immediately after the replicas have processed them. However, it is desirous that they be removed at some point after all the replicas have processed the change.
There are various strategies for removing the old changes from the change table. The process of removing this information is called purge processing. Regardless of the strategy used (e.g., check on every update, check only after every n updates, separate background task), the tables are constructed such that any replica can perform purge processing, even multiple replicas simultaneously. The worst case scenario is that multiple replicas try to delete the same row from the status table. Duplicate deletion is an easy conflict to resolve, however. The additional deletion is simply ignored or from the replica's point of view, if the deletion fails because the row does not exist, it is considered normal processing and no error is provided.
One embodiment of the logic associated with purge processing is described with reference to
If there is a replica to participate in the processing, then for the selected replica, a determination is made as to the oldest ChangeId of the selected replica that has been processed by all the replicas, STEP 702. This is accomplished by examining the LastChangeVector for the replicas and determining the oldest ChangeId for the ReplicaId. Thereafter, all changes for that ReplicaId, where the ChangeId is older than or equal to the oldest ChangeId determined in STEP 702 is deleted from the change table, STEP 704.
Subsequently, a determination is made as to whether there are remaining changes for a ReplicaId that no longer exists, INQUIRY 706. In one example, this determination is made by searching the change table by ReplicaId. If there are no remaining changes for a ReplicaId which no longer exists in the set of replicas, then the status table may be updated, STEP 708. For example, a replica can remove the ChangeId value for that replica from its own LastChangeVector. In this example, replicas only update their own status table row, so a replica during purge processing can only update its own status table row. (In other embodiments, a replica can remove a replica from its own LastChangeVector during add, modify and/or delete processing.)
If, on the other hand, there are remaining changes for the ReplicaId, then processing continues with INQUIRY 700. When all of the participating replicas have been processed, the purge processing is complete, STEP 710.
The information held in the change table is used to inform masters participating in the replication environment of changes made to the directory by other masters. One embodiment of the logic associated with providing this changed information to one or more masters is described with reference to
In response to learning of changes, the replica applies the changes logged in the change table to its own copy of the directory, STEP 802. The manner in which these changes are applied are database dependent. Each database may be of its own type, have its own structure and/or format, and the master is able to understand the nature of the proposed updates, even though they originated from another server. Known techniques are used to reconcile any changes to be applied, if there are conflicts at the application time. Each master updates its own database in the structure and format of its database.
Thereafter, the replica updates its own row in the status table, STEP 804. In one example, this is accomplished by changing the LastChangeVector of the replica's row based on the changes it processed from the change table. For example, if ReplicaId 1 just applied change 7 of changes originating from ReplicaId 2, then the ChangeId in the LastChangeVector corresponding to ReplicaId 2 in the row owned by ReplicaId 1 is updated to 7, etc. This completes replication processing.
Described in detail above is a capability for facilitating the providing of changed information to multiple masters of a multi-master environment, while substantially avoiding conflict resolution during the providing of the changed information. The capability is relatively simple to use and provides the information in a timely manner. In one example, the capability employs one or more data structures to communicate the changed information.
In accordance with an aspect of the present invention, non-conflicting updates are made to the data structures for various operations applied thereto. Row additions are uniquely made, either by ReplicaId or ChangeId. Further, row updates of the status table are isolated to a single updater, since each status table row is owned by a particular replica.
Although in the above embodiments, the replication data structures are tables, this is only one example. Other types of data structures may also be used. Further, in another embodiment, techniques other than employing status data structures may be used in cleaning up the change data structures. The use of status data structures is only one example.
Additionally, in other embodiments, the LastChangeVector does not include values for the replica owning that row. Thus, the status table is not updated in those situations in which the owning replica is adding a row to the change table, as an example.
Moreover, although in the above embodiments, there is a change table for each replicated set of information, in other embodiments the change table may accommodate multiple replicated sets of information.
Further, even though the masters are servers in the examples above, other types of masters can participate in one or more aspects of the present invention.
In another aspect of the present invention, information in the change table is protected from viewing except from other masters, while the information is in the table and in transit between the masters. This is facilitated by using, for instance, a digital envelope technique. The digital envelope includes, for instance, encrypted bulk-encryption keys for each recipient server, which are encrypted using the public key of each recipient server. The bulk-encryption keys can be decrypted by each recipient using its private key. This is a form of multi-cast of enveloped information.
The present invention can be included in an article of manufacture (e.g., one or more computer program products) having, for instance, computer usable media. The media has embodied therein, for instance, computer readable program code means for providing and facilitating the capabilities of the present invention. The article of manufacture can be included as a part of a computer system or sold separately.
Additionally, at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform the capabilities of the present invention can be provided.
The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.
Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the following claims.
Hahn, Timothy J., McGarvey, John R.
Patent | Priority | Assignee | Title |
10817506, | May 07 2018 | Microsoft Technology Licensing, LLC | Data service provisioning, metering, and load-balancing via service units |
10885018, | May 07 2018 | Microsoft Technology Licensing, LLC | Containerization for elastic and scalable databases |
10970269, | May 07 2018 | Microsoft Technology Licensing, LLC | Intermediate consistency levels for database configuration |
10970270, | May 07 2018 | Microsoft Technology Licensing, LLC | Unified data organization for multi-model distributed databases |
11030185, | May 07 2018 | Microsoft Technology Licensing, LLC | Schema-agnostic indexing of distributed databases |
11269925, | May 15 2019 | International Business Machines Corporation | Data synchronization in a data analysis system |
11321303, | May 07 2018 | Microsoft Technology Licensing, LLC | Conflict resolution for multi-master distributed databases |
11379461, | May 07 2018 | Microsoft Technology Licensing, LLC | Multi-master architectures for distributed databases |
11397721, | May 07 2018 | Microsoft Technology Licensing, LLC | Merging conflict resolution for multi-master distributed databases |
11487714, | May 15 2019 | International Business Machines Corporation | Data replication in a data analysis system |
11893041, | May 15 2019 | International Business Machines Corporation | Data synchronization between a source database system and target database system |
7917609, | Aug 24 2007 | International Business Machines Corporation | Method and apparatus for managing lightweight directory access protocol information |
Patent | Priority | Assignee | Title |
5596748, | Sep 29 1994 | International Business Machines Corporation | Functional compensation in a heterogeneous, distributed database environment |
5832517, | Dec 29 1993 | Xerox Corporation | System logging using embedded database |
5838923, | Jun 22 1993 | Microsoft Technology Licensing, LLC | Method and system for synchronizing computer mail user directories |
5999935, | Mar 28 1997 | International Business Machines Corporation | Tail compression of a sparse log stream of a multisystem environment |
6092084, | Mar 28 1997 | International Business Machines Corporation | One system of a multisystem environment taking over log entries owned by another system |
6122640, | Sep 22 1998 | CA, INC | Method and apparatus for reorganizing an active DBMS table |
6615223, | Feb 29 2000 | ORACLE INTERNATIONAL CORPORATION OIC | Method and system for data replication |
20040068516, | |||
JP2001084267, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Dec 03 2002 | HAHN, TIMOTHY J | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 013665 | /0213 | |
Dec 18 2002 | MCGARVEY, JOHN R | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 013665 | /0213 | |
Jan 09 2003 | International Business Machines Corporation | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Aug 28 2006 | ASPN: Payor Number Assigned. |
Jan 21 2010 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
May 16 2014 | REM: Maintenance Fee Reminder Mailed. |
Oct 03 2014 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Oct 03 2009 | 4 years fee payment window open |
Apr 03 2010 | 6 months grace period start (w surcharge) |
Oct 03 2010 | patent expiry (for year 4) |
Oct 03 2012 | 2 years to revive unintentionally abandoned end. (for year 4) |
Oct 03 2013 | 8 years fee payment window open |
Apr 03 2014 | 6 months grace period start (w surcharge) |
Oct 03 2014 | patent expiry (for year 8) |
Oct 03 2016 | 2 years to revive unintentionally abandoned end. (for year 8) |
Oct 03 2017 | 12 years fee payment window open |
Apr 03 2018 | 6 months grace period start (w surcharge) |
Oct 03 2018 | patent expiry (for year 12) |
Oct 03 2020 | 2 years to revive unintentionally abandoned end. (for year 12) |