A redundant system (20) comprises a first unit (21), a second unit (22), and a synchronization server (24). The first unit (21) and a second unit (22) each include plural state machines (SMs) for performing tasks. At least some of the state machines provided at the first unit simultaneously perform same tasks as at least corresponding ones of some of the state machines provided at the second unit. When a state machine of the second unit needs to be resynchronized, the synchronization server (24) receives a resynchronization request (2-1) from the resync requesting state machine of the second unit and thereupon provides an out-of-synchronization indication (2-2) to a corresponding state machine of the first unit. In response to the out-of-synchronization indication (2-2) from the synchronization server, the corresponding state machine of the first unit generates a resynchronization request (2-3) at an time deemed appropriate by the corresponding state machine. In response to the resynchronization request (2-3) from the corresponding state machine, the synchronization server issues a resynchronization command (2-4) to both the resync requesting state machine of the second unit and the corresponding state machine of the first unit, thereby causing simultaneous resynchronization. At least one of the first unit and the second unit includes restart logic that prescribes an order of restart of the plural state machines, thereby providing a gradual start up of the state machines of the redundant units. Thereafter each of the plural state machines controls its own resynchronization.
|
1. A redundant system comprising:
a first unit and a second unit each including plural state machines for performing tasks, at least some of the state machines provided at the first unit simultaneously performing same tasks as at least corresponding ones of some of the state machines provided at the second unit; a synchronization server which receives a resynchronization request from a resync requesting state machine of the second unit and thereupon provides an out-of-synchronization indication to a corresponding state machine of the first unit; wherein in response to the out-of-synchronization indication from the synchronization server, the corresponding state machine of the first unit generates a resynchronization request; wherein in response to the resynchronization request from the state machine of the first unit, the synchronization server issues a resynchronization command to both the corresponding state machine of the first unit and the resync requesting state machine of second unit; wherein the resync requesting state machine of second unit and the corresponding state machine of the first unit simultaneously resynchronize in response to the resynchronization command.
12. A method of operating a redundant system having a first unit and a second unit, each of the first unit and the second unit including plural state machines for performing tasks, at least some of the state machines provided at the first unit simultaneously performing same tasks as at least corresponding ones of some of the state machines provided at the second unit; the method comprising the following steps:
(1) receiving, at a synchronization server, a resynchronization request from a resync requesting state machine of the second unit; (2) in response to step (1), providing an out-of-synchronization indication to a corresponding state machine of the first unit; (3) in response to step (2), generating at the corresponding state machine of the first unit a corresponding state machine resynchronization request; (4) in response to step (3), issuing from the synchronization server a resynchronization command to both the resync requesting state machine of the second unit and the first unit and the corresponding state machine of the first unit; (5) simultaneously resynchronizing the resync requesting state machine of the second unit and the corresponding state machine of the first unit in response to the resynchronization command.
21. A telecommunications system comprising:
a node having a redundant system having a first unit and a second unit each including plural state machines for performing tasks of the node, at least some of the state machines provided at the first unit simultaneously performing same tasks as at least corresponding ones of some of the state machines provided at the second unit; a synchronization server which receives a resynchronization request from a resync requesting state machine of the second unit and thereupon provides an out-of-synchronization indication to a corresponding state machine of the first unit; wherein in response to the out-of-synchronization indication from the synchronization server, the corresponding state machine of the first unit generates a resynchronization request; wherein in response to the resynchronization request from the corresponding state machine of the first unit, the synchronization server issues a resynchronization command to both the resync requesting state machine of the second unit and the corresponding state machine of the first unit; wherein the resync requesting state machine of the second unit and the corresponding state machine of the first unit simultaneously resynchronize in response to the resynchronization command.
28. A method of operating a telecommunications system comprising a node having a redundant system including a first unit and a second unit, each of the first unit and the second unit including plural state machines for performing tasks of the node, at least some of the state machines provided at the first unit simultaneously performing same tasks as at least corresponding ones of some of the state machines provided at the second unit; the method comprising the following steps:
(1) receiving, at a synchronization server, a resynchronization request from a resync requesting state machine of the second unit; (2) in response to step (1), providing an out-of-synchronization indication to a corresponding state machine of the first unit; (3) in response to step (2), generating at the corresponding state machine of the first unit a corresponding resynchronization request; (4) in response to step (3), issuing from the synchronization server a resynchronization command to both the resync requesting state machine of the second unit and the corresponding state machine of the first unit; (5) simultaneously resynchronizing the resync requesting state machine of the second unit and the corresponding state machine of the first unit in response to the resynchronization command.
2. The apparatus of
3. The apparatus of
4. The apparatus of
5. The apparatus of
6. The apparatus of
7. The apparatus of
8. The apparatus of
9. The apparatus of
10. The apparatus of
11. The apparatus of
13. The method of
14. The method of
15. The method of
16. The method of
17. The method of
18. The method of
19. The method of
20. The method of
22. The apparatus of
23. The apparatus of
24. The apparatus of
25. The apparatus of
26. The apparatus of
27. The apparatus of
29. The method of
30. The method of
31. The method of
32. The method of
33. The method of
34. The method of
|
1. Field of the Invention
The present invention pertains to synchronizing of a redundant system, and particularly to synchronizing of a redundant computer system having two independent units which execute the same tasks simultaneously.
2. Related Art and other Considerations
Some activities are so supremely important that it is judicious to have two units or processors for performing the activities, just in case one of the units malfunctions or terminates (e.g., crashes). Such is expected to be the case in future generations of mobile telecommunications systems, for example. See U.S. Pat. No. 5,469,503 to Butensky et al. which discloses a recovery algorithm for a redundant system in a telecommunications environment.
In such mission critical situations, multiple units (e.g., computer processors or CPUs) are employed. In some instances, one or more of the units serves the role of a standby unit that is essentially idle until such time as it is necessary to replace an active unit. In other instances, multiple redundant units execute the same tasks simultaneously but independently. Unfortunately, in these other instances, the redundant units can, under certain circumstances, become out of phase with one another. In such circumstances it is necessary to re-coordinate or "resynchronize" the redundant units.
Coordination of redundant units has heretofore been addressed using various techniques. One known technique is to have the two redundant systems, e.g., two CPUs, using a same clock and same memory. In this technique, the two CPUs execute the same instructions out of the same memory at the same time, thereby obviating the need for software synchronization. A third unit compares the results from the two CPUs. A disadvantage of this technique is that its implementation requires specialized hardware rather than conventional components.
Employment of a third unit which coordinates synchronization of two processors is taught in U.S. Pat. No. 5,748,873 to Ohguro et al. When a first processor needs to resynchronize with a second processor, the first processor sends a re-synchronizing indication to a match control logic, which then outputs synchronous reset indication signals so that both the first and second processors are reset in synchronism.
Some redundant units (independently executing the same task) comprise state machines which are synchronized on the basis of external stimuli, such as bus events. But if one of the redundant units is reset (e.g., rebooted after a failure or the like), the reset unit somehow has to catch up with the other unit so that the tasks and any state machines of the reset unit can be synchronized. A possible solution to this synchronization problem is to ignore external stimuli and transfer all states from an active unit to the newly reset unit. Unfortunately, this solution would necessitate both units as appearing isolated, at least temporarily during the transfer of states, to external equipment communicating with the two units. If the overall system contains much state information, such resynchronization time might be unacceptably long.
What is needed, therefore, and an object of the present invention, is an effective way of resynchronizing redundant units which employ external stimuli-synchronized state machines.
A redundant system comprises a first unit, a second unit, and a synchronization server. The first unit and a second unit each include plural state machines for performing tasks. At least some of the state machines provided at the first unit simultaneously perform same tasks as at least corresponding ones of some of the state machines provided at the second unit.
In accordance with the invention, when a state machine of a second unit determines that it needs to be resynchronized, the synchronization server receives a resynchronization request from the state machine requesting resynchronization and thereupon provides an out-of-synchronization indication to a corresponding state machine in the first unit. In response to the out-of-synchronization indication from the synchronization server, the corresponding state machine in the first unit generates a resynchronization request at an time deemed appropriate by the corresponding state machine in the first unit. In response to the resynchronization request from the corresponding state machine in the first unit, the synchronization server issues a resynchronization command to both the corresponding state machine in the first unit and the state machine requesting of the second unit which sought resynchronization. The state machines of the first and second units simultaneously resynchronize in response to the resynchronization command.
At least one of the first unit and the second unit includes a restart logic that prescribes an order of restart of the plural state machines when a unit is restarted or rebooted. This provides for a gradual start up of the state machines of the redundant units. Preferably, the prescribed order is on a basis of one state machine at a time. When a state machine is restarted, it is up to the individual state machine when to synchronize in accordance with the steps summarized above.
In an illustrated embodiment, the first unit and the second unit are situated in a node of a cellular telecommunications system, such as a base station (BTS) node, for example.
The synchronization server can be a distinct unit, or can be included in one of the first unit and the second unit.
The foregoing and other objects, features, and advantages of the invention will be apparent from the following more particular description of preferred embodiments as illustrated in the accompanying drawings in which reference characters refer to the same parts throughout the various views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular architectures, interfaces, techniques, etc. in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well known devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
Preferably one or more of the pairs of state machines shown in
Each of first CPU 21 and second CPU 22 also include restart logic. In
In the particular embodiment shown in
In accordance with one mode of the present invention as illustrated in
In another mode of the invention, illustrated in
The following discussion of the resynchronization process is applicable both to the modes of FIG. 3 and FIG. 3A and variations thereof. As any one of the state machines of a CPU begins it resynchronization, the restarted state machine sends the resynchronization request 2-1 of
Upon receipt of the out-of-synchronization indication 2-2, whether the corresponding state machine in the other CPU is in a position to resynchronize depends upon that corresponding state machine in the other CPU. In this regard, and as stated above, it was noted in
In the particular example base station (BTS) 40 shown in
The extension terminal (ET) 47 serves to connect base station (BTS) 40 over terrestrial link 49 to another node of the telecommunications system of
The structure and operation of examples nodes such as base station (BTS) 40 and radio network controller (RNC) 50 can be understood from, for example, the following United States Patent applications, all of which are incorporated herein by reference: U.S. patent application Ser. No. 09/188,102 for "Asynchronous Transfer Mode System"; U.S. patent application Ser. No. 09/035,821 for "Telecommunications Inter-Exchange Measurement Transfer"; U.S. patent application Ser. No. 09/035,788 for "Telecommunications Inter-Exchange Congestion Control"; and U.S. patent application Ser. No. 09/071,886 for "Inter-Exchange Paging".
In the particular example shown in
Thus, the present invention starts up the state machines of a reset unit in a gradual, predefined manner which allows each state machine to decide by itself when it should be synchronized. Moreover, the present invention provides a scheme for the state machines to employ a third party, e.g., the server 24, in the resynchronization process. Of course, the reset unit must know what state machines are to be restarted, and in what order or sequence or logic pattern.
Thus, when one of the CPUs (e.g., a standby CPU) needs to be restarted, the gradual restarting of the present invention will never leave the other or active CPU appearing lost relative to the outside world. The only part of the overall system which will seem lost relative to its external environment is any state machine which is currently resynchronizing itself. But by letting each state machine decide when to perform resynchronization, the time that the state machine seems lost is negligible (since the state machine itself may know or anticipate the best time for it to resynchronize).
It should be understood, that the term "unit" as employed herein is not necessarily confined to a processor which executes programmed instructions, but extends to any type of logic device or circuit which supervises plural state machines. Nor does the fact that the state machines illustrated herein are realized in software preclude the invention from applying to hardware or circuit-formed state machines which utilize the resynchronization principles of the present invention.
Nor should particular significance be attached to the fact that the state machines illustrated herein are depicted as receiving only one stimulus. Such simplified illustration has been provided merely for purposes of clarity. Plural stimulus applied to state machines is certainly envisioned by the present invention.
The location of resynchronization server 24 and resynchronization server 24' as described herein is not limiting in the sense that the person skilled in the art will realize the resynchronization servers can be situated and connected in other manners, e.g., that other elements can be connected between resynchronization server and one or more of the CPUs.
Moreover, the illustration provided in
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Karlsson, Marcus, Rynbäck, Patrik
Patent | Priority | Assignee | Title |
6931568, | Mar 29 2002 | International Business Machines Corporation | Fail-over control in a computer system having redundant service processors |
7143249, | Oct 04 2000 | Network Appliance, Inc | Resynchronization of mirrored storage devices |
7512061, | Feb 11 2005 | RPX Corporation | Recovery of state information of a first tunnel end-point |
7620845, | Mar 12 2004 | Toshiba Solutions Corporation; Toshiba Digital Solutions Corporation | Distributed system and redundancy control method |
7647462, | Sep 25 2003 | International Business Machines Corporation | Method, system, and program for data synchronization between a primary storage device and a secondary storage device by determining whether a first identifier and a second identifier match, where a unique identifier is associated with each portion of data |
8677169, | Jun 30 2007 | Cisco Technology, Inc. | Session redundancy using a replay model |
8732556, | Feb 16 2011 | SCHNEIDER ELECTRIC SYSTEMS USA, INC | System and method for fault tolerant computing using generic hardware |
8745467, | Feb 16 2011 | SCHNEIDER ELECTRIC SYSTEMS USA, INC | System and method for fault tolerant computing using generic hardware |
Patent | Priority | Assignee | Title |
4569017, | Dec 22 1983 | AG COMMUNICATION SYSTEMS CORPORATION, 2500 W UTOPIA RD , PHOENIX, AZ 85027, A DE CORP | Duplex central processing unit synchronization circuit |
4635186, | Jun 20 1983 | International Business Machines Corporation; INTERNATIONAL BUSINESS MACHINES CORPORATION, A CORP OF NY | Detection and correction of multi-chip synchronization errors |
4733353, | Dec 13 1985 | General Electric Company | Frame synchronization of multiply redundant computers |
5301308, | Apr 25 1989 | Siemens Aktiengesellschaft | Method for synchronizing redundant operation of coupled data processing systems following an interrupt event or in response to an internal command |
5428637, | Aug 24 1994 | The United States of America as represented by the Secretary of the Army | Method for reducing synchronizing overhead of frequency hopping communications systems |
5452441, | Mar 30 1994 | AT&T IPM Corp | System and method for on-line state restoration of one or more processors in an N module redundant voting processor system |
5469503, | Jul 27 1993 | Wilmington Trust, National Association, as Administrative Agent | Method for resynchronizing secondary database and primary database with preservation of functionality of an automatic call distribution system |
5579220, | Jul 28 1993 | Siemens Aktiengesellschaft | Method of updating a supplementary automation system |
5583986, | Dec 21 1994 | Electronics and Telecommunications Research Institute; Korea Telecommunication Authority | Apparatus for and method of duplex operation and management for signalling message exchange no. 1 system |
5680594, | May 24 1995 | Apple Inc | Asic bus interface having a master state machine and a plurality of synchronizing state machines for controlling subsystems operating at different clock frequencies |
5748873, | Sep 17 1992 | Hitachi,Ltd. | Fault recovering system provided in highly reliable computer system having duplicated processors |
5841963, | Jun 08 1994 | Hitachi, Ltd. | Dual information processing system having a plurality of data transfer channels |
6223304, | Jun 18 1998 | Telefonaktiebolaget LM Ericsson (publ) | Synchronization of processors in a fault tolerant multi-processor system |
GB2308040, | |||
WO9602115, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Feb 25 1999 | Telefonaktiebolaget LM Ericsson (publ) | (assignment on the face of the patent) | / | |||
Mar 11 1999 | KARLSSON, MARCUS | Telefonaktiebolaget LM Ericsson | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 009841 | /0858 | |
Mar 11 1999 | RYNBACK, PATRIK | Telefonaktiebolaget LM Ericsson | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 009841 | /0858 |
Date | Maintenance Fee Events |
Apr 24 2006 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Apr 22 2010 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Apr 22 2014 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Oct 22 2005 | 4 years fee payment window open |
Apr 22 2006 | 6 months grace period start (w surcharge) |
Oct 22 2006 | patent expiry (for year 4) |
Oct 22 2008 | 2 years to revive unintentionally abandoned end. (for year 4) |
Oct 22 2009 | 8 years fee payment window open |
Apr 22 2010 | 6 months grace period start (w surcharge) |
Oct 22 2010 | patent expiry (for year 8) |
Oct 22 2012 | 2 years to revive unintentionally abandoned end. (for year 8) |
Oct 22 2013 | 12 years fee payment window open |
Apr 22 2014 | 6 months grace period start (w surcharge) |
Oct 22 2014 | patent expiry (for year 12) |
Oct 22 2016 | 2 years to revive unintentionally abandoned end. (for year 12) |