An apparatus and a method for node synchronization that can be used in a heterogeneous computer system where nodes in the system do not share a common system clock. Time stamps, which are critically important, are attached to transaction requests. Time stamps are based on a "time of day" value, which may simply be a register incremented by a system clock. Since each node has its own system clock, the frequency of these clocks may drift which results in variation in the time stamp values. If the values drift too far apart, data updates may be lost. A frequency synthesizer capable of high resolution and rapid frequency adjustments can be connected to system clock. When a shift in phase between the master and slave time of day values is detected, the frequency synthesizer output can be changed by a small amount to bring the two signals back into phase.
|
6. An apparatus in a multiple processor data processing system to synchronize counters incremented by local clocks for a plurality of nodes, the apparatus comprising:
a frequency synthesizer connected to a system clock at each node from the plurality of nodes, wherein the frequency synthesizer includes a plurality if stages and wherein at least two stages within the plurality of stages include a variable frequency divider and wherein the frequency synthesizer makes small incremental adjustments in output frequency; and a comparator, wherein the comparator determines a change in direction of a phase difference between the phase associated with a slave node and the phase associated with a master node.
1. A method in a multiple processor data processing system to synchronize counters incremented by local clocks for a plurality of nodes, the method comprising:
designating a master node from the plurality of nodes, wherein remaining nodes are designated as slave nodes; determining a phase difference between a phase associated with a counter incremented by a clock signal of a slave node to a phase associated with a counter incremented by a local clock signal of the master node; detecting a change in direction of the phase difference between the phase associated with the slave node and the phase associated with the master node; and adjusting the clock frequency of the slave node by a first amount in a first stage in a multiple stage frequency synthesizer and by a second amount in a second stage of the multiple stage frequency synthesizer to cause the phase difference between the phase associated with the slave node and the phase associated with the master node to switch direction.
2. The method of
3. The method of
4. The method of
5. The method of
7. The apparatus of
8. The apparatus of
9. The apparatus of
10. The apparatus of
11. The apparatus of
12. The apparatus of 11, wherein a change in direction of the phase difference between the phase associated with the counter at the slave node and the phase associated with the counter at the master node causes the frequency synthesizer output of the slave node to shift so that the phase difference changes direction.
|
The present invention is related to the following applications entitled "AN APPARATUS AND METHOD FOR HIGH RESOLUTION FREQUENCY ADJUSTMENT IN A MULTISTAGE FREQUENCY SYNTHESIZER", U.S. application Ser. No. 09/631,718, now issued as U.S. Pat. No. 6,566,921; "AN APPARATUS AND METHOD FOR DYNAMIC FREQUENCY ADJUSTMENT IN A FREQUENCY SYNTHESIZER", U.S. application Ser. No. 09/631,720, now issued as U.S. Pat. No. 6,522,207; which are incorporated herein by reference.
1. Technical Field
The present invention relates generally to an improved method for system synchronization and in particular to an apparatus and a method for adjusting the time of day clocks in a heterogeneous computer system. Still more particularly, the present invention provides an apparatus and a method for high resolution frequency adjustment for node synchronization that can be used in a non-uniform memory access (NUMA) computer system.
2. Description of the Related Art
A phase locked loop (PLL) is a very interesting integrated circuit that blends analog and digital techniques. Although the basic design of a PLL has been known for decades, the circuit only became a practical building block in integrated circuit form where the cost has become affordable and the design has become more reliable.
The PLL contains a phase detector, an amplifier, a voltage controlled oscillator (VCO), and a feedback loop that allows the output frequency to be a replication of the input signal with noise removed or a multiple of the frequency of the input signal. PLLs have been used for demodulation of FM signals, for tone decoding, for frequency generation, for generation of "clean" signals, and for pulse synchronization, to name but a few of the applications. Because the output frequency is a multiple of the input frequency, it is difficult to make fine frequency adjustments using such a frequency synthesizer.
A non-uniform memory access (NUMA) computer system is a multiple processor architecture where there is a single memory address space but where memory is separated into "close" banks of memory and "distant" banks of memory. Access is "non-uniform" because the access times for the close banks of memory directly associated with the node that contains the CPU are much faster than the access times for distant memory banks at other nodes in the system. A distinct advantage of a NUMA architecture is that it scales well, in the sense that adding more nodes and processors to the system does not create bottlenecks that degrade performance in the same way as other parallel architectures.
One problem with NUMA architectures is to keep the nodes synchronized. Transactions are often labeled with time stamps that are generated by the time of day at each node in the system. Since these nodes have independent clocks, even though they are initialized at precisely the same time, they will eventually drift apart and require re-synchronization. It is important to have precise time stamps with as little "cycle slippage" as possible between the nodes.
Therefore, it would be advantageous to have a method for high resolution frequency adjustment for node synchronization that can be used in a non-uniform memory access (NUMA) computer system.
An apparatus and a method is presented for node synchronization that can be used in a heterogeneous computer system where nodes in the system do not share a common system clock. A non-uniform memory access (NUMA) computer system is one such system where this method and apparatus can be applied.
Transactions in a multiprocessor computer system must be coordinated precisely for correct operation. Time stamps are attached to transaction requests and when data is changed in the system, the relative values of time stamps are critically important. These time stamps are based on a "time of day" value, which may simply be a register incremented by a system clock. Since each node has its own system clock, the frequency of these clocks may drift which results in variation in the time stamp values. If the values drift too far apart, data updates in the multiprocessor computer system may be lost.
This invention monitors the relative phase of a "master" time of day register with one or more "slave" time of day registers. A frequency synthesizer capable of high resolution and rapid frequency adjustments can be connected to system clock. When a shift in phase between the master and slave time of day values is detected, the frequency synthesizer output can be changed by a small amount to bring the two signals back into phase.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
With reference now to the figures, and in particular with reference to
External disk drive 156 is connect to input/output channel 152. The nodes are interconnected using high speed channels 116 and 136. This system contains a single address space composed of memory banks 110, 130, and 150. Access of a CPU to its local memory bank, such as CPU 102 accessing memory 110, will be very fast since it does not need to use the node interconnections 116 or 136. Access by a CPU to a distant memory bank, such as CPU 102 accessing memory 130, will be slower since data must be transferred on communications channel 116.
Those of ordinary skill in the art will appreciate that the hardware depicted in
Even if the nodes are architecturally similar, one of the nodes needs to be designated the master, in this diagram Node 0, and the other nodes are "slaves" , in the sense that their time of day is re-synchronized to the "master" time of day. It is important to have precise frequency generation with as little "cycle slippage" as possible between the nodes. What is required is a frequency generation system with the possibility of making fine adjustments to the system clock frequency on a dynamic basis so that the time of day register value can be changed.
The values of K1 and K2 must be fixed to avoid cycle-slipping due to PLL pullout frequency. The value of fout is equal to (K1/K2) fref. By setting K1 and K2 to different integer values, the output frequency is synthesized based on the input frequency. However, these values cannot be changed dynamically, as explained below.
The input to the phase locked loop is reference frequency 502, which is fed into phase detector 504. The other input to the phase detector will be discussed below. The output of phase detector 504 is fed into charge pump 506. The charge pump creates a current for the period of time during which the phase error exists, which is integrated by capacitor C1310 to create a voltage Vc which is fed into voltage controlled oscillator (VCO) 512. VCO output equals K1 fref. This signal is fed into frequency divider 516 that divides it by K1, which is an integer value in the range of 1, 2, . . . , N1. The output of frequency divider 516 equals fref and this is the second input to phase detector 504. This completes the feedback loop. Since both inputs to phase detector 506 equal fref any shift in one of these frequencies will be detected by phase detector 504 and fed through charge pump 508 to voltage controlled oscillator 512.
Circuit output fout 514 is generated by feeding the output of VCO 510 into frequency divider 512 which divides its input by K2 to produce the value (K1/K2) fref. This is the same output value as the circuit in
Of particular interest is the case where K1, is approximately equal to K2 so that the ratio K1/K2 is equal to 1 plus or minus a small delta factor. Substituting these values in the equation for the output frequency results in fout=(1±Δ)fref. So by varying the value of K2, which can be changed without cycle slipping, the output frequency can be adjusted up or down by small amounts relative to the input frequency.
The circuit in
To close the loop, the output of phase lock loop 604 is fed through frequency divider 606 where the division is by K1. This output is fed back as the second input to the phase detector that is part of phase lock loop 604. The frequency output from this conventional frequency synthesizer is K1 fref/L, where both K1 and L are fixed.
To allow for dynamic frequency adjustment, the output of phase lock loop 604 is the input to frequency divider 608 that divides its input frequency by K2. The value for K2 can be varied dynamically, in a manner similar to the dynamic adjustments to frequency divider 512 in FIG. 5. The detailed circuitry of this dynamic frequency divider are disclosed in FIG. 7. The output from stage 1 of the three stage frequency adjuster is (K1 fref)/(K2 L) where K1 and L are fixed and K2 is variable label this output as f2.
Stage 2 of the three stage frequency adjuster contains the forward path of phase locked loop 610, the feedback circuit with frequency divider 612 that divides by K3, and the frequency divider 614 at the output that divides by K4. The frequency output of stage 2 equals (K3/K4) f2; this frequency is referred to as f3. The value of K3 is fixed but the value of K4 is variable.
Stage 3 of the three stage frequency adjuster has the same structure as stage 2. It contains the forward path of phase locked loop 618, feedback loop with frequency divider 620 that divides by M, and frequency divider 622 on the output that divides by N. The frequency output, fout, of this final stage equals (M/N) f3. The values of M and N are both fixed.
Substituting the various formulas for each stage of the circuit, it can be seen that fout=(K1/K2) (K3/K4) (M/N) (fref/L) where K2 and K4 are variable. It is instructive to substitute typical frequency values to see how the output frequency can be tuned with high refinement. Let fref be 150 MHz. The values of the various dividers will be chosen so that the output frequency will also be 150 MHz, but by varying the values of K2 and K4, fine adjustments can be obtained. L, K1, and K3 are set to 100. M and N are set to 200 and 2, respectively. In case 1, K2 is set to 119 and K4 is set to 84. The resultant output frequency is 150.06 MHz; this is a change of +60,000 Hz for 150 MHz or +400 parts per million (PPM). In case 2, K2 is set to 122 and K4 is set to 82. The resultant output frequency is 149.94 MHz; this is a change of -60,000 Hz for 150 MHz or -400 PPM.
By carrying through the calculations stage-by-stage, it is found the frequency shifts at stage 2 are less than 2.5% and at stage 3 are less than 0.08%. As one of ordinary skill in the art will appreciate, greater refinement of frequency adjustment can be obtained when more stages are cascaded. The frequency divider at the output of each stage, except for the last stage, can be made variable.
Frequency dividers in the prior art are hardwired to a particular divisor value. Therefore, a new circuit had to be devised that could divide by any integer value and that could change the divisor value very quickly.
With reference now to
Comparator A>B COMP 706 is on whenever the current counter value is less than the current divisor value. Whenever the comparator 706 is on, the incrementer INC 710 increases the counter value by 1 and saves the new value in REG_B 712. The output state based on the setting of REG_OUT 716 remains the same. When the counter value exceeds the divisor value, then the output of comparator 706 is off, which causes the incrementer to be set back to 1 and the value of REG_OUT 716 to be toggled resulting in the output frequency changing state.
Examination of particular frequency values helps understand operation of this circuit. Suppose the output of the multiplexer is a divisor value of 120 and the value in REG_B 712 has just been reset, so REG_B 712 counts from the value 1 up to the divisor value. When this counter equals the divisor value, it triggers the output of A>B COMP 706 to change state. This has two effects: it resets the value in REG_B to 1 and it toggles the output frequency from REG_OUT 716. For every 120 pulses on the input, there is one pulse on the output. So the circuit functions like a "divide by 120" circuit.
Suppose that the value of NEW_K is 110 and the CHANGE_K command is received; this transfers the value of 110 to the "A INPUT" of the multiplexer. There are two possible cases: the counter value in REG_B is less than 110 or the counter value is between 110 and 120. If the case the counter is less than 110, REG_B 712 continues to count but now will be reset when 110 is reached. If the value in REG_B is already greater than 110, then the output of comparator A>B COMP is switched which results in toggle of the output frequency and a reset of the counter.
As one of ordinary skill in the art will appreciate, the case where NEW_K is larger than CURRENT_K is even easier. The current counter value is less than NEW_K, so once the multiplexer switches the input to the comparator, the counter will continue to count up until the new divisor value is reached.
With reference now to
The master frequency fm appears as horizontal line 806 at the bottom of the figure. Slave frequency 808 is shown as a dashed line at the bottom of the figure; it is shown initially 200 PPM greater than the master frequency 806. Vertical dashed lines 810, 812, 814, 816, 818, 820, 822, and 824 indicate the times that the phases of the master and slave signals are compared and, when required, corrections are applied.
At times 810, 812, 814, and 816 the phase of the slave 802 is greater than the phase of the master 804. During these same intervals, the frequency of the slave 808 is 200 PPM greater than the frequency of the master 806. At synchronization time 818, the phase of the slave 802 is less than the phase of the master 804. When this is detected, the variable dividers in the multistage frequency synthesizer associated with the slave are adjusted to produce a frequency of the slave 808 200 PPM less than the frequency of the master 806. This causes the phase of the slave 802 to rise quickly until at time 820 it is again greater than the master. This causes the frequency of the slave 808 to switch between being 200PPM less than the master 806 to being 200 PPM greater than the master. During time intervals 822 and 824 the phase of the slave 802 is still greater than the phase of the master 804, so the frequency of the slave 808 remains at 200 PPM greater than the frequency of the master 806.
As one of ordinary skill in the art will appreciate, once the phase of the slave 802 becomes less than the phase of the master 804, the frequency of the slave 808 will drop to 200 PPM less than the frequency of the master 806 to bring the system back into balance. This continual detection of phase differences and resulting frequency adjustments will keep the time of day registers synchronized during the operation of the computer system. If these adjustments were not made, over a longer period of time the register discrepancies would become so large as to cause system malfunctions as a result of timestamp problems. However, since these synchronization times occur every thousand clock cycles or so in a typical embodiment, the time of day values never shift enough to cause any serious problems. Solutions to this problem in the prior art involved expensive hardware, such as using an external atomic clock, to provide synchronization. Using this invention, two or more time of day registers at different nodes in a multiple processor system can be synchronized with a minimum of additional hardware.
The description of the present invention has been presented for purposes of illustration and description but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Dean, Mark Edward, Boerstler, David William, Ngo, Hung Cai, Zimmerman, Andrew Christian
Patent | Priority | Assignee | Title |
10235254, | Jun 28 2005 | Renesas Electronics Corporation | Processor and method of controlling execution of processes |
6901527, | Sep 21 2001 | International Business Machines Corporation | Synchronizing multiple time stamps distributed within a computer system with main time of day register |
7260653, | May 11 1999 | Canon Kabushiki Kaisha | Method and device for the synchronization between two networks |
7356725, | Sep 09 2005 | International Business Machines Corporation | Method and apparatus for adjusting a time of day clock without adjusting the stepping rate of an oscillator |
7681064, | Sep 09 2005 | International Business Machines Corporation | Apparatus and computer program product for TOD-clock steering |
7681199, | Aug 31 2004 | Hewlett-Packard Development Company, L.P. | Time measurement using a context switch count, an offset, and a scale factor, received from the operating system |
7747237, | Apr 09 2004 | Skyworks Solutions, Inc. | High agility frequency synthesizer phase-locked loop |
7898342, | Nov 03 2003 | Heidelberger Druckmaschinen Aktiengesellschaft | Circuit for clock interpolation and method for performing clock interpolation |
7904418, | Nov 14 2006 | Microsoft Technology Licensing, LLC | On-demand incremental update of data structures using edit list |
7925742, | Feb 28 2008 | Microsoft Technology Licensing, LLC | Correlating performance data of multiple computing devices |
8135978, | Sep 09 2005 | International Business Machines Corporation | Performing a perform timing facility function instruction for sychronizing TOD clocks |
8438415, | Sep 09 2005 | International Business Machines Corporation | Performing a perform timing facility function instruction for synchronizing TOD clocks |
8984334, | Jun 28 2005 | Renesas Electronics Corporation | Processor and method of controlling execution of processes |
9342416, | Jun 28 2005 | Renesas Electronics Corporation | Processor and method of controlling execution of processes |
9350367, | Apr 09 2004 | Skyworks Solutions, Inc. | High agility frequency synthesizer phase-locked loop |
Patent | Priority | Assignee | Title |
4481489, | Jul 02 1981 | Motorola Inc. | Binary signal modulating circuitry for frequency modulated transmitters |
5006979, | Jul 29 1985 | Hitachi, Ltd. | Phase synchronization system |
5059925, | Sep 28 1990 | Cisco Technology, Inc | Method and apparatus for transparently switching clock sources |
5111451, | Oct 27 1989 | Cirrus Logic, INC | Method and apparatus for synchronizing an optical transceiver over a full duplex data communication channel |
5184350, | Apr 17 1991 | PULSE COMMUNICATIONS, INC | Telephone communication system having an enhanced timing circuit |
5276408, | Oct 22 1990 | NEC Corporation | PLL frequency synthesizer capable of changing an output frequency at a high speed |
5349310, | Jun 09 1993 | Naxos Data LLC | Digitally controlled fractional frequency synthesizer |
5398002, | Feb 15 1989 | Samsung Electronics Co., Ltd. | Automatic frequency control system by quadrature-phase in frequency or phase demodulating system |
5537449, | Jun 29 1994 | NEC Corporation | Clock synchronizing circuitry having a fast tuning circuit |
5694089, | Feb 14 1995 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Fast frequency switching synthesizer |
5815042, | Oct 03 1995 | ATI Technologies Inc. | Duty cycled control implemented within a frequency synthesizer |
5983326, | Jul 01 1996 | Oracle America, Inc | Multiprocessing system including an enhanced blocking mechanism for read-to-share-transactions in a NUMA mode |
6023768, | Feb 02 1998 | McDonnell Douglas Corporation | Phase locked distributed time reference for digital processing and method therefor |
6188286, | Mar 30 1999 | LANTIQ BETEILIGUNGS-GMBH & CO KG | Method and system for synchronizing multiple subsystems using one voltage-controlled oscillator |
6441692, | Sep 17 1997 | Matsushita Electric Industrial Co., Ltd. | PLL frequency synthesizer |
6539489, | Mar 31 2000 | LIBERTY PATENTS LLC | Apparatus, method and system for synchronizing slave system operations to master system clocking signals in a master-slave asynchronous communication system |
JP11055232, | |||
JP11225136, | |||
JP2094709, | |||
JP4024861, | |||
JP64041343, | |||
JP8018446, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jul 25 2000 | DEAN, MARK EDWARD | International Business Machines Corporation | CORRECTIVE ASSIGNMENT TO CORRECT THE SECOND & FOURTH ASSIGNOR S NAME, PREVIOUSLY RECORDED AT REEL 011128 FRAME 0201 | 012443 | /0486 | |
Jul 25 2000 | ZIMMERMAN, ANDREW CHRISTIAN | International Business Machines Corporation | CORRECTIVE ASSIGNMENT TO CORRECT THE SECOND & FOURTH ASSIGNOR S NAME, PREVIOUSLY RECORDED AT REEL 011128 FRAME 0201 | 012443 | /0486 | |
Jul 25 2000 | DEAH, MARK EDWARD | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011128 | /0201 | |
Jul 25 2000 | ZIMMERMAN, ANDREW CHRISTAIN | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011128 | /0201 | |
Jul 27 2000 | BOERSTLER, DAVID WILLIAM | International Business Machines Corporation | CORRECTIVE ASSIGNMENT TO CORRECT THE SECOND & FOURTH ASSIGNOR S NAME, PREVIOUSLY RECORDED AT REEL 011128 FRAME 0201 | 012443 | /0486 | |
Jul 27 2000 | NGO, HUNG CAI | International Business Machines Corporation | CORRECTIVE ASSIGNMENT TO CORRECT THE SECOND & FOURTH ASSIGNOR S NAME, PREVIOUSLY RECORDED AT REEL 011128 FRAME 0201 | 012443 | /0486 | |
Jul 27 2000 | BOERSTLER, DAVID WILLIAM | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011128 | /0201 | |
Jul 27 2000 | NGO, HUNG CAI | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011128 | /0201 | |
Aug 03 2000 | International Business Machines Corporation | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Jul 13 2004 | ASPN: Payor Number Assigned. |
Jan 21 2008 | REM: Maintenance Fee Reminder Mailed. |
Jul 13 2008 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Jul 13 2007 | 4 years fee payment window open |
Jan 13 2008 | 6 months grace period start (w surcharge) |
Jul 13 2008 | patent expiry (for year 4) |
Jul 13 2010 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jul 13 2011 | 8 years fee payment window open |
Jan 13 2012 | 6 months grace period start (w surcharge) |
Jul 13 2012 | patent expiry (for year 8) |
Jul 13 2014 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jul 13 2015 | 12 years fee payment window open |
Jan 13 2016 | 6 months grace period start (w surcharge) |
Jul 13 2016 | patent expiry (for year 12) |
Jul 13 2018 | 2 years to revive unintentionally abandoned end. (for year 12) |