A method for fault tolerance in concurrently executing computer programs is presented. The present invention controls the re-execution of concurrent programs in order to avoid a recurrence of synchronization failure. The invention (i) traces an execution, (ii) detects a synchronization failure, (iii) determines a control strategy, and (iv) re-executes under control. control is achieved by tracing information during an execution and using this information to add synchronizations during the re-execution.
|
6. A method for fault tolerance in concurrently executed software programs, comprising:
tracing executions of concurrent programs to identify faults and to provide tracing information; identifying faults in the execution of said concurrent programs; and using controlled re-execution of said concurrent programs based on the identification of faults, said tracing information being used to add synchronizations during the re-execution, and said re-execution of said concurrent programs being based on an automatically formulated computer operation that is derived from said tracing information.
1. A method of providing fault tolerance in concurrently executing computer programs by controlling the re-execution of concurrent programs in order to avoid a recurrence of synchronization failures, comprising:
(a) tracing the execution of concurrent programs to provide tracing information; (b) detecting synchronization failures resulting from said execution of the concurrent programs; and (c) applying a control strategy, based on said detection of failures, for a re-execution of the concurrent programs said control strategy using the tracing information to add synchronizations during said re-execution, said re-execution of said concurrent programs being based on an automatically formulated computer operation that is derived from information traced during a failed execution.
13. A method of providing tolerance to synchronization faults during concurrent execution of computer program components, one of said concurrent program components having a first critical section and another of said concurrent program components having a second critical section, said method comprising the steps of:
tracing the execution of concurrent program components to acquire tracing information pertaining thereto, including information pertaining to a synchronization failure resulting from said execution of said concurrent program components; using said tracing information to derive one or more synchronizations for use in selectively controlling a re-execution of said concurrent program components; and re-executing said concurrent program components with said derived synchronizations being added to said re-execution, a particular one of said derived synchronizations being operative to ensure that execution of said first critical section is completed before execution of said second critical section commences.
11. A Method of correcting synchronization faults occurring during the execution of concurrent programs, comprising:
tracing the execution of concurrent programs; detecting a synchronization error within the execution of said concurrent programs; resetting program processes and associated programs files to a consistent global state prior to said error; recording a section of the traced vector clock values that occur after the rolled-back state, wherein said section indicates the critical section entry and exit points required by the algorithm and further indicates failure points within execution of said concurrent programs; adding synchronizations specified as pairs of critical section boundary points to identified failures within said sections; replaying said concurrent programs using the logged vector clock values of the receive points, wherein each receive point must be blocked until the same message arrives as in the previous execution; and using each of said added synchronizations during said replaying of said concurrent programs to ensure that execution of a critical section of one of said concurrent programs is completed before execution of a critical section of another of said concurrent programs commences.
2. The method of
3. The method of
4. The method of
5. The method of
7. The method of
8. The method of
9. The method of
10. The method of
12. The method of
14. The method of
one of said derived synchronizations is implemented by a control message sent from an exit point of said first critical section, and received before an entry point of said second critical section.
|
|||||||||||||||||||||||||||
This application is a conversion from and claims priority of U.S. Provisional Application No. 60/159,253, filed Oct. 13, 1999.
The present invention is generally directed to methods for correcting synchronization faults in concurrently executed computer programs and, more particularly, to methods and systems for fault tolerance of concurrently executed software programs using controlled re-execution of the programs.
Concurrent programs are difficult to write. The programmer is presented with the task of balancing two competing forces: safety and liveness. Frequently, the programmer leans too much in one of the two directions, causing either safety failures (e.g. races) or liveness failures (e.g. deadlocks) Such failures arise from a particular kind of software fault (bug), known as a synchronization fault. Studies have shown that synchronization faults account for a sizeable fraction of observed software faults in concurrent programs. Locating synchronization faults and eliminating them by reprogramming is always the best strategy. However, many systems must maintain availability in spite of software failures. Concurrent programs include all parallel programming paradigms such as multi-threaded programs, shared-memory parallel programs, message-passing distributed programs, distributed shared-memory programs, etc. A parallel entity may be referred to as a process, although in practice it may also be a thread.
Traditionally, it was believed that software failures are permanent in nature and, therefore, they would recur in every execution of the program with the same inputs. This belief led to the use of design diversity to recover from software failures. In approaches based on design diversity, redundant modules with different designs are used, ensuring that there is no single point-of-failure. Contrary to this belief, it was observed that many software failures are, in fact, transient (they may not recur when the program is re-executed with the same inputs). In particular, the failures caused by synchronization faults are usually transient in nature.
The existence of transient software failures motivated a new approach to software fault tolerance based on rolling back the processes to a previous state and then restarting them (possibly with message reordering), in the hope that the transient failure will not recur in the new execution. Methods based on this approach have mostly relied on chance in order to recover from a transient software failure. In the special case of synchronization faults, however, it is desirable to do better.
It would therefore be desirable to be able to bypass a synchronization fault and recover from the resulting failure.
The present invention controls the re-execution of concurrent programs in order to avoid a recurrence of the synchronization failure. The invention provides a method of (i) tracing an execution, (ii) detecting a synchronization failure, (iii) determining a control strategy, and (iv) re-executing under control.
Control is achieved by tracing information during an execution and using this information to add synchronizations during the re-execution.
In accordance with the present invention, a method of providing fault tolerance in concurrently executing computer programs by controlling the re-execution of concurrent programs in order to avoid a recurrence of synchronization failures is provided, comprising:
(a) tracing the execution of concurrent programs;
(b) detecting synchronization failures resulting from said execution of the concurrent programs; and
(c) applying a control strategy, based on said detection of failures, for said execution of the concurrent programs.
Also in accordance with the present invention, application of a control strategy includes causing a re-execution of said concurrent programs under a control derived from tracing information during an execution, and wherein said control includes using said information to add synchronizations to said concurrent programs during re-execution.
For a more complete understanding of the invention and for application of further advantages thereof, reference is now made to the following description or preferred embodiments taken in conjunction with the accompanying drawings, in which:
In accordance with the present, methods for tolerating transient synchronization failure in software is provided. The present invention controls the re-execution of concurrent programs, based on information traced during a failed execution. The following description will describe the management of races, tracing, failure detection, and re-execution under control in great depth.
Referring to
The present invention allows for the development of a control strategy by determining which synchronizations to add to an execution trace in order to tolerate a synchronization fault during the re-execution of a computer program, or programs, encountering fault. This proves to be an important problem in its own right and can be applied in areas other than software fault tolerance, such as concurrent debugging. The problem is generalized using a framework known as the off-line predicate control problem. The problem, as it applied to debugging, was introduced in a paper entitled "Predicate control for active debugging of distributed programs," by A. Tarafdar and V. K. Garg. IEEE Proceedings of the 9th symposium on Parallel and Distributed Processing, Orlando, USA, April 1998. Informally, off-line predicate control specifies that, given a computation and a property on the computation, one must determine a controlled computation (one with more synchronizations) that maintains the property (used is the term computation for a formal model of an execution). Previous attempts solved the predicate control problem for a class of properties called disjunctive predicates. Applying the results of the Tarafdar et al. study to software fault tolerance would mean avoiding synchronization failures of the form: l1{circumflex over ( )}l2{circumflex over ( )}l3, where l1 is a local property specified on process P1. For example, if l1 specifies that a server is unavailable, the synchronization failure is that all servers are unavailable at the same time.
The invention addresses a class of off-line predicate control problems, characterized by the mutual exclusion property, that is especially useful in tolerating races. Four classes of mutual exclusion properties are considered: off-line mutual exclusion, off-line readers writers, off-line independent mutual exclusion, and off-line independent read-write mutual exclusion. For each of these classes of properties, necessary and sufficient conditions under which the problem may be solved are determined. Furthermore, an efficient algorithm that solves the most general of the problems, off-line independent read-write mutual exclusion, is presented, which also solves each of the other three problems. The algorithm takes O(np) time, where n is the number of concurrent processes and p is the number of critical sections.
The said problems have been termed off-line problems to distinguish them from their more popular on-line variants (i.e. the usual mutual exclusion problems). The difference between the on-line and off-line problems is that in the on-line case, the computation is provided on-line, whereas in the off-line case, the computation is known a priori. Ignorance of the future makes on-line mutual exclusion a harder problem to solve.
In general, in on-line mutual exclusion, one cannot avoid deadlocks without making some assumptions (e.g. critical sections do not block). Thus, on-line mutual exclusion is impossible to solve. To understand why this is true, consider the scenario in FIG. 1. Any on-line algorithm, being unaware of the future computation, would have a symmetric choice of entering CS1 or CS2 first. If CS2 is entered first, it would result in a deadlock. An off-line algorithm, being aware of the future computation, could make the correct decision to enter CS1 first and add a synchronization from CS1 to CS2. There will always be scenarios where on-line mutual exclusion algorithms will fail, resulting in either race conditions or deadlocks. In such scenarios, controlled re-execution based on off-line mutual exclusion becomes vitally important.
The model that is now presented is of a single execution of the concurrent program. The model is not at the programming language level, but at a lower level, at which the execution consists of a sequence of states for each process and the communications that occurred among them, let S be a finite set of elementary entities known as states. S is partitioned into subsets S1, S2, . . . , Sn, where n>1. These partitions correspond to n processes in the system. A subset G of S is called a global state if ∀i: |G∩Si|=1. Let Gi denote the unique element in G∩Si. A global predicate is a function that maps a global state onto a boolean value.
A computation is a partial order → on S such that ∀i:→i is a total order on Si, where →i represents → restricted to the set Si. Note that the states in a single process are totally ordered while the states across processes are partially ordered. We will use →, →k, →c to denote computations, and ∥, ∥k, ∥o to denote the respective incomparability relations (e.g. s∥t≡(s≯t){circumflex over ( )}(t≯s)). Given a computation → and a subset K of S, →-consistent (K)≡∀s, tεK s∥t. In particular, a global state may be →-consistent. The notion of consistency indicates when a set of states could have occurred concurrently in a computation.
A computation → is extensible in S if:
Intuitively, extensibility allows for the extension of a consistent set of states to a consistent global state. Any computation in S can be made extensible by adding "dummy" states to S. Therefore, it can be implicitly assumed that any computation is extensible.
Given a computation →, let ≦ be a relation on global states defined as: G≦H≡∀i: (Gi→i Hi)
Given a computation → and a global predicate B, a computation →c is called a controlling computation of B in → if (1) →
In the paper by Tarafdar et al., it was proved that the Off-line Predicate Control is NP-Hard. Therefore, it is important to solve useful restricted forms of the Off-line Predicate Control Problem. Since it is desirable to avoid race conditions, the general problem is restricted by letting B specify the mutual exclusion property.
Referring to
In traditional on-line mutual exclusion, there has been no "independent" variant, since it trivially involves applying the same algorithm for each lock. However, in off-line mutual exclusion, such an approach will not work, since the synchronizations added by each independent algorithm may cause deadlocks when applied together.
For the practitioner, an algorithm which solves Off-line Independent Read-Write Mutual Exclusion would suffice, since it can be used to solve all other variants. For the purpose of describing the invention in straight-forward terms, however, the simplest off-line mutual exclusion Problem will be described in generalized steps. For each problem, the necessary and sufficient conditions for finding a solution will also be described.
Off-line Mutual Exclusion is a specialization of Off-line Predicate Control to the following class of global predicates:
where critical is a function that maps a state onto a boolean value. Thus, Bmutex specifies that at most one process may be critical in a global state.
Based on the critical boolean function on states, critical sections are defined as maximal intervals of critical states. More precisely; given a critical function on S and a computation→on S, a critical section, CS, is a non-empty, maximal subset on a Si such that: (1) ∀s∈CS: critical(s), and (2) ∀s,t∈CS: ∀u∈Si: s→u→t
Let CS.first and CS.last be the minimum and maximum states respectively in CS (w.r.t. →i). Let
All computations will have the same total order →i for each Si. Therefore, the set of critical sections will not change for each computation. However, the
Theorem 1 (Necessary Condition) For a computation → of S, and a global predicate Bmutex,
a controlling computation of Bmutex in → exists
Proof: The contrapositive is proved. Let
Case 1: [s1∈CS1
Case 2: [s1∉CS1
So in either case, →c is not a controlling computation of Bmutex in →.
Theorem 2 (Sufficient Condition) For a computation → of S and a global predicate Bmutex,
Proof: Since
The remaining obligation of proof is that →c is a partial order. To this end, let →k be defined as: (→∪{(CSi.last, CSi+1.first)|1≦i≦k-1})+. The following claim can be made: Claim:∀1≦k≦m: (1)→k is a partial order, and (2) CSi
Proof of Claim: (by Induction on k)
Base Case: Immediate from →=→1.
Inductive Case: We make the inductive hypothesis that →k-1 is a partial order, and that CSi
(i) Irreflexivity: Let s→kt. There are two possibilities: either s
(ii) Transitivity: This is immediate from the definition of →k.
Therefore, →k is a partial order. The second part of the claim is now illustrated. Suppose Csi
In conclusion, the necessary and sufficient condition for finding a controlling computation for Bmutex is that there is no cycle of critical sections with respect to
The following is an explanation of off-line Readers Writers Problem. Let read_critical and write_critical be functions that map a state onto a boolean value. Further, no state can be both read_critical and write_critical (any read and write locked sate is considered to be only write locked). Let critical(s)≡read_critical(s)
Given a read-critical function and a write-critical function on S and a computation → on S, we define a read critical section and a write critical section in an analogous fashion to the critical sections that were defined before. Note that, since no state is both read_critical and write_critical, critical sections in a process do not overlap.
Let
Theorem 3 (Necessary Condition) For a computation → of S, and a global predicate Brw,
| a controlling computation of | => | all cycles in → contain | |
| Brw in → exists | only read critical sections | ||
Proof: The proof is similar to the proof of Theorem 1. Providing the contrapositive: Let
First, there is at least one critical section in the cycle say Csk (where k≠1), such that CS1.last≯cCSk.first and Csk.lasts≯c CS1.first. To prove this, assume the opposite:
and prove a contradiction as follows: Csm
Since the existence of a Csk has been demonstrated such that CS1.last≯cCsk.first and Csk.last≯cCS1.first, a proof similar to the one in Theorem 1 can be used to show that →c is not a controlling computation of Brw in →.
Theorem 4 (Sufficient Condition) For a computation → of S, and a global predicate Brw,
| all cycles in → contain | => | a controlling computation of | |
| only read critical sections | Brw in → exists | ||
Proof: Consider the set of strongly connected components of the set of critical sections with respect to the
It is shown that →c is a controlling computation of Brw in →. Suppose G is a global state such that
Note, as before, that the proof of Theorem 4 can be used to design an algorithm to find a controlling computation.
The following is an explanation of off-line Independent Mutual Exclusion. Let critical1, critical2, . . . criticalm be functions that map an event onto a boolean value. The Off-line Independent Mutual Exclusion Problem is a specialization of the Off-line Predicate Control Problem to the following class of global predicates:
Given a function criticali on S and a computation → on S, an i-critical section is defined in an analogous fashion to the critical sections that were defined before. Note that the definition allows independent critical sections on the same process to overlap. In particular the same set of states may correspond to two different critical sections (corresponding to a critical section with multiple locks). Let
Theorem 5 (Necessary Condition)
For a computation → of S, and a global predicate Bind,
| a controlling computation of | => | → has no cycles of i-critical |
| Bind in → exists | sections, for some i | |
Proof: The proof is almost identical to the proof of Theorem 1.
Theorem 6 (Sufficient Condition) For a computation → of S, and a global predicate Bind.
| → has no cycles of i-critical | => | a controlling computation of |
| sections, for some i | Bind in → exists | |
Proof: The proof is along similar lines to the proof of Theorem 4. In this case strongly connected components are taken as before, but utilize the fact that no two i-critical sections may be in the same strongly connected component (otherwise, there would be a cycle of i-critical sections).
Using similar definitions, the Off-line Independent Read-Write Mutual Exclusion Problem is a specialization of the Off-line Predicate Control Problem to the following class of global predicates:
As before, we defined i-read critical sections and i-write critical section (1≦i≦m). Similarly, let → be a relation on all critical selections. The-necessary and sufficient condition is a combination of that of the previous two sections. Since the proofs are similar to the previous ones, we simply state:
Theorem 7 (Necessary and Sufficient Condition)
For a computation → of Si and a global predicate Bind-rw,
| a controlling computation of | ≡ | all cycles of i-critical sections in → |
| Bind-rw in → exists | contain only read critical sections | |
The input to the algorithm is the computation, represented by n lists of critical sections C1, . . . , Cn. For now, to simplify presentation, assume that critical sections are totally ordered on each process. Each critical section is represented as its process id, its first and last states, a type identifier cs_id that specifies the criticalcs
The first while loop of the algorithm builds ordered, a totally ordered set of strongly connected components of critical sections (called scc's from here on). The second while loop simply uses ordered to construct the →c relation.
The goal of each iteration of the first loop is to add an scc, which is minimal w.r.t. →, to ordered (where → is the relation on scc's defined in the proof of Theorem 4). To determine this scc, it first computes the set of scc's among the leading critical sections in C1, . . . Cn. Since no scc can contain two critical sections from the same process, it is sufficient to consider only the leading critical sections. From the set of scc's, it determines the set of minimal scc's, crossable. In then randomly selects one of the minimal scc's. Finally, before adding the scc to ordered, it must check if the scc is not_valid, where not_valid(crossed)≡∀cs1cs1∈crossed:cs.cs--id . . . cs1.cs_id
The main while loop of the algorithm executes p times in the worst case, where p is the number of critical sections in the computation. Each iteration takes O(n2) since it must compute the scc's. Thus, a simple implementation of the algorithm will have a time complexity of O(n2p). A better implementation of the algorithm, however, would amortize the cost of computing scc's over multiple iterations of the loop. Each iteration would compare each of the critical sections that have newly reached the heads of the lists with the existing scc's, thus forming new scc's. Each of the p critical section, therefore, reaches the head of the list just once, when it is compared with n-1 critical sections to determine the new scc's. The time complexity of the algorithm with this improved implementation is, therefore, O(np). Note that a naive algorithm based directly on the constructive proof of the sufficient condition in Theorem 7 would take O(p2). The complexity has been significantly reduced by using the fact that the critical sections in a process are totally ordered.
The algorithm has implicitly assumed a total ordering of critical sections in each process. As noted before, however, independent critical sections on the same process may overlap, and may even coincide exactly (a critical section with multiple locks is treated as multiple critical sections that completely overlap). The algorithm can be extended to handle such cases by first determining the scc's within a process. These scc's correspond to maximal sets of overlapping critical sections. The input to the algorithm would consist of n lists of such process-local scc's. The remainder of the algorithm remains unchanged.
Referring to
After determining a control strategy the algorithm used in the present invention determines which synchronizations to add in order to avoid very general forms of mutual exclusion violation. As mentioned before, the other three parts of our scheme have been addressed as independent problems. All of the pieces are at this point put together for a comprehensive look at how race failures (mutual exclusion violations) can be tolerated.
Consider a distributed system of processes that write to a single shared file. The file system itself does not synchronize accesses and so the processes are responsible for synchronizing their accesses to the file. If they do not do so, the writes may interleave and the data may be corrupted. Since the file data is very crucial, it must be ensured that races can be tolerated.
Synchronization occurs through the use of explicit message passing between the processes. The first part of our mechanism involves tracing the execution. The concern during tracing is to reduce the space and time overhead, so that tolerating a possible fault does not come at too great a cost. In the example, a vector clock mechanism is used, updating the vector clock at each send and receive point. This vector clock needs to be logged for each of the writes to the file (for the algorithm of the present invention). The vector clock values must also be logged for each receive point (for replay purposes). When a write is initiated, and when it returns, the vector clock must be logged. In the example, the writes are typically very long and therefore are performed asynchronously. Thus, execution continues while the write is in progress. In particular, the process may receive a message from another process during its write to the file. Inserting some computation at the send, receive, write initiation, and write completion points can be achieved either by code instrumentation, or by modifying the run-time environment (message-passing interface and the file system interface).
The second part of our mechanism is detecting when a race occurs. Since message passing is used as the synchronization mechanism, the methods described in a Ph.D. thesis by R. H. B. Netzer, entitled "Race condition detection for debugging shared-memory parallel programs," University of Wisconsin-Madison, 1991, are particularly applicable. Once a race has been detected, all processes are rolled back to a consistent global state prior to the race. The file is also rolled back to a version consistent with the rolled-back state of the processes. (A versioned file system with the ability to roll back must be assumed.) The section of the traced vector clock values that occur after the rolled-back state is then noted. The section indicates the critical section entry and exit points required by the algorithm. The algorithm would take O(np) time, where n is the number of processes and p is the number of critical sections that have been rolled back. The output of the algorithm is the set of added synchronizations specified as pairs of critical section boundary points.
The next step is to replay the processes using the logged vector clock values of the receive points. Each receive point must be blocked until the same message arrives as in the previous execution. This is a standard replay mechanism. In addition in this replay, additional synchronizations must be imposed. For example, suppose (s, t) in one of the synchronizations output by our algorithm. The state s is a critical section exit point while t is a critical section entry point. Each of these additional synchronizations is implemented by a control message sent from s and received before t. Thus, at each critical section exit point, the added synchronizations must be checked to decide if a control message must be sent. At each critical section entry point, the added synchronizations must be checked to decide if the process must block waiting for a control message. As in tracing, the points at which computation must be added are the write initiation and completion points, and the send and receive points. Again, this can be accomplished by code instrumentation or run-time environment modification.
An example has been chosen in which the processes only write to the file. If the processes were to read from the file as well, then that would cause causal dependencies between processes. Then these causal dependencies would have to be tracked as in the tracking for the messages. Another option would be to assume that these causal dependencies do not affect the message communications, in which case, they do not need to be tracked. However, if this approach is taken, it would have to be checked out to determine that the traced computation is the same as the one being replayed. In case of a divergence, the execution would be left to proceed uncontrolled from the point of divergence.
Whereas the invention has been described with respect to specific embodiments thereof, it will be understood that various changes and modifications will be suggested to one skilled in the art and it is intended to encompass such changes and modifications as fall within the scope of the appended claims.
Garg, Vijay K., Tarafdar, Ashis
| Patent | Priority | Assignee | Title |
| 6845470, | Feb 27 2002 | International Business Machines Corporation | Method and system to identify a memory corruption source within a multiprocessor system |
| 6851075, | Jan 04 2002 | International Business Machines Corporation | Race detection for parallel software |
| 7086053, | Jun 12 2000 | Oracle America, Inc | Method and apparatus for enabling threads to reach a consistent state without explicit thread suspension |
| 7770064, | Oct 05 2007 | International Business Machines Corporation | Recovery of application faults in a mirrored application environment |
| 7856536, | Oct 05 2007 | LinkedIn Corporation | Providing a process exclusive access to a page including a memory address to which a lock is granted to the process |
| 7921272, | Oct 05 2007 | International Business Machines Corporation | Monitoring patterns of processes accessing addresses in a storage device to determine access parameters to apply |
| 7926058, | Feb 06 2007 | RENEO, INC | Resource tracking method and apparatus |
| 7941616, | Oct 21 2008 | Microsoft Technology Licensing, LLC | System to reduce interference in concurrent programs |
| 7971248, | Aug 15 2007 | Microsoft Technology Licensing, LLC | Tolerating and detecting asymmetric races |
| 8051423, | Feb 06 2007 | RENEO, INC | System and method for tracking resources during parallel processing |
| 8055855, | Oct 05 2007 | International Business Machines Corporation | Varying access parameters for processes to access memory addresses in response to detecting a condition related to a pattern of processes access to memory addresses |
| 8321838, | Sep 24 2001 | Oracle International Corporation | Techniques for debugging computer programs involving multiple computing machines |
| 8533728, | Feb 06 2007 | RENEO, INC | Resource tracking method and apparatus |
| 8572577, | Jun 20 2008 | International Business Machines Corporation | Monitoring changes to data within a critical section of a threaded program |
| 8799863, | Sep 24 2001 | Oracle International Corporation | Techniques for debugging computer programs involving multiple computing machines |
| 9086969, | Dec 15 2009 | F5 Networks, Inc | Establishing a useful debugging state for multithreaded computer program |
| Patent | Priority | Assignee | Title |
| 4358823, | Mar 25 1977 | TRW, Inc. | Double redundant processor |
| 5016249, | Dec 22 1987 | Lucas Industries public limited company | Dual computer cross-checking system |
| 5249187, | Sep 04 1987 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Dual rail processors with error checking on I/O reads |
| 5423024, | May 06 1991 | STRATUS COMPUTER, INC | Fault tolerant processing section with dynamically reconfigurable voting |
| 5440726, | Jun 22 1994 | AT&T IPM Corp | Progressive retry method and apparatus having reusable software modules for software failure recovery in multi-process message-passing applications |
| 5530802, | Jun 22 1994 | THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT | Input sequence reordering method for software failure recovery |
| 5590277, | Jun 22 1994 | THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT | Progressive retry method and apparatus for software failure recovery in multi-process message-passing applications |
| 6038684, | Jul 17 1992 | Sun Microsystems, Inc. | System and method for diagnosing errors in a multiprocessor system |
| 6058491, | Sep 15 1997 | International Business Machines Corporation | Method and system for fault-handling to improve reliability of a data-processing system |
| 6161196, | Jun 19 1998 | WSOU Investments, LLC | Fault tolerance via N-modular software redundancy using indirect instrumentation |
| 6173414, | May 12 1998 | McDonnell Douglas Corporation; TRW, Inc. | Systems and methods for reduced error detection latency using encoded data |
| Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
| Oct 11 2000 | GARG, VIJAY K | Board of Regents, The University of Texas System | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011504 | /0069 | |
| Oct 13 2000 | Board of Regents, The University of Texas System | (assignment on the face of the patent) | / | |||
| Oct 16 2000 | TARAFDAR, ASHIS NMI | Board of Regents, The University of Texas System | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011504 | /0069 | |
| Apr 17 2007 | TARAFDAR, ASHIS | Board of Regents, The University of Texas System | CORRECTIVE ASSIGNMENT TO CORRECT THE DOCUMENT TO INCLUDE THE SERIAL NUMBER AND PATENT NUMBER OF THE ASSIGNMENT PREVIOUSLY RECORDED ON REEL 011504 FRAME 0069 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT OF THE ENTIRE INTEREST | 019175 | /0704 | |
| Apr 17 2007 | GARG, VIJAY K | Board of Regents, The University of Texas System | CORRECTIVE ASSIGNMENT TO CORRECT THE DOCUMENT TO INCLUDE THE SERIAL NUMBER AND PATENT NUMBER OF THE ASSIGNMENT PREVIOUSLY RECORDED ON REEL 011504 FRAME 0069 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT OF THE ENTIRE INTEREST | 019175 | /0704 | |
| Apr 19 2007 | The Board of Regents of the University of Texas System | LOT 25D ACQUISITION FOUNDATION, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 019974 | /0995 | |
| Apr 27 2007 | GARG, VIJAY K | Board of Regents, The University of Texas System | CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNMENT TO CLARIFY GRANT OF RIGHTS PREVIOUSLY RECORDED ON REEL 019175 FRAME 0704 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT OF THE ENTIRE INTEREST | 019288 | /0641 | |
| May 03 2007 | TARAFDAR, ASHIS | Board of Regents, The University of Texas System | CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNMENT TO CLARIFY GRANT OF RIGHTS PREVIOUSLY RECORDED ON REEL 019175 FRAME 0704 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNMENT OF THE ENTIRE INTEREST | 019288 | /0641 | |
| Aug 12 2015 | LOT 25D ACQUISITION FOUNDATION, LLC | S AQUA SEMICONDUCTOR, LLC | MERGER SEE DOCUMENT FOR DETAILS | 036907 | /0278 |
| Date | Maintenance Fee Events |
| May 05 2008 | ASPN: Payor Number Assigned. |
| May 05 2008 | RMPN: Payer Number De-assigned. |
| May 07 2008 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
| May 07 2008 | M1554: Surcharge for Late Payment, Large Entity. |
| May 13 2008 | R2551: Refund - Payment of Maintenance Fee, 4th Yr, Small Entity. |
| May 13 2008 | STOL: Pat Hldr no Longer Claims Small Ent Stat |
| Sep 23 2011 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
| Mar 11 2016 | REM: Maintenance Fee Reminder Mailed. |
| Aug 03 2016 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
| Date | Maintenance Schedule |
| Aug 03 2007 | 4 years fee payment window open |
| Feb 03 2008 | 6 months grace period start (w surcharge) |
| Aug 03 2008 | patent expiry (for year 4) |
| Aug 03 2010 | 2 years to revive unintentionally abandoned end. (for year 4) |
| Aug 03 2011 | 8 years fee payment window open |
| Feb 03 2012 | 6 months grace period start (w surcharge) |
| Aug 03 2012 | patent expiry (for year 8) |
| Aug 03 2014 | 2 years to revive unintentionally abandoned end. (for year 8) |
| Aug 03 2015 | 12 years fee payment window open |
| Feb 03 2016 | 6 months grace period start (w surcharge) |
| Aug 03 2016 | patent expiry (for year 12) |
| Aug 03 2018 | 2 years to revive unintentionally abandoned end. (for year 12) |