Embodiments of systems and methods for reducing false positives during the linking of data records are disclosed herein. Broadly speaking, embodiments of the present invention may be used in the generation of an overall weight from the comparison of various attributes of data records, where the linking of the data records is dependent on the overall weight. More specifically, embodiments of the present invention may calculate a false positive penalty based on a set of results, each of the set of results based on a comparison of an attribute. The false positive penalty may be subtracted from the overall weight generated from the comparison of the attributes of data records to adjust the overall weight. By configuring which attributes of the data records are used as the set of attributes for generating the false positive penalty, and the penalties associated with a particular combination of results for the comparisons of these attributes, the incidence of false positives in the linking of data records may be significantly reduced.
|
1. A method for association of data records, comprising: providing a system comprising an identity hub running an identity hub engine, the identity hub coupled to one or more external data sources through one or more networks, each external data source at a corresponding database;
receiving a first data record and a second data record from the one or more external data sources at the identity hub;
obtaining a first set of results, wherein each of the first set of results is a value generated based on a comparison between one of a first set of attributes from the first data record and the second data record, wherein the comparison between each of the first set of attributes is performed by the identity hub engine;
determining a first overall weight for a comparison between the first data record and the second data record using the first set of results;
generating a first false positive penalty based on the first set of results, wherein the first false positive penalty is associated with the comparison of the first data record and the second data record, and wherein each possible permutation of the first set of results corresponds to a different false positive penalty;
adjusting the first overall weight to reduce a likelihood of incorrect linking of the first data record and second data record; and
determining whether the first data record and the second data record should be linked based on the adjusted first overall weight.
6. A computer readable storage media, comprising instructions translatable for implementing an identity hub engine on an identity hub the identity hub coupled to one or more external data sources through one or more networks, each external data source at a corresponding database the identity hub engine operable for:
receiving a first data record and a second data record from the one or more external data sources at the identity hub;
obtaining a first set of results, wherein each of the first set of results is a value generated based on a comparison between one of a first set of attributes from the first data record and the second data record, wherein the comparison between each of the first set of attributes is performed by the identity hub engine;
determining a first overall weight for a comparison between the first data record and the second data record using the first set of results;
generating a first false positive penalty based on the first set of results, wherein the first false positive penalty is associated with the comparison of the first data record and the second data record, and wherein each possible permutation of the first set of results corresponds to a different false positive penalty;
adjusting the first overall weight to reduce a likelihood of incorrect linking of the first data record and second data record; and
determining whether the first data record and the second data record should be linked based on the adjusted first overall weight.
11. A system for the linking of data records, comprising:
one or more data sources, each data source at a corresponding database; and
an identify hub linked to the one or more data sources through one or more networks, wherein the identity hub comprising a computer readable medium including instructions operable for implementing an identity hub engine for:
receiving a first data record and a second data record from the one or more external data sources at the identity hub;
obtaining a first set of results, wherein each of the first set of results is a value generated based on a comparison between one of a first set of attributes from the first data record and the second data record, wherein the comparison between each of the first set of attributes is performed by the identity hub engine;
determining a first overall weight for a comparison between the first data record and the second data record using the first set of results;
generating a first false positive penalty based on the first set of results, wherein the first false positive penalty is associated with the comparison of the first data record and the second data record, and wherein each possible permutation of the first set of results corresponds to a different false positive penalty;
adjusting the first overall weight to reduce a likelihood of incorrect linking of the first data record and second data record; and
determining whether the first data record and the second data record should be linked based on the adjusted first overall weight.
2. The method of
4. The method of
5. The method of
obtaining a second set of results, wherein each of the second set of results is based on a comparison between one of the first set of attributes from a third data record and a fourth data record, the second set of results differing from the first set of results, wherein obtaining the second set of results is performed by the identity hub engine;
determining a second overall weight for a comparison between the third data record and the fourth data record using the second set of results;
generating a second false positive penalty based on the second set of results wherein the second false positive penalty is associated with the comparison of the third data record and the fourth data record;
adjusting the second overall weight to reduce the likelihood of the incorrect linking of the third data record and fourth data record; and
determining whether the third data record and the fourth data record should be linked based on the adjusted second overall weight.
7. The computer readable storage media of
8. The computer readable storage media of
9. The computer readable storage media of
10. The computer readable storage media of
obtaining a second set of results, wherein each of the second set of results is based on a comparison between one of the first set of attributes from a third data record and a fourth data record, the second set of results differing from the first set of results, wherein obtaining the second set of results is performed by the identity hub engine;
determining a second overall weight for a comparison between the third data record and the fourth data record using the second set of results;
generating a second false positive penalty based on the second set of results wherein the second false positive penalty is associated with the comparison of the third data record and the fourth data record;
adjusting the second overall weight to reduce the likelihood of the incorrect linking of the third data record and fourth data record; and
determining whether the third data record and the fourth data record should be linked based on the adjusted second overall weight.
12. The system of
14. The system of
15. The system of
obtaining a second set of results, wherein each of the second set of results is based on a comparison between one of the first set of attributes from a third data record and a fourth data record, the second set of results differing from the first set of results, wherein obtaining the second set of results is performed by the identity hub engine;
determining a second overall weight for a comparison between the third data record and the fourth data record using the second set of results;
generating a second false positive penalty based on the second set of results wherein the second false positive penalty is associated with the comparison of the third data record and the fourth data record;
adjusting the second overall weight to reduce the likelihood of the incorrect linking of the third data record and fourth data record; and
determining whether the third data record and the fourth data record should be linked based on the adjusted second overall weight.
|
This application is a continuation of, and claims a benefit of priority under 35 U.S.C. 120 of the filing date of U.S. patent application Ser. No. 11/521,946 by inventors Norm Adams et al. entitled “Method and System for Filtering False Positives” filed on Sep. 15, 2006, the entire contents of which are hereby expressly incorporated by reference for all purposes.
This invention relates generally to associating data records, and in particular to identifying data records that may contain information about the same entity such that these data records may be associated. Even more particularly, this invention relates to the statistical identification of data records for association.
In today's day and age, the vast majority of businesses retain extensive amounts of data regarding various aspects of their operations, such as inventories, customers, products, etc. Data about entities, such as people, products, parts or anything else may be stored in digital format in a data store such as a computer database. These computer databases permit the data about an entity to be accessed rapidly and permit the data to be cross-referenced to other relevant pieces of data about the same entity. The databases also permit a person to query the database to find data records pertaining to a particular entity, such that data records from various data stores pertaining to the same entity may be associated with one another.
A data store, however, has several limitations which may limit the ability to find the correct data about an entity within the data store. The actual data within the data store is only as accurate as the person who entered the data, or an original data source. Thus, a mistake in the entry of the data into the data store may cause a search for data about an entity in the database to miss relevant data about the entity because, for example, a last name of a person was misspelled or a social security number was entered incorrectly, etc. A whole host of these types of problems may be imagined: two separate record for an entity that already has a record within the database may be created such that several data records may contain information about the same entity, but, for example, the names or identification numbers contained in the two data records may be different so that it may be difficult to associate the data records referring to the same entity with one other.
For a business that operates one or more data stores containing a large number of data records, the ability to locate relevant information about a particular entity within and among the respective databases is very important, but not easily obtained. Once again, any mistake in the entry of data (including without limitation the creation of more than one data record for the same entity) at any information source may cause relevant data to be missed when the data for a particular entity is searched for in the database. In addition, in cases involving multiple information sources, each of the information sources may have slightly different data syntax or formats which may further complicate the process of finding data among the databases. An example of the need to properly identify an entity referred to in a data record and to locate all data records relating to an entity in the health care field is one in which a number of different hospitals associated with a particular health care organization may have one or more information sources containing information about their patient, and a health care organization collects the information from each of the hospitals into a master database. It is necessary to link data records from all of the information sources pertaining to the same patient to enable searching for information for a particular patient in all of the hospital records.
There are several problems which limit the ability to find all of the relevant data about an entity in such a database. Multiple data records may exist for a particular entity as a result of separate data records received from one or more information sources, which leads to a problem that can be called data fragmentation. In the case of data fragmentation, a query of the master database may not retrieve all of the relevant information about a particular entity. In addition, as described above, the query may miss some relevant information about an entity due to a typographical error made during data entry, which leads to the problem of data inaccessibility. In addition, a large database may contain data records which appear to be identical, such as a plurality of records for people with the last name of Smith and the first name of Jim. A query of the database will retrieve all of these data records and a person who made the query to the database may often choose, at random, one of the data records retrieved which may be the wrong data record. The person may not often typically attempt to determine which of the records is appropriate. This can lead to the data records for the wrong entity being retrieved even when the correct data records are available. These problems limit the ability to locate the information for a particular entity within the database.
To reduce the amount of data that must be reviewed, and prevent the user from picking the wrong data record, it is also desirable to identify and associate data records from the various information sources that may contain information about the same entity. There are conventional systems that locate duplicate data records within a database and delete those duplicate data records, but these systems may only locate data records which are substantially identical to each other. Thus, these conventional systems cannot determine if two data records, with, for example, slightly different last names, nevertheless contain information about the same entity. In addition, these conventional systems do not attempt to index data records from a plurality of different information sources, locate data records within the one or more information sources containing information about the same entity, and link those data records together. Consequently, it would be desirable to be able to associate data records from a plurality of information sources which pertain to the same entity, despite discrepancies between attributes of these data records.
No matter the system utilized to identify and associate data records, however, certain conditions may arise with respect to associating these data records. More specifically, there will almost certainly be cases where data records which should be associated are not (known as false negative) and cases where data records are associated when they do not refer to the same entity (known as a false positive). In certain areas, these conditions may be relatively innocuous, the false negatives and false positives are easily dealt with and no harm may arise. In highly critical areas such as health care industries, however, these conditions may have the potential to cause great harm. This is particularly true for false positives. Mistakenly associating data records which refer to distinct entities may have large ramifications when it comes to the application of medical care and pharmaceuticals.
Thus, there is a need for system and methods for comparing attributes of data records and linking these data records which is operable to filter these linked data records for false positives, and it is to this end that embodiments of the present invention are directed.
Embodiments of systems and methods for reducing false positives during the linking of data records are disclosed herein. Broadly speaking, embodiments of the present invention may be used in the generation of an overall weight from the comparison of various attributes of data records, where the linking of the data records is dependent on the overall weight. More specifically, embodiments of the present invention may provide a set of code (e.g., a computer program product comprising a set of computer instructions stored on a computer readable medium and executable or translatable by a computer processor) translatable to calculate a false positive penalty based on a set of results, each of the set of results based on a comparison of an attribute. The false positive penalty may be subtracted from the overall weight generated from the comparison of the attributes of data records to adjust the overall weight. By configuring which attributes of the data records are used as the set of attributes for generating the false positive penalty, and the penalties associated with a particular combination of results for the comparisons of these attributes, the incidence of false positives in the linking of data records may be significantly reduced.
In one embodiment, a comparison between attributes from two data records yields a set of results which are, in turn, used to generate a false positive penalty. The overall weight which was determined for the two data records may then be adjusted using this false positive penalty.
In some embodiments, the set of results to utilize in determining a false positive penalty may be configured.
In other embodiments, the attributes utilized to generate the false positive penalty may be the attributes used to determine the overall weight for the two records, or a subset thereof.
In still other embodiments, the false positive penalty to be utilized in conjunction with any particular set of results may be configured.
Embodiments of the present invention may provide the technical advantage of more effective linking of data records through the reduction of the incidences of false positives when linking these data records. More specifically, embodiments of the present invention may prove effective at reducing the incidences of false positives when linking data records pertaining to people, especially when these data records undergoing comparison and linking may comprise members of the same family or the same name.
Other technical advantages of embodiments of the present invention include an almost endless degree of configurability and flexibility. In other words, the attributes, results, other parameters utilized in the determination of the false positive penalty to impose (if any) and the values for false positive penalties themselves may be configurable such that embodiments of the present invention may be fined tuned according to a wide variety of variables.
These, and other, aspects of the invention will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. The following description, while indicating various embodiments of the invention and numerous specific details thereof, is given by way of illustration and not of limitation. Many substitutions, modifications, additions or rearrangements may be made within the scope of the invention, and the invention includes all such substitutions, modifications, additions or rearrangements.
The drawings accompanying and forming part of this specification are included to depict certain aspects of the invention. A clearer impression of the invention, and of the components and operation of systems provided with the invention, will become more readily apparent by referring to the exemplary, and therefore nonlimiting, embodiments illustrated in the drawings, wherein identical reference numerals designate the same components. Note that the features illustrated in the drawings are not necessarily drawn to scale.
The invention and the various features and advantageous details thereof are explained more fully with reference to the nonlimiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well known starting materials, processing techniques, components and equipment are omitted so as not to unnecessarily obscure the invention in detail. Skilled artisans should understand, however, that the detailed description and the specific examples, while disclosing preferred embodiments of the invention, are given by way of illustration only and not by way of limitation. Various substitutions, modifications, additions or rearrangements within the scope of the underlying inventive concept(s) will become apparent to those skilled in the art after reading this disclosure.
Reference is now made in detail to the exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts (elements).
Before turning to embodiments of the present invention, a general description of an example infrastructure or context which may be helpful in explaining these various embodiments will be described. A block diagram of one embodiment of just such an example infrastructure is described in
As shown, the identity hub 32 may receive data records from the data sources 34, 36, 38 as well as write corrected data back into the information sources 34, 36, 38. The corrected data communicated to the data sources 34, 36, 38 may include information that was correct, but has changed, information about fixing information in a data record or information about links between data records.
In addition, one of the operators 40, 42, 44 may transmit a query to the identity hub 32 and receive a response to the query back from the identity hub 32. The one or more data sources 34, 36, 38 may be, for example, different databases that possibly have data records about the same entities. For example, in the health care field, each information source 34, 36, 38 may be associated with a particular hospital in a health care organization and the health care organization may use the identity hub 32 to relate the data records associated with the plurality of hospitals so that a data record for a patient in Los Angeles may be located when that same patient is on vacation and enters a hospital in New York. The identity hub 32 may be located at a central location and the data sources 34, 36, 38 and users 40, 42, 44 may be located remotely from the identity hub 32 and may be connected to the identity hub 32 by, for example, a communications link, such as the Internet or any other type communications network, such as a wide area network, intranet, wireless network, leased network, etc.
The identity hub 32 may have its own database that stores complete data records in the identity hub, or alternatively, the identity hub may also only contain sufficient data to identify a data record (e.g., an address in a particular data source 34, 36, 38) or any portion of the data fields that comprise a complete data record so that the identity hub 32 can retrieve the entire data record from the data source 34, 36, 38 when needed. The identity hub 32 may link data records together containing information about the same entity utilizing an entity identifier or an associative database separate from actual data records. Thus, the identity hub 32 may maintain links between data records in one or more data sources 34, 36, 38, but does not necessarily maintain a single uniform data record for an entity.
More specifically, the identity hub may link data records in data sources 34, 36, 38 by comparing a data record (received from an operator, or from a data source 34, 36, 38) with other data records in data sources 34, 36, 38 to identify data records which should be linked together. This identification process may entail a comparison of one or more of the attributes of the data records with like attributes of the other data records. For example, a name attribute associated with one record may be compared with the name of other data records, social security number may be compared with the social security number of another record, etc. In this manner, data records which should be linked may be identified.
It will be apparent to those of ordinary skill in the art, that both the data sources 34, 36, 38 and the operators 40, 42, 44 may be affiliated with similar or different organizations or owners. For example, data source 34 may be affiliated with a hospital in Los Angeles run by one health care network, while data source 36 may be affiliated with a hospital in New York run by another health care network. Thus, the data records of each of data sources may be of a different format.
This may be illustrated more clearly with reference to
Notice, however, that each of the records may have a different format, for example data record 202 may have a filed for the attribute of driver's license number, while data record 200 may have no such field. Similarly, like attributes may have different formats as well. For example, name fields 210a, 210b, 210c in record 200 may accept the entry of a full first, last and middle name, while name fields 210d, 210e, 210f in record 202 may be designed for full first and last names, but only allow the entry of a middle initial.
As may be imagined, discrepancies such as this may be problematic when comparing two or more data records (e.g. attributes of data records) to identify data records which should be linked. Complicating the linking of data records, information pertaining to the same entity may be incorrectly entered into a data record, or may change in one data record pertaining to the entity but not in another data record, etc.
To deal with these possibilities, a system may be utilized which compares the various attributes of data records according to statistical algorithms to determine if data records refer to identical entities and hence, should be linked. To aid in an understanding of the systems and methods of the present invention it will be helpful to present an example embodiment of a methodology for identifying records pertaining to the same entity which may utilize these systems and methods.
This standardization may comprise the standardization of attributes of a data record into a standard format, such that subsequent comparisons between like attributes of different data records may be performed according to this standard format. It will be apparent that each of the attributes of the data records to be compared may be standardized according to a different format, a different set of semantics or lexicon, etc.
Once the attributes of the data records to be compared have been standardized at step 320, a set of candidates may be selected to compare to the new data record at step 330. This candidate selection process may comprise a comparison of one or more attributes of the new data records to the existing data records to determine which of the existing new data records are similar enough to the new data records to entail further comparison. These candidates may then undergo a more detailed comparison to the new records where a set of attributes are compared between the records to determine if an existing data record should be linked or associated with the new data record. This more detailed comparison may entail comparing each of the set of attributes of one record (e.g. an existing record) to the corresponding attribute in the other record (e.g. the new record) to generate a weight for that attribute. The weights for the set of attributes may then be summed to generate an overall weight which can then be compared to a threshold to determine if the two records should be linked.
In some cases, however, data records which do not represent the same entity may be mistakenly linked. These false positives may occur for a variety of reasons. For example, family members may share a variety of characteristics which, in turn may lead to data records for different members of the same family being incorrectly linked (e.g. it may be incorrectly determined that these data records refer to the same person, and thus the data records linked). Typically, methods for reducing the occurrence of false positives take one or more attributes of the data records being compared and use a fixed penalty if there is a mismatch between any corresponding attribute in their respective data records (e.g. a mismatch penalty is imposed if there is a mismatch between the names in each data records. Because of this, these prior solutions were severely limited in both the combinations of attributes which could be utilized in filtering for false positives, and the final weight penalties that could be imposed for specific combinations of the results of the comparisons of these attributes. Consequently, it would be desirable to implement a false positive filter (e.g. algorithm) to help reduce the likelihood of incorrectly linking data record which associated with different entities, which can be configured to use various attributes which may be used in the linking of records and which may impose differing penalties based on different combinations of results for the evaluation of these attributes.
To that end, attention is now directed to systems and methods for reducing false positives during the linking of data records. Broadly speaking, embodiments of the present invention may be used in the generation of an overall weight from the comparison of various attributes of data records, where the linking of the data records is dependent on the overall weight. More specifically, embodiments of the present invention may calculate a false positive penalty based on a set of results each of the set of results based on a comparison of an attribute. The false positive penalty may be subtracted from the overall weight generated from the comparison of the attributes of data records to adjust the overall weight. By configuring which attributes of the data records are used as the set of attributes for generating the false positive penalty, and the penalties associated with a particular combination of results for the comparisons of these attributes, the incidence of false positives in the linking of data records may be significantly reduced.
Turning now to
At step 410, then, the data resulting from the comparisons of a set of attributes may be obtained. The set of attributes or results utilized may be configured to include any set of attributes of the data records being compared, while the data resulting from the comparisons of the attributes may be configured to include the results of the comparison of the attributes or any tokens of the attributes, intermediary results used in the comparisons of attributes or the results of the comparisons of attributes obtained may be generated from comparisons of the attributes or parts of the attributes which were not used in the generation of an overall weight for the two data records.
In one embodiment, the set of attributes utilized may include name, gender, birth date and SSN. The results from the comparison of these attributes may be a name comparison which results in four values for the name comparison “equal”—where all the tokens of the names from each data record match exactly, “partial” if there is one or more initial or nickname/phonetic matches (as depicted in U.S. patent application Ser. No. 11/521,928 titled “Method and System For Comparing Attributes Such as Business Names” by Norm Adams et al. filed on Sep. 15, 2006, and U.S. patent application Ser. No. 11/522,223 entitled “Method and System For Comparing Attributes Such as Personal Names” by Norm Adams et al. and filed on Sep. 15, 2006, both of which are hereby fully incorporated herein by reference), between tokens of either of the names from either attribute and no mismatched between tokens, “different”—there is at least one token mismatch between the two names of the data records and “missing”—where one of the data records is missing name data or no comparison between name attributes was conducted during generation of a weight. It will be apparent that a numerical value may be assigned designating each of the results above.
Similarly, the results of a comparison of the values for the gender attribute of the two data record may be utilized with the three possible results being “agree”, “disagree” or “missing”—where at least one of the data records does not have gender data or no comparison of the gender attributes was conducted during the generation of a weight. The results of comparisons of the values for the date of birth attribute of the two data record may also be utilized. One comparison may be the edit distance between the two dates of birth of the data records (e.g. a value of 1 for an edit distance of 1, value of 2 for edit distance of 2, etc.), while another comparison may be the difference in birth year. The difference in birth year may be represented by values, where 0 indicated that birth year data is missing or has not been compared, a value indicates a difference in birth year between 0 and 4 years, a value of 2 means the difference is between 5 and 9, etc. Edit distance between the social security numbers of the two records may also be utilized.
The results of the comparison of the various attributes obtained in step 410 may then be utilized to generate a false positive penalty in step 420. In one embodiment, the specific permutation of the set of results obtained from the comparison of the obtained at step 410 may be used to generate a false positive value. More specifically, in one embodiment, the combination of results obtained at step 410 may be used to access a data structure (e.g. index into a table, etc.) which may store a penalty value to be utilized based on that combination of results. In other words, the results of the comparisons may have an associated numeric value (e.g. “missing” for an attribute may have a value of 0, edit distance for an attribute may be a value of 1, etc.) and each of these numeric values may be used to index into a data structure to locate the false positive penalty associated with the particular permutation of results represented by those values. In one embodiment, a program (such as Python) may be used to generate data structures such as tables comprising false positive penalties for use with embodiments of the present invention.
In one particular embodiment, a four dimensional table may be utilized, with the first dimension of the table indexed by a value resulting from a combination of the result for the name comparisons and the result for the gender comparisons; the second dimension being the result of the edit distance comparisons between the date of birth attributes, the third dimension is the difference between the birth year of the two data records and the fourth dimension is the result of the edit distance comparisons between the SSNs of the two data records. For example, a certain false positive penalty corresponding to partial agreement on the name attribute between the two records, agreement of the gender attribute, an edit distance of one between the two date of birth attributes a difference of 20 or more on the year of birth and an edit distance of 3 between the two SSNs of the data records may correspond to entry (6, 2, 5, 4) in the table.
Returning to
It may be helpful here to depict an example of various permutations of the results of the comparisons of various attributes which may cause a false positive penalty to be applied.
It will be noted, in conjunction with the above discussions, that each of these values for the results of the comparisons of the various attributes may have a numerical value associated with it, and these values may be used to index a table comprising values for the false positive penalty to be imposed for any permutation of values from the results of the comparisons of attributes. Thus, for the above example, the values associated with each of the permutations listed in
In the foregoing specification, the invention has been described with reference to specific embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of invention.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any component(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or component of any or all the claims.
Schumacher, Scott, Adams, Norm, Ellard, Scott
Patent | Priority | Assignee | Title |
10698755, | Sep 28 2007 | International Business Machines Corporation | Analysis of a system for matching data records |
8713434, | Sep 28 2007 | International Business Machines Corporation | Indexing, relating and managing information about entities |
8799282, | Sep 28 2007 | International Business Machines Corporation | Analysis of a system for matching data records |
9286374, | Sep 28 2007 | International Business Machines Corporation | Method and system for indexing, relating and managing information about entities |
9600563, | Sep 28 2007 | International Business Machines Corporation | Method and system for indexing, relating and managing information about entities |
Patent | Priority | Assignee | Title |
4531186, | Jan 21 1983 | International Business Machines Corporation | User friendly data base access |
5020019, | May 29 1989 | Ricoh Company, Ltd. | Document retrieval system |
5134564, | Oct 19 1989 | Intuit | Computer aided reconfiliation method and apparatus |
5247437, | Oct 01 1990 | XEROX CORPORATION, A CORP OF NY | Method of managing index entries during creation revision and assembly of documents |
5321833, | Aug 29 1990 | GOOGLE LLC | Adaptive ranking system for information retrieval |
5323311, | Aug 31 1990 | Fujitsu Limited | Classified-by-field dictionary generating apparatus, machine translation apparatus and machine translation system using these apparatuses |
5333317, | Dec 22 1989 | Bull HN Information Systems Inc. | Name resolution in a directory database |
5381332, | Dec 09 1991 | GENERAL DYNAMICS C4 SYSTEMS, INC | Project management system with automated schedule and cost integration |
5442782, | Aug 13 1993 | Oracle International Corporation | Providing information from a multilingual database of language-independent and language-dependent items |
5497486, | Mar 15 1994 | LOT 19 ACQUISITION FOUNDATION, LLC | Method of merging large databases in parallel |
5535322, | Oct 27 1992 | International Business Machines Corporation | Data processing system with improved work flow system and method |
5535382, | Jul 31 1989 | Ricoh Company, Ltd. | Document retrieval system involving ranking of documents in accordance with a degree to which the documents fulfill a retrieval condition corresponding to a user entry |
5537590, | Aug 05 1993 | DIAGNOSTIC SYSTEMS CORPORATION; DIAGNOSTICS SYSTEMS CORPORATION | Apparatus for applying analysis rules to data sets in a relational database to generate a database of diagnostic records linked to the data sets |
5555409, | Dec 04 1990 | APPLIED TECHNICAL SYSTEMS, INC | Data management systems and methods including creation of composite views of data |
5561794, | Apr 28 1994 | The United States of America as represented by the Secretary of the Navy; UNITED STATES OF AMERICA, AS REPRESENTED BY THE SECRETARY OF THE NAVY, THE | Early commit optimistic projection-based computer database protocol |
5583763, | Sep 09 1993 | Intel Corporation | Method and apparatus for recommending selections based on preferences in a multi-user system |
5600835, | Aug 20 1993 | Canon Inc. | Adaptive non-literal text string retrieval |
5606690, | Aug 20 1993 | Canon Inc.; CANON INC | Non-literal textual search using fuzzy finite non-deterministic automata |
5615367, | May 25 1993 | Borland Software Corporation | System and methods including automatic linking of tables for improved relational database modeling with interface |
5640553, | Sep 15 1995 | Infonautics Corporation | Relevance normalization for documents retrieved from an information retrieval system in response to a query |
5651108, | Jan 21 1994 | CODEGEAR LLC | Development system with methods for visual inheritance and improved object reusability |
5675752, | Sep 15 1994 | Sony Trans Com | Interactive applications generator for an interactive presentation environment |
5675753, | Apr 24 1995 | Qwest Communications International Inc | Method and system for presenting an electronic user-interface specification |
5694593, | Oct 05 1994 | Northeastern University | Distributed computer database system and method |
5694594, | Nov 14 1994 | Trimble Navigation Limited | System for linking hypermedia data objects in accordance with associations of source and destination data objects and similarity threshold without using keywords or link-difining terms |
5710916, | May 24 1994 | Panasonic Corporation of North America | Method and apparatus for similarity matching of handwritten data objects |
5734907, | Mar 12 1992 | EVIDAN | Method of programming an information processing device for network management applications using generic programming |
5765150, | Aug 09 1996 | R2 SOLUTIONS LLC | Method for statistically projecting the ranking of information |
5774661, | Apr 18 1995 | HTA TECHNOLOGY INVESTMENTS LLC | Rule engine interface for a visual workflow builder |
5774883, | May 25 1995 | AUTO DATA, INC | Method for selecting a seller's most profitable financing program |
5774887, | Nov 18 1992 | Qwest Communications International Inc | Customer service electronic form generating system |
5778370, | Aug 25 1995 | Data village system | |
5787431, | Dec 16 1996 | CODEGEAR LLC | Database development system with methods for java-string reference lookups of column names |
5787470, | Oct 18 1996 | AT&T Corp | Inter-cache protocol for improved WEB performance |
5790173, | Jul 20 1995 | Verizon Patent and Licensing Inc | Advanced intelligent network having digital entertainment terminal or the like interacting with integrated service control point |
5796393, | Nov 08 1996 | Meta Platforms, Inc | System for intergrating an on-line service community with a foreign service |
5805702, | Jan 31 1996 | Maxim Integrated Products, Inc | Method, apparatus, and system for transferring units of value |
5809499, | Oct 18 1996 | PATTERN DISCOVERY SOFTWARE SYSTEMS, LTD | Computational method for discovering patterns in data sets |
5819264, | Apr 03 1995 | ADOBE SYSTEMS ISRAEL LTD | Associative search method with navigation for heterogeneous databases including an integration mechanism configured to combine schema-free data models such as a hyperbase |
5835712, | May 03 1996 | Open Invention Network, LLC | Client-server system using embedded hypertext tags for application and database development |
5835912, | Mar 13 1997 | Government of the United States, as represented by the National Security Agency | Method of efficiency and flexibility storing, retrieving, and modifying data in any language representation |
5848271, | Mar 14 1994 | WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT | Process and apparatus for controlling the work flow in a multi-user computing system |
5859972, | May 10 1996 | The Board of Trustees of the University of Illinois | Multiple server repository and multiple server remote application virtual client computer |
5862322, | Mar 14 1994 | INFOR GLOBAL SOLUTIONS MASSACHUSETTS , INC | Method and apparatus for facilitating customer service communications in a computing environment |
5862325, | Feb 29 1996 | Intermind Corporation | Computer-based communication system and method using metadata defining a control structure |
5878043, | May 09 1996 | Nortel Networks Limited | ATM LAN emulation |
5893074, | Jan 29 1996 | Jet Propulsion Laboratory | Network based task management |
5893110, | Aug 16 1996 | RPX Corporation | Browser driven user interface to a media asset database |
5905496, | Jul 03 1996 | Oracle America, Inc | Workflow product navigation system |
5930768, | Feb 06 1996 | SUPERSONIC BOOM, INC | Method and system for remote user controlled manufacturing |
5960411, | Sep 12 1997 | AMAZON COM, INC | Method and system for placing a purchase order via a communications network |
5963915, | Feb 21 1996 | DISNEY ENTERPRISES, INC | Secure, convenient and efficient system and method of performing trans-internet purchase transactions |
5987422, | May 29 1997 | Oracle International Corporation | Method for executing a procedure that requires input from a role |
5991758, | Jun 06 1997 | International Business Machines Corporation | System and method for indexing information about entities from different information sources |
5999937, | Jun 06 1997 | International Business Machines Corporation | System and method for converting data between data sets |
6014664, | Aug 29 1997 | International Business Machines Corporation | Method and apparatus for incorporating weights into data combinational rules |
6016489, | Dec 18 1997 | Oracle America, Inc | Method and apparatus for constructing stable iterators in a shared data collection |
6018733, | Sep 12 1997 | GOOGLE LLC | Methods for iteratively and interactively performing collection selection in full text searches |
6018742, | Jul 07 1998 | Perigis Corporation | Constructing a bifurcated database of context-dependent and context-independent data items |
6026433, | Mar 17 1997 | Red Hat, Inc | Method of creating and editing a web site in a client-server environment using customizable web site templates |
6049847, | Sep 16 1996 | Intel Corporation | System and method for maintaining memory coherency in a computer system having multiple system buses |
6067549, | Dec 11 1998 | CGI TECHNOLOGIES AND SOLUTIONS INC | System for managing regulated entities |
6069628, | Jan 15 1993 | Thomson Reuters Global Resources Unlimited Company | Method and means for navigating user interfaces which support a plurality of executing applications |
6078325, | May 31 1991 | Edify Corporation | Object oriented customer information exchange system and method |
6108004, | Oct 21 1997 | SAP SE | GUI guide for data mining |
6134581, | Oct 06 1997 | Oracle America, Inc | Method and system for remotely browsing objects |
6185608, | Jun 12 1998 | IBM Corporation | Caching dynamic web pages |
6223145, | Nov 26 1997 | Zerox Corporation | Interactive interface for specifying searches |
6269373, | Feb 26 1999 | International Business Machines Corporation | Method and system for persisting beans as container-managed fields |
6297824, | Nov 26 1997 | Xerox Corporation | Interactive interface for viewing retrieval results |
6298478, | Dec 31 1998 | International Business Machines Corporation | Technique for managing enterprise JavaBeans (™) which are the target of multiple concurrent and/or nested transactions |
6311190, | Feb 02 1999 | S AQUA SEMICONDUCTOR, LLC | System for conducting surveys in different languages over a network with survey voter registration |
6327611, | Nov 12 1997 | Meta Platforms, Inc | Electronic document routing system |
6330569, | Jun 30 1999 | Unisys Corp.; Unisys Corporation | Method for versioning a UML model in a repository in accordance with an updated XML representation of the UML model |
6349325, | Jun 16 1997 | Ascom Network Testing AB | Prioritized agent-based hierarchy structure for handling performance metrics data in a telecommunication management system |
6356931, | Oct 06 1997 | Oracle America, Inc | Method and system for remotely browsing objects |
6374241, | Mar 31 1999 | GOOGLE LLC | Data merging techniques |
6385600, | Apr 03 1997 | GOOGLE LLC | System and method for searching on a computer using an evidence set |
6389429, | Jul 30 1999 | APRIMO, INCORPORATION | System and method for generating a target database from one or more source databases |
6446188, | Dec 01 1998 | Intel Corporation | Caching dynamically allocated objects |
6449620, | Mar 02 2000 | OPEN TEXT HOLDINGS, INC | Method and apparatus for generating information pages using semi-structured data stored in a structured manner |
6457065, | Jan 05 1999 | IBM Corporation | Transaction-scoped replication for distributed object systems |
6460045, | Mar 15 1999 | Microsoft Technology Licensing, LLC | Self-tuning histogram and database modeling |
6496793, | Apr 21 1993 | JPMORGAN CHASE BANK, N A , AS SUCCESSOR AGENT | System and methods for national language support with embedded locale-specific language driver identifiers |
6502099, | Dec 16 1999 | International Business Machines Corporation | Method and system for extending the functionality of an application |
6510505, | May 09 2001 | International Business Machines Corporation | System and method for allocating storage space using bit-parallel search of bitmap |
6523019, | Sep 21 1999 | Open Invention Network LLC | Probabilistic record linkage model derived from training data |
6529888, | May 09 1994 | Microsoft Technology Licensing, LLC | Generating improved belief networks |
6556983, | Jan 12 2000 | I P ENGINE, INC | Methods and apparatus for finding semantic information, such as usage logs, similar to a query using a pattern lattice data space |
6557100, | Oct 21 1999 | International Business Machines Corporation | Fastpath redeployment of EJBs |
6621505, | Sep 30 1997 | International Business Machines Corporation | Dynamic process-based enterprise computing system and method |
6633878, | Jul 30 1999 | Accenture Global Services Limited | Initializing an ecommerce database framework |
6633882, | Jun 29 2000 | Microsoft Technology Licensing, LLC | Multi-dimensional database record compression utilizing optimized cluster models |
6633992, | Dec 30 1999 | INTEL CORPORATION, A CORP OF DELAWARE | Generalized pre-charge clock circuit for pulsed domino gates |
6647383, | Sep 01 2000 | Lucent Technologies Inc | System and method for providing interactive dialogue and iterative search functions to find information |
6662180, | May 12 1999 | Matsushita Electric Industrial Co., Ltd. | Method for searching in large databases of automatically recognized text |
6687702, | Jun 15 2001 | SYBASE, INC | Methodology providing high-speed shared memory access between database middle tier and database server |
6704805, | |||
6718535, | Jul 30 1999 | Accenture Global Services Limited | System, method and article of manufacture for an activity framework design in an e-commerce based environment |
6742003, | Apr 30 2001 | Microsoft Technology Licensing, LLC | Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications |
6757708, | Mar 03 2000 | International Business Machines Corporation | Caching dynamic content |
6795793, | Jul 19 2002 | MED-ED INNOVATIONS, INC DBA NEI, A CALIFORNIA CORPORATION | Method and apparatus for evaluating data and implementing training based on the evaluation of the data |
6807537, | Dec 04 1997 | Microsoft Technology Licensing, LLC | Mixtures of Bayesian networks |
6842761, | Nov 21 2000 | Verizon Patent and Licensing Inc | Full-text relevancy ranking |
6842906, | Aug 31 1999 | Accenture Global Services Limited | System and method for a refreshable proxy pool in a communication services patterns environment |
6879944, | Mar 07 2000 | Microsoft Technology Licensing, LLC | Variational relevance vector machine |
6907422, | Dec 18 2001 | Oracle America, Inc | Method and system for access and display of data from large data sets |
6912549, | Sep 05 2001 | CERNER INNOVATION, INC | System for processing and consolidating records |
6922695, | Sep 06 2001 | International Business Machines Corporation | System and method for dynamically securing dynamic-multi-sourced persisted EJBS |
6957186, | May 27 1999 | Accenture Global Services Limited | System method and article of manufacture for building, managing, and supporting various components of a system |
6990636, | Sep 30 1997 | International Business Machines Corporation | Enterprise workflow screen based navigational process tool system and method |
6996565, | Sep 06 2001 | International Business Machines Corporation | System and method for dynamically mapping dynamic multi-sourced persisted EJBs |
7035809, | Dec 07 2001 | Accenture Global Services Limited | Accelerated process improvement framework |
7043476, | Oct 11 2002 | International Business Machines Corporation | Method and apparatus for data mining to discover associations and covariances associated with data |
7099857, | Aug 04 1999 | The Board of Trustees of the University of Illinois | Multi-attribute drug comparison |
7143091, | Feb 04 2002 | Adobe Inc | Method and apparatus for sociological data mining |
7155427, | Oct 30 2002 | Oracle International Corporation | Configurable search tool for finding and scoring non-exact matches in a relational database |
7181459, | May 04 1999 | ICONFIND, INC | Method of coding, categorizing, and retrieving network pages and sites |
7249131, | Sep 06 2001 | International Business Machines Corporation | System and method for dynamically caching dynamic multi-sourced persisted EJBs |
7330845, | Feb 17 2000 | TOMTOM INTERNATIONAL B V | System, method and program product for providing navigational information for facilitating navigation and user socialization at web sites |
7487173, | May 22 2003 | International Business Machines Corporation | Self-generation of a data warehouse from an enterprise data model of an EAI/BPI infrastructure |
7526486, | May 22 2006 | International Business Machines Corporation | Method and system for indexing information about entities with respect to hierarchies |
7567962, | Aug 13 2004 | Microsoft Technology Licensing, LLC | Generating a labeled hierarchy of mutually disjoint categories from a set of query results |
7620647, | Sep 15 2006 | International Business Machines Corporation | Hierarchy global management system and user interface |
7627550, | Sep 15 2006 | International Business Machines Corporation | Method and system for comparing attributes such as personal names |
7685093, | Sep 15 2006 | International Business Machines Corporation | Method and system for comparing attributes such as business names |
7698268, | Sep 15 2006 | International Business Machines Corporation | Method and system for filtering false positives |
7788274, | Jun 30 2004 | GOOGLE LLC | Systems and methods for category-based search |
8321383, | Jun 02 2006 | International Business Machines Corporation | System and method for automatic weight generation for probabilistic matching |
8321393, | Mar 29 2007 | International Business Machines Corporation | Parsing information in data records and in different languages |
8332366, | Jun 02 2006 | International Business Machines Corporation | System and method for automatic weight generation for probabilistic matching |
20020007284, | |||
20020073099, | |||
20020080187, | |||
20020087599, | |||
20020095421, | |||
20020099694, | |||
20020152422, | |||
20020156917, | |||
20020178360, | |||
20030004770, | |||
20030004771, | |||
20030018652, | |||
20030023773, | |||
20030051063, | |||
20030065826, | |||
20030065827, | |||
20030105825, | |||
20030120630, | |||
20030145002, | |||
20030158850, | |||
20030174179, | |||
20030182101, | |||
20030195836, | |||
20030195889, | |||
20030195890, | |||
20030220858, | |||
20030227487, | |||
20040107189, | |||
20040107205, | |||
20040122790, | |||
20040143477, | |||
20040143508, | |||
20040181526, | |||
20040181554, | |||
20040220926, | |||
20040260694, | |||
20050004895, | |||
20050015381, | |||
20050015675, | |||
20050050068, | |||
20050055345, | |||
20050060286, | |||
20050071194, | |||
20050075917, | |||
20050114369, | |||
20050149522, | |||
20050154615, | |||
20050210007, | |||
20050228808, | |||
20050240392, | |||
20050256740, | |||
20050256882, | |||
20050273452, | |||
20060053151, | |||
20060053172, | |||
20060053173, | |||
20060053382, | |||
20060064429, | |||
20060074832, | |||
20060074836, | |||
20060080312, | |||
20060116983, | |||
20060117032, | |||
20060129605, | |||
20060129971, | |||
20060136205, | |||
20060161522, | |||
20060167896, | |||
20060179050, | |||
20060190445, | |||
20060195560, | |||
20060265400, | |||
20060271401, | |||
20060271549, | |||
20060287890, | |||
20070005567, | |||
20070016450, | |||
20070055647, | |||
20070067285, | |||
20070073678, | |||
20070073745, | |||
20070094060, | |||
20070150279, | |||
20070192715, | |||
20070198481, | |||
20070198600, | |||
20070214129, | |||
20070214179, | |||
20070217676, | |||
20070250487, | |||
20070260492, | |||
20070276844, | |||
20070276858, | |||
20070299697, | |||
20070299842, | |||
20080005106, | |||
20080016218, | |||
20080069132, | |||
20080120432, | |||
20080126160, | |||
20080243832, | |||
20080243885, | |||
20080244008, | |||
20080276221, | |||
20090089317, | |||
20090089332, | |||
20090089630, | |||
20090198686, | |||
20100114877, | |||
20100174725, | |||
20100175024, | |||
20110010214, | |||
20110010346, | |||
20110010401, | |||
20110010728, | |||
20110047044, | |||
20110191349, | |||
JP2000348042, | |||
JP2001236358, | |||
JP2005063332, | |||
JP2006163941, | |||
JP2006277413, | |||
WO159586, | |||
WO175679, | |||
WO3021485, | |||
WO2004023297, | |||
WO2004023311, | |||
WO2004023345, | |||
WO2009042931, | |||
WO2009042941, | |||
WO9855947, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Oct 23 2006 | ELLARD, SCOTT | INITIATE SYSTEMS, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 023941 | /0709 | |
Oct 24 2006 | SCHUMACHER, SCOTT | INITIATE SYSTEMS, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 023941 | /0709 | |
Oct 27 2006 | ADAMS, NORM | INITIATE SYSTEMS, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 023941 | /0709 | |
Jan 14 2010 | International Business Machines Corporation | (assignment on the face of the patent) | / | |||
Oct 20 2010 | INITIATE SYSTEMS, INC | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 025373 | /0013 |
Date | Maintenance Fee Events |
Apr 18 2017 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Apr 16 2021 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Nov 19 2016 | 4 years fee payment window open |
May 19 2017 | 6 months grace period start (w surcharge) |
Nov 19 2017 | patent expiry (for year 4) |
Nov 19 2019 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 19 2020 | 8 years fee payment window open |
May 19 2021 | 6 months grace period start (w surcharge) |
Nov 19 2021 | patent expiry (for year 8) |
Nov 19 2023 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 19 2024 | 12 years fee payment window open |
May 19 2025 | 6 months grace period start (w surcharge) |
Nov 19 2025 | patent expiry (for year 12) |
Nov 19 2027 | 2 years to revive unintentionally abandoned end. (for year 12) |