Methods and systems for maintaining the internal consistency of a fact repository are described. Accessed objects are checked for attribute-value pairs that have links to other objects. For any link to an object, the name of the linked-to object is inserted into the attribute-value pair having the link. The accessed objects are filtered to remove attribute-value pairs meeting predefined criteria, possibly resulting in null objects. Links to null objects are identified and removed.
|
10. A system for improving internal consistency of a database, comprising:
one or more processors;
memory;
means for accessing a first set of objects in the database, each object including one or more attribute-value pairs, wherein at least a subset of the values in the one or more attribute-value pairs includes respective links to other objects in the database;
means for filtering the database by identifying attribute-value pairs of the first set of objects in the database that meet predefined criteria and removing the identified attribute-value pairs from the first set of objects in the database, wherein objects in the database meeting a null information criterion after the filtering comprise null objects;
means for identifying, after the filtering, a second set of objects that include attribute-value pairs having links to null objects; and
means for removing from the second set of objects the attribute-value pairs having links to the null objects.
11. A computer-implemented method of improving internal consistency of a database, comprising:
at a system having one or more processors and memory storing one or more modules to be executed by the one or more processors:
accessing a first set of objects in the database, each object including one or more attribute-value pairs, wherein at least a subset of the values in the one or more attribute-value pairs includes respective links to other objects in the database, wherein the other objects include one or more null objects having null attribute-value pairs;
filtering the database by identifying attribute-value pairs of the first set of objects in the database that meet predefined criteria and removing the identified attribute-value pairs from the first set of objects in the database;
identifying a second set of the attribute-value pairs of the first set of objects including links to the null objects; and
removing from the second set of objects the attribute-value pairs having links to the null objects.
7. A non-transitory computer readable storage medium storing one or more programs for execution by one or more processors in a computer system, the one or more programs comprising instructions for:
accessing a first set of objects in the database, each object including one or more attribute-value pairs, wherein at least a subset of the values in the one or more attribute-value pairs includes respective links to other objects in the database;
filtering the database by identifying attribute-value pairs of the first set of objects in the database that meet predefined criteria and removing the identified attribute-value pairs from the first set of objects in the database, wherein objects in the database meeting a null information criterion after the filtering comprise null objects;
identifying, after the filtering, a second set of objects that include attribute-value pairs having links to the null objects; and
removing from the second set of objects the attribute-value pairs having links to the null objects.
1. A computer-implemented method of improving internal consistency of a database, comprising:
at a system having one or more processors and memory storing one or more modules to be executed by the one or more processors:
accessing a first set of objects in the database, each object including one or more attribute-value pairs, wherein at least a subset of the values in the one or more attribute-value pairs includes respective links to other objects in the database;
filtering the database by identifying attribute-value pairs of the first set of objects in the database that meet predefined criteria and removing the identified attribute-value pairs from the first set of objects in the database, wherein objects in the database meeting a null information criterion after the filtering comprise null objects;
after the filtering, identifying a second set of objects that include attribute-value pairs having links to the null objects; and
removing from the second set of objects the attribute-value pairs having links to the null objects.
4. A system for improving internal consistency of a database, comprising:
one or more processors; and
memory storing one or more modules to be executed by the one or more processors;
the one or more modules including instructions:
to access a first set of objects in the database, each object including one or more attribute-value pairs, wherein at least a subset of the values in the one or more attribute-value pairs includes respective links to other objects in the database;
to filter the database by identifying attribute-value pairs of the first set of objects in the database that meet predefined criteria and removing the identified attribute-value pairs from the first set of objects in the database, wherein objects in the database meeting a null information criterion after the filtering comprise null objects;
to identify, after the filtering, a second set of objects that include attribute-value pairs having links to the null objects; and
to remove from the second set of objects the attribute-value pairs having links to the null objects.
2. The computer-implemented method of
3. The computer-implemented method of
5. The system of
6. The system of
8. The non-transitory computer readable storage medium of
9. The non-transitory computer readable storage medium of
12. The computer-implemented method of
13. The computer-implemented method of
14. The method of
15. The method of
16. The method of
17. The system of
18. The non-transitory computer readable storage medium of
19. The system of
20. The method of
|
This application is related to the following applications, each of which is hereby incorporated by reference:
U.S. patent application Ser. No. 11/097,688, “Corroborating Facts Extracted from Multiple Sources,” filed on Mar. 31, 2005;
U.S. patent application Ser. No. 11/097,690, “Selecting the Best Answer to a Fact Query from Among a Set of Potential Answers,” filed on Mar. 31, 2005;
U.S. patent application Ser. No. 11/097,689, “User Interface for Facts Query Engine with Snippets from Information Sources that Include Query Terms and Answer Terms,” filed on Mar. 31, 2005;
U.S. patent application Ser. No. 11/142,853, “Learning Facts from Semi-Structured Text,” filed on May 31, 2005;
U.S. patent application Ser. No. 11/142,740, “Merging Objects in a Facts Database,” filed on May 31, 2005; and
U.S. patent application Ser. No. 11/142,765, “Identifying the Unifying Subject of a Set of Facts,” filed on May 31, 2005.
The disclosed embodiments relate generally to fact databases. More particularly, the disclosed embodiments relate to methods and systems for maintaining the internal consistency of a fact database.
The World Wide Web (also known as the “Web”) and the web pages within the Web are a vast source of factual information. Users may look to web pages to get answers to factual questions, such as “what is the capital of Poland” or “what is the birth date of George Washington.” The factual information included in web pages may be extracted and stored in a fact database.
A fact database may, at times, become internally inconsistent. When a fact database is populated with data, there may be gaps in the data for which the database building module does not have the data to fill. When fact database maintenance operations are performed, data may be modified or removed, resulting in possible data inconsistencies. These internal inconsistencies may diminish the quality of the fact database.
According to an aspect of the invention, a method of improving internal consistency of a database includes accessing a set of objects in the database (e.g., a fact repository), each object including one or more attribute-value pairs, wherein at least a subset of the values in the attribute-value pairs includes respective links to other objects; filtering the attribute-value pairs of the objects to remove attribute-value pairs meeting predefined criteria, wherein objects meeting a null information criterion after the filtering comprise null objects; identifying attribute-value pairs of the objects including links to null objects; and removing the links to null objects.
Like reference numerals refer to corresponding parts throughout the drawings.
Within a fact repository organized based on objects (representing entities and concepts) and facts associated with objects, a fact may reference another object. In other words, facts may serve as connections between objects. For example, the fact that Tokyo is the capital of Japan connects objects representing Tokyo and Japan. The reference to the other object may include a link to the other object (such as an object identifier) and a name of the other object. However, the name of the other object may be missing or incorrect, even though the link may be correct. Furthermore, the other object may be “removed” from the fact database during fact database maintenance operations, resulting in a dangling link. Both missing or incorrect object names and dangling links represent internal inconsistencies in the fact database. The internal inconsistencies may be remedied by inserting the name of an object into facts that link to the object and removing dangling links.
The document hosts 102 store documents and provide access to documents. A document may be any machine-readable data including any combination of text, graphics, multimedia content, etc. In some embodiments, a document may be a combination of text, graphics and possible other forms of information written in the Hypertext Markup Language (HTML), i.e., a web page. A document may include one or more hyperlinks to other documents. A document may include one or more facts within its contents. A document stored in a document host 102 may be located and/or identified by a Uniform Resource Locator (URL), or Web address, or any other appropriate form of identification and/or location.
The fact repository engine 106 includes an importer 108, a repository manager 110, a fact index 112, and a fact repository 114. The importer 108 extracts factual information from documents stored on document hosts 102. The importer 108 analyzes the contents of the documents stored in document host 102, determines if the contents include factual information and the subject or subjects with which the factual information are associated, and extracts any available factual information within the contents.
The repository manager 110 processes facts extracted by the importer 108. The repository manager 110 builds and manages the fact repository 114 and the fact index 112. The repository manager 110 receives facts extracted by the importer 108 and stores them in the fact repository 114. The repository manager 110 may also perform operations on facts in the fact repository 114 to “clean up” the data within the fact repository 114. For example, the repository manager 110 may look through the fact repository 114 to find duplicate facts (that is, facts that convey the exact same factual information) and merge them. The repository manager 110 may also normalize facts into standard formats. The repository manager 110 may also remove unwanted facts from the fact repository 114, such as facts related to pornographic content.
The fact repository 114 stores factual information extracted from a plurality of documents that are located on the document hosts 102. In other words, the fact repository 114 is a database of factual information. A document from which a particular fact may be extracted is a source document (or “source”) of that particular fact. In other words, a source of a fact includes that fact within its contents. Source documents may include, without limitation, Web pages. Within the fact repository 114, entities, concepts, and the like for which the fact repository 114 may have factual information stored are represented by objects. An object may have one or more facts associated with it. Within each object, each fact associated with the object is stored as an attribute-value pair. Each fact also includes a list of source documents that include the fact within its contents and from which the fact was extracted. Further details about objects and facts in the fact repository are described below, in relation to
The fact index 112 provides an index to the fact repository 114 and facilitates efficient lookup of information in the fact repository 114. The fact index 112 may index the fact repository 114 based on one or more parameters. For example, the fact index 112 may have an index that maps unique terms (e.g., words, numbers and the like) to records or locations within the fact repository 114. More specifically, the fact index 112 may include entries mapping every term in every object name, fact attribute and fact value of the fact repository to records or locations within the fact repository.
It should be appreciated that each of the components of the fact repository engine 106 may be distributed over multiple computers. For example, the fact repository 114 may be deployed over N servers, with a mapping function such as the “modulo N” function being used to determine which facts are stored in each of the N servers. Similarly, the fact index 112 may be distributed over multiple servers, and the importer 108 and repository manager 110 may each be distributed over multiple computers. However, for convenience of explanation, we will discuss the components of the fact repository engine 106 as though they were implemented on a single computer.
Each fact 204 also may include one or more metrics 218. The metrics may provide indications of the quality of the fact. In some embodiments, the metrics include a confidence level and an importance level. The confidence level indicates the likelihood that the fact is correct. The importance level indicates the relevance of the fact to the object, compared to other facts for the same object. The importance level may optionally be viewed as a measure of how vital a fact is to an understanding of the entity or concept represented by the object.
Each fact 204 includes a list of sources 220 that include the fact and from which the fact was extracted. Each source may be identified by a Uniform Resource Locator (URL), or Web address, or any other appropriate form of identification and/or location, such as a unique document identifier.
In some embodiments, some facts may include an agent field 222 that identifies the module that extracted the fact. For example, the agent may be a specialized module that extracts facts from a specific source (e.g., the pages of a particular web site, or family of web sites) or type of source (e.g., web pages that present factual information in tabular form), or a module that extracts facts from free text in documents throughout the Web, and so forth.
In some embodiments, an object 200 may have one or more specialized facts, such as a name fact 206 and a property fact 208. A name fact 206 is a fact that conveys a name for the entity or concept represented by the object 200. For example, for an object representing the country Spain, there may be a fact conveying the name of the object as “Spain.” A name fact 206, being a special instance of a general fact 204, includes the same parameters as any other fact 204; it has an attribute, a value, a fact ID, metrics, sources, etc. The attribute 224 of a name fact 206 indicates that the fact is a name fact, and the value is the actual name. The name may be a string of characters. An object 200 may have one or more name facts, as many entities or concepts can have more than one name. For example, an object representing Spain may have name facts conveying the country's common name “Spain” and the official name “Kingdom of Spain.” As another example, an object representing the U.S. Patent and Trademark Office may have name facts conveying the agency's acronyms “PTO” and “USPTO” and the official name “United States Patent and Trademark Office.” If an object does have more than one name fact, one of the name facts may be designated as a primary name and other name facts may be designated as secondary names.
A property fact 208 is a fact that conveys a statement about the entity or concept represented by the object 200 that may be of interest. For example, for the object representing Spain, a property fact may convey that Spain is a country in Europe. A property fact 208, being a special instance of a general fact 204, also includes the same parameters (such as attribute, value, fact ID, etc.) as other facts 204. The attribute field 226 of a property fact 208 indicates that the fact is a property fact, and the value field is a string of text that conveys the statement of interest. For example, for the object representing Spain, the value of a property fact may be the text string “is a country in Europe.” Some objects 200 may have one or more property facts while other objects may have no property facts.
It should be appreciated that the data structure illustrated in
An object is a collection of facts. An object may become a null or empty object when facts are removed from the object. In some embodiments, a null object is an object that has had all of its facts (including name facts) removed, leaving the object with only its object ID. In some other embodiments, a null object is an object that has all of its facts other than name facts removed, leaving the object with its object ID and name facts. In further other embodiments, where an object has names in special records that have a different format from general facts, the object is a null object only if all of its associated facts, not including the special records for its names, are removed. Alternatively, the object may be a null object only if all of its facts and the special records for its names are removed. A null object represents an entity or concept for which the fact repository engine 106 has no factual information and, as far as the fact repository engine 106 is concerned, does not exist. In some embodiments, a null object may be left in the fact repository 114. However, the null object is treated as if it was removed from the fact repository 114. In some other embodiments, null objects are removed from the fact repository 114.
A set of objects, stored in the fact repository 114, is accessed (302). The set of objects accessed may be the entirety of objects that are stored in the fact repository 114, or the set of accessed objects may be a subset of the entirety of objects stored in the fact repository 114. At least some of the accessed objects include one or more facts. In some embodiments, a fact, as described above in relation to
One of the objects in the set is selected (304). If the object does not include an A-V pair that includes a link to another object (306—no), nothing is done to that object. If the object includes one or more A-V pairs that include links to other objects (306—yes), the name of the respective linked-to object is inserted into the respective value of each A-V pair with a link to the linked-to object (308). If the value of an A-V pair already has a name for the linked-to object, in some embodiments that name is replaced by the inserted name, regardless of whether the pre-existing name is the same as the inserted name. In some other embodiments, the pre-existing name in an A-V pair is first compared against the name of the linked-to object. The name of the linked-to object in the A-V pair is replaced with the name of the linked-to object if the names do not match. In embodiments where a fact stores a link in the value field, the link is not replaced by the name when the name is inserted. Rather, the name and the link are concatenated and the concatenated string is stored in the value field.
If there are objects in the set remaining to be selected (310—no), another object is selected (304). Otherwise (310—yes), the process ends (312). The process may be repeated at scheduled intervals, or as needed.
In some embodiments, after operation 302, an optional table of object identifiers and object names may be built. The table maps object identifiers to their corresponding object names (in some embodiments, the primary names). The table may be loaded into memory. When inserting names into values, as described above, the fact repository engine 106 may refer to the table rather than searching for the object identifier in the fact repository itself. This may help make the dereferencing process more efficient.
As described above, an object may have more than one name. If the linked-to object has more than one name and one of them is designated the primary name, then the primary name is the one that is inserted into the value.
One or more filters are applied to the A-V pairs of the objects and A-V pairs meeting predefined criteria are removed (404). The filters identify A-V pairs that meet predefined criteria and remove them. The predefined criteria may be defined based on the information conveyed by the A-V pair. For example, one predefined criterion for removal may be that an A-V pair is to be removed if it conveys a fact associated with pornography. In some embodiments, the predefined criteria may be implemented using heuristics and/or blacklists. The filters would apply the heuristics to the A-V pairs or compare the A-V pairs against a blacklist to determine which A-V pairs warrant removal. After the filtering, some objects may become null objects due to the removal of A-V pairs from the object.
One of the objects in the set is selected (406). If the object does not include an A-V pair that includes a link to another object or if all links in the A-V pairs of the object are to non-null objects (408—no), nothing is done to that object. If the object includes one or more A-V pairs that include links to null objects (408—yes), the links to null objects are removed (410).
In some embodiments, the removal of a link to a null object is performed by removing the identifier of the null object from the value of the A-V pair having the link to the null object. This link removal method is used only if there is already a name of the null object in the value of the A-V pair. In some other embodiments, the link is removed by removing the A-V pair (i.e., removing the fact) from the object. In further other embodiments, both manners of removal may be used; which one is used may depend on the circumstances with regard to how the linked-to object became a null object. For example, if the linked-to object became a null object because its associated A-V pairs were removed due to their meeting a first predefined criterion, then the manner of removal to be used may be removal of the A-V pair. On the other hand, if the associated A-V pairs were removed due to their meeting a second predefined criterion, then the manner of removal to be used may be removal of the identifier of the null object, leaving the name of the linked-to object in the value of the A-V pair.
If there are objects in the set remaining to be selected (412—no), another object is selected (406). Otherwise (412—yes), the process ends (414). In the embodiments where a manner of removal of the links to null objects includes removing the A-V pair with the link, instead of ending at 414, operations 406-412 may be repeated for additional iterations because the removal of the A-V pairs may create new null objects. While operations 406-412 may be repeated indefinitely, in some embodiments, a predefined limit on the number of additional iterations may be set. For example, after the first iteration, a limit may be set such that only one additional iteration is performed. It should be appreciated, however, that the process as illustrated in
The system 500 also includes a fact storage system 530 for storing facts. As described above, in some embodiments each fact stored in the fact storage system 530 includes a corresponding list of sources from which the respective fact was extracted.
Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory 512 may store a subset of the modules and data structures identified above. Furthermore, memory 512 may store additional modules and data structures not described above.
Although
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated.
Hogue, Andrew William, Betz, Jonathan T., Siemborski, Robert Joseph
Patent | Priority | Assignee | Title |
10019431, | May 02 2014 | WELLS FARGO BANK, N A | Systems and methods for active column filtering |
11243987, | Jun 16 2016 | Microsoft Technology Licensing, LLC | Efficient merging and filtering of high-volume metrics |
Patent | Priority | Assignee | Title |
5010478, | Apr 11 1986 | Entity-attribute value database system with inverse attribute for selectively relating two different entities | |
5133075, | Dec 19 1988 | Hewlett-Packard Company | Method of monitoring changes in attribute values of object in an object-oriented database |
5347653, | Jun 28 1991 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | System for reconstructing prior versions of indexes using records indicating changes between successive versions of the indexes |
5440730, | Aug 09 1990 | TTI Inventions C LLC | Time index access structure for temporal databases having concurrent multiple versions |
5475819, | Oct 02 1990 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Distributed configuration profile for computing system |
5519608, | Jun 24 1993 | Xerox Corporation | Method for extracting from a text corpus answers to questions stated in natural language by using linguistic analysis and hypothesis generation |
5546507, | Aug 20 1993 | Unisys Corporation | Apparatus and method for generating a knowledge base |
5560005, | Feb 25 1994 | WebMD Corporation | Methods and systems for object-based relational distributed databases |
5574898, | Jan 08 1993 | IBM Corporation | Dynamic software version auditor which monitors a process to provide a list of objects that are accessed |
5675785, | Oct 04 1994 | CA, INC | Data warehouse which is accessed by a user using a schema of virtual tables |
5680622, | Jun 30 1994 | CODEGEAR LLC | System and methods for quickly detecting shareability of symbol and type information in header files |
5694590, | Sep 27 1991 | Green Wireless LLC | Apparatus and method for the detection of security violations in multilevel secure databases |
5701470, | Dec 08 1995 | Oracle America, Inc | System and method for space efficient object locking using a data subarray and pointers |
5717911, | Jan 23 1995 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Relational database system and method with high availability compliation of SQL programs |
5717951, | Aug 07 1995 | Method for storing and retrieving information on a magnetic storage medium via data blocks of variable sizes | |
5724571, | Jul 07 1995 | Oracle America, Inc | Method and apparatus for generating query responses in a computer-based document retrieval system |
5778373, | Jul 15 1996 | AT&T Corp | Integration of an information server database schema by generating a translation map from exemplary files |
5778378, | Apr 30 1996 | MICHIGAN, REGENTS OF THE UNIVERSITY OF, THE | Object oriented information retrieval framework mechanism |
5787413, | Jul 29 1996 | International Business Machines Corporation | C++ classes for a digital library |
5793966, | Dec 01 1995 | Microsoft Technology Licensing, LLC | Computer system and computer-implemented process for creation and maintenance of online services |
5802299, | Feb 13 1996 | 3M Innovative Properties Company | Interactive system for authoring hypertext document collections |
5815415, | Jan 19 1996 | Bentley Systems, Incorporated | Computer system for portable persistent modeling |
5819210, | Jun 21 1996 | Xerox Corporation | Method of lazy contexted copying during unification |
5819265, | Jul 12 1996 | International Business Machines Corporation; IBM Corporation | Processing names in a text |
5822743, | Apr 08 1997 | CASEBANK SUPPORT SYSTEMS INC | Knowledge-based information retrieval system |
5826258, | Oct 02 1996 | Amazon Technologies, Inc | Method and apparatus for structuring the querying and interpretation of semistructured information |
5838979, | Oct 31 1995 | Rocket Software, Inc | Process and tool for scalable automated data field replacement |
5909689, | Sep 18 1997 | Sony Corporation; Sony Electronics, Inc. | Automatic update of file versions for files shared by several computers which record in respective file directories temporal information for indicating when the files have been created |
5920859, | Feb 05 1997 | Fidelity Information Services, LLC | Hypertext document retrieval system and method |
5943670, | Nov 21 1997 | International Business Machines Corporation; IBM Corporation | System and method for categorizing objects in combined categories |
5956718, | Dec 15 1994 | Oracle International Corporation | Method and apparatus for moving subtrees in a distributed network directory |
5974254, | Jun 06 1997 | National Instruments Corporation | Method for detecting differences between graphical programs |
5987460, | Jul 05 1996 | Hitachi, Ltd. | Document retrieval-assisting method and system for the same and document retrieval service using the same with document frequency and term frequency |
6006221, | Aug 16 1995 | Syracuse University | Multilingual document retrieval system and method using semantic vector matching |
6018741, | Oct 22 1997 | International Business Machines Corporation | Method and system for managing objects in a dynamic inheritance tree |
6038560, | May 21 1997 | Oracle International Corporation | Concept knowledge base search and retrieval system |
6044366, | Mar 16 1998 | Microsoft Technology Licensing, LLC | Use of the UNPIVOT relational operator in the efficient gathering of sufficient statistics for data mining |
6052693, | Jul 02 1996 | DELVE SOFTWARE LIMITED | System for assembling large databases through information extracted from text sources |
6064952, | Nov 18 1994 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Information abstracting method, information abstracting apparatus, and weighting method |
6073130, | Sep 23 1997 | GOOGLE LLC | Method for improving the results of a search in a structured database |
6078918, | Apr 02 1998 | Conversant, LLC | Online predictive memory |
6112203, | Apr 09 1998 | R2 SOLUTIONS LLC | Method for ranking documents in a hyperlinked environment using connectivity and selective content analysis |
6112210, | Oct 31 1997 | Oracle International Corporation | Apparatus and method for null representation in database object storage |
6122647, | May 19 1998 | AT HOME BONDHOLDERS LIQUIDATING TRUST | Dynamic generation of contextual links in hypertext documents |
6134555, | Mar 10 1997 | International Business Machines Corporation | Dimension reduction using association rules for data mining application |
6138270, | Jun 06 1997 | National Instruments Corporation | System, method and memory medium for detecting differences between graphical programs |
6182063, | Jul 07 1995 | Oracle America, Inc | Method and apparatus for cascaded indexing and retrieval |
6202065, | Jul 02 1997 | TVL LP | Information search and retrieval with geographical coordinates |
6212526, | Dec 02 1997 | Microsoft Technology Licensing, LLC | Method for apparatus for efficient mining of classification models from databases |
6240546, | Jul 24 1998 | International Business Machines Corporaiton | Identifying date fields for runtime year 2000 system solution process, method and article of manufacture |
6263328, | Apr 09 1999 | International Business Machines Corporation | Object oriented query model and process for complex heterogeneous database queries |
6263358, | Jul 25 1997 | British Telecommunications public limited company | Scheduler for a software system having means for allocating tasks |
6266805, | Jul 25 1997 | British Telecommunications plc | Visualization in a modular software system |
6285999, | Jan 10 1997 | GOOGLE LLC | Method for node ranking in a linked database |
6289338, | Dec 15 1997 | Manning And Napier Information Services LLC | Database analysis using a probabilistic ontology |
6311194, | Mar 15 2000 | ALTO DYNAMICS, LLC | System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising |
6314555, | Jul 25 1997 | British Telecommunications public limited company | Software system generation |
6327574, | Jul 07 1998 | CALLAHAN CELLULAR L L C | Hierarchical models of consumer attributes for targeting content in a privacy-preserving manner |
6349275, | Nov 24 1997 | Nuance Communications, Inc | Multiple concurrent language support system for electronic catalogue using a concept based knowledge representation |
6377943, | Jan 20 1999 | ORACLE INTERNATIONAL CORPORATION OIC | Initial ordering of tables for database queries |
6397228, | Mar 31 1999 | GOOGLE LLC | Data enhancement techniques |
6438543, | Jun 17 1999 | International Business Machines Corporation | System and method for cross-document coreference |
6470330, | Nov 05 1998 | SYBASE, INC | Database system with methods for estimation and usage of index page cluster ratio (IPCR) and data page cluster ratio (DPCR) |
6473898, | Jul 06 1999 | VERSATA SOFTWARE, INC | Method for compiling and selecting data attributes |
6487495, | Jun 02 2000 | HERE GLOBAL B V | Navigation applications using related location-referenced keywords |
6502102, | Mar 27 2000 | Accenture Global Services Limited | System, method and article of manufacture for a table-driven automated scripting architecture |
6519631, | Aug 13 1999 | Answers Corporation | Web-based information retrieval |
6556991, | Sep 01 2000 | ACCESSIFY, LLC | Item name normalization |
6565610, | Feb 11 1999 | HERE GLOBAL B V | Method and system for text placement when forming maps |
6567846, | May 15 1998 | WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT | Extensible user interface for a distributed messaging framework in a computer network |
6567936, | Feb 08 2000 | ZHIGU HOLDINGS LIMITED | Data clustering using error-tolerant frequent item sets |
6572661, | Jan 11 1999 | Cisco Technology, Inc | System and method for automated annotation of files |
6578032, | Jun 28 2000 | Microsoft Technology Licensing, LLC | Method and system for performing phrase/word clustering and cluster merging |
6584464, | Mar 19 1999 | IAC SEARCH & MEDIA, INC | Grammar template query system |
6584646, | Feb 29 2000 | Ethicon, Inc | Tilt hinge for office automation equipment |
6594658, | Jul 07 1995 | Sun Microsystems, Inc. | Method and apparatus for generating query responses in a computer-based document retrieval system |
6606625, | Jun 03 1999 | IMPORT IO GLOBAL INC | Wrapper induction by hierarchical data analysis |
6606659, | Jan 28 2000 | FORCEPOINT FEDERAL HOLDINGS LLC; Forcepoint LLC | System and method for controlling access to internet sites |
6609123, | Sep 01 2000 | International Business Machines Corporation | Query engine and method for querying data using metadata model |
6636742, | Dec 23 1997 | UNWIRED BROADBAND, INC | Tracking of mobile terminal equipment in a mobile communications system |
6643641, | Apr 27 2000 | Mineral Lassen LLC | Web search engine with graphic snapshots |
6656991, | Aug 22 2000 | Ausimont S.p.A. | Blends of fluorinated and acrylic elastomers |
6665659, | Feb 01 2000 | RATEZE REMOTE MGMT L L C | Methods and apparatus for distributing and using metadata via the internet |
6665666, | Oct 26 1999 | GOOGLE LLC | System, method and program product for answering questions using a search engine |
6665837, | Aug 10 1998 | R2 SOLUTIONS LLC | Method for identifying related pages in a hyperlinked database |
6675159, | Jul 27 2000 | Leidos, Inc | Concept-based search and retrieval system |
6684205, | Oct 18 2000 | International Business Machines Corporation | Clustering hypertext with applications to web searching |
6693651, | Feb 07 2001 | International Business Machines Corporation | Customer self service iconic interface for resource search results display and selection |
6704726, | Dec 28 1998 | BULL S A | Query processing method |
6738767, | Mar 20 2000 | Meta Platforms, Inc | System and method for discovering schematic structure in hypertext documents |
6745189, | Jun 05 2000 | International Business Machines Corporation | System and method for enabling multi-indexing of objects |
6754873, | Sep 20 1999 | GOOGLE LLC | Techniques for finding related hyperlinked documents using link-based analysis |
6763496, | Mar 31 1999 | Microsoft Technology Licensing, LLC | Method for promoting contextual information to display pages containing hyperlinks |
6799176, | Jan 10 1997 | GOOGLE LLC | Method for scoring documents in a linked database |
6804667, | Nov 30 1999 | TERADATA US, INC | Filter for checking for duplicate entries in database |
6820081, | Mar 19 2001 | NUIX NORTH AMERICA INC | System and method for evaluating a structured message store for message redundancy |
6820093, | Jul 30 1996 | HyperPhrase Technologies, LLC | Method for verifying record code prior to an action based on the code |
6823495, | Sep 14 2000 | Microsoft Technology Licensing, LLC | Mapping tool graphical user interface |
6832218, | Sep 22 2000 | International Business Machines Corporation | System and method for associating search results |
6845354, | Sep 09 1999 | Institute For Information Industry | Information retrieval system with a neuro-fuzzy structure |
6850896, | Oct 28 1999 | Market-Touch Corporation | Method and system for managing and providing sales data using world wide web |
6868411, | Aug 13 2001 | Xerox Corporation | Fuzzy text categorizer |
6873982, | Jul 16 1999 | International Business Machines Corporation | Ordering of database search results based on user feedback |
6873993, | Jun 21 2000 | Canon Kabushiki Kaisha | Indexing method and apparatus |
6886005, | Feb 17 2000 | E-NUMERATE SOLUTIONS, INC | RDL search engine |
6886010, | Sep 30 2002 | The United States of America as represented by the Secretary of the Navy; NAVY, UNITED STATES OF AMERICA, AS REPRESENTED BY THE SEC Y OF THE | Method for data and text mining and literature-based discovery |
6901403, | Mar 02 2000 | ROGUE WAVE SOFTWARE, INC | XML presentation of general-purpose data sources |
6904429, | Sep 29 1997 | Kabushiki Kaisha Toshiba | Information retrieval apparatus and information retrieval method |
6957213, | May 17 2000 | Oracle OTC Subsidiary LLC | Method of utilizing implicit references to answer a query |
6963880, | May 10 2002 | Oracle International Corporation | Schema evolution of complex objects |
6965900, | Dec 19 2001 | X-Labs Holdings, LLC | METHOD AND APPARATUS FOR ELECTRONICALLY EXTRACTING APPLICATION SPECIFIC MULTIDIMENSIONAL INFORMATION FROM DOCUMENTS SELECTED FROM A SET OF DOCUMENTS ELECTRONICALLY EXTRACTED FROM A LIBRARY OF ELECTRONICALLY SEARCHABLE DOCUMENTS |
7003506, | Jun 23 2000 | Microsoft Technology Licensing, LLC | Method and system for creating an embedded search link document |
7003522, | Jun 24 2002 | Microsoft Technology Licensing, LLC | System and method for incorporating smart tags in online content |
7003719, | Jan 25 1999 | Thomson Reuters Enterprise Centre GmbH | System, method, and software for inserting hyperlinks into documents |
7007228, | Jul 29 1999 | MEDIATEK INC | Encoding geographic coordinates in a fuzzy geographic address |
7013308, | Nov 28 2000 | Amazon Technologies, Inc | Knowledge storage and retrieval system and method |
7020662, | May 29 2001 | Oracle America, Inc | Method and system for determining a directory entry's class of service based on the value of a specifier in the entry |
7043521, | Mar 21 2002 | ALVARIA CAYMAN CX | Search agent for searching the internet |
7051023, | Apr 04 2003 | R2 SOLUTIONS LLC | Systems and methods for generating concept units from search queries |
7076491, | Nov 09 2001 | YOZOSOFT CO , LTD | Upward and downward compatible data processing system |
7080073, | Aug 18 2000 | AUREA SOFTWARE, INC | Method and apparatus for focused crawling |
7080085, | Jul 12 2000 | International Business Machines Corporation | System and method for ensuring referential integrity for heterogeneously scoped references in an information management system |
7100082, | Aug 04 2000 | Oracle America, Inc | Check creation and maintenance for product knowledge management |
7143099, | Feb 08 2001 | AMDOCS DEVELOPMENT LIMITED; Amdocs Software Systems Limited | Historical data warehousing system |
7146536, | Aug 04 2000 | Oracle America, Inc | Fact collection for product knowledge management |
7158980, | Oct 02 2003 | Acer Incorporated | Method and apparatus for computerized extracting of scheduling information from a natural language e-mail |
7162499, | Jun 21 2000 | Microsoft Technology Licensing, LLC | Linked value replication |
7165024, | Feb 22 2002 | NEC Corporation | Inferring hierarchical descriptions of a set of documents |
7174504, | Nov 09 2001 | YOZOSOFT CO , LTD | Integrated data processing system with links |
7181471, | Nov 01 1999 | Fujitsu Limited | Fact data unifying method and apparatus |
7194380, | Feb 28 2003 | PEGASYSTEMS INC | Classification using probability estimate re-sampling |
7197449, | Oct 30 2001 | Intel Corporation | Method for extracting name entities and jargon terms using a suffix tree data structure |
7216073, | Mar 13 2001 | INTELLIGATE, LTD | Dynamic natural language understanding |
7233943, | Oct 18 2000 | International Business Machines Corporation | Clustering hypertext with applications to WEB searching |
7260573, | May 17 2004 | GOOGLE LLC | Personalizing anchor text scores in a search engine |
7263565, | Sep 21 2004 | Renesas Electronics Corporation; NEC Electronics Corporation | Bus system and integrated circuit having an address monitor unit |
7277879, | Dec 17 2002 | Hewlett Packard Enterprise Development LP | Concept navigation in data storage systems |
7302646, | Feb 06 2002 | GOOGLE LLC | Information rearrangement method, information processing apparatus and information processing system, and storage medium and program transmission apparatus therefor |
7305380, | Dec 15 1999 | GOOGLE LLC | Systems and methods for performing in-context searching |
7325160, | Nov 09 2001 | YOZOSOFT CO , LTD | Data processing system with data recovery |
7363312, | Jul 04 2002 | Hewlett Packard Enterprise Development LP | Combining data descriptions |
7376895, | Nov 09 2001 | YOZOSOFT CO , LTD | Data object oriented repository system |
7398461, | Jan 24 2002 | R2 SOLUTIONS LLC | Method for ranking web page search results |
7409381, | Jul 30 1998 | British Telecommunications public limited company | Index to a semi-structured database |
7412078, | Jul 18 2001 | INPEG VISION CO , LTD | System for automatic recognizing license number of other vehicles on observation vehicles and method thereof |
7418736, | Mar 27 2002 | British Telecommunication public limited company | Network security system |
7472182, | Dec 31 2002 | EMC IP HOLDING COMPANY LLC | Data collection policy for storage devices |
7483829, | Jul 26 2001 | International Business Machines Corporation | Candidate synonym support device for generating candidate synonyms that can handle abbreviations, mispellings, and the like |
7493308, | Oct 03 2000 | Amazon Technologies, Inc | Searching documents using a dimensional database |
7493317, | Oct 20 2005 | Adobe Inc | Result-based triggering for presentation of online content |
7587387, | Mar 31 2005 | GOOGLE LLC | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
7644076, | Sep 12 2003 | TERADATA US, INC | Clustering strings using N-grams |
7672971, | Feb 17 2006 | GOOGLE LLC | Modular architecture for entity normalization |
7685201, | Sep 08 2006 | Microsoft Technology Licensing, LLC | Person disambiguation using name entity extraction-based clustering |
7698303, | Jan 14 2002 | BEIJING ZITIAO NETWORK TECHNOLOGY CO , LTD | System for categorizing and normalizing knowledge data based on user's affinity to knowledge |
7716225, | Jun 17 2004 | GOOGLE LLC | Ranking documents based on user behavior and/or feature data |
7747571, | Apr 15 2004 | AT&T Intellectual Property, I,L.P. | Methods, systems, and computer program products for implementing logical and physical data models |
7756823, | Mar 26 2004 | Lockheed Martin Corporation | Dynamic reference repository |
7797282, | Sep 29 2005 | MICRO FOCUS LLC | System and method for modifying a training set |
7885918, | Jul 29 2005 | K MIZRA LLC | Creating a taxonomy from business-oriented metadata content |
7917154, | Nov 01 2006 | R2 SOLUTIONS LLC | Determining mobile content for a social network based on location and time |
7953720, | Mar 31 2005 | GOOGLE LLC | Selecting the best answer to a fact query from among a set of potential answers |
8024281, | Feb 29 2008 | Red Hat, Inc | Alpha node hashing in a rule engine |
8065290, | Mar 31 2005 | GOOGLE LLC | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
8108501, | Nov 01 2006 | R2 SOLUTIONS LLC | Searching and route mapping based on a social network, location, and time |
20010021935, | |||
20020022956, | |||
20020038307, | |||
20020042707, | |||
20020065845, | |||
20020073115, | |||
20020083039, | |||
20020087567, | |||
20020107861, | |||
20020147738, | |||
20020169770, | |||
20020174099, | |||
20020178448, | |||
20020194172, | |||
20030018652, | |||
20030058706, | |||
20030069880, | |||
20030078902, | |||
20030088607, | |||
20030097357, | |||
20030120644, | |||
20030120675, | |||
20030126102, | |||
20030126152, | |||
20030149567, | |||
20030149699, | |||
20030154071, | |||
20030167163, | |||
20030177110, | |||
20030182310, | |||
20030195872, | |||
20030195877, | |||
20030196052, | |||
20030204481, | |||
20030208354, | |||
20040003067, | |||
20040015481, | |||
20040024739, | |||
20040049503, | |||
20040059726, | |||
20040064447, | |||
20040088292, | |||
20040107125, | |||
20040122844, | |||
20040122846, | |||
20040123240, | |||
20040128624, | |||
20040143600, | |||
20040153456, | |||
20040167870, | |||
20040167907, | |||
20040167911, | |||
20040177015, | |||
20040177080, | |||
20040199923, | |||
20040243552, | |||
20040243614, | |||
20040255237, | |||
20040260979, | |||
20040267700, | |||
20040268237, | |||
20050055365, | |||
20050076012, | |||
20050080613, | |||
20050086211, | |||
20050086222, | |||
20050086251, | |||
20050097150, | |||
20050108630, | |||
20050125311, | |||
20050149576, | |||
20050149851, | |||
20050187923, | |||
20050188217, | |||
20050240615, | |||
20050256825, | |||
20060036504, | |||
20060041597, | |||
20060047691, | |||
20060047838, | |||
20060053171, | |||
20060053175, | |||
20060064411, | |||
20060074824, | |||
20060074910, | |||
20060085465, | |||
20060112110, | |||
20060123046, | |||
20060129843, | |||
20060136585, | |||
20060143227, | |||
20060143603, | |||
20060152755, | |||
20060167991, | |||
20060224582, | |||
20060238919, | |||
20060242180, | |||
20060248045, | |||
20060248456, | |||
20060253418, | |||
20060259462, | |||
20060277169, | |||
20060288268, | |||
20060293879, | |||
20070005593, | |||
20070005639, | |||
20070016890, | |||
20070038610, | |||
20070043708, | |||
20070055656, | |||
20070073768, | |||
20070094246, | |||
20070100814, | |||
20070130123, | |||
20070143282, | |||
20070143317, | |||
20070150800, | |||
20070198451, | |||
20070198480, | |||
20070198481, | |||
20070198503, | |||
20070198577, | |||
20070198598, | |||
20070198600, | |||
20070203867, | |||
20070208773, | |||
20070271268, | |||
20080071739, | |||
20080104019, | |||
20090006359, | |||
20090119255, | |||
JP11265400, | |||
JP2002157276, | |||
JP2002540506, | |||
JP2003281173, | |||
JP5174020, | |||
WO127713, | |||
WO2004114163, | |||
WO2006104951, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
May 25 2005 | HOGUE, ANDREW WILLIAM | Google Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016196 | /0533 | |
May 25 2005 | SIEMBORSKI, ROBERT JOSEPH | Google Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016196 | /0533 | |
May 25 2005 | BETZ, JONATHAN T | Google Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016196 | /0533 | |
May 31 2005 | Google Inc. | (assignment on the face of the patent) | / | |||
Sep 29 2017 | Google Inc | GOOGLE LLC | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 044334 | /0466 |
Date | Maintenance Fee Events |
Oct 01 2018 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Sep 30 2022 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Mar 31 2018 | 4 years fee payment window open |
Oct 01 2018 | 6 months grace period start (w surcharge) |
Mar 31 2019 | patent expiry (for year 4) |
Mar 31 2021 | 2 years to revive unintentionally abandoned end. (for year 4) |
Mar 31 2022 | 8 years fee payment window open |
Oct 01 2022 | 6 months grace period start (w surcharge) |
Mar 31 2023 | patent expiry (for year 8) |
Mar 31 2025 | 2 years to revive unintentionally abandoned end. (for year 8) |
Mar 31 2026 | 12 years fee payment window open |
Oct 01 2026 | 6 months grace period start (w surcharge) |
Mar 31 2027 | patent expiry (for year 12) |
Mar 31 2029 | 2 years to revive unintentionally abandoned end. (for year 12) |