Method of checking index integrity in a DB2 database

Method of checking index integrity in a DB2 database
US5579515

A single-phase CHECK index operation for db2 entails creating a special sort record for each data record and each index entry, collating the special sort records in a particular way to group together the sort records for each data record and any associated index entries, and performing diagnostic operations on the sorted records.

PTO Wrapper PDF
Dossier Espace Google

Patent 5579515
Priority Dec 16 1993
Filed Dec 16 1993
Issued Nov 26 1996
Expiry Dec 16 2013
Inventors Hintz, Tho…
Assg.orig BMC Softwa…
Assg.curr BMC SOFTWA…
Entity Large
Referenced by 33
References 10
Maint.: all paid

BACKGROUND OF THE IN…
SUMMARY OF THE INVEN…
BRIEF DESCRIPTION OF…
DETAILED DESCRIPTION…

1. A method of testing db2 index integrity, comprising the steps of:

(a) reading, in parallel, (1) a plurality of db2 data records from a db2 table space, and (2) a plurality of db2 index entries from each of at least one db2 indexes associated with said db2 table space, normally each said index entry being nominally associated with exactly one said db2 data record;

(b) constructing a sort record for each said data record and each said index entry;

(c) collating the sort records for the data records and said sort records for the index entries into a single sequence of sort records to group together (1) the sort records for each data record, with (2) the one or more sort records for said one or more index entries associated with said respective date record; and

(d) performing a specified diagnosis routine utilizing said single sequence of sort records as an input.

2. A method of testing db2 index integrity, comprising the steps of:

(a) reading, in parallel, (1) a plurality of db2 data records from a db2 table space, and (2) a plurality of db2 index entries from each of at least one db2 indexes associated with said db2 table space, each said index entry being nominally associated with exactly one said db2 data record;

(b) constructing a sort record for each data record, each said sort record respectively including (1) a row identifier field whose value uniquely identifies said respective data record, and (2) an index identifier field whose value indicates that said sort record is associated with a data record;

(c) constructing a sort record for each said index entry, each said sort record for said index entries respectively including (1) a field identifying the data record with which said index entry is associated, and (2) a field identifying the db2 index from which the index entry was read;

(d) collating the sort records for the data records and said sort records for the index entries into a single sequence of sort records to group together (1) the sort records for each data record, with (2) the one or more sort records for said one or more index entries associated with said respective data record; and

(e) performing a specified diagnosis routine utilizing said single sequence of sort records as an input.

3. The method of a specified one of claims 1 and 2, wherein the sort records are grouped so that the sort record for a specified data record precedes any sort record for an index entry associated with said specified data record.

4. A program storage device encoding a program of instructions, for a general-purpose programmable machine, wherein execution of said instructions by said machine results in performance of the method steps of claim 1.

5. A program storage device encoding a program of instructions, for a general-purpose programmable machine, wherein execution of said instructions by said machine results in performance of the method steps of claim 2.

6. A program storage device encoding a program of instructions, for a general-purpose programmable machine, wherein execution of said instructions by said machine results in performance of the method steps of claim 3.

BACKGROUND OF THE INVENTION

In the prior art, a well-known process that is executed periodically for DB2 installations is to check the indexes of a database. Generally speaking, this entails ensuring that the physical data records (sometimes referred to as rows) are properly indexed. That is, for each data row should have exactly one index entry (e.g., "Last Name" for a personnel data record), and that the key value, in the physical data record or row matches the key value as stored in the appropriate index entry for that data row.

Some nomenclature is introduced here to aid in understanding the terminology used. A principal term used here is that of a "data set." Generally speaking, a data set is a collection of data that is referred to in an operating system environment (e.g., the well-known MVS operating system environment) by a single name, in much the same way as a word processing file might be given a single directory name for easy retrieval of the data in the file (even though the data might physically be stored in a variety of locations on a disk). Examples of MVS data sets include DB2 indexes and DB2 table spaces. Typically, maintenance operations in the DB2 environment involve three steps: First, reading the data to be reorganized (e.g., from a table space or index), often from a variety of different physical locations identified by a data set name. The step of data reading is typically referred to as an UNLOAD process that involves physically copying the data to some other memory or other storage. The second step is that of sorting or otherwise ordering the data to conform to the desired ordering and performing any other desired processing. Finally, the third step is that of rewriting the sorted data to storage (table space or index) designated with the same data set name. The third step is typically referred to as a RELOAD process.

A conventional way of checking DB2 indexes is illustrated in FIGS. 1 and 2. One or more DB2 indexes 100 are UNLOADed in a first processing phase 105, which typically reads the index entries and sorts them into the order of the physical sequence of the rows indexed. A work or intermediate data set 110 is created from the output of the first processing phase 105. In a second processing phase 115, the contents of the work data set 110 are methodically compared with the actual physical data records read from DB2 table space(s) 120 to check the index integrity as described above, and appropriate diagnostic message(s) 125 are displayed. This process is shown in more detail in FIG. 2. The index(es) in question are UNLOADed sequentially at block 200. At block 210, the index(es) are sorted by the data row identifier (sometimes referred to as the row ID or RID). At block 220, the physical data records are read, and at block 230, the respective data records are compared with the corresponding index key values from the index-related work data set 110. At block 240, appropriate conventional diagnostic routines are executed to check the index integrity.

A significant disadvantage of the foregoing prior-art approach to the CHECK INDEX operation is that the various indexes are UNLOADed one after another, and subsequently the physical data records are read. The elapsed time required for the operation thus includes the sum of the elapsed times required for each of the index UNLOAD operations plus the time required for the reading of the associated physical data records. This is especially undesirable in installations which attempt to operate 24 hours a day, seven days a week; it may be highly undesirable to disable a data set, in effect, for the length of time required for a two-phase CHECK INDEX operation. Moreover, such an approach plainly requires additional input/output (I/O) steps, which can have tangible financial costs associated therewith. In some actual systems, storage space may be at a premium and thus the use of storage to create a working or intermediate data set may be undesirable.

SUMMARY OF THE INVENTION

A method in accordance with the invention entails performing a single-phase CHECK INDEX operation for a DB2 environment by creating a special SORT record for each data record and each index entry, collating the special SORT records in a particular way to group together the SORT records for each data record and any associated index entries, and performing diagnostic operations on the sorted records.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a data flow diagram overview, and

FIG. 2 a flow-chart overview, of a typical prior-art two-phase CHECK INDEX process in a DB2 database.

FIG. 3 is a data flow diagram of an improved single-phase CHECK INDEX process.

FIG. 4 is a flow-chart representation of the same process.

FIG. 5 shows a sort key used in the process.

FIGS. 6 and 7 are pseudocode and flow-chart representations of an illustrative diagnostic routine.

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS

One illustrative embodiment of the invention is described below as it might be implemented by loading of program of instructions (e.g., executable code or compilable or interpretable source code) on a general purpose computer system from a program storage device such as a magnetic tape, a floppy disk, an optical disk, etc., and causing the computer system to execute the program. In the interest of clarity, not all features of an actual implementation are described in this specification. It will of course be appreciated that in the development of any such actual implementation (as in any software development project), numerous implementation-specific decisions must be made to achieve the developers' specific goals and subgoals, such as compliance with system- and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of software engineering for those of ordinary skill having the benefit of this disclosure.

Referring to FIGS. 3 and 4, an improved CHECK INDEX operation takes advantage of an innovative use of the system SORT routine (e.g., the standard SORT routine supplied with the IBM MVS operating system or any desired third-party SORT routine). As shown in FIG. 3, the DB2 index(es) 100 and the physical data rows in the DB2 table space 120 are read in parallel and fed to a SORT routine 300. The output of the SORT routine is examined to diagnose potential problems with index integrity.

Referring now to FIG. 4, the method of the invention is shown in more detail. The physical data records to be checked are read from the DB2 table space(s) 120 along with the index records from the DB2 index(es) 100, as shown in block 400. Significantly, these READ operations are performed in parallel, thus reducing the elapsed time required to perform the CHECK INDEX operation.

At block 405, a series of SORT input records is constructed, one for each data row read from a DB2 table space and one for each index entry read from a DB2 index, as illustrated in more detail in FIG. 5. As is well-known to those of ordinary skill, each index entry includes a value identifying the data row that is indexed by the index entry (commonly referred to as the row ID or "RID"). Each SORT input record accordingly includes a row ID: if the SORT input record is for a data row, then the row ID is that of the data row; if the SORT input record is for an index entry, the row ID is that of the data row that is indexed. In the SORT record, the row ID immediately precedes an index identifier for the record in question. Similarly, the table ID (TB ID) identifies the DB2 table associated with the record in question (whether a table space record or an index record). Finally, the key value field of the SORT record is the index key value for the record in question. If the record in question is a physical data record from a DB2 table space, then the value of the index identifier is set at zero; otherwise, the value of the index identifier is a unique non-zero value uniquely identifying the particular DB2 index from which the index entry was read.

Returning to FIG. 4, the SORT records so constructed are fed to the system SORT routine where a two-part sort key is used. The sort key, as shown in FIG. 5, consists of the row ID immediately followed by the index ID. Consequently, the output of the SORT routine is a series of records in row ID order and for each row ID, in index ID order.

As shown in block 415 and FIGS. 6 and 7, the output of the SORT routine is tested, using any convenient diagnostic process, to determine whether any integrity problems exist in the data being tested. FIG. 6 shows a pseudocode representation of one diagnostic process, which is also shown in graphic flow chart form in FIG. 7.

A significant advantage of the method described herein is that the data-record and index-entry data are read in parallel, thus reducing elapsed time. Moreover, because a work data set 110 is not created, I/O resources are thereby conserved.

INVENTORS:

Hintz, Thomas E., Tenberg, Kerry C.

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
5842197,	Aug 29 1996	Oracle International Corporation	Selecting a qualified data repository to create an index
5842208,	Apr 09 1997	International Business Machines Corporation	High performance recover/build index system by unloading database files in parallel
5878410,	Sep 13 1996	Microsoft Technology Licensing, LLC	File system sort order indexes
5978793,	Apr 18 1997	International Business Machines Corporation	Processing records from a database
6065005,	Dec 17 1997	International Business Machines Corporation	Data sorting
6081799,	May 21 1998	International Business Machines Corporation	Executing complex SQL queries using index screening for conjunct or disjunct index operations
6144970,	Sep 24 1998	International Business Machines Corporation	Technique for inplace reorganization of a LOB table space
6272486,	Apr 16 1998	International Business Machines Corporation	Determining the optimal number of tasks for building a database index
6282570,	Dec 07 1998	International Business Machines Corporation	Monitoring a large parallel database through dynamic grouping and sequential sampling
6295539,	Sep 14 1998	CA, INC	Dynamic determination of optimal process for enforcing constraints
6343286,	Sep 24 1998	International Business Machines Corporation	Efficient technique to defer large object access with intermediate results
6343293,	Sep 24 1998	International Business Machines Corporation	Storing the uncompressed data length in a LOB map to speed substring access within a LOB value
6363389,	Sep 24 1998	International Business Machines Corporation	Technique for creating a unique quasi-random row identifier
6366902,	Sep 24 1998	International Business Machines Corp.	Using an epoch number to optimize access with rowid columns and direct row access
6427143,	Apr 10 1998	CA, INC	Method for loading rows into a database table while enforcing constraints
6470359,	Sep 24 1998	International Business Machines Corporation	Fast technique for recovering an index on an auxiliary table
6598041,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for processing modifications to data in tables in a database system
6604097,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for using control data structures when performing operations with respect to a database
6606617,	Sep 24 1998	International Business Machines Corporation	Optimized technique for prefetching LOB table space pages
6643637,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for using a fetch request to make data available to an application program
6665678,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for optimistic concurrency control for scrollable cursors in a database
6694305,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for processing modifications to data in tables in a database system
6694340,	Sep 24 1998	International Business Machines Corporation	Technique for determining the age of the oldest reading transaction with a database object
6721731,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for processing a fetch request for a target row at an absolute position from a first entry in a table
6754653,	Sep 07 2000	International Business Machines Corporation	Method, system, and program for processing a fetch request for a target row in a table that precedes as current row
6775676,	Sep 11 2000	International Business Machines Corporation; SAP AG	Defer dataset creation to improve system manageability for a database system
6856996,	Mar 30 2001	International Business Machines Corporation	Method, system, and program for accessing rows in one or more tables satisfying a search criteria
6983321,	Jul 10 2000	BMC SOFTWARE, INC	System and method of enterprise systems and business impact management
7376665,	Nov 06 2003	AT&T Intellectual Property I, L P	Systems, methods and computer program products for automating retrieval of data from a DB2 database
7774458,	Jul 10 2000	BMC Software, Inc.	System and method of enterprise systems and business impact management
7933922,	Nov 06 2003	RAKUTEN GROUP, INC	Systems, methods and computer program products for automating retrieval of data from a DB2 database
8438173,	Jan 09 2009	Microsoft Technology Licensing, LLC	Indexing and querying data stores using concatenated terms
8818953,	Nov 29 2004	BMC Software, Inc.	Method and apparatus for loading data into multi-table tablespace

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
4817036,	Mar 15 1985	Brigham Young University	Computer system and method for data base indexing and information retrieval
4933848,	Jul 15 1988	International Business Machines Corporation	Method for enforcing referential constraints in a database management system
5089985,	Apr 07 1988	International Business Machines Corporation	System and method for performing a sort operation in a relational database manager to pass results directly to a user without writing to disk
5121493,	Jan 19 1990	AMALGAMATED SOFTWARE OF NORTH AMERICA, INC , A CORP OF TEXAS	Data sorting method
5133068,	Sep 23 1988	International Business Machines Corporation	Complied objective referential constraints in a relational database having dual chain relationship descriptors linked in data record tables
5146590,	Jan 13 1989	International Business Machines Corporation	Method for sorting using approximate key distribution in a distributed system
5222235,	Feb 01 1990	BMC SOFTWARE, INC	Databases system for permitting concurrent indexing and reloading of data by early simulating the reload process to determine final locations of the data
5307484,	Mar 06 1991	NEW CARCO ACQUISITION LLC; Chrysler Group LLC	Relational data base repository system for managing functional and physical data structures of nodes and links of multiple computer networks
5404510,	May 21 1992	Oracle International Corporation	Database index design based upon request importance and the reuse and modification of similar existing indexes
5497486,	Mar 15 1994	LOT 19 ACQUISITION FOUNDATION, LLC	Method of merging large databases in parallel

ASSIGNMENT RECORDS Assignment records on the USPTO

/////////////////////

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Dec 16 1993		BMC Software, Inc.	(assignment on the face of the patent)
Feb 16 1994	TENBERG, KERRY C	BMC SOFTWARE, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	006927	0811	pdf
Feb 16 1994	HINTZ, THOMAS E	BMC SOFTWARE, INC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	006927	0811	pdf
Sep 10 2013	BLADELOGIC, INC	CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT	SECURITY AGREEMENT	031204	0225	pdf
Sep 10 2013	BMC SOFTWARE, INC	CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT	SECURITY AGREEMENT	031204	0225	pdf
Oct 02 2018	Credit Suisse AG, Cayman Islands Branch	BMC ACQUISITION L L C	RELEASE OF PATENTS	047198	0468	pdf
Oct 02 2018	BMC SOFTWARE, INC	CREDIT SUISSE, AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT	SECURITY INTEREST SEE DOCUMENT FOR DETAILS	047185	0744	pdf
Oct 02 2018	BLADELOGIC, INC	CREDIT SUISSE, AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT	SECURITY INTEREST SEE DOCUMENT FOR DETAILS	047185	0744	pdf
Oct 02 2018	Credit Suisse AG, Cayman Islands Branch	BLADELOGIC, INC	RELEASE OF PATENTS	047198	0468	pdf
Oct 02 2018	Credit Suisse AG, Cayman Islands Branch	BMC SOFTWARE, INC	RELEASE OF PATENTS	047198	0468	pdf
Jun 01 2020	BMC SOFTWARE, INC	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	SECURITY INTEREST SEE DOCUMENT FOR DETAILS	052844	0646	pdf
Jun 01 2020	BLADELOGIC, INC	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	SECURITY INTEREST SEE DOCUMENT FOR DETAILS	052844	0646	pdf
Sep 30 2021	BLADELOGIC, INC	ALTER DOMUS US LLC	GRANT OF SECOND LIEN SECURITY INTEREST IN PATENT RIGHTS	057683	0582	pdf
Sep 30 2021	BMC SOFTWARE, INC	ALTER DOMUS US LLC	GRANT OF SECOND LIEN SECURITY INTEREST IN PATENT RIGHTS	057683	0582	pdf
Jan 31 2024	ALTER DOMUS US LLC	BMC SOFTWARE, INC	TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS	066567	0283	pdf
Jan 31 2024	ALTER DOMUS US LLC	BLADELOGIC, INC	TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS	066567	0283	pdf
Feb 29 2024	CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS RESIGNING COLLATERAL AGENT	GOLDMAN SACHS BANK USA, AS SUCCESSOR COLLATERAL AGENT	OMNIBUS ASSIGNMENT OF SECURITY INTERESTS IN PATENT COLLATERAL	066729	0889	pdf
Jul 31 2024	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	BMC SOFTWARE, INC	RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL FRAME 052844 0646	068339	0408	pdf
Jul 31 2024	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	BLADELOGIC, INC	RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL FRAME 052844 0646	068339	0408	pdf
Jul 31 2024	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	BMC SOFTWARE, INC	RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL FRAME 052854 0139	068339	0617	pdf
Jul 31 2024	THE BANK OF NEW YORK MELLON TRUST COMPANY, N A , AS COLLATERAL AGENT	BLADELOGIC, INC	RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL FRAME 052854 0139	068339	0617	pdf

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
May 15 2000	M183: Payment of Maintenance Fee, 4th Year, Large Entity.
Apr 20 2004	M1552: Payment of Maintenance Fee, 8th Year, Large Entity.
May 16 2008	M1553: Payment of Maintenance Fee, 12th Year, Large Entity.

Date	Maintenance Schedule
Nov 26 1999	4 years fee payment window open
May 26 2000	6 months grace period start (w surcharge)
Nov 26 2000	patent expiry (for year 4)
Nov 26 2002	2 years to revive unintentionally abandoned end. (for year 4)
Nov 26 2003	8 years fee payment window open
May 26 2004	6 months grace period start (w surcharge)
Nov 26 2004	patent expiry (for year 8)
Nov 26 2006	2 years to revive unintentionally abandoned end. (for year 8)
Nov 26 2007	12 years fee payment window open
May 26 2008	6 months grace period start (w surcharge)
Nov 26 2008	patent expiry (for year 12)
Nov 26 2010	2 years to revive unintentionally abandoned end. (for year 12)