method and apparatus are provided for presenting various levels of detail about successful error recoveries and background hardware optimizations during the recording and retrieval of digital information on magnetic tape. A first, Band Summary, report presents a high-level summary of recovery methods by data band and wrap. A second, Detail Summary, report presents a mid-level summary of recovery methods by track and longitudinal position (LPOS) region within one wrap of a band on the tape. A third, ERP Summary, report presents a low-level summary of errors and specific recovery methods and optimizations by LPOS region within each wrap. Such “telescoping” views permit pattern analysis to be performed at different resolutions. Thus, correlations of possible interactions between hardware and microcode activities that result in changes of the nominal operating point of the drive may be identified. Possible failure patterns may also be identified and fed back to design personnel and incorporated in microcode design changes for more effective ERP.
|
1. A method for mapping error corrections in a magnetic tape drive data storage system, comprising:
performing data write and/or read operations on a tape medium mounted in a data storage tape drive;
receiving information pertaining to successful recoveries from corresponding errors during the data operations;
mapping the successful error recovery information to associate each error recovery with a physical location on the tape medium;
mapping any hardware optimization of the read/write channel or servo system performed as a preventative measure;
generating a first output report providing a first level of error recovery detail; and
generating a second output report providing a second level of error recovery detail, the second output report having more of detail than the first output report.
6. A system for mapping error corrections in a magnetic tape drive data storage system, comprising:
an error recovery controller operable to initiate recovery processes in response to errors detected during data write and/or read operations on a tape medium mounted in a data storage tape drive;
an error recovery controller operable to initiate preventative recovery processes in response to statistical assessment of read/write channel and servo performance detected during data write and/or read operations on a tape medium mounted in a data storage tape drive;
an error recovery logger operable to record:
locations of the errors on the tape medium;
a recovery method associated with each error;
a preventative recovery method associated with thresholded statistical performance data; and
values of a plurality of operational parameters at the time of each error; and
a report generator operable to generate:
a first output report providing a first level of error recovery detail; and
a second output report providing a second level of error recovery detail, the second output report having more of detail than the first output report.
2. The method of
3. The method of
4. The method of
5. The method of
7. The system of
8. The system of
9. The system of
10. The system of
|
The present invention is directed generally to the recording and retrieval of digital information on magnetic tape and, in particular, to providing various levels of detail about successful error recoveries.
Conventional data storage tape drives employ various error correction and recovery methods to detect and correct data errors which, if left unresolved, would compromise the integrity of information read from or written to the magnetic tape media. Events which can lead to data errors include defects on the media, debris between the tape head and the media, and other conditions that interfere with head/media data transfer operations.
Error correction and recovery may be thought of as two distinct operations that are employed at different stages of error processing. Error correction is conventionally implemented using error correction coding (ECC) techniques in which host data to be placed on a tape medium is encoded in a well-defined structure by introducing data-dependent redundancy information. The presence of data errors is detected when the encoded structure is disturbed. The errors are corrected by making minimal alternations to reestablish the structure. ECC error correction is usually implemented “on-the-fly” as data is processed by the tape drive apparatus. Various encoding schemes are known in the art.
Error recovery occurs when ECC error correction is unable to correct data errors or when thresholds for allowable error correction are exceeded. The error recovery process may require stopping the tape and reprocessing a data block in which an error was detected. Typical error recovery procedures include tape refresh operations wherein a tape is wound to its end and brought back to the error recovery point, tape backhitch or “shoeshine” operations wherein a tape is drawn back and forth across the tape head, backward tape read operations, tape tension adjustment operations and tape servo adjustment operations, to name a few, which a drive might be capable of (although not all drives may be capable of performing all such error recovery procedures).
Basic tape “mapping” has been employed to summarize errors and performance parameters by physical tape location. The resulting map may be offloaded from the tape drive via a host interface command or as a subset of a product dump file; it may then be formatted for engineering analysis by the manufacturer of the drive. Such mapping has typically been designed to focus on visualizing the tape media quality and recording channel defects. However, with the increasing design sophistication required to accomplish ever increasing data densities on the tape, there is a corresponding increasing reliance on complex recoveries and optimization performed internally by microcode, some of which may not be visible and therefore not available for analysis.
The present invention provides method, system and computer program product for presenting various levels of detail about successful error recoveries during the recording and retrieval of digital information on magnetic tape. A method includes performing data write and/or read operations on a tape medium mounted in a data storage tape drive, receiving information pertaining to successful recoveries from corresponding errors during the data operations, mapping the successful error recovery information to associate each error recovery with a physical location on the tape medium, mapping any hardware optimization of the read/write channel or servo system performed as a preventative measure, generating a first output report providing a first level of error recovery detail, and generating a second output report providing a second level of error recovery detail, the second output report having more of detail than the first output report.
The system includes an error recovery controller operable to initiate recovery processes in response to errors detected during data write and/or read operations on a tape medium mounted in a data storage tape drive, an error recovery controller operable to initiate preventative recovery processes in response to statistical assessment of read/write channel and servo performance detected during data write and/or read operations on a tape medium mounted in a data storage tape drive, an error recovery logger and a report generator. The error recovery logger is operable to record locations of the errors on the tape medium, a recovery method associated with each error, a preventative recovery method associated with thresholded statistical performance data, and values of a plurality of operational parameters at the time of each error. The report generator is operable to generate a first output report providing a first level of error recovery detail and a second output report providing a second level of error recovery detail, the second output report having more of detail than the first output report.
The computer program product includes having computer-readable code embodied therein for mapping error corrections in a magnetic tape drive data storage system, the computer-readable code comprising instructions for performing data write and/or read operations on a tape medium mounted in a data storage tape drive, receiving information pertaining to recoveries from corresponding errors during the data operations, receiving information pertaining to background hardware optimization not performed in response to error stimulus but due to thresholding of statistical data collected dynamically during data operations, mapping the error recovery information to associate each error recovery with a physical location on the tape medium, generating a first output report providing a first level of error recovery detail, and generating a second output report providing a second level of error recovery detail, the second output report having more of detail than the first output report.
A first, Band Summary, report may present a high-level summary of recovery methods by data band and wrap. A second, Detail Summary, report may present a mid-level summary of recovery methods by track and longitudinal position (LPOS) region within one wrap of a band on the tape. A third, ERP Summary, report may present a low-level summary of errors and specific recovery methods by LPOS region within each wrap. Such “telescoping” views permit pattern analysis to be performed at different resolutions. Thus, correlations of possible interactions between hardware and microcode activities that result in changes of the nominal operating point of the drive may be identified. Possible failure patterns may also be identified and fed back to design personnel and incorporated in microcode design changes for more effective ERP.
The microprocessor controller 120 provides overhead control functionality for the operations of all other components of the tape drive 100. The functions performed by the microprocessor controller 120 are programmable via microcode routines, as is known in the art. During data write operations (with all dataflow being reversed for data read operations), the microprocessor controller 120 activates the adaptor 102 to perform the required host interface protocol for receiving an information data block. The adaptor 102 communicates the data block to the data buffer 104 which stores the data for subsequent read/write processing. The data buffer 104 in turn communicates the data to the read/write dataflow circuitry 106, which formats the device data into physically formatted data that may be recorded on the magnetic tape 200. The read/write dataflow circuitry 106 is also responsible for executing all read/write data transfer operations under the control of the microprocessor controller 120. Formatted physical data from the read/write dataflow circuitry 106 is communicated to a tape interface system 110 which includes one or more read/write heads within a head assembly 114 and appropriate drive components (not shown) for performing forward and reverse movement of the tape 200 mounted on supply and take-up reels 116A and 116B. The drive components are controlled by the motion control system 108 to execute such tape movements as forward and reverse recording and playback, rewind and other tape motion functions. In addition, in multi-track tape drive systems, the motion control system 108 positions the read/write heads transversely relative to the longitudinal direction of tape movement in order to record data in a plurality of tracks.
High density multi-track recording may be accomplished by recording multiple data tracks onto the tape 200 using a plurality of small head elements incorporated into the head assembly 114, with each data track being written by one head element (i.e., read/write head channel). This data storage protocol is achieved using multiple tape wraps and tape wrap halves. A tape wrap consists of one outbound and one inbound recording/playback pass across the entire allocated length of the tape 200. The outbound pass represents a first wrap half while the inbound pass represents a second wrap half. There are typically multiple wraps, such as 42, recorded on the tape 200. Each wrap half extends across the entire usable portion of the tape 200.
For accurate longitudinal positioning of the tape 200 relative to the head assembly 114, the servo pattern is encoded with longitudinal position (LPOS) information which represents an absolute longitudinal address that appears at set intervals 210 along the length of the tape 200. In the LTO (“Linear Tape-Open”) tape format, a unique LPOS word occurs every 7.2 mm along the tape 200. Thus, the drive can position itself longitudinally to a given LPOS to obtain a resolution of 7.2 mm. Longitudinal resolution can be further improved, such as to 200 μm, by sub-dividing each LPOS 212. In this disclosure, for logging purposes the LPOS positions encountered during a full tape pass are aggregated into larger, more manageable units referred to as LPOS regions.
After the tape 200 is mounted in the drive 100, tape processing proceeds through successive LPOS regions, wrap halves and wraps and various errors may occur. As disclosed in commonly-assigned U.S. Pat. No. 5,331,476, entitled “Apparatus and Method for Dynamically Performing Knowledge-Based Error Recovery”, which patent is incorporated herein by reference in its entirety, the microprocessor controller 120 populates data structures with detected errors, successful recovery mechanisms, background optimizations, speed and dataflow corrections and other performance information, all associated with physical locations (wrap and LPOS region) on the tape 200, over the course of the tape mount.
The present invention employs additional microcode executed by the microprocessor controller 120 to identify where errors occur on the tape 200 as well as the specific hardware procedures, initiated by the microprocessor controller 120, that were required to resolve the errors. Such information highlights the effects on drive performance of both reactive and preventative microcode procedures. Reactive procedures involve error recovery procedures (ERP) in response to error situations whereas preventative procedures are performed in response to thresholded information that dynamically optimizes internal operating parameters of the drive. Both kinds of ERP can induce calibration, adaptive equalizations, servo tracking changes, mechanical brushing and cartridge reseating. Because of the large amount of information which is available, the present invention presents “telescoping” views of increasing resolution.
The table of
The calibration information includes full recalibrations (column 8), partial calibrations (column 9) and background channel optimization by track (columns 10 and 11). The recovery method information includes cartridge reseating attempts or “re-chucks” (column 12), read and write speed (columns 13 and 14), servo and StopWrite errors (columns 15 and 16), required reprocessing of read or write operations (columns 17 and 18) and the particular method used to recover from the errors (column 19). The servo and StopWrite errors are summarized as a bit-mask of errors detected on a given half wrap. The recovery methods are organized into five groups of related hardware modifications and are logged by the identifiers: N, M, S, C, and D. These groups are defined but are not limited to the following: (S) servo methods such as OppServo (in which the forward (or backward) servo readers are employed when the backward (or forward) servo readers are nominally selected), PES (for servo tracking or offset changes to better track data), AgaGain (in which the servo reader gain is changed) and MatchFilter (which refers to the method by which the servo system interprets multiple servo signal feedback in order to maintain the correct longitudinal position); (C) channel/calibration methods such as read/write channel calibration); (D) dataflow correction methods such as operating range parameter changes; (M) mechanical methods such as rechuck and stepper motor indexing (which pertains to a servo modification used to control the vertical head position within the servo bands); (N) and no method which indicates a transient or a recovery without hardware intervention. It will be appreciated that the foregoing list of recovery methods is merely representative of recovery methods and that additional, fewer or other methods may be used. The Band 2 Summary of
Such detail is provided in the table of
The table of
The third column in each section provides the number of channels involved in background asymmetry cancellation table (ACT) adjustment for the indicated wrap and LPOS region while the fourth and fifth columns provide an indication of the relative write and read speeds, respectively (higher values indicate slower speeds relative to 1 as the highest supported drive speed).
The Band Summary report of
By employing the “telescoping” views provided by the present invention, pattern analysis may be performed at different resolutions. Thus, correlations of possible interactions (both positive and negative) between hardware and microcode activities that result in changes of the nominal operating point of the drive may be identified. Possible failure patterns may also be identified and fed back to design personnel and incorporated in microcode design changes for more effective ERP.
It is important to note that while the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes of the present invention are capable of being distributed in the form of a computer readable medium of instructions and a variety of forms and that the present invention applies regardless of the particular type of signal bearing media actually used to carry out the distribution. Examples of computer readable media include recordable-type media such as a floppy disk, a hard disk drive, a RAM, and CD-ROMs and transmission-type media such as digital and analog communication links.
The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated. Moreover, although described above with respect to methods and systems, the need in the art may also be met with a computer program product containing instructions for multi-level mapping of tape error recoveries.
Nylander-Hill, Pamela R, Gale, Ernest S
Patent | Priority | Assignee | Title |
10110257, | Nov 14 2012 | International Business Machines Corporation | Reconstructive error recovery procedure (ERP) for multiple data sets using reserved buffer |
10170158, | Jul 14 2014 | International Business Machines Corporation | Variable scoping capability for physical tape layout diagnostic structure of tape storage device |
10621065, | Dec 05 2017 | International Business Machines Corporation | Concurrent logging of data layers within a tape storage device |
10936466, | Dec 05 2017 | International Business Machines Corporation | Concurrent logging of data layers within a tape storage device |
7421640, | Aug 17 2005 | GLOBALFOUNDRIES U S INC | Method and apparatus for providing error correction capability to longitudinal position data |
8035911, | Feb 15 2007 | Hewlett Packard Enterprise Development LP | Cartridge drive diagnostic tools |
8099648, | Mar 09 2006 | Lattice Semiconductor Corporation | Error detection in physical interfaces for point-to-point communications between integrated circuits |
8630058, | May 31 2010 | Fujitsu Limited | Drive apparatus, library apparatus, and control method thereof |
8856076, | Jun 17 2011 | International Business Machines Corporation | Rendering tape file system information in a graphical user interface |
8908485, | May 31 2011 | International Business Machines Corporation | Extended diagnostic overlay control for tape storage devices |
8908486, | May 31 2011 | International Business Machines Corporation | Method for extended diagnostic overlay control for tape storage devices |
9053748, | Nov 14 2012 | International Business Machines Corporation | Reconstructive error recovery procedure (ERP) using reserved buffer |
9104576, | Jul 16 2013 | International Business Machines Corporation | Dynamic buffer size switching for burst errors encountered while reading a magnetic tape |
9141478, | Jan 07 2014 | International Business Machines Corporation | Reconstructive error recovery procedure (ERP) using reserved buffer |
9263092, | May 31 2011 | International Business Machines Corporation | Extended diagnostic overlay control for tape storage devices |
9355675, | Jul 14 2014 | International Business Machines Corporation | Variable scoping capability for physical tape layout diagnostic structure of tape storage device |
9564171, | Jan 07 2014 | International Business Machines Corporation | Reconstructive error recovery procedure (ERP) using reserved buffer |
9582360, | Jan 07 2014 | International Business Machines Corporation | Single and multi-cut and paste (C/P) reconstructive error recovery procedure (ERP) using history of error correction |
9583136, | Jul 16 2013 | International Business Machines Corporation | Dynamic buffer size switching for burst errors encountered while reading a magnetic tape |
9584162, | Jan 05 2016 | International Business Machines Corporation | Microcode data recovery strategies for use of iterative decode |
9590660, | Nov 14 2012 | International Business Machines Corporation | Reconstructive error recovery procedure (ERP) using reserved buffer |
9621193, | Jan 05 2016 | International Business Machines Corporation | Microcode data recovery strategies for use of iterative decode |
9778977, | Jan 05 2016 | International Business Machines Corporation | Microcode data recovery strategies for use of iterative decode |
9911463, | Jul 14 2014 | International Business Machines Corporation | Variable scoping capability for physical tape layout diagnostic structure of tape storage device |
Patent | Priority | Assignee | Title |
5386324, | Jul 30 1993 | International Business Machines Corporation | Apparatus and method for anticipated error recovery using debris profiles |
6269422, | Dec 10 1998 | Unisys Corporation | System and method for retrieving tape statistical data |
6408405, | Dec 10 1998 | Unisys Corporation | System and method for displaying and analyzing retrieved magnetic tape statistics |
20010042222, | |||
20040010499, | |||
20040019835, | |||
20040025077, | |||
20050216782, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jun 30 2005 | GALE, ERNEST S | INTERNATIONAL BUSINESS MACHINES IBM CORPORATION | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016332 | /0989 | |
Jul 15 2005 | NYLANDER-HILL, PAMELA R | INTERNATIONAL BUSINESS MACHINES IBM CORPORATION | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 016332 | /0989 | |
Jul 18 2005 | International Business Machines Corporation | (assignment on the face of the patent) | / | |||
May 03 2011 | International Business Machines Corporation | Google Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026664 | /0866 | |
Sep 29 2017 | Google Inc | GOOGLE LLC | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 044127 | /0735 |
Date | Maintenance Fee Events |
Sep 07 2007 | ASPN: Payor Number Assigned. |
May 16 2011 | REM: Maintenance Fee Reminder Mailed. |
Jun 15 2011 | M1554: Surcharge for Late Payment, Large Entity. |
Jun 15 2011 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Apr 09 2015 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
May 27 2019 | REM: Maintenance Fee Reminder Mailed. |
Nov 11 2019 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Aug 19 2020 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Aug 19 2020 | M1558: Surcharge, Petition to Accept Pymt After Exp, Unintentional. |
Aug 19 2020 | PMFG: Petition Related to Maintenance Fees Granted. |
Aug 19 2020 | PMFP: Petition Related to Maintenance Fees Filed. |
Date | Maintenance Schedule |
Oct 09 2010 | 4 years fee payment window open |
Apr 09 2011 | 6 months grace period start (w surcharge) |
Oct 09 2011 | patent expiry (for year 4) |
Oct 09 2013 | 2 years to revive unintentionally abandoned end. (for year 4) |
Oct 09 2014 | 8 years fee payment window open |
Apr 09 2015 | 6 months grace period start (w surcharge) |
Oct 09 2015 | patent expiry (for year 8) |
Oct 09 2017 | 2 years to revive unintentionally abandoned end. (for year 8) |
Oct 09 2018 | 12 years fee payment window open |
Apr 09 2019 | 6 months grace period start (w surcharge) |
Oct 09 2019 | patent expiry (for year 12) |
Oct 09 2021 | 2 years to revive unintentionally abandoned end. (for year 12) |