A system for authenticating data of interest includes a digest locator engine capable to locate a first and a second digest result in a data file, including a set of data; a first digest creator capable to create, using a first digest function, a first digest of the set of data, the first digest function being identical to a digest function used to create the first digest result; a second digest creator capable to create, using a second digest function that is incompatible with the first digest function, a second digest of the set of data, the second digest function being identical to a second digest function used to create the second digest result; and a digest comparator engine, communicatively coupled to the digest locator, first digest creator and the second digest creator, capable to compare the first and second created digests with the first and second located digest results respectively.
|
1. An apparatus, comprising:
a marking node comprising a marking engine, storage, and a communications interface interconnected for data communication with one another by a communication channel;
a data of interest identifier to identify a data of interest;
a first digest creator to create a first digest of said data of interest, using a first digest function; and
a second digest creator to create a second digest of said data of interest, using a second digest function that is incompatible with said first digest function, with said first digest and said second digest undeceptively identifying said data of interest,
said marking engine to append said first digest and said second digest to a file holding said data of interest to create said file including said first digest, said second digest and said data of interest.
12. An apparatus, comprising:
an authenticating node comprising a digest comparator engine, storage, and a communications interface interconnected for data communication with one another by a communication channel;
a data of interest locator to locate a first digest result and a second digest result in a file, said file including data of interest, and said first digest result and said second digest result undeceptively identifying said data of interest;
a first digest creator to create a first digest of said data of interest, using a first digest function, said first digest function being identical to a digest function used to create said first digest result;
a second digest creator to create a second digest of said data of interest, using a second digest function that is incompatible with said first digest function, said second digest function being identical to a second digest function used to create said second digest result;
said digest comparator engine to compare said first digest and said second digest with said first digest result and said second digest result respectively to create a match of the digests to undeceivably authenticate said data of interest; and
an output interface to display a positive authentication message if said match of said digests indicates said first digest and said second digest match said first digest result and said second digest result respectively, undeceivably authenticating said data of interest.
2. The apparatus of
3. The apparatus of
5. The apparatus of
6. The apparatus of
7. The apparatus of
8. The apparatus of
9. The apparatus of
10. The apparatus of
a plurality of digest creators to create additional digests of said set of data using additional digest functions that are incompatible with one another; and
said marking engine to append said additional digests to said file holding said data of interest.
11. The apparatus of
14. The apparatus of
15. The apparatus of
16. The apparatus of
17. The apparatus of
18. The apparatus of
19. The apparatus of
said digest locator engine to locate additional digest results in said file, said data file including said data of interest;
a plurality of digest creators to create, using additional digest functions that are incompatible with one another, additional digests of data of interest, said additional digest functions being identical to said additional digest functions used to create said additional digest results in said file; and
said digest comparator engine to compare said additional digests with said additional digest results respectively to further create said match.
|
This application is a continuation application of patent application Ser. No. 10/102,507, filed Mar. 19, 2002 now issued as U.S. Pat. No. 7,480,796 on Jan. 20, 2009, which further claims the benefit of the priority date of the U.S. provisional patent application Ser. No. 60/296,820 filed Jun. 7, 2001.
This invention relates to methods of data authentication and more particularly, but not exclusively, provides a system and method for creating, attaching, and using digest results of digital data to authenticate the data.
Information that is used by computers is often stored in digital data files of various formats. A digital file refers to digital data that together and as a group have meaning or use to a party in possession of the file. A computer file refers to digital data that can be stored as a file on a medium such as a hard disk, a flash memory card, random access memory (RAM), read-only memory (ROM), CD-ROM, DVD-ROM, or any other medium or device designed to store data in digital form.
Digital files can be transported by wireless means, such as over a cellular phone or wireless modem, or through a wire such as over the Internet, a wired modem, a local area network (LAN), a wide area network (WAN), a virtual private network (VPN), or any other similar means.
Because digital data can be easily modified, it is often important to be able to verify the integrity of the file to confirm that the file or a subset of the file has not been altered. The important data to be verified, which may consist of the entire file or a subset of it, is referred to as “data of interest.”
One conventional way to verify the integrity of the data of interest has been to generate a data file that contains both the data of interest and a redundant copy of the data of interest encrypted in such a way that a comparison may be made between the data of interest and the redundant data. The comparison may be made by decrypting the redundant data and comparing the decrypted redundant data with the unencrypted data of interest in the file. Alternatively, the same encryption algorithm used to encrypt the redundant data may be applied to the data of interest and the newly encrypted version of the data of interest may be compared against the stored encrypted redundant data. These techniques have the drawback of requiring the storage of two copies of the digital data and are not desirable for large data files.
Another conventional technique involves applying a digest function to the data of interest to create a digest result. A digest result is a shorthand way of representing the data. Examples of a digest function include a simple checksum, a weighted checksum, bit operations, or other functions. In a simple checksum, all the bytes of the data of interest are added together and stored in an integer where the overflow bits are dropped (e.g., a CRC). In a weighted checksum, each byte of data is multiplied by a weight factor before being added to the other bytes. In bit operations technique, each byte of data of interest is subjected to various operations. For example, each byte of data may be subjected to an exclusive or (XOR) operation with a subsequent byte of data, and the result XOR'ed with the next byte until all the bytes of the data of interest are exhausted. The resulting value is the digest result.
Since all bytes of the data of interest are used to calculate a digest result, if one or more bytes are altered to another value, a user can detect that a change has occurred. If the user reapplies the digest function to the changed data, the digest result will not match the digest result calculated from the original data. The ability to compare the recalculated digest result against the original digest result and noting the difference if the data of interest has been changed allows the user to authenticate the data and detect if any alterations have occurred.
However, using a digest function has drawbacks. A digest function may map multiple sets of data into the same digest result. Therefore, a digest result calculated for an altered, forged set of data may be equal to the digest result calculated for the authentic set of data. To pass a forged set of data as authentic, it is possible to analyze the digest function to find out which sets of data that are different from the authentic data set yield the same digest result as the authentic set. The data of interest cannot be authenticated with any reliability if a digest function alone is used.
An example follows that demonstrates how using a digest function may yield the same digest result for different data sets. In this example, a weighted checksum is used as the digest function. Two different sets of data are subjected to this digest function. The authentic set contains the values of 1, 3, 5, and 2 and the forged set contains the values 3, 2, 5, and 2. The weighted sum uses weight of 2 and 4 multiplied by the data values and repeats the multiplications periodically until all data is exhausted: 2*1+4*3+2*5+4*2=32 1) 2*3+4*2+2*5+4*2=32 2) Application of this digest function to both sets of data yields the same digest result of 32.
The type of weakness, demonstrated by the foregoing example, can exist with other authentication schemes that are based on digest results of data. Because it is possible to map more than one set of data to the same digest result, it is possible for a clever programmer to break the authentication scheme and cause forged data to be mistaken for authentic data. In the above example, because the first weight factor, the number 2, was half the second weight factor, the number 4, the first data point was increased from 1 to 3 and the second was decreased from 3 to 2 thus canceling the effect of the increase in the first data point.
Therefore, a more secure system and method for verifying the integrity of a data of interest are needed that do not require storing redundant copies of data within a file.
The present invention provides a system for authenticating data using incompatible digest functions. The system comprises a marking node and an authenticating node. The marking node comprises a data of interest identifier, a first digest creator, a second digest creator and a marking engine. The data of interest identifier identifies data to be subjected to the first and second digest creators. The first and second digest creators, using a first digest function and a second digest function respectively, create digests of the data of interest. The first digest function is incompatible with the second digest function. The marking engine then appends the digests to a file holding the data of interest.
The authenticating node comprises a data of interest identifier, a first digest creator, a second digest creator, a digest locator engine and a digest comparator engine. The data of interest identifier identifies data within a file that has been subjected to a two or more digest creator functions. The first and second digest creators are substantially identical to the first and second digest creators of the marking node and use the same digest functions. The digest locator engine locates a digest appended by the marking engine to the file holding the data of interest. The digest comparator engine compares the digests created by the authenticating node first and second digest creators with the digests appended to the file. If the authenticating node created digests match the stored digests, then the data is authenticate. If the digests do not match, then the data is not authentic (e.g., tampered with or incorrectly copied/transmitted).
The present invention further provides a method for marking data for authentication. The method comprises: identifying data of interest; creating, using a first function, a first digest for the data of interest; creating, using a second function that is incompatible with the first function, a second digest for the data of interest; identifying a location in the data of interest or file holding the data of interest to append the digests; and appending the digests to the identified location.
The present invention further provides a method of authenticating data that has been marked. The method comprises: locating appended digests; identifying data of interest; creating a first digest, using the first function, of the identified data of interest; creating a second digest, using the second function, of the identified data of interest; and comparing the created digests with located appended digests to verify authenticity of the data of interest.
Accordingly, the system and methods advantageously enable authentication of data.
Non-limiting and non-exhaustive embodiments of the present invention are described with reference to the following figures, wherein like reference numerals refer to like parts throughout unless otherwise specified.
A method and system are disclosed for authenticating data of interest that make forgery nearly impossible. While using one digest function creates a digest result that can be forged, using two or more incompatible digest functions yields improved security.
The marking node 110 is connected to the data generator 105. The data generator 105 may be a digital camera, word processor, scanner, microphone, keyboard, or any other device capable of creating digital data. The data generator 105 communicates the digital data to the marking node 110 automatically or upon request. Alternatively, the data may be transferred to the marking node 110 by a compact disk or other data storage medium. It will be appreciated that the data generator 105 and all or parts of the marking node 110 may be integral to the same device such as a digital camera used by police.
The marking node 110 comprises a marker system 111, a data file 114, which includes data of interest 113 and digest results 112 corresponding to the data of interest 113, and a communications engine 115. By creating two or more digest results 112 that correspond to the data of interest and storing the digest results in the file 114 containing the data of interest 113, the marker system 111 enables authentication of the data of interest 113. The details of the marker system will be provided with the discussion of
The authenticating node 130 comprises an authenticator system 131, a data file 124, and a communications engine 132. The data file 124 includes data of interest 123, that may be authentic or forged, and digest results 112 that were formed from the authentic data of interest 113. The authenticator system 131 performs the task of finding the digest results 112 within the data file 124 and, based on the digest results, determines whether the data of interest 123 received is authentic. Details of the authenticator system 131 will be provided with reference to
The marker 111 and authenticator 131 systems will likely reside in permanent storage 208 before they are retrieved into the working memory 209 to operate on data files 114 and 124 that are received through the input devices 203 or are available from a computer readable storage medium 206. The authentication results may be communicated to the user via the communication interface 207 or the output devices 204 or stored in storage 208 or working memory 209 of the computer.
It is appreciated that if an authentic copy of the file 114 containing the data of interest 113 is received by the authenticating node 130, then the file 124 at the authenticating node 130 will be the same as file 114 and the data of interest 123 in that file 124 will be the same as data of interest 113. Different reference numerals used for the data files 114 and 124 and the data of interest 113 and 123 at the marking node 110 and the authenticating node 130 respectively reflect the scenario that a forged set of data of interest 123 may be present at the authenticating node 130. The digest results 112 refer to the results of performing the digest operations on the authentic of data of interest 113 that are attached to that data 113 and are assumed to be unaltered.
It is appreciated that not all of the order presented in the flowchart of
It is appreciated that not all of the order presented in the flowchart of
Digest function one 720 and digest function two 730 are incompatible in the sense that if applying digest function one to a first data set yields digest result one and applying the second digest function to the same first data set yields digest result two, applying the first digest function to a second different data set will likely not yield digest result one when applying the second digest function to the second different data set yields digest function two, or alternatively applying the second digest function to the second different data set will likely not yield digest result two when applying the first digest function to the second different data set yields digest function one. In other words, in the case of incompatible digest functions there is a very small possibility that two or more differing data sets simultaneously satisfy all digest functions identically.
In
The two digest functions are chosen such that running another set of data points through them is not likely to yield the same two digest results simultaneously. Therefore, a forged data set may not be passed on in place of an authentic set.
A person skilled in the art will recognize that the approach of using two incompatible digest functions can be generalized to two or more incompatible digest functions. Further, the digest functions may be of many various types such as hash functions, checksums, weighted checksums, one-way encryption functions or any other type of digest function that can be applied to a set of data.
Digest functions may be periodic or aperiodic. Periodic digest functions, such as the weighted sum function depicted in
The overall period of a number of incompatible digest functions used together is equal to the product of the periods of the functions. For improved security, it is desired to use digest functions with large periods and to have a different period length for each digest function used. Security is further enhanced if the overall period length of the combination of digest functions exceeds the length of the data of interest.
Digest functions with no set period, such as the XOR function, do not suffer from periodic effect. The length of the period for these functions can be considered to be infinite. In order to form a set of incompatible digest functions, both periodic and aperiodic digest functions may be included. For additional security, it is preferred but not required that some of each type of digest functions are mixed together to form a set of digest functions that is applied to the data of interest.
An important issue regarding authentication of data files arises from the need to hide the digest results so that they are not apparent to a file parser.
The foregoing description of the embodiments is by way of example only, and other variations and modifications of the above-described embodiments and methods are possible in light of the foregoing teaching. The embodiments described herein are not intended to be exhaustive or limiting. The present invention is limited only by the following claims.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
5204966, | Mar 09 1990 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | System for controlling access to a secure system by verifying acceptability of proposed password by using hashing and group of unacceptable passwords |
5568554, | Jan 31 1995 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Method for improving the processing and storage performance of digital signature schemes |
6005936, | Nov 28 1996 | International Business Machines Corporation | System for embedding authentication information into an image and an image alteration detecting system |
6504843, | Feb 24 1999 | 3Com Technologies | System and method for dynamically mapping a high speed link to a multiplicity of low speed trunked links |
20050102520, | |||
WO2091145, | |||
WO2007072468, | |||
WO9847259, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jan 20 2009 | Kwan Software Engineering, Inc. | (assignment on the face of the patent) | / | |||
Mar 27 2009 | KWAN, JOHN MAN KWONG | KWAN SOFTWARE ENGINEERING, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 022498 | /0461 | |
May 01 2014 | KWAN SOFTWARE ENGINEERING, INC | HARRINGTON, KEVIN J | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 032805 | /0108 |
Date | Maintenance Fee Events |
Dec 09 2016 | REM: Maintenance Fee Reminder Mailed. |
Apr 28 2017 | M2551: Payment of Maintenance Fee, 4th Yr, Small Entity. |
Apr 28 2017 | M2554: Surcharge for late Payment, Small Entity. |
Oct 07 2020 | M2552: Payment of Maintenance Fee, 8th Yr, Small Entity. |
Dec 16 2024 | REM: Maintenance Fee Reminder Mailed. |
Date | Maintenance Schedule |
Apr 30 2016 | 4 years fee payment window open |
Oct 30 2016 | 6 months grace period start (w surcharge) |
Apr 30 2017 | patent expiry (for year 4) |
Apr 30 2019 | 2 years to revive unintentionally abandoned end. (for year 4) |
Apr 30 2020 | 8 years fee payment window open |
Oct 30 2020 | 6 months grace period start (w surcharge) |
Apr 30 2021 | patent expiry (for year 8) |
Apr 30 2023 | 2 years to revive unintentionally abandoned end. (for year 8) |
Apr 30 2024 | 12 years fee payment window open |
Oct 30 2024 | 6 months grace period start (w surcharge) |
Apr 30 2025 | patent expiry (for year 12) |
Apr 30 2027 | 2 years to revive unintentionally abandoned end. (for year 12) |