A structure and technique for preventing collisions using a hash table in conjunction with a cam to identify and prevent collisions of binary keys. A portion of the hash value of a binary key, which does not collide with a portion of the hash value of any other reference binary key, is used as an entry in the hash table. If two or more binary keys have identical values of the portions of the hash values, each of these binary keys are stored in their entirety, in the cam. The key in the cam provides a pointer to a data structure where the action associated with that binary key is stored. If the binary key is not found in the cam, the binary key is hashed, and a specific entry in the hash table is selected using a portion of this hash value.

Patent
   7349397
Priority
May 13 2002
Filed
Aug 03 2006
Issued
Mar 25 2008
Expiry
May 13 2022

TERM.DISCL.
Assg.orig
Entity
Large
5
12
all paid
11. A method for preventing collisions between two or more binary keys wherein each binary key corresponds to an action to be taken, comprising:
providing a hash table having a plurality of entries, each entry associated with a binary key and indexed by a selected portion of a hash value of said associated binary key, each entry pointing to a location in a data structure for storing the non-selected portion of, or the entire hash value of, the binary key and action data corresponding to the value of the binary key;
providing a content addressable memory (cam) having a plurality of entries, each configured to store a binary key, or a value unique to a binary key, and an association to a corresponding action associated therewith; each entry in said hash table having an entry and a pointer to said data structure using a selected one portion of a first hash value of a first binary key as an index into the hash table when and only when a selected one portion of the first hash value is not the selected one portion of the hash value of any other binary key using said cam and hash table to prevent collision of any binary keys;
presenting a second binary key for insertion into one of the hash table and the cam;
creating a second hash value of the second binary key;
searching the hash table using a first portion of the second hash value;
detecting that the hash table includes an entry indexed by the first portion of the second hash value for a third binary key;
creating a first entry in the cam indexed by the second binary key and storing a pointer to a data structure corresponding to the first entry;
creating a second entry in the cam indexed by the third binary key and storing a pointer to a data structure corresponding to the second entry; and
deleting the entry in the hash table indexed by the first portion of the second hash value and data corresponding to the entry.
1. A method of preventing collisions between two or more binary keys wherein each binary key corresponds to an action to be taken, comprising the steps of:
providing a hash table having a plurality of entries, each entry associated with a binary key and indexed by a selected portion of a hash value of said associated binary key, each entry pointing to a location in a data structure for storing the non-selected portion of, or the entire hash value of, the binary key and action data corresponding to the value of the binary key, and a content addressable memory (cam) having a plurality of entries, each configured to store a binary key, or a value unique to a binary key, and an association to a corresponding action associated therewith;
storing in said hash table a pointer to said data structure using a selected one portion of a first hash value of a first binary key as an index into the hash table when and only when a selected one portion of the first hash value is not the selected one portion of the hash value of any other binary key, and storing in the cam the first binary key or a value unique to the first binary key, and establishing an association between the associated cam entry location and a location of an associated data structure, when and only when the selected portion of the first hash value of the first binary key is the same as the selected portion of the hash value of one or more other binary keys;
presenting a second binary key for insertion into one of the hash table and the cam;
creating a second hash value of the second binary key;
searching the hash table using a first portion of the second hash value;
detecting that the hash table includes an entry indexed by the first portion of the second hash value for a third binary key;
creating an entry in the cam indexed by the second binary key;
creating an entry in the cam indexed by the third binary key; and
deleting the entry in the hash table indexed by the first portion of the second hash value.
2. The invention as defined in claim 1 further characterized by:
presenting a binary key for action; searching the cam to see if the binary key is stored in said cam;
if said binary key is found, using established association with said data structure in order to access the action data associated therewith;
hashing the binary key and accessing an entry in the hash table using a portion of resulting hashed binary key as an index into said hash table to determine if the selected entry of the hash table is valid;
if the entry is valid, then pointing to said data structure in order to access the action data associated therewith.
3. The invention as defined in claim 1 wherein said cam and said hash table are searched simultaneously.
4. The invention as defined in claim 1 wherein said cam is searched first, and said hash table is searched if and only if the value of the second binary key is not found in the cam.
5. The invention as defined in claim 1 wherein said second binary key is stored unaltered in the cam.
6. The invention as defined in claim 2 wherein said second binary key is stored unaltered in the cam.
7. The invention as defined in claim 3 wherein said second binary key is stored unaltered in the cam.
8. The invention as defined in claim 4 wherein said second binary key is stored unaltered in the cam.
9. The invention as defined in claim 1 wherein said action is stored in the cam.
10. The invention as defined in claim 1 wherein said action is directed by an offset value in said cam location.
12. The invention as defined in claim 11 wherein said value stored in the cam is the unaltered value of each key.
13. The invention as defined in claim 11 wherein said action related to each entry in said cam is stored in said cam.
14. The invention as defined in claim 12 wherein said action related to each entry in said cam is stored in said cam.
15. The invention as defined in claim 11 wherein said action related to each entry in said cam is designated by an offset.
16. The invention as defined in claim 12 wherein said action related to each entry in said cam is designated by an offset.

This application is a continuation of application Ser. No. 10/144,610, filed May 13, 2002, which has issued as U.S. Pat. No. 7,116,664.

This invention relates to a method and structure for preventing collisions between two or more stored hash values of binary keys to action items in a network environment.

In certain networks, specific fields within message headers are used as binary keys to search data structures for specific details regarding actions necessary for appropriate processing of those messages. The length of a binary key is dependent on the size of the field(s) used to create the key. A few example key lengths may include 32 bits for an IP address, 48 bits for an Ethernet MAC address, or 104 bits for a TCP/IP 5-tuple. It is impractical to use these keys in their full form to directly address corresponding entries due to the length of the keys. This can theoretically be done in content addressable memory (CAM), but typically creates practical disadvantages because of the cost of a CAM of such size. Hence, a common approach is to hash the value of the binary key and use a pre-selected first portion of the hashed value to address a specific entry in a hash table. Hashing can be accomplished by creating a new value of the binary key having the same number of bits, which are unique to any given binary key, and then using only a portion of the bits, e.g. the first N bits to select the corresponding hash table entry. This value is then used to address a specific entry in a hash table, sometimes referred to as a direct table DT. Either the entire hashed value or the remaining portion of the hashed value is stored in a data structure, together with the corresponding function-specific data denoted by the binary key. Whenever a binary key is extracted from received messages, its value is hashed and the first portion of the hash value is used to access an entry in the hash table. If a valid hash table entry is found, that location in the hash table points to a data structure containing a complementary portion of a reference hash value that is compared with the equivalent complementary portion of the hash value generated from the message key to confirm the validity of the key and declare the associated action if the key is, in fact, valid. This works well for some numbers; however, in some cases, the first portion of the hashed value of one binary reference key is the same as the first portion of the hashed value of another binary reference key. This occurs because only a portion of the newly created value of the binary key is used to select an entry in the hash table and, hence, this portion of the new value of one binary key may be the same as that of another binary key. This is often referred to as a “collision”. In the past, this has been dealt with by the use of patricia tree structures or the like. But this is cumbersome and relatively slow. Hence, a faster relatively inexpensive technique is needed.

The present invention provides a structure and technique for totally preventing collisions by using a hash table or direct table DT in conjunction with a content addressable memory (CAM) to identify and prevent any collisions of selected first portions of hash values, i.e., identified first portions of different binary keys. In operation, a selected portion of the hash value of any reference binary key that does not collide with an identified selected portion of the hash value of any other reference binary key is used to select an entry in the direct table. Each location addressed by the selected portion of a hash value holds a pointer to a data structure where the action represented by the binary key corresponding to that hashed value and the remaining portion of the hash value, or the entire hash value, are stored. However, if it is determined that two or more binary keys have identical values of the selected first portions of the hash values, each of these binary keys, or an identification specific to the key are stored, in their entirety, in the CAM, with the entry number of the matching CAM entry providing a pointer to a data structure where the action associated with that binary key is stored. Such binary keys, which are different but have the same value of the selected portion of their hash values, are not associated with entries stored in the hash table.

In operation, when a binary key is presented for search, the CAM is first searched to see if the binary key is stored in the CAM. If it is, the location in the CAM at which it is stored is mapped into an address pointer to the data structure containing details regarding the action to be performed. If the binary key is not found in the CAM, which indicates that there are no collisions, the binary key is hashed, and a specific entry in the hash table is selected by using the first portion of this hash value as an offset into the hash table. If the selected first portion of the hash value accesses a valid entry in the hash table, that entry contains a pointer to the data structure containing details regarding appropriate actions for processing the associated message. Either the remainder of the hashed value, or the entire hashed value, is also stored in this data structure, so it can be compared with the hashed key constructed from the message during the process of accessing the data structure. If the remainder or total hashed key stored in the data structure matches the hashed key used for the search, the search process has identified the desired match, and the associated action defined by data in the structure is indicated. If the remainder of the hash value does not compare, or if the selected portion of the hash value selects an invalid entry in the hash table, a no-match indication is given, and the associated software performs appropriate default actions. In any event, potential collisions have been avoided, by using conventional data tables for storing selected portions of the hash values of binary keys where there is no collision, and in those few cases where there would be a collision, based on selected portions of hash values; these are anticipated and avoided by storing the binary keys in a CAM with pointers to the associated actions.

FIG. 1 is a diagram of the structure of this invention; and

FIG. 2 is a flow diagram of one search protocol according to this invention.

FIG. 1 is a high level view of the configuration of the present invention. A combination of a content addressable memory (CAM) 10 and a hash table or direct table DT 12 is shown. A data structure 14 is also shown having a cam portion and hash table portion. Hardware 16 is provided which will perform a hashing function of a binary key. An alternate embodiment of the invention includes software for implementing these hash functions. In either case, software is typically used to implement a reverse hashing required by Insert procedures to construct a binary key from the selected portions of the hashed value. A control function (either hardware or software) 18 is provided to control the operation of the CAM 10 and hashing function 16 responsive to a compare function 20. The underlying premise of the invention is that CAMS 10 are relatively expensive but do function well to provide a positive indication of a match of a binary bit number being delivered thereto. On the other hand, direct tables or hash tables DT 12 are relatively inexpensive and can provide maximum storage of entries corresponding to selected segments of hashed values at a minimum cost. However, when a binary number is hashed and a selected portion of the hashed value is used to identify the entire hashed value and, hence, the binary key value, there is a possibility that two different binary keys or binary numbers will have the same value of the selected portion of the hashed value. Briefly, hashing, as used herein, refers to generating a number of bits in a selected manner from the bits of a given binary number, such as a binary key, and then using a certain predetermined portion of the number of those bits to identify the binary key value, e.g., a typical binary key may have 32, 48, 104 bits or more, and the first N bits are used to select an entry from the hash table, where N may be limited in practical implementations to 20 or less. A technique for providing a hash function and reverse hash function is shown in commonly owned application Ser. No. 09/210,222, filed Dec. 10, 1998, now U.S. Pat. No. 6,785,278, which is incorporated herein by reference.

In hashing, it should be understood that for any given binary key, X, there is a hash function for generating a hashed key, H(X), having the same number of bits, x, as in the original binary key, X. H(X) may be partitioned into two segments, h(X) and h′(X), with h(X) having a fixed number of bits N in a specific place so that the number of bits in h(X) is greater than zero (0) and less than the total number of bits in the key X. The segment h′(X) is the complementary function of h(X), so that h′(X) has x-N bits. Thus, h(X) concatenated with h′(X) reconstructs the hashed key H(X). Moreover, knowing both the hash function h(X) and the complement h′(X) allows the value X of the binary key to be recalculated precisely.

Thus, it is possible to have two selected first portion values which are identical but which refer to different binary keys. Such a condition is known as a collision, and collisions need to be avoided so that, when a key is presented for search, there will be an unambiguous pointing to the proper action represented by a given key that is unique to the given key. However, the predetermined portion of hashed value could be the same for two or more binary keys. This results in a collision that must be avoided in order to prevent ambiguity in an action associated with the binary key.

In many network systems, different binary keys are typically contained in the header of a message that is being distributed within the system, and are used to guide actions taken on these messages by networking devices. Each binary key corresponds to attributes or details of actions to be taken in processing a message containing the key. For example, an IP destination address may be used as a key to access data structures identifying the next hop address, target port to be used for transmitting the data, transmit vs. discard indication, etc. When this particular key is presented within the system for search and execution, the key will be used to locate details of actions to be taken, and the system will take those actions based upon the particular action data associated with the binary key. Thus, whenever a particular key is presented, this must be recognized as a unique binary key and a pointer declares the action indicated by the key.

According to the present invention, a CAM 10 is used in conjunction with a hash table 12, a data structure 14, and hardware 16 to perform a hashing and unhashing function to effectively utilize the capability of the CAM while minimizing its size and, thus, its cost and using the hashed value in a hash table when the CAM is not needed.

According to the present invention, a hash function accepts a binary key, X, consisting of M bits, and computes a corresponding hashed key, H(X), that also consists of M bits. A selected first portion, h(X), of hashed key, H(X), is used to map to a corresponding entry in a hash table. The selected first portion, h(X), consists of N bits (where N<M). Likewise, the hash table uses an N=bit address to select one of 2N entries. The output of a search is uniquely determined by the full M bits of X or H(X). However, the first N bits (i.e. h(X)) might not correlate to a single unique entry. A complementary function, h′(X), consisting of M-N bits, is, therefore, defined as the remainder from H(X), after h(X) has been segmented from it. This complementary function is used to validate the uniqueness of an entry in the hash table via comparison with a stored equivalent, h′(x). The invention operates as follows. If the selected portion h(X) of the hashed value of two or more binary keys is the same value, then each of these binary keys, or a value unique to each key, is stored in the CAM 10, with the location of each CAM entry associated with the address of a corresponding data structure containing appropriate data to guide message processing actions. The location of the data structure can be an offset value based on the location of the matching CAM entry, or the data structure itself could be contained within the CAM 10, or any other technique could be used to recognize and initiate action. (The technique for insertion and deletion of values into the CAM will be described presently.) If, however, the selected first portion h(X) of the hash value of any binary key is unique and the binary key is not stored in the CAM 10, then the selected portion of the hash value h(X) is used to access a specific entry in the hash or direct table DT 12 at a particular location, with a pointer from that location to the data structure having a corresponding action of the binary key having that hashed value. The remainder h′(X), or the hash value H(X), is also included in the data structure. If the remainder or all of the hash value stored in the data structure matches the hashed key used to locate the data structure, then the action is declared. Thus, in operation, when a binary key is presented, a comparison is first made in the CAM 10 to see if the binary key is stored. It will be remembered that the only binary key numbers stored in the CAM in their entirety (i.e., in their unhashed value) are those binary keys which have selected first portions of their hash values that are identical to the selected first portion of some other binary key. Thus, there are a minimum number of binary keys that need to be stored in the CAM 10. If the value corresponding to the binary key is found in the CAM 10, then a pointer from that entry points to the data structure where action to be taken is declared, or the location stores the required action. If, however, the binary key is not found in the CAM 10, then the binary key is hashed and the selected portion h(X) of the hash value is used to access the hash table or direct table DT 12. If a valid entry is found, that means that there are no other identical selected first portions of the hash values, and so a pointer from that value in the hash table or direct table DT 12 points to the data structure containing the remainder of the hash value and the action of the binary key.

In a preferred embodiment, once a binary key is placed into the CAM (due to a collision in the hash table), it will remain in the CAM even if the other colliding entries are eventually deleted via administrative table maintenance. A table maintenance task can manage these situations by periodically hashing each binary key in the CAM, and searching for entries that are unique in the selected first portion h(x). Any CAM entries identified to have a unique selected first portion h(X) can then be added to the hash table and removed from the CAM. Those skilled in the art will recognize that more complex implementations are possible that would maintain a separate data structure or an additional segment of the base data structure to identify which CAM entries have matching selected first portions h(x). Such additional data structures can enable the delete process to test an entry being deleted from the CAM to determine if the deletion would result in a remaining CAM entry that no longer matched other entries in the selected first portion h(x), thus enabling that remaining CAM entry to be moved from the CAM into the hash table.

Thus, the search policy can be characterized as follows where a key X is presented for search:

The following designations are used in the description of the Search policy, Insertion policy, and Deletion policy:

SEARCH POLICY (FIG. 2)

In another embodiment, both the hash table 12 and CAM 10 are searched simultaneously. In yet another embodiment, the hash table 12 is searched first rather than the CAM 10.

For insertion of an entry corresponding to a binary key number in the CAM 10 or in the hash table or direct table DT 12, the following steps are performed. A binary key X and action A are presented for insertion. First, the M-bit binary key X is sought in the CAM 10, and if X is found, then the system will write A over the existing action and end the procedure. If X is not found, then X is hashed and the entry in hash table or direct table DT 12 is accessed using an N bit address corresponding to h(X) to see if a valid entry is found there. If the hash table entry is invalid, then the h(X) is used to store a pointer to a data structure containing action A corresponding to X and h′(x), and the program is ended. If an entry exists at index h(X), and contains a pointer from this entry to an obsolete version of action A corresponding to X (i.e. h′(X)=h′(x)), then action data A is updated in the corresponding data structure. However, if an entry exists at index h(X) and contains a pointer from this entry to an action B corresponding to Y (i.e. h(X)=h(y) but h′(X)˜=h′(y)), then the binary key X is entered into the CAM 10, with the entry index corresponding to the location of a new data structure containing action A. Following this, binary key Y is recreated from the hash h(y), or equivalently h(X), and the complement h′(y) is stored in the data structure pointed to by the hash table entry indexed by h(X), and then binary key Y is entered into the CAM 10 with the entry index corresponding to the location of a new data structure containing action B. The entry corresponding to h(Y) is deleted in the hash table 12 (i.e. marked invalid), the original data structure containing action B is deleted (since this data is moved to a location corresponding to the CAM entry index for Y), and the program is ended.

An alternate implementation includes the step of copying the pointer from the hash table entry to a small data portion of the new CAM entry. In this alternative, the data structure does not have to be moved, since the new pointer continues to point to the same data structure location. The Insertion Policy can be characterized as follows where a key X and action A are presented for insertion.

INSERTION POLICY

For deletion, a key X is presented for deletion and the key X is sought in the CAM 10. If X is found, then delete X and corresponding data structure containing action A and mark for overwriting, and the program is done. Otherwise, if X is not found in the CAM 10, then the pre-selected N-bit portion h(X) of the hash value is used to index into the hash table (DT) 12. If the hash table slot at the h(X) index is occupied, then delete the entry and the corresponding data structure containing action A, mark the hash table entry as invalid, and end. If the entry indexed by h(X) is invalid, or if it points to a data structure containing h′(y), such that the compare at the end of the search does not match, then log a message indicating that the entry targeted for deletion was not found, and end the program. This can be written as follows when a key X is presented for deletion.

DELETION POLICY

Rinaldi, Mark Anthony, Davis, Gordon Taylor, Jeffries, Clark Debs, Herkersdorf, Andreas Guenther

Patent Priority Assignee Title
7809701, Oct 15 2007 Telefonaktiebolaget LM Ericsson (publ); TELEFONAKTIEBOLAGET LM ERICSSON PUBL Method and system for performing exact match searches using multiple hash tables
8266062, Jun 27 2007 Microsoft Technology Licensing, LLC Server side reversible hash for telephone-based licensing mechanism
8503456, Jul 14 2009 AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD Flow based path selection randomization
8565239, Jul 14 2009 AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD Node based path selection randomization
8665879, Jul 14 2009 AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD Flow based path selection randomization using parallel hash functions
Patent Priority Assignee Title
5251207, Mar 10 1992 INTERNATIONAL BUSINESS MACHINES CORPORATION, A CORP OF NY Combined terminal adapter for SMDS and frame relay high speed data services
5390173, Oct 22 1992 ENTERASYS NETWORKS, INC Packet format in hub for packet data communications system
5414704, Oct 22 1992 ENTERASYS NETWORKS, INC Address lookup in packet data communications link, using hashing and content-addressable memory
5881311, Jun 05 1996 FaStor Technologies, Inc. Data storage subsystem with block based data management
5893086, Jul 11 1997 International Business Machines Corporation Parallel file system and method with extensible hashing
5909686, Jun 30 1997 Oracle America, Inc Hardware-assisted central processing unit access to a forwarding database
6104715, Apr 28 1997 IBM Corporation Merging of data cells in an ATM network
6430190, Nov 21 1997 Cisco Technology, Inc. Method and apparatus for message routing, including a content addressable memory
6735670, May 12 2000 Hewlett Packard Enterprise Development LP Forwarding table incorporating hash table and content addressable memory
6934796, Feb 01 2002 AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD Content addressable memory with hashing function
7039764, Jan 17 2002 Nokia Technologies Oy Near-perfect, fixed-time searching algorithm using hashing, LRU and cam-based caching
GB919980015,
///
Executed onAssignorAssigneeConveyanceFrameReelDoc
Aug 03 2006International Business Machines Corporation(assignment on the face of the patent)
Mar 27 2012International Business Machines CorporationFacebook, IncASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS 0280150863 pdf
Oct 28 2021Facebook, IncMeta Platforms, IncCHANGE OF NAME SEE DOCUMENT FOR DETAILS 0585530802 pdf
Date Maintenance Fee Events
Jan 25 2008ASPN: Payor Number Assigned.
Nov 07 2011REM: Maintenance Fee Reminder Mailed.
Mar 22 2012M1551: Payment of Maintenance Fee, 4th Year, Large Entity.
Mar 22 2012M1554: Surcharge for Late Payment, Large Entity.
Sep 09 2015M1552: Payment of Maintenance Fee, 8th Year, Large Entity.
Sep 19 2019M1553: Payment of Maintenance Fee, 12th Year, Large Entity.


Date Maintenance Schedule
Mar 25 20114 years fee payment window open
Sep 25 20116 months grace period start (w surcharge)
Mar 25 2012patent expiry (for year 4)
Mar 25 20142 years to revive unintentionally abandoned end. (for year 4)
Mar 25 20158 years fee payment window open
Sep 25 20156 months grace period start (w surcharge)
Mar 25 2016patent expiry (for year 8)
Mar 25 20182 years to revive unintentionally abandoned end. (for year 8)
Mar 25 201912 years fee payment window open
Sep 25 20196 months grace period start (w surcharge)
Mar 25 2020patent expiry (for year 12)
Mar 25 20222 years to revive unintentionally abandoned end. (for year 12)