Methods and apparatus to perform string matching for network packet inspection are disclosed. In some embodiments there is a set of string matching slice circuits, each slice circuit of the set being configured to perform string matching steps in parallel with other slice circuits. Each slice circuit may include an input window storing some number of bytes of data from an input data steam. The input window of data may be padded if necessary, and then multiplied by a polynomial modulo an irreducible galois-field polynomial to generate a hash index. A storage location of a memory corresponding to the hash index may be accessed to generate a slice-hit signal of a set of h slice-hit signals. The slice-hit signal may be provided to an AND-OR logic array where the set of h slice-hit signals is logically combined into a match result.
|
9. An apparatus comprising:
an AND-OR logic array configurable to receive a set of h slice-hit signals and to combine the set of h slice-hit signals into a match result; and
a set of h slice circuits, each ith slice circuit of the set comprising:
an input window configurable to independently store Wi bytes of data from an input data steam;
a ghash unit coupled with the input window and configurable to receive the Wi bytes of data, pad the Wi bytes of data if necessary, and multiply the Wi bytes of data by a galois-field polynomial modulo an irreducible galois-field polynomial combined with a randomly generated polynomial multiplier to generate an index; and
a memory coupled with the ghash unit and configurable to access a storage location responsive to the index to generate a slice-hit signal and to provide the slice-hit signal to said AND-OR logic array as one of the set of h slice-hit signals.
1. A method to perform string matching for network packet inspection, the method comprising:
configuring a set of h slice circuits, each ith slice circuit of the set of h slice circuits being configured to perform the steps of:
independently storing an ith input window of Wi bytes of data from an input data stream;
padding the Wi bytes of data if necessary, and multiplying the Wi bytes of data by a galois-field polynomial modulo an irreducible galois-field polynomial combined with a randomly generated polynomial multiplier to generate an ith hash index;
accessing a storage location of a memory corresponding to the ith hash index to generate an ith slice-hit signal of a set of h slice-hit signals; and
providing the ith slice-hit signal to an AND-OR logic array as one of the set of h slice-hit signals; and
configuring the AND-OR logic array to receive the set of h slice-hit signals and to combine the set of h slice-hit signals into a match result.
17. A packet processing system to perform string matching for network packet inspection, the system comprising:
a system processor;
an AND-OR logic array configurable to receive a set of h slice-hit signals and to combine the set of h slice-hit signals into a match result; and
a set of h slice circuits, each ith slice circuit of the set comprising:
an input window configurable to independently store Wi bytes of data from an input data steam;
a ghash unit coupled with the input window and configurable to receive the Wi bytes of data, pad the Wi bytes of data if necessary, and multiply the Wi bytes of data by a galois-field polynomial modulo an irreducible galois-field polynomial combined with a randomly generated polynomial multiplier to generate an index; and
a memory coupled with the ghash unit and configurable to access a storage location responsive to the index to generate a slice-hit signal and to provide the slice-hit signal to said AND-OR logic array as one of the set of h slice-hit signals; and
a machine readable medium to store executable instructions, such that when said executable instructions are executed by the system processor, the system processor is caused to:
set a pointer to a first character of the input data steam to establish a starting point for the input window of each ith slice circuit, and
increment the pointer until the match result is positive or until an end-of-file is reached in the input data steam.
2. The method of
storing the ith slice-hit signal in the storage location of the memory corresponding to the ith hash index.
3. The method of
4. The method of
reading out the ith slice-hit signal, from the storage location of the memory corresponding to the ith hash index, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
5. The method of
multiplexing the ith slice-hit signal from the storage location of the memory corresponding to the ith hash index, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
6. The method of
7. The method of
8. The method of
10. The apparatus of
reading out the slice-hit signal, from the storage location of the memory corresponding to the index of the ith slice circuit, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
11. The apparatus of
multiplexing the slice-hit signal, from the storage location of the memory corresponding to the index of the ith slice circuit, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
12. The apparatus of
13. The apparatus of
14. The apparatus of
15. The apparatus of
16. The apparatus of
18. The system of
19. The system of
20. The system of
21. The system of
22. The system of
23. The system of
reading out the slice-hit signal, from the storage location of the memory corresponding to the index of the ith slice circuit, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
24. The system of
multiplexing the slice-hit signal, from the storage location of the memory corresponding to the index of the ith slice circuit, to the AND-OR logic array as the ith one of the set of h slice-hit signals.
|
This disclosure relates generally to the field of network processing. In particular, the disclosure relates to a novel filter architecture to accelerate string matching in packet inspection for network applications such as intrusion detection/prevention and virus detection.
In modem networks, applications such as intrusion detection/prevention and virus detection are important for protecting the networks and/or network users from attacks. In such applications network packets are often inspected to identify problematic packets by finding matches to a known set of data patterns. Matching every byte of an incoming data stream against a large database of patterns (e.g. up to hundreds of thousands) is very compute-intensive. Programs have used techniques such as finite-state machines and filters to find matches to known sets.
A Bloom filter, conceived by Burton H. Bloom in 1970, is a probabilistic structure for determining whether an element is a member of a set. Hashing is performed on the element. Multiple different hash functions are used to generate multiple different hash indices into an array of bits. To add or insert an element into the set, these hash functions are used to index multiple bit locations in the array for the element and these bit locations are then set to one. To query the filter for an arbitrary element the hash functions are used to index multiple bit locations in the array for the element and these bit locations are then checked to see if they are all set to one. If they are not all set to one, the arbitrary element in question is not a member of the set.
Whenever a filter generates a positive outcome for an element, which is not actually a member of the set, the outcome is called a false positive. The Bloom filter will not generate a false negative. It is a goal of any particular filter design, that the probability of false positives is “small.” For Bloom filters, after inserting n elements into a set represented by an array of m bits using k different hash functions, the probability of a false positive is (1−(1−1/m)kn)k.
Designing a filter for a specific problem may be tedious, and at high data rates it is difficult or impossible for state-of-the art processors to implement the design at rates even close to line-rate. To achieve rates close to one or more gigabits per second, specialized field-programmable gate array solutions or custom circuits have been proposed.
To date, more generalized reconfigurable architectures to accelerate string matching in packet inspection for network applications such as intrusion detection/prevention and virus detection have not been fully explored.
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings.
Methods and apparatus to perform string matching for network packet inspection are disclosed below. In some embodiments, a filter apparatus may be configured as a set of string matching slice circuits, each slice circuit of the set being configured to perform string matching steps in parallel with other slice circuits. Each slice circuit may include an input window storing some number of bytes of data from an input data steam. The input window of data may be padded if necessary, and may be multiplied by a distinct Galois-field polynomial modulo an irreducible Galois-field polynomial to generate a hash index. A storage location of a memory slice corresponding to the hash index may be accessed to generate a slice-hit signal of a plurality of slice-hit signals. The slice-hit signal may be provided to an AND-OR logic array where the plurality of slice-hit signals is logically combined into a match result.
Embodiments of such methods and apparatus represent reconfigurable architectures to accelerate string matching in packet inspection for network applications such as intrusion detection/prevention and virus detection.
In the following description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in detail in order not to obscure the understanding of this description. These and other embodiments of the present invention may be realized in accordance with the following teachings and it should be evident that various modifications and changes may be made in the following teachings without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense and the invention measured only in terms of the claims and their equivalents.
It will be appreciated that some embodiments of filter apparatus 101 may use the same irreducible Galois-field polynomial in each of the Ghash units 112-152 with H distinct polynomial multipliers selected at random (each having a good mixture of 1's and 0's) to generate H distinct hash indices, thus simplifying the task of generating distinct hash indices for each Ghash unit. It will also be appreciated that in embodiments of filter apparatus 101 where, unlike the Bloom filter, input windows 111-151 are independently configurable to store Wi bytes of data from input data steam 120, the filter apparatus 101 may be used to solve multiple problems of different sizes (e.g. a 2-byte match, a 3-byte match, a 6-byte match, and an 8-byte match, etc.) at the same time in parallel.
Slice circuits 110-150, respectively, also include memories 113-153 coupled with the Ghash units 112-152 and configurable to access respective storage locations responsive to their respective indices (e.g. at the addresses specified by some field of bits from respective indices) to each generate an ith slice-hit signal and to provide the an ith slice-hit signal to AND-OR logic array 140 as one of the set of H slice-hit signals 115-155. Some embodiments of memories 113-153 are configurable from a larger memory 130 to serve as individual memories 113-153 for slice circuits 110-150 respectively. Some alternative embodiments of memories 113-153 may be N-entry (e.g. 1K entries) read/write random-access memories (RAMs) of fixed width (e.g. 64-bits wide) and are configurable to be combined into larger memories (e.g. memory 130) as necessary (e.g. when a very large set of patterns is required). Slice circuits 110-150 may also include multiplexers 114-154, respectively, configurable to access respective bit storage locations responsive to portions of their respective indices to generate the ith slice-hit signal and to provide the ith slice-hit signal to AND-OR logic array 140 as one of the set of H slice-hit signals 115-155.
AND-OR logic array 140 is configurable to receive a set of H slice-hit signals 115-155 and to combine the set of H slice-hit signals 115-155 into a match result 145, a copy of which may be stored as a match result 185. Some embodiments of AND-OR logic array 140 may be configurable to perform a simple AND (e.g. as in a Bloom filter) or a simple OR (e.g. as in solving multiple problems of different sizes in parallel) of the set of H slice-hit signals 115-155 to get a match result 145. Alternative embodiments of AND-OR logic array 140 may be configurable to perform a complex AND-OR of the set of H slice-hit signals 115-155 (e.g. tempk=(AND slice-hit signali for all i in a set Sk) and then the final match result=(OR tempk for all k) ) to get a match result 145. The complex AND-OR of the set of H slice-hit signals 115-155 may be used, for example, in embodiments of filter apparatus 101 to provide multiple Bloom filters in parallel.
It will be appreciated that when a final match result is positive, a verification process may be used to check against false positives. Such verification process may be relatively slower than using filter apparatus 101 and so the configuration of filter apparatus 101 should be carefully made to avoid frequent false positives.
In processing block 211 a set of H slice circuits are configured. In processing block 212, i is set to zero (0). In processing block 213, i is incremented. In processing block 214, i is checked to see if it has exceeded H. It will be appreciated that even though initialization of the H slice circuits is shown as an iterative process 201, in at least some preferred embodiment of process 201, the set of H slice circuits are configured to concurrently perform initialization according to processing blocks 215-220 of process 201 for use in string matching during network packet inspections. Therefore, for each of the H slice circuits processing blocks 215-220 are executed as follows, before proceeding to processing block 222.
In processing block 215 Wi bytes of data is stored from an input data steam in an ith input window. In processing block 216 the Wi bytes of data are padded if necessary. Then in processing block 217 the Wi bytes of data are multiplied by a Galois-field polynomial modulo an irreducible Galois-field polynomial to generate an ith hash index. In processing block 218 a storage location of a memory corresponding to the ith hash index is accessed, and in processing block 220 an ith slice-hit signal is stored (i.e. set) in the storage location of the memory corresponding to the ith hash index. When all of the H slice circuits have completed processing blocks 215-220 of process 201, processing proceeding to processing block 222 where a pointer in the input data stream is moved (e.g. to a new string in the database). Then from processing block 224, if the data stream is empty processing terminates. Otherwise processing repeats in processing block 212.
It will be appreciated that the process 201 may be iterated for hundreds to hundreds of thousands of times in order to initialize a filter apparatus for string matching patterns in packet inspection. Thus when the set of H slice circuits are configured to concurrently perform initialization substantial performance improvements may be realized. It will also be appreciated that the process 201 of initializing a filter apparatus (by setting slice-hit signals) may be performed in a manner substantially similar to a process of utilizing a filter apparatus for string matching (by reading the slice-hit signals) in packet inspection. In some embodiments of processing block 222 a pointer into the input data stream may moved for each ith slice, in such a way as to provide each ith slice with a new compete pattern, whereas in utilizing a filter apparatus for string matching a pointer into the input data stream may be simply incremented.
In processing block 315 Wi bytes of data is stored from an input data steam in an ith input window. In processing block 316 the Wi bytes of data are padded if necessary. Then in processing block 317 the Wi bytes of data are multiplied by a Galois-field polynomial modulo an irreducible Galois-field polynomial to generate an ith hash index. In processing block 319 a storage location of a memory corresponding to the ith hash index is accessed to generate an ith slice-hit signal of a set of H slice-hit signals. In processing block 321 the ith slice-hit signal is provided to an AND-OR logic array as one of the set of H slice-hit signals. When all of the H slice circuits have completed processing blocks 315-321 of process 301, processing proceeding to processing block 323 where the AND-OR logic array is configured to receive the set of H slice-hit signals and to combine the set of H slice-hit signals into a match result. Then from processing block 323 processing terminates.
It will be appreciated that iterations of process 301 may be configured in accordance with embodiments of filter apparatus 101 to substantially accelerate string matching in packet inspection.
System 401 includes an input data stream 420, which may be in system memory 470 as shown, or may comprise an optional data stream buffer of filter 480 for storing packed data for inspection and/or a pattern database to initialize filter 480.
Filter 480 includes a set of H slice circuits 410-450, each ith slice circuit of the set is configurable for providing an ith slice-hit signal to a configurable AND-OR logic array 440 as one of a set of H slice-hit signals. Slice circuits 410-450, respectively include input windows 411-451 each configurable to store Wi bytes of data from input data steam 420, and Ghash units 412-452 coupled with input windows 411-451 and configurable to receive the Wi bytes of data, to pad the Wi bytes of data if necessary, and to multiply their respective WI bytes of data by a polynomial modulo an irreducible Galois-field polynomial to generate an index.
Slice circuits 410-450, respectively, also include memories 413-453 coupled with the Ghash units 412-452 and configurable to access respective storage locations responsive to their respective indices to each generate an ith slice-hit signal and to provide the an ith slice-hit signal to AND-OR logic array 440 as one of the set of H slice-hit signals 415-455. Memories 413-453 may be N-entry read/write RAMs of any fixed width and configurable to be combined into larger memories (e.g. memory 430) as necessary. Alternatively some embodiments of memories 413-453 may be configurable from a larger memory 430. Slice circuits 410-450 may also include multiplexers 414-454, respectively, configurable to access respective bit storage locations responsive to portions of their respective indices to generate the ith slice-hit signal and to provide the ith slice-hit signal to AND-OR logic array 440 as one of the set of H slice-hit signals 415-455. AND-OR logic array 440 may receive the set of H slice-hit signals 415-455 and combine the set of H slice-hit signals 415-455 into a match result 445.
System 401 also includes system processor 460 to executed a program 471 in system memory 470 to accelerate string matching in packet inspection for network applications using filter 480, and to move or increment a pointer 461 into input data stream 420 until a match result 445 is positive (in the case of string matching for packet inspections) or until an end-of-file is reached in the input data steam 420. In some embodiments of system 401, processor 460 may check a copy of match result 445 stored in system memory 470 as a match result 485 when string matching for packet inspections to determine if match result 445 was positive.
The above description is intended to illustrate preferred embodiments of the present invention. From the discussion above it should also be apparent that especially in such an area of technology, where growth is fast and further advancements are not easily foreseen, the invention can may be modified in arrangement and detail by those skilled in the art without departing from the principles of the present invention within the scope of the accompanying claims and their equivalents.
Wolrich, Gilbert M., Gopal, Vinodh, Feghali, Wajdi K., Clark, Christopher F.
Patent | Priority | Assignee | Title |
11722516, | Apr 28 2014 | Sophos Limited | Using reputation to avoid false malware detections |
11729621, | Apr 29 2019 | SonicWALL Inc. | Elastic security services and load balancing in a wireless mesh network |
11800598, | Apr 29 2019 | SonicWALL Inc. | Method for providing an elastic content filtering security service in a mesh network |
11863987, | Apr 29 2019 | SonicWALL Inc. | Method for providing an elastic content filtering security service in a mesh network |
11882136, | Dec 18 2014 | Sophos Limited | Process-specific network access control based on traffic monitoring |
11997117, | Apr 28 2014 | Sophos Limited | Intrusion detection using a heartbeat |
12069480, | Apr 29 2019 | SonicWALL Inc. | Elastic security services load balancing in a wireless mesh network |
12074904, | Apr 28 2014 | Sophos Limited | Using reputation to avoid false malware detections |
9602522, | Mar 08 2011 | TREND MICRO INCORPORATED | Methods and systems for full pattern matching in hardware |
Patent | Priority | Assignee | Title |
6430184, | Apr 10 1998 | Top Layer Networks, Inc. | System and process for GHIH-speed pattern matching for application-level switching of data packets |
7085988, | Apr 08 2002 | Maxtor Corporation | Hashing system utilizing error correction coding techniques |
7444515, | Aug 14 2003 | Washington University | Method and apparatus for detecting predefined signatures in packet payload using Bloom filters |
20020006195, | |||
20050086520, | |||
20050283714, | |||
20070014395, | |||
20080130894, | |||
20080148025, | |||
20090024826, | |||
WO2010077904, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Dec 30 2008 | Intel Corporation | (assignment on the face of the patent) | / | |||
Jan 13 2009 | GOPAL, VINODH | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 022800 | /0236 | |
Jan 13 2009 | WOLRICH, GILBERT | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 022800 | /0236 | |
Jan 13 2009 | FEGHALI, WAJDI | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 022800 | /0236 | |
Mar 13 2009 | CLARK, CHRISTOPHER F | Intel Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 022800 | /0236 | |
Jul 18 2022 | Intel Corporation | TAHOE RESEARCH, LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 061175 | /0176 |
Date | Maintenance Fee Events |
Oct 31 2016 | ASPN: Payor Number Assigned. |
Oct 14 2019 | REM: Maintenance Fee Reminder Mailed. |
Mar 30 2020 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Feb 23 2019 | 4 years fee payment window open |
Aug 23 2019 | 6 months grace period start (w surcharge) |
Feb 23 2020 | patent expiry (for year 4) |
Feb 23 2022 | 2 years to revive unintentionally abandoned end. (for year 4) |
Feb 23 2023 | 8 years fee payment window open |
Aug 23 2023 | 6 months grace period start (w surcharge) |
Feb 23 2024 | patent expiry (for year 8) |
Feb 23 2026 | 2 years to revive unintentionally abandoned end. (for year 8) |
Feb 23 2027 | 12 years fee payment window open |
Aug 23 2027 | 6 months grace period start (w surcharge) |
Feb 23 2028 | patent expiry (for year 12) |
Feb 23 2030 | 2 years to revive unintentionally abandoned end. (for year 12) |