A method for processing a datagram, including receiving an initial fragment of the datagram over a communication link and classifying in an initial classification the initial fragment as a first fragment, a middle fragment, or a last fragment of the datagram. The method further includes receiving one or more subsequent fragments over the communication link, following the initial fragment, and classifying each of the one or more subsequent fragments in respective subsequent classifications so as to find among the subsequent fragments at least one of the first fragment, the middle fragment, and the last fragment of the datagram.
Responsive to the initial and the one or more subsequent classifications, a determination is made whether the datagram is completely constituted by the initial fragment and no more than two of the subsequent fragments. The datagram is reassembled responsive to the determination.
|
18. A method for processing data, the method comprising:
in a single chip comprising an on-chip processor and an on-chip memory:
classifying, utilizing said on-chip processor, a received initial fragment of a datagram as one of the following: a first fragment, a middle fragments, and a last fragment of the datagram;
classifying, utilizing said on-chip processor, a plurality of received subsequent fragments of the datagram as at least two of the following: a first fragment, a middle fragment, and a last fragment of the datagram;
determining, utilizing said on-chip processor, whether the datagram is completely constituted by the initial fragment and no more than two of the plurality of subsequent fragments; and
reassembling, utilizing said on-chip processor, the datagram based on the determination.
15. A system for processing data, the system comprising:
a single chip comprising at least one on-chip processor and an on-chip memory, wherein:
the at least one on-chip processor enables classifying of a received initial fragment of a datagram as one of the following: a first fragment, a middle fragments, and a last fragment of the datagram;
the at least one on-chip processor enables classifying of a plurality of received subsequent fragments of the datagram as at least two of the following: a first fragment, a middle fragment, and a last fragment of the datagram; and
the at least one on-chip processor enables determination of whether the datagram is completely constituted by the initial fragment and no more than two of the plurality of subsequent fragments and reassembling the datagram based on the determination.
8. An apparatus for processing a datagram, comprising:
a single chip comprising an on-chip processor and an on-chip memory, wherein:
said on-chip memory is adapted to receive an initial fragment and at least two subsequent fragments of a datagram and store the received initial fragment and the at least two subsequent fragments;
said on-chip processor is adapted to classify the initial fragment and the at least two subsequent fragments as one of a first fragment, a middle fragment, and a last fragment of the datagram; and
said on-chip processor is adapted to make a determination, responsive to the classification of each of the stored fragments, whether the datagram is completely constituted by the initial fragment and no more than two of the at least two subsequent fragments, and to reassemble the datagram based on the determination.
1. A method for processing a datagram, comprising:
in a single chip comprising an on-chip processor and an on-chip memory:
receiving, by said single chip, an initial fragment of the datagram over a communication link;
classifying, utilizing said on-chip processor, the initial fragment as one of the following: a first fragment, a middle fragment, and a last fragment of the datagram;
receiving, by said single chip, a plurality of subsequent fragments over the communication link, following the initial fragment;
classifying, utilizing said on-chip processor, the plurality of subsequent fragments to find among the plurality of subsequent fragments two of the following: the first fragment, the middle fragment, and the last fragment of the datagram;
determining, utilizing said on-chip processor, whether the datagram is completely constituted by the initial fragment and no more than two of the plurality of subsequent fragments; and
reassembling, utilizing said on-chip processor, the datagram based on the determination.
2. The method according to
3. The method according to
4. The method according to
5. The method according to
6. The method according to
7. The method according to
9. The apparatus according to
10. The apparatus according to
an ordering buffer which is adapted to store ordering data from a fragment header; and
a reassembly buffer which is adapted to store payload data conveyed by each fragment, and wherein the on-chip processor is adapted to reassemble the payload data from the reassembly buffer.
11. The apparatus according to
12. The apparatus according to
13. The apparatus according to
14. The apparatus according to
16. The system according to
17. The system according to
19. The method according to
20. The method according to
|
This application claims the benefit of U.S. Provisional Patent Application No. 60/317,670, filed Sep. 6, 2001, which is incorporated herein by reference.
The present invention relates generally to transmission of datagrams, and specifically to reassembling fragments of Internet Protocol (IP) datagrams.
The Transmission Control Protocol/Internet Protocol suite is a widely-used transport protocol in digital packet networks. The Internet Protocol is described by Postel in Request For Comments (RFC) 791 of the U.S. Defense Advanced Research Projects Agency (DARPA), published in 1981, which is incorporated herein by reference. The Internet Protocol (IP) enables an IP datagram to be split into two or more IP fragments when an interface is unable to transmit the original datagram due to the latter being too large. The oversized datagram is split into separate IP fragments, each fragment being small enough to be transmitted by the interface. The process of fragmentation may occur more than once, depending on the maximum transmission unit (MTU) of each network component. For example, a datagram which is originally 1518 bytes—the maximum datagram size for networks operating according to an Ethernet protocol—may be sent to a first router having an MTU of 1000. The router divides the datagram into two IP fragments, 1000 bytes and 518 bytes, and forwards the two fragments to a second router having an MTU of 576 bytes, The second router divides the 1000 byte fragment into a 576 byte fragment and a 423 byte fragment, and thus transmits three fragments representing the original 1518 byte datagram.
The IP layer at the receiving host accumulates the fragments until enough have arrived to reconstitute the original datagram. RFC 791 describes a reassembly mechanism, and an algorithm for reassembly based on tracking arriving fragments in a vector of bits. The algorithm operates in substantially the same manner regardless of the number of fragments.
A field 22 comprises 1-bit flags 24, 26, and 28, and a 13-bit fragment offset 30. Flag 24 must be set to zero. Flag 26 is set to 0 if the datagram may be fragmented, and is set to 1 if the datagram may not be fragmented. Flag 28 is set to 0 to indicate that this fragment is the last fragment, and is set to 1 to indicate that there are more fragments. Fragment offset 30 indicates where in the datagram the fragment belongs. It is calculated in units of 8 bytes, and is set to 0 for the first fragment. Field 22 is used by the datagram receiver to know in which order fragments are placed, and in order to correctly reassemble the fragments to the original datagram.
RFC 815, “IP Datagram Reassembly Algorithms,” by David D. Clark, published in 1982, which is incorporated herein by reference, describes an alternative fragment reassembly system to that described in RFC 791. RFC 815 refers to a partially reassembled datagram which is assumed to have missing areas, termed holes. Each hole is characterized by the first byte number and a last byte number of the hole, the pair of numbers being termed a hole descriptor. A processor stores each hole descriptor, together with a pointer to the next hole, in its respective hole. The partially reassembled datagram is stored with its hole decriptors, by the processor, in a reassembly buffer. (The buffer size must be sufficient to accommodate the largest datagram transmitted by IP.) The buffer also maintains a global pointer to the first hole in the datagram.
As long as network speed was the main factor limiting receiver rates, software implementations of IP receiver logic provided adequate performance levels. However, with the advent of network speeds in the 1 Gbps and 10 Gbps range, this is no longer the case. Faster IP receiver processing is required, requiring a new approach to the original specifications in RFC 791 and/or RFC 815. Among the issues to be addressed are maximization of parallel processing, efficient information passing, and rapid classification and handling of fragments.
It is an object of some aspects of the present invention to provide apparatus and a method for efficient reassembly of datagram fragments.
In preferred embodiments of the present invention, a processor classifies an incoming fragment, which has been generated from a complete datagram, as a first, a middle, or a last fragment. The processor performs similar classifications on up to two subsequent fragments. If the first two classifications result in first and last fragment classifications, and if the two fragments form the complete datagram, the complete datagram is reassembled from the two fragments. If the first two classifications do not result in fragments forming the complete diagram, but do imply that the complete datagram may be split into three fragments, the process classifies a third fragment. If the three classifications result in the first, the middle, and the last fragment which together form the complete datagram, the complete datagram is reassembled from the three fragments. By classifying incoming fragments as first, middle, or last fragments, re-assembling the complete datagram (where it is initially divided into two or three fragments) is made significantly faster than prior art systems for re-assembling datagrams from fragments.
If the classifications indicate that the datagram has been split into more than three fragments, for example, if the first two classifications yield different middle fragments, the fragments are processed using any suitable prior art reassembly method. Thus, the prior art method is only implemented for cases of four or more fragments. Most preferably, datagrams and their fragments are generated according to a standard protocol, such as the Internet Protocol (IP), in which case the prior art reassembly method is preferably the Clark algorithm described in the Background of the Invention.
There is therefore provided, according to a preferred embodiment of the present invention, a method for processing a datagram, including:
Preferably, each fragment includes a header, and classifying each fragment includes determining the classification of the fragment responsive to data comprised in the header.
Preferably, receiving the initial fragment and the one or more subsequent fragments includes storing ordering data from a header of each fragment in an ordering buffer and storing payload data conveyed by each fragment in a reassembly buffer, and reassembling the datagram includes reassembling the payload data from the reassembly buffer.
The method preferably also includes providing a state machine having a plurality of initial states, the state machine existing in one of the initial states responsive to receiving the initial fragment and the initial classification thereof. The state machine preferably also has a plurality of subsequent states, the state machine existing in one of the subsequent states responsive to receiving the initial fragment and the initial classification thereof, and to receiving the one or more subsequent fragments and the respective classifications of the one or more subsequent fragments.
Preferably, making the determination includes determining that the datagram is not completely constituted by the initial fragment and the no more than two of the subsequent fragments, and transferring the data fragments to a memory for subsequent reassembly responsive to the determination.
Preferably, the datagram for the method is generated according to an Internet protocol.
There is further provided, according to a preferred embodiment of the present invention, apparatus for processing a datagram, including:
a memory which receives an initial fragment and one or more subsequent fragments from a communication link and which stores the fragments; and
a processor which is adapted to classify each of the fragments as a first fragment, a middle fragment, or a last fragment of the datagram and to make a determination, responsive to the classifications of each of the stored fragments, whether the datagram is completely constituted by the initial fragment and no more than two of the subsequent fragments and to reassemble the datagram responsive to the determination.
Preferably, each fragment includes a header, and classifying each fragment includes determining the classification of the fragment responsive to data comprised in the header.
Preferably, the memory includes:
an ordering buffer which is adapted to store ordering data from a header included in each fragment; and
a reassembly buffer which is adapted to store payload data conveyed by each fragment; and
wherein the processor is adapted to reassemble the payload data from the reassembly buffer.
The apparatus preferably also includes a state machine which is implemented from the memory and the processor, the state machine having a plurality of initial states, and existing in one of the initial states responsive to receiving the initial fragment and the initial classification thereof.
The state machine preferably has a plurality of subsequent states, the state machine existing in one of the subsequent states responsive to receiving the initial fragment and the initial classification thereof, and to receiving the one or more subsequent fragments and the respective classifications of the one or more subsequent fragments.
Preferably, making the determination includes determining that the datagram is not completely constituted by the initial fragment and the no more than two of the subsequent fragments, and the processor is adapted to transfer the data fragments within the memory for subsequent reassembly responsive to the determination.
Preferably, the datagram for the apparatus is generated according to an Internet protocol.
The present invention will be more fully understood from the following detailed description of the preferred embodiments thereof, taken together with the drawings, in which:
Class 1. The fragment start and hole start are the same, and the fragment is shorter than the hole. A previous hole is partly filled by the fragment.
Class 2. The fragment end and hole end are the same, and the fragment is shorter than the hole. A previous hole is partly filled by the fragment.
Class 3. The fragment fills the “middle” of an existing hole.
Class 4. The fragment start and hole start are the same, and the fragment end and hole end are also the same, so that the hole is filled by the fragment.
After the classification has been made, hole information, such as new hole start and/or end, and hole pointers, are updated as illustrated in 80, 82, 84, and 86.
The algorithms described with reference to
Reference is now made to
As described in the Background of the Invention, an Internet Protocol (IP) datagram may be divided into two or more fragments before being transmitted from a transmitter, depending on the size of the datagram and the maximum transmission unit (MTU) of the path from the transmitter. Each fragment produced comprises identifying information in the fragment's header that enables a receiver of the fragment to identify the connection and socket of the datagram. Each fragment header also comprises sequential information of data conveyed in the fragment, such as a first and last number of bytes of the fragment data, or equivalent information. While the description hereinbelow is directed to reassembling fragments which have been generated according to the Internet Protocol, it will be appreciated that the scope of the present invention applies to any other protocol wherein datagrams are divided into fragments, and wherein the fragments comprise sequential information of data conveyed in the fragments.
In order to classify the first fragment, CPU 92 uses flag 28 and fragment offset field 30 (
TABLE I
Flag 28
Fragment Offset
State
Field 30
Fragment Classification
Set (1)
0
First part
Set (1)
>0
Middle part
Not set (0)
>0
Last part
In addition to classifying the fragment, CPU 92 determines start and end values of the fragment, fragment1.start and fragment1.end respectively, in bytes, using length field 18 and offset 30. Using fragment1.start and/or fragment1.end, CPU 92 calculates connection parameters, consisting of potential values middle_start and/or middle_end that a next fragment might have. The connection parameters are stored in ordering buffer 96, and data comprised in the fragment is stored in reassembly buffer 98. The connection parameters calculated depend on the initial classification determined in step 104, and are listed in Table II below.
TABLE II
Fragment
Classification
Connection Parameter
First part
middle_start = fragment1.end + 1
Middle part
middle_start = fragment1.start
middle_end = fragment1.end
Last part
middle_end = fragment1.start − 1
In a second step 106 a second fragment is received and CPU 92 determines start and end values of the fragment, fragment2.start and fragment2.end respectively, using length field 18 and offset 30. The fragment is classified substantially as described above for step 104 with reference to Table I.
In comparison steps 108 and 109, the two fragments are compared. Tables III, IV and V below list possible types of the second fragment and comparisons between first and second fragment parameters. The tables give results of the comparison and updates to the connection parameters, where appropriate, and a state that machine 130 is in after the comparison. Tables III, IV, and V apply when the first fragment has been classified as a first part, middle part, and last part respectively.
TABLE III
First fragment is First part
Second Fragment
Result
State
Last part.
First and second
Finished two
middle_start =
fragments make a
fragments total
fragment2.start
complete datagram
state 140
Last part.
Missing middle
Missing middle
middle_start <
part.
state 142
fragment2.start
middle_end =
fragment2.start −
1
Middle part.
Missing last part.
Missing last state
middle_start =
middle_end =
144
fragment2.start
fragment2.end + 1
Middle part.
More than three
More than three
middle_start not
fragments
fragments state
eql
146
fragment2.start
None of the above
Error
First exists state
134
TABLE IV
First fragment is Middle part
Second Fragment
Result
State
Middle part.
More than three
More than three
middle_end <
fragments in
fragments state
fragment2.start or
datagram
146
middle_start >
fragment2.end
Last part.
Missing first
Missing first
middle_end + 1 =
part.
state 148
fragment2.start
First part.
Missing last part
Missing last state
middle_start =
144
fragment2.end + 1
None of the above
Error
Middle exists
state 136
TABLE V
First fragment is Last part
Second Fragment
Result
State
First part.
First and second
Finished two
middle_end =
fragments make a
fragments total
fragment2.end
complete datagram
state 140
First part.
Missing middle
Missing middle
middle_end <
part.
state 142
fragment2.end
middle_start =
fragment2.end + 1
Middle part.
Missing first
Missing first
middle_end =
part.
state 148
fragment2.end
middle_start =
fragment2.start
Middle part.
More than three
More than three
middle_end >
fragments
fragments state
fragment2.end
146
None of the above
Error
Middle exists
state 134
If the first and second fragments make a complete datagram, corresponding to the first rows of Tables III and V, comparison 108 is positive. In this case process 100 completes in complete datagram step 110, corresponding to state machine 130 moving to “Finished two fragments total” state 140. When comparison 108 is negative, comparison 109 is invoked, to check if there are more than three fragments in the datagram, corresponding to the fourth rows of Tables III and V and the first row of Table IV.
If comparison 109 is positive, process 100 finishes with an invoke Clark algorithm step 118, corresponding to machine 130 moving to state 146. If comparison 109 is negative, process 100 continues to a receive third fragment step 110, corresponding to state machine 130 being in states 142, 144, or 148. On receipt of the third fragment CPU 92 determines start and end values of the fragment, fragment3.start and fragment3.end respectively, and in a comparison step 112 the CPU compares these with parameters derived from the two fragments already received. Details of the comparisons are given in Tables VI, VII, and VIII below, corresponding to state machine 130 being in states 148, 142, and 144 respectively. The tables also show the final state of machine 130.
TABLE VI
Missing first fragment state 148
Third Fragment
Result
State
First part.
Three fragments
Finished three
middle_start =
make a complete
fragments total
fragment3.end
datagram
state 150
First part.
More than 3
More than three
middle_start >
fragments.
fragments state
fragment3.end
146
Middle part.
More than 3
More than three
middle_start ≧
fragments.
fragments state
fragment3.end
146
None of the above
Error
Missing first
fragment state 148
TABLE VII
Missing middle fragment state 142
Third Fragment
Result
State
Middle part.
Three fragments
Finished three
middle_start =
make a complete
fragments total
fragment3.start
datagram
state 150
and
middle_end =
fragment3.end
Middle part.
More than 3
More than three
middle_start <
fragments.
fragments state
fragment3.start
146
Middle part.
More than 3
More than three
middle_end >
fragments.
fragments state
fragment3.end
146
None of the above
Error
Missing first
fragment state 142
TABLE VIII
Missing last fragment state 144
Third Fragment
Result
State
Last part.
Three fragments
Finished three
middle_end =
make a complete
fragments total
fragment3.start − 1
datagram
state 150
Last part.
More than 3
More than three
middle_end <
fragments.
fragments state
fragment3.start − 1
146
Middle part.
More than 3
More than three
middle_end ≦
fragments.
fragments state
fragment3.end
146
None of the above
Error
Missing last
fragment state 144
If in comparison 112 it is found that the three received fragments form a complete datagram, process 100 finishes at complete datagram step 114, corresponding to the first rows of Tables VI, VII, and VIII, and to state machine 130 being in state 150. If comparison 112 is false, process 100 concludes by transferring to a reassembly method suited to more than three fragments, such as the Clark algorithm. This corresponds to state machine 130 moving from state 146 to a further reassembly state 152, and to the already received fragments preferably being transferred to a different region of memory 94. Alternatively, the reassembly method may use links, stored in memory 94, to the already received fragments.
Inspection of
It will be appreciated that state machine 130, by classifying datagram fragments as first, middle, or last fragments, is able to re-assemble datagrams which have been fragmented into up to three fragments extremely efficiently.
Data networks which operate according to an Ethernet protocol are able to transmit frames having a maximum length of 1518 bytes. A maximum transmission unit (MTU) for each component of the network, such as a router which conveys frames over the network, must be at least 576 bytes; typically, a number of routers within the network have the same values of MTU, such as 576 bytes. Thus, an Ethernet frame of 1518 bytes would be fragmented into three fragments if passing through one or more routers having MTUs of 576 bytes. State machine 130 will efficiently reassemble such fragments, without having to transfer to state 152, i.e., without having to implement a further reassembly algorithm.
It will be appreciated that the preferred embodiments described above are cited by way of example, and that the present invention is not limited to what has been particularly shown and described hereinabove. Rather, the scope of the present invention includes both combinations and subcombinations of the various features described hereinabove, as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not disclosed in the prior art.
Shalom, Rafi, Mizrachi, Shay, Grinfeld, Ron
Patent | Priority | Assignee | Title |
10089339, | Jul 18 2016 | ARM Limited | Datagram reassembly |
11411892, | Mar 22 2018 | HUAWEI TECHNOLOGIES CO , LTD | Packet fragment processing method and apparatus and system |
11777864, | Aug 31 2020 | Micron Technology, Inc. | Transparent packet splitting and recombining |
11924313, | Aug 31 2020 | Micron Technology, Inc. | Multiple protocol header processing |
11954055, | Aug 31 2020 | Micron Technology, Inc. | Mapping high-speed, point-to-point interface channels to packet virtual channels |
ER3886, |
Patent | Priority | Assignee | Title |
5440545, | Aug 02 1993 | Google Technology Holdings LLC | Packet delivery system |
5493667, | Feb 09 1993 | Intel Corporation | Apparatus and method for an instruction cache locking scheme |
5809527, | Dec 23 1993 | Unisys Corporation | Outboard file cache system |
5809543, | Dec 23 1993 | Unisys Corporation | Fault tolerant extended processing complex for redundant nonvolatile file caching |
5963963, | Jul 11 1997 | GOOGLE LLC | Parallel file system and buffer management arbitration |
5970391, | Jul 14 1997 | Google Technology Holdings LLC | Method for a subscriber unit to compile message fragments transmitted from different zones |
6438655, | Apr 20 1999 | WSOU Investments, LLC | Method and memory cache for cache locking on bank-by-bank basis |
6601143, | Sep 25 1999 | International Business Machines Corporation | Self-adapting cache management method and system |
6631130, | Nov 21 2000 | F POSZAT HU, L L C | Method and apparatus for switching ATM, TDM, and packet data through a single communications switch while maintaining TDM timing |
6643710, | Sep 17 1999 | Hewlett Packard Enterprise Development LP | Architecture to fragment transmitted TCP packets to a requested window size |
6654811, | Apr 13 2000 | Nokia Inc. | Backpressure arrangement in client-server environment |
6742045, | Jul 02 1999 | Cisco Technology, Inc | Handling packet fragments in a distributed network service environment |
6771646, | Jun 30 1999 | PACKET INTELLIGENCE LLC | Associative cache structure for lookups and updates of flow records in a network monitor |
6785866, | May 01 1998 | Adobe Systems Incorporated | Dialogs for multiple operating systems and multiple languages |
6795866, | Oct 21 1999 | Oracle America, Inc | Method and apparatus for forwarding packet fragments |
7088738, | Oct 30 2000 | RPX Corporation | Dynamic fragmentation of information |
20020095512, | |||
20030007452, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Aug 20 2002 | MIZRACHI, SHAY | SILIQUENT TECHNOLOGIES INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 013278 | /0480 | |
Aug 20 2002 | SHALOM, RAFI | SILIQUENT TECHNOLOGIES INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 013278 | /0480 | |
Aug 20 2002 | GRINFELD, RON | SILIQUENT TECHNOLOGIES INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 013278 | /0480 | |
Sep 06 2002 | Broadcom Corporation | (assignment on the face of the patent) | / | |||
Aug 31 2005 | SILIQUENT TECHNOLOGIES, INC | Broadcom Corporation | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 017477 | /0524 | |
Feb 01 2016 | Broadcom Corporation | BANK OF AMERICA, N A , AS COLLATERAL AGENT | PATENT SECURITY AGREEMENT | 037806 | /0001 | |
Jan 19 2017 | BANK OF AMERICA, N A , AS COLLATERAL AGENT | Broadcom Corporation | TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS | 041712 | /0001 | |
Jan 20 2017 | Broadcom Corporation | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 041706 | /0001 |
Date | Maintenance Fee Events |
Apr 08 2016 | REM: Maintenance Fee Reminder Mailed. |
Aug 28 2016 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Aug 28 2015 | 4 years fee payment window open |
Feb 28 2016 | 6 months grace period start (w surcharge) |
Aug 28 2016 | patent expiry (for year 4) |
Aug 28 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Aug 28 2019 | 8 years fee payment window open |
Feb 28 2020 | 6 months grace period start (w surcharge) |
Aug 28 2020 | patent expiry (for year 8) |
Aug 28 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Aug 28 2023 | 12 years fee payment window open |
Feb 28 2024 | 6 months grace period start (w surcharge) |
Aug 28 2024 | patent expiry (for year 12) |
Aug 28 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |