Improved packet scheduling methods and apparatuses for use in, among other things, a network interface of a router (or other network element) are described herein. In one such improved method, packets buffered in a network interface are segmented for transmission on a communications link into multiple scheduling domains each being represented by a scheduling tree, each scheduling tree is assigned to a separate virtual port scheduling engine, and a top level scheduling engine is employed to schedule between the outputs of the virtual port scheduling engines to make the final choice of which buffered packet to transmit on the communications link (e.g., to move to the transmit queue of the network interface). By having the virtual port scheduling engines operate in parallel and substantially independently of each other, the rate at which packet can be moved into the transmit queue may increase greatly, thereby increasing the bandwidth of the network interface of the router.
|
1. A packet scheduling apparatus, comprising:
a first scheduling engine operable to (a) select a packet queue from a first set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in a first set of intermediate packet queues, wherein the first scheduling engine is configured to perform the packet queue selection using information corresponding to a first set of scheduling nodes;
a second scheduling engine operable to (a) select a packet queue from a second set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in a second set of intermediate packet queues, wherein the second scheduling engine is configured to perform the packet queue selection using information corresponding to a second set of scheduling nodes;
a third scheduling engine operable to (a) select a packet queue from a set of packet queues comprising the first set of intermediate packet queues and the second set of intermediate packet queues and (b) move a packet from the selected packet queue to a transmit queue;
a packet transmitter configured to transmit on to a communications link packets from the transmit queue,
wherein the first scheduling engine and the second scheduling engine are configured to select packet queues independently of each other such that state information need not be shared between the first and second scheduling engines.
12. A packet scheduling method, the method comprising:
assigning a first set of packet queues to a first scheduling engine;
assigning a second set of packet queues to a second scheduling engine;
assigning a first packet flow to a set of packet queues included in the first set of packet queues;
assigning a second packet flow to a set of packet queues included in the second set of packet queues;
receiving, at a network interface of a network element, a packet;
determining a packet flow to which the packet belongs;
if the received packet belongs to the first packet flow, then placing the received packet in one of the packet queues included in the set of packet queues to which the first packet flow is assigned in response to determining that the received packet belongs to the first packet flow; and
if the received packet belongs to the second packet flow, then placing the received packet in one of the packet queues included in the set of packet queues to which the second packet flow is assigned in response to determining that the received packet belongs to the second packet flow, wherein
the first scheduling engine (a) selects a packet queue from the first set of packet queues and (b) moves a packet from the selected packet queue to an intermediate packet queue included in a first set of intermediate packet queues,
the second scheduling engine (a) selects a packet queue from the second set of packet queues and (b) moves a packet from the selected packet queue to an intermediate packet queue included in a second set of intermediate packet queues,
a third scheduling engine (a) selects a packet queue from a set of packet queues comprising the first set of intermediate packet queues and the second set of intermediate packet queues and (b) moves a packet from the selected packet queue to a transmit queue, and
a packet transmitter of the network interface transmits on to a communications link packets from the transmit queue.
2. The packet scheduling apparatus of
3. The packet scheduling apparatus of
a set of packet queues included in the first set of packet queues is associated with a first packet flow,
a set of packet queues included in the second set of packet queues is associated with a second packet flow, and
the packet scheduling apparatus further includes a packet receiving and processing unit (PRPU) configured to (a) receive a packet, (b) determine a packet flow to which the packet belongs, and (c) place the packet in a packet queue associated with the packet flow.
4. The packet scheduling apparatus of
the third scheduling engine is operable to select a packet queue from a set of packet queues comprising the first set of intermediate packet queues, the second set of intermediate packet queues, and a third set of packet queues,
a set of packet queues included in the third set of packet queues is associated with a third packet flow, and
the PRPU is configured such that (a) when the PRPU receives a packet and determines that the packet belongs to the first packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the first packet flow, (b) when the PRPU receives a packet and determines that the packet belongs to the second packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the second packet flow, and (c) when the PRPU receives a packet and determines that the packet belongs to the third packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the third packet flow.
5. The packet scheduling apparatus of
6. The packet scheduling apparatus of
a configuration module configured to examine information defining the scheduling tree, assign to the first scheduling engine a first sub-tree of the scheduling tree, and assign to the second scheduling engine a second, different sub-tree of the scheduling tree.
7. The packet scheduling apparatus of
8. The packet scheduling apparatus of
the information corresponding to the first set of scheduling nodes comprises: (a) first maximum data rate information associated with one of the scheduling nodes included in the first set of scheduling nodes and (b) information identifying a first scheduling algorithm, and
the information corresponding to the second set of scheduling nodes comprises: (a) second maximum data rate information associated with one of the scheduling nodes included in the second set of scheduling nodes and (b) information identifying a second scheduling algorithm.
9. The packet scheduling apparatus of
the first scheduling engine is configured to select a packet queue from which to remove a packet using the first maximum data rate information and the first scheduling algorithm, and
the second scheduling engine is configured to select a packet queue from which to remove a packet using the second maximum data rate information and the second scheduling algorithm.
10. The packet scheduling apparatus of
a data processing system; and
a non-transitory computer readable medium accessible to the data processing system, the non-transitory computer readable medium storing computer readable program code that when executed by the data processing system cause the data processing system to (a) select a packet queue from the first set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in the first set of intermediate packet queues.
11. The packet scheduling apparatus of
13. The method of
14. The method of
the first scheduling engine performs the packet queue selection using information corresponding to a first scheduling tree comprising a first set of hierarchically arranged scheduling nodes, and
the second scheduling engine is configured to perform the packet queue selection using information corresponding to a second scheduling tree comprising a second set of hierarchically arranged scheduling nodes.
15. The method of
16. The method of
the information corresponding to the first set of scheduling nodes comprises: (a) first maximum data rate information associated with one of the scheduling nodes included in the first set of scheduling nodes and (b) a first scheduling algorithm, and
the information corresponding to the second set of scheduling nodes comprises: (a) second maximum data rate information associated with one of the scheduling nodes included in the second set of scheduling nodes and (b) a second scheduling algorithm.
17. The method of
the first scheduling engine is configured to select a packet queue from which to remove a packet using the first maximum data rate information and the first scheduling algorithm, and
the second scheduling engine is configured to select a packet queue from which to remove a packet using the second maximum data rate information and the second scheduling algorithm.
18. The method of
19. The method of
20. The method of
the third scheduling engine select a packet queue from a set of packet queues comprising the first set of intermediate packet queues, the second set of intermediate packet queues, and a third set of packet queues,
a third packet flow is assigned to a set of packet queues included in the third set of packet queues, and
the method further comprises placing the received packet in one of the packet queues included in the set of packet queues to which the third packet flow is assigned in response to determining that the received packet belongs to the third packet flow.
21. The method of
assigning a third packet flow to (i) a set of packet queues included in the first set of packet queues and (ii) a set of packet queues included in the second set of packet queues; and
if the received packet belongs to the third packet flow, then, using a load balancer, place the received packet in one of the packet queues to which the third packet flow is assigned.
|
The invention relates to packet scheduling. As used herein, the term “packet” is used broadly to encompass, for example, any unit of data at any layer of the OSI model (e.g., network layer, transport layer, data physical communications link layer, application layer, etc.).
Packet scheduling is necessary when multiple packets compete for a common outgoing communications link (e.g., a physical communications link or a pseudo-wire). This scenario occurs commonly in routers (and other network elements). At its most simplest, a router connects a first network with a second network. That is, there is a first physical communications link that connects a first network interface of the router to the first network and a second physical communications link that connects a second network interface of the router to the second network, thereby enabling the router to route packets between the two networks. The router may receive from the first network via the first physical communications link packets destined for a node in the second network. At certain points in time, the rate at which these packets arrive at the router may exceed the rate at which the router can transmit packets onto the second physical communications link (e.g., the second physical communications link may have a lower bandwidth than the first physical communication link). Thus, the router may employ packet queues to temporarily store the received packets. Thus, at any given point in time, it is likely that the router is storing multiple packets in its packet queues that were received from the first network and destined for the second network. As there may be a single physical communications link connecting the router to the second network, the queued packets all “compete” for this common outgoing physical communications link. As such, the router requires some method of packet scheduling. That is, the router needs some way to select which of the queued packets will be next in line for outgoing transmission.
One packet scheduling technique involves (a) creating a scheduling tree having a root scheduling node, a set of leaf scheduling nodes and zero or more aggregate scheduling nodes, where each leaf scheduling node is associated with a packet queue, and (b) employing a scheduling engine to, on a continuous basis, traverse the scheduling tree to arrive at a leaf scheduling node and to move a packet from the packet queue associated with the leaf scheduling node to a transmit queue. A problem with this technique is that the performance of the scheduling engine may be limited due to, among other things, memory bandwidth limitations and contention overhead for accessing and updating the shared state information of each scheduling node.
What is desired, therefore, is an improved packet scheduling process.
Methods and apparatuses for improving packet scheduling in a network interface of a router (or other network element) are described herein. One method is to segment packets buffered in the network interface for transmission (e.g., for transmission on a physical communications link or port pseudo-wire or Link Aggregation Group (LAG)) into multiple scheduling domains, where each scheduling domain is represented by a scheduling tree, assign each scheduling tree to a separate virtual port scheduling engine, and employ a top level scheduling engine to schedule between the outputs of the virtual port scheduling engines to make the final choice of which buffered packet to transmit (e.g., to move to a transmit queue of the network interface).
Having the virtual port scheduling engines operate in parallel and substantially independently of each other reduces greatly the amount of shared state that must be considered for each individual scheduling decision. Consequently, with this technique, the rate at which packets can be moved into the transmit queue may increase substantially. Thus, if the network interface is connected to a high-speed communications link (e.g., 100 Gigabits per second (Gbps) physical communications link), then the ability of the scheduling system to operate fast enough to utilize the full bandwidth of the communications link is enhanced.
Accordingly, in one aspect, a packet scheduling apparatus is provided. In some embodiments, the packet scheduling apparatus includes: a first scheduling engine (e.g. a first virtual port scheduling engine); a second scheduling engine (e.g. a second virtual port scheduling engine); and a third scheduling engine (e.g., a top level scheduling engine). The first scheduling engine is operable to (a) select a packet queue from a first set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in a first set of intermediate packet queues. The first scheduling engine may be configured to perform the packet queue selection using information corresponding to a first set of scheduling nodes (e.g., a hierarchically arranged set of scheduling nodes that forms a scheduling tree).
Like the first scheduling engine, the second scheduling engine is operable to (a) select a packet queue from a second set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in a second set of intermediate packet queues. The second scheduling engine may be configured to perform the packet queue selection using information corresponding to a second set of scheduling nodes. In some embodiments, the first scheduling engine and the second scheduling engine are configured to select packet queues independently of each other such that state information need not be shared between the first and second scheduling engines. The third scheduling engine is operable to (a) select a packet queue from a set of packet queues that includes the first set of intermediate packet queues and the second set of intermediate packet queues and (b) move a packet from the selected packet queue to a transmit queue.
In some embodiment, the packet scheduling apparatus may be implemented in a network interface and also includes a packet transmitter configured to transmit on to a communications link packets from the transmit queue.
In some embodiments, the first and second scheduling engines are software based scheduling engines comprising a computer readable medium having computer code stored therein loaded into, and executed by, a processor and the third scheduling engine is a pure hardware based scheduling engine that is implemented using an application specific integrated circuit (ASIC).
In some embodiments, a set of packet queues included in the first set of packet queues is associated with a first packet flow, a set of packet queues included in the second set of packet queues is associated with a second packet flow, and the packet scheduling apparatus further includes a packet receiving and processing unit (PRPU) configured to (a) receive a packet, (b) determine a packet flow to which the packet belongs, and (c) place the packet in an egress packet queue associated with the packet flow. The PRPU may be software based (e.g., the PRPU may include a computer readable medium having computer code stored therein loaded into, and executed by, a processor) or hardware based (e.g., the PRPU may be implemented using an application specific integrated circuit (ASIC)).
In some embodiments, the third scheduling engine is operable to select a packet queue from a set of packet queues comprising the first set of intermediate packet queues, the second set of intermediate packet queues, and a third set of packet queues, where a set of packet queues included in the third set of packet queues is associated with a third packet flow. In such embodiments, the PRPU is configured such that (a) when the PRPU receives a packet and determines that the packet belongs to the first packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the first packet flow, (b) when the PRPU receives a packet and determines that the packet belongs to the second packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the second packet flow, and (c) when the PRPU receives a packet and determines that the packet belongs to the third packet flow, the PRPU places the packet in one of the packet queues included in the set of packet queues that is associated with the third packet flow.
In some embodiments, the first set of scheduling nodes includes scheduling nodes from a first sub-tree of a scheduling tree and the second set of scheduling nodes comprises scheduling nodes from a second, different sub-tree of the scheduling tree. In such embodiments, a configuration module may be configured to examine information defining the scheduling tree, assign to the first scheduling engine a first sub-tree of the scheduling tree, and assign to the second scheduling engine a second, different sub-tree of the scheduling tree.
In some embodiments, the first set of scheduling nodes includes a set of scheduling nodes that are also included in the second set of scheduling nodes.
In some embodiments, the information that corresponds to the first set of scheduling nodes comprises: (a) first maximum data rate information associated with one of the scheduling nodes included in the first set of scheduling nodes and (b) information identifying a first scheduling algorithm, and the information that corresponds to the second set of scheduling nodes comprises: (a) second maximum data rate information associated with one of the scheduling nodes included in the second set of scheduling nodes and (b) information identifying a second scheduling algorithm.
In some embodiments, the first scheduling engine is configured to select a packet queue from which to remove a packet using the first maximum data rate information and the first scheduling algorithm, and the second scheduling engine is configured to select a packet queue from which to remove a packet using the second maximum data rate information and the second scheduling algorithm.
In some embodiments, the first scheduling engine includes: a data processing system; and a computer readable medium accessible to the data processing system. The computer readable medium may store computer readable program code that when executed by the data processing system cause the data processing system to (a) select a packet queue from the first set of packet queues and (b) move a packet from the selected packet queue to an intermediate packet queue included in the first set of intermediate packet queues.
In another aspect, a packet scheduling method is provided. In some embodiments, the packet scheduling method includes the following steps: assigning a first set of packet queues to a first scheduling engine; assigning a second set of packet queues to a second scheduling engine; assigning a first packet flow to a set of packet queues included in the first set of packet queues; assigning a second packet flow to a set of packet queues included in the second set of packet queues; receiving, at a network interface of a network element, a packet; determining a packet flow to which the packet belongs; if the received packet belongs to the first packet flow, then placing the received packet in one of the packet queues included in the set of packet queues to which the first packet flow is assigned in response to determining that the received packet belongs to the first packet flow; and if the received packet belongs to the second packet flow, then placing the received packet in one of the packet queues included in the set of packet queues to which the second packet flow is assigned in response to determining that the received packet belongs to the second packet flow.
In some embodiments, the first scheduling engine (a) selects a packet queue from the first set of packet queues and (b) moves a packet from the selected packet queue to an intermediate packet queue included in a first set of intermediate packet queues, the second scheduling engine (a) selects a packet queue from the second set of packet queues and (b) moves a packet from the selected packet queue to an intermediate packet queue included in a second set of intermediate packet queues, and a third scheduling engine (a) selects a packet queue from a set of packet queues comprising the first set of intermediate packet queues and the second set of intermediate packet queues and (b) moves a packet from the selected packet queue to a transmit queue. A packet transmitter of the network interface is configured to transmit on to a communications link packets from the transmit queue.
The above and other aspects and embodiments are described below with reference to the accompanying drawings.
The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention. In the drawings, like reference numbers indicate identical or functionally similar elements.
As used herein the indefinite articles “a” and “an” mean “one or more.”
In the example shown, access network 103 is a digital subscriber line (DSL) access network 103, but any type of access network 103 may be used. The example DSL access network 103 includes DSL modems 102 connected to DSL access multiplexers (DSLAMs) 104, connected to a switch 106 via physical communications link 122. For example, DSL modem 102 is connected via a physical communications link 121 with DSLAM 104, which is connected via a physical communications link 122 with switch 106. Switch 106 is connected via a physical communications link 123 (which may be wired or wireless) with a network interface 191 of network element 108. Similarly, network 110 is connected via a physical communications link 124 with a network interface 192 of edge router 108. Network interfaces 191 and 192 may be connected by a backplane component (not shown) of network element 108. Also connected to switch 106 may be another network 112.
In one embodiment, each packet received by PRPU 202 belongs to a single packet flow. In this embodiment, for each packet received by PRPU 202, PRPU 202 functions to determine the packet flow to which the received packet belongs. In some embodiments, PRPU 202 determines the packet flow to which a received packet belongs by examining data included in a packet header included in the packet or by examining meta-data for the packet, if any. For instance, the packet header (or meta-data) may include one or more virtual local area network (VLAN) tags (e.g., an outer VLAN tag and an inner VLAN tag) and may also include information identifying the type of payload data the packet is carrying (e.g., real-time data, such as voice-over IP data, or non-real time data, such as HTTP messages). As a specific example, all packets associated with a certain outer VLAN tag, inner VLAN tag, and payload type are determined to belong to the same flow, whereas all packets associated with a different outer VLAN tag, inner VLAN tag, or payload type are determined to belong to a different packet flow.
PRPU 202 also functions to add the received packet to a packet queue based on the determined packet flow to which the packet belongs. That is, in some embodiments, each packet flow is associated with a packet queue. For instance, network interface 191 may include a packet flow to packet queue database (DB), which may be implemented in, for example, a computer readable medium into which data is written and read, that stores information that maps each one of a set of defined packet flows to a packet queue. As shown in
For example, if it is assumed that all packets received by PRPU 202 and destined for network 112 belong to the same packet flow, then this packet flow may be associated with, for example, q8. Thus, in this example, when PRPU 202 receives from network interface 192 a packet destined for network 112 (or meta-data for the packet—e.g., a packet identifier, a memory location identifier identifying the memory location where the packet is stored, destination address information), PRPU 202 will “add” the packet to q8. The packet queues in packet queue set 206 do not need to be physical packet queues in the sense that all packets in a packet queue are located in sequence in the same storage device. Rather, the packet queues described herein may be logical packet queues, such as logical first-in-first-out (FIFO) packet queues. The packets themselves may be stored anywhere. Thus, “adding” a packet to a packet queue may consist of merely adding to a data structure that implements the packet queue (e.g., a linked list data structure) an identifier uniquely associated with the packet (e.g., an identifier identifying the memory location where the packet is stored).
While PRPU 202 is processing packets (e.g., adding packets to one of the packet queues 206), scheduling system 212 continuously selects one of the packet queues 206 and moves a packet from the selected packet queue to a transmit queue 214. In parallel, packet transmitter 216 continuously removes packets from transmit queue 214 and transmits those packets onto physical communications link 123. In some embodiments, packet transmitter 216 may, prior to transmitting a packet, add a header to the packet, thereby creating a protocol data unit. In this manner, packets flow into and out of the egress portion of network interface 191.
In some embodiments, when it is time for scheduling system 212 to select a packet queue, scheduling system 212 traverses a scheduling tree to determine the packet queue from packet queue set 206 that it should select. Thus, network interface 191 may include a scheduling tree database 210, which may be implemented in a computer readable medium into which data is written and read, for storing information defining the scheduling tree.
In the example shown, each leaf scheduling node and each aggregate scheduling node represents a subset of the packet flows received by PRPU 202 that may be transmitted onto physical communications link 123, and root scheduling node 301 represents all of the packet flows received by PRPU 202 that may be transmitted onto physical communications link 123. Additionally, each leaf scheduling node is associated with a unique packet queue. Thus, scheduling tree 300 illustrates a packet flow to packet queue mapping that may be stored in database 204 and used by PRPU 202, as discussed above.
As a specific example, leaf scheduling node 303 represents the flow of packets to network 112, leaf scheduling node 309 represents the flow of voice packets (packets containing voice data, such as voice-over IP data) destined for VLAN 1.1, leaf scheduling node 310 represents the flow of non-voice packets destined for VLAN 1.1, aggregate scheduling node 305 represents the flow of all packets destined for VLAN 1.1 (i.e., voice and non-voice), and leaf scheduling node 306 represents the flow of all packets destined for VLAN 1.2. In this example, it is assumed that VLAN 1 is associated with DSLAM 104, such that all traffic destined for VLAN 1 is transmitted by switch 106 on physical communications link 122, and VLAN 1.1 is associated with DSL device 102, such that all traffic destined for VLAN 1.1 is transmitted by DSLAM 104 onto physical communications link 121.
As illustrated in
As discussed, each scheduling node may be implemented as a data structure stored in a computer readable medium. Thus, in some embodiment, each data structure that implements a scheduling node may include (i) a parent pointer data element that stores a parent scheduling node pointer that points to another data structure that implements another scheduling node (i.e., the scheduling node's parent) and (ii) a set of child pointer data elements, where each child pointer data element stores a child scheduling node pointer that points to another data structure that implements another scheduling node (i.e., one of the scheduling node's children). Thus, each scheduling node relative to another scheduling node may be the parent or the child of that another scheduling node. In the case of a data structure that implements a root node, the parent scheduling node pointer of that data structure may point to NULL because, in some embodiments, by definition, a root node may not have a parent scheduling node. Likewise, in the case of a data structure that implements a leaf node, each child scheduling node pointer of that data structure may point to NULL because, in some embodiments, by definition, a leaf node may not have any child scheduling nodes.
As discussed above, scheduling system 212 may be configured to select a packet queue from which to obtain a packet for delivery to a transmit queue 214 by traversing scheduling tree 300. In some embodiments, scheduling system 212 traverses scheduling tree 300 in a top-down manner (but, in other embodiments, scheduling system may traverse scheduling tree 300 using a bottom-up traversal algorithm) by starting at root scheduling node 301 and then selecting a child scheduling node (e.g., selecting a child pointer data element from the data structure that implements root node 301). In some embodiments, root scheduling node 301 may be associated with a scheduling algorithm (e.g., round-robin). Also, each scheduling node, may be associated with a maximum data rate (and other parameters, such as a minimum target data rate). For example, as discussed above, a data structure may implement a scheduling node, therefore, a scheduling node may be associated with a maximum data rate by storing the maximum data rate in a data element of the data structure that implements the scheduling node.
In such embodiments, scheduling engine 212 selects a child scheduling node of root scheduling node 301 using the scheduling algorithm associated with root scheduling node 301 and the maximum data rates. For example, if it is assumed that (a) the maximum data rate associated with scheduling node 302 is 7 Gbps and (b) the scheduling algorithm associated with root scheduling node 301 indicates that scheduling system 212 should select aggregate scheduling node 302, then scheduling system 212 will select aggregate scheduling node 302, unless, within the last second of time (or other period of time), scheduling system 212 has already selected from the packet queues associated with scheduling node 302 (i.e., packet queues q1, q2 and q3) more than 10 Gb of data, otherwise scheduling system 212 will select one of the other scheduling nodes directly connected to root scheduling node 301 (i.e., scheduling nodes 303 and 304, in this example).
If the selected child scheduling node is a leaf scheduling node, then scheduling system 212 selects the packet queue associated with the selected leaf scheduling node and moves a packet from the selected packet queue to the transmit queue 214. If the selected child scheduling node is a not a leaf scheduling node (i.e., is an aggregate scheduling node), then scheduling system 212 selects a child scheduling node of the selected aggregate scheduling node. This process repeats until scheduling system 212 selects a leaf scheduling node. In this manner, scheduling system 212 traverses scheduling tree 300, considering and enforcing max rates or other scheduling rules at each level and node of the tree.
Like root scheduling node 301, the selected aggregate scheduling node may be associated with a scheduling algorithm, and each child scheduling node of the selected aggregate scheduling node may be associated with a maximum data rate (and/or other parameters). Thus, scheduling system 212 uses the scheduling algorithm and maximum data rates to determine which child scheduling node will be selected. As discussed above, this process repeats until scheduling system 212 selects a scheduling node that is a leaf scheduling node (i.e., a scheduling node that does not have any child scheduling nodes). After selecting a leaf scheduling node and moving to transmit queue 214 a packet from the packet queue associated with the selected leaf scheduling node, scheduling system 212 will once again traverse the scheduling tree 300 starting at root scheduling node 301. Thus, scheduling system 212 continuously traverses the scheduling tree 300 and, thereby, continuously selects a packet queue from which to move a packet to transmit queue 214. In this manner, packets are queued for transmission on physical communications link 123.
As is evident from the above description, scheduling system 212 maintains state information for at least some of the scheduling nodes. For example, if a scheduling node has a maximum data rate associated with it, then scheduling system 212 will keep track of how much data has been selected for transmission from the packet queues associated (directly and indirectly) with the scheduling node. As another example, if a scheduling node is associated with a scheduling algorithm, then scheduling system 212 may maintain state information required to implement the scheduling algorithm (e.g., in the case where the scheduling algorithm of the scheduling node is a round-robin scheduling algorithm, then scheduling system 212 may keep track of which child of the scheduling node had the last “turn”). In some embodiments, scheduling system 212 may store the state information for a scheduling node in one or more data elements of the data structure that implements the scheduling node.
In situations where the transmission capacity of physical communications link 123 is high (e.g., 100 Gbps), there may be situations where scheduling system 212 is not able to move packets into transmit queue 214 quickly enough such that all of the 100 Gbps capacity is utilized due to the fact that the scheduling tree used by scheduling system 212 has too many decision points. In such situations, multiple new scheduling trees can be formed from the existing scheduling tree. For example,
In the example shown, tree 401 is used by scheduling engine 521 to select a packet queue from the packet queue set that consists of iq1, iq2 and q8; tree 402 is used by virtual port scheduling engine 522 to select a packet queue from the packet queue set that consists of q1-q3; and tree 403 is used by virtual port scheduling engine 523 to select a packet queue from the packet queue set that consists of q4-q7. Each scheduling engine 521-523 operates in the same manner as scheduling system 212 described above in connection with tree 300. That is, each scheduling engine 521-523 continually traverses its corresponding scheduling tree; thus each scheduling engine 521-523 continually moves packets from a packet queue selected based on the corresponding tree to transmit queue 214 or to an intermediate packet queue.
More specifically, scheduling engine 521 is configured such that it will move a packet from a selected packet queue to transmit queue 214, whereas scheduling engines 522 and 523 are configured such that each will move a packet from a selected packet queue to an intermediate packet queue (e.g., iq1 and iq2, respectively). Scheduling engines 521, 522, and 523 may be configured to operate in parallel. That is, while scheduling engines 522 and 523 are moving packets in to the intermediate packet queues (iq1 and iq2), scheduling engine 521 may moving packets out of those packet queues and into transmit queue 214. Additionally, scheduling engines 521, 522, and 523 may be configured to operate independently of each other such that any one of scheduling engines does not need any state information maintained by another scheduling engine. In this manner, the rate at which packets are moved into transmit queue 214 can increase greatly. For example, if we assume that at least one of the intermediate packet queues always contains at least one packet, then the rate at which packets are moved into transmit queue 214 is dependent solely on the “bandwidth” of scheduling engine 521 (i.e., the rate at which scheduling engine can transfer packets to transmit queue 214). Moreover, in some embodiments, scheduling engine 521 can be a very simple scheduling engine because its scheduling tree (e.g. tree 401) may only require traversing a single level (e.g., all of the scheduling nodes connected to the root scheduling node 301 are leaf scheduling nodes). Thus, in some embodiments, scheduling engine 521 is implemented substantially purely in hardware so that it will have high bandwidth. For example, in some embodiments, scheduling engine 521 consists (or consists essentially of) an application specific integrated circuit (ASIC), whereas scheduling engines 522 and 523 are software based (e.g., implemented using a general purpose processor having associated therewith a computer readable medium having a computer program stored thereon, such as a program written in an assembly language).
As further shown in
In step 814, PRPU 202 receives a packet. In step 814, PRPU 202 may also receive meta-data associated with the packet. In step 816, PRPU 202 determines the packet flow to which the packet belongs. As discussed above, PRPU 202 may determine the packet flow using data contained in the packet (e.g., a destination address) and/or the meta-data, which may identify one or more VLANs to which the packet is destined. In step 818, PRPU 202 places the received packet in the packet queue associated with the determined packet flow. For example, in step 818 PRPU may use the determined packet flow (e.g., determined VLAN identifiers) to look up in database 204 that packet queue that is assigned to the determined packet flow. PRPU 202 may perform steps 814-818 continuously.
In step 820, the first scheduling engine selects a packet queue from the first set of packet queues. For example, in step 820 the first scheduling engine may traverse a scheduling tree to arrive at a leaf scheduling node of the tree and, thereby, select the packet queue associated with the leaf scheduling node. In step 822, the first scheduling engine moves a packet from the selected packet queue to a first intermediate packet queue (e.g., iq1). The first scheduling engine may perform steps 820-822 continuously.
In some embodiments, the first scheduling engine periodically monitors the state of the first intermediate packet queue (e.g., periodically determines the length of the packet queue), and, depending on the state of the packet queue, may cease performing steps 802-822 for a short period of time (i.e., the first scheduling engine may pause). For example, if the first scheduling engine determines that the length of the first intermediate packet queue is greater than a predetermined threshold, then first scheduling engine, in response to that determination, may pause for some amount of time or temporarily selectively schedule only packets that are bound for other intermediate queues that are not full, thereby preventing the first intermediate packet queue from growing to large. This feature provides the advantages of: (i) bounding the amount of system resources (e.g., packet buffers) consumed by the intermediate queues, (ii) bounding the additional forwarding latency that could be incurred while a packet is waiting in an intermediate queue, and (iii) ensuring rules associated with the virtual port scheduling engines ultimately determine scheduling behavior.
In step 824, the second scheduling engine selects a packet queue from the second set of packet queues. For example, in step 824 the second scheduling engine may traverse a scheduling tree to arrive at a leaf scheduling node of the tree and, thereby, select the packet queue associated with the leaf scheduling node. In step 826, the second scheduling engine moves a packet from the selected packet queue to a second intermediate packet queue (e.g., iq2). The second scheduling engine may perform steps 824-826 continuously and independently of the first scheduling engine. Like the first scheduling engine, the second scheduling engine may periodically monitor the state of the second intermediate packet queue, and may be configured to pause depending on the state of the packet queue.
In step 828, the third scheduling engine selects a packet queue from a set of packet queues that includes the first and second intermediate packet queues and the third set of packet queues. In step 830, the third scheduling engine moves a packet from the selected packet queue to the transmit queue 214. The third scheduling engine may perform steps 828-830 continuously and independently of the first scheduling engine and the second scheduling engine. Like the first and second scheduling engines, the third scheduling engine may periodically monitor the state of transmit queue 214, and may be configured to pause depending on the state of the packet queue.
In the above manner, multiple, independent scheduling engines are employed to move packets to the transmit queue, thereby increasing the throughput of network interface 191.
Referring back to
As a specific example, assume that a new DSLAM 901 (see
After the scheduling nodes are defined, the packet flow represented by the leaf scheduling nodes need to be associated with a unique packet queue. Configuration module 208 may perform this function by updating packet flow to packet queue database 204 by adding to database 204, for each leaf scheduling node, information mapping the packet flow defined by the leaf scheduling node with a packet queue.
Additionally, after the scheduling nodes are defined, one or more of the scheduling trees that are currently being used by scheduling system 212 need to be modified to accommodate the leaf scheduling nodes 1002 and 1003 and/or a new scheduling tree needs to be created. This can be done manually by the network administrator or automatically by configuration module 208.
As an example,
In embodiments where configuration module 208 automatically reconfigures the scheduling trees, configuration module 208 may be programmed to take into account scheduling engine bandwidth and the maximum bandwidths associated with scheduling nodes. For example, if we assume that (a) the maximum bandwidth of scheduling engine 522 is 15 Gbps, (b) the maximum bandwidth associated with scheduling node 302 is 10 Gbps, and (c) the maximum bandwidth associated with scheduling node 1001 is also 10 Gbps, then configuration module 208 would not add leaf scheduling nodes 1002 and 1003 to tree 402, as shown in
Referring now to
Referring now to
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
Additionally, while the processes described above and illustrated in the drawings are shown as a sequence of steps, this was done solely for the sake of illustration. Accordingly, it is contemplated that some steps may be added, some steps may be omitted, the order of the steps may be re-arranged, and some steps may be performed in parallel.
Patent | Priority | Assignee | Title |
10534623, | Dec 16 2013 | Microsoft Technology Licensing, LLC | Systems and methods for providing a virtual assistant |
10999335, | Aug 10 2012 | Microsoft Technology Licensing, LLC | Virtual agent communication for electronic device |
11388208, | Aug 10 2012 | Microsoft Technology Licensing, LLC | Virtual agent communication for electronic device |
9679300, | Dec 11 2012 | Microsoft Technology Licensing, LLC | Systems and methods for virtual agent recommendation for multiple persons |
Patent | Priority | Assignee | Title |
6834053, | Oct 27 2000 | Nortel Networks Limited | Distributed traffic scheduler |
7760747, | Sep 09 1999 | FUTUREWEI TECHNOLOGIES, INC , DBA HUAWEI TECHNOLOGIES USA | Apparatus and method for packet scheduling |
7830889, | Feb 06 2003 | Juniper Networks, Inc. | Systems for scheduling the transmission of data in a network device |
7969884, | May 09 2008 | RPX CLEARINGHOUSE LLC | Method and system for weight and rate scheduling |
8194690, | May 24 2006 | EZCHIP SEMICONDUCTOR LTD ; Mellanox Technologies, LTD | Packet processing in a parallel processing environment |
8437355, | Nov 04 2010 | Adtran, Inc. | Systems and methods for scheduling business and residential services in optical networks |
8443066, | Feb 13 2004 | Oracle International Corporation | Programmatic instantiation, and provisioning of servers |
20050047425, | |||
20070070895, | |||
20090279550, | |||
20120063313, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Apr 05 2011 | Telefonaktiebolaget L M Ericsson (publ) | (assignment on the face of the patent) | / | |||
Apr 05 2011 | LYNCH, TIMOTHY | TELEFONAKTIEBOLAGET LM ERICSSON PUBL | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026156 | /0479 | |
Apr 05 2011 | LAM, PETER | TELEFONAKTIEBOLAGET LM ERICSSON PUBL | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026156 | /0479 |
Date | Maintenance Fee Events |
Oct 23 2017 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Oct 22 2021 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Apr 22 2017 | 4 years fee payment window open |
Oct 22 2017 | 6 months grace period start (w surcharge) |
Apr 22 2018 | patent expiry (for year 4) |
Apr 22 2020 | 2 years to revive unintentionally abandoned end. (for year 4) |
Apr 22 2021 | 8 years fee payment window open |
Oct 22 2021 | 6 months grace period start (w surcharge) |
Apr 22 2022 | patent expiry (for year 8) |
Apr 22 2024 | 2 years to revive unintentionally abandoned end. (for year 8) |
Apr 22 2025 | 12 years fee payment window open |
Oct 22 2025 | 6 months grace period start (w surcharge) |
Apr 22 2026 | patent expiry (for year 12) |
Apr 22 2028 | 2 years to revive unintentionally abandoned end. (for year 12) |