A multiport memory architecture, systems including the same and methods for using the same. The architecture generally includes (a) a memory array; (b) a plurality of ports configured to receive and/or transmit data; and (c) a plurality of port buffers, each of which is configured to transmit the data to and/or receive the data from one or more of the ports, and all of which are configured to (i) transmit the data to the memory array on a first common bus and (ii) receive the data from the memory array on a second common bus. The systems generally include those that embody one or more of the inventive concepts disclosed herein. The methods generally relate to writing blocks of data to, reading blocks of data from, and/or transferring blocks of data across a memory. The present invention advantageously reduces latency in data communications, particularly in network switches, by tightly coupling port buffers to the main memory and advantageously using point-to-point communications over long segments of the memory read and write paths, thereby reducing routing congestion and enabling the elimination of a FIFO. The invention advantageously shrinks chip size and provides increased data transmission rates and throughput, and in preferred embodiments, reduced resistance and/or capacitance in the memory read and write busses.
|
1. A method of writing data to a multiport memory, the method comprising:
converting serial data to n-bit-wide parallel data, n bits of data forming a word;
selecting one of multiple write lines of a port buffer;
buffering a k-word-long block of the n-bit-wide parallel data into the selected write line of the port buffer;
transmitting a block of the k*n bits of data to the multiport memory on a bus common to each of the multiple write lines of the port buffer; and
writing the k*n bits of data into the multiport memory.
10. A method of reading data from a multiport memory, the method comprising:
outputting k*n bits of data from the multiport memory onto a k*n-bit-wide bus, the bus being common to each of multiple read lines of a port buffer;
selecting one of the multiple read lines of the port buffer,
converting the k*n bits of data into n-bit-wide parallel data by buffering k words of the n-bit-wide parallel data into the selected read line of the port buffer; and
converting the n-bit-wide parallel data into serial data to be read externally from the multiport memory.
2. The method of
3. The method of
5. The method of
the serial data is converted to the n-bit-wide parallel data at a first frequency;
the k-word-long block of the n-bit-wide parallel data is buffered at a second frequency; and
the k*n bits of data are written into the multiport memory at a third frequency, the first frequency differing from the third frequency.
6. The method of
7. The method of
8. The method of
selecting one of multiple read lines of the port buffer; and
buffering the k*n bits of data as k words of n-bit-wide parallel data into the selected read line of the port buffer.
9. The method of
11. The method of
12. The method of
13. The method of
14. The method of
the n-bit-wide parallel data is converted into the serial data at a first frequency;
the k*n bits of data are converted into the n-bit-wide parallel data at a second frequency; and
the k*n bits of data is simultaneously outputted at a third frequency, the first frequency differing from the third frequency.
15. The method of
16. The method of
17. The method of
18. The method of
19. The method of
20. The method of
|
This application is a divisional of U.S. patent application Ser. No. 10/702,744, filed Nov. 5, 2003, incorporated herein by reference in its entirety and which claims the benefit of U.S. Provisional Application No. 60/454,443, filed Mar. 13, 2003, which is incorporated herein by reference in its entirety.
The present invention generally relates to the field of multiport memories. More specifically, embodiments of the present invention pertain to architectures, systems and methods for data communications in a network using a multiport memory.
Memories are used in networks to enable rapid data transfer from one or more data sources to any of a plurality of destinations, oftentimes in switching devices.
Ports 30-37 typically operate at network speeds; e.g., at or about 1 GHz. However, memory array 20 typically operates at a significantly slower speed; e.g., 100-200 MHz. Consequently, the architecture 10 requires FIFO buffers to temporarily store the data that is going into or coming out of memory array 20. However, FIFO buffers 40-47 are typically located close to ports 30-37, which limits the effective operational rate of FIFO buffers 40-47 and memory array 20 due to the loading requirements of busses 50 and 52 (e.g., the current and/or voltage needed to overcome or control the inherent capacitance[s], resistance[s] and/or impedance of busses 50 and 52). Thus, to improve throughput using the architecture of
There are physical limits to the maximum throughput of architecture 10, however. Memory can only go so fast in any given process technology, and increasing the width of the memory limits its speed due to internal loading of the memory's control signals. Increasing the external width of a memory causes increased die area and die cost. In the example of
A need therefore exists to increase the operational speed of multiport memories to keep up with ever-increasing demands for increased network speeds and high network switching flexibility.
Embodiments of the present invention relate to multiport memory architectures, systems and methods for using the same. The multiport memory architecture generally comprises (a) a memory array; (b) a plurality of ports configured to receive and/or transmit data; and (c) a plurality of port buffers, each of which is configured to transmit the data to and/or receive the data from one or more of the ports, and all of which are configured to (i) transmit the data to the memory array on a first common bus and (ii) receive the data from the memory array on a second common bus. The systems and network switches generally comprise those that include an architecture embodying one or more of the inventive concepts disclosed herein.
The method of writing generally comprises the steps of (1) converting serial data to n-bit-wide parallel data, n bits of data forming a word; (2) buffering a k-word-long block of the n-bit-wide parallel data; and (3) substantially simultaneously writing the k*n bits of data into the memory. The invention also relates to method of reading data from a memory, comprising the steps of (1′) substantially simultaneously outputting k*n bits of data from the memory onto a k*n-bit-wide bus; (2′) converting the k*n bits of data into n-bit-wide parallel data; and (3′) converting the n-bit-wide parallel data into serial data to be read externally from the memory. The invention also concerns a method of transferring data in a network, comprising a combination of one or more steps from each of the present methods of writing to and reading from a memory.
The present invention advantageously reduces latency in data communications, particularly in packet network switches, by tightly coupling the port buffers to the main memory, thereby advantageously enabling (1) use of point-to-point communications over relatively long segments of the memory read and write paths and (2) the elimination of a FIFO memory in the memory read and write paths. Thus, the invention also provides generally reduced routing congestion and reduced die sizes, particularly when using standard cell-based design techniques. On-chip point-to-point communications from bond pad to port buffers and vice versa further reduces parasitics in the corresponding wires. By tightly coupling port buffers to the main memory array, the invention advantageously reduces RC components of the memory read and write busses, further increasing data transmission rates and throughput. In contrast, the routing of the architecture of
These and other advantages of the present invention will become readily apparent from the detailed description of preferred embodiments below.
Reference will now be made in detail to the preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be readily apparent to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present invention.
Some portions of the detailed descriptions which follow are presented in terms of processes, procedures, logic blocks, functional blocks, processing, and other symbolic representations of operations on data bits, data streams or waveforms within a computer, processor, controller and/or memory. These descriptions and representations are generally used by those skilled in the data processing arts to effectively convey the substance of their work to others skilled in the art. A process, procedure, logic block, function, process, etc., is herein, and is generally, considered to be a self-consistent sequence of steps or instructions leading to a desired and/or expected result. The steps generally include physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic, optical, or quantum signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer or data processing system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, waves, waveforms, streams, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise and/or as is apparent from the following discussions, it is appreciated that throughout the present application, discussions utilizing terms such as “processing,” “operating,” “computing,” “calculating,” “determining,” “manipulating,” “transforming,” “displaying” or the like, refer to the action and processes of a computer or data processing system, or similar processing device (e.g., an electrical, optical, or quantum computing or processing device), that manipulates and transforms data represented as physical (e.g., electronic) quantities. The terms refer to actions and processes of the processing devices that manipulate or transform physical quantities within the component(s) of a system or architecture (e.g., registers, memories, other such information storage, transmission or display devices, etc.) into other data similarly represented as physical quantities within other components of the same or a different system or architecture.
Furthermore, for the sake of convenience and simplicity, the terms “clock,” “time,” “rate,” “period” and “frequency” may be used somewhat interchangeably herein, but are generally given their art-recognized meanings. Also, for convenience and simplicity, the terms “data,” “data stream,” “signal,” “waveform” and “information” may be used interchangeably, as may the terms “connected to,” “coupled with,” “coupled to,” and “in communication with” (which may refer to a direct or indirect link or signal path), but these terms are also generally given their art-recognized meanings.
The present invention concerns a multiport memory architecture, and systems comprising and methods of using the same. The multiport memory architecture generally comprises (a) a memory array; (b) a plurality of ports configured to receive and/or transmit data; and (c) a plurality of port buffers, each of which is configured to transmit the data to and/or receive the data from one or more of the ports, and all of which are configured to (i) transmit the data to the memory array on a first common bus and (ii) receive the data from the memory array on a second common bus. A further aspect of the invention concerns a network switch, system, and network generally comprising the present architecture and/or embodying one or more of the inventive concepts described herein.
Even further aspects of the invention concern methods of reading from and/or writing to a memory. The method of writing generally comprises the steps of (1) converting serial data to n-bit-wide parallel data, n bits of data forming a word; (2) buffering a k-word-long block of the n-bit-wide parallel data; and (3) substantially simultaneously writing the k*n bits of data into the memory. The invention also relates to method of reading data from a memory, comprising the steps of (1′) substantially simultaneously outputting k*n bits of data from the memory onto a k*n-bit-wide bus; (2′) converting the k*n bits of data into n-bit-wide parallel data; and (3′) converting the n-bit-wide parallel data into serial data to be read externally from the memory. The invention also concerns a method of transferring data in a network, comprising a combination of one or more steps from each of the present methods of writing to and reading from a memory.
The invention, in its various aspects, will be explained in greater detail below with regard to exemplary embodiments.
An Exemplary Memory Architecture
In one aspect, the present invention relates to a multiport memory architecture generally comprises (a) a memory array; (b) a plurality of ports configured to receive and/or transmit data; and (c) a plurality of port buffers, each of which is configured to transmit data to and/or receive data from one or more of the ports, and all of which are configured to (i) transmit block of the data to the memory array on a first common bus and (ii) receive a block of the data from the memory array on a second common bus.
In the present architecture, the memory array is conventional, and may comprise a plurality of memory sub-arrays. These sub-arrays may comprise one or more rows, columns, blocks or pages of memory, pages being a preferred implementation (a so-called “multiport page mode memory,” or MPPM). Each of the memory rows, columns, blocks and/or pages may be identifiable and/or accessible by a unique memory address corresponding to the row, column, block and/or page. In a preferred implementation, each of the blocks of data transferred between memory array 110 and a port buffer 120-127 comprises a page of data. Typically, the minimum density of the memory array 110 is 256 kb or 1 Mb. While the maximum density of the memory array 110 is not limited, as a practical matter, a typical maximum density is about 32 Mb or 128 Mb.
The nature of the memory elements in memory array 110 is also not particularly limited, and may include latches, static random access memory (SRAM), dynamic random access memory (DRAM), magnetic random access memory (MRAM), electrically erasable and programmable read only memory (EEPROM) and flash memory, although for simplicity, speed and low power considerations, latches are preferred. The memory array 110 may also be synchronous or asynchronous, but for speed and timing considerations, synchronous memory is preferred.
In the present architecture, the port buffers 120-127 may be considered “tightly coupled” to the memory array 110. In essence, “tightly coupled” means that the port buffers 120-127 are in closer proximity to the memory array 110 than they are to ports 130-145, and that the memory busses 150a, 150b, 155a and 155b are designed to reduce or minimize RC components, such as bus length (corresponding to resistance) and/or parasitic capacitance between adjacent metal lines in the bus. While the port buffers 120-127 are shown on different sides of memory array 110, and the ports 130-144 are shown on different sides of port buffers 120-127, and the port buffers 120-127 can be located on one side of array 110 (see, e.g.,
In the present multiport memory architecture, the number of port buffers may be any integer of 2 or more, 3 or more, or 4 or more. In certain implementations, there may be (2x−d) port buffers in the architecture, x being an integer of at least 3, and in various embodiments, of from 4 to 8 (e.g., 5 or 6), and d is 0 or an integer of (2x−1−1) or less. The value of d may be determined by the number of parallel registers that accompany the port buffers (e.g., that have a port buffer address), but which provide a different function, such as “snoop” register 140 and/or parallel read and write registers 141-142. Independently, the number of corresponding ports is generally 2 or more, 3 or more, or 4 or more, and in certain implementations, may be (2x−d), where x and d are as described above. In one implementation, there are 10 ports. Preferably, the ports and port buffers are in a 1:1 relationship, although it is not necessarily the case that each port communicates with only a single port buffer (or vice versa; a so-called “dedicated” port or port buffer).
Referring now to
In preferred implementations, the read portion 250i and the write portion 240i each independently comprises a*(2y+b) entries, where a is the number of lines or rows of entries (e.g., write lines 242 and/or 244), 2y is the number of entries in a line or row, y is an integer of at least 3, and b is 0 or an integer of (2y−1) or less. In some embodiments, b is 0 and y is an integer of from 4 to 8, and in specific embodiments, y is 5 or 6.
Referring back to
Again referring back to
The port buffers in the present architecture may be single buffered (see, e.g.,
The present architecture enables processing and/or transfers of data at a variety of rates and/or across time domains. For example, the memory array may operate at a first frequency, and each of the ports may operate independently at a second frequency greater or less than the first frequency. For example, and referring back to
Continuing to refer to
Read-only “snoop” register 142 (
An Exemplary Packet Network Switch, System, and Network
In a further aspect of the invention, the network switch, system, and network generally comprise those that include an architecture embodying one or more of the inventive concepts disclosed herein. For example, the network switch may simply comprise the present multiport memory architecture. In preferred embodiments, the network switch is embodied on a single integrated circuit.
As discussed above, one advantage of the present invention is that a FIFO buffer to buffer data between a port and main memory is not necessary, thereby reducing the area of an IC dedicated to FIFO-main memory routing and (ideally) increasing data transmission speeds though the IC. Therefore, the present network switch may comprise a plurality of port buffers that each (i) transmit the data to a corresponding port along a first data path and (ii) receive the data from the corresponding port along a second data path, wherein none of these data paths includes a first-in-first-out (FIFO) memory.
In further embodiments, the system may include a port that is configured to convert serial data from the network to parallel data for processing in the network switch, and/or convert parallel data from the network switch to serial data for the network. In most implementations, the system port will be the memory architecture port described above, but in some implementations, the system port can be a separate port configured to transmit data externally to an integrated circuit (IC) that includes the memory architecture and a transmitter. Thus, the system may further include (i) at least one port (and preferably a plurality of ports) comprising a transmitter configured to transmit serial data to an external receiver; and (ii) at least one port (and preferably a plurality of ports) comprising a receiver configured to receive externally-generated serial data (e.g., serial data from an external transmitter).
The invention further relates to a network, comprising at least one of the present systems, and a plurality of storage or data communications devices, each of the devices being communicatively coupled to the system. In further embodiments, the network may comprise (a) a plurality of the present systems, which may be communicatively coupled to each other and/or cascaded with each other; and (b) a plurality of storage or communications devices, wherein each storage or communications device is communicatively coupled to at least one of the systems. In one implementation, each of the devices is communicatively coupled to a unique system. The network may be any kind of known network, such as a packet switching network.
Exemplary Methods
The present invention further relates to method of writing data to a memory, comprising the steps of (a) converting serial data to n-bit-wide parallel data, n bits of data forming a word; (b) buffering a k-word-long block of the n-bit-wide parallel data; and (c) substantially simultaneously writing the k*n bits of data into the memory. The invention also relates to method of reading data from a memory, comprising the steps of (1) substantially simultaneously outputting k*n bits of data from the memory onto a k*n-bit-wide bus; (2) converting the k*n bits of data into n-bit-wide parallel data; and (3) converting the n-bit-wide parallel data into serial data to be read externally from the memory. The invention also concerns a method of transferring data in a network, comprising combinations of steps in the methods of writing and reading.
In one embodiment of the method of writing, buffering may comprise sequentially writing k words of the n-bit-wide parallel data into k data storage elements. In a further embodiments of the method(s) of reading and/or writing, the step of converting serial data to n-bit-wide parallel data may be conducted at a first frequency, the buffering step at a second frequency, and the step of substantially simultaneously writing the k*n bits of data at a third frequency, the first frequency being the same as or different from both the second and the third frequencies. As discussed above, the first frequency may be greater or less than the second and third frequencies. However, the third frequency is generally substantially the same as or higher than the second frequency.
The method of writing data may further comprise the step(s) of (i) identifying one of a plurality of buffer addresses for buffering the k-word-long block of the n-bit-wide parallel data, (ii) identifying one of a plurality of memory addresses for substantially simultaneously writing all k*n bits of data into the memory, (iii) receiving the serial data.
The invention further encompasses a method of transferring data in a network, comprising: the present method of writing data to a memory, and substantially simultaneously reading the k*n bits of data from the memory. As one might expect, in a preferred implementation, the step of substantially simultaneously reading the k*n bits of data comprises buffering the k*n bits of data as k words of n-bit-wide data, and may further comprise converting the n-bit-wide data into serial data to be read externally from the memory.
The method of reading data from a memory generally comprises the steps of (1) substantially simultaneously outputting k*n bits of data from the memory onto a k*n-bit-wide bus; (2) converting the k*n bits of data into n-bit-wide parallel data; and (3) converting the n-bit-wide parallel data into serial data to be read externally from the memory. In preferred embodiments, the step of converting the k*n bits of data into n-bit-wide parallel data comprises buffering k words of n-bit-wide data, and the buffering step may comprise storing the k words of n-bit-wide data in k registers, each register having n data storage elements (where k and n are as described above). In other words, in the method of reading, converting k*n bits of data into n-bit-wide parallel data comprise buffering the data as k words of n-bit-wide data. In a preferred implementation, the step of converting the k*n bits of data into n-bit-wide parallel data further comprises sequentially shifting the k words of n-bit-wide data onto an n-bit-wide bus. As described above, the step of converting n-bit-wide parallel data into serial data may be conducted at a first frequency, the step of converting the k*n bits of data into n-bit-wide parallel data may be conducted at a second frequency, and the step of substantially simultaneously outputting the k*n bits of data may be conducted at a third frequency, the first, second and third frequencies being as described above.
The method of reading data from a memory may further comprise (a) identifying one of a plurality of buffer addresses for buffering the k words of the n-bit-wide data, and/or (b) identifying one of a plurality of memory addresses for simultaneously outputting the k*n bits of data from the memory.
An Exemplary Implementation
Referring now to
This memory contains two major functional units: port pages 220a-k and memory block 210. Memory access from a port goes through a port page 220i (the designation “i” refers to any one of a plurality of substantially structurally and/or functionally identical elements), which serves as a bridge between the internal memory block interface (e.g., buffers 230) and the port interface, reconciling the difference between the memory block bandwidth and the bandwidth of an individual port while allowing efficient use of the memory block bandwidth. Since the internal memory block data interface 230 is relatively wide, and the port data interface is relatively narrow, the port pages act as temporary storage as well as parallel to serial and serial to parallel converters.
With the double buffering of port pages for both read and write accesses, the multi-port memory 200 can be used such that sustained concurrent non-blocking accesses between memory 210 and all ports can be maintained indefinitely. For port write accesses, the corresponding page entries are filled sequentially with write data through a dedicated 8-bit port write data bus. Subsequently, at the cue of a memory write signal, the entire contents of a page 220i are written into a selected page in the memory 210.
Through the memory control interface and the page control interface (not shown), the user can control when the page contents are written to the memory 210. Referring now to
Port read accesses are performed by first loading the contents from the desired page in memory 210 (up to 32 bytes) into the read buffer portion 250i of port page 220i. Next, the contents of the port page 220i are clocked out sequentially through the dedicated 8-bit port read bus RD[7:0]. By selecting a line using appropriate states of control signals NRSEi and multiplexer 256, the second read page line 226 is available for the next page of data from memory as soon as it is available, while the port is sending data from the first line 228. As soon as data is exhausted from the first line 228, data can be sent from the second line 226, and the first line 228 is available for the next page of data from memory 210.
The memory block 210 is accessed through memory control signals, a dedicated read bus 212 and a dedicated write bus 214 to the port pages. The width of the data busses is the number of entries 242a-o, 244a-o, 252a-o or 254a-o in a page multiplied by 8. The memory read and write busses 212 and 214 are coupled to the port read and write pages 250i and 240i, respectively. A source addresses and a destination addresses must accompany each memory request. For a write access, the source address is the port page 220i address, and the destination address is the page address in memory 210. For the read access, the source address is the page address in memory 210, and the destination address is the port page 220i address. The user controls the scheduling of the write and read operations to the port pages 220i and memory block 210 according to the temporal validity of the data in the port pages 220i and the memory block 210.
In most cases, operating in the sustained concurrent non-blocking mode will require that the number of entries 242i, 244i, 252i and 254i per page 220i be greater than the number of ports divided by two, and that the memory bandwidth be greater than the required aggregate bandwidth of the port pages 220a-220k.
The port count, memory capacity and memory bandwidth can be increased by using multiple blocks of the multi-port memory system described above. By cascading two multi-port page mode (MPPM) memory architectures 200 by techniques known in the art, sustained concurrent access of up to 2*2z (and in one specific implementation, 8192) pages containing up to 2y (and in one specific implementation, 32) bytes of data per line can be attained by up to 2*2x (and in one specific implementation, 32) read and/or write (R/W) ports. Up to m MPPM memories 200 may be cascaded, enabling sustained concurrent access of up to m*2z (where z is, e.g., from 8 to 15) pages containing 2y (where y is, e.g., from 3 to 8) bytes of data per line by up to m*2x (where x is, e.g., from 2 to 7) R/W ports. The exact number of ports depends on the desired aggregate port bandwidth and the memory operating frequency.
Applications of multi-port page mode memory 200 include those that can use a high port count, high bandwidth switch fabric. Features of memory 200 include support for any number of ports (e.g., in one implementation, 10, and in another, 16), dedicated read and write page blocks for each port, dedicated double buffered read port pages, dedicated double buffered write port pages, any number of entries (e.g., up to 2y, and in one implementation, 32) of any number of bits (e.g., up to (2p+c), and in one implementation, 8) each per page line, any number of pages or memory blocks (e.g., up to 2z, and in one implementation, 4096), port page operational frequencies up to 200 MHz (or faster depending upon the technology used), memory block operational frequencies up to 200 MHz (or faster), a 2-cycle memory read latency, a 2-cycle memory write latency, simple interfaces, a write snoop register 260, a parallel read port register 270, and a parallel write port register 280. Hardware descriptions of the memory 200 exist or can be provided without undue experimentation in 0.13 or 0.15 μm CMOS technology. Approximate dimensions of a 1 Mb 9-port, double buffer configuration are about 1880 μm×2870 μm; approximate dimensions of a 2 Mb, 26-port, single buffer configuration are about 3800 μm×3120 μm (both estimated for 0.15 μm technology). Predicted power dissipation @ 200 MHz (page clock and memory clock frequencies) is less than 1 W.
The following name and usage conventions are used in
TABLE 1
Port interface signal descriptions.
Width
Signal Name
Type
Description
1
WPCK
Input
Port write clock. A dedicated clock should accompany each
write port to synchronize the loading of write data into the
write page entries. A common clock can be used for all
ports if timing permits and power is not a significant
concern.
1
NWSE0
Input
Write Line Select signal. When low, line 0 of write double
buffer is activated. When writing to entire line, NWSE0
must be held low for 32 WPCK cycles. If both NWSE0 and
NWSE1 are asserted, the same data is written to both lines.
1
NWSE1
Input
Write Line Select signal. When low, line 1 of write double
buffer is activated. When writing to entire line, NWSE1
must be held low for 32 WPCK cycles. If both NWSE0 and
NWSE1 are asserted, the same data is written to both lines.
1
WEPR
Input
Write entry select pointer reset signal. This signal is used in
conjunction with NWSE and is synchronized to WPCK.
Assertion of WEPR relative to the rising edge of WPCK sets
the selected write entry select pointer to entry 0. If both
NWSE0 and NWSE1 are asserted, the write entry select
pointer for both write lines is reset to entry 0. After de-
assertion of WEPR, each subsequent cycle of WPCK
advances the selected write entry select pointer. After the
entry select pointer advances to the last entry, all subsequent
WPCK cycles will produce a null pointer. The selected
write pointer will point to entry 0 upon the next assertion of
WEPR across the rising edge of WPCK.
8
WD[7:0]
Input
Port write 8-bit data bus.
1
RPCK
Input
Port read clock. This clock strobes data onto the port read
data bus from the read entry buffers. A dedicated clock may
accompany each read port to synchronize the reading of data
from the read page entries.
1
NRSE0
Input
Read line 0 select signal. When low, line 0 of the read
double buffer is activated. To shift out contents of the 32
entries, NRSE0 is asserted for 32 RPCK cycles.
1
NRSE1
Input
Read line 1 select signal. When low, line 1 of the read
double buffer is activated. To shift out contents of the 32
entries, NRSE1 is asserted for 32 RPCK cycles.
8
RD[7:0]
Output
Port read 8-bit data bus.
1
PWCK
Input
Parallel write port clock.
1
LPWR
Input
Load Parallel Write Register. Synchronous to PWCK.
N*8
PRD[N*8-1:0]
Output
Read bus for the Parallel Read Port. Synchronous to MCK.
N*8
PWD[N*8-1:0]
Input
Write bus for the Parallel Write Port. Synchronous to
PWCK.
N*8
SBUS[N*8-1:0]
Output
Read bus for the Snoop Register. Synchronous to MCK.
1
SLD
Input
Snoop Register load signal. Synchronous to MCK.
1
NRST
Input
Port logic reset signal.
1
NWR
Input
Memory write signal. Active low. Synchronous to MCK
clock. When asserted, the memory performs a write-
operation using source (PA) and destination (MA)
addresses. The contents of the specified port page are
written into the specified memory block page.
1
NRD
Input
Memory read signal. Active low. Synchronous to MCK.
When asserted, the memory performs a read operation using
source (MA) and destination (PA) addresses. The contents
are read into the specified port page.
5
PA[4:0]
Input
Port Address. Maximum of 30 ports (for a 5-bit address).
This is the source address for a write operation to main
memory from a port page or the destination address for a
read operation from main memory to a port page.
1
PL
Input
Specifies from which line of the double buffered page to
access. “0” specifies Line 0. “1” specifies Line 1. Not used
for single buffer configuration.
12
MA[11:0]
Input
Memory page address for read or write operations.
Maximum of 4096 pages. This is the destination address for
a port page write to memory operation and the source
address for memory page to port page read.
Descriptions of the memory interface signals shown in
TABLE 2
Memory interface signal descriptions.
Signal
Width
Name
Type
Description
1
MCK
Input
MCK is the clock for the memory block.
Can be asynchronous to PCK. All memory
block operations are synchronized to MCK.
1
FDINH
Input
Redundancy information from fuse block
for memory sub-block H, loaded
through this port after system reset.
1
FDINL
Input
Redundancy information from fuse block
for memory sub-block L, loaded
through this port after system reset.
1
FSCKH
Input
Clock from fuse block to latch data in
from FDINH.
1
FSCKL
Input
Clock from fuse block to latch data in
from FDINL.
2 (or
WTC
Input
Code for setting internal write timing margin.
more)
May be input into programmable register.
3 (or
RTC
Input
Code for setting internal read timing margin.
more)
May be input into programmable register.
Functional Description
Referring to
Referring back to
While
Referring now to
As shown in part in
Memory Interface
Referring back to
Writing to Memory
Loading of all entries in a write page must be tracked. This may be done automatically using conventional logic (e.g., an n-bit counter that is reset in response to an appropriate transition or combination of WEPR and/or NWSE). Once all entries in a port page are loaded, the entire contents of this page are written into memory 210 by asserting NWR, de-asserting NRD and specifying the appropriate source and destination addresses. Referring to
Referring back to
On MCK edge 303, data from port 240k, line 1, is written to memory 210, page address Z. As for port clock WPCK[j], the rising edge of port clock WPCK[k] writing data into the last entry in write page 224k, must occur a period of time at least TLEMW before MCK edge 303. Data from port 240j, line 0, is latched in snoop register 260 on MCK edge 303.
On MCK edge 305, data from port 240q, line 0, is written to memory 210, page address X. Thus, the present architecture allows for data in memory 210 to be overwritten as soon as two clock cycles after a first block or page of data is written. Data from port 240k, line 1, is latched in snoop register 260 on MCK edge 305.
The MPPM block 200 may also include a page 280 with N*8 parallel inputs, PWD[N*8-1:0], and N*8 parallel outputs to the memory write bus 214. When present, the parallel write port page register 280 may have a port address of 31. The contents of this register may be written to memory 210 using the memory write command with the parallel write port 280 as the source address and with a page in the memory 210 as the destination address.
Reading from Memory
Referring now to
Referring back to
Referring now to
At MCK edge 335, data is read from memory 210, page X, into port buffer 220q, line 0 in accordance with the memory read operations described above, since the commands and signals 336 on the address/command interface waveform have the values MA[X], PA[q], NRD=0, NWR=1 and RPL=0. As for data 337 from MA[Z] is read onto parallel read port bus PRD two MCK cycles plus a period of time TPRDO after the corresponding read command edge 333. The commands and signals 338 on the address/command interface waveform have the values MA[Y], PA[30], NRD=0 and NWR=1. Therefore, at MCK edge 339, data is read from memory 210, page Y, into parallel read port 270. This data 338 is read onto parallel read port bus PRD two MCK cycles plus a period of time TPRDO after the corresponding read command edge 339.
Thus, the present invention provides a multiport memory architecture, and a system and method for operating on data in such a memory and/or in a network or network device including such a memory. The present invention advantageously reduces die size in data communications devices, particularly in very high speed network switches, by tightly coupling port buffers to the main memory and advantageously using narrow width point-to-point communications from a port to a port buffer, thereby reducing routing congestion over significant areas of a chip and enabling the elimination of a FIFO in the memory read and write paths. By eliminating the FIFO, the invention provides increased data transmission rates and throughput. In certain embodiments using point-to-point communications, the invention advantageously increases memory frequency due to the reduced RC components of the memory read and write busses, further increasing data transmission rates and throughput.
The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the Claims appended hereto and their equivalents.
Sutardja, Sehat, Lee, Winston, Pannell, Donald
Patent | Priority | Assignee | Title |
10026485, | May 28 2015 | Kioxia Corporation | Semiconductor device |
10381092, | May 28 2015 | TOSHIBA MEMORY CORPORATION | Semiconductor device |
10438670, | May 28 2015 | Kioxia Corporation | Semiconductor device |
10636499, | May 28 2015 | Kioxia Corporation | Semiconductor device |
10950314, | May 28 2015 | TOSHIBA MEMORY CORPORATION | Semiconductor device |
11295821, | May 28 2015 | Kioxia Corporation | Semiconductor device |
11715529, | May 28 2015 | Kioxia Corporation | Semiconductor device |
12100459, | May 28 2015 | Kioxia Corporation | Semiconductor device |
8683085, | May 06 2008 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | USB interface configurable for host or device mode |
8688877, | Mar 13 2003 | MARVELL INTERNATIONAL LTD; CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Multiport memory architecture |
8688922, | Mar 11 2010 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Hardware-supported memory management |
8806095, | Jan 31 2011 | ZEROPLUS TECHNOLOGY CO., LTD. | Electronic measuring device and method of converting serial data to parallel data for storage using the same |
8843723, | Jul 07 2010 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Multi-dimension memory timing tuner |
8874833, | Mar 23 2009 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Sequential writes to flash memory |
8924598, | May 06 2008 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | USB interface configurable for host or device mode |
9070451, | Apr 11 2008 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Modifying data stored in a multiple-write flash memory cell |
9070454, | Apr 21 2009 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Flash memory |
9105319, | Mar 13 2003 | MARVELL INTERNATIONAL LTD; CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Multiport memory architecture |
9304693, | Dec 17 2012 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | System and method for writing data to a data storage structure |
Patent | Priority | Assignee | Title |
4611299, | Feb 22 1982 | Hitachi, Ltd. | Monolithic storage device |
4823340, | Dec 05 1986 | ANT Nachrichtentechnik GmbH | Circuit arrangement for non-blocking switching of PCM channels in the space and time domain |
5260905, | Sep 03 1990 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Multi-port memory |
5307343, | Nov 30 1989 | Italtel Societa Italiana Telecommunicazioni S.p.A. | Basic element for the connection network of a fast packet switching node |
5440523, | Aug 19 1993 | Qualcomm Incorporated | Multiple-port shared memory interface and associated method |
5680595, | Jun 07 1995 | Micron Technology, Inc. | Programmable data port clocking system for clocking a plurality of data ports with a plurality of clocking signals in an asynchronous transfer mode system |
5719890, | Jun 01 1995 | Micron Technology, Inc. | Method and circuit for transferring data with dynamic parity generation and checking scheme in multi-port DRAM |
5778007, | Jun 01 1995 | Micron Technology, Inc. | Method and circuit for transferring data with dynamic parity generation and checking scheme in multi-port DRAM |
5802131, | Sep 28 1995 | Micro Technology, Inc. | Multiport serial access self-queuing memory switch |
5815447, | Aug 08 1996 | Micron Technology, Inc. | Memory device having complete row redundancy |
5875470, | Sep 28 1995 | International Business Machines Corporation | Multi-port multiple-simultaneous-access DRAM chip |
5953340, | Jul 12 1995 | HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Adaptive networking system |
5996051, | Apr 14 1997 | GLOBALFOUNDRIES Inc | Communication system which in a first mode supports concurrent memory acceses of a partitioned memory array and in a second mode supports non-concurrent memory accesses to the entire memory array |
6021086, | Aug 19 1993 | Qualcomm Incorporated | Memory interface unit, shared memory switch system and associated method |
6034957, | Aug 29 1997 | Extreme Networks | Sliced comparison engine architecture and method for a LAN switch |
6067301, | May 29 1998 | WSOU Investments, LLC | Method and apparatus for forwarding packets from a plurality of contending queues to an output |
6081528, | Jun 01 1995 | Micron Technology, Inc. | Shared buffer memory architecture for asynchronous transfer mode switching and multiplexing technology |
6115389, | Apr 17 1998 | LEGERITY, INC | Auto-negotiation for multiple ports using shared logic |
6160814, | May 31 1997 | TEXAS INSRRUMENTS INCORPORATED | Distributed shared-memory packet switch |
6167491, | Sep 01 1994 | High performance digital electronic system architecture and memory circuit therefor | |
6216205, | May 21 1998 | Integrated Device Technology, inc | Methods of controlling memory buffers having tri-port cache arrays therein |
6230191, | Oct 05 1998 | WSOU Investments, LLC | Method and apparatus for regulating the amount of buffer memory requested by a port in a multi-port switching device with shared buffer memory |
6370624, | Feb 27 1998 | Intel Corporation | Configurable page closing method and apparatus for multi-port host bridges |
6446173, | Sep 17 1997 | Sony Corporation; Sony Electronics, Inc. | Memory controller in a multi-port bridge for a local area network |
6487207, | Feb 26 1997 | Micron Technology, Inc. | Shared buffer memory architecture for asynchronous transfer mode switching and multiplexing technology |
6535939, | Nov 09 1999 | International Business Machines Corporation | Dynamically configurable memory bus and scalability ports via hardware monitored bus utilizations |
6535963, | Jun 30 1999 | Cisco Technology Inc | Memory apparatus and method for multicast devices |
6539488, | Nov 30 1999 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | System with a plurality of media access control circuits with a shared memory for storing data and synchronizing data from a clock domain to a host clock domain |
6618390, | May 21 1999 | GLOBALFOUNDRIES Inc | Method and apparatus for maintaining randomly accessible free buffer information for a network switch |
6712704, | Apr 08 1999 | Nintendo of America Inc. | Security system for video game system with hard disk drive and internet access capability |
6714643, | Feb 24 2000 | UNIFY GMBH & CO KG | System and method for implementing wait time estimation in automatic call distribution queues |
6732184, | Jan 31 2000 | Advanced Micro Devices, Inc. | Address table overflow management in a network switch |
6741589, | Jan 24 2000 | Advanced Micro Devices, Inc. | Apparatus and method for storing data segments in a multiple network switch system using a memory pool |
6785272, | Jun 24 1999 | ALLIED TELESIS, INC | Intelligent stacked switching system |
7039781, | Jul 27 2001 | Godo Kaisha IP Bridge 1 | Flash memory apparatus and method for merging stored data items |
7068651, | Jun 02 2000 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | Fibre channel address adaptor having data buffer extension and address mapping in a fibre channel switch |
7099325, | May 10 2001 | Advanced Micro Devices, Inc. | Alternately accessed parallel lookup tables for locating information in a packet switched network |
7130308, | Aug 29 1997 | Extreme Networks, Inc. | Data path architecture for a LAN switch |
7136953, | May 07 2003 | Nvidia Corporation | Apparatus, system, and method for bus link width optimization |
7149834, | Aug 22 2001 | General Atomics | Wireless device attachment and detachment system, apparatus and method |
7185132, | Sep 17 2004 | VIA Technologies, Inc. | USB controller with intelligent transmission mode switching function and the operating method thereof |
7197591, | Jun 30 2004 | Intel Corporation | Dynamic lane, voltage and frequency adjustment for serial interconnect |
7249270, | May 26 2004 | ARM Limited | Method and apparatus for placing at least one processor into a power saving mode when another processor has access to a shared resource and exiting the power saving mode upon notification that the shared resource is no longer required by the other processor |
7329136, | Mar 02 2006 | Transpacific Electronics, LLC | Bi-directional electronic device with USB interface |
7334072, | Sep 27 2002 | MONTEREY RESEARCH, LLC | System, method and apparatus for extending distances between wired or wireless USB devices and a USB host |
7359997, | Jun 06 2003 | Seiko Epson Corporation | USB data transfer control device including first and second USB device wherein destination information about second device is sent by first device |
7447824, | Oct 26 2005 | Hewlett-Packard Development Company, L.P.; HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | Dynamic lane management system and method |
7451280, | Nov 22 2002 | CORNAMI, INC | External memory controller node |
7469311, | May 07 2003 | Nvidia Corporation | Asymmetrical bus |
7478188, | Jun 02 2006 | Hewlett-Packard Development Company, L.P.; HEWLETT-PACKARD DEVELOPMENT COMPANY, L P | System and method for connecting a WUSB device to multiple WUSB hosts |
7480757, | May 24 2006 | LENOVO INTERNATIONAL LIMITED | Method for dynamically allocating lanes to a plurality of PCI Express connectors |
7480808, | Jul 16 2004 | ATI Technologies ULC | Method and apparatus for managing power consumption relating to a differential serial communication link |
7496707, | Aug 22 2006 | International Business Machines Corporation | Dynamically scalable queues for performance driven PCI express memory traffic |
7536490, | Jul 20 2006 | VIA Technologies, Inc. | Method for link bandwidth management |
7539809, | Aug 19 2005 | Dell Products L.P. | System and method for dynamic adjustment of an information handling systems graphics bus |
7571287, | Mar 13 2003 | MARVELL INTERNATIONAL LTD; CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Multiport memory architecture, devices and systems including the same, and methods of using the same |
7583600, | Sep 07 2005 | Oracle America, Inc | Schedule prediction for data link layer packets |
7606960, | Mar 26 2004 | TAHOE RESEARCH, LTD | Apparatus for adjusting a clock frequency of a variable speed bus |
7624221, | Aug 01 2005 | Nvidia Corporation | Control device for data stream optimizations in a link interface |
7660925, | Apr 17 2007 | International Business Machines Corporation | Balancing PCI-express bandwidth |
7685322, | Feb 28 2006 | Microsoft Technology Licensing, LLC | Port number emulation for wireless USB connections |
7689753, | May 02 2006 | Samsung Electronics Co., Ltd. | Method of operating wireless USB apparatus by receiving operation state information and wireless USB apparatus using the same |
7752342, | Sep 24 2002 | FUTURE LINK SYSTEMS | Interface integrated circuit device for a USB connection |
7949817, | Jul 31 2007 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Adaptive bus profiler |
8205028, | Jul 31 2007 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Adaptive bus profiler |
8234425, | Jun 27 2007 | CAVIUM INTERNATIONAL; MARVELL ASIA PTE, LTD | Arbiter module |
20010036116, | |||
20030154314, | |||
20040093389, | |||
20040193774, | |||
20050268001, | |||
20060075144, | |||
20060106962, | |||
20080148083, | |||
20080215773, | |||
20080215774, | |||
20080265838, | |||
20080320189, | |||
FR2779843, | |||
JP10506776, | |||
JP1162294, | |||
JP2004288355, | |||
JP4061094, | |||
JP5107204, | |||
JP547174, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jun 29 2009 | Marvell World Trade Ltd. | (assignment on the face of the patent) | / | |||
Dec 31 2019 | Marvell World Trade Ltd | MARVELL INTERNATIONAL LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 051778 | /0537 | |
Dec 31 2019 | MARVELL INTERNATIONAL LTD | CAVIUM INTERNATIONAL | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 052918 | /0001 | |
Dec 31 2019 | CAVIUM INTERNATIONAL | MARVELL ASIA PTE, LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 053475 | /0001 |
Date | Maintenance Fee Events |
Jun 20 2016 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jun 10 2020 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Aug 05 2024 | REM: Maintenance Fee Reminder Mailed. |
Date | Maintenance Schedule |
Dec 18 2015 | 4 years fee payment window open |
Jun 18 2016 | 6 months grace period start (w surcharge) |
Dec 18 2016 | patent expiry (for year 4) |
Dec 18 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Dec 18 2019 | 8 years fee payment window open |
Jun 18 2020 | 6 months grace period start (w surcharge) |
Dec 18 2020 | patent expiry (for year 8) |
Dec 18 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Dec 18 2023 | 12 years fee payment window open |
Jun 18 2024 | 6 months grace period start (w surcharge) |
Dec 18 2024 | patent expiry (for year 12) |
Dec 18 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |