A hypothetical reference decoder.
|
1. A method comprising:
(a) defining receiving a first set of at least one value multiple values, each value in the first set being characteristic of a transmission bit rate for a first segment access point at a start of a video having an associated first segment presentation start time and an associated first segment presentation end time sequence;
(b) defining receiving a second set of at least one value multiple values each characteristic of a buffer size for said first segment access point;
(c) defining receiving a third set of at least one value multiple values each characteristic of an initial decoder buffer fullness for said first segment delay for said first access point;
(d) wherein each value within said first set, said second set, and said third set, respectively, is defined so that data received by a decoder for constructing a plurality of video frames of said first segment is free from an underflow state in a buffer of said decoder when said constructing begins at said first segment presentation start time receiving a fourth set of multiple values characteristic of an initial delay for other access points of the video sequence, the other access points being distinct access points from the first access point;
wherein
(e) defining a fourth set of at least one value characteristic of said transmission bit rate for a second segment of said video having an associated second segment presentation start time and an associated second segment presentation end time, said second segment presentation start time being later than first segment presentation start time and said second segment presentation end time being the same as, or earlier, than said first segment presentation end time the values within said first set, said second set, and said third set, respectively, are defined so that data received by a decoder for constructing a plurality of video frames is free from an overflow state for said first access point;
(f) defining a fifth set of at least one value characteristic of said buffer size for said second segment; the values within said first set, said second set, and said fourth set, respectively, are defined so that data received by a decoder for constructing a plurality of video frames is free from an overflow state for each of said other access points
(g) defining a sixth set of at least one value characteristic of said initial decoder buffer fullness for said second segment;
(h) wherein each value within said fourth set, said fifth set, and said sixth set, respectively, is defined so that data received by said decoder for constructing a plurality of video frames of said second segment is free from an underflow state in said buffer of said decoder when said constructing begins at said second segment presentation start time; and
(i) allowing a user to begin presentation at a user-selected one of said first segment presentation start time, and said second segment presentation start time associated with said second segment.
2. The method of
0. 3. The method of
4. The method of
5. The method if
0. 6. The method of
7. The method of
0. 8. The method of
0. 9. The method of
0. 10. The method of
0. 11. The method of
0. 12. The method of claim 1 wherein at least one of said first access point or said other access points correspond to a local maximum buffer fullness state of at least one leaky bucket model for a buffer of a hypothetical reference decoder.
0. 13. The method of claim 1 wherein at least one of said first access point or said other access points correspond to a local minimum buffer fullness state of at least one leaky bucket model for a buffer of a hypothetical reference decoder.
|
Typically, ti+1−ti=1/M seconds, where M is the frame rate (normally in frames/sec) for the bit stream.
A leaky bucket model with parameters (R, B, F) contains a bit stream if there is no underflow of the decoder buffer. Because the encoder and decoder buffer fullness are complements of each other this is equivalent to no overflow of the encoder buffer. However, the encoder buffer (the leaky bucket) is allowed to become empty, or equivalently the decoder buffer may become full, at which point no further bits are transmitted from the encoder buffer to the decoder buffer. Thus, the decoder buffer stops receiving bits when it is full, which is why the min operator in equation (1) is included. A full decoder buffer simply means that the encoder buffer is empty.
The following observations may be made:
Assume that the system fixes F=aB for all leaky buckets, where a is some desired fraction of the initial buffer fullness. For each value of the peak bit rate R, the system can find the minimum buffer size Bmin that will contain the bit stream using equation (1). The plot of the curve of R-B values, is shown in
By observation, the curve of (Rmin, Bmin) pairs for any bit stream (such as the one in
MPEG Video Buffering Verifier (VBV)
The MPEG video buffering verifier (VBV) can operate in two modes: constant bit rate (CBR) and variable bit rate (VBR). MPEG-1 only supports the CBR mode, while MPEG-2 supports both modes.
The VBV operates in CBR mode when the bit stream is contained in a leaky bucket model of parameters (R, B, F) and:
R=Rmax=the average bit rate of the stream.
The VBV operates in VBR mode when the bit stream is constrained in a leaky bucket model of parameters (R, B, F) and:
R=Rmax=the peak or maximum rate. Rmax is higher than the average rate of the bit stream.
The decoder buffer fullness follows the following equations:
B0=B
Bi+1=min (B, Bi−bi+Rmax/M), i=0, 1, 2, . . . (3)
The encoder ensures that Bi−bi is always greater than or equal to zero. That is, the encoder must ensure that the decoder buffer does not underflow. However, in this VBR case the encoder does not need to ensure that the decoder buffer does not overflow. If the decoder buffer becomes full, then it is assumed that the encoder buffer is empty and hence no further bits are transmitted from the encoder buffer to the decoder buffer.
The VBR mode is useful for devices that can read data up to the peak rate Rmax. For example, a DVD includes VBR clips where Rmax is about 10 Mbits/sec, which corresponds to the maximum reading speed of the disk drive, even though the average rate of the DVD video stream is only about 4 Mbits/sec.
Referring to
Broadly speaking, the CBR mode can be considered a special case of VBR where Rmax happens to be the average rate of the clip.
H.263's Hypothetical Reference Decoder (HRD)
The hypothetic reference model for H.263 is similar to the CBR mode of MPEG's VBV previously discussed, except for the following:
Previously existing hypothetical reference decoders operate at only one point (R, B) of the curve in
A generalized hypothetical reference decoder (GHRD) can operate given the information of N leaky bucket models,
(R1, B1, F1), (R2, B2, F2), . . . , (RN, BN, RN), (4)
each of which contains the bit stream. Without loss of generality, let us assume that these leaky buckets are ordered from smallest to largest bit rate, i.e., Ri<Ri+1. Lets also assume that the encoder computes these leaky buckets models correctly and hence Bi<Bi+1.
The desired value of N can be selected by the encoder. If N=1, the GHRD is essentially equivalent to MPEG's VBV. The encoder can choose to: (a) pre-select the leaky bucket values and encode the bit stream with a rate control that makes sure that all of the leaky bucket constraints are met, (b) encode the bit stream and then use equation (1) to compute a set of leaky buckets containing the bit stream at N different values of R, or (c) do both. The first approach (a) can be applied to live or on-demand transmission, while (b) and (c) only apply to on-demand.
The number of leaky buckets N and the leaky bucket parameters (4) are inserted into the bit stream. In this way, the decoder can determine which leaky bucket it wishes to use, knowing the peak bit rate available to it and/or its physical buffer size. The leaky bucket models in (4) as well as all the linearly interpolated or extrapolated models are available for use.
The interpolated buffer size B between points k and k+1 follow the straight line:
B={(Rk+1−R)/(Rk+1−Rk)}Bk+{(R−Rk)/(Rk+1−Rk)}Bk+1 Rk<R<Rk+1
Likewise, the initial decoder buffer fullness F can be linearly interpolated:
F={(Rk+1−R)/(Rk+1−Rk)}Fk+{(R−Rk)/(Rk+1−Rk)}Fk+1 Rk<R<Rk+1
The resulting leaky bucket with parameters (R, B, F) contains the bit stream, because the minimum buffer size Bmin is convex in both R and F, that is, the minimum buffer size Bmin corresponding to any convex combination (R, F)=a(Rk, Fk)+(1−a)(Rk+1, Fk+1), 0<a<1, is less than or equal to B=aBk+(1−a)Bk+1.
It is observed that if R is larger than RN, the leaky bucket (R, BN, FN) will also contain the bit stream, and hence BN and FN are the buffer size and initial decoder buffer fullness recommended when R>=RN. If R is smaller than R., the upper bound B=B1+(R1−R)T can be caused (and once can set F=B), where T is the time length of the stream in seconds. These (R, B) values outside the range of the N points are also shown in
The Joint Video Team of ISO/IEC MPEG and ITU-T VCEG Working Draft Number 2, Revision 0 (WD-2) incorporated many of the concepts of the hypothetical reference decoder proposed by Jordi Ribas-Cobera, et al. of Microsoft Corporation, incorporated by reference herein. The WD-2 document is similar to the decoder proposed by Jordi Ribas-Cobera, et al. of Microsoft Corporation, though the syntax is somewhat modified. In addition, WD-2 describes an example algorithm to compute B, and F for a given rate R.
As previously described, the JVT standard (WD-2) allows the storing of (N>=1) leaky buckets, (R1, B1, F1), . . . , (RN, BN, FN) values which are contained in the bit stream. These values may be stored in the header. Using Fi as the initial buffer fullness and Bi as the buffer size, guarantees that the decoder buffer will not underflow when the input stream comes in at the rate Ri. This will be the case if the user desires to present the encoded video from start to end. In a typical video-on-demand application the user may want to seek to different portions of the video stream. The point that the user desires to seek to may be referred to as the access point. During the process of receiving video data and constructing video frames the amount of data in the buffer fluctuates. After consideration, the present inventor came to the realization that if the Fi value of the initial buffer fullness (when the channel rate is Ri) is used before starting to decode the video from the access point, then it is possible that the decoder will have an underflow. For example, at the access point or sometime thereafter the amount of bits necessary for video reconstruction may be greater than the bits currently in the buffer, resulting in underflow and inability to present video frames in a timely manner. It can likewise be shown that in a video stream the value of initial buffer fullness required to make sure there in no underflow at the decoder varies based on the point at which the user seeks to. This value is bounded by the Bi. Accordingly, the combination of B and F provided for the entire video sequence, if used for an intermediate point in the video will not likely be appropriate, resulting in an underflow, and thus freezing frames.
Based upon this previously unrealized underflow potential, the present inventor then came to the realization that if only a set of R, B, and F values are defined for an entire video segment, then the system should wait until the buffer B for the corresponding rate R is full or substantially full (or greater than 90% full) to start decoding frames when a user jumps to an access point. In this manner, the initial fullness of the buffer will be at a maximum and thus there is no potential of underflow during subsequent decoding starting from the access point. This may be achieved without any additional changes to the existing bit stream, thus not impacting existing systems. Accordingly, the decoder would use the value of initial buffering Bj for any point the user seeks to when the rate is Rj, as shown in
The initial buffer fullness (F) may likewise be characterized as a delay until the video sequence is presented (e.g., initial_cpb_removal_delay). The delay is temporal in nature being related to the time necessary to achieve initial buffer fullness (F). The delay and/or F may be associated with the entire video or the access points. It is likewise to be understood that delay may be substituted for F in all embodiments described herein (e.g., (R,B,delay)). One particular value for the delay may be calculated as delay=F/R, using a special time unit (units of 90 KHz Clock).
To reduce the potential delay the present inventor came to the realization that sets of (R, B, F) may be defined for a particular video stream at each access point. Referring to
The sets of R, B, F values for each access point may be located at any suitable location, such as for example, at the start of the video sequence together with sets of (R, B, F) values for the entire video stream or before each access point which avoids the need for an index; or stored in a manner external to the video stream itself which is especially suitable for a server/client environment.
This technique may be characterized by the following model:
(R1, B1, F1, M1, f11, t11, . . . , fM11, tM11) . . . , (RN, BN, FN, MN, f1N, t1N, . . . , fMNN, tMNN),
where fkj denotes the initial buffer fullness value at rate Rj at access point tkj (time stamp). The values of Mj may be provided as an input parameter or may be automatically selected.
For example, Mj may include the following options:
The system may, for a given Rj, use an initial buffer fullness equal to fjk if the user seeks an access point tkj. This occurs when the user selects to start at an access point, or otherwise the system adjusts the user's selection to one of the access points.
It is noted that in the case that a variable bit rate (in bit stream) is used the initial buffer fullness value (or delay) is preferably different than the buffer size, albeit it may be the same. In the case of variable bit rate in MPEG-2 VBV buffer is filled till it is full, i.e. F=B (value of B is represented by vbv_buffer_size).
If the system permits the user to jump to any frame of the video in the manner of an access point, then the decoding data set would need to be provided for each and every frame. While permissible, the resulting data set would be excessively large and consume a significant amount of the bitrate available for the data. A more reasonable approach would be to limit the user to specific access points within the video stream, such as every second, 10 seconds, 1 minute, etc. While an improvement, the resulting data set may still be somewhat extensive resulting in excessive data for limited bandwidth devices, such as mobile communication devices.
In the event that the user selects a position that is not one of the access points with an associated data set, then the initial buffer fullness may be equal to max(fkj, f(k+1)j) for a time between tkj and t(k+1)j, especially if the access points are properly selected. In this manner, the system is guaranteed of having a set of values that will be free from resulting in an underflow condition, or otherwise reduce the likelihood of an underflow condition, as explained below.
To select a set of values that will ensure no underflow condition (or otherwise reduce) when the above-referenced selection criteria is used, reference is made to
Based upon the selection criteria a set of 10 points for
In addition, if the bit rate and the buffer size remain the same while selecting a different access point, then merely the modified buffer fullness, F, needs to be provided or otherwise determined.
All the references cited herein are incorporated by reference.
The terms and expressions that have been employed in the foregoing specification are used as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims that follow.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
5159447, | May 23 1991 | American Telephone and Telegraph Company | Buffer control for variable bit-rate channel |
5287182, | Jul 02 1992 | Agere Systems, INC | Timing recovery for variable bit-rate video on asynchronous transfer mode (ATM) networks |
5365552, | Nov 16 1992 | Intel Corporation | Buffer fullness indicator |
5398072, | Oct 25 1993 | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | Management of channel buffer in video decoders |
5481543, | Mar 16 1993 | Sony Corporation | Rational input buffer arrangements for auxiliary information in video and audio signal processing systems |
5534944, | Jul 15 1994 | Panasonic Corporation of North America | Method of splicing MPEG encoded video |
5537408, | Feb 03 1995 | International Business Machines Corporation | apparatus and method for segmentation and time synchronization of the transmission of multimedia data |
5543853, | Jan 19 1995 | THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT | Encoder/decoder buffer control for variable bit-rate channel |
5565924, | Jan 19 1995 | THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT | Encoder/decoder buffer control for variable bit-rate channel |
5619341, | Feb 23 1995 | Motorola, Inc. | Method and apparatus for preventing overflow and underflow of an encoder buffer in a video compression system |
5629736, | Nov 01 1994 | THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT | Coded domain picture composition for multimedia communications systems |
5652749, | Feb 03 1995 | International Business Machines Corporation | Apparatus and method for segmentation and time synchronization of the transmission of a multiple program multimedia data stream |
5663962, | Sep 29 1994 | Cselt- Centro Studi E Laboratori Telecomunicazioni S.p.A. | Method of multiplexing streams of audio-visual signals coded according to standard MPEG1 |
5668841, | May 27 1994 | Lucent Technologies Inc | Timing recovery for variable bit-rate video on asynchronous transfer mode (ATM) networks |
5831688, | Oct 31 1994 | Mitsubishi Denki Kabushiki Kaisha | Image coded data re-encoding apparatus |
5877812, | Nov 21 1995 | Google Technology Holdings LLC | Method and apparatus for increasing channel utilization for digital video transmission |
5982436, | Mar 28 1997 | Pendragon Wireless LLC | Method for seamless splicing in a video encoder |
5995151, | Dec 04 1995 | France Brevets | Bit rate control mechanism for digital image and video data compression |
6023296, | Jul 10 1997 | MEDIATEK, INC | Apparatus and method for object based rate control in a coding system |
6055270, | Apr 20 1994 | Thomson Cosumer Electronics, Inc. | Multiplexer system using constant bit rate encoders |
6085221, | Jan 08 1996 | Cisco Technology, Inc | File server for multimedia file distribution |
6188703, | Aug 01 1997 | IBM Corporation | Multiplexer for multiple media streams |
6269120, | Mar 23 1998 | International Business Machines Corporation | Method of precise buffer management for MPEG video splicing |
6272566, | Nov 18 1998 | SANDPIPER CDN, LLC | System for maintaining proper buffering within video play list |
6301428, | Dec 09 1997 | AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD | Compressed video editor with transition buffer matcher |
6366704, | Dec 01 1997 | Sharp Laboratories of America, Inc. | Method and apparatus for a delay-adaptive rate control scheme for the frame layer |
6381254, | Nov 08 1996 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Moving image encoding device/method, moving image multiplexing device/method, and image transmission device |
6389072, | Dec 23 1998 | UNILOC 2017 LLC | Motion analysis based buffer regulation scheme |
6397251, | Sep 02 1997 | Cisco Technology, Inc | File server for multimedia file distribution |
6542549, | Oct 13 1998 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Method and model for regulating the computational and memory requirements of a compressed bitstream in a video decoder |
6587506, | Nov 02 1999 | Sovereign Peak Ventures, LLC | Video editing apparatus, video editing method, and data storage medium for a video editing program |
6637031, | Dec 04 1998 | Microsoft Technology Licensing, LLC | Multimedia presentation latency minimization |
6907481, | Mar 06 2001 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | System for bit-rate controlled digital stream playback and method thereof |
6909743, | Apr 14 1999 | MEDIATEK INC | Method for generating and processing transition streams |
6912251, | Sep 25 1998 | Mediatek USA Inc | Frame-accurate seamless splicing of information streams |
7079581, | Apr 18 2002 | Samsung Electronics Co., Ltd. | Apparatus and method for controlling variable bit rate in real time |
7088771, | Apr 06 1999 | AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED | Video encoding and video/audio/data multiplexing device |
7257162, | Jul 02 2002 | Synaptics Incorporated | Hypothetical reference decoder for compressed image and video |
7646816, | Sep 19 2001 | Microsoft Technology Licensing, LLC | Generalized reference decoder for image or video processing |
9654533, | Jan 17 2013 | Electronics and Telecommunications Research Institute | Method of adaptively delivering media based on reception status information from media client and apparatus using the same |
20020037161, | |||
20020067768, | |||
20020085634, | |||
20030053416, | |||
20040190606, | |||
20040255063, | |||
20050074061, | |||
20050084007, | |||
EP930786, | |||
JP2000124958, | |||
JP2002112183, | |||
JP200392752, | |||
JP2272851, | |||
JP7107429, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 28 2003 | DESHPANDE, SACHIN G | SHARP LABORAORIES OF AMERICA | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 038182 | /0361 | |
Apr 28 2008 | Sharp Laboratories of America, Inc | Sharp Kabushiki Kaisha | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 050407 | /0100 | |
Sep 29 2015 | Sharp Kabushi Kaisha | Dolby Laboratories Licensing Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 050407 | /0176 | |
Sep 29 2015 | Sharp Kabushiki Kaisha | Dolby Laboratories Licensing Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 038182 | /0555 | |
Dec 15 2015 | Dolby Laboratories Licensing Corporation | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Date | Maintenance Schedule |
Mar 01 2025 | 4 years fee payment window open |
Sep 01 2025 | 6 months grace period start (w surcharge) |
Mar 01 2026 | patent expiry (for year 4) |
Mar 01 2028 | 2 years to revive unintentionally abandoned end. (for year 4) |
Mar 01 2029 | 8 years fee payment window open |
Sep 01 2029 | 6 months grace period start (w surcharge) |
Mar 01 2030 | patent expiry (for year 8) |
Mar 01 2032 | 2 years to revive unintentionally abandoned end. (for year 8) |
Mar 01 2033 | 12 years fee payment window open |
Sep 01 2033 | 6 months grace period start (w surcharge) |
Mar 01 2034 | patent expiry (for year 12) |
Mar 01 2036 | 2 years to revive unintentionally abandoned end. (for year 12) |