In an image processing system, successive segmentation steps are carried out. In the case of a “hole” in the segmentation result, which is defined as a small area of one type (such as black) within a slightly larger “island” of another type (such as white), a basic segmentation technique may result in an error. The error is avoided by retaining, for each distinct area in the image, a variable indicating whether the area has been “flipped” from one type to another in a previous processing operation.
|
1. A method of processing image data associated with an mrc selector plane, the image data including a first subset of the image data and a second subset of the image data, comprising:
identifying in the image data a hole, a hole being an area associated with the first subset, the hole being substantially surrounded by an island associated with the second subset, and the island being substantially surrounded by a greater area associated with the first subset;
altering the image data so that the hole becomes associated with the second subset; and
maintaining, for at least a subset of image data associated with the hole, a data structure including a variable indicative of whether the image data has been previously altered.
2. The method of
applying a simplification algorithm to the image data; and
suppressing alteration of image data associated with the hole during the applying step.
3. The method of
altering the image data if the hole is within a predetermined size range.
4. The method of
maintaining, for at least a subset of image data associated with the island, a data structure including a variable indicative of whether the image data has been previously altered.
5. The method of
|
U.S. Pat. No. 6,240,205 is hereby incorporated by reference for the teachings therein.
The present disclosure relates to technique for organizing and segmenting image data, as would be useful in, for example, digital scanners, cameras or printers.
Image data is often stored in the form of multiple scanlines, each scanline comprising multiple pixels. When processing this type of image data, it is helpful to know the type of image represented by the data. For instance, the image data could represent graphics, text, a halftone, condone, or some other recognized image type. A page of image data could be all one type, or some combination of image types.
It is known in the art to take a page of image data and to separate, or “segment,” the image data into windows of similar image types. For instance, a page of image data may include a halftone picture with accompanying text describing the picture. In order to efficiently process the image data, it is desirable to segment the pictorial area from text area. Processing of the page of image data can then be efficiently carried out by tailoring the processing to the type of image data being processed based on the segmentation result.
One common overall method for performing image segmentation is the use of a “mixed-raster content” or MRC representation of image data. There are several variations of MRC representation, as shown for example in
In segmentation of MRC image data to yield a selector plane, as well as in other activities with any kind of image data, a kind of error of segmentation is called the “hole” problem. A “hole” in an initial segmentation result (such as a selector plane in the three layer MRC case) can be defined as a small area associated with a first subset or type of image data surrounded by a greater “island” of pixels associated with a second subset or type of image data, the island in turn being substantially surrounded by a greater area associated with the first subset. As will be described in detail below, the presence of such holes in image data, such as in an MRC image plane, can lead to special problems of misclassification of portions of the image data.
U.S. Pat. No. 6,240,205 gives a general description of segmenting and classifying image data, including steps of separating each scanline of image data into edges and image runs and classifying each of the edges and image runs as standard image types.
U.S. Pat. Nos. 5,778,092 and 6,608,928 disclose examples of processing MRC image planes.
There is provided a method of processing image data, the image data including a first subset of the image data and a second subset of the image data. Holes associated with the first subset image data, surrounded by islands associated with the second subset image data, the islands being substantially surrounded by greater area associated with the first subset, are identified in the image data.
There is further provided a method of processing image data, the image data describing runs and windows in an image. For the image data, there is maintained a set of window data structures associated with windows in the image, each of the window data structures including a history variable indicative of whether the image data associated therewith has been previously altered.
In the following detailed description, a method of processing image data will be specifically described with regard to a mixed-raster content (MRC) selector plane, but it will be understood that the method can be applied to the processing of any type of image data for any purpose.
As part of an algorithm to simplify the selector plane, as described above, a typical action is to identify unconnected shapes or “islands” in a segmentation result, and if the island is smaller than a certain threshold size, “flip” the segmentation result (i.e., change the pixels in selector plane from their original black to white, or vice-versa) in the island to assume the image type of its surrounding pixels. The hole can have tens of thousands of pixels depending on the resolution of the image.
In a general case, the flipping will effectively erase the island; for example, a small island of black pixels surrounded by white pixels will be “flipped” to be white pixels and thus disappear. A problem occurs with a hole within an island, as in
According to the present embodiment, the “hole” problem of
What follows is a description of one practical implementation of a method of overcoming the “hole” problem, with reference to
When an image, or image-like data set such as a selector plane, is processed, the pixel data is processed through a series of scanlines in the image, each scan line including a series of pixels, and the scanlines arranged next to each other forming a raster which creates the two-dimensional image. When the image data of a selector plane is processed on such a line-by-line basis, there can be identified “runs” of black or white pixels along each scanline. In
In addition to line definitions, another set of data structures is “window definitions.” Window definitions describe unconnected areas (windows, which can be islands or holes) of black or white pixels in the selector plane. Examples of areas associated with window definitions are shown in
A method for avoiding the “hole problem” can, with the above-described line definition and window definition data structures, be carried out in an on-the-fly basis by processing a series of line definitions forming a selector plane and cross-checking each line definition with its corresponding window definition. According to this method, all “runs” such as R1, R2, R3 within the selector plane are considered sequentially along a scanline. For any two adjacent runs within a single scanline, their corresponding window definitions are checked. If the two corresponding windows have are of different states AND they have BOTH been previously flipped from their original states (i.e., their history variables are BOTH 1), the state of the second of the two runs in the scanline is in effect reversed, or in other words the flipping of the state of the second of the two runs is suppressed if it is otherwise mandated by a generally-applied algorithm. This operation has the effect of overcoming the “hole problem” as described above.
The described embodiment can address, or be readily adapted to address, situations in which multiple island-and-hole relationships are cascaded, e.g., a hole within an island, the island surrounded by a greater area, the greater area being surrounded by, in effect, a more-greater area.
Patent | Priority | Assignee | Title |
7693329, | Jun 30 2004 | CHINA CITIC BANK CORPORATION LIMITED, GUANGZHOU BRANCH, AS COLLATERAL AGENT | Bound document scanning method and apparatus |
Patent | Priority | Assignee | Title |
4539704, | Sep 15 1983 | Pitney Bowes Inc. | Image thinning process |
5778092, | Dec 20 1996 | Nuance Communications, Inc | Method and apparatus for compressing color or gray scale documents |
6173075, | Aug 30 1995 | TOON BOOM ANIMATION INC | Drawing pixmap to vector conversion |
6240205, | Jul 26 1996 | Xerox Corporation | Apparatus and method for segmenting and classifying image data |
6249604, | Nov 19 1991 | Technology Licensing Corporation | Method for determining boundaries of words in text |
6498608, | Dec 15 1998 | Microsoft Technology Licensing, LLC | Method and apparatus for variable weight outline emboldening of scalable outline fonts |
6608928, | Nov 03 1999 | Xerox Corporation | Generic pre-processing of mixed raster content planes |
7043080, | Nov 21 2000 | Sharp Kabushiki Kaisha | Methods and systems for text detection in mixed-context documents using local geometric signatures |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jun 25 2003 | Xerox Corporation | JPMorgan Chase Bank, as Collateral Agent | SECURITY AGREEMENT | 015722 | /0119 | |
Nov 20 2003 | Li, Xing | Xerox Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 014737 | /0092 | |
Nov 20 2003 | BAI, YINGJUN | Xerox Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 014737 | /0092 | |
Nov 21 2003 | Xerox Corporation | (assignment on the face of the patent) | / | |||
Aug 22 2022 | JPMORGAN CHASE BANK, N A AS SUCCESSOR-IN-INTEREST ADMINISTRATIVE AGENT AND COLLATERAL AGENT TO BANK ONE, N A | Xerox Corporation | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 061360 | /0501 | |
Nov 07 2022 | Xerox Corporation | CITIBANK, N A , AS AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 062740 | /0214 | |
May 17 2023 | CITIBANK, N A , AS AGENT | Xerox Corporation | RELEASE OF SECURITY INTEREST IN PATENTS AT R F 062740 0214 | 063694 | /0122 | |
Jun 21 2023 | Xerox Corporation | CITIBANK, N A , AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 064760 | /0389 | |
Nov 17 2023 | Xerox Corporation | JEFFERIES FINANCE LLC, AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 065628 | /0019 | |
Feb 06 2024 | Xerox Corporation | CITIBANK, N A , AS COLLATERAL AGENT | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 066741 | /0001 | |
Feb 06 2024 | CITIBANK, N A , AS COLLATERAL AGENT | Xerox Corporation | TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS RECORDED AT RF 064760 0389 | 068261 | /0001 |
Date | Maintenance Fee Events |
Dec 14 2011 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jan 22 2016 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Feb 02 2020 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Aug 05 2011 | 4 years fee payment window open |
Feb 05 2012 | 6 months grace period start (w surcharge) |
Aug 05 2012 | patent expiry (for year 4) |
Aug 05 2014 | 2 years to revive unintentionally abandoned end. (for year 4) |
Aug 05 2015 | 8 years fee payment window open |
Feb 05 2016 | 6 months grace period start (w surcharge) |
Aug 05 2016 | patent expiry (for year 8) |
Aug 05 2018 | 2 years to revive unintentionally abandoned end. (for year 8) |
Aug 05 2019 | 12 years fee payment window open |
Feb 05 2020 | 6 months grace period start (w surcharge) |
Aug 05 2020 | patent expiry (for year 12) |
Aug 05 2022 | 2 years to revive unintentionally abandoned end. (for year 12) |