A method and system for determining the orientation of an image of a picture within a scanned image, including the steps of locating in the scanned image the contour of the image of the picture, determining a plurality of bounding boxes confining the contour of the image of the picture, selecting one of the plurality of bounding boxes that is substantially aligned with the contour of the image of the picture, and calculating an angle of rotation of the picture based on the selected bounding box.
|
0. 33. An apparatus, comprising:
means for locating in the scanned image a contour of the image of the picture;
means for determining one or more bounding boxes confining the contour of the image of the picture;
means for selecting one of the bounding boxes that is substantially aligned with the contour of the image of the picture; and
means for calculating an angle of rotation of the picture based at least in part on the selected bounding box.
0. 64. An apparatus, comprising:
means for locating in a scanned image a contour of the image of a picture;
means for determining one or more bounding boxes confining the contour of the image of the picture;
means for selecting one or more of the bounding boxes that is significantly aligned with the contour of the image of the picture; and
means for calculating an angle of rotation of the picture based at least in part on the selected bounding box.
1. A method for determining the orientation of an image of a picture within a scanned image, comprising the steps of:
locating in the scanned image a contour of the image of the picture;
determining a plurality of bounding boxes confining the contour of the image of the picture;
selecting one of the plurality of bounding boxes that is substantially aligned with the contour of the image of the picture; and
calculating an angle of rotation of the picture based on the selected bounding box.
32. A system for determining the orientation of an image of a picture within a scanned image, comprising:
an image processor locating in the scanned image a contour of the image of the picture;
a box generator determining a plurality of bounding boxes confining the contour of the image of the picture;
a box processor selecting one of the plurality of bounding boxes that is significantly aligned with the contour of the image of the picture; and
an angle processor calculating an angle of rotation of the picture based on the selected bounding box.
16. A method for operating a scanner, comprising:
detecting that a user has placed a plurality of pictures in arbitrary orientations on a scanner bed of the scanner;
scanning the plurality of pictures to generate a scanned image containing a plurality of images of the pictures; and
automatically determining an orientation of at least one of the images of the pictures relative to the scanner bed using the scanned image;
applying edge detection to the scanned image to locate edges of the plurality of images of the pictures, and
identifying bounding areas of pixel locations for each image of a picture from among the plurality of images of pictures, each bounding area surrounding one image of a picture from among the plurality of images of pictures, such that identifying bounding areas includes,
initializing a plurality of expandable groups of pixel locations;
expanding each of the expandable groups of pixel locations until none of the pixel locations on its boundary are situated at edges.
0. 48. An apparatus, comprising:
means for detecting that a user has placed one or more pictures in arbitrary orientations on a scanner bed of a scanner;
means for scanning the one or more pictures to generate a scanned image containing one or more images of the pictures; and
means for determining an orientation of at least one of the images of the pictures relative to the scanner bed using the scanned image;
means for applying edge detection to the scanned image to locate edges of the one or more images of the pictures, and
means for identifying bounding areas of pixel locations for one or more image of a picture from among the one or more images of pictures, one or more bounding area surrounding one image of a picture from among the one or more images of pictures, such that said means for identifying bounding areas comprises:
means for initializing one or more expandable groups of pixel locations;
means for expanding one or more of the expandable groups of pixel locations until none, or nearly none, of the pixel locations on its boundary are situated at edges.
0. 55. An apparatus, comprising:
means for detecting that a user has placed one or more pictures in arbitrary orientations on a scanner bed of the scanner;
means for scanning the one or more pictures to generate a scanned image containing one or more images of the pictures;
means for determining an orientation of at least one of the images of the pictures relative to the scanner bed using the scanned image;
means for applying edge detection to the scanned image to locate edges of the one or more images of the pictures, and
means for identifying bounding areas of pixel locations for one or more image of a picture from among the one or more images of pictures, one or more bounding area surrounding one image of a picture from among the one or more images of pictures, wherein said means for determining the orientation comprises:
means for detecting a contour of the image of the picture;
means for determining a one or more bounding boxes enclosing the contour of the image of the picture;
means for selecting one of the one or more bounding boxes that is substantially aligned with the contour of the image of the picture; and
means for calculating an angle of rotation of the picture based at least in part on the selected bounding box.
23. A method for operating a scanner, comprising:
detecting that a user has placed a plurality of pictures in arbitrary orientations on a scanner bed of the scanner;
scanning the plurality of pictures to generate a scanned image containing a plurality of images of the pictures;
automatically determining an orientation of at least one of the images of the pictures relative to the scanner bed using the scanned image;
applying edge detection to the scanned image to locate edges of the plurality of images of the pictures, and
identifying bounding areas of pixel locations for each image of a picture from among the plurality of images of pictures, each bounding area surrounding one image of a picture from among the plurality of images of pictures, wherein said automatically determining the orientation comprises, for each image of a picture from among the plurality of images of pictures, the steps of:
detecting a contour of the image of the picture;
determining a plurality of bounding boxes enclosing the contour of the image of the picture;
selecting one of the plurality of bounding boxes that is substantially aligned with the contour of the image of the picture; and
calculating an angle of rotation of the picture based on the selected bounding box.
2. The method of
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
9. The method of
10. The method of
providing a reference box; and
rotating the reference box about its center through multiple angles of rotation.
11. The method of
finding a region that bounds the contour of the image of the picture; and
generating a box that encloses within it a circle, the circle being large enough to enclose the region.
12. The method of
13. The method of
14. The method of
determining for each of the bounding boxes an average distance between the bounding box and the contour of the image of the picture; and
choosing a bounding box for which the average distance between the chosen bounding box and the contour of the image of the picture is small as compared to the respective average distances between others of the plurality of bounding boxes and the contour of the image of the picture.
15. The method of
17. The method of
18. The method of
19. The method of
20. The method of
21. The method of
22. The method of
determining a separation between two expandable groups of pixel locations; and
coalescing the two expandable groups of pixel locations when their separation is smaller than a prescribed threshold.
24. The method of
25. The method of
26. The method of
providing a reference box; and
rotating the reference box about its center through multiple angles of rotation.
27. The method of
finding a region that bounds the contour of the image of the picture; and
generating a reference box that encloses within it a circle, the circle being large enough to enclose the region.
28. The method of
29. The method of
30. The method of
choosing a bounding box for which the average distance between the chosen bounding box and the contour of the image of the picture is small as compared to the respective average distances between others of the plurality of bounding boxes and the contour of the image of the picture.
31. The method of
0. 34. An apparatus as claimed in claim 33, further comprising means for applying edge detection to the scanned image, to locate edges of the image of the picture.
0. 35. An apparatus as claimed in claim 34, wherein said means for applying edge detection comprises a Gaussian filter.
0. 36. An apparatus as claimed in claim 34, said means for applying edge detection comprises a Laplace filter.
0. 37. An apparatus as claimed in of claim 34, said means for applying edge detection comprises a Laplacian of Gaussian filter.
0. 38. An apparatus as claimed in claim 34, further comprising means for identifying a bounding area of pixel locations that surrounds the image of the picture.
0. 39. An apparatus as claimed in claim 38, further comprising means for pre-scanning the picture at a low resolution to obtain a pre-scanned image, and wherein the pre-scanned image is used to identify the bounding area of pixel locations.
0. 40. An apparatus as claimed in claim 38, said means for locating comprises means for locating pixel locations situated at edges of the image of the picture and within the bounding area that are nearest to the border of the bounding area.
0. 41. An apparatus as claimed in claim 33, wherein one or more of the bounding boxes is associated with an angle, and wherein the positioning angle is the angle associated with the selected bounding box.
0. 42. An apparatus as claimed in claim 33 wherein said means for determining comprises:
means for providing a reference box; and
means for rotating the reference box about its center through multiple angles of rotation.
0. 43. An apparatus as claimed in claim 42 wherein said means for providing comprises:
means for finding a region that bounds the contour of the image of the picture; and
means for generating a box that encloses within it a circle, the circle being large enough to enclose the region.
0. 44. An apparatus as claimed in claim 43, wherein the center of the circle is coincident with the center of the reference box.
0. 45. An apparatus as claimed in claim 43, wherein the center of the reference box is coincident with the centroid of the image of the picture.
0. 46. An apparatus as claimed in claim 33, said means for selecting comprising:
means for determining for one or more of the bounding boxes an average distance between the bounding box and the contour of the image of the picture; and
means for choosing a bounding box for which the average distance between the chosen bounding box and the contour of the image of the picture is small as compared to the respective average distances between others of the plurality of bounding boxes and the contour of the image of the picture.
0. 47. An apparatus as claimed in claim 46 wherein said means for choosing comprises means for choosing a bounding box for which the average distance between the bounding box and the contour of the image of the picture is smallest among the respective average distances between the one or more bounding boxes and the contour of the image of the picture.
0. 49. An apparatus as claimed in claim 48, said means for applying edge detection comprising a Gaussian filter.
0. 50. An apparatus as claimed in claim 48, said means for applying edge detection comprising a Laplace filter.
0. 51. An apparatus as claimed in claim 48, said means for applying edge detection comprising a Laplacian of Gaussian filter.
0. 52. An apparatus as claimed in claim 48, further comprising means for pre-scanning the one or more pictures at a lower resolution to obtain a pre-scanned image, and wherein the pre-scanned image is used to identify the bounding areas.
0. 53. An apparatus as claimed in claim 48, further comprising means for coalescing two or more expandable groups of pixel locations when the two or more expandable groups at least partially overlap as a result of said means for expanding.
0. 54. An apparatus as claimed in claim 48, further comprising:
means for determining a separation between two or more expandable groups of pixel locations; and
means for coalescing the two or more expandable groups of pixel locations when their separation is smaller than a prescribed threshold.
0. 56. An apparatus as claimed in claim 55, wherein one or more of the bounding boxes is associated with an angle, and wherein the positioning angle is the angle associated with the selected bounding box.
0. 57. An apparatus as claimed in claim 55, said means for detecting comprising means for locating pixel locations situated at edges of the image of the picture and within the bounding area of the image of the picture that are nearest to the border of the bounding area.
0. 58. An apparatus as claimed in claim 55, said means for determining comprising:
means for providing a reference box; and
means for rotating the reference box about its center through multiple angles of rotation.
0. 59. An apparatus as claimed in claim 58, said means for providing comprising:
means for finding a region that bounds the contour of the image of the picture; and
means for generating a reference box that encloses within it a circle, the circle being large enough to enclose the region.
0. 60. An apparatus as claimed in claim 59, wherein the center of the circle is coincident with the center of the center of the reference box.
0. 61. An apparatus as claimed in claim 59, wherein the center of the box is coincident with the centroid of the picture.
0. 62. An apparatus as claimed in claim 55, said means for selecting comprising:
means for determining for one or more of the bounding boxes, an average distance between the bounding box and the contour of the image of the picture; and
means for choosing a bounding box for which the average distance between the chosen bounding box and the contour of the image of the picture is relatively small as compared to the respective average distances between others of the one or more bounding boxes and the contour of the image of the picture.
0. 63. An apparatus as claimed in claim 62, said means for choosing comprising means for choosing a bounding box for which the average distance between the chosen bounding box and the contour of the image of the picture is smallest, or nearly smallest, among the respective average distances between the bounding boxes and the contour of the image of the picture.
|
This is a continuation-in-part of application U.S. Ser. No. 09/151,437, filed on Sep. 11, 1998starling
the value of σ for human visual perception is taken to be a value between 1.2 and 1.4, and the window for the convolution with the above filter is taken to be a square of size Round(6σ+3.5).
When using the Marr-Hildreth LoG operator, edge locations are identified as the zero-crossings of the result when the scanned image is convoluted with the above filter. Since zero crossings do not always lie at pixel locations, several methods are used to identify edge locations. These methods appear in Pratt, and are known to those skilled in the art of edge detection. For example, in one method edge locations are marked at each pixel with a positive response that has a neighbor with a negative response. In another method, the maximum of all positive responses and the minimum of all negative responses are formed in a 3×3 window around a selected pixel location. If the magnitude of the difference between the maximum and the minimum exceeds a threshold, an edge is judged to be present at the selected pixel location.
In a preferred embodiment of the present invention, since it is the contour of the picture within the scanned image that is being sought, edge locations are only computed near the sides of the bounding box. Starting from each pixel location on each side of the bounding box, the edge detection processing moves inwards towards the picture, along lines perpendicular to that side of the bounding box, one pixel at a time, until the first edge location is found. By repeating this for each pixel location on each side of the bounding box, all pixel locations where outermost edges of the picture are situated are determined.
The following computer listing of software in the C++ programming language implements Marr-Hildreth Laplacian of Gaussian edge detection as used in a preferred embodiment of the present invention. The constructor marrHildrethFilter calculates the values of the Mexican hat filter from Equation 3 and sets the elements of the array logaussian accordingly. The window size, fwidth, is set equal to the value sint32(float(6.0)*sigma+float(3.5)), as described hereinabove. The method MarrHildrethSmoothing carries out the convolution for performing the LoG filter on the luminance values of an input array, named “image.”
The method MarrHildrethEdgesDetection computes the zero crossings of the LoG for the luminance values. The long string of logical tests within the if-statements do the actual check for a zero crossing. The first if-statement examines the values of the LoG at the four neighboring pixel locations (r, c), (r+1, c), (r, c+1) and (r+1, c+1), and the second if-statement examines the values of LoG at the four neighboring pixel locations (r, c), (r+1, c), (r, c−1) and (r+1, c−1). If the LoG value at (r, c) has a different sign than one of its values at the three other pixel locations, then (r, c) is marked as an edge location and added to the list of outermost edges. Whenever an edge is detected, the bounding rectangle NewCrop is adjusted as necessary so as to include the detected edge location within it.
For each row value, r, the edge detection proceeds in two directions; namely, in the direction of increasing column value, c, and in the direction of decreasing column value, c. The edge detection breaks out of the loop on column value, c, as soon as the first edge location is detected, which suffices to detect the outermost edges.
class marrHildrethFilter
{
public:
marrHildrethFilter(float sigma, uint32 windowsize = 0);
virtual
~marrHildrethFilter(void);
void
SetElement(sint32 i, sint32 j, float x);
float
LoG(sint32 i, sint32 j) const;
sint32
Size(void) const;
private:
float**
logaussian;
sint32
halfsize;
};
inline void marrHildrethFilter::SetElement(sint32 i, sint32 j, float x)
{
logaussian[i+halfsize][j+halfsize]= x;
}
inline float marrHildrethFilter::LoG(sint32 i, sint32 j) const
{
return logaussian[i+halfsize][j+halfsize];
}
inline sint32 marrHildrethFilter::Size(void) const
{
return halfsize;
}
inline marrHildrethFilter::~marrHildrethFilter(void)
{
if (logaussian)
{
for (sint32 i=−halfsize; i<=halfsize; i++)
delete logaussian[i+halfsize];
delete logaussian;
}
}
typedef float *floatPtr;
marrHildrethFilter::marrHildrethFilter(float sigma, uint32 windowsize)
{
if (windowsize) halfsize = (windowsize − 1)/2;
else
{
sint32 fwidth = sint32(float(6.0)*sigma+float(3.5));
halfsize = (fwidth − 1)/2;
}
sint32 size = halfsize*2+1;
logaussian = new floatPtr[size];
sint32 x,y;
for (x = 0; x<size; x++)
logaussian[x]= new float[size];
float sigma2 = sigma * sigma;
float dc = float(1.0/(2.0*PI*sigma2*sigma2));
dc *= float(0.5 * sigma2);
float sigNorm = float(−1.0/(2.0*sigma2));
float norm,coef = float(1.0/sigma2);
for (y=−halfsize; y<=halfsize; y++)
for (x=−halfsize; x<=halfsize; x++)
{
norm = float(x*x + y*y);
SetElement(x,y,float(dc*(2.0-norm*coef)*exp(norm*sigNorm)));
}
}
float PSmartScanHighRes::MarrHildrethSmoothing(
const NPixelBufferMgr& image, PProcessBuffers* buffer, sint32 i,
sint32 j,
marrHildrethFilter* mh)
{
static const float EPS = PThresholdValues::ZeroCrossingPrecision( );
register float* ptr = &(buffer->gradient[i*width+j]);
if (*ptr != NOT_COMPUTED) return (*ptr);
(*ptr) = 0.0;
register sint32 size = mh->Size( );
sint32 x,y;
for (y=−size; y<=size; y++)
for (x=−size; x<=size; x++)
*ptr += mh->LoG(y,x)*ComputeLuminance(image,buffer,i+x,j+y);
if (ABS(*ptr)<EPS)
(*ptr) = 0.0;
return (*ptr);
}
bool PSmartScanHighRes::MarrHildrethEdgesDetection(
const NPixelBufferMgr& image, PProcessBuffers*
buffer,
PIntRectangle* newCrop, PEdgeIterator* list)
{
register sint32 r;
register sint32 c,c0;
register sint32 rmax = sint32(height);
register sint32 cmax = sint32(width);
marrHildrethFilter mh(PThresholdValues::LogSigma( ));
if (!buffer->IsIntensityMapValid( ) ∥ !buffer->IsGradientMapValid( ))
return false;
*newCrop = PIntRectangle( );
bool is1stCall = true;
for (r=0; r<rmax; r++)
{
for (c=0; c<cmax; c++)
{
if ((MarrHildrethSmoothing(image,buffer,r,c,&mh) != float(0.0)) &&
((MarrHildrethSmoothing(image,buffer,r,c,&mh) >0) &&
( ((r+1<rmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c,&mh)<0)) ∥
((c+1<cmax) &&
(MarrHildrethSmoothing(image,buffer,r,c+1,&mh)<0)) ∥
((r+1<rmax) && (c+1<cmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c+1,&mh)<0)))) ∥
((MarrHildrethSmoothing (image,buffer,r,c,&mh) <0) &&
(((r+1<rmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c,&mh)>0)) ∥
((c+1<cmax) &&
(MarrHildrethSmoothing(image,buffer,r,c+1,&mh)>0)) ∥
((r+1<rmax) && (c+1<cmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c+1,&mh)>0)))))
{
list->AddTail(new PChainedEdge(r,c,0));
if (is1stCall)
{
*newCrop = PIntRectangle(c,r,c+1 ,r+1);
is1stCall = false;
}
else
{
if (sint32(r)<newCrop->Top( )) newCrop->SetTop(r);
else if (sint32(r+1)>newCrop->Bottom( ))
newCrop->SetBottom(r+1);
if (sint32(c)<newCrop->Left( )) newCrop->SetLeft(c);
}
break;
}
}
if(c != cmax)
{
c0 = c;
for (c=cmax−1; c>c0; c−−)
{
if ((MarrHildrethSmoothing(image,buffer,r,c,&rnh) != float(0.0))
&&
((MarrHildrethSmoothing(image,buffer,r,c,&mh) >0) &&
( ((r+1<rmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c,&mh)<0)) ∥
((c>0) &&
(MarrHildrethSmoothing(image,buffer,r,c−1,&mh)<0)) ∥
((r+1<rmax) && (e>0) &&
(MarrHildrethSmoothing(image,buffer,r+1,c-1,&mh)<0)))) ∥
((MarrHildrethSmoothing(image,buffer,r,c,&mh) <0) &&
( ((r+1<rmax) &&
(MarrHildrethSmoothing(image,buffer,r+1,c,&mh)>0)) ∥
((c>0) &&
(MarrHildrethSmoothing(image,buffer,r,c−1,&mh)>0)) ∥
( (r+1<rmax) && (c>0) &&
(MarrHildrethSmoothing(image,buffer,r+1,c−1,&mh)>0)))))
{
list*AddTail(new PChainedEdge(r,c,0));
if (sint32(c+1 )>newCrop->Right( )) newCrop->SetRight(c+1);
break;
}
}
}
}
return true;
}
As in the variable box size embodiment described hereinabove, the second step in the fixed box size embodiment generates boxes having various angles of rotation X relative to the borders of the scanner bed, and the third step analyzes the boxes to determine the box that is most aligned with the picture. However, in the fixed box size embodiment it is not necessary that each of the boxes generated is itself “smallest,” as was the case in the variable boxes size embodiment. Instead, in the fixed box size embodiment, the various boxes are generated by rotating a reference box about its center. The reference box is chosen large enough so that irrespective of how it is rotated about its center, it encloses the entire contour of the picture within the high resolution scanned image.
In a preferred embodiment of the present invention, the center of the reference box is chosen to be the centroid of the shape enclosed by the contour of the picture; i.e., the centroid of the shape of the picture. The centroid of a shape is the average of all pixel locations within the shape, and corresponds to the center of gravity of a uniform plate having such a shape. Moreover, as described hereinbelow with reference to
Reference is now made to
A reference box 540 is then chosen to have the same center point, O, and to enclose circle 510. This ensures that when reference box 540 is rotated about its center O by any angle of rotation, the rotated box 550 will still enclose the entire picture within it. It is not essential in this fixed box size embodiment of the present invention that boxes 540 or 550 be constructed to be as small as possible.
The following computer listing of software in the C++ programming language calculates the reference box 540 according to a preferred embodiment of the present invention. The variables i0 and j0 are used to accumulate sums of the row and column coordinates, respectively, of all pixels on the outermost edges of the picture. The sums are averaged by dividing by the number of such pixels, and the average values are stored in rowCenter and columnCenter. These average values determine the center of the desired reference box 540. The half-width/half-length of the reference box is set to half of the length of the diagonal of a given bounding box cropArea, plus an additional pixel.
PFloatRectangle PSmartScanHighRes::BarycentricBoundingBox(
const PIntRectangle& cropArea, PEdgeIterator*
edges,
float* rowCenter, float* columnCenter)
{
sint32 i0=0, j0=0, nb = 0;
PEdgePtr cur = edges->First( );
edges->Reset( );
while (cur = edges->Current( ))
{
i0 += cur->Row( );
j0 += cur->Column( );
edges->Next( );
nb++;
}
*rowCenter = float(i0)/float(nb);
*columnCenter = float(j0)/float(nb);
sint32 size = sint32(sort(cropArea.Width( )*cropArea.Width( ) +
cropArea.Height( )*cropArea.Height( ))/2.0 + 1.0);
return PFloatRectangle(*columnCenter-size,
*rowCenter-
size, *columnCenter+size, * rowCenter+size);
}
In a preferred fixed box size embodiment, the present invention calculates, for each orientation angle X, a sum D(X) that gives a measure of the average distance between the box oriented at angle X and the picture. Specifically, D(X) is given by the expression
where p denotes a pixel location in the edge envelope of the picture, and B(X) denotes the box that is oriented at angle X. Thus D(X) equals the sum of the distances, ∂(p, B(X)), from each pixel location p in the edge envelope of the picture to the box B(X). The distance from a pixel location p to a box B(X) is defined to be the minimal distance from p to any of the four sides of B(X). Specifically, the four distances from pixel location p to each side of box B(X) are considered, and the value of ∂(p, B(X)) is set to the smallest of these four distances. The distance from a pixel location p to a side of a box is measured along the line perpendicular to the side.
Were D(X) to be normalized by dividing by the number of pixel locations in the outermost edges, it would represent an average distance between the outermost edges of the picture and box B(X). As such, D(X) serves as a metric for how well box B(X) is aligned with the picture. The angle of rotation X for which D(X) is smallest is the angle that brings the reference box closest to the pixels of the outermost edges of the picture. In other words, the box B(X) that is “most aligned” with the picture is the one for which D(X) is smallest, among all values of X.
For the special case of the reference box 540 in
Reference is now made to
A point A denotes the point of intersection of the lines L-0 and L-X. The location of point A can be readily determined from the distance OR and the angle X. Indeed, referring back to
Referring again to
The distance AB can also be readily determined since the locations of point A and point B are both known.
The line L-X* is taken from point P perpendicular to line L-X, and it intersects line L-X at a point C. The length PC represents the desired distance between pixel location P and side L-X. A point D in
PC=PB*cos(X)−AB*sin(X). (2)
For each pixel location P on an outermost edge of the picture, the calculation of PC is performed for each of the four sides of box B(X), and the smallest of these four values is used for the term ∂(p,B(X)) in Equation 1 above. These terms are cumulatively summed over all such pixel locations to calculate the value of D(X) in Equation 1 above. The desired angle of rotation of the picture is that angle X for which D(X) is minimized.
The following computer listing of software in the C++ programming language implements the calculation of D(X) as used in a preferred embodiment of the present invention. The method InitializeDistanceEdges calculates the sum D(0) of all the distances from each pixel location in an outermost edge of the picture to the reference box, named “box,” for which X=0. It uses a class PDistanceEdge inherited from PedgeElt that includes four members leftDist, topDist, rightDist and bottomDist, representing the distances from the edge location to the four sides of a box. The method Distance computes the sum D(X) of the distances from each pixel location in an outermost edge of the picture to the box obtained by rotating the reference box by an angle alpha (i.e. X) about its center, using Equation 2 above.
float PSmartScanHighRes::InitializeDistanceEdges(PEdgeIterator* edges,
PEdgeIterator* new_edges, const PFloatRectangle&
box)
{
float distance = 0.0;
PEdgePtr chained_edge = edges->First( ), next = 0;
edges->Reset( );
PEdgePtr dist_edge;
while (chained_edge)
{
next = chained_edge->next;
dist_edge = new PDistanceEdge(chained edge->Row( ),
chained_edge->Column( ),
flat(box.Left( )-chained_edge-
>Column( )),
float(box.Top( )-chained_edge->Row( )),
float(box .Right( )-chained_edge-
>Column( )),
float(box.Bottom( )-chained_edge-
>Row( )));
new_edges->AddTail(dist_edge);
distance += DistanceMinimal(((PDistanceEdge*) dist_edge)->Left( ),
((PDistanceEdge*)dist_edge)->Top( ),
((PDistanceEdge*)dist_edge)->Right( ),
((PDistanceEdge*)dist_edge)-
>Bottom( ));
chained_edge = next;
}
return distance;
}
float PSmartScanHighRes::Distance(PEdgeIterator* edges, const
PFloatRectangle& boundingBox, float Xc, float Yc, float angle)
{
float distance = 0.0, cosinus = float(cos(angle)), sinus = float(sin(angle));
float left, top, right, bottom;
float Xm = float(boundingBox.Left( ));
float Ym = Yc;
float XXm = Xm*cosinus − Ym*sinus;
float YYm = Xm*sinus + Ym*cosinus;
float left_Xr = Xm;
float left_Yr = (−(Xc−XXm)*Xm+(Xc−XXm)*XXm+
(Yc−YYm)*YYm)/(Yc−YYm);
Xm = float(boundingBox.Right( ));
Ym = Yc;
XXm = Xm*cosinus − Ym*sinus;
YYm = Xm*sinus +Ym*cosinus;
float right_Xr = Xm;
float right_Yr = (−(Xc−XXm)*Xm+(Xc−XXm)*XXm+
(Yc−YYm)*YYm)/(Yc−YYm);
Xm = Xc;
Ym = float(boundingBox.Top( ));
XXm = Xm*cosinus − Ym*sinus;
YYm = Xm*sinus +Ym*cosinus;
float top_Yr = Ym;
float top_Xr = (−(Yc−YYm)*Ym+(Xc−XXm)*XXm+
(Yc−YYm)*YYm)/(Xc−XXm);
Xm = Xc;
Ym = float(boundingBox.Bottom( ));
XXm = Xm*cosinus = Ym*sinus;
YYm = Xm*sinus +Ym*cosinus;
float bottom_Yr = Ym;
float bottom_Xr = (−(Yc−YYm)*Ym+(Xc−XXm)*XXm+
(Yc−YYm)*YYm)/(Xc−XXm);
PDistanceEdge* cur = (PDistanceEdge*)edges->First( );
edges->Reset( );
while (cur)
{
left = cur->Left( )*cosinus − (cur->Row( ) = left_Yr)*sinus;
right = cur->Right( )*cosinus − (cur->Row( ) = right_Yr)*sinus;
top = cur->Top( )*cosinus − (cur->Column( ) = top_Yr)*sinus;
bottom=cur->Bottom( )*cosinus − (cur->Column( ) = bottom_Yr)*sinus;
distance += DistanceMinimal(left, top, right, bottom);
cur = (PDistanceEdge*) (edges->Next( ));
}
return distance;
}
float PSmartScanHighRes::DistanceMinimal(float aa, float ab, float ac,
float ad)
{
float a = ABS(aa);
float b = ABS(ab);
float c = ABS(ac);
float d = ABS(ad);
if (a<b)
{
if (a<c)
return (a<d)?a:d;
else
return (c<d)?c:d;
}
else
{
if (b<c)
return (b<d)?b:d;
else
return (c<d)?c:d;
}
}
class SMARTSCANDEC PDistanceEdge:public PEdgeElt
{
friend PEdgeIterator;
public:
pDistanceEdge(void);
PDistanceEdge(const PDistanceEdge&);
PDistanceEdge(uint32 arow, uint32 acolumn, float leftDist,
float topDist, float rightDist,
float bottomDist, PEdgePtr next = 0);
PDistanceEdge& operator=(const PDistanceEdge&);
virtual
~PDistanceEdge(void);
float
Left(void) const;
float
Top(void) const;
float
Right(void) const;
float
Bottom(void) const;
void
SetLeft(float);
void
SetTop(float);
void
SetRight(float);
void
SetBottom(float);
private:
float
leftDist, topDist, rightDist, bottomDist;
};
inline PDistanceEdge::pDistanceEdge(void):PEdgeElt( )
{ leftDist = topDist = rightDist = bottomDist = 0.0; }
inline PDistanceEdge::PDistanceEdge(
const PDistanceEdge& elt):
PEdgeElt(elt)
{ leftDist = elt.leftDist; topDist = elt.topDist; rightDist = elt.rightDist;
bottomDist = elt.bottomDist; }
inline PDistanceEdge::PDistanceEdge(uint32 arow, uint32 acolumn,
float aleftDist, float atopDist,
float arightDist, float abottomDist,
PEdgePtr anext):PEdgeElt(arow, acolumn,
anext)
{ leftDist = aleftDist; topDist = atopDist; rightDist = arightDist;
bottomDist = abottomDist; }
inline PDistanceEdge& PDistanceEdge::operator=(
const PDistanceEdge&
elt)
{ this->pEdgeElt::operator=(elt):leftDist = elt.leftDist; topDist = elt.topDist;
rightDist = elt.rightDist; bottomDist = elt.bottomDist; return (*this); }
inline PDistanceEdge::~PDistanceEdge(void)
{ }
inline float PDistanceEdge::Left(void) const
{ return leftDist; }
inline float PDistanceEdge::Top(void) const
{ return topDist; }
inline float PDistanceEdge::Right(void) const
{ return rightDist; }
inline float PDistanceEdge::Bottom(void) const
{ return bottomDist; }
inline void PDistanceEdge::SetLeft(float val)
{ leftDist = vat; }
inline void PDistanceEdge::SetTop(float val)
{ topDist = val; }
inline void PDistanceEdge::SetRight(float val)
{ rightDist = val; }
inline void PDistanceEdge::SetBottom(float val)
{ bottomDist = val; }
Reference is now made to
The angle of orientation, X, is varied within a range from X_START to X_END with a step size of DX, in order to search for the desired angle of rotation of the picture. It is assumed that X_START is less than X_END. At step 730, X is initialized to the value X_START. At step 740 a test is made whether or not X exceeds the value X_END. If so, execution terminates at step 750, and the desired angle of rotation is given by the variable X_ANGLE. Otherwise, execution continues by advancing to step 760, which initializes to zero the variable D for the running sum in Equation 1 above.
At step 770 all of the pixel locations within the contour of the picture that were identified at step 710 are marked as being unprocessed. In addition, a specific pixel location, P, is selected as an initial location for processing. At step 780 P is marked as being processed. At step 790 the variable L_SMALLEST is initialized to a very large positive value. This ensures that the first value of the variable L calculated below is less than L_SMALLEST, and thus accepted.
At step 800 a box, B, is considered as being oriented in the direction of angle X relative to the borders of the scanner bed. This is the box 550 from
The four sides of box B are marked as unprocessed, and at step 810 a specific side, S, is selected as an initial side for processing. At step 820, S is marked as being processed. At step 830 the distance, L, between pixel location P and side S is calculated, preferably based on Equation 2 above. At step 840 a determination is made as to whether or not L is smaller than L_SMALLEST. This is done in order to choose the smallest of the four distances from P to each of the four sides of box B, in order to computer the term ∂(p, B(X)) from Equation 1 above. If L is smaller than L_SMALLEST, then at step 850 L_SMALLEST is set to L, and the flow of execution advances to step 860. If L is not smaller than L_SMALLEST, then step 850 is by-passed, and the flow of execution advances directly from step 840 to step 860. As mentioned hereinabove, the first time the determination of step 840 is made, L is less than L_SMALLEST, since L_SMALLEST was initialized to a very large positive value at step 790.
At step 860 a determination is made whether there remain any unprocessed sides S of box B. If so, the flow of execution returns to step 820. If not, then the calculation of L_SMALLEST is complete, and L_SMALLEST is equal to the term ∂(p,B(X)) from Equation 1 above. At step 870 this term is added cumulatively to the running sum variable D. At step 880 a determination is made whether or not there remain any unprocessed pixel locations within the contour of the picture. If so, then control returns to step 750. If not, then all of the pixel locations within the outermost edges of the picture have been accounted for in the sum D(X) from Equation 1, and hence the variable D equals this sum.
The flow of execution then advances to step 890. At step 890 a determination is made as to whether or not D is less than D_SMALLEST. If not, the flow of execution advances to step 900 where the orientation angle is incremented by an amount DX, and from there the flow of execution returns to step 740. Otherwise, step 810 is executed, which sets the angle of rotation, X_ANGLE, to X, and sets D_SMALLEST to D. D_SMALLEST thus represents the smallest value of D currently produced by the search, and X_ANGLE represents the angle of orientation that produced this value of D. As mentioned hereinabove, the first time the determination of step 890 is made, D is less than D_SMALLEST, since D_SMALLEST was initialized to a very large positive value at step 720.
Various well-known search techniques can be used to determine the angle X_ANGLE that produces the smallest value of D(X). Although the search technique presented in
In a preferred embodiment of the present invention a search for the angle of rotation is made by varying X in units of one degree from 0° to 90° or from 0° to −90°. The decision as to whether to search in the direction of positive or negative angles is made by initially selecting a small angle of rotation in one angular direction. It this causes the value of D(X) to increase, then the search is made in the opposite direction. Otherwise, if this causes the value of D(X) to decrease, then the search is made in the same direction. Moreover, since most people are right-handed and since right-handed people tend to rotate the picture clockwise when placing it on the scanner bed, the positive angular direction from 0° to 90° is the more probable one to give rise to the picture's angle of rotation. As such, in a preferred embodiment of the present invention, the initial angle of rotation is selected in the positive angular direction. In any event the largest number of directions to search through does not exceed 180, since 180 degrees spans all of the angles between −90° and +90° in units of one degree.
Regarding the method illustrated in
The following computer listing of software in the C++ programming language implements the flowchart of
bool PSmartScanHighRes::RotationEstimate(
const PIntRectangle& cropArea, PEdgeIterator* edges,
float* rowCenter, float* columnCenter, float*
rotateAngle)
{
if ((cropArea.Area( ) == 0)∥(edges->Size( ) == 0)) return false;
const float PId2=float(PI/2.0);
*rowCenter = *columnCenter = *rotateAngle = 0.0;
PFloatRectangle boundingBox(BarycentricBoundingBox(cropArea, edges,
rowCenter,
columnCenter));
float rotationCenterRow = float(boundingBox.Top( ) +
boundingBox.Height( )/2.0);
float rotationCenterColumn = float(boundingBox.Left( ) +
boundingBox.Width( )/2.0);
PEdgeIterator new_edge;
float distanceO = InitializeDistanceEdges(edges &new_edge,
boundingBox);
float newDistance = distance0, distance = distance0;
float angle = 0.0;
sint32 iteration = 0;
while (((newDistance < distance) [DIST_EQUAL(newDistance, distance))
&& (angle <= PId2))
{
angle += float(RADIANT_STEP); iteration++;
distance = newDistance;
newDistance = Distance(&new_edge, boundingBox, rotationCenterRow,
rotationCenterColumn, angle);
}
iteration--; angle −= float(RADIANT_STEP);
if ((iteration == 0) ∥ (angle>PId2))
{
angle = 0.0; distance = newDistance = distance0;
while (((newDistance <distance) ∥
DIST_EQUAL(newDistance, distance)) && (angle >= -PId2))
{
angle −= RADIANT_STEP;
distance = newDistance;
newDistance = Distance(&new_edge, boundingBox,
rotationCenterRow, rotationCenterColumn,
angle);
}
}
iteration--; angle −= float(RADIANT_STEP);
if (angle<-PId2) return false;
*rotateAngle = angle;
return true;
}
Once the angle of rotation of the picture is identified, the photo kiosk can display a corrected image of the picture by rotating the scanned image in the direction opposite to its rotation angle, so that it appears oriented correctly in the kiosk display, and aligned with the kiosk display axes. This correction can be accomplished by either modifying the scanned image data, or by preserving the scanned image data and simply including the angle of rotation as part of the image data as described hereinbelow.
A photo kiosk may represent a scanned image internally in the Flashpix image format. FLASHPIX is a trademark of the Digital Imaging Group. A reference for Flashpix is the document “Flashpix Format Specification,” ©1996, 1997, Eastman Kodak Company, the contents of which are hereby incorporated by reference.
The Flashpix image format allows for the inclusion of meta-data. Meta-data is auxiliary data to the image, such as data describing the creator, the contents, the copyright, the date and time created, the date and time last modified, the camera information and the scanner information. Meta-data can also include parameters for transformations to be applied to the image, such as rotations and scaling, general affine transformations, contrast adjustments and color space transformations. When displaying a Flashpix image, a Flashpix viewer must check for the presence of such transformations. If they are present within the Flashpix file, the viewer must apply them to the portion of the image being viewed.
As mentioned above, a photo kiosk may provide the consumer with hard copy or soft copy photo products. If the photo kiosk uses the Flashpix image format internally, then the kiosk performs the transformations embedded within the image file prior to displaying the image on the kiosk display and prior to printing out a hard copy photo product. For soft copy products, the Flashpix image file would be delivered to the consumer with the transformation data embedded within the file, in which case the consumer would need to have a Flashpix viewer in his home or office computer in order to properly display or print his photo product.
Multiple Scans
The present invention can be used to automatically generate multiple images when multiple pictures are placed on a scanner bed together. Regions of interest and angles of rotation can be determined for each of the pictures independently.
In order to apply the various techniques described hereinabove, the present invention operates by first scanning an area of the scanner bed containing all of the pictures therewithin, to produce a single scanned digital image, and then individually isolating the pictures within the scanned image. In a preferred embodiment, this is accomplished by identifying areas of pixel locations within the scanned image that bound each of the individual pictures, and then determining the contours of each individual picture within the respective bounding areas.
To find bounding areas of pixel locations within the scanned image for each of the pictures, the present invention first applies edge detection to the scanned image, in order to identify the edges of all of the pictures together within the scanned image. The present invention then applies a “blob growing” algorithm as described in detail hereinbelow. A “blob” is a connected group of pixel locations. A blob is initialized by choosing the group to be a small set of pixel locations, known to be contained entirely within a single one of the pictures. In a preferred embodiment of the present invention blobs are initialized as small circular sets of pixel locations, centered at locations within the scanned image where there are large densities of edges. This ensures that each blob is initially contained within a single one of the pictures.
A blob “grows” by expanding outwards. For example, initially a blob could be a circular set of pixel locations, and the growing could be implemented by inclusion of additional pixel locations as the set expands radially outward. In a preferred embodiment of the present invention, blobs grow by following edges that impinge upon their outer boundary. That is, given a current blob shape, if an edge point (i.e., a pixel location belonging to an edge) is found on the boundary of the blob, then the blob is expanded outwards in the direction of the edge. The growth process continues until there are no edge points on the boundary of the blob.
Blobs do not typically grow symmetrically in all directions. Blob growth can appear to be random. Even if a blob is initialized to be a circular set of pixel locations, its shape changes as it grows, and its shape is typically not circular nor oval-shaped nor even convex at any stage other than the initial stage.
Several blobs located at different parts of the scanned image are initialized and grown simultaneously, in order to take all of the pictures into consideration together in determining each of their individual bounding areas. Two blobs that grow in such a way that they intersect are coalesced into a single blob.
Reference is now made to
The boundary of blob 940 does not intersect any edges, and as such blob 940 does not grow. The boundary of blob 950 intersects four of the edges 920. The vectors 970 indicate the outward directions of the intersecting edges. Blob 950 grows by expanding along the vectors 970.
Provided that the pictures in the scanner bed do not overlap, it is expected that the blobs will dynamically grow and coalesce until each blob contains a single entire picture within it. At that point, the blobs constitute the desired bounding areas of pixel locations.
If a single picture contains multiple disjoint objects within it, it may happen that multiple blobs are generated within such a picture, each blob bounding one of the disjoint objects. This is undesirable, since the purpose of the blobs is to bound an entire picture, and not just an object within a picture.
To overcome this undesirable result, the present invention operates by calculating a “separation” between two blobs, and coalescing two blobs together if the separation between them is smaller than a prescribed threshold. Thus blobs within the same picture will be combined if they are close enough together. Examples for measures of separation between blobs include the shortest distance between the blobs, or the area in a section between the blobs.
Reference is now made to
A supporting line 1000 for blobs 980 and 990 can be computer generated by initializing a line connecting the centroids 1010 and 1020 of the two blobs 980 and 990, respectively. The left endpoint of the line is moved upward, with the right endpoint being held fixed, until the line no longer intersects blob 980. The right endpoint is subsequently moved upward, with the left endpoint being held fixed, until the line no longer intersects blob 990, at which point the line connecting the left and right endpoints is a supporting line. Similarly, a second supporting line for blobs 980 and 990 can be computer generated by repeating the above algorithm with the endpoints moving downward rather than upward.
When the generation of the blobs is complete, and the blobs no longer grow nor coalesce, the present invention uses the bounding areas of pixel locations corresponding to each blob to process each picture individually, as described above with reference to single scanned pictures. Specifically, for the variable box size embodiment, a search for the angle of rotation is conducted according to the flowchart in
In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made to the specific exemplary embodiments without departing from the broader spirit and scope of the invention as set forth in the appended claims. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
Vallmajo, Patrice, Bossut, Phillippe Joseph Ghislain
Patent | Priority | Assignee | Title |
8600167, | May 21 2010 | Hand Held Products, Inc.; HAND HELD PRODUCTS, INC | System for capturing a document in an image signal |
9047531, | May 21 2010 | Hand Held Products, Inc.; HAND HELD PRODUCTS, INC | Interactive user interface for capturing a document in an image signal |
9256974, | May 04 2010 | 3-D motion-parallax portable display software application | |
9319548, | May 21 2010 | Hand Held Products, Inc. | Interactive user interface for capturing a document in an image signal |
9451132, | May 21 2010 | Hand Held Products, Inc. | System for capturing a document in an image signal |
9521284, | May 21 2010 | Hand Held Products, Inc. | Interactive user interface for capturing a document in an image signal |
9681041, | Nov 02 2012 | DNP Imagingcomm America Corporation | Apparatus, system and method for capturing and compositing an image using a light-emitting backdrop |
ER2092, |
Patent | Priority | Assignee | Title |
5555042, | Oct 06 1994 | Eastman Kodak Company; EASTMAN KODAK COMPANY ROCHESTER, NY 14650-2201 | Apparatus for automatically feeding slides into a film scanner |
5623581, | Jan 22 1996 | AMERICAN PHOTO BOOTHS, INC | Direct view interactive photo kiosk and image forming process for same |
5913019, | Jan 22 1996 | FOTO FANTASY, INC | Direct view interactive photo kiosk and composite image forming process for same |
6049636, | Jun 27 1997 | Microsoft Technology Licensing, LLC | Determining a rectangular box encompassing a digital picture within a digital image |
6111667, | Dec 12 1995 | Minolta Co., Ltd. | Image processing apparatus and image forming apparatus connected to the image processing apparatus |
6369908, | Mar 31 1999 | Photo kiosk for electronically creating, storing and distributing images, audio, and textual messages | |
6597808, | Dec 06 1999 | Panasonic Corporation of North America | User drawn circled region extraction from scanned documents |
6750988, | Sep 11 1998 | CALLSTAT SOLUTIONS LLC | Method and system for scanning images in a photo kiosk |
6791723, | Sep 11 1998 | CALLSTAT SOLUTIONS LLC | Method and system for scanning images in a photo kiosk |
20100104194, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
May 04 1999 | VALLMAJO, PATRICE | LIVE PICTURE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 018314 | /0663 | |
May 04 1999 | BOSSUT, PHILIPPE JOSEPH GHISLAIN | LIVE PICTURE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 018314 | /0663 | |
Jun 30 1999 | LIVE PICTURE, INC | MGI SOFTWARE CORP | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026735 | /0568 | |
Jul 03 2002 | MGI SOFTWARE CORP | ROXIO, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 018314 | /0937 | |
Dec 17 2004 | ROXIO, INC | Sonic Solutions | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 018314 | /0950 | |
Apr 21 2005 | Sonic Solutions | Kwok, Chu & Shindler LLC | CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE: KWOK, CHU & SHINDLER LLP PREVIOUSLY RECORDED ON REEL 025976 FRAME 0636 ASSIGNOR S HEREBY CONFIRMS THE ASSIGNEE: KWOK, CHU & SHINDLER LLC | 026360 | /0379 | |
Apr 21 2005 | Sonic Solutions | KWOK, CHU & SHINDLER LLP | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 025976 | /0636 | |
Sep 13 2006 | Intellectual Ventures I LLC | (assignment on the face of the patent) | / | |||
Jul 18 2011 | Kwok, Chu & Shindler LLC | Intellectual Ventures I LLC | MERGER SEE DOCUMENT FOR DETAILS | 026637 | /0623 | |
Nov 26 2019 | Intellectual Ventures I LLC | INTELLECTUAL VENTURES ASSETS 161 LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 051945 | /0001 | |
Dec 06 2019 | INTELLECTUAL VENTURES ASSETS 161 LLC | HANGER SOLUTIONS, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 052159 | /0509 | |
Feb 09 2021 | HANGER SOLUTIONS, LLC | CALLSTAT SOLUTIONS LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 058203 | /0890 |
Date | Maintenance Fee Events |
Mar 09 2016 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Nov 01 2014 | 4 years fee payment window open |
May 01 2015 | 6 months grace period start (w surcharge) |
Nov 01 2015 | patent expiry (for year 4) |
Nov 01 2017 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 01 2018 | 8 years fee payment window open |
May 01 2019 | 6 months grace period start (w surcharge) |
Nov 01 2019 | patent expiry (for year 8) |
Nov 01 2021 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 01 2022 | 12 years fee payment window open |
May 01 2023 | 6 months grace period start (w surcharge) |
Nov 01 2023 | patent expiry (for year 12) |
Nov 01 2025 | 2 years to revive unintentionally abandoned end. (for year 12) |