Face recognition system

Face recognition system
RE36041

A recognition system for identifying members of an audience, the system including an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module responsive to the detection means for determining whether a detected image of a person identified by the detection means resembles one of a reference set of images of individuals.

PTO Wrapper PDF
Dossier Espace Google

Patent RE36041
Priority Nov 16 1994
Filed Nov 16 1994
Issued Jan 12 1999
Expiry Jan 12 2016
Inventors Pentland, …
Assg.orig Massachuse…
Assg.curr Massachuse…
Entity Large
Referenced by 133
References 9
Maint.: all paid REINSTATED

BACKGROUND OF THE IN…
SUMMARY OF THE INVEN…
DESCRIPTION OF THE P…
STRUCTURE AND OPERAT…
Calculating Eigenfac…
Summary of Eigenface…
Motion Detection And…
Using "Face Space" T…
Scale Invariance

12. A method for identifying members of an audience, the method comprising:

generating an image of the audience;

selecting a portion of said generated image;

representing a reference set of images of individuals as a set of eigenevectors in a multi-dimensional image space;

representing said selected image portion as an input vector in said multi-dimensional image space;

computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;

using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and

if it is determined that the selected image contains an image that can be classified as an image of a person determining whether said image of a person resembles one of a reference set of images of individuals.

1. A recognition system for identifying members of an audience, the system comprising:

an imaging system which generates an image of the audience;

a selector module for selecting a portion of said generated image;

means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;

a detection means which determines whether the selected image portion contains an image that can be classified as an image of a person, said detection means including means for representing said selected image portion as an input vector in said multi-dimensional image space and means for computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors, wherein said detection means uses the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and

2. The recognition system of claim 1 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold.

3. The recognition system of claim 1 wherein said selection means selector module comprises a motion detector for identifying the selected portion of said image by detector motion.

4. The recognition system of claim 3 wherein said selection means selector module further comprises a locator module for locating the portion of said image corresponding to a face of the person based on motion detected by said motion detector.

5. The recognition system of claim 1 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.

6. The recognition system of claim 1 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.

7. The recognition system of claim 6 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace.

8. The recognition system of claim 7 wherein said recognition module further comprises means for projecting said input vector onto said subspace.

9. The recognition system of claim 8 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.

10. The recognition system of claim 8 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace.

11. The recognition system of claim 10 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.

13. The method of claim 12 further comprising the step of determining which one, if any, of the members of said reference set said image of a person resembles.

14. The method of claim 12 wherein the image of the audience is a sequence of image frames and wherein the method further comprises detecting motion within the sequence of image frames and wherein the selected image portion is determined on the basis of the detected motion.

15. The method of claim 12 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person further comprises comparing said computed distance to a preselected threshold.

16. The method of claim 15 wherein the step of determining whether said image of a person resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace.

17. The method of claim 16 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.

18. The method of claim 17 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises projecting said input vector onto said subspace.

19. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises selecting a member of said reference set and computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.

20. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace.

21. The method of claim 20 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.

22. A recognition system comprising:

an imaging system which generates an image;

a selector module for selecting a portion of said generated image;

means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;

a recognition module responsive to said detection means for determining whether a detected image of a person identified by said detection means resembles one of the reference set of images of individuals. 23. The recognition system of claim 22 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold. 24. The recognition system of claim 22 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals. 25. The recognition system of claim 22 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.

26. The recognition system of claim 25 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace. 27. The recognition system of claim 26 wherein said recognition module further comprises means for projecting said input vector onto said subspace. 28. The recognition system of claim 27 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member. 29. The recognition system of claim 27 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace. 30. The recognition system of claim 24 wherein said means for representing said reference set includes means for adding a member to said reference set by protecting into said subspace an input vector having a computed distance indicative of an image of a face. 31. A method comprising:

generating an image;

selecting a portion of said generated image;

representing a reference set of images of faces of individuals as a set of eigenvectors in a multi-dimensional image space;

representing said selected image portion as an input vector in said multi-dimensional image space;

computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;

using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person's face; and

if it is determined that the selected image contains an image that can be classified as an image of a person's face, determining whether said image of a person's face resembles one of a reference set of images of faces of

individuals. 32. The method of claim 31 further comprising the step of determining which one, if any, of the members of said reference set said image of a person's face resembles. 33. The method of claim 31 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person's face further comprises comparing said computed distance to a preselected threshold. 34. The method of claim 33 wherein the step of determining whether said image of a person's face resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace. 35. The method of claim 34 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.

The method of claim 35 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises projecting said input vector onto said subspace. 37. The method of claim 36 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace.

BACKGROUND OF THE INVENTION

The invention relates to a system for identifying members of a viewing audience.

For a commercial television network, the cost of its advertising time depends critically on the popularity of its programs among the television viewing audience. Popularity, in this case, is typically measured in terms of the program's share of the total audience viewing television at the time the program airs. As a general rule of thumb, advertisers prefer to place their advertisements where they will reach the greatest number of people. Thus, there is a higher demand among commercial advertisers for advertising time slots along side more popular programs. Such time slots can also demand a higher price.

Because the economics of television advertising depends so critically on the tastes and preferences of the television audience, the television industry invests a substantial amount of time, effort and money in measuring those tastes and preferences. One preferred approach involves monitoring the actual viewing habits of a group of volunteer families which represent a cross-section of all people who watch television. Typically, the participants in such a study allow monitoring equipment to be placed in their homes. Whenever a participant watches a television program, the monitoring equipment records the time, the identity of the program and the identity of the members of the viewing audience. Many of these systems require active participation by the television viewer to obtain the monitoring information. That is, the viewer must in some way interact with the equipment to record his presence in the viewing audience. If the viewer forgets to record his presence the monitoring statistics will be incomplete. In general, the less manual intervention required by the television viewer, the more likely it is that the gathered statistics on viewing habits will be complete and error free.

Systems have been developed which automatically identify members of the viewing audience without requiring the viewer to enter any information. For example, U.S. Pat. No. 4,858,000 to Daozehng Lu, issued Aug. 15, 1989 describes such a system. In the system, a scanner using infrared detectors locates a member of the viewing audience, captures an image of the located member, extracts a pattern signature for the captured image and then compares the extracted pattern signature to a set of stored pattern image signatures to identify the audience member.

SUMMARY OF THE INVENTION

In general, in one aspect, the invention is a recognition system for identifying members of an audience. The invention includes an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module for determining whether a detected image of a person resembles one of a reference set of images of individuals.

Preferred embodiments include the following features. The recognition module also determines which one, if any, of the individuals in the reference set the detected image resembles. The selection means includes a motion detector for identifying the selected portion of the image by detecting motion and it includes a locator module for locating the portion of the image corresponding to the face of the person detected. In the recognition system, the detection means and the recognition module employ a first and second pattern recognition techniques, respectively, to determine whether an image of a person is present in the selected portion of the image and both pattern recognition techniques employ a set of eigenvectors in a multi-dimensional image space to characterize the reference set. In addition, the second pattern recognition technique also represents each member of the reference set as a point in a subspace defined by the set of eigenvectors. Also, the image of a person is an image of a person's face and the reference set includes images of faces of the individuals.

Also in preferred embodiments, the recognition system includes means for representing the reference set as a set of eigenvectors in a multi-dimensional image space and the detection means includes means for representing the selected image portion as an input vector in the multi-dimensional image space and means for computing the distance between a point identified by the input vector and a subspace defined by the set of eigenvectors. The detection means also includes a thresholding means for determining whether an image of a person is present by comparing the computed distance to a preselected threshold. The recognition module includes means for representing each member of the reference set as a corresponding point in the subspace. To determine the location of each point in subspace associated with a corresponding member of the reference set, a vector associated with that member is projected onto the subspace.

The recognition module also includes means for projecting the input vector onto the subspace, means for selecting a particular member of the reference set, and means for computing a distance within the subspace between a point identified by the projection of the input vector onto the subspace and the point in the subspace associated with the selected member.

In general, in another aspect, the invention is a method for identifying members of an audience. The invention includes the steps of generating an image of the audience; selecting a portion of the generated image; analyzing the selected image portion to determine whether an image of a person is present; and if an image of a person is determined to be present, determining whether the image of a person resembles one of a reference set of images of individuals.

One advantage of the invention is that it is fast, relatively simple and works well in a constrained environment, i.e., an environment for which the associated image remains relatively constant except for the coming and going of people. In addition, the invention determines whether a selected portion of an image actually contains an image of a face. If it is determined that the selected image portion contains an image of a face, the invention then determine which one of a reference set of known faces the detected face image most resembles. If the detected face image is not present among the reference set, the invention reports the presence of a unknown person in the audience. The invention has the ability to discriminate face images from images of other objects.

Other advantages and features will become apparent from the following description of the preferred embodiment and from the claims.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 is a block diagram of a face recognition system;

FIG. 2 is a flow diagram of an initialization procedure for the face recognition module;

FIG. 3 is a flow diagram of the operation of the face recognition module; and

FIG. 4 is a block diagram of a motion detection system for locating faces within a sequence of images.

STRUCTURE AND OPERATION

Referring to FIG. 1, in an audience monitoring system 2, a video camera 4, which is trained on an area where members of a viewing audience generally sit to watch the TV, sends a sequence of video image frames to a motion detection module 6. Video camera 4, which may, for example, be installed in the home of a family that has volunteered to participate in a study of public viewing habits, generates images of TV viewing audience. Motion detection module 6 processes the sequence of image frames to identify regions of the recorded scene that contain motion, and thus may be evidence of the presence of a person watching TV. In general, motion detection module 6 accomplishes this by comparing successive frames of the image sequence so as to find those locations containing image data that changes over time. Since the image background (i.e., images of the furniture and other objects in the room) will usually remain unchanged from frame to frame, the areas of movement will generally be evidence of the presence of a person in the viewing audience.

When movement is identified, a head locator module 8 selects a block of the image frame containing the movement and sends it to a face recognition module 10 where it is analyzed for the presence of recognizable faces. Face recognition module 10 performs two functions. First, it determines whether the image data within the selected block resembles a face. Then, if it does resemble a face, module 10 determines whether the face is one of a reference set of faces. The reference set may include, for example, the images of faces of all members of the family in whose house the audience monitoring system has been installed.

To perform its recognition functions, face recognizer 10 employs a multi-dimensional representation in which face images are characterized by a set of eigenvectors or "eigenfaces". In general, according to this technique, each image is represented as a vector (or a point) in very high dimensional image space in which each pixel of the image is represented by a corresponding dimension or axis. The dimension of this image space thus depends upon the size of the image being represented and can become very large for any reasonably sized image. For example, if the block of image data is N pixels by N pixels, then the multi-dimensional image space has dimension N². The image vector which represents the N×N block of image data in this multi-dimensional image space is constructed by simply concatenating the rows of the image data to generate a vector of length N².

Face images, like all other possible images, are represented by points within this multi-dimensional image space. The distribution of faces, however, tends to be grouped within a region of the image space. Thus, the distribution of faces of the reference set can be characterized by using principal component analysis. The resulting principal components of the distribution of faces, or the eigenvectors of the covariance matrix of the set of face images, defines the variation among the set of face images. These eigenvectors are typically ordered, each one accounting for a different amount of variation among the face images. They can be thought of as a set of features which together characterize the variation between face images within the reference set. Each face image location within the multi-dimensional image space contributes more or less to each eigenvector, so that each eigenvector represents a sort of ghostly face which is referred to herein as an eigenface.

Each individual face from the reference set can be represented exactly in terms of a linear combination of M non-zero eigenfaces. Each face can also be approximated using only the M' "best" faces, i.e., those that have the largest eigenvalues, and which therefore account for the most variance within the set of face images. The best M' eigenfaces span an M'-dimensional subspace (referred to hereinafter as "face space") of all possible images.

This approach to face recognition involves the initialization operations shown in FIG. 2 to "train" recognition module 10. First, a reference set of face images is obtained and each of the faces of that set is represented as a corresponding vector or point in the multi-dimensional image space (step 100). Then, using principal component analysis, the distribution of points for the reference set of faces is characterized in terms of a set of eigenvectors (or eigenfaces) (step 102). If a full characterization of the distribution of points is performed, it will yield N² eigenfaces of which M are non-zero. Of these, only the M' eigenfaces corresponding to the highest eigenvalues are chosen, where M'<M<<N². This subset of eigenfaces is used to define a subspace (or face space) within the multidimensional image space. Finally, each member of the reference set is represented by a corresponding point within face space (step 104). For a given face, this is accomplished by projecting its point in the higher dimensional image space onto face space.

If additional faces are added to the reference set at a later time, these operations are repeated to update the set of eigenfaces characterizing the reference set.

After face recognition module 10 is initialized, it implements the steps shown in FIG. 3 to recognize face images supplied by face locator module 8. First, face recognition module 10 projects the input image (i.e., the image presumed to contain a face) onto face space by projecting it onto each of the M' eigenfaces (step 200). Then, module 10 determines whether the input image is a face at all (whether known or unknown) by checking to see if the image is sufficiently close to "face space" (step 202). That is, module 10 computes how far the input image in the multi-dimensional image space is from the face space and compares this to a preselected threshold. If the computed distance is greater than the preselected threshold, module 10 indicates that it does not represent a face image and motion detection module 6 locates the next block of the overall image which may contain a face image.

If the computed distance is sufficiently close to face space (i.e., less than the preselected threshold), recognition module 10 treats it as a face image and proceeds with determining whose face it is (step 206). This involves computing distances between the projection of the input image onto face space and each of the reference face images in face space. If the projected input image is sufficiently close to any one of the reference faces (i.e., the computed distance in face space is less than a predetermined distance), recognition module 10 identifies the input image as belonging to the individual associated with that reference face. If the projected input image is not sufficently close to any one of the reference faces, recognition module 10 reports that a person has been located but the identity of the person is unknown.

The mathematics underlying each of these steps will now be described in greater detail.

Calculating Eigenfaces

Let a face image I(x,y) be a two-dimensional N by N array of (8-bit) intensity values. The face image is represented in the multi-dimensional image space as a vector of dimension N². Thus, a typical image of size 256 by 256 becomes a vector of dimension 65,536, or, equivalently, a point in 65,536-dimensional image space. An ensemble of images, then, maps to a collection of points in this huge space.

Images of faces, being similar in overall configuration, are not randomly distributed in this huge image space and thus can be described by a relatively low dimensional subspace. Using principal component analysis, one identifies the vectors which best account for the distribution of face images within the entire image space. These vectors, namely, the "eigenfaces", define the "face space". Each vector is of length N², describes an N by N image, and is a linear combination of the original face images of the reference set.

Let the training set of face images be Γ₁, Γ₂, Γ₃, . . . , Γ_m. The average face of the set is defined by

Ψ=(M)-1 Σ_n Γ_n, (1)

where the summation is from n=1 to M. Each face differs from the average by the vector Φ_i =Γ_i -Ψ. This set of very large vectors is then subject to principal component analysis, which seeks a set of M orthonormal vectors, u_n, which best describes the distribution of the data. The kth vector, u_k, is chosen such that:

λ_k =(M)-1 Σ_n (u_k^T Φ_n)² (2)

is a maximum, subject to: ##EQU1##

The vectors u_k and scalars λ_k are the eigenvectors and eigenvalues, respectively, of the covariance matrix ##EQU2## where the matrix A=[Φ₁ Φ₂ . . . Φ_M ]. The matrix C, however, is N² by N², and determining the N² eigenvectors and eigenvalues can become an intractable task for typical image sizes.

If the number of data points in the face space is less than the dimension of the overall image space (namely, if, M<N²), there will be only M-1, rather than N², meaningful eigenvectors. (The remaining eigenvectors will have associated eigenvalues of zero.) One can solve for the N² -dimensional eigenvectors in this case by first solving for the eigenvectors of an M by M matrix--e.g. solving a 16×16 matrix rather than a 16,384 by 16,384 matrix--and then taking appropriate linear combinations of the face images Φ_i. Consider the eigenvectors v_i of A^T A such that:

A^T Av_i =μ_i v_i (5)

Premultiplying both sides by A, yields:

AA^T Av_i =μ_i Av_i (6)

from which it is apparent that Av_i are the eigenvectors of C=AA^T.

Following this analysis, it is possible to construct the M by M matrix L=A^T A, where L_mn =Φ_m^T Φ_n, and find the M eigenvectors, v₁, of L. These vectors determine linear combinations of the M training set face images to form the eigenfaces u₁ : ##EQU3##

With this analysis the calculations are greatly reduced, from the order of the number of pixels in the images (N²) to the order of the number of images in the training set (M). In practice, the training set of face images will be relatively small (M<<N²), and the calculations become quite manageable. The associated eigenvalues provide a basis for ranking the eigenvectors according to their usefulness in characterizing the variation among the images.

In practice, a smaller M' is sufficient for identification, since accurate construction of the image is not a requirement. In this framework, identification becomes a pattern recognition task. The eigenfaces span an M'-dimensional subspace of the original N² image space. The M' significant eigenvectors of the L matrix are chosen as those with the largest associated eigenvalues. In test cases based upon M=16 face images, M'=7 eigenfaces were found to yield acceptable results, i.e., a level of accuracy sufficient for monitoring a TV audience for purposes of studying viewing habits and tastes.

A new face image (Γ) is transformed into its eigenface components (i.e., projected into "face space") by a simple operation,

ω_k =u_k^T (Γ-Ψ), (8)

for k=1, . . . , M'. This describes a set of point-by-point image multiplications and summations, operations which may be performed at approximately frame rate on current image processing hardware.

The weights form a vector Ω^T =[ω₁ ω₂ . . . ω_M,] that describes the contribution of each eigenface in representing the input face image, treating the eigenfaces as a basis set for face images. The vector may then be used in a standard pattern recognition algorithm to find which of a number of pre-defined face classes, if any, best describes the face. The simplest method for determining which face class provides the best description of an input face image is to find the face class k that minimizes the Euclidian distance

ε_k =∥(Ω-Ω_k)∥², (9)

where Ω_k is a vector describing the kth face class. The face classes Ω_i are calculated by averaging the results of the eigenface representation over a small number of face images (as few as one) of each individual. A face is classified as belonging to class k when the minimum ε_k is below some chosen threshold θ_ε. Otherwise the face is classified as "unknown", and optionally used to create a new face class.

Because creating the vector of weights is equivalent to projecting the original face image onto the low-dimensional face space, many images (most of them looking nothing like a face) will project onto a given pattern vector. This is not a problem for the system, however, since the distance ε between the image and the face space is simply the squared distance between the mean-adjusted input image Φ=Γ-Ψ and Φ_f=Σω_k u_k, its projection onto face space (where the summation is over k from 1 to M'):

ε² =∥Φ-Φ_f ∥² (10)

Thus, there are four possibilities for an input image and its pattern vector: (1) near face space and near a face class; (2) near face space but not near a known face class; (3) distant from face space and near a face class; and (4) distant from face space and not near a known face class.

In the first case, an individual is recognized and identified. In the second case, an unknown individual is present. The last two cases indicate that the image is not a face image. Case three typically shows up as a false positive in most other recognition systems. In the described embodiment, however, the false recognition may be detected because of the significant distance between the image and the subspace of expected face images.

Summary of Eigenface Recognition Procedure

To summarize, the eigenfaces approach to face recognition involves the following steps:

1. Collect a set of characteristic face images of the known individuals. This set may include a number of images for each person, with some variation in expression and in lighting. (Say four images of ten people, so M=40.)

2. Calculate the (40×40) matrix L, find its eigenvectors and eigenvalues, and choose the M' eigenvectors with the highest associated eigenvalues. (Let M'=10 in this example.)

3. Combine the normalized training set of images according to Eq. 7 to produce the (M'=10) eigenfaces u_k.

4. For each known individual, calculate the class vector Ω_k by averaging the eigenface pattern vectors Ω (from Eq. 9) calculated from the original (four) images of the individual. Choose a threshold θ_ε which defines the maximum allowable distance from any face class, and a threshold θ_t which defines the maximum allowable distance from face space (according to Eq. 10).

5. For each new face image to be identified, calculate its pattern vector φ, the distances ε_i to each known class, and the distance ε to face space. If the distance ε>θ_t, classify the input image as not a face. If the minimum distance ε_k ≦θ_ε and the distance ε≦θ₁, classify the input face as the individual associated with class vector Ω_k. If the minimum distance ε_k >θε and ε≦θ₁, then the image may be classified as "unknown", and optionally used to begin a new face class.

6. If the new image is classified as a known individual, this image may be added to the original set of familiar face images, and the eigenfaces may be recalculated (steps 1-4). This gives the opportunity to modify the face space as the system encounters more instances of known faces.

In the described embodiment, calculation of the eigenfaces is done offline as part of the training. The recognition currently takes about 400 msec running rather inefficiently in Lisp on a Sun 4, using face images of size 128×128. With some special-purpose hardware, the current version could run at close to frame rate (33 msec).

Designing a practical system for face recognition within this framework requires assessing the tradeoffs between generality, required accuracy, and speed. If the face recognition task is restricted to a small set of people (such as the members of a family or a small company), a small set of eigenfaces is adequate to span the faces of interest. If the system is to learn new faces or represent many people, a larger basis set of eigenfaces will likely be required.

Motion Detection And Head Tracking

In the described embodiment, motion detection module 6 and head locator module 8 locates and tracks the position of the head of any person within the scene viewed by video camera 4 by implementing the tracking algorithm depicted in FIG. 4. A sequence of image frames 30 from video camera 4 first passes through a spatio-temporal filtering module 32 which accentuates image locations which change with time. Spatio-temporal filtering module 32 identifies the locations of motion by performing a differencing operation on successive frames of the sequence of image frames. In the output of the spatio-temporal filter module 32, a moving person "lights up" whereas the other areas of the image containing no motion appear as black.

The spatio-temporal filtered image passes to a thresholding module 34 which produces a binary motion image identifying the locations of the image for which the motion exceeds a preselected threshold. That is, it locates the areas of the image containing the most motion. In all such areas, the presence of a person is postulated.

A motion analyzer module 36 analyzes the binary motion image to watch how "motion blobs" change over time to decide if the motion is caused by a person moving and to determine head position. A few simple rules are applied, such as "the head is the small upper blob above a larger blob (i.e., the body)", and "head motion must be reasonably slow and contiguous" (i.e., heads are not expected to jump around the image erratically).

The motion image also allows for an estimate of scale. The size of the blob that is assumed to be the moving head determines the size of the subimage to send to face recognition module 10 (see FIG. 1). This subimage is rescaled to fit the dimensions of the eigenfaces.

Using "Face Space" To Locate The Face

Face space may also be used to locate faces in single images, either as an alternative to locating faces from motion (e.g. if there is too little motion or many moving objects) or as a method of achieving more precision than is possible by use of motion tracking alone.

Typically, images of faces do not change radically when projected into the face space; whereas, the projection of non-face images appear quite different. This basic idea may be used to detect the presence of faces in a scene. To implement this approach, the distance ε between the local subimage and face space is calculated at every location in the image. This calculated distance from face space is then used as a measure of "faceness". The result of calculating the distance from face space at every point in the image is a "face map" ε(x,y) in which low values (i.e., the dark areas) indicate the presence of a face.

Direct application of Eq. 10, however, is rather expensive computationally. A simpler, more efficient method of calculating the face map ε(x,y) is as follows.

To calculate the face map at every pixel of an image I(x,y), the subimage centered at that pixel is projected onto face space and the projection is then subtracted from the original subimage. To project a subimage Γ onto face space, one first subtracts the mean image (i.e., Ψ), resulting in Φ=Γ-Ψ. With Φ_f being the projection of Φ onto face space, the distance measure at a given image location is then: ##EQU4## since Φ_f ⊥(Φ-Φ_f). Because Φ_f is a linear combination of the eigenfaces (Φ_f =Σ_i ω_i u_i) and the eigenfaces are orthonormal vectors,

Φ_f^T Φ_f =Σ_i ω_i² (12)

and

ε² (x,y)=Φ^T (x,y) Φ(x,y)-Σω_i² (x,y) (13)

where ε(x,y) and ω_i (x,y) are scalar functions of image location, and Φ(x,y) is a vector function of image location.

The second term of Eq. 13 is calculated in practice by a correlation with the L eigenfaces: ##EQU5## where x the correlation operator. The first term of Eq. 13 becomes ##EQU6## Since the average face Ψ and the eigenfaces u_i are fixed, the terms Ψ^T Ψ and Ψxu_i may be computed ahead of time.

Thus, the computation of the face map involves only L+1 correlations over the input image and the computation of the first term Γ^T (x,y)Γ(x,y). This is computed by squaring the input image I(x,y) and, at each image location, summing the squared values of the local subimage.

Scale Invariance

Experiments reveal that recognition performance decreases quickly as the head size, or scale, is mis-judged. It is therefore desirable for the head size in the input image must be close to that of the eigenfaces. The motion analysis can give an estimate of head size, from which the face image is rescaled to the eigenface size.

Another approach to the scale problem, which may be separate from or in addition to the motion estimate, is to use multiscale eigenfaces, in which an input face image is compared with eigenfaces at a number of scales. In this case the image will appear to be near the face space of only the closest scale eigenfaces. Equivalently, the input image (i.e., the portion of the overall image selected for analysis) can be scaled to multiple sizes and the scale which results in the smallest distance measure to face space used.

Other embodiments are within the following claims. For example, although the eigenfaces approach to face recognition has been presented as an information processing model, it may also be implemented using simple parallel computing elements, as in a connectionist system or artificial neural network.

INVENTORS:

Pentland, Alex P., Turk, Matthew

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
10032068,	Oct 02 2009	FotoNation Limited	Method of making a digital camera image of a first scene with a superimposed second scene
6445810,	Aug 01 1997	Intel Corporation	Method and apparatus for personnel detection and tracking
6456320,	May 27 1997	Godo Kaisha IP Bridge 1	Monitoring system and imaging system
6501857,	Jul 20 1999	HEWLETT-PACKARD DEVELOPMENT COMPANY, L P	Method and system for detecting and classifying objects in an image
6529620,	Sep 11 2000	PINOTAGE L L C	System and method for obtaining and utilizing maintenance information
6535620,	Mar 10 2000	Sarnoff Corporation	Method and apparatus for qualitative spatiotemporal data processing
6597801,	Sep 16 1999	COMPAQ INFORMATION TECHNOLOGIES GROUP, L P	Method for object registration via selection of models with dynamically ordered features
6618490,	Sep 16 1999	COMPAQ INFORMATION TECHNOLOGIES GROUP, L P	Method for efficiently registering object models in images via dynamic ordering of features
6628811,	Mar 19 1998	Matsushita Electric Industrial Co. Ltd.	METHOD AND APPARATUS FOR RECOGNIZING IMAGE PATTERN, METHOD AND APPARATUS FOR JUDGING IDENTITY OF IMAGE PATTERNS, RECORDING MEDIUM FOR RECORDING THE PATTERN RECOGNIZING METHOD AND RECORDING MEDIUM FOR RECORDING THE PATTERN IDENTITY JUDGING METHOD
6628834,	Jul 20 1999	HEWLETT-PACKARD DEVELOPMENT COMPANY, L P	Template matching system for images
6690414,	Dec 12 2000	SIGNIFY HOLDING B V	Method and apparatus to reduce false alarms in exit/entrance situations for residential security monitoring
6724920,	Jul 21 2000	TRW Inc	Application of human facial features recognition to automobile safety
6795567,	Sep 16 1999	HEWLETT-PACKARD DEVELOPMENT COMPANY, L P	Method for efficiently tracking object models in video sequences via dynamic ordering of features
6810135,	Jun 29 2000	TRW Inc.; TRW Inc	Optimized human presence detection through elimination of background interference
6816085,	Jan 14 2000	BLACKBIRD TECH LLC	Method for managing a parking lot
6865296,	Jun 06 2000	MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD	Pattern recognition method, pattern check method and pattern recognition apparatus as well as pattern check apparatus using the same methods
6873743,	Mar 29 2001	CEDAR LANE TECHNOLOGIES INC	Method and apparatus for the automatic real-time detection and correction of red-eye defects in batches of digital images or in handheld appliances
6904168,	Mar 29 2001	CEDAR LANE TECHNOLOGIES INC	Workflow system for detection and classification of images suspected as pornographic
6904347,	Jun 29 2000	TRW Inc.; TRW Inc	Human presence detection, identification and tracking using a facial feature image sensing system for airbag deployment
6965694,	Dec 01 2000	Honda Giken Kogyo Kabushiki Kaisa	Motion information recognition system
6975763,	Jul 12 2000	Minolta Co., Ltd.	Shade component removing apparatus and shade component removing method for removing shade in image
7050084,	Sep 24 2004	AVAYA LLC	Camera frame display
7054468,	Dec 03 2001	Honda Motor Co., Ltd.	Face recognition using kernel fisherfaces
7068301,	Sep 11 2000	Pinotage L.L.C.	System and method for obtaining and utilizing maintenance information
7085774,	Aug 30 2001	Infonox On The Web	Active profiling system for tracking and quantifying customer conversion efficiency
7103215,	Mar 29 2001	CEDAR LANE TECHNOLOGIES INC	Automated detection of pornographic images
7110570,	Jul 21 2000	TRW Inc	Application of human facial features recognition to automobile security and convenience
7188307,	Nov 28 2000	Canon Kabushiki Kaisha	Access system
7227567,	Sep 14 2004	AVAYA LLC	Customizable background for video communications
7269292,	Jun 26 2003	FotoNation Limited	Digital image adjustable compression and resolution using face detection information
7295687,	Aug 13 2002	Samsung Electronics Co., Ltd.	Face recognition method using artificial neural network and apparatus thereof
7315630,	Jun 26 2003	FotoNation Limited	Perfecting of digital image rendering parameters within rendering devices using face detection
7317815,	Jun 26 2003	FotoNation Limited	Digital image processing composition using face detection information
7331671,	Mar 29 2004	Aptiv Technologies AG	Eye tracking method based on correlation and detected eye movement
7362368,	Jun 26 2003	FotoNation Limited	Perfecting the optics within a digital image acquisition device using face detection
7362885,	Apr 20 2004	Aptiv Technologies AG	Object tracking and eye state identification method
7379602,	Jul 29 2002	Honda Giken Kogyo Kabushiki Kaisha	Extended Isomap using Fisher Linear Discriminant and Kernel Fisher Linear Discriminant
7382903,	Nov 19 2003	Monument Peak Ventures, LLC	Method for selecting an emphasis image from an image collection based upon content recognition
7388971,	Oct 23 2003	Northrop Grumman Systems Corporation	Robust and low cost optical system for sensing stress, emotion and deception in human subjects
7440593,	Jun 26 2003	FotoNation Limited	Method of improving orientation and color balance of digital images using face detection information
7460150,	Mar 14 2005	AVAYA LLC	Using gaze detection to determine an area of interest within a scene
7466866,	Jun 26 2003	FotoNation Limited	Digital image adjustable compression and resolution using face detection information
7471846,	Jun 26 2003	FotoNation Limited	Perfecting the effect of flash within an image acquisition devices using face detection
7512571,	Aug 29 2002		Associative memory device and method based on wave propagation
7564476,	May 13 2005	AVAYA LLC	Prevent video calls based on appearance
7565030,	Jun 26 2003	FotoNation Limited	Detecting orientation of digital images using face detection information
7570785,	Jun 07 1995	AMERICAN VEHICULAR SCIENCES LLC	Face monitoring system and method for vehicular occupants
7574016,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7616233,	Jun 26 2003	FotoNation Limited	Perfecting of digital image capture parameters within acquisition devices using face detection
7620216,	Jun 14 2006	Delphi Technologies, Inc.	Method of tracking a human eye in a video image
7620218,	Aug 11 2006	FotoNation Limited	Real-time face tracking with reference images
7630527,	Jun 26 2003	FotoNation Limited	Method of improving orientation and color balance of digital images using face detection information
7634109,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7650034,	Dec 14 2005	Aptiv Technologies AG	Method of locating a human eye in a video image
7652593,	Jan 14 2000		Method for managing a parking lot
7660445,	Nov 19 2003	Monument Peak Ventures, LLC	Method for selecting an emphasis image from an image collection based upon content recognition
7668304,	Jan 25 2006	AVAYA LLC	Display hierarchy of participants during phone call
7684630,	Jun 26 2003	FotoNation Limited	Digital image adjustable compression and resolution using face detection information
7688225,	Jan 14 2000		Method for managing a parking lot
7693311,	Jun 26 2003	FotoNation Limited	Perfecting the effect of flash within an image acquisition devices using face detection
7702136,	Jun 26 2003	FotoNation Limited	Perfecting the effect of flash within an image acquisition devices using face detection
7706576,	Dec 28 2004	AVAYA LLC	Dynamic video equalization of images using face-tracking
7809162,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7844076,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection and skin tone information
7844135,	Jun 26 2003	FotoNation Limited	Detecting orientation of digital images using face detection information
7848549,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7853043,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7855737,	Mar 26 2008	FotoNation Limited	Method of making a digital camera image of a scene including the camera user
7860274,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
7864990,	Aug 11 2006	FotoNation Limited	Real-time face tracking in a digital image acquisition device
7912245,	Jun 26 2003	FotoNation Limited	Method of improving orientation and color balance of digital images using face detection information
7916897,	Aug 11 2006	FotoNation Limited	Face tracking for controlling imaging parameters
7916971,	May 24 2007	FotoNation Limited	Image processing method and apparatus
7953251,	Oct 28 2004	FotoNation Limited	Method and apparatus for detection and correction of flash-induced eye defects within digital images using preview or other reference images
7962629,	Jun 17 2005	FotoNation Limited	Method for establishing a paired connection between media devices
7965875,	Jun 12 2006	FotoNation Limited	Advances in extending the AAM techniques from grayscale to color images
7974714,	Oct 05 1999	Blanding Hovenweep, LLC; HOFFBERG FAMILY TRUST 1	Intelligent electronic appliance system and method
8005265,	Sep 08 2008	FotoNation Limited	Digital image processing using face detection information
8031914,	Oct 11 2006	Hewlett-Packard Development Company, L.P.	Face-based image clustering
8046313,	Dec 23 1991	Blanding Hovenweep, LLC; HOFFBERG FAMILY TRUST 1	Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
8050465,	Aug 11 2006	FotoNation Limited	Real-time face tracking in a digital image acquisition device
8055029,	Jun 18 2007	FotoNation Limited	Real-time face tracking in a digital image acquisition device
8055067,	Jan 18 2007	FotoNation Limited	Color segmentation
8055090,	Oct 30 2008	FotoNation Limited	Digital image processing using face detection information
8064653,	Nov 29 2007	Google Technology Holdings LLC	Method and system of person identification by facial image
8126208,	Oct 30 2008	FotoNation Limited	Digital image processing using face detection information
8131016,	Oct 30 2008	FotoNation Limited	Digital image processing using face detection information
8135184,	Oct 28 2004	FotoNation Limited	Method and apparatus for detection and correction of multiple image defects within digital images using preview or other reference images
8155397,	Sep 26 2007	FotoNation Limited	Face tracking in a camera processor
8155401,	Jun 26 2003	FotoNation Limited	Perfecting the effect of flash within an image acquisition devices using face detection
8160312,	Jun 26 2003	FotoNation Limited	Perfecting the effect of flash within an image acquisition devices using face detection
8165282,	May 25 2006	AVAYA LLC	Exploiting facial characteristics for improved agent selection
8213737,	Jun 21 2007	FotoNation Limited	Digital image enhancement with reference images
8224039,	Feb 28 2007	FotoNation Limited	Separating a directional lighting variability in statistical face modelling based on texture space decomposition
8224108,	Oct 30 2008	FotoNation Limited	Digital image processing using face detection information
8243182,	Mar 26 2008	FotoNation Limited	Method of making a digital camera image of a scene including the camera user
8251597,	Oct 16 2009	SKYCAM, LLC	Aerial support structure for capturing an image of a target
8270674,	Aug 11 2006	FotoNation Limited	Real-time face tracking in a digital image acquisition device
8320641,	Oct 28 2004	FotoNation Limited	Method and apparatus for red-eye detection using preview or other reference images
8326066,	Jun 26 2003	FotoNation Limited	Digital image adjustable compression and resolution using face detection information
8330831,	Aug 05 2003	FotoNation Limited	Method of gathering visual meta data using a reference image
8345114,	Jul 30 2008	FotoNation Limited	Automatic face and skin beautification using face detection
8379917,	Oct 02 2009	FotoNation Limited	Face recognition performance using additional image features
8384793,	Jul 30 2008	FotoNation Limited	Automatic face and skin beautification using face detection
8385610,	Aug 11 2006	FotoNation Limited	Face tracking for controlling imaging parameters
8433050,	Feb 06 2006	AVAYA LLC	Optimizing conference quality with diverse codecs
8494232,	May 24 2007	FotoNation Limited	Image processing method and apparatus
8494286,	Feb 05 2008	FotoNation Limited	Face detection in mid-shot digital images
8498452,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
8503800,	Mar 05 2007	FotoNation Limited	Illumination detection using classifier chains
8509496,	Aug 11 2006	FotoNation Limited	Real-time face tracking with reference images
8509561,	Feb 28 2007	FotoNation Limited	Separating directional lighting variability in statistical face modelling based on texture space decomposition
8515138,	May 24 2007	FotoNation Limited	Image processing method and apparatus
8577616,	Dec 16 2003	AERULEAN PLANT IDENTIFICATION SYSTEMS, INC	System and method for plant identification
8593542,	Dec 27 2005	FotoNation Limited	Foreground/background separation using reference images
8649604,	Mar 05 2007	FotoNation Limited	Face searching and detection in a digital image acquisition device
8675991,	Jun 26 2003	FotoNation Limited	Modification of post-viewing parameters for digital images using region or feature information
8682097,	Feb 14 2006	FotoNation Limited	Digital image enhancement with reference images
8896725,	Jun 21 2007	FotoNation Limited	Image capture device with contemporaneous reference image capture mechanism
8923564,	Mar 05 2007	FotoNation Limited	Face searching and detection in a digital image acquisition device
8948468,	Jun 26 2003	FotoNation Limited	Modification of viewing parameters for digital images using face detection information
8989453,	Jun 26 2003	FotoNation Limited	Digital image processing using face detection information
9007480,	Jul 30 2008	FotoNation Limited	Automatic face and skin beautification using face detection
9053545,	Jun 26 2003	FotoNation Limited	Modification of viewing parameters for digital images using face detection information
9129381,	Jun 26 2003	FotoNation Limited	Modification of post-viewing parameters for digital images using image region or feature information
9224034,	Mar 05 2007	FotoNation Limited	Face searching and detection in a digital image acquisition device
9355303,	Dec 04 2011	King Saud University	Face recognition using multilayered discriminant analysis
9692964,	Jun 26 2003	FotoNation Limited	Modification of post-viewing parameters for digital images using image region or feature information
9767539,	Jun 21 2007	FotoNation Limited	Image capture device with contemporaneous image correction mechanism
RE46310,	Dec 23 1991	Blanding Hovenweep, LLC; HOFFBERG FAMILY TRUST 1	Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
RE47908,	Dec 23 1991	Blanding Hovenweep, LLC	Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
RE48056,	Dec 23 1991	Blanding Hovenweep, LLC	Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
RE49387,	Dec 23 1991	Blanding Hovenweep, LLC	Ergonomic man-machine interface incorporating adaptive pattern recognition based control system

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
4636862,	Feb 13 1984	Kokusai Denshin Denwa Kabushiki Kaisha	System for detecting vector of motion of moving objects on picture
4651289,	Jan 29 1982	Tokyo Shibaura Denki Kabushiki Kaisha	Pattern recognition apparatus and method for making same
4752957,	Sep 07 1983	Kabushiki Kaisha Toshiba	Apparatus and method for recognizing unknown patterns
4838644,	Sep 15 1987	The United States of America as represented by the United States	Position, rotation, and intensity invariant recognizing method
4858000,	Sep 14 1988	MALACHI PARTNERS LLC, A FLORIDA LIMITED LIABILITY COMPANY	Image recognition audience measurement system and method
4926491,	Sep 17 1984	Kabushiki Kaisha Toshiba	Pattern recognition device
4930011,	Aug 02 1988	NIELSEN MEDIA RESEARCH, INC , A DELAWARE CORP	Method and apparatus for identifying individual members of a marketing and viewing audience
4998286,	Feb 13 1987	Olympus Optical Co., Ltd.	Correlation operational apparatus for multi-dimensional images
5031228,	Sep 14 1988	NIELSEN MEDIA RESEARCH, INC , A DELAWARE CORP	Image recognition system and method

ASSIGNMENT RECORDS Assignment records on the USPTO

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Nov 16 1994		Massachusetts Institute of Technology	(assignment on the face of the patent)

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
Aug 24 2000	M184: Payment of Maintenance Fee, 8th Year, Large Entity.
Aug 24 2000	M186: Surcharge for Late Payment, Large Entity.
Jun 30 2005	M1553: Payment of Maintenance Fee, 12th Year, Large Entity.
Jun 30 2005	M1558: Surcharge, Petition to Accept Pymt After Exp, Unintentional.
Jun 30 2005	PMFP: Petition Related to Maintenance Fees Filed.
Jul 28 2005	PMFG: Petition Related to Maintenance Fees Granted.
Feb 03 2010	ASPN: Payor Number Assigned.

Date	Maintenance Schedule
Jan 12 2002	4 years fee payment window open
Jul 12 2002	6 months grace period start (w surcharge)
Jan 12 2003	patent expiry (for year 4)
Jan 12 2005	2 years to revive unintentionally abandoned end. (for year 4)
Jan 12 2006	8 years fee payment window open
Jul 12 2006	6 months grace period start (w surcharge)
Jan 12 2007	patent expiry (for year 8)
Jan 12 2009	2 years to revive unintentionally abandoned end. (for year 8)
Jan 12 2010	12 years fee payment window open
Jul 12 2010	6 months grace period start (w surcharge)
Jan 12 2011	patent expiry (for year 12)
Jan 12 2013	2 years to revive unintentionally abandoned end. (for year 12)