A method and system for creating a transition between a first scene and a second scene on a computer system display, simulating motion. The method includes determining a transformation that maps the first scene into the second scene. Motion between the scenes is simulated by displaying transitional images that include a transitional scene based on a transitional object in the first scene and in the second scene. The rendering of the transitional object evolves according to specified transitional parameters as the transitional images are displayed. A viewer receives a sense of the connectedness of the scenes from the transitional images. Virtual tours of broad areas, such as cityscapes, can be created using inter-scene transitions among a complex network of pairs of scenes.
|
1. A system for creating a transition between a first scene and a second scene simulating motion, the first scene observed from a first viewpoint including a first digital image and the second scene observed from a second viewpoint including a second digital image, the system comprising:
a processor, the processor configured to:
determine, for each digital image, a directional vector corresponding to a pan and tilt of a camera that captured the digital image;
transform, for each digital image, corresponding digital image data to rotate the digital image so that the directional vector of the transformed digital image is parallel to one of the axes of a global coordinate system;
further transform digital image data of at least one of the transformed digital images so as to align the directional vectors of the transformed digital images, thereby achieving alignment of corresponding features in each of the transformed digital images;
define a common ground plane shared by the first transformed digital image and the second transformed digital image;
define a footprint of structures within the first transformed digital image and the second transformed digital image on the ground plane;
extrude geometries from the footprint, wherein the footprint and the extruded geometries define a three-dimensional geometry representative of the structures;
determine a view position along a path for a transitional image in relation to the three dimensional geometry; and
projectively map data from the first digital image and the second digital image based on the three-dimensional geometry and the view position to create a transitional image.
11. A system for creating a transition between a first scene and a second scene simulating motion, the first scene observed from a first viewpoint including a first digital image and the second scene observed from a second viewpoint including a second digital image, the system comprising:
means for determining, for each digital image, a directional vector corresponding to a pan and tilt of a camera that captured the digital;
means for transforming, for each digital image, corresponding digital image data to rotate the digital image so that the directional vector of the transformed digital image is parallel to one of the axes of a global coordinate system;
means for further transforming digital image data of at least one of the transformed digital images so as to align the directional vectors of the transformed digital images, thereby achieving alignment of corresponding features in each of the transformed digital images;
means for defining a common ground plane shared by the first transformed digital image and the second transformed digital image;
means for defining a footprint of structures within the first transformed digital image and the second transformed digital image on the ground plane;
means for extruding geometries from the footprint, wherein the footprint and the extruded geometries define a three-dimensional geometry representative of the structures;
means for determining a view position along a path for a transitional image in relation to the three-dimensional geometry; and
means for projectively mapping data from the first digital image and the second digital image based on the three-dimensional geometry and the view position to create a transitional image.
2. The system according to
3. The system according to
4. The system according to
5. The system according to
display on a display the first digital image, one or more transitional images, and the second digital image sequentially.
6. The system according to
7. The system according to
8. The system according to
display on a display the first digital image with a first navigational icon embedded; and
when the first navigational icon is activated, display the transitional image, such that there is simulated motion from the first digital image to the second digital image.
9. The system according to
10. The system according to
display the first scene;
receive an indication of the view position; and
when the indication of the view position is received, display the transitional image that includes at least one transitional scene, such that there is simulated motion from the first scene to the second scene.
12. The system according to
13. The system according to
14. The system according to
15. The system according to
means for displaying on a display the first digital image, one or more transitional images, and the second digital image sequentially.
16. The system according to
17. The system according to
18. The system according to
means for displaying on a display the first digital image with a first navigational icon embedded; and
means for displaying the transitional image, such that there is simulated motion from the first digital image to the second digital image when the first navigational icon is activated.
19. The system according to
20. The system according to
means for displaying the image of the first scene;
means for receiving an indication of the view position; and
means for displaying the transitional image that includes at least one transitional scene, such that there is simulated motion from the first scene to the second scene when the indication of the view position is received.
|
This application is a continuation of, and therefore claims priority to, U.S. patent application Ser. No. 16/042,309 (the '309 application—U.S. Pat. No. 10,304,233), filed Jul. 23, 2018, entitled “Method for Inter-Scene Transitions,” the disclosure of which is incorporated herein by reference. The '309 application is a continuation of, and therefore claims priority to, U.S. patent application Ser. No. 14/969,669 (the '669 application—U.S. Pat. No. 10,032,306), filed Dec. 15, 2015, entitled “Method for Inter-Scene Transitions”, the disclosure of which is incorporated herein by reference. The '669 application is a continuation of, and therefore claims priority from, U.S. patent application Ser. No. 14/090,654 (the '654 application—Abandoned), filed Nov. 26, 2013, entitled “Method for Inter-Scene Transitions”, the disclosure of which is incorporated herein by reference. The '654 application is a continuation of, and therefore claims priority from, U.S. patent application Ser. No. 11/271,159 (the '159 application—Abandoned), filed Nov. 11, 2005, entitled “Method for Inter-Scene Transitions”, the disclosure of which is incorporated herein by reference. The '159 application claims priority from U.S. provisional patent application Ser. No. 60/712,356 (the '356 application), filed Aug. 30, 2005, entitled “Method for Inter-Scene Transitions”, the disclosure of which is incorporated herein by reference. The '159 application also claims priority from U.S. provisional patent application Ser. No. 60/627,335, filed Nov. 12, 2004, entitled “Method for Inter-Scene Transitions,” which is incorporated herein by reference.
The invention relates to computer graphics methods and systems and, in particular, to methods and systems for creating smooth transitions between two or more related images or panoramas on a computer display.
Virtual tours have become a frequently used technique for providing viewers with information about scenes of interest. Such tours can provide a photorealistic, interactive and immersive experience of a scene or collection of scenes. These tours can incorporate one or more of a wide variety of graphic display techniques in representing the scenes.
One effective technique for presenting information as part of these tours is display of a panorama or panoramic image. Panoramic viewers can display images with wide fields of view, while maintaining detail across the entire picture. Several steps are required for creation and display of these panoramas: image capture, image “stitching”, and panorama display (or viewing). The first step is capturing an image of the scene 100, which is also known as the acquisition step. Multiple photographs are typically taken from various angles from a single position 110 in space, as shown in
Current panoramic virtual tours have significant limitations. The inherent nature of panoramas (including regular photographs and images), is that panoramas are taken from a single acquisition position, and, thus, the images are static. To describe a broader area, i.e., beyond a view from a point in space, panoramic virtual tours typically employ a “periscope view”—the end user “pops” into a point in space, looks around, and then instantaneously “pops” into another position in space to navigate through a wider area. Assuming a simple case of two panoramic scenes, even when the acquisition positions are very close, it is often difficult for the viewer to mentally connect the two scenes. The two panoramas are not inherently capable of describing how the panoramas are connected and oriented with respect to each other. With these limitations, it is difficult for the viewer to understand the space, sense of orientation, and scale of a wider area with current virtual tours.
In a first embodiment of the invention, there is provided a method for creating a transition between a first scene and a second scene simulating motion in a computer system having a display. The first scene is observed from a first viewpoint and includes a feature. The second scene is observed from a second viewpoint and includes a second feature. The method includes first graphically identifying on the display the feature in the first scene and the feature in the second scene and determining a transformation mapping the first scene into the second scene using the two features. Then, one or more transitional images are created that include at least one transitional scene based on the feature in the first scene and on the feature in the second scene, such that there is simulated motion from the first scene to the second scene.
In another embodiment of the invention, a method is provided for displaying a transition between a first scene and a second scene simulating motion on a computer system display. The first scene is observed from a first viewpoint and includes a first feature, and the second scene is observed from a second viewpoint and includes a second feature. The method includes displaying a navigational icon embedded in the first scene. When the navigational icon is activated, at least one transitional image is displayed that includes at least one transitional scene based on the first feature and on the second feature, such that there is simulated motion from the first scene to the second scene.
In a further embodiment of the invention, a method is provided for displaying a transition between a first scene and a selected scene simulating motion on a computer system display. The first scene is observed from a first viewpoint and includes a first feature, and the selected scene is observed from a second viewpoint and includes a second feature. The method includes displaying the first scene; receiving an indication of the location of the selected scene. When the location of the selected location is received, at least one transitional image is displayed that includes at least one transitional scene based on the first feature and on the second feature, such that there is simulated motion from the first scene to the selected scene. In specific embodiment of the invention, the indication may be received from search engine output, or a user selection from a list or activation of an icon anywhere on a display, etc.
In a further embodiment of the invention, a method is provided for displaying a transition between a first scene and a second scene and between the second scene and a third scene simulating motion on a computer system display. The first scene is observed from a first viewpoint and includes a first feature; the second scene is observed from a second viewpoint and includes a second feature; and the third scene is observed from a third viewpoint and includes a third feature. The method includes:
providing a first transitional image that includes at least one transitional scene based on the first feature and on the second feature, such that there is simulated motion from the first scene to the second scene; and providing a second transitional image that includes at least one transitional scene based on the second feature and on the third feature, such that there is simulated motion from the second viewpoint to the third viewpoint. The first transitional image and the second transitional image are formed without determining the absolute positions and orientations in a frame of reference of each of the first, second and third scenes.
In another embodiment of the invention, a method is provided for displaying a transition between a first scene and a selected scene simulating motion on a computer system display. The first scene is observed from a first viewpoint and includes a first feature; a second scene is observed from a second viewpoint and includes a second feature; and the selected scene is observed from a selected scene viewpoint. The method includes: displaying the first scene; receiving an indication of the location of the selected scene viewpoint; and determining a route from the first viewpoint to the selected scene viewpoint, where the route includes the second viewpoint. When the indication of the location of the selected scene viewpoint is received, a transitional image is displayed that includes at least one transitional scene based on the first feature and on the second feature, such that there is simulated motion from the first scene to the second scene.
The foregoing features of the invention will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawings in which:
Note that as used in this description and the accompanying claims, the following terms shall have the meanings indicated, unless the context otherwise requires: The term “perspective view” shall mean a 2D view of an image in a world plane projected on an image plane. The image plane will frequently be a display surface, but in general, may be any plane. A “perspective rectangle” shall mean a 2D polygon in a perspective view which is a projection of a rectangle in world space onto the image plane. A “transitional parameter” shall mean a measure of the contribution of a first image versus a second image to a transitional object formed from a combination of the first image and the second image. For example, if the transitional object is derived from alpha blending the first image and the second image, the transitional parameter measures the degree of transparency and opacity of the contribution of each image to the transitional object. An “active element” shall mean an icon displayed in an image such that selection of the icon by an input device initiates an action. A “navigational icon” shall mean an active element icon displayed in an image such that selection of the icon by an input device causes a displayed image of a scene to update.
In broad overview, embodiments of the invention provide a system and a method that simulate smooth motion between images of two or more connected locations or scenes. Simulated motion provides a sense of orientation and an understanding of the space to users navigating through a series of images of locations. To navigate from one image to another, a user may select a portion of a first scene that connects to a second scene. The view is then transitioned to the second scene. This type of navigation may be disorienting if the second scene simply replaces the first scene—there is no sense of motion between the scenes to emphasize the geographic connection between them. Instead, motion between the two scenes may be simulated to provide the viewer a better sense of the relationships between the two scenes, including a sense of space and orientation.
In further embodiments of the invention, this concept of simulating motion between images can be extended to create a connected network of multiple image pairs forming a tour of a space, such as a neighborhood, a boulevard, or even a town or city. Such a network of scenes will be called below a “supertour.” The term “supertour” is used for convenience in description and not by way of limitation: the network of images may extend from two images to an arbitrarily large number of images. An overview flow diagram for a method of creating a supertour is shown in
One method of providing a sense of connection between scenes uses techniques known as zooming and fading. From an initial panorama or image, the viewer orients towards the second scene (panorama or image), zooms in by varying the field-of-view (“FOV”) of a virtual camera, then fades out of the first panorama, then fades into the second panorama. This technique, may provide some sense of orientation, but is very dependent on the scene—how closely the panoramic images have been acquired, whether the scenes contain substantial amounts of common visual features, and the complexity of visibility and occlusions among the objects within the scene. Otherwise, zooming and fading works no better than “popping” into the destination panorama without the zoom-fade effects. Furthermore, zooming into an image cannot properly simulate moving in three-dimensional space. Note that zooming into a flat image is the same as “having a closer look” at an image, and does not simulate motion in 3D space. Realistic motion heavily depends on the parallax effect as relative positions between objects and camera changes.
Another method of providing a simulation of motion between two images is to create a physical movie of the motion between images, which is played when a user chooses to move between two scenes. Capturing an actual movie between positions in physical space could be done using a video camera, and other camera positioning equipment. This approach of using movies is particularly useful for transitioning between images on Web pages. Because most Web browsers include software that is capable of playing streaming video or other digital movie or video formats, no additional software is needed to display such movies. Creating actual physical movies for transitions between scenes can be time consuming and expensive to acquire, especially for large environments, e.g. cityscapes. The movies also require significant data and post processing. Because of differences in points-of-view, it is typically necessary to create separate movies for each direction in which motion between images or panoramas is desired. Thus, for movement between two images, two movies are needed—one movie for movement from the first image to the second, and a different movie for movement from the second image to the first. This further complicates the acquisition process, since accurate connections of the bidirectional movies are important in creating seamless movies and images/panoramas. Specialized equipment as well as a crew of people are necessary for such endeavors.
Another method of simulating motion between two images involves creating a three-dimensional model that represents the path between two images. Once such a three-dimensional model exists, motion between the images can be simulated by moving the position of a virtual camera in the three-dimensional model. This approach provides a high degree of flexibility, permitting a user to view the area represented by the model from any vantage point. Techniques such as those illustrated in U.S. patent application Ser. No. 10/780,500, entitled “Modeling and Editing Image Panoramas,” which is incorporated herein by reference, may be used to create three dimensional models from panoramic images. However, these techniques create visual artifacts and seams, since photo-textured models have static texture maps.
In various embodiments of the present invention, a method and a system are provided for generating a substantially seamless transition between two scenes—a “first scene” and a “second scene”—simulating motion on a computer display screen. The first scene is observed from a first viewpoint and the second scene is observed from a second viewpoint. These scenes may be a single source image or a panoramic source image or any portion thereof. Images may include virtually any type of digitized graphical content including photographs, pictures, sketches, paintings, etc.
The first step 500—acquisition camera pose estimation—determines relative acquisition positions of the first and second scenes in 3D space (i.e., a world space). More technically, the pose estimation step determines the camera extrinsics—the position and orientation of the acquisition camera. To simulate 3D motion from one point in space to another, it is necessary to compute relative distances and orientations of the source images with respect to each other. Typically, to compute the pair-wise pose estimation, correspondences between common features in the source images are established, automatically or with human intervention. With appropriate levels of corresponded features, the relative camera extrinsics may be computed. In a specific embodiment of the invention, planar rectangular feature correspondences between the scenes are used to estimate the pose. In another specific embodiment of the invention, a perspective rectangle tool (“PRT”) is provided, as described below, to facilitate tracing of rectangular features in an image. Note that this step established a transformation that maps the first scene into the second scene and that, in embodiments of the invention, a variety of techniques, as are known in the art, may be used to determine this transformation. Note that the source images may show the same physical location or different physical locations and features within the source images that are corresponded need not be the same feature or at the same location.
Transitional objects are then created 510. Once the relative positions of the first and second scenes are determined, then a path for a virtual camera is selected from the first scene to the second scene. The camera path may be any arbitrary path, but, by default, the camera path may be a straight line. To simulate motion, “transitional objects” are created. Transitional scenes incorporating these transitional objects are displayed to simulate motion from the first scene to the second scene. These transitional objects are typically objects in the transitional scenes that are formed by combining a portion or feature of the first scene and a portion or feature of a second scene. The combining operators are what we call transitional parameters, described in detail below. In a specific embodiment of the invention, three-dimensional geometry with projective texture mapping may be used to create transitional objects. The projective textures are either from the first source image, or the second source image, or a blend of both. When the transition to the second scene has been achieved, the transitional scenes including the transitional objects disappear, and the user sees only the second scene. For example, transitional objects in a beach scene may include people, beach umbrellas, the beach, and/or the sky. As the virtual camera travels to the second scene, the people, the beach, the sky and the umbrellas pass by to correctly simulate a 3D motion in space.
Next, transitional parameters may be entered and adjusted 520. As the virtual camera travels from the first scene to the second scene, transitional parameters determine how the transitional objects in the transitional scenes vary in time, as the motion is simulated from the first scene to the second scene. Transitional parameters may include alpha blending (transparency), motion blurring, feature morphing, etc. In general, the transitional parameters may be thought as image processing filters (both 2D and 3D) that are applied over time during the flight of a virtual camera along a path.
Finally, the virtual camera path from the first scene to the second scene may be edited 530. In some embodiments of the invention, the virtual camera path may be linear by default from the acquisition point of the first scene to the acquisition point of the second scene. Alternatively, the virtual camera path may be determined to be an arbitrary path, e.g., a curved path. Further, the speed at which the path is traversed may vary. Furthermore, the viewing direction may point in any direction and may change during the transition from the first scene to the second scene.
In an embodiment of the invention, a “perspective rectangle tool” (“PRT”), enables a user to draw “exact” rectangular features on a source image (in perspective) using a constrained user interface. (By “exact,” we mean the measure of each corner angle of the rectangle is 90 degrees in a world plane.)
If we assume that the perspective rectangle on the image plane is an exact rectangle then we can compute a world plane where the corresponding rectangle is an exact rectangle. We describe next an embodiment of the invention where constraints are applied to the user interface such that the four points clicked on the image plane (via a pointing device on the display surface) will always create an exact perspective rectangle, therefore, enabling a world plane to be defined, in which the corresponding rectangle is a rectified exact rectangle.
As shown in
Once the four corners of the rectangular feature have been established, any of the corners may be selected with a pointing device and edited. Similar constraints are applied such that any edits to the corner will maintain the exactness of the rectangle. Edges may also be edited as well while maintaining the exactness requirement. In a specific embodiment of the invention, as illustrated in
A flow diagram of a process for determining a perspective rectangle is shown in
We now describe a 3D graphics-oriented technique to compute vanishing vectors (
In an embodiment of the invention, a corner (e.g. the fourth point) is moved via a pointing device with a click-and-drag command. As the user presses a button on the pointing device down, the fourth point is determined, and as the user drags around to determine where to place the fourth point, the vanishing vectors are computed and the edges 1-4 and 3-4 are placed such that the exactness constraint is valid.
As shown in 1045 and 1050, while moving the fourth point, a “control edge” is determined by the user. A “control edge” in this case is either edge 3-4 or 1-4. In a specific embodiment of the invention, different pointing device buttons are used to determine the control edge. Without loss of generality, if the control edge is defined as 3-4, then as the fourth point is moved using a pointing device, the control edge 3-4 is defined by drawing a line from point 3 to the current position of the pointing device. Point 4, which is on the solution curve, lies somewhere on this line. Vanishing vector y may be defined using the mentioned technique above, the two planes being p, v1, v2, and p, v3, m, where m is the current mouse position on the image plane. To compute the orthogonal vanishing vector x, two planes are again intersected, the first plane being p, v2, v3, and the second plane being the dual of vector y. Each vector in 3D space has its dual: an orthogonal plane. The computed x and y are guaranteed to be orthogonal. Finally, intersecting the plane p, v3, m with line defined by v1+x computes the 3D position of v4. Projecting the 3D point v4 onto the image plane provides the exact position of point 4 while maintaining the exactness constraint.
In a specific embodiment of the invention, acquisition camera pose estimation may be computed by corresponding rectangular features in a first scene and a second scene by using PRT.
Once corresponding features have been selected, a solution for the extrinsics of the acquisition points (camera pose) relative to each other may be computed. This solution involves maintaining the first scene static while rotating and translating the second scene, so that the rectangular feature in the second scene matches in direction, size and placement the corresponding feature in the first scene. From these operations, the relative positions and orientations of the two scenes in world space may be determined. Thus, the transformation mapping the first scene into the second scene using the rectangular features may be determined.
The rotation needed to align the second to the first scene is determined from the normals of the respective world planes. PRT defines first and second world planes from the corresponding rectangular features, and each plane has its dual, a normal. As discussed before, each rectangular feature in the world plane provides a pair of parallel lines that meet at a vanishing point (via PRT). Similarly to
The translation step is a two-step process. The first step involves reducing the translation solution space to a one-dimensional problem; and the second step then computes the solution in the one-dimensional space (
Next, the centroid of each PRT rectangle is computed. To compute the centroid, we first place the world planes at an arbitrary distance from the acquisition position. The four corners of the rectangle are then projected onto the plane. The four projected points, which are now specific points in 3D space, are averaged to compute the centroid. The centroid of the second PRT rectangle is then translated to match the centroid of the first PRT rectangle. As shown in
The line that goes through the centroid (now commonly shared point in space) to the new position of the viewpoint for the second panorama position is the one-dimensional solution space 1660. We call this the “solution line.” Moving the second scene position along the solution line means the projected rectangle on the common world plane changes in size, i.e., area. The final step, a translation along the solution line, is illustrated 1670. The second translation, 1670, matches the areas of the PRT rectangles in the world plane.
The exact solution is now computed by matching the area of the rectangle of the second panorama to that of the first panorama.
Computing the distance hd determines the final translation position. Equation (1) shows the length of hd, where it is the hypotenuse of a right triangle, and rd and bd are opposite and adjacent sides, respectively. Equation (2) shows how to compute the orthogonal distance to the normal plane rd, where Ad and As are areas of the projected rectangles of second and first panoramas onto the world plane, respectively. By computing hd, we are computing the distance from c to pd, such that the projected areas of the first and second PRT rectangles are the same.
In another embodiment of the invention, multiple pairs of rectangles may be corresponded to further improve the alignment. This is done by using the weighted average of each solution position of the second panorama positions. There are two aspects of the user-specified rectangle to consider: the angle and the size of the user-specified rectangles. The final position of the second panorama is determined by:
where k is the number of corresponded rectangle pairs, variable j is for second panorama and first panorama rectangles, ni,j is the normal of the rectangle, vi,j is the unit view vector from the acquisition position to the center of the rectangle (in 3D space), Ai,j is the solid angle of the projected rectangle subtended on a unit sphere, and pi is the solution position of the second panorama computed from our alignment algorithm.
More intuitively, (ni,j·vi,j) considers the angle of the rectangle as seen from the acquisition position—the more grazing the angle, the less confidence that the user-specified rectangle is correct. The size of the rectangle is also considered, since with a larger relative rectangle, user errors are less likely.
In preferred embodiments of the invention, once the camera pose has been estimated, transitional objects may then be modeled. As mentioned above, transitional objects are transient objects created for simulating motion from a first scene to a second scene.
In a specific embodiment of the invention, three-dimensional geometry and projective texture mapping may be used to create transitional objects, similar to those described in U.S. patent application Ser. No. 10/780,500, entitled “Modeling and Editing Image Panoramas.” In such techniques, a single merged texture map is used for each geometry, where the respective texture may be created from a blend of multiple source images.
In
In a specific embodiment of the invention, two textures may be stored for each geometry (or geometric element)—one texture from the first scene, and the other texture from the second scene. During the transition from the first scene to the second scene, these textures may also transition—i.e., alpha blending (i.e. transparency), morphing, motion blurring, and other types of image processing may be applied to the scenes, according to transitional parameters. (Transitional parameters are discussed in detail below.)
The transitional object modeling tool may also be used for non-planar geometries. Various 3D primitives, such as cubes, spheres, cylinders, may also be modeled. Also, triangle meshes and analytical geometric descriptions may also be modeled coupled with projective texture mapping. Furthermore, transitional objects that do not have corresponding views may also be modeled (as is described below). Oftentimes, due to the complexity of scenes, each feature may not be visible in both scenes. In this case, the geometry may still be modeled, but there may only be a single texture, either from the first scene or from the second scene.
In a preferred embodiment of the invention, the transition from the first to the second scene is modeled using a “virtual camera.” As shown in
In a specific embodiment of the invention, a user interface provides for interactive editing of transitional parameters.
In specific embodiments of the invention, transitional parameters may include: alpha blending, motion blurring, morphing, saturation change, camera speed, and camera XY-offset factors. Other transitional parameters may be defined as desired. In general, any type of image processing filter or algorithm for both 2D and 3D may be applied to the transitional images, and transitional parameters may be entered to control the filters or algorithms as a function of time (or position) along the path.
An intermediate image (or images) taken between two scenes (images or panoramas) may be used as a further source image in conjunction with these alpha blending, motion blurring, morphing, etc. techniques to improve the appearance of a transition between a first scene and a second scene. For example, on the path between a first panorama and a second panorama, there may be several ordinary images (i.e., images that are not necessarily panoramic) available. These images can be used as intermediate points for the alpha blending, motion blurring, morphing, etc., to create an even more visually convincing transition between the two panoramas.
Morphing for a transitional object requires additional feature correspondences as compared to other techniques, such as alpha-blending, motion blurring, etc.
Examples of inter-scene transitions created with embodiments of the present invention are shown below for a variety of scene types. These examples show the importance of transitional parameters to alleviate the necessity of precision in pose estimation for traditional vision and computer graphics problems.
The next example is of two scenes that do not have exact features to correspond.
The final example is shown in
In embodiments of the invention, once the inter-scene motion has been created, the scenes may be populated with artificial entities that interact with the user—called “active elements.” Typically, active elements are activated through a pointing device. Other methods of active element activation are described below.
As shown in
One of the most important active elements is called a “navigational icon.” A navigational icon activates motion within scenes, such as from a first scene to a second scene. As shown in
Navigational icons can play an important role in viewing scenes, enabling the user to visually understand that once a navigational icon is activated, inter-scene motion is triggered. This consistency in “visual language” is an important concept, especially in virtual environments. Furthermore, the navigational icon now enables a complex network of inter-scene motions, not only between two scenes in a one-directional way, but potentially among thousands of scenes interconnected multiply-directionally. An example of such a “supertour” at a city-scaled inter-scene connection is shown below.
Active elements are inserted into the scene with correct perspectives. This is done via an embodiment of the invention called the “Active Element Creator” (“AEC”) that enables the user to determine existing planar perspectives in the scene, and then create and edit layers of information into the scene.
In
Note that defining image-plane and world-plane rectangles that correspond to each other does not only create rectangles, but also create a one-to-one mapping between the two coordinate systems, x-y and x′-y′ (
In embodiments of the invention, a supertour is created including a complex network of scenes, inter-scene motions, active elements, and overview maps.
In some embodiments of the invention, a scene viewer, which shows perspective images or panoramas, is coupled with an overview map viewer. As shown in
In various embodiments of the invention, a method provides a means to “script” a series of scenes and transitions to play in sequence. In a supertour, a user typically invokes a transition from one scene to another by activating a navigational icon using a pointing device. Scripting may be thought of as a means to “record” a supertour path through multiple scenes and their corresponding inter-scene motions, and “play” the pre-determined path once invoked by the user. The scripted path may be a user-recorded path, or may be algorithmically determined, e.g. a shortest driving direction between two points in a city, according to specific embodiments of the invention. This is different from using additional source images to create a transition; scripts may be dynamically customized on the fly.
For instance, assuming scenes “A” through “Z” exist in the supertour. Scene “A” is connected to “Z” only via intermediate scenes (corresponding to intermediate locations), “B” through “Y.” If the current scene is “A,” and when a user selects a navigational icon “Z” on the overview map, a script may be triggered that plays the scenes and the inter-scene motions from “A” through to “Z” automatically and sequentially, such that the user may have a continuous and connected experience.
In specific embodiments of the invention, for the automatic playing necessary for scripting, as well as for simple navigation through navigational icons, scene viewers provide, what we call, an “orientation matching.” The scene viewer automatically aligns itself to the starting orientation of its connected inter-scene motion. For example, while traversing from scene “A” to scene “Z,” the user comes to an intersection scene, where a turn is necessary. The orientation matching feature automatically turns the viewer to align to the next inter-scene motion, and then triggers the transition.
Also, in embodiments of the invention, at each given panoramic scene, the user can interactively change the viewing orientation using a pointing device. To smoothly and seamlessly transition from one scene to another, it is preferable that the user's viewing orientation first match the beginning of the transitional image, and then initiate the transition from the first to the second scene. This feature is especially useful for transitional images in the form of pre-rendered movies, since the panorama viewing orientation should be aligned to the first frame of the transitional movie to provide a seamless experience to the end user.
In an embodiment of the invention, a data structure is implemented for each pair of connected source images and their respective directional transitional image, where the orientation angles (θ,φ)1 are the zenith and azimuth angles of the first scene, and the orientation angles (θ,φ)2 are the zenith and azimuth angles of the second scene that match the first and last frames of the transitional image, respectively. These orientation-matching data are stored during the inter-scene motion authoring process. In accordance with an embodiment of the invention, the transitional images are created in a three-dimensional system, so it is easy to determine the exact view orientation of the virtual camera along the transitional image's path.
In an embodiment of the invention, once a transition from the first scene to the second scene has been triggered, e.g., via a navigational icon, a panorama viewer is provided that automatically reorients the view of the first scene from any given arbitrary viewpoint (θ′,φ′)1 to match (θ,φ)1 via interpolation of the view angles. Once (θ′,φ′)1=(θ,φ)1 then the viewer renders the transitional image to simulate smooth motion to the second scene. Once reaching the second scene, the viewer transitions from displaying the transitional image to the second scene's panorama, which is oriented such that the viewing angle (θ,φ)2 for a smooth and seamless transition.
In these examples, it may seem as though all the scenes are connected to each other in an “absolute” sense. In other words, the multiple scenes displayed on the overview map and the scene viewer may seem like they are all positioned correctly with each other's position and orientation in world space. In embodiments of the present invention, supertours are created using only relative pose estimation between pairs of source images. This approach contrasts with many vision research and image-based modeling systems, in which it is important to compute as precise a pose estimation as possible via feature correspondences among source images. This is a complex optimization problem, and is more difficult and error-prone as the number of source images increases.
For example, in a simple scenario, assume there are three input source images, A, B, and C, that share corresponding features, e.g. the photographs are taken around a building; and each pair share common features, e.g. A-with-B, B-with-C, and C-with-A. Typical vision systems compute the camera pose of B relative to A, then compute the camera pose of C relative to B, etc. The computation error from A-to-B pose estimation would naturally propagate to the pose estimation of B-to-C, since all source images reside in the same “absolute” coordinate system. If there are feature correspondences between C and A, then it is necessary to have a global optimization algorithm to “spread” and lessen the error propagation. Note that due to A-to-B and B-to-C pose estimation, A and C already have their positions set in an absolute coordinate system. Trying to then compute the pose of A from C will naturally create more pose estimation errors. In more complex scenarios, e.g. real-world data, a system of complex optimization problem is a difficult problem to solve, often has problems with robustness, and once an error is introduced, it is difficult to “debug.”
In embodiments of the present invention, supertours are created using relative pose estimation only between pairs of source images. In other words, pose estimation for each pair of source images resides in relative coordinate systems. There is no need for global optimization, since the pose estimation problem is determined for each pair of source images. For the simplistic scenario of input source images A, B, and C, supertour only requires approximate pose estimations between A-to-B, B-to-C, and C-to-A, all of which are computed separately regardless of the error in each computation. This embodiment allows the user to smoothly and continuously “move” from one source image to another. Therefore, from the viewpoint of scene A, the inter-scene transition simulates motion from A-to-B, and then ends up in scene B. Once reaching scene B, the coordinate system may change (which is seamless to the user). Then simulating motion from B-to-C may be performed separately from pose estimation of A-to-B, regardless of its computation errors. This approach advantageously reduces computing complexity and opportunities for errors, allowing supertour embodiments to scale up more easily as the number of nodes increase.
In preferred embodiments of the invention, the final process as shown in the overview flow diagram (
In various embodiments of the invention, a method provides a transition, in a computer system having a display that simulates motion between a first scene and a second scene. The method includes receiving an indication of a viewpoint in the second scene towards which a transition is to be made. The indication may be received from a variety of sources. For example, the indication may be produced by entering search parameters into a search engine and the search engine may identify the location. The indication may be received upon activation of an icon anywhere on the display—the icon need not be located on a plan view map or a panorama viewer. When the location is received, a transitional image or a series of such images are displayed simulating motion toward the location. In a further example, a list of locations may be presented on the screen and the indication is received based on selection of an item in the list as shown in
Any of the above described embodiments of the invention may be implemented in a system that includes a computer or other type of processor. The computer or processor includes memory for instructions implementing the method steps. The computer or processor is coupled to a display device for displaying output and may be coupled to one or more input devices for receiving input from users. Instructions implementing the method may be executed on a single processor or multiple processors. Processors may be organized in a client-server fashion. Multiple processors may be connected by public or private communication systems of any type known in the art. Such communication systems may include, without limitation, data networks as are known in the art, such as the internet, using both wired and wireless link-level and physical media, point-to-point communication means, such as the public telephone system, satellite links, a Ti line, a microwave link, a wire line or a radio link, etc. Display devices used in the system may be of any type suitable for providing graphical displays. Displays may be directed from any processor to any display surface and multiple display surfaces may be employed in embodiments of the invention. Input devices for receiving inputs from users may take diverse forms including, without limitation, a keyboard, a pointing device, such as a trackball or mouse or touchpad, etc.
Systems according to embodiments of the invention may be described by the following clauses:
A system for creating a transition between a first scene and a second scene simulating motion, the first scene observed from a first viewpoint and including a first feature, and the second scene observed from a second viewpoint and including a second feature, the system comprising: a computer including a processor, memory and a display, the memory containing instructions that cause the computer to:
Computer program products according to embodiments of the invention may be described by the following clauses:
A computer program product for use on a computer system for creating a transition between a first scene and a second scene simulating motion, the first scene observed from a first viewpoint and including a first feature, and the second scene observed from a second viewpoint and including a second feature, the computer program product comprising a computer usable medium having computer readable program code thereon, the computer readable program code including program code for:
Additional computer program product embodiments of the invention may be described by adding program code steps according to the below listed method claims for the processor to execute.
Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, networker, or locator.) Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
The computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking technologies. The computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
Hardware logic (including programmable logic for use with a programmable logic device) implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL.)
While the invention has been particularly shown and described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. As will be apparent to those skilled in the art, techniques described above for panoramas may be applied to images that have been captured as non-panoramic images, and vice versa.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
6246412, | Jun 18 1998 | Microsoft Technology Licensing, LLC | Interactive construction and refinement of 3D models from multiple panoramic images |
6573899, | Dec 13 1999 | International Business Machines Corporation | Morphing processing apparatus, method, storage medium, program transmission apparatus, and animation creation apparatus |
7120293, | Nov 30 2001 | Microsoft Technology Licensing, LLC | Interactive images |
7424218, | Jul 28 2005 | ZHIGU HOLDINGS LIMITED | Real-time preview for panoramic images |
20050025347, | |||
20050078178, | |||
20060087519, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Feb 13 2006 | OH, BYONG MOK | MOK3, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 049524 | /0717 | |
Feb 28 2008 | MOK3, INC | EVERYSCAPE, INC | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 049528 | /0718 | |
May 23 2019 | Smarter Systems, Inc. | (assignment on the face of the patent) | / | |||
Sep 04 2019 | EVERYSCAPE, INC | SMARTER SYSTEMS, INC | PATENT ASSIGNMENT AGREEMENT | 050488 | /0061 |
Date | Maintenance Fee Events |
May 23 2019 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
May 30 2019 | SMAL: Entity status set to Small. |
Sep 18 2023 | REM: Maintenance Fee Reminder Mailed. |
Mar 04 2024 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Jan 28 2023 | 4 years fee payment window open |
Jul 28 2023 | 6 months grace period start (w surcharge) |
Jan 28 2024 | patent expiry (for year 4) |
Jan 28 2026 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 28 2027 | 8 years fee payment window open |
Jul 28 2027 | 6 months grace period start (w surcharge) |
Jan 28 2028 | patent expiry (for year 8) |
Jan 28 2030 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 28 2031 | 12 years fee payment window open |
Jul 28 2031 | 6 months grace period start (w surcharge) |
Jan 28 2032 | patent expiry (for year 12) |
Jan 28 2034 | 2 years to revive unintentionally abandoned end. (for year 12) |