A dialog supporting apparatus which can support an on-going dialog so that the dialog is smoothly completed. The dialog supporting apparatus includes an utterance receiving unit receiving an utterance of a dialog participant and outputting utterance information, an utterance processing unit translating the utterance identified by the utterance information, an utterance output unit outputting the translated utterance information, a dialog history database, an utterance prediction generating a first utterance prediction information based on the first utterance prediction information and the second utterance prediction information.
|
15. A dialog supporting method, performed by a dialog supporting apparatus, for supporting a present dialog between dialog participants, said dialog supporting method comprising:
generating first utterance prediction information based on utterances included in a previous dialog of a first dialog participant;
obtaining, from another dialog supporting apparatus, second utterance prediction information including utterances included in a previous dialog of a second dialog participant;
predicting a next utterance in the present dialog of the first dialog participant using the dialog supporting apparatus, the next utterance being predicted based on an utterance that (i) matches an opening utterance included in the second utterance prediction information, and (ii) is a replacement for an opening utterance included in the first utterance prediction information, the utterance that is the replacement for the opening utterance included in the first utterance prediction information being used for the prediction of the next utterance when a first utterance included in the first prediction information and a second utterance included in the second prediction information do not match by comparison; and
displaying the predicted next utterance.
16. A non-transitory computer-readable recording medium storing a program thereon, the program for supporting a present dialog between dialog participants, said program causing a computer, as a dialog supporting apparatus, to execute a dialog supporting method comprising:
generating first utterance prediction information based on utterances included in a previous dialog of a first dialog participant;
obtaining, from another dialog supporting apparatus, second utterance prediction information including utterances included in a previous dialog of a second dialog participant;
predicting a next utterance in the present dialog of the first dialog participant using the dialog supporting apparatus, the next utterance being predicted based on an utterance that (i) matches an opening utterance included in the second utterance prediction information, and (ii) is a replacement for an opening utterance included in the first utterance prediction information, the utterance that is the replacement for the opening utterance included in the first utterance prediction information being used for the prediction of the next utterance when a first utterance included in the first prediction information and a second utterance included in the second prediction information do not match by comparison; and
displaying the predicted next utterance.
1. A dialog supporting apparatus which supports a present dialog between dialog participants, said dialog supporting apparatus comprising:
a dialog history database for storing a previous dialog of a first dialog participant, the previous dialog of the first dialog participant including utterances;
an utterance prediction unit for (a) generating first utterance prediction information based on the utterances of the previous dialog stored in said dialog history database, (b) obtaining, from another dialog supporting apparatus, second utterance prediction information including utterances of a previous dialog of a second dialog participant, and (c) predicting a next utterance in the present dialog of the first dialog participant using said dialog supporting apparatus, such that the next utterance is predicted based on an utterance that (i) matches an opening utterance included in the second utterance prediction information, and (ii) is a replacement for an opening utterance included in the first utterance prediction information, the utterance that is the replacement for the opening utterance included in the first utterance prediction information being used for the prediction of the next utterance when a first utterance included in the first prediction information and a second utterance included in the second prediction information do not match by comparison; and
a display unit for displaying the next utterance predicted by said utterance prediction unit.
14. A dialog supporting system for supporting a present dialog between dialog participants, said dialog supporting system comprising:
a first dialog supporting apparatus; and
a second dialog supporting apparatus,
wherein said first dialog supporting apparatus includes:
a first dialog history database for storing a previous dialog of a first dialog participant, the previous dialog of the first dialog participant including utterances;
a first utterance prediction unit for (a) generating first utterance prediction information based on the utterances of the previous dialog stored in said first dialog history database, (b) obtaining, from said second dialog supporting apparatus, second utterance prediction information including utterances included in a previous dialog of a second dialog participant, and (c) predicting a next utterance in the present dialog of the first dialog participant using said first dialog supporting apparatus, such that the next utterance predicted by said first utterance prediction unit is predicted based on an utterance that (i) matches an opening utterance included in the second utterance prediction information, and (ii) is a replacement for an opening utterance included in the first utterance prediction information, the utterance that is the replacement for the opening utterance included in the first utterance prediction information being used for the prediction of the next utterance when a first utterance included in the first prediction information and a second utterance included in the second prediction information do not match by comparison; and
a first display unit for displaying the next utterance predicted by said first utterance prediction unit, and
wherein said second dialog supporting apparatus includes:
a second dialog history database for storing the previous dialog of a second dialog participant;
a second utterance prediction unit for (a) generating the second utterance prediction information based on the utterances of the previous dialog stored in said second dialog history database, (b) obtaining, from said first dialog supporting apparatus, the first utterance prediction information including the utterances included in the previous dialog of the first dialog participant, and (c) predicting a next utterance in the present dialog of the second dialog participant using said second dialog supporting apparatus, such that the next utterance predicted by said second utterance prediction unit is predicted based on an utterance that (i) matches the opening utterance included in the first utterance prediction information, and (ii) is the replacement for the opening utterance included in the second utterance prediction information, the utterance that is the replacement for the opening utterance included in the second utterance prediction information being used for the prediction of the next utterance when the second utterance included in the second prediction information and the first utterance included in the first prediction information do not match by comparison; and
a second display unit for displaying the next utterance predicted by said second utterance prediction unit.
2. The dialog supporting apparatus according to
wherein said utterance prediction unit transmits the generated first utterance prediction information to the other dialog supporting apparatus.
3. The dialog supporting apparatus according to
wherein said dialog history database stores a plurality of previous dialogs, each previous dialog of the plurality of previous dialogs including utterances, and
wherein said utterance prediction unit extracts, from the plurality of previous dialogs, a previous dialog most similar to the present dialog, such that the generated first utterance prediction information is based on the utterances of the extracted previous dialog.
4. The dialog supporting apparatus according to
wherein said utterance prediction unit generates a first prediction stack based on an assembly of successive utterances included in the first utterance prediction information and based on an assembly of successive utterances included in the second utterance prediction information obtained from the other dialog supporting apparatus.
5. The dialog supporting apparatus according to
wherein said utterance prediction unit predicts the next utterance of the first dialog participant as an opening utterance of the first prediction stack.
6. The dialog supporting apparatus according to
wherein, when an utterance of the first dialog participant or an utterance of the second dialog participant appears in an assembly of utterances included in the first prediction stack, said utterance prediction unit moves the assembly including the utterance of the first dialog participant or the utterance of the second dialog participant to an opening part of the first prediction stack and deletes, from the first prediction stack, any utterances previously located ahead of the assembly moved to the opening part of the first prediction stack.
7. The dialog supporting apparatus according to
wherein said utterance prediction unit adjusts a number of utterances included in the first utterance information and a number of utterances included in the second utterance information to a same number using dynamic programming.
8. The dialog supporting apparatus according to
an utterance receiving unit for receiving an utterance of the first dialog participant;
an utterance processing unit for transforming the utterance received by said utterance receiving unit into another utterance form; and
an utterance output unit for outputting the utterance of the other utterance form transformed by said utterance processing unit.
9. The dialog supporting apparatus according to
wherein said utterance receiving unit performs speech recognition of speech inputted thereto, after narrowing down a speech recognition dictionary to one of predicted utterances predicted by said utterance prediction unit, sentences which are similar to the predicted utterances, words included in the predicted utterances, and words associated with the predicted utterances, and receives a result of the speech recognition as the utterance of the first dialog participant.
10. The dialog supporting apparatus according to
wherein said utterance receiving unit receives the predicted next utterance as the utterance of the first dialog participant, when the predicted next utterance is selected by the first dialog participant.
11. The dialog supporting apparatus according to
wherein said utterance prediction unit predicts a development of the utterances made by the first and second dialog participants, based on the first utterance prediction information and the second utterance prediction information, and displays the predicted development on said display unit.
12. The dialog supporting apparatus according to
wherein said utterance prediction unit transmits the predicted development of the utterances to the other dialog supporting apparatus.
13. The dialog supporting apparatus according to
|
This is a divisional application of U.S. application Ser. No. 11/353,199, filed Feb. 14, 2006, now U.S. Pat. No. 7,346,515, which is a continuation application of PCT application No. PCT/JP2005/018426, filed Oct. 5, 2005, designating the United States of America.
(1) Field of the Invention
The present invention relates to a dialog supporting apparatus which supports an on-going dialog between people.
(2) Description of the Related Art
Conventionally, a translation device has been developed with a purpose of supporting an on-going dialog in different languages, respectively, spoken by travelers and local people at travel destinations abroad or the like. A representative example is a translation apparatus which is obtained by providing a translation scheme based on text and translation of example sentences and example usages, on a small information processing apparatus such as a PDA (Personal Digital Assistant). Such an apparatus is provided with thousands of example usages in order to cover general travel conversation, and requires a user to select a desired example usage by viewing the list of example usages. Hence, the apparatus has a usability problem when it comes to actual use. Especially in the case where the apparatus has a small display for displaying a list of example usages and thus the number of example usages which can be viewed at one time is small, this problem is more noticeable. In addition, assuming a general use status of a translation apparatus, example usages corresponding to several sentences must be used in the dialog with the other party in a great many cases. Thus, it takes more time than expected to complete an on-going dialog by means of a translation apparatus. Therefore, in order to achieve a final purpose of supporting an on-going dialog made between people, there is a need to add a supplementary function for enabling a user to immediately select a desired example usage from among the list of large number of example usages.
As a method for solving this problem, there has been provided an approach for narrowing down candidate next utterances of a user using example dialog models or conversation training history corpuses (for example, refer to Japanese Laid-open Patent Application No. 2003-30187).
Narrowing down candidate next utterances based on past dialog history of a user of the translation device is effective in the case where the utterances of the other party are included in the utterances. In addition, the narrowing down of candidate next utterances based on a virtual dialog which has been previously uttered in training or typical dialog patterns is effective in the case where the other party utters in compliance with the dialog pattern as expected by the user. However, it is common that dialog patterns vary among people. Here is an example case where a traveler starts a dialog with a waiter of a restaurant in order to reserve a table. In response to the traveler's utterance of “I'd like to reserve a table”, a waiter may start the dialog with an utterance relating to the date and time of a reservation saying “What date and time would you like to reserve a table?” and another waiter may start the dialog with an utterance relating to the number of people saying “How many people are in your party?”. Such being the case, there is a problem that the narrowing-down of the candidate utterances fails depending on the other party in an on-going dialog. An additional problem is that inappropriate narrowing-down confuses dialog participants, resulting in increasing the time to complete the dialog, contrary to the purpose. Especially, in the case of traveling in a region where no communication infrastructure is established, such a problem must be solved only using the translation apparatus of a user without using any network.
The present invention has been conceived in view of these circumstances. An object of the present invention is to provide a dialog supporting apparatus which can support an on-going dialog so that the dialog is smoothly completed irrespective of who the other party is, even in the case where no network is available.
In order to achieve the above-described object, the dialog supporting apparatus of the present invention supports an on-going dialog made by dialog participants. The dialog supporting apparatus includes: a dialog history database in which a dialog history of one of the dialog participants is stored; and an utterance prediction unit which generates a first utterance prediction information based on the dialog history stored in the dialog history database, obtains a second utterance prediction information from the other dialog supporting apparatus, and predicts the next utterance in the dialog of the dialog participant who uses the dialog supporting apparatus.
The dialog supporting apparatus of the present invention enables a user to easily select example usages from among the candidate next utterances of the user. Thus, it eliminates the necessity for the other party to wait, and therefore the dialog supported by the dialog supporting apparatus can be smoothly advanced. In addition, since a candidate next utterance is generated based on only the dialog histories of the user and the other party, there is no need to install information such as typical dialog patterns in the apparatus, and thus it becomes possible to reduce the implementation scale of the whole apparatus.
The disclosure of Japanese Patent Application No. 2004-296776 filed on Oct. 8, 2004 including specification, drawings and claims is incorporated herein by reference in its entirety.
The disclosure of PCT application No. PCT/JP2005/018426, filed Oct. 5, 2005, including specification, drawings and claims is incorporated herein by reference in its entirety.
These and other objects, advantages and features of the invention will become apparent from the following description thereof taken in conjunction with the accompanying drawings that illustrate a specific embodiment of the invention. In the Drawings:
The dialog supporting apparatus of the embodiment of the present invention supports an on-going dialog made by dialog participants. The dialog supporting apparatus includes: a dialog history database in which a dialog history of one of the dialog participants is stored; and an utterance prediction unit which (a) generates a first utterance prediction information based on the dialog history stored in the dialog history database, (b) obtains a second utterance prediction information from the other dialog supporting apparatus, and (c) predicts the next utterance in the dialog of the dialog participant who uses the dialog supporting apparatus.
This makes it possible to predict the next utterance, based on only the dialog histories of the user and the other party. Therefore, the dialog supporting apparatus can support an on-going dialog so that the dialog is smoothly completed irrespective of the other party in the dialog.
Here, it is preferable that the utterance prediction unit extracts the dialog history whose situation is most similar to the dialog history of the dialog from among the dialog histories stored in the dialog history database, and the extracted dialog history is the first utterance prediction information.
In addition, in one aspect of the present invention, it is preferable that the utterance prediction unit of the dialog supporting apparatus generates a prediction stack based on an assembly which corresponds to successive utterances and is commonly included in the dialog histories, respectively, included in the first utterance prediction information and the second utterance prediction information.
This makes it possible to predict the next utterance of one of the dialog participants, based on the dialog history whose dialog situation is most similar.
In another aspect of the present invention, the dialog supporting apparatus may further include: an utterance receiving unit which receives an utterance of one of the dialog participants; an utterance processing unit which transforms the utterance received by the utterance receiving unit into an other utterance form; and an utterance output unit which outputs the utterance of the other utterance form transformed by the utterance processing unit.
This makes it possible to support an on-going dialog where different languages such as Japanese and English are used.
Note that the present invention can be realized not only as a dialog supporting apparatus, but also as a dialog supporting method having steps corresponding to the unique units which are provided to the dialog supporting apparatus, and as a program causing a computer to execute these steps. Of course, the program can be distributed through a recording medium such as a CD-ROM and a communication medium such as the Internet.
The embodiment of the present invention will be described with reference to figures.
The dialog supporting apparatus 100 is intended for supporting an on-going dialog made between people. As shown in
The utterance receiving unit 101 receives an utterance of a dialog participant, and outputs the utterance information for identifying the utterance. The utterance processing unit 102 transforms the utterance identified by the utterance information outputted by the utterance receiving unit 101 into an other utterance form. The utterance output unit 103 outputs the utterance information transformed into the other utterance form as the utterance to the other dialog participant. The dialog history database 104 stores a dialog history where two or more pieces of past utterance information of the dialog participants are placed in time sequence.
The utterance prediction unit 105 generates a first utterance prediction information, based on the dialog history stored in the dialog history database 104. In addition, the utterance prediction unit 105 obtains a second utterance prediction information from the other dialog supporting apparatus. Further, the utterance prediction unit 105 predicts the next utterance of the dialog participant who uses a dialog supporting apparatus 100, based on the first utterance prediction information and the second utterance prediction information, in a dialog that the dialog participants will start to make. In addition, the utterance prediction unit 105 notifies the generated first utterance prediction information to the other dialog supporting apparatus.
The dialog supporting apparatus 100 shown in
Next, an operation performed in the case of supporting an on-going dialog in different languages using the dialog supporting apparatuses configured like described above will be described below. It is assumed here that the dialog participant 1 who speaks Japanese uses a dialog supporting apparatus 100a and the dialog participant 2 who speaks English uses a dialog supporting apparatus 100b.
The utterance receiving unit 101a transforms the received utterance of the dialog participant 1 into the corresponding utterance information. The utterance information is, for example, an utterance number in
In contrast to the dialog supporting apparatus 100a, the utterance processing unit 102b of the dialog supporting apparatus 100b transforms an utterance of the dialog participant 2 into corresponding utterance information.
The utterance receiving unit 101a allows the dialog participant 1 to directly select an utterance from the list of Japanese utterances in
Here will be described, as shown in
The utterance prediction unit 105a and the utterance prediction unit 105b specify the topic of a dialog first in order to search, respectively, the corresponding dialog history database 104a and the dialog history database 104b for the dialog history needed for performing utterance prediction (Step S601). The utterance prediction unit 105a searches the dialog history database 104a for the needed dialog history, and the utterance prediction unit 105b searches the dialog history database 104b for the needed dialog history. For example, the dialog participant 1 uses the dialog supporting apparatus 100a realized as, for example, a PDA shown in
When a dialog start button 705 and a dialog start button 706 are pressed by the respective dialog participants, the utterance prediction unit 105a selects a dialog history d1 for the dialog participant 1, and the utterance prediction unit 105b selects a dialog history d3 for the dialog participant 2. This is because the topics of the dialog history d1 and the dialog history d3 relate to a hotel. The dialog history d2 and the dialog history d4 are not selected because their topics are different. After that, the utterance prediction unit 105a notifies the utterance prediction unit 105b of the dialog history d1 which is the first utterance prediction information, and the utterance prediction unit 105b notifies the utterance prediction unit 105a of the dialog history d3 which is the second utterance prediction information.
Next, the utterance prediction unit 105a and the utterance prediction unit 105b start to generate prediction stacks using the dialog history d1: E1, J2, E3, J4, E5, J6, E7, E8 and the dialog history d3: E1, E3, J4, J2, E8, E5, J6, E7, respectively (Step S602).
The utterance prediction unit 105a and the utterance prediction unit 105b adjust the numbers of the utterances in the respective dialog histories to the same number using a dynamic programming shown in
Next, the utterance prediction unit 105a and the utterance prediction unit 105b respectively determine the utterance blocks of the dialog histories. An utterance block is an assembly which corresponds to successive utterances and is commonly included in each history. Here, such utterance blocks that include the maximum numbers of utterances are determined so that the number of utterance blocks included in each dialog history becomes the minimum. In other words, the number of utterances included in the utterance string A (utterance string B) is assumed to be m (Step S902). Next, 1 is substituted to i (Step S903). Whether or not A[i] is present in the utterance string B is judged (Step S904). Note that A[i] shows ith utterance in the utterance string A. In addition, as for φ, when A[i]=φ, and B[j]=φ, A[i] and B[j] are not assumed to be the same. As a result of this judgment, in the case where A[i] is present in the utterance string B (Step S904: YES), the utterance is assumed to be B[j] (Step S905). After that, the maximum n which satisfies the condition that A[i] to A[i+n] is the same as B[j] to B[j+n] is calculated, and each of them is assumed to be a block (Step S906). Next, i+n+1 is substituted to i (Step S907).
In the case where the judgment of whether or not A[i] is present in the utterance string B showed that A[i] is not present in the utterance string B (Step S904: NO), A[i] is assumed to be a block (Step S908). After that, i+1 is substituted to i (Step S909).
Next, whether or not i is greater than m is judged (Step S910). In the case where the judgment showed that i is greater than m (Step S907: NO), the processing step of judging whether or not A[i] is present in the utterance string B and the following processing steps (Steps S904 to S907) are repeated. In the other case where the judgment showed that i is greater than m (Step S907: YES), the processing is completed.
In an example case of dialog history d1: E1, J2, E3, J4, E5, J6, E7, E8 and the dialog history d3: E1, E3, J4, J2, E8, E5, J6, E7, the dialog history d1: E1, J2, (E3, J4), (E5, J6, E7), E8 and the dialog history d3: E1, (E3, J4), J2, E8, (E5, J6, E7) are obtained by performing the above operation. Each utterance label enclosed by parentheses corresponds to an utterance block. Note that utterance blocks made up of only a single utterance are not provided with parentheses in order to simplify the transcription. The dialog history d1: E1, J2, (E3, J4), (E5, J6, E7), E8 becomes the prediction stack J: E1, J2, (E3, J4), (E5, J6, E7), E8 of the utterance prediction unit 105a, and the dialog history d3: E1, (E3, J4), J2, E8, (E5, J6, E7) becomes the prediction stack E: E1, (E3, J4), J2, E8, (E5, J6, E7) of the utterance prediction unit 105b.
The processing of judging whether or not these two utterances are the same (Steps S901, S904 and S906) is performed in the procedure of generating each prediction stack, but it should be noted that the judgment may be made performing at least one of the following matching: a matching based on utterance information which is an utterance number for identifying the utterance; a matching based on a surface expression using natural language processing; and a matching based on a content word. In an example case where one of the dialog histories includes the utterance of
and the other dialog history includes the utterance of
it is possible to judge these utterances as included in common in the two dialog histories even in the case where each of them is provided with a different utterance number, because the surface expressions of the respective utterances are similar. This is true of another example case where one of the dialog histories includes the utterance of “Thank you.” and the other dialog history includes the utterance of “Thank you very much.”. Furthermore, here is another case where one of the dialog histories includes the utterance of
and the other dialog history includes the utterance of
In this case, it is possible to judge these utterances as the utterance included in both the two dialog histories as long as both
are defined as content words. This is because many content words are commonly included in the dialog histories. In addition, these judgment methods may be used in combination. To employ a flexible utterance judgment like this makes it possible to control the number of utterance blocks even in the case where the number of blocks is required to be increased when performing only a matching based on an utterance number.
Here, the flow chart shown in
After structuring an utterance prediction stack, the utterance prediction unit 105a and the utterance prediction unit 105b judge whether or not the respective prediction stacks are empty (Step S603). In the case where one of the prediction stacks is empty for the reason that the utterance prediction unit 105a or the utterance prediction unit 105b cannot structure an appropriate prediction stack or for another reason (Step S603: YES), the utterance prediction unit 105a or the utterance prediction unit 105b completes the processing without performing any utterance prediction operation, and follows the operation of the other party's utterance prediction unit which is the utterance prediction unit 105a or the utterance prediction unit 105b.
On the other hand, in the case where the prediction stack is not empty (Step S603: NO), the utterance prediction unit 105a and the utterance prediction unit 105b each displays the opening utterance of the prediction stack as the candidate next utterance (Step S604).
The dialog participant 1 and the dialog participant 2 can select an arbitrary utterance from among all the utterances defined in
In
At this time, the utterance prediction unit 105a and the utterance prediction unit 105b judge whether or not an utterance is inputted by the respective dialog participants (Step S605). When utterances are inputted by the respective dialog participants (Step S605: YES), the utterance prediction unit 105a and the utterance prediction unit 105b search the respective prediction stacks for a matching utterance starting with the opening utterances (Step S606), and judge whether or not a matching utterance is present (Step S607). In the case where a matching utterance is present (Step S607: YES), the utterance prediction unit 105a and the utterance prediction unit 105b judge whether or not the matching utterance is the opening utterance of the prediction stacks (Step S608). In the case where the matching utterance is the opening utterances (Step S608: YES), they delete the opening utterances in the prediction stacks so as to update the prediction stacks (Step S609). After that, in the case where there emerge utterance blocks which can be combined with each other after the utterance is deleted, they -combine utterance blocks which can be combined in the prediction stacks (Step S611). In contrast, in the case where the matching utterance is not the opening utterances (Step S608: NO), they move the block including the matching utterance to the opening part of the prediction stacks (Step S610), and delete the utterances from the opening utterance to the matching utterance so as to update the prediction stacks. After that, they return to the processing of judging whether or not the prediction stacks are empty (Step S603).
Since the utterance of one of the dialog participant is E1 in the above example, the utterance prediction unit 105a and the utterance prediction unit 105b delete the utterance E1 which is the opening utterances in the prediction stacks so as to update the prediction stacks to the prediction stack J: J2, (E3, J4), (E5, J6, E7), E8 and the prediction stack E: (E3, J4), J2, E8, (E5, J6, E7). Since no utterance blocks which can be combined are present in the prediction stacks J: J2, (E3, J4), (E5, J6, E7), E8 and E: (E3, J4), J2, E8, (E5, J6, E7), these prediction stacks do not change. After that, those prediction stacks are still not empty, the utterance prediction unit 105a assumes the utterance J2, which is the opening utterance of the prediction stack J: J2, (E3, J4), (E5, J6, E7), E8, to be a prediction candidate. More specifically, the utterance prediction unit 105a displays the utterance J2 of
on the prediction display area 1105. In addition, the utterance prediction unit 105b displays the utterance E3 of “Have you made reservation?” on the prediction display area 1106 assuming the utterance E3, which is the opening utterance of the prediction stack E: (E3, J4), J2, E8, (E5, J6, E7) to be a prediction candidate. The utterance prediction unit 105a and the utterance prediction unit 105b wait for an utterance by the other dialog participant.
The dialog participant 1 or the dialog participant 2 may, respectively, select an utterance in the usage example list 1101 or the usage example list 1102. However, since a desired utterance has already been displayed on the prediction display area 1105 or the prediction display area 1106, it is a good idea for the dialog participant to select the next utterance from among the prediction candidates. Here, in the case where the dialog participant 2 selected the prediction display area 1106 earlier than the time when the dialog participant 1 selects the prediction display area 1105 or the like, the utterance E3 is translated into Japanese in the utterance processing unit 102b, and the utterance of
is notified to the dialog participant 1. Since the utterance E3 from the dialog participant is not present in the opening utterance block of the prediction stack J: J2, (E3, J4), (E5, J6, E7), E8, the utterance prediction unit 105a changes the prediction stack 3 into J: (E3, J4), 32, (E5, J6, E7), E8 (Step S610), and updates to J: J4, J2, (E5, J6, E7), E8 (Step S609). On the other hand, since the utterance E3 is present in the opening utterance block of the prediction stack E: (E3, 34), J2, E8, (E5, J6, E7), the utterance prediction unit 105b updates the prediction stack E to E: J4, J2, E8, (E5, 37, E7). At this time, the successive utterances of J4 and J2 are commonly included in the respective prediction stacks (Step S611). Therefore, the utterance prediction unit 105b updates the prediction stack J to J: (J4, J2), (E5, J6, E7), E8 and the prediction stack E to E: (J4, J2), E8, (E5, J6, E7) by combining J4 and J2 with each other so as to include them in an utterance block.
Likewise, since the prediction stack J is J: (J4, J2), (E5, J6, E7), E8 and the prediction stack E is E: (J4, J2), E8, (E5, J6, E7), the utterance prediction unit 105a displays the prediction candidate 34 of
on the prediction display area 1105 as shown in
The dialog participant 1 may select an utterance in the example usage list 1101. However, since a desired utterance has already been displayed on the prediction display area 1105, the dialog participant 1 selects the utterance on the prediction display area 1105. In response to this, the utterance 34 is translated into English by the utterance processing unit 102a and the utterance of “Yes.” is notified to the dialog participant 2. Likewise, the utterance prediction unit 105a updates the prediction stack J to J: J2, (E5, J6, E7), E8 and displays the utterance J2 of
on the prediction display area 1105 as shown in
Next, an effect of the present invention will be described from an objective standpoint.
was displayed on the prediction display area 1105, nothing was displayed on the prediction display area 1106, and the dialog participant 1 inputted the utterance J4.
which is a prediction candidate is displayed on the prediction display area 1105, and the utterance E3 of “Have you made reservation?” is displayed on the prediction display area 1106. Here, Branches 1802 are in the dialog showing the following two cases: the case where the dialog participant 1 inputted the utterance J2 earlier than the time when the dialog participant 2 inputs the utterance E3; and the case where the dialog participant 2 inputted the utterance E3 earlier than the time when the dialog participant 1 inputs the utterance 32. An example taken next is the case of the dialog which advances in the direction shown by the bold arrows in the advancement patterns of dialogs like this. In this example, the dialog d′ shown in
Here, a degree of similarity of dialog histories is defined. For example, r(da|db) is the degree of similarity of the dialog history da with respect to the dialog history db, and it is defined by an equation 2001 of
As shown in
Here will be shown that the present invention can provide an effect even in the case where a dialog is continued without a dialog participant selecting a part of prediction candidates.
As shown in
The case of assuming that the dialog participant 1 who speaks Japanese uses the dialog supporting apparatus 100a and the dialog participant 2 who speaks English uses the dialog supporting apparatus 100b has already been described up to this point. In the case described next, it is assumed that a dialog participant 1 who speaks Chinese uses a dialog supporting apparatus 100a and a dialog participant 2 who speaks English uses a dialog supporting apparatus 100b.
The utterance receiving unit 101a transforms the received utterance of the dialog participant 1 into the corresponding utterance information. The utterance information is, for example, an utterance number in
In contrast to the dialog supporting apparatus 100a, the utterance receiving unit 102b of the dialog supporting apparatus 100b translates the received utterance of the dialog participant 2 into the corresponding utterance information.
which is the utterance information to the receiving unit 103b. After that, in order to simplify the following description considering the language directions, the utterance number 1 inputted by the dialog participant 1 is abbreviated as C1 and the utterance number 1 inputted by the dialog participant 2 is abbreviated as E1.
The utterance receiving unit 101a allows the dialog participant 1 to directly select an utterance in the list of Chinese utterances in
As shown in
The utterance prediction unit 105a and the utterance prediction unit 105b specify the topic of the dialog in order to search the dialog history database 104a and the dialog history database 104b for the dialog history needed for predicting utterances first (Step S601). The utterance prediction unit 105a searches the dialog history database 104a for the utterance, and the utterance prediction unit 105b searches the dialog history database 104b for the utterance. The dialog participant 1 uses the dialog supporting apparatus 100a realized as a PDA or the like shown in
When the dialog start button 705 and the dialog start button 706 are pressed by the respective dialog participants, the utterance prediction unit 105a selects the dialog history d5 for the dialog participant 1, and the utterance prediction unit 105b selects the dialog history d7 for the dialog participant 2. This is because the topics of the dialogs of the dialog history d5 and the dialog history d7 relate to a hotel. The dialog history d6 and the dialog history d8 are not selected because their topics are different. After that, the utterance prediction unit 105a notifies the dialog history d5 as a first prediction information to the utterance prediction unit 105b, and the utterance prediction unit 105b notifies the dialog history d7 as a second prediction information to the utterance prediction unit 105a.
Likewise, the utterance prediction unit 105a makes a prediction stack using the dialog history d5: E1, C2, E3, C4, E5, C6, E7, E8, and also the utterance prediction unit 105b makes a prediction stack using the dialog history d7: E1, E3, C4, C2, E8, E5, C6, E7 (Step S602). After that, for example, the utterance prediction unit 105a makes the prediction stack C: E1, C2, (E3, C4), (E5, C6, E7), E8, and the utterance prediction unit 105b makes the prediction stack E: E1, (E3, C4), C2, E8, (E5, C6, E7), respectively.
After generating the prediction stack, each of the utterance prediction unit 105a and the utterance prediction unit 105b judges whether or not the prediction stack is empty (Step S603). In the case where the utterance prediction unit 105a or the utterance prediction unit 105b cannot structure any appropriate prediction stack for some reason and the prediction stack is empty (Step S603: YES), the utterance prediction unit 105a or the utterance prediction unit 105b completes the processing without performing utterance prediction operation, and follows the operation of the other party's utterance prediction unit which is the utterance prediction unit 105a or the utterance prediction unit 105b.
On the other hand, in the case where the prediction stack is not empty (Step S603: NO), the utterance prediction unit 105a and the utterance prediction unit 105b display the opening utterance of the prediction stack as the candidate next utterance (Step S604).
The dialog participant 1 and the dialog participant 2 can select an arbitrary utterance from among all the utterances defined in
In
At this time, the utterance prediction unit 105a and the utterance prediction unit 105b judge whether or not the utterance is inputted by a dialog participant (Step S605). When the utterance is inputted by the dialog participant (Step S605: YES), the utterance prediction unit 105a and the utterance prediction unit 105b search the prediction stacks for a matching utterance starting with the opening utterances (Step S606), and judges whether or not there is a matching utterance (Step S607). In the case where there is a matching utterance (Step S607: YES), the utterance prediction unit 105a and the utterance prediction unit 105b judge whether or not the matching utterance is the opening utterances of the prediction stacks (Step S608). In the case where it is the opening utterance (Step S608: YES), each of them deletes the opening utterance of the prediction stack so as to update the prediction stack (Step S609). After that, in the case where utterance blocks which can be combined with each other emerge after the utterance is deleted, it combines the utterance blocks which can be combined in the prediction stack (Step S611). On the other hand, in the case where the utterance is not the opening utterance (Step S608: NO), it moves the block including the matching utterance to the opening part of the prediction stack, deletes the matching utterance and the utterances placed before the matching utterance so as to update the prediction stack (Step S610). After that, the utterance prediction unit 106 returns to the processing of judging whether or not the prediction stack is empty (Step S603).
Since the utterance of a dialog participant is E1 in the above example, the respective utterance prediction units 105a and 105b delete the utterance E1 which is the opening utterance of the prediction stacks so as to update the prediction stacks to the prediction stack C: C2, (E3, C4), (E5, C6, E7), E8 and the prediction stack E: (E3, C4), C2, E8, (E5, C6, E7), respectively. Note that the prediction stack C: C2, (E3, C4), (E5, C6, E7), and E8 and the prediction stack E: (E3, C4), C2, E8, and (E5, C6, E7) do not change. This is because no utterance blocks which can be combined are present in the prediction stacks. Since the prediction stack is still not empty, the utterance prediction unit 105a assumes the utterance C2 to be a prediction candidate. Here, the utterance C2 is the opening utterance of the prediction stack C: C2, (E3, C4), (E5, C6, E7), and E8. In other words, the utterance prediction unit 105a displays the utterance C2 of
on the prediction display area 1105 as shown in
The dialog participant 1 or the dialog participant 2 may select an utterance from the example usage list 1101 or the example usage list 1102. However, desired utterances have already been displayed on the prediction display area 1105 or the prediction display area 1106. Therefore, it is good for them to select an utterance from among the prediction candidates. In the case where the dialog participant 2 selected the prediction area 1106 earlier than the time when the dialog participant 1 selects the prediction area 1105 or the like, the utterance E3 is transformed into Chinese by the utterance processing unit 102b and the utterance of
is notified to the dialog participant 1. Here, the utterance E3 from the dialog participant is not present in the opening utterance block of the prediction stack C: C2, (E3, C4), (E5, C6, E7), and E8. Therefore, the utterance prediction unit 105a makes the prediction stack C into C: (E3, C4), C2, (E5, C6, E7) and E8 (Step S610), and updates the prediction stack C to C: C4, C2, (E5, C6, E7) and E8 (Step S609). On the other hand, the utterance E3 from the dialog participant is present in the opening utterance block of the prediction stack E: (E3, C4), C2, E8, (E5, C6, E7), and the utterance prediction unit 105b updates the prediction stack E to E: C4, C2, E8, and (E5, C6, E7). At this time, successive utterances of C4 and C2 are commonly included in the prediction stacks (Step S611). Thus, the utterance prediction unit 105b updates the prediction stack C to C: (C4, C2), (E5, C6, E7), and E8, and the prediction stack E to E: (C4, C2), E8, and (E5, C6, E7) by combining these utterances of C4 and C2 with each other and including them in an utterance block.
Likewise, since the prediction stack C has been updated to C: (C4, C2), (E5, C6, E7) and E8, and the prediction stack E has been updated to E: (C4, C2), E8 and (E5, C6, E7), the utterance prediction unit 105a displays the candidate prediction utterance C4 of
on the prediction display area 1105 as shown in
The dialog participant 1 may select an utterance in the example usage list 1101. However, desired utterances have already been displayed on the prediction display area 1105. When the prediction display area 1105 is selected as shown in
on the prediction display area 1105 as shown in
Next, the effect which is obtained also in the case of Chinese and English as well as the above-described case of Japanese and English will be described in an objective manner.
was displayed on the prediction display area 1105, after that nothing is displayed on the prediction display area 1106, and thus the dialog participant 1 inputted the utterance C4.
earlier than the time when the dialog participant 2 inputs the utterance E3 while the utterance E3 of “Have you made reservation?” is displayed on the prediction display area 1106, and the case where the dialog participant 2 inputted the utterance E3 earlier than the time when the dialog participant 1 inputs the utterance C2 while the utterance E3 of “Have you made reservation?” is displayed on the prediction display area 1106 also. An example to be taken here is an advancement pattern of a dialog which advances along with the bold arrow among plural advancement patterns of the dialog. In this case, a dialog f shown in
As shown in
The thing shown next is that the present invention has an effect even in the case where dialog participants continue a dialog without selecting a part of prediction candidates.
As shown in
Note that the dialog supporting apparatus can be configured so that it has a history registration unit in addition to the configuration shown in
In addition, the dialog supporting system can be configured so that one dialog supporting system is shared with the dialog participants as shown in
In addition, dialog supporting apparatuses can be configured to have a speech recognition unit 401a and a speech recognition unit 402b, respectively, shown in
Note that it is possible to implement an utterance output unit 502a and an utterance output unit 502b so that they use the utterance processing unit of the other party's dialog supporting apparatus as shown in
In addition, a button 1107 and a button 1108 shown in
In addition,
An example of Japanese and English and an example of Chinese and English have been taken in the embodiment. However, the other languages such as French are also available. It should be noted that the present invention does not depend on language.
Although only an exemplary embodiment of this invention has been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiment without materially departing from the novel teachings and advantages of this invention. Accordingly, all such modifications are intended to be included within the scope of this invention.
The dialog supporting apparatus of the present invention has a function for inputting utterances of dialog participants smoothly. It is useful as translation application software or the like of mobile phones and mobile terminals. In addition, it is applicable in the use of public town terminals and guidance terminals and the like. Further, it is applicable in the use of, for example, a chat system where typical sentences are used.
Mizutani, Kenji, Okimoto, Yoshiyuki
Patent | Priority | Assignee | Title |
10839801, | Dec 13 2018 | Language Line Services, Inc. | Configuration for remote multi-channel language interpretation performed via imagery and corresponding audio at a display-based device |
8180625, | Nov 14 2005 | LANGUAGE DISCOVERY LTD | Multi language exchange system |
8583417, | Sep 25 2009 | Kabushiki Kaisha Toshiba; Toshiba Digital Solutions Corporation | Translation device and computer program product |
9472193, | Sep 03 2013 | Sovereign Peak Ventures, LLC | Speech dialogue control method |
Patent | Priority | Assignee | Title |
5216603, | Nov 18 1985 | Silicon Valley Bank | Method and apparatus for structuring and managing human communications by explicitly defining the types of communications permitted between participants |
5682539, | Sep 29 1994 | LEVERANCE, INC | Anticipated meaning natural language interface |
5748841, | Feb 25 1994 | Matsushita Electric Corporation of America | Supervised contextual language acquisition system |
5812126, | Dec 31 1996 | Intel Corporation | Method and apparatus for masquerading online |
5854997, | Sep 07 1994 | Hitachi, Ltd. | Electronic interpreter utilizing linked sets of sentences |
6233561, | Apr 12 1999 | Panasonic Intellectual Property Corporation of America | Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue |
6321188, | Nov 15 1994 | FUJI XEROX CO , LTD | Interactive system providing language information for communication between users of different languages |
6505162, | Jun 11 1999 | Industrial Technology Research Institute | Apparatus and method for portable dialogue management using a hierarchial task description table |
6622119, | Oct 30 1999 | Nuance Communications, Inc | Adaptive command predictor and method for a natural language dialog system |
6792406, | Dec 24 1998 | Sony Corporation | Information processing apparatus, portable device, electronic pet apparatus recording medium storing information processing procedures and information processing method |
6917920, | Jan 07 1999 | RAKUTEN, INC | Speech translation device and computer readable medium |
7050979, | Jan 24 2001 | Panasonic Intellectual Property Corporation of America | Apparatus and method for converting a spoken language to a second language |
7162412, | Nov 20 2001 | Evidence Corporation | Multilingual conversation assist system |
7251595, | Mar 22 2001 | Nippon Telegraph and Telephone Corporation | Dialogue-type information providing apparatus, dialogue-type information providing processing method, and program and recording medium for the same |
7346515, | Oct 08 2004 | Intertrust Technologies Corporation | Dialog supporting apparatus |
7505893, | Nov 11 2005 | Panasonic Intellectual Property Corporation of America | Dialogue supporting apparatus |
20020120436, | |||
20040172236, | |||
JP2003030187, | |||
JP2003288339, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Nov 28 2007 | Panasonic Corporation | (assignment on the face of the patent) | / | |||
May 27 2014 | Panasonic Corporation | Panasonic Intellectual Property Corporation of America | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 033033 | /0163 | |
Mar 18 2016 | Panasonic Intellectual Property Corporation of America | Intertrust Technologies Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 039021 | /0319 | |
Mar 13 2020 | Intertrust Technologies Corporation | ORIGIN FUTURE ENERGY PTY LTD | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 052189 | /0343 | |
Sep 08 2022 | ORIGIN FUTURE ENERGY PTY LTD | Intertrust Technologies Corporation | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 062747 | /0742 |
Date | Maintenance Fee Events |
Jul 07 2011 | ASPN: Payor Number Assigned. |
Apr 16 2014 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
May 16 2018 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Jul 04 2022 | REM: Maintenance Fee Reminder Mailed. |
Dec 19 2022 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Nov 16 2013 | 4 years fee payment window open |
May 16 2014 | 6 months grace period start (w surcharge) |
Nov 16 2014 | patent expiry (for year 4) |
Nov 16 2016 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 16 2017 | 8 years fee payment window open |
May 16 2018 | 6 months grace period start (w surcharge) |
Nov 16 2018 | patent expiry (for year 8) |
Nov 16 2020 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 16 2021 | 12 years fee payment window open |
May 16 2022 | 6 months grace period start (w surcharge) |
Nov 16 2022 | patent expiry (for year 12) |
Nov 16 2024 | 2 years to revive unintentionally abandoned end. (for year 12) |