A method uses natural language for visual analysis of a dataset. A data visualization is displayed based on a first dataset. The method then extracts analytic phrases from a natural language command related to the data visualization. The method computes conversation centers based on the analytic phrases and computes analytical functions associated with the conversation centers, thereby creating functional phrases. The method updates the data visualization according to the functional phrases. The method then extracts analytic phrases from a second natural language command related to the updated data visualization, and computes temporary conversation centers from these analytic phrases. The method then computes cohesion between the first analytic phrases and the second analytic phrases to build a set of conversation centers, and computes analytical functions from this set of conversation centers, thereby creating functional phrases. The method updates the data visualization based on the created functional phrases.
|
26. A non-transitory computer readable storage medium storing one or more programs configured for execution by an electronic device with a display, the one or more programs comprising instructions for:
displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries;
receiving a first user input that specifies a first natural language command related to the data visualization;
extracting a first set of one or more independent analytic phrases from the first natural language command;
computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases;
creating a first set of one or more functional phrases by computing a first set of analytical functions associated with the first set of one or more conversation centers;
updating the data visualization based on the first set of one or more functional phrases;
receiving a second user input that specifies a second natural language command related to the updated data visualization;
extracting a second set of one or more independent analytic phrases from the second natural language command;
computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases;
computing cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases, wherein the cohesion includes similarity in lexical or grammatical structures between the first set of one or more analytic phrases and the second set of one or more analytic phrases;
deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion;
creating a second set of one or more functional phrases by computing a first set of analytical functions associated with the first set of one or more conversation centers; and
updating the data visualization based on the second set of one or more functional phrases, including highlighting or filtering data marks whose characteristics correspond to a data attribute that is semantically related to the second set of one or more extracted analytic phrases.
1. A method of using natural language for visual analysis of a dataset, comprising:
at computer having a display, one or more processors, and memory storing one or more programs configured for execution by the one or more processors:
displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries;
receiving a first user input that specifies a first natural language command related to the data visualization;
extracting a first set of one or more independent analytic phrases from the first natural language command;
computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases;
creating a first set of one or more functional phrases by computing a first set of analytical functions associated with the first set of one or more conversation centers;
updating the data visualization based on the first set of one or more functional phrases;
receiving a second user input that specifies a second natural language command related to the updated data visualization;
extracting a second set of one or more independent analytic phrases from the second natural language command;
computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases;
computing cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases, wherein the cohesion includes similarity in lexical or grammatical structures between the first set of one or more analytic phrases and the second set of one or more analytic phrases;
deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion;
creating a second set of one or more functional phrases by computing a second set of one or more analytical functions associated with the second set of one or more conversation centers; and
updating the data visualization based on the second set of one or more functional phrases, including highlighting or filtering data marks whose characteristics correspond to a data attribute that is semantically related to the second set of one or more extracted analytic phrases.
25. An electronic device, comprising:
a display;
one or more processors;
memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for:
displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries;
receiving a first user input that specifies a first natural language command related to the data visualization;
extracting a first set of one or more independent analytic phrases from the first natural language command;
computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases;
creating a first set of one or more functional phrases by computing a first set of analytical functions associated with the first set of one or more conversation centers;
updating the data visualization based on the first set of one or more functional phrases;
receiving a second user input that specifies a second natural language command related to the updated data visualization;
extracting a second set of one or more independent analytic phrases from the second natural language command;
computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases;
computing cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases, wherein the cohesion includes similarity in lexical or grammatical structures between the first set of one or more analytic phrases and the second set of one or more analytic phrases;
deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion;
creating a second set of one or more functional phrases by computing a second set of one or more analytical functions associated with the second set of one or more conversation centers; and
updating the data visualization based on the second set of one or more functional phrases, including highlighting or filtering data marks whose characteristics correspond to a data attribute that is semantically related to the second set of one or more extracted analytic phrases.
2. The method of
identifying a phrase structure of the second set of one or more analytic phrases;
identifying one or more forms of pragmatics based on the phrase structure; and
deriving the second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the one or more forms of pragmatics.
3. The method of
obtaining a parsed output by parsing the second natural language command applying a probabilistic grammar; and
resolving the parsed output to corresponding categorical and data attributes.
4. The method of
5. The method of
identifying the one or more forms of pragmatics comprises determining whether the second natural language command is an incomplete utterance by determining whether one or more linguistic elements are absent in the phrase structure; and
deriving the second set of one or more conversation centers comprises:
in accordance with the determination that the second natural language command is an incomplete utterance:
determining a first subset of conversation centers in the first set of one or more conversation centers, the first subset of conversation centers corresponding to the one or more linguistic elements absent in the phrase structure; and
computing the second set of one or more conversation centers by combining the temporary set of one or more conversation centers with the first subset of conversation centers.
6. The method of
identifying the one or more forms of pragmatics comprises determining whether the second natural language command is a reference expression by determining whether one or more anaphoric references is present in the phrase structure; and
deriving the second set of one or more conversation centers comprises:
in accordance with the determination that the second natural command is a reference expression:
searching the first set of one or more conversation centers to find a first subset of conversation centers that corresponds to a phrasal chunk in the second natural language command that contains a first anaphoric reference of the one or more anaphoric references; and
computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers and the first subset of conversation centers.
7. The method of
determining whether the first anaphoric reference is accompanied by a verb in the second natural language command;
in accordance with a determination that the anaphoric reference is accompanied by a verb:
searching the first set of one or more conversation centers to find a first action conversation center that refers to an action verb; and
computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers, the first subset of conversation centers, and the first action conversation center.
8. The method of
determining whether the first anaphoric reference is a deictic reference that refers to some object in the environment;
in accordance with a determination that the anaphoric reference is a deictic reference, computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers, and a characteristic of the object.
9. The method of
determining whether the first anaphoric reference is a reference to a visualization property in the updated data visualization;
in accordance with a determination that the anaphoric reference is a deictic reference, computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers, and data related to the visualization property.
10. The method of
identifying the one or more forms of pragmatics comprises determining whether the second natural language command is a repair utterance by determining whether the phrase structure corresponds to one or more predefined repair utterances; and
deriving the second set of one or more conversation centers comprises:
in accordance with the determination that the second natural language command is a repair utterance:
computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers; and
updating one or more data attributes in the second set of one or more conversation centers based on the one or more predefined repair utterances and the phrase structure.
11. The method of
determining whether the phrase structure corresponds to a repair utterance to change a default behavior related to displaying a data visualization; and
in accordance with a determination that the phrase structure corresponds to a repair utterance to change a default behavior, changing the default behavior related to displaying.
12. The method of
identifying the one or more forms of pragmatics comprises determining whether the second natural language command is a conjunctive expression by (i) determining explicit or implicit presence of conjunctions in the phrase structure, and (ii) determining whether the temporary set of one or more conversation centers includes each conversation center in the first set of one or more conversation centers; and
deriving the second set of one or more conversation centers comprises:
in accordance with the determination that the second natural language command is a conjunctive expression, computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers.
13. The method of
determining whether the second natural language command has more than one conjunct; and
in accordance with the determination that the second natural language command has more than one conjunct, computing the second set of one or more analytical functions by linearizing the second natural language command.
14. The method of
generating a parse tree for the second natural language command;
traversing the parse tree in post-order to extract a first analytic phrase and a second analytic phrase, wherein the first analytic phrase and the second analytic phrase are adjacent nodes in the parse tree;
computing a first analytical function and a second analytical function corresponding to the first analytic phrase and the second analytic phrase, respectively; and
combining the first analytical function with the second analytical function by applying one or more logical operators based on one or more characteristics of the first analytical function and the second analytic function, wherein the one or more characteristics include attribute type, operator type, and a value.
15. The method of
the first analytical function comprises a first attribute, a first operator, and a first value;
the second analytical function comprises a second attribute, a second operator, and a second value; and
combining the first analytical function with the second analytical function comprises:
determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute;
determining whether the first attribute and the second attribute are identical; and
in accordance with a determination that the first attribute and the second attribute are identical and are both categorical type attributes, applying a union operator to combine the first analytical function and the second analytical function.
16. The method of
the first analytical function comprises a first attribute, a first operator, and a first value;
the second analytical function comprises a second attribute, a second operator, and a second value; and
combining the first analytical function with the second analytical function comprises:
determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute;
determining whether the first attribute and the second attribute are identical; and
in accordance with a determination that the first attribute and the second attribute are non-identical, applying the intersection operator to combine the first analytical function and the second analytical function.
17. The method of
the first analytical function comprises a first attribute, a first operator, and a first value;
the second analytical function comprises a second attribute, a second operator, and a second value; and
combining the first analytical function with the second analytical function comprises:
determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute;
determining whether the first attribute and the second attribute are identical; and
in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes:
determining the operator types of the first operator and the second operator; and
in accordance with a determination that the first operator and the second operator are both equality operators, applying the union operator to combine the first analytical function and the second analytical function.
18. The method of
the first analytical function comprises a first attribute, a first operator, and a first value;
the second analytical function comprises a second attribute, a second operator, and a second value; and
combining the first analytical function with the second analytical function comprises:
determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute;
determining whether the first attribute and the second attribute are identical; and
in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes:
determining the operator types of the first operator and the second operator; and
in accordance with a determination that the first operator is a “less than” operator and the second operator is a “greater than” operator:
determining whether the first value is less than the second value; and
in accordance with a determination that the first value is less than the second value, applying the union operator to combine the first analytical function and the second analytical function.
19. The method of
the first analytical function comprises a first attribute, a first operator, and a first value;
the second analytical function comprises a second attribute, a second operator, and a second value; and
combining the first analytical function with the second analytical function comprises:
determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute;
determining whether the first attribute and the second attribute are identical; and
in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes:
determining the operator types of the first operator and the second operator; and
in accordance with a determination that the first operator is a “greater than” operator and the second operator is a “lesser than” operator:
determining whether the first value is less than the second value; and
in accordance with a determination that the first value is less than the second value, applying the intersection operator to combine the first analytical function and the second analytical function.
20. The method of
creating the second set of one or more functional phrases, based on the semantically related one or more attributes of data by computing semantic relatedness between the second set of one or more extracted analytic phrases and one or more attributes of data included in the updated data visualization, and computing analytical functions associated with the second set of one or more analytic phrases.
21. The method of
learning word embeddings by training a first neutral network model on a large corpus of text;
computing a first word vector for a first word in a first phrase in the second set of one or more analytic phrases using a second neural network model, the first word vector mapping the first word to the word embeddings;
computing a second word vector for a first data attribute in the one or more data attributes using the second neural network model, the second word vector mapping the first data attribute to the word embeddings; and
computing relatedness between the first word vector and the second word vector using a similarity metric.
22. The method of
23. The method of
24. The method of
obtaining word definitions for the second set of one or more analytic phrases from a publicly available dictionary;
determining whether the word definitions contain one or more predefined adjectives using a part-of-speech API provided by a natural language toolkit; and
in accordance with the determination that the word definitions contain one or more predefined adjectives, mapping the one or more predefined adjectives to one or more analytical functions.
|
This application is a continuation-in-part of U.S. patent application Ser. No. 15/804,991, filed Nov. 6, 2017, entitled “Systems and Methods of Using Natural Language Processing for Visual Analysis of a Data Set,” which is a continuation-in-part of U.S. patent application Ser. No. 15/486,265, filed Apr. 12, 2017, entitled “Systems and Methods of Using Natural Language Processing for Visual Analysis of a Data Set,” which claims priority to both (1) U.S. Provisional Application Ser. No. 62/321,695, filed Apr. 12, 2016, entitled “Using Natural Language Processing for Visual Analysis of a Data Set” and (2) U.S. Provisional Application Ser. No. 62/418,052, filed Nov. 4, 2016, entitled “Using Natural Language Processing for Visual Analysis of a Data Set,” each of which is incorporated by reference herein in its entirety. U.S. patent application Ser. No. 15/804,991 also claims priority to U.S. Provisional Application Ser. No. 62/500,999, filed May 3, 2017, entitled “Applying Pragmatics Principles for Interaction with Visual Analytics,” which is incorporated by reference herein in its entirety. This application also claims priority to U.S. Provisional Application Ser. No. 62/598,399, filed Dec. 13, 2017, entitled “Identifying Intent in Visual Analytical Conversations,” which is incorporated by reference herein in its entirety.
This application is related to U.S. Pat. No. 9,183,235, filed Mar. 3, 2015, which is incorporated by reference herein in its entirety.
The disclosed implementations relate generally to data visualization and more specifically to systems, methods, and user interfaces that enable users to interact with and explore datasets using a natural language interface.
Data visualization applications enable a user to understand a data set visually, including distribution, trends, outliers, and other factors that are important to making business decisions. Some data sets are very large or complex, and include many data fields. Various tools can be used to help understand and analyze the data, including dashboards that have multiple data visualizations. However, some functionality may be difficult to use or hard to find within a complex user interface. Most systems return only very basic interactive visualizations in response to queries, and others require expert modeling to create effective queries. Other systems require simple closed-ended questions, and then are only capable of returning a single text answer or a static visualization.
Accordingly, there is a need for tools that allow users to effectively utilize functionality provided by data visualization applications. One solution to the problem is providing a natural language interface as part of a data visualization application (e.g., within the user interface for the data visualization application) for an interactive query dialog that provides graphical answers to natural language queries. The natural language interface allows users to access complex functionality using ordinary questions or commands. Questions and insights often emerge from previous questions and patterns of data that a person sees. By modeling the interaction behavior as a conversation, the natural language interface can apply principles of pragmatics to improve interaction with visual analytics. Through various techniques for deducing the grammatical and lexical structure of utterances and their context, the natural language interface supports various pragmatic forms of natural language interaction with visual analytics. These pragmatic forms include understanding incomplete utterances, referring to entities within utterances and visualization properties, supporting long, compound utterances, identifying synonyms and related concepts, and ‘repairing’ responses to previous utterances. Furthermore, the natural language interface provides appropriate visualization responses either within an existing visualization or by creating new visualizations when necessary, and resolves ambiguity through targeted textual feedback and ambiguity widgets. In this way, the natural language interface allows users to efficiently explore data displayed (e.g., in a data visualization) within the data visualization application.
In accordance with some implementations, a method executes at an electronic device with a display. For example, the electronic device can be a smart phone, a tablet, a notebook computer, or a desktop computer. The device displays a data visualization based on a dataset retrieved from a database using a first set of one or more database queries. A user specifies a first natural language command related to the displayed data visualization. Based on the displayed data visualization, the device extracts a first set of one or more independent analytic phrases from the first natural language command. The device then computes a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The device then computes a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The device then updates the data visualization based on the first set of one or more functional phrases.
In some implementations, the device receives a second natural language command related to the updated data visualization. After receiving the second natural language command, the device extracts a second set of one or more independent analytic phrases from the second natural language command, and computes a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases, according to some implementations. The device then derives a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The device computes a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases. The device then updates the data visualization based on the second set of one or more functional phrases.
In some implementations, each of the conversation centers of the first set of one or more conversation centers, the temporary set of one or more conversation centers, and the second set of one or more conversation centers comprises a value for a variable (e.g., a data attribute or a visualization property). In such implementations, the device uses the transitional rules by performing a sequence of operations that comprises: determining whether a first variable is included in the first set of one or more conversation centers; determining whether the first variable is included in the temporary set of one or more conversation centers; determining a respective transitional rule of the one or more transitional rules to apply based on whether the first variable is included in the first set of one or more conversation centers and/or the temporary set of one or more conversation centers; and applying the respective transitional rule.
In some implementations, the one or more transitional rules used by the device comprise a CONTINUE rule for including each conversation center in the first set of one or more conversation centers in the second set of one or more conversation centers, and adding one or more conversation centers from the temporary set of one or more conversation centers to the second set of one or more conversation centers.
In some such implementations, the device applies the respective transitional rule by performing a sequence of operations that comprises: when (i) the first variable is included in the temporary set of one or more conversation centers, and (ii) the first variable is not included in the first set of one or more conversation centers, the device applies the CONTINUE rule to include the first variable to the second set of one or more conversation centers.
In some implementations, the one or more transitional rules used by the device comprise a RETAIN rule for retaining each conversation center in the first set of one or more conversation centers in the second set of one or more conversation centers without adding any conversation center from the temporary set of one or more conversation centers to the second set of one or more conversation centers.
In some such implementations, the device applies the respective transitional rule by performing a sequence of operations that comprises: when (i) the first variable is included in the first set of one or more conversation centers, and (ii) the first variable is not included in the temporary set of one or more conversation centers, apply the RETAIN rule to include each conversation center in the first set of one or more conversation centers to the second set of one or more conversation centers.
In some implementations, the one or more transitional rules used by the device comprise a SHIFT rule for including each conversation center in the first set of one or more conversation centers in the second set of one or more conversation centers, and replacing one or more conversation centers in the second set of one or more conversation centers with conversation centers in the temporary set of one or more conversation centers.
In some such implementations, the device applies the respective transitional rule by performing a sequence of operations that comprises: when (i) the first variable is included in the first set of one or more conversation centers, and (ii) the first variable is included in the temporary set of one or more conversation centers: determine whether a first value of the first variable in the first set of one or more conversation centers is different from a second value of the first variable in the temporary set of one or more conversation centers; when the first value is different from the second value, apply the SHIFT rule for including each conversation center in the first set of one or more conversation centers in the second set of one or more conversation centers, and replace the value for the first variable in the second set of one or more conversation centers with the second value.
In some such implementations, the device further determines if a widget corresponding to the first variable has been removed by the user. When the widget has been removed, apply the SHIFT rule for including each conversation center in the first set of one or more conversation centers in the second set of one or more conversation centers, and replace the value for the first variable in the second set of one or more conversation centers with a new value (e.g., a maximum value, or a super-set value) that includes the first value.
In some implementations, the device creates a first set of one or more queries based on the first set of one or more functional phrases, and requeries the database using the first set of one or more queries, thereby retrieving a second dataset, and then displays an updated data visualization using the second dataset. In some implementations, the device creates a second set of one or more queries based on the second set of one or more functional phrases, and requeries the database using the second set of one or more queries, thereby retrieving a third dataset, and then displays an updated data visualization using the third dataset. In some instances, requerying the database is performed locally at the computing device using cached or stored data at the computing device. For example, requerying is commonly performed locally when the natural language command specifies one or more filters. In some implementations, the device further creates and displays a new data visualization using the second dataset or the third dataset.
In some implementations, the device further determines if the user has selected a dataset different from the first dataset, or if the user has reset the data visualization, and, if so, resets each of the first set of one or more conversation centers, the temporary set of one or more conversation centers, and the second set of one or more conversation centers to an empty set that includes no conversation centers.
Typically, an electronic device includes one or more processors, memory, a display, and one or more programs stored in the memory. The programs are configured for execution by the one or more processors and are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases, and updating the data visualization based on the first set of one or more functional phrases.
In some implementations, the one or more programs include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The one or more programs also include instructions for computing a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases, and updating the data visualization based on the second set of one or more functional phrases.
In some implementations, a non-transitory computer readable storage medium stores one or more programs configured for execution by a computing device having one or more processors, memory, and a display. The one or more programs are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases, and updating the data visualization based on the first set of one or more functional phrases.
In some implementations, the one or more programs include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The one or more programs also include instructions for computing a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases, and updating the data visualization based on the second set of one or more functional phrases.
In another aspect, in accordance with some implementations, the device displays a data visualization based on a dataset retrieved from a database using a first set of one or more database queries. A user specifies a first natural language command related to the displayed data visualization. Based on the displayed data visualization, the device extracts a first set of one or more independent analytic phrases from the first natural language command. The device then computes a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The device then computes a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The device then updates the data visualization based on the first set of one or more functional phrases. The user specifies a second natural language command related to the updated data visualization. After receiving the second natural language command, the device extracts a second set of one or more independent analytic phrases from the second natural language command, and computes a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The device then computes cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases. The device then derives a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion. The device computes a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases. The device then updates the data visualization based on the second set of one or more functional phrases.
In some implementations, the device computes the cohesion and derives the second set of one or more conversation centers by performing a sequence of operations that comprises: identifying a phrase structure of the second set of one or more analytic phrases; identifying one or more forms of pragmatics based on the phrase structure; and deriving the second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the one or more forms of pragmatics.
In some such implementations, the device identifies the phrase structure by performing a sequence of operations that comprises: parsing the second natural language command applying a probabilistic grammar, thereby obtaining a parsed output; and resolving the parsed output to corresponding categorical and data attributes. In some such implementations, parsing the second natural language command further comprises deducing syntactic structure by employing a part-of-speech API (e.g., a Part-of-Speech (PS) tagger) provided by a natural language toolkit.
In some implementations, the device identifies the one or more forms of pragmatics by performing a sequence of operations that comprises determining whether the second natural language command is an incomplete utterance (sometimes called an Ellipsis) by determining whether one or more linguistic elements are absent in the phrase structure. In some such implementations, the device derives the second set of one or more conversation centers by performing a sequence of operations that comprises: in accordance with the determination that the second natural language command is an incomplete utterance: determining a first subset of conversation centers in the first set of one or more conversation centers, the first subset of conversation centers corresponding to the one or more linguistic elements absent in the phrase structure; and computing the second set of one or more conversation centers by combining the temporary set of one or more conversation centers with the first subset of conversation centers.
In some implementations, the device identifies the one or more forms of pragmatics by performing a sequence of operations that comprises determining whether the second natural language command is a reference expression by determining whether one or more anaphoric references is present in the phrase structure; and the device derives the second set of one or more conversation centers by performing another sequence of operations that comprises: in accordance with the determination that the second natural command is a reference expression: searching the first set of one or more conversation centers to find a first subset of conversation centers that corresponds to a phrasal chunk in the second natural language command that contains a first anaphoric reference of the one or more anaphoric references; and computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers and the first subset of conversation centers.
In some such implementations, the device further determines if the first anaphoric reference is accompanied by a verb in the second natural language command, and if so, searches the first set of one or more conversation centers to find a first action conversation center that refers to an action verb (e.g., “filter out”); and computes the second set of one or more conversation centers based on the temporary set of one or more conversation centers, the first subset of conversation centers, and the first action conversation center.
In some such implementations, the device determines if the first anaphoric reference is a deictic reference that refers to some object in the environment, typically by pointing, and if so, computes the second set of one or more conversation centers based on the temporary set of one or more conversation centers, and a characteristic of the object. Deictic references are typically enabled through multimodal interaction (e.g., via the use of a mouse in addition to speech or text).
In some such implementations, the device further determines if the first anaphoric reference is a reference to a visualization property in the updated data visualization, and if so, computes the second set of one or more conversation centers based on the temporary set of one or more conversation centers, and data related to the visualization property
In some implementations, the device identifies the one or more forms of pragmatics by performing a sequence of operations that comprises determining whether the second natural language command is a repair utterance by determining whether the phrase structure corresponds to one or more predefined repair utterances (say, to repair a potential ambiguity in the first natural language command or how the results are presented to the user). For example, the user utters “get rid of condo” or “change from condo to townhomes.” In such implementations, if the device determines that the second natural language command is a repair utterance, the device computes the second set of one or more conversation centers based on the temporary set of one or more conversation centers; and updates one or more data attributes in the second set of one or more conversation centers based on the one or more predefined repair utterances and the phrase structure.
In some such implementations, the device determines if the phrase structure corresponds to a repair utterance to change a default behavior related to displaying a data visualization (e.g., highlighting for selection, such as in response to “no filter, instead”), and if so, the device changes the default behavior related to displaying.
In some implementations, the device identifies the one or more forms of pragmatics by performing a sequence of operations that comprises determining whether the second natural language command is a conjunctive expression by (i) determining explicit or implicit presence of conjunctions in the phrase structure, and (ii) determining whether the temporary set of one or more conversation centers includes each conversation center in the first set of one or more conversation centers. In such implementations, the device derives the second set of one or more conversation centers by performing another set of operations that comprises: in accordance with the determination that the second natural language command is a conjunctive expression, computing the second set of one or more conversation centers based on the temporary set of one or more conversation centers
In some such implementations, the device determines if the second natural language command has more than one conjunct; and in accordance with the determination that the second natural language command has more than one conjunct, the device computes the second set of one or more analytical functions by linearizing the second natural language command. In some such implementations, the device linearizes the second natural language command by performing a sequence of operations that comprises: generating a parse tree for the second natural language command; traversing the parse tree in post-order to extract a first analytic phrase and a second analytic phrase, wherein the first analytic phrase and the second analytic phrase are adjacent nodes in the parse tree; computing a first analytical function and a second analytical function corresponding to the first analytic phrase and the second analytic phrase, respectively; and combining the first analytical function with the second analytical function by applying one or more logical operators based on one or more characteristics of the first analytical function and the second analytic function, wherein the one or more characteristics include attributor type, operator type, and a value.
In some such implementations, the first analytical function comprises a first attribute (sometimes herein called a variable, and includes a visualization property), a first operator, and a first value; the second analytical function comprises a second attribute (sometimes herein called a variable, and includes a visualization property), a second operator, and a second value.
In some such implementations, combining the first analytical function with the second analytical function comprises: determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute; determining whether the first attribute and the second attribute are identical; and, in accordance with a determination that the first attribute and the second attribute are identical and are both categorical type attributes, applying a union operator to combine the first analytical function and the second analytical function.
In some such implementations, combining the first analytical function with the second analytical function comprises: determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute; determining whether the first attribute and the second attribute are identical; and, in accordance with a determination that the first attribute and the second attribute are non-identical, applying the intersection operator to combine the first analytical function and the second analytical function.
In some such implementations, combining the first analytical function with the second analytical function comprises: determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute; determining whether the first attribute and the second attribute are identical; and, in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes: determining the operator types of the first operator and the second operator; and, in accordance with a determination that the first operator and the second operator are both equality operators, applying the union operator to combine the first analytical function and the second analytical function.
In some such implementations, combining the first analytical function with the second analytical function comprises: determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute; determining whether the first attribute and the second attribute are identical; and, in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes: determining the operator types of the first operator and the second operator; and in accordance with a determination that the first operator is a “less than” operator and the second operator is a “greater than” operator: determining whether the first value is less than the second value; and in accordance with a determination that the first value is less than the second value, applying the union operator to combine the first analytical function and the second analytical function.
In some such implementations, combining the first analytical function with the second analytical function comprises: determining whether the first attribute is a categorical type attribute or an ordered type attribute, and determining whether the second attribute is a categorical type attribute or an ordered type attribute; determining whether the first attribute and the second attribute are identical; and, in accordance with a determination that the first attribute and the second attribute are identical and are both ordered type attributes: determining the operator types of the first operator and the second operator; and in accordance with a determination that the first operator is a “greater than” operator and the second operator is a “lesser than” operator: determining whether the first value is less than the second value; and in accordance with a determination that the first value is less than the second value, applying the intersection operator to combine the first analytical function and the second analytical function.
In some implementations, the device further computes semantic relatedness between the second set of one or more extracted analytic phrases and one or more attributes of data included in the updated data visualization, and computes analytical functions associated with the second set of one or more analytic phrases, thereby creating the second set of one or more functional phrases, based on the semantically related one or more attributes of data. As opposed to grammatical cohesion or cohesion between contexts, lexical cohesion looks for cohesion within the context.
In some such implementations, the device computes semantic relatedness by performing a sequence of operations that comprises: training a first neutral network model on a large corpus of text, thereby learning word embeddings; computing a first word vector for a first word in a first phrase in the second set of one or more analytic phrases using a second neural network model, the first word vector mapping the first word to the word embeddings; computing a second word vector for a first data attribute in the one or more data attributes using the second neural network model, the second word vector mapping the first data attribute to the word embeddings; and computing relatedness between the first word vector and the second word vector using a similarity metric.
In some such implementations, the first neural network model is a Word2vec™ model. In some such implementations, the second neural network model is a recurrent neural network model.
In some such implementations, the similarity metric is based at least on (i) Wu-Palmer distance between word senses associated with the first word vector and the second word vector, (ii) a weighting factor, and (iii) a pairwise cosine distance between the first word vector and the second word vector.
In some such implementations, the device computes analytical functions by performing a sequence of operations that comprises: obtaining word definitions for the second set of one or more analytic phrases from a publicly available dictionary; determining whether the word definitions contain one or more predefined adjectives using a part-of-speech API provided by a natural language toolkit; and in accordance with the determination that the word definitions contain one or more predefined adjectives, mapping the one or more predefined adjectives to one or more analytical functions.
Typically, an electronic device includes one or more processors, memory, a display, and one or more programs stored in the memory. The programs are configured for execution by the one or more processors and are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The one or more programs also include instructions for updating the data visualization based on the first set of one or more functional phrases. The one or more programs also include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for computing cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion. The one or more programs also include instructions for computing a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases, and updating the data visualization based on the second set of one or more functional phrases.
In some implementations, a non-transitory computer readable storage medium stores one or more programs configured for execution by a computing device having one or more processors, memory, and a display. The one or more programs are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The one or more programs also include instructions for updating the data visualization based on the first set of one or more functional phrases. The one or more programs also include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for computing cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion. The one or more programs also include instructions for computing a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases, and updating the data visualization based on the second set of one or more functional phrases.
In another aspect, in accordance with some implementations, a method executes at an electronic device with a display. For example, the electronic device can be a smart phone, a tablet, a notebook computer, or a desktop computer. The device displays a data visualization based on a dataset retrieved from a database using a first set of one or more database queries. A user specifies a first natural language command related to the displayed data visualization. Based on the displayed data visualization, the device extracts a first set of one or more independent analytic phrases from the first natural language command. The device then computes a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The device then computes a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The device then updates the data visualization based on the first set of one or more functional phrases. The user specifies a second natural language command related to the updated data visualization. After receiving the second natural language command, the device extracts a second set of one or more independent analytic phrases from the second natural language command, and computes a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The device then derives a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The device then updates the data visualization based on the second set of one or more conversation centers.
In some implementations, the device further determines one or more data attributes corresponding to the second set of one or more conversation centers; scans displayed data visualizations to identify one or more of the displayed data visualizations that contain data marks whose characteristics correspond to a first data attribute in the one or more data attributes; and highlights the data marks whose characteristics correspond to the first data attribute. In some such implementations, the device further filters, from the displayed data visualizations, results that contain data marks whose characteristics do not correspond to the one or more data attributes. Further, in some such implementations, the device receives a user input to determine whether to filter or to highlight the data marks (e.g., via a natural language command, such as ‘exclude,’ ‘remove,’ and ‘filter only’).
In some implementations, the visualization characteristics include one or more of color, size, and shape. In some implementations, the visualization characteristics correspond to a visual encoding of data marks. In some implementations, the visual encoding is one or more of color, size, and shape.
In some implementations, the device determines if none of the displayed data visualizations contain data marks whose characteristics correspond to the first data attribute, and if so, generates a specification for a new data visualization with the first data attribute (e.g., aggregation types) and displays the new data visualization. In some such implementations, displaying the new data visualization further comprises determining a chart type based on the specification; and generating and displaying the chart. Further, in some such implementations, the chart is positioned using a two-dimensional grid-based layout algorithm, automatically coordinated with other data visualizations (sometimes herein called views).
In some implementations, the device further performs a sequence of operations comprising: computing a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases; selecting a first functional phrase from the second set of one or more functional phrases, wherein the first functional phrase comprises a parameterized data selection criterion; selecting an initial range for values of the parameters of the parameterized data selection criterion; displaying an editable user interface control (e.g., widgets) corresponding to the parameterized data selection criterion, wherein the user interface control displays the current values of the parameters; and ordering a displayed set of one or more editable user interface controls based on the order of queries in the second natural language command, wherein the order of queries is inferred while extracting the second set of one or more analytic phrases from the second natural language command. In some such implementations, the user interface control allows adjustment of the first functional phrase. Further, in some such implementations, the user interface control displays a slider, which enables a user to adjust the first functional phrase. In some such implementations, ordering the displayed set of one or more editable user interface controls further comprises using a library that facilitates the compact placement of small word-scale visualization within text. In some such implementations, the library is Sparklificator™.
In some implementations, the device performs a sequence of operations aimed at automatically correcting some user errors. The sequence of operations comprises: determining a first token in the second natural language command that does not correspond to any of the analytic phrases in the second set of one or more analytic phrases (for example, due to a parsing failure); searching for a correctly spelled term corresponding to the first token using a search library by comparing the first token with one or more features of the first dataset; and substituting the correctly spelled term for the first token in the second natural language command to obtain a third natural language command; and extracting the second set of one or more analytic phrases from the third natural language command. In some such implementations, the one or more features include data attributes, cell values, and related keywords of the first dataset. In some such implementations, the search library is a fuzzy string library, such as Fuse.js™.
In some such implementations, the device further performs a sequence of operations comprising: determining whether there is no correctly spelled term corresponding to the first token; and in accordance with a determination that there is no correctly spelled term corresponding to the first token: parsing the second natural language command to obtain a parse tree; pruning the parse tree to remove the portion of the tree corresponding to the first token; and extracting the second set of one or more analytic phrases based on the pruned parse tree.
In some implementations, the device further generates a textual feedback indicating that the first token was unrecognized and therefore removed from the second natural language command—a situation that typically occurs when the utterance was only partially understood. In some such implementations, the device displays the first token.
In some implementations, the device further generates a textual feedback indicating that the correctly spelled term is substituted for the first token in the second natural language command. This is typically the situation when the utterance was not successfully understood, but the device suggested an alternative query. Further, in some such implementations, the device displays and highlights the correctly spelled term.
Typically, an electronic device includes one or more processors, memory, a display, and one or more programs stored in the memory. The programs are configured for execution by the one or more processors and are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The one or more programs also include instructions for updating the data visualization based on the first set of one or more functional phrases. The one or more programs also include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The one or more programs also include instructions for updating the data visualization based on the second set of one or more conversation centers, wherein the updating comprises: determining one or more data attributes corresponding to the second set of one or more conversation centers; scanning displayed data visualizations to identify one or more of the displayed data visualizations that contain data marks whose characteristics correspond to a first data attribute in the one or more data attributes; and highlighting the data marks whose characteristics correspond to the first data attribute.
In some implementations, a non-transitory computer readable storage medium stores one or more programs configured for execution by a computing device having one or more processors, memory, and a display. The one or more programs are configured to perform any of the methods described herein. The one or more programs include instructions for displaying a data visualization based on a first dataset retrieved from a database using a first set of one or more queries. The one or more programs also include instructions for receiving a first user input to specify a first natural language command related to the data visualization. The one or more programs also include instructions for extracting a first set of one or more independent analytic phrases from the first natural language command. The one or more programs also include instructions for computing a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases. The one or more programs also include instructions for computing a first set of analytical functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases. The one or more programs also include instructions for updating the data visualization based on the first set of one or more functional phrases. The one or more programs also include instructions for receiving a second user input to specify a second natural language command related to the updated data visualization. The one or more programs also include instructions for extracting a second set of one or more independent analytic phrases from the second natural language command. The one or more programs also include instructions for computing a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases. The one or more programs also include instructions for deriving a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules. The one or more programs also include instructions for updating the data visualization based on the second set of one or more conversation centers, wherein the updating comprises: determining one or more data attributes corresponding to the second set of one or more conversation centers; scanning displayed data visualizations to identify one or more of the displayed data visualizations that contain data marks whose characteristics correspond to a first data attribute in the one or more data attributes; and highlighting the data marks whose characteristics correspond to the first data attribute.
Thus methods, systems, and graphical user interfaces are disclosed that allow users to efficiently explore data displayed within a data visualization application by using natural language commands.
Both the foregoing general description and the following detailed description are exemplary and explanatory, and are intended to provide further explanation of the invention as claimed.
For a better understanding of the aforementioned systems, methods, and graphical user interfaces, as well as additional systems, methods, and graphical user interfaces that provide data visualization analytics, reference should be made to the Description of Implementations below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
Reference will now be made to implementations, examples of which are illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without requiring these specific details.
The graphical user interface 100 also includes a data visualization region 112. The data visualization region 112 includes a plurality of shelf regions, such as a columns shelf region 120 and a rows shelf region 122. These are also referred to as the column shelf 120 and the row shelf 122. As illustrated here, the data visualization region 112 also has a large space for displaying a visual graphic (also referred to herein as a data visualization). Because no data elements have been selected yet, the space initially has no visual graphic. In some implementations, the data visualization region 112 has multiple layers that are referred to as sheets.
In some implementations, the graphical user interface 100 also includes a natural language processing region 124. The natural language processing region 124 includes an input bar (also referred to herein as a command bar) for receiving natural language commands. A user may interact with the input bar to provide commands. For example, the user may type a command in the input bar to provide the command. In addition, the user may indirectly interact with the input bar by speaking into a microphone (e.g., an audio input device 220) to provide commands. In some implementations, data elements are initially associated with the column shelf 120 and the row shelf 122 (e.g., using drag and drop operations from the schema information region 110 to the column shelf 120 and/or the row shelf 122). After the initial association, the user may use natural language commands (e.g., in the natural language processing region 124) to further explore the displayed data visualization. In some instances, a user creates the initial association using the natural language processing region 124, which results in one or more data elements being placed in the column shelf 120 and the row shelf 122. For example, the user may provide a command to create a relationship between data element X and data element Y. In response to receiving the command, the column shelf 120 and the row shelf 122 may be populated with the data elements (e.g., the column shelf 120 may be populated with data element X and the row shelf 122 may be populated with data element Y, or vice versa).
The memory 206 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. In some implementations, the memory 206 includes one or more storage devices remotely located from the processor(s) 202. The memory 206, or alternately the non-volatile memory device(s) within the memory 206, includes a non-transitory computer-readable storage medium. In some implementations, the memory 206 or the computer-readable storage medium of the memory 206 stores the following programs, modules, and data structures, or a subset or superset thereof:
In some implementations, the data visualization application 230 includes a data visualization generation module 234, which takes user input (e.g., a visual specification 236), and generates a corresponding visual graphic. The data visualization application 230 then displays the generated visual graphic in the user interface 232. In some implementations, the data visualization application 230 executes as a standalone application (e.g., a desktop application). In some implementations, the data visualization application 230 executes within the web browser 226 or another application using web pages provided by a web server (e.g., a server-based application).
In some implementations, the information the user provides (e.g., user input) is stored as a visual specification 236. In some implementations, the visual specification 236 includes previous natural language commands received from a user or properties specified by the user through natural language commands.
In some implementations, the data visualization application 230 includes a language processing module 238 for processing (e.g., interpreting) commands provided by a user of the computing device. In some implementations, the commands are natural language commands (e.g., captured by the audio input device 220). In some implementations, the language processing module 238 includes sub-modules such as an autocomplete module, a pragmatics module, and an ambiguity module, each of which is discussed in further detail below.
In some implementations, the memory 206 stores metrics and/or scores determined by the language processing module 238. In addition, the memory 206 may store thresholds and other criteria, which are compared against the metrics and/or scores determined by the language processing module 238. For example, the language processing module 238 may determine a relatedness metric (discussed in detail below) for an analytic word/phrase of a received command. Then, the language processing module 238 may compare the relatedness metric against a threshold stored in the memory 206.
Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 206 stores a subset of the modules and data structures identified above. Furthermore, the memory 206 may store additional modules or data structures not described above.
Although
Sequences of utterances that exhibit coherence form a conversation. Coherence is a semantic property of conversation, based on the interpretation of each individual utterance relative to the interpretation of other utterances. As previously mentioned, in order to correctly interpret a set of utterances, the process framework utilizes and extends a model commonly used for discourse structure called conversational centering, in accordance with some implementations. In this model, utterances are divided into constituent discourse segments, embedding relationships that may hold between two segments. A center refers to those entities serving to link that utterance to other utterances in the discourse. Consider a discourse segment DS with utterances U1, . . . , Um. Each utterance Un (1≤n<m) in DS is assigned a set of forward-looking centers, Cf (Un, DS) referring to the current focus of the conversation; each utterance other than the segment's initial utterance, is assigned a set of backward-looking centers, Cb (Un, DS). The set of backward-looking centers of a new utterance Un+1 is Cb (Un+1, DS), which is equal to the forward-looking centers of Un (i.e., Cf (Un, DS)). In the context of visual analytic conversations, forward and backward-looking centers include data attributes and values, visual properties, and analytical actions (e.g., filter, highlight).
Each discourse segment exhibits both global coherence i.e., the global context of the entire conversation, usually referring to a topic or subject of the conversation, and local coherence i.e., coherence amongst the utterances within that conversation. Local coherence refers to inferring a sequence of utterances within a local context through transitional states of continuing, retaining, and replacing between Cf (Un, DS) and Cb (Un, DS). The framework extends this conversational centering theory for visual analytical conversation by introducing a set of rules for each of these local coherence constructs, in accordance with some implementations.
Given an utterance Un, a system implementing this framework responds by executing a series of analytical functions derived from the forward-looking centers Cf (Un, DS). An analytical function F(X, op, v) consists of a variable X (which can bean attribute or a visualization property), an operator op, and a value v (typically a constant), according to some implementations. For example, when the user says “measles in the uk,” the system creates two functions namely F_CAT(diseases, ==, measles) and F_CAT(country, ==, uk). When the user provides a new utterance Un+1, the system first creates a set of temporary centers Ctemp(Un+1, DS) from Un+1 without considering any previous context. The system then applies a set of rules to create a set of forward-looking centers, Cf (Un+1, DS) based on some set operations between Cb (Un+1,DS) and Ctemp (Un+1, DS). The forward-looking centers are then used to respond to the user utterance according to some implementations.
The transition Continue 323 continues the context from the backward-looking centers Cb to the forward-looking centers Cf, in accordance with some implementations. In other words, each of the conversation centers in the backward-looking conversation centers Cb is included (324) in the forward-looking conversation centers Cf. Using set notation, for a given utterance Un+1, in a discourse segment DS, as a result of this transition,
Cb(Un+1,DS)⊂Cf(Un+1,DS).
This transition occurs when a variable X is in Ctemp (Un+1) but not in Cb (Un+1, DS), in accordance with some implementations. In this case, the system performs the following union operation:
Cf(Un+1,DS)=Cb(Un+1,DS)∪Ctemp(Un+1,DS).
The transition Retain 325 retains (326) the context from the backward-looking centers Cb (322) in the forward-looking centers Cf without adding additional entities to the forward-looking centers, in accordance with some implementations. In other words,
Cf(Un+1,DS)=Cb(Un+1,DS).
The transition Retain 325 triggers when the variable X is in Cb (Un+1, DS) but not in Ctemp(Un+1, DS), in accordance with some implementations.
In some implementations, with the Shift transition 327, the context shifts from the backward-looking conversation centers 322 to the forward-looking conversation centers 328. That is,
Cf(Un+1,DS)≠Cb(Un+1,DS)
In some implementations, the Shift transition 327 occurs when the variable X is in both Cb (Un+1, DS) and Ctemp (Un+1, DS), but the corresponding values are different. In this case, the system replaces all the backward-centers Cb (Un+1, DS) containing X with Ctemp(Un+1, DS). As
Cf(Un+1,DS)=Cb(Un+1,DS)−XCb+XCtemp
In some implementations, the Shift transition 327 also occurs when a filter constraint is removed (e.g., removing a widget for measles shifts the disease variable from measles to all diseases).
Referring back to
Given an utterance Un, a system implementing this framework responds by executing a series of analytical functions derived from the forward-looking centers Cf (Un, DS). An analytical function F(X, op, v) consists of a variable X (which can bean attribute or a visualization property), an operator op, and a value v (typically a constant). For example, when the user says “measles in the uk,” the system creates two functions, such as the function F_CAT(diseases, ==, measles) and the function F_CAT(country, ==, uk). When the user provides a new utterance Un+1, the system first creates a set of temporary centers Ctemp(Un+1, DS) from Un+1 without considering any previous context. The system then applies a set of rules to create a set of forward-looking centers, Cf (Un+1, DS) based on some set operations between Cb (Un+1,DS) and Ctemp (Un+1, DS). The forward-looking centers are then used to respond to the user utterance according to some implementations.
Conversation centering posits that utterances display connectedness between them. The manner in which these utterances link up with each other to form a conversation is cohesion. Cohesion comes about as a result of the combination of both lexical and grammatical structures in the constituent phrases. Identifying phrase structure is thus a logical starting point to resolve that utterance into one or more analytical functions applied to the visualization. Phrase structure includes both lexical and grammatical structure. In FIG. 5, a system implementing this framework computes phrase structure for utterance Un+1 (520) in step 522. Typically, a parser is used to compute the phrase structure. A parser accepts an input sentence (sometimes called a query or a natural language command) and breaks the input sentence into a sequence of tokens (linguistic elements) by applying a set of grammar rules specific to a particular natural language, such as English. In some implementations, the grammar rules can be modified to suit the environment. In some implementations, a probabilistic grammar is applied to provide a structural description of the input queries. Probabilistic grammars are useful in resolving ambiguities in sentence parsing. The probability distributions (for grammar production rules) can be estimated from a corpus of hand-parsed sentences, for instance. Some implementations deduce additional syntactic structure by employing a Part-Of-Speech (POS) Tagger that assigns parts of speech, such as noun, verb, adjective, to each word (sometimes called a token). Some implementations resolve the parsed output to corresponding categorical and ordered data attributes. As the dashed lines that connects the blocks 500 and 510 show, in some implementations, the system also computes (510) phrase structure for the utterance Un (500).
With the phrase structure(s), the system proceeds to determine (530) the type of pragmatic forms (examples of which are described below with reference to
In some implementations, when the system receives an utterance Un+1 (620), which in this example is the utterance “townhomes,” the system computes (626) temporary conversation centers for Un+1 (620). For this example, the system computes the conversation centers (628) to be the set {townhomes}. Additionally, the system computes (622) phrase structure for the utterance Un+1 (620) using techniques described above in reference to step 522 (
As mentioned above, ellipses exclude one or more linguistic elements. With the aid of the phrase structures (612 and 624), the system determines a subset of conversation centers of utterance Un (600) that corresponds to missing linguistic elements in utterance Un+1 (620), in accordance with some implementations. In this example, the system computes the subset to be the set {ballard, 1M}, because the linguistic elements, viz., a noun phrase that refers to a place following a prepositional phrase (corresponding to “ballard”) and a noun phrase that refers to a price value following another prepositional phrase (corresponding to “1M” or, more precisely, “under 1M”), are missing in utterance Un+1 (620) but were present in utterance Un (600). On the other hand, the phrase “houses” in the utterance Un (600) and the phrase “townhomes” in the utterance Un+1 (620) correspond to similar linguistic elements (e.g., both phrases are noun phrases and refer to types of houses).
In step 634, the system combines the temporary set of conversation centers, which in this example is the set {townhomes}, with the subset of conversation centers (632) to arrive at a set of forward-looking conversation centers (638) for utterance Un+1, in accordance with some implementations. Based on the computed set of forward-looking conversation centers (636), the system determines the type of filters to apply to the dataset and applies the appropriate filters in step 638 to display an appropriate data visualization (640), in accordance with some implementations. In this example, because the conversation centers “ballard” and “1M” were retained from the backward-looking conversation centers (604), the system retains the numerical filter (corresponding to 1M) and spatial filter (corresponding to Ballard). Also, since the value of the conversation center (corresponding to home_type variable) changed from townhomes to houses, the system applies categorical filter on home_type to show the townhomes (instead of houses).
As mentioned above, referring expressions with anaphoric references make references to something else within the text. Based on the phrase structure (724), the system identifies (726) anaphora in the utterance Un+1 (720), in accordance with some implementations. In this example, the system identifies the anaphora (728) “previous.” Using the identified anaphora, the system next identifies (734) phrasal chunk (732) containing the reference to identify the entities the reference is referring to, in accordance with some implementations. For the example shown, the system identifies the phrasal chunk “year” that corresponds to the anaphora “previous.” Based on the identified anaphora and the phrasal chunk, in step 730, the system searches through the backward-looking centers to find such entities and replaces the anaphoric reference with these entities, in accordance with some implementations. Additionally, in some implementations, as is the case in this example, the system also detects and applies appropriate functions to the value of the entity. For the example shown, the system also detects that the user is referring to the “previous” year, and therefore the value of 2015 is decremented by 1 before arriving at the right value for the year variable. The system computes the date for ‘previous’ using a temporal function (e.g., DATECALC), in accordance with some implementations. The system arrives at a set of forward-looking conversation centers (736), which for this example is the set {prices, 2014}. Based on this set, the system takes necessary steps to update the visualization in step 738, in accordance with some implementations. For this example, the system retains a reference to year and updates the temporal filter to 2014, to show the visualization in 740.
Although not shown in
In some implementations, the references refer to values of a data attribute. In some implementations, the references refer to actions that need to be executed by the system. For instance, consider the utterance “filter out ballard” followed by “do that to fremont.” Here, the word that is not immediately followed by any noun, but immediately preceded by a verb word ‘do.’ In such cases, the system determines one or more actions mentioned in the previous utterance, which for this example is the action ‘filter out’.
In some implementations, the system supports references that lie outside the text, and in the context of the visualization. In some such implementations, the forward-looking center Cf references context within the visualization as opposed to text in the backward-looking center Cb. In some implementations, this form of indirect referencing includes a deictic reference that refers to some object in the environment, usually by pointing. In some such implementations, the system supports deictic references by enabling multimodal interaction (mouse+speech/text).
When the system receives an utterance Un+1 (810), which in this example is the utterance “houses in Ballard under 600 k last summer,” the system computes (812) phrase structure for the utterance Un+1 (810) using techniques described above in reference to step 522 (
In some implementations, the system resolves multiple conjuncts within compound utterances to invoke one or more corresponding analytical functions through a process of linearization. In some such implementations, an analytical function F(X, op, v) consists of a variable X (e.g., an attribute), an operator op, and a value v. Each attribute is either categorical or ordered. The ordered data type is further categorized into ordinal and quantitative. The linearization process considers the types of attributes and operators to combine analytical functions using the logical operator AND (represented as “∧”) and the logical operator OR (represented as “∨”).
Applying the ∨ operator: When two or more adjacent conjuncts share an attribute and that attribute's data type is categorical, the system connects these conjuncts by ∨, in accordance with some implementations. Similarly, when the shared attribute is ordered and the function's operator is ==, the system applies ∨, in accordance with some implementations. In such cases, ∨ is logically more appropriate as a choice because applying ∧ would not match to any item in the data table. For example, if the utterance is “show me condos and townhomes,” then the system generates the following combination of analytical functions: (F_CAT(homeType, ==, condo)∨F_CAT(homeType, ==, townhome)), in accordance with some implementations. In this example, both ‘condo’ and ‘town-home’ belong to the same categorical attribute (e.g., homeType). Because a particular house (item) cannot be both ‘condo’ and ‘townhome’ at the same time, applying the ∨ operator is logically more appropriate than applying the ∧ operator. Similarly, if the user utters “2 3 bedroom houses,” the system generates (F_ORDINAL(bed, ==, 2)∨F_ORDINAL(bed, ==, 3)), in accordance with some implementations. The ∨ operator is also appropriate if the attribute type is ordered and involves the condition X<v1 and X>v2, where v1<v2. For instance, if the utterance is “before 2013 and after 2014,” then the ∨ operator will be used between the two conjuncts, in accordance with some implementations. Again, in this instance, applying the ∧ operator would result in matching no item in the data table.
Applying the ∧ operator: The ∧ operator is appropriate if attribute type is ordered and involves the condition X>v1 and X<v2, where v1<v2. For example, “houses over 400 k and under 700 k” resolves to (F_NUMERIC(price, >, 4000000)∧F NUMERIC(price, <, 700000)). “Beds between 2 to 4” resolves to (F_ORDINAL(beds, >=, 2)∧F NUMERIC(beds, <=, 4)). Notice that applying ∨ operator would result in matching to all items in the data table. In some implementations, the ∧ operator is also applied when there is no common attribute between two conjuncts. For example, the utterance “price under 600 k with 2 beds” resolves to (F_ORDINAL(beds, ==, 2)∧F_NUMERIC(price, <=, 600000)).
In order to generate the analytical function representation of the whole utterance, the system traverses a corresponding parse tree for the utterance generated by a parser (e.g., the parser described above in reference to
When the system receives an utterance Un+1 (910), which in this example is the utterance “the cheapest,” the system computes (912) phrase structure for the utterance Un+1 (914) using techniques described above in reference to step 522 (
In some implementations, the system identifies attribute word senses by employing the Word2vec™ model containing learned vector representations of large text corpora, computing word vectors using a recurrent neural network. In some implementations, the semantic relatedness Srel between a word wi in a given utterance and a data attribute dj, is the maximum value of a score computed as follows:
In formula (1), dist(Si,m, Sj,n) is the Wu-Palmer distance between the two senses Si,m and Sj,n. vwi and vdj are the vector representations of wi and dj, respectively. λ is a weighting factor applied to a pairwise cosine distance between the vectors.
The Word2vec™ model is used here only as an example. A number of other neural network models can be used to identify word senses, such as Stanford University's GloVe™. Some libraries, such as GenSim™ and Deeplearning4j™, provide a choice of different word embedding models in a single package.
In some implementations, the system not only computes semantic relatedness between terms and data attributes, but also computes the type of analytical function associated with each term. For example, the system performs the additional steps for queries “show me the cheapest houses near Ballard” or “where are the mansions in South Lake Union?” The system considers the corresponding dictionary definitions as additional features to these word vectors and checks if the definitions contain quantitative adjectives such as ‘less,’ ‘more,’ ‘low,’ ‘high’ using a POS tagger, in accordance with some implementations. The system then maps appropriate analytical functions to these adjectives, in accordance with some implementations.
Referring back to
When the system receives an utterance Un+1 (1010), which in this example is the utterance “remove condos,” the system computes (1012) phrase structure for the utterance Un+1 (1010) using techniques described above in reference to step 522 (
Given an utterance Un, a system implementing this framework responds by executing a series of analytical functions derived from the forward-looking centers Cf (Un, DS). An analytical function F(X, op, v) consists of a variable X (which can bean attribute or a visualization property), an operator op, and a value v (typically a constant). For example, when the user says “measles in the uk,” the system creates two functions namely F_CAT(diseases, ==, measles) and F_CAT(country, ==, uk). When the user provides a new utterance Un+1, the system first creates a set of temporary centers Ctemp(Un+1, DS) from Un+1 without considering any previous context. The system then applies a set of rules to create a set of forward-looking centers, Cf (Un+1, DS) based on some set operations between Cb (Un+1,DS) and Ctemp (Un+1, DS). The forward-looking centers are then used to respond to the user utterance according to some implementations.
To support a conversation, the visualizations shown by the system provide cohesive and relevant responses to various utterances. Sometimes, the system responds by changing the visual encoding of existing visualizations, while in other cases the system creates a new chart to support the visual analytical conversation more effectively. In addition to appropriate visualization responses, the system helps the user understand how the system has interpreted an utterance by producing appropriate feedback and allows the user to rectify the interpretation through some interface controls as necessary. In a traditional dashboard, users interact by selecting items or attributes in a visualization that are highlighted to provide immediate visual feedback. Simultaneously, other charts are updated by highlighting or filtering out items. In a natural language interface, however, instead of making explicit selection by mouse/keyboard, the user mentions different attributes and values, making it a non-trivial task of deciding how each view within a dashboard should respond to the utterance. Another complication arises when the system has to support multiple visualizations.
During visual analysis flow, there may be situations where the existing visualization cannot meet the evolving information needs of the user. This scenario could arise, for example, when a particular data attribute cannot be encoded effectively in the existing visualization (e.g., time values in a map), warranting the need for creating a new visualization as a response. Drawing inspiration from work that connects visualization with language specification, the system supports the creation of different types of visualizations (e.g., bar chart, line chart, map chart, and scatterplot), in accordance with some implementations.
Referring back to
It is further noted that, although not shown in
In some implementations, the system identifies one or more widgets from the analytical functions derived from an utterance. In some such implementations, the system organizes and presents the widgets in an intuitive way so that the user can understand how the system interprets her utterance and subsequently modify the interpretation using these widgets. For this purpose, the system takes the original utterance and orders the widgets in the same sequence as the corresponding query terms. In some such implementations, the system achieves this by using a library, such as Sparklificator™, that facilitates the placement of small word-scale visualization within text in a compact way. In addition, some implementations provide a set of interfaces to users including the ability to manipulate and/or remove a widget, to modify the query, and to resolve ambiguous queries.
In addition to ambiguity handling, in some implementations, the system also provides feedback and meaningful hints to modify the text, when the system fails to completely understand the query. For instance, if the system cannot successfully parse the given utterance, the system first attempts to automatically correct the misspelled terms by comparing the tokens with the attributes, cell values, and related keywords in the current dataset using fuzzy string matching. When the user forms a query that is partially recognized, the system prunes the unrecognized terms from the corresponding parse tree and then shows the results based on the tokens that are understood.
In some implementations, the computer displays (1308) a data visualization based on a dataset retrieved from a database using a first set of one or more queries. For example, referring to
The computer receives (1310) a first user input to specify a first natural language command related to the displayed data visualization. In some implementations, the user input is received as text input (e.g., a via keyboard 216 or via touch sensitive display 214) from a user in a data-entry region on the display in proximity to the displayed data visualization. In some implementations, the user input is received as a voice command using a microphone (e.g., an audio input device 220) coupled to the computer. For example, referring to
Based on the displayed data visualization, the computer extracts (1312) a first set of one or more independent analytic phrases from the first natural language command. For example, referring to
The language processing module 238 computes (1314) a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases, in accordance with some implementations. A framework based on a conversational interaction model is described above in reference to
Subsequently, the language processing module 238 computes (1316) a first set of analytic functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases, in accordance with some implementations. As described above in reference to
In some implementations, the computer updates (1318) the data visualization based on the first set of one or more functional phrases computed in step 1316. As shown in
Referring now back to
Based on the displayed data visualization, the computer extracts (1322) a second set of one or more independent analytic phrases from the second natural language command. For example, referring to
The language processing module computes (1324) a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases, in accordance with some implementations.
The language processing module derives (1326) a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules, in accordance with some implementations. In some such implementations (1332), each of the conversation centers of the first set of one or more conversation centers, the temporary set of one or more conversation centers, and the second set of one or more conversation centers comprises a value for a variable (e.g., a data attribute or a visualization property). In some such implementations, the language processing module uses the transitional rules by performing a sequence of operations (as shown in
In some implementations, as shown in
In some implementations, as shown in
In some implementations, as shown in
Referring now back to
The computer updates (1330) the data visualization based on the second set of one or more functional phrases, in accordance with some implementations. In some such implementations, as shown in
In some implementations, as shown in
In some implementations, the computer displays (1408) a data visualization based on a dataset retrieved from a database using a first set of one or more queries. For example, referring to
The computer receives (1410) a first user input to specify a first natural language command related to the displayed data visualization. In some implementations, the user input is received as text input (e.g., a via keyboard 216 or via touch sensitive display 214) from a user in a data-entry region on the display in proximity to the displayed data visualization. In some implementations, the user input is received as a voice command using a microphone (e.g., an audio input device 220) coupled to the computer. For example, referring to
Based on the displayed data visualization, the computer extracts (1412) a first set of one or more independent analytic phrases from the first natural language command. For example, referring to
The language processing module 238 computes (1414) a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases, in accordance with some implementations. A framework based on a conversational interaction model is described above in reference to
Subsequently, the language processing module 238 computes (1416) a first set of analytic functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases, in accordance with some implementations. As described above in reference to
In some implementations, the computer updates (1418) the data visualization based on the first set of one or more functional phrases computed in step 1416.
Referring now to
Based on the displayed data visualization, the computer extracts (1422) a second set of one or more independent analytic phrases from the second natural language command. For example, referring to
The language processing module computes (1424) a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases, in accordance with some implementations.
The language processing module computes (1426) cohesion between the first set of one or more analytic phrases and the second set of one or more analytic phrases and derives a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the cohesion, in accordance with some implementations. As shown in
Whence the phrase structure is identified in step 1434, the language processing module identifies one or more forms of pragmatic forms based on the phrase structure, according to some implementations. Subsequently, the language processing module derives (1446) the second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers based on the identified one or more forms of pragmatics.
In some implementations, the language processing module 238 computes (1430) a second set of one or more analytical functions associated with the second set of one or more conversation centers, thereby creating a second set of one or more functional phrases. The language processing module 238 performs this step, using the second set of one or more conversation centers computed in step 1426, in a manner similar to step 1416 described above.
In some implementations, as shown in
In some implementations, as shown in
In some implementations, as shown in
Referring now back to
In some implementations, the language processing module 238 determines (1468) whether the first anaphoric reference is a reference to a visualization property in the updated data visualization (sometimes herein called a deictic reference), and, in accordance with a determination that the anaphoric reference is a deictic reference, computes (1470) the second set of one or more conversation centers based on the temporary set of one or more conversation centers, and data related to the visualization property.
In some implementations, as shown in
In some implementations, as shown in
In some implementations, the language processing module 238 determines (1494) whether the phrase structure corresponds to a repair utterance to change a default behavior related to displaying a data visualization, and, in accordance with a determination that the phrase structure corresponds to a repair utterance to change a default behavior, changes (1496) the default behavior related to displaying.
In some implementations, the language processing module 238 determines (14.106) whether the second natural language command has more than one conjunct, and, in accordance with the determination that the second natural language command has more than one conjunct, computes (14.108) the second set of one or more analytical functions by linearizing the second natural language command.
In some implementations, the language processing module 238 linearizes the second natural language command by performing a sequence of operations shown in
In
In
In
In
In
In some implementations, the computer displays (1508) a data visualization based on a dataset retrieved from a database using a first set of one or more queries. For example, referring to
The computer receives (1510) a first user input to specify a first natural language command related to the displayed data visualization. In some implementations, the user input is received as text input (e.g., a via keyboard 216 or via touch sensitive display 214) from a user in a data-entry region on the display in proximity to the displayed data visualization. In some implementations, the user input is received as a voice command using a microphone (e.g., an audio input device 220) coupled to the computer. For example, referring to
Based on the displayed data visualization, the computer extracts (1512) a first set of one or more independent analytic phrases from the first natural language command. For example, referring to
The language processing module 238 computes (1514) a first set of one or more conversation centers associated with the first natural language command based on the first set of one or more analytic phrases, in accordance with some implementations. A framework based on a conversational interaction model is described above in reference to
Subsequently, the language processing module 238 computes (1516) a first set of analytic functions associated with the first set of one or more conversation centers, thereby creating a first set of one or more functional phrases, in accordance with some implementations. As described above in reference to
In some implementations, the computer updates (1518) the data visualization based on the first set of one or more functional phrases computed in step 1516.
Referring now to
Based on the displayed data visualization, the computer extracts (1522) a second set of one or more independent analytic phrases from the second natural language command. For example, referring to
The language processing module computes (1524) a temporary set of one or more conversation centers associated with the second natural language command based on the second set of one or more analytic phrases, in accordance with some implementations.
The language processing module derives (1526) a second set of one or more conversation centers from the first set of one or more conversation centers and the temporary set of one or more conversation centers using one or more transitional rules, in accordance with some implementations.
The computer updates (1528) the data visualization based on the second set of one or more functional phrases, in accordance with some implementations.
Referring to
Subsequently, the computer highlights (1538) the data marks whose characteristics correspond to the first data attribute, in accordance with some implementations. In some such implementations, the computer filters (1540) results from the displayed data visualizations that contain data marks whose characteristics do not correspond to the one or more data attributes. Further, in some such implementations, the computer receives (1542) a user input to determine whether to filter or to highlight the data marks and filters or highlights the data marks on the displayed data visualizations based on the determination.
Referring now to
Referring to
Referring to
In some implementations, the language processing module 238 substitutes (1584) the correctly spelled term for the first token in the second natural language command to obtain a third natural language command, and extracts (1586) the second set of one or more analytic phrases from the third natural language command.
In some implementations, as shown in
In some implementations, as shown in
The terminology used in the description of the invention herein is for the purpose of describing particular implementations only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various implementations with various modifications as are suited to the particular use contemplated.
Gossweiler, III, Richard C., Tory, Melanie K., Setlur, Vidya R., Battersby, Sarah E., Chang, Angel Xuan, Dykeman, Isaac J., Prince, Md Enamul Hoque
Patent | Priority | Assignee | Title |
11216450, | Jul 30 2020 | TABLEAU SOFTWARE, LLC | Analyzing data using data fields from multiple objects in an object model |
11232120, | Jul 30 2020 | TABLEAU SOFTWARE, LLC | Schema viewer searching for a data analytics platform |
11442964, | Jul 30 2020 | TABLEAU SOFTWARE, LLC | Using objects in an object model as database entities |
11594215, | Jun 27 2017 | Amazon Technologies, Inc. | Contextual voice user interface |
11599533, | Jul 30 2020 | TABLEAU SOFTWARE, LLC | Analyzing data using data fields from multiple objects in an object model |
11809459, | Jul 30 2020 | TABLEAU SOFTWARE, LLC | Using objects in an object model as database entities |
11977852, | Jan 12 2022 | Bank of America Corporation | Anaphoric reference resolution using natural language processing and machine learning |
12094460, | Jul 27 2016 | SAMSUNG ELECTRONICS CO , LTD | Electronic device and voice recognition method thereof |
D927517, | Oct 31 2018 | THALES AVS FRANCE SAS | Electronic device display screen with graphical user interface |
D969842, | Jan 11 2021 | SAMSUNG ELECTRONICS CO , LTD | Display screen or portion thereof with graphical user interface |
ER4063, | |||
ER7148, |
Patent | Priority | Assignee | Title |
7089266, | Jun 02 2003 | The Board of Trustees of the Leland Stanford Jr. University; BOARD OF TRUSTEES OF THE LELAND STANFORD JR UNIVERSITY | Computer systems and methods for the query and visualization of multidimensional databases |
7716173, | Jun 02 2003 | The Board of Trustees of the Leland Stanford Jr. University | Computer systems and methods for the query and visualization of multidimensional database |
8489641, | Jul 08 2010 | GOOGLE LLC | Displaying layers of search results on a map |
8713072, | Jun 02 2003 | The Board of Trustees of the Leland Stanford, Jr. Univeristy | Computer systems and methods for the query and visualization of multidimensional databases |
8972457, | Jun 02 2003 | Board of Trustees of the Leland Stanford Jr. University | Computer systems and methods for the query and visualization of multidimensional databases |
9183235, | Jun 02 2003 | The Board of Trustees of the Leland Stanford Jr. University | Computer systems and methods for the query and visualization of multidimensional databases |
9477752, | Sep 30 2013 | VERINT SYSTEMS INC | Ontology administration and application to enhance communication data analytics |
9501585, | Jun 13 2013 | Progress Software Corporation | Methods and system for providing real-time business intelligence using search-based analytics engine |
9575720, | Jul 31 2013 | GOOGLE LLC | Visual confirmation for a recognized voice-initiated action |
9794613, | Jul 19 2011 | LG Electronics Inc. | Electronic device and method for controlling the same |
9858292, | Nov 11 2013 | TABLEAU SOFTWARE, INC. | Systems and methods for semantic icon encoding in data visualizations |
9953645, | Dec 07 2012 | SAMSUNG ELECTRONICS CO , LTD | Voice recognition device and method of controlling same |
20040030741, | |||
20060218140, | |||
20060259394, | |||
20060259775, | |||
20070174350, | |||
20090299990, | |||
20100030552, | |||
20100110076, | |||
20100313164, | |||
20120179713, | |||
20130031126, | |||
20150058318, | |||
20150123999, | |||
20150310855, | |||
20150379989, | |||
20160261675, | |||
20180108359, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
May 11 2018 | TABLEAU SOFTWARE, INC. | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
May 11 2018 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Nov 04 2024 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Date | Maintenance Schedule |
May 18 2024 | 4 years fee payment window open |
Nov 18 2024 | 6 months grace period start (w surcharge) |
May 18 2025 | patent expiry (for year 4) |
May 18 2027 | 2 years to revive unintentionally abandoned end. (for year 4) |
May 18 2028 | 8 years fee payment window open |
Nov 18 2028 | 6 months grace period start (w surcharge) |
May 18 2029 | patent expiry (for year 8) |
May 18 2031 | 2 years to revive unintentionally abandoned end. (for year 8) |
May 18 2032 | 12 years fee payment window open |
Nov 18 2032 | 6 months grace period start (w surcharge) |
May 18 2033 | patent expiry (for year 12) |
May 18 2035 | 2 years to revive unintentionally abandoned end. (for year 12) |