Approaches for indexing and comparing charts are described. A system can receive one or more charts, which may include depictions of signals, and index portions of a chart using a sliding window algorithm. Subsequently, a system can receive a query that can be compared to the indexed portions of one or more charts. After a comparison, the most similar portions of the compared charts are provided based on a nearest neighbour search.

Patent
   11314738
Priority
Dec 23 2014
Filed
Sep 11 2019
Issued
Apr 26 2022
Expiry
May 11 2035

TERM.DISCL.
Extension
139 days
Assg.orig
Entity
Large
0
202
currently ok
12. A method comprising:
displaying a first chart comprising at least a first axis;
receiving a query comprising at least a user-selected portion of the first chart along at least the first axis of the first chart;
analyzing at least the user-selected portion of the first chart to determine a first set of features associated with the user-selected portion of the first chart;
generating an index of a plurality of sub-charts, wherein generating the index comprises at least:
determining a plurality of sub-charts based on at least a portion of a second chart;
analyzing the respective sub-charts to determine respective second sets of features; and
generating the index of at least the plurality of sub-charts and respective second sets of features;
comparing, using at least the index, the first set of features to one or more of the second sets of features associated with the respective sub-charts to determine an amount of similarity between each combination of the first set of features and the one or more of the second sets of features; and
in response to determining that the first set of features has a threshold amount of similarity to one of the second sets of features:
determining one of the sub-charts that is associated with the one of the second sets of features; and
outputting for display, as a response to the query, at least a representation of the one of the sub-charts.
1. A system comprising:
a memory device that stores a set of instructions;
one or more processors configured to execute the set of instructions to cause the system to:
display a first chart comprising at least a first axis;
receive a query comprising at least a user-selected portion of the first chart along at least the first axis of the first chart;
analyze at least the user-selected portion of the first chart to determine a first set of features associated with the user-selected portion of the first chart;
generate an index of a plurality of sub-charts, wherein generating the index comprises at least:
determining a plurality of sub-charts based on at least a portion of a second chart;
analyzing the respective sub-charts to determine respective second sets of features; and
generating the index of at least the plurality of sub-charts and respective second sets of features;
compare, using at least the index, the first set of features to one or more of the second sets of features associated with the respective sub-charts to determine an amount of similarity between each combination of the first set of features and the one or more of the second sets of features; and
in response to determining that the first set of features has a threshold amount of similarity to one of the second sets of features:
determine one of the sub-charts that is associated with the one of the second sets of features; and
output for display, as a response to the query, at least a representation of the one of the sub-charts.
20. A non-transitory computer-readable medium storing a set of instructions that are executable by one or more processors to cause the one or more processors to perform a method, the method comprising:
displaying a first chart comprising at least a first axis;
receiving a query comprising at least a user-selected portion of the first chart along at least the first axis of the first chart;
analyzing at least the user-selected portion of the first chart to determine a first set of features associated with the user-selected portion of the first chart;
generating an index of a plurality of sub-charts, wherein generating the index comprises at least:
determining a plurality of sub-charts based on at least a portion of a second chart;
analyzing the respective sub-charts to determine respective second sets of features; and
generating the index of at least the plurality of sub-charts and respective second sets of features;
comparing, using at least the index, the first set of features to one or more of the second sets of features associated with the respective sub-charts to determine an amount of similarity between each combination of the first set of features and the one or more of the second sets of features; and
in response to determining that the first set of features has a threshold amount of similarity to one of the second sets of features:
determining one of the sub-charts that is associated with the one of the second sets of features; and
outputting for display, as a response to the query, at least a representation of the one of the sub-charts.
2. The system of claim 1, wherein the set of instructions further causes the system to:
determine a distance matrix associated with to the first chart, and
wherein comparing the first set of features to one or more of the second sets of features comprises at least comparing the first set of features to the one or more of the second set of features using the distance matrix and a nearest neighbor search or a Euclidean distance.
3. The system of claim 2, wherein the distance matrix is used to determine a distance from the first set of features to the second set of features.
4. The system of claim 2, wherein:
the distance matrix is determined using a machine learning approach, the second sets of features are transformed, for use with the distance matrix, before receipt of the query, and
the distance matrix is one of a plurality of different distance matrices corresponding to different charts.
5. The system of claim 1, wherein each sub-chart of the plurality of sub-charts comprises a portion of the second chart included in another sub-chart of the plurality of sub-charts.
6. The system of claim 5, wherein the respective second sets of features associated with the respective sub-charts comprise at least one of: one or more values that correspond to an average value of a signal associated with the respective sub-charts, one or more slopes associated with the respective sub-charts, one or more min-max differences associated with the respective sub-charts, or ranges of data included in the respective sub-charts.
7. The system of claim 1, wherein at least some of the sub-charts overlap with adjacent sub-charts of the second chart.
8. The system of claim 7, wherein each sub-chart represents a portion of a signal.
9. The system of claim 1, wherein the set of instructions further causes the system to:
further in response to determining that the first set of features has the threshold amount of similarity to one of the second sets of features:
display the query as a line chart; and
display the one of the sub-charts as a line chart.
10. The system of claim 1, wherein the set of instructions further causes the system to:
normalize at least the user-selected portion of the first chart and at least the portion of the second chart,
wherein analyzing at least the user-selected portion of the first chart comprises analyzing the normalized user-selected portion of the first chart, and
wherein determining the plurality of sub-charts comprises determining the plurality of sub-charts based on the normalized portion of the second chart.
11. The system of claim 10, wherein:
the plurality of sub-charts are determined from at least the portion of the second chart using a sliding window,
each of the plurality of sub-charts corresponds to a different portion of the normalized portion of the second chart within the sliding window as the sliding window is moved within the normalized portion of the second chart,
each sub-chart overlaps with an adjacent sub-chart of the second chart, and
a size of the sliding window is determined based on at least one of: a storage space of the system, a processing power of the system, network resources of the system, or a type of data represented by the second chart.
13. The method of claim 12 further comprising:
determining a distance matrix associated with to the first chart, and
wherein comparing the first set of features to one or more of the second sets of features comprises at least comparing the first set of features to the one or more of the second set of features using the distance matrix and a nearest neighbor search or a Euclidean distance.
14. The method of claim 12, wherein each sub-chart of the plurality of sub-charts comprises a portion of the second chart included in another sub-chart of the plurality of sub-charts.
15. The method of claim 14, wherein the respective second sets of features associated with the respective sub-charts comprise at least one of: one or more values that correspond to an average value of a signal associated with the respective sub-charts, one or more slopes associated with the respective sub-charts, one or more min-max differences associated with the respective sub-charts, or ranges of data included in the respective sub-charts.
16. The method of claim 12, wherein at least some of the sub-charts overlap with adjacent sub-charts of the second chart.
17. The method of claim 12 further comprising:
further in response to determining that the first set of features has the threshold amount of similarity to one of the second sets of features:
displaying the query as a line chart; and
displaying the one of the sub-charts as a line chart.
18. The method of claim 12 further comprising:
normalizing at least the user-selected portion of the first chart and at least the portion of the second chart,
wherein analyzing at least the user-selected portion of the first chart comprises analyzing the normalized user-selected portion of the first chart, and
wherein determining the plurality of sub-charts comprises determining the plurality of sub-charts based on the normalized portion of the second chart.
19. The method of claim 18, wherein:
the plurality of sub-charts are determined from at least the portion of the second chart using a sliding window,
each of the plurality of sub-charts corresponds to a different portion of the normalized portion of the second chart within the sliding window as the sliding window is moved within the normalized portion of the second chart,
each sub-chart overlaps with an adjacent sub-chart of the second chart, and
a size of the sliding window is determined based on at least one of: a storage space of a system, a processing power of a system, network resources of a system, or a type of data represented by the second chart.

This application is a continuation of U.S. patent application Ser. No. 14/581,227, filed Dec. 23, 2014, and titled “SEARCHING CHARTS.” The entire disclosure of the above item is hereby made part of this specification as if set forth fully herein and incorporated by reference for all purposes, for all that it contains.

Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57.

As larger amounts of memory are able to be stored in smaller spaces, users can store more data than ever before. Data can be graphically represented in a variety of ways, including by a chart. Many types of charts can be used to represent data, including bar charts, pie charts, and line charts. Line charts are similar to scatter plots, except that measurement points are ordered (typically by their X-axis value) and joined with straight line segments. Often times, line charts show how data changes over equal intervals of time. As such, the X-axis in a line chart can represent an hour, a day, a year, etc. Line charts are often used to visualize a trend in data over intervals of time, and thus the line is often drawn chronologically.

Comparing two line charts by their appearance can be difficult for many reasons. For instance, various line charts may display time in different intervals/resolutions. While the data displayed on two line charts may look the same, one chart may include data for a given day while another may include data for a given week. Similarly, the Y-axes on various charts have different minimum and/or maximum values. For instance, the Y-axes on two charts may both display temperature, but one chart may display up to 200 degrees, while the other chart only displays up to 100 degrees. Thus, although the scale of two line charts may be different, the data represented by the graphs may appear very similar when normalized.

Reference will now be made to the accompanying drawings, which illustrate exemplary embodiments of the present disclosure and in which:

FIG. 1 is a block diagram of an exemplary system for chart searching, consistent with embodiments of the present disclosure;

FIG. 2 is a block diagram of an exemplary system for chart searching, consistent with embodiments of the present disclosure;

FIG. 3 is a diagram of exemplary charts, consistent with embodiments of the present disclosure;

FIG. 4 is a flowchart representing an exemplary an exemplary method for indexing charts, consistent with embodiments of the present disclosure;

FIG. 5 is a flowchart representing an exemplary method for chart searching, consistent with embodiments of the present disclosure; and

FIG. 6 is a block diagram of an exemplary computer system, consistent with embodiments of the present disclosure.

Reference will now be made in detail to exemplary embodiments, the examples of which are illustrated in the accompanying drawings. Whenever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.

Embodiments describing methods, systems, and non-transitory computer-readable mediums for indexing and searching for charts are described herein. It should be understood that charts, as used herein, include data that is displayed in a visual format. That data can include features, which in turn can be stored as a vector. Typically, a chart includes an X-axis and a Y-axis, and represents a first set of data with reference to a second set of data (e.g., temperature/time, etc.).

In some embodiments, a chart may be a portion of a larger chart (which can be referred to herein as a sub-chart, a sub-window, sub-set data (which can comprise sub-set features), etc.). For example, a chart that includes time data can include a week of data, but a sub-chart of the chart might display only a day of data. If a user were to view a sub-chart of the sub-chart, the user may see a chart that displays only an hour of data, or even a second or less of data (e.g., sub-set data). Of course, it should be appreciated that data, which can be displayed as a chart, can represent values other than time.

Often, a signal can be displayed in a chart. In some examples, signals can be displayed as values and their relationship to time. Thus, signals are easily displayed on charts. For example, an electrical signal or a radio wave may be displayed on a chart. As another example, large amounts of data received from sensors or other electronic devices can be displayed as a chart. Sensors can deliver one or more data sets over a period of time. For example, a sensor can monitor pressure, temperature, vibration, oxygen levels, radiation, etc.

In some cases, a user may want to search a chart to see if there are any one or more portions of the chart that resemble another chart. For instance, a user might find an interesting pattern displayed in a chart, and want to search other portions of that chart, or another chart, for a similar pattern. It should be appreciated that, in various embodiments, although charts are used as examples, the present disclosure can determine similar patterns independent of the form of visualization (e.g., a chart).

As an example, a user may want to determine what causes a device to overheat. In such a case the user can view a portion of a chart indicating that the device is about to overheat and compare that portion of the chart to other charts, or previous portions of the same chart to see if a similar pattern occurred prior to another overheating event. Based the appearance of patterns in charts that indicate that a device is about to overheat, a user or an electronic device can predict future overheating. Further, based on this information, a user or electronic device can deactivate a device when it determines a particular pattern in a chart in order to prevent a device from overheating.

Described herein are approaches for comparing and/or associating charts which include both indexing at least portions of one or more charts, as well as searching a database or other data storage device comprising at least portions of one or more charts.

With regard to indexing features of a chart, approaches described herein contemplate determining sub-charts of one or more charts (which can include one or more graphical representations of a signal) using a “sliding window” algorithm, and computing a set of features for each sub-chart (also referred to as sub-set features or sub-chart features) created by the sliding window. Once captured, the sub-charts and/or sub-set features/sub-chart features can be stored. For example, the sub-set features may be stored as a vector in a database, or some other storage device (as with all charts, sub-charts, sub-sets of data, etc.). The number of sub-charts (also known as sub-windows when referenced in association with a sliding window algorithm) to be stored and indexed can configured by a user, or can be based on resources such as an amount of bandwidth available or an amount of processing available.

In various embodiments described herein, a sliding window algorithm includes determining and storing portions of a chart (interchangeably referred to as a sub-charts, windows of a chart, sub-windows, etc.). Sub-charts determined/created based on a sliding window algorithm overlap as the “window” slides across the larger chart. For example, a sliding window may begin by capturing a sub-chart that has a height (also referred to as a range or scale) that is the size of the chart (e.g., the Y-axis), but only a portion of the width of the chart (e.g., the X-axis, time). Thus, for example, when applying a sliding window to a chart that includes 1 minute of data, the sliding window can create and store sub-charts for all values of the 1 minute chart between 0 seconds and 10 seconds, 1 and 11 seconds, 2 and 12 seconds, etc. such that fifty 10-second-interval sub-charts are created that correspond with the 1 minute chart. Similarly, fifty-five 5-second-interval sub-charts can be created that correspond with the minute long chart. Charts of any size can be created, stored and indexed using a sliding window algorithm. Although not always necessary, each sub-chart can overlap with another sub-chart, such that some or all of the larger chart is stored as a set of sub-charts. In some embodiments, the size and amount of charts/sub-charts that are stored can be configured by an administrator, or based on an amount of resources such as storage space or processing power. For instance, an administrator or a device can specify that a set of sub-charts to be stored should include either 5 seconds of data or 10 seconds of data based on an amount of storage space, network resources, the type of data (e.g., temperature data, vibration data, etc.), or any other factor. In some embodiments, sub-charts of different sizes (e.g., lengths of time) can be created and/or indexed using a sliding window algorithm. For instance, a set of sub-charts can include a combination of sub-charts including 5 seconds of data and sub-charts including 10 seconds of data. It is appreciated that a set of sub-charts created by a sliding window algorithm could include sub-charts of many different resolutions (e.g., 5 seconds, 10 seconds, 20 seconds, etc.)

The sliding window technique is often used because if a section of a chart is not stored, or is stored without the sections of the chart immediately preceding and/or succeeding the section, a chart search at a later time may miss the section of the chart because the query (e.g., chart(s) to search for) was spread across multiple sub-charts—which might not have happened had the sliding window technique been employed for capturing overlapping sections of the chart. For example, if a chart looked like a set of stairs, some sub-charts can include a single step, multiple steps, of a straight line depending on their respective size (e.g., amount of time shown in the respective sub-chart). It should, however, be noted that in some embodiments a sliding window algorithm can be non-overlapping (e.g., sub-charts created by a sliding window algorithm might not include portions of one or more other sub-charts).

By indexing sub-charts in a database, the portions can be searched faster than if someone were to simply compare a query (e.g., a sub-chart) to an entire chart (or multiple charts) that has not been indexed as sub-charts. Of course, in some cases a user may wish to compare a query comprising multiple sub-charts, which can take even more time if the charts to be searched are not stored in “smaller” portions within a database. Of course, as will be discussed below, the term smaller is relative since all charts (e.g., charts and their respective sub-charts) can be stored as a set of data that is the same size regardless of the respective “size” of the chart/sub-chart. For example, regardless of the amount of time two charts represent, they can be stored in vectors that are the same size and/or comprise the same number of features.

As described above, some charts may have different resolutions (e.g., a time span of 1 week, 2 weeks, etc.), ranges/scales (e.g., temperatures, pressure, values often shown on Y-axes), or dimensions (e.g., the number of channels) than other charts. The type or amplitude of values in one chart may be different from the type or amplitude of values in another chart, even though the features of the first chart and a section of the second chart appear the same visually. Conversely, two charts can appear the same visually, but can be very different if they utilize different units (e.g., different time intervals). As will be described in more detail below, in some embodiments charts may be normalized prior to being compared. Thus, while portions of this disclosure discuss chart searching, in various embodiments this disclosure discloses searching for similar patterns in multi-dimensional windows (e.g., periods of time) across different resolutions and scales.

FIG. 1 is a block diagram 100 of an exemplary system for comparing charts, consistent with embodiments of the present disclosure. Block diagram 100 includes a first chart 120 and a second chart 130 (which can also be referred to as a query comprising two charts, or a two-dimensional chart). Charts 120 and 130 are inputted into a chart searching system 110. The chart searching system 110 then outputs at least one chart. As shown in diagram 100, chart searching system provided 8 charts 122A-D (collectively known as 122) and 132A-D (collectively known as 132) as output (which can also be referred to as four two-dimensional charts). In some embodiments, chart searching system 110 may output more or fewer charts.

It should be noted that a query comprising a set of two charts 120 and 130 was used as input for chart searching system 110. In some embodiments, the chart searching system 110 may return the same number of charts that it received as input. Of course, chart searching system 110 may return additional, or fewer charts than those inputted into the chart searching system 110. In some embodiments where more than one chart is entered as a query, the similarities of a first chart within a query and the chart search results associated with that first chart can be weighted more heavily than the similarities of the second chart of a query and the chart search results that it caused to be returned. Herein, the term chart search can be used to describe a search that accepts a query in the form of one or more charts, and provides results in the form of one or more charts—in a manner similar to image searches where an image is the input and similar images are the output.

In some embodiments, for example block diagram 100, different charts 120 and 130 can display identical time periods on their X-axes. The identical period of time can be the actual period of time (e.g., the same resolution and the same start time/both chart 120 and 130 consist of values that occurred at 12:30 a.m. to 1:00 p.m. on the same day of the same year), or an identical amount of time (e.g., both charts have the same resolution but not necessarily the same start time/both chart 120 and 130 show 30 minutes of values). As a response, the output of chart searching system 110 can display charts with similar features that occurred at the same actual time (e.g., charts 122 and 132 both show values occurring between 12:30 a.m. to 1:00 p.m. on the same day of the same year as the inputted charts, or charts 122 and 132 both show values occurring during the same 30 minutes on the same day of the year as each other). Alternatively, outputted charts 122 and 132 may display sub-charts that most closely resemble the respective inputs (120 and 130, respectively) regardless of the times represented in charts 122 and 132.

FIG. 2 is a block diagram of an exemplary system 200 for chart searching, consistent with embodiments of the present disclosure. In some embodiments, chart searching system 110 can be implemented using a portion or all of system 200. System 200 can comprise a series database 205, a query 210, channel 1 resolutions 220, channel 2 resolutions 225, a feature computation module 230, and a transformation module 240. Further, system 200 can include groundtruth annotations 250, a metric learning module 255, and at least one distance matrix 260. System 200 can also comprise a chunk index 270, a nearest neighbours search module 280, a database query module 290, and a plot module 295.

Series database 205 can comprise charts, which as described above may include, but are not limited to: portions of charts (e.g., sub-charts, sub-windows, sub-sets of data, etc.), features associated with charts, sub-set features, channels associated with charts (e.g., expressions of electrical signals), etc. The series database 205 can comprise raw data (e.g., charts), which can be compared. For example, the series database 205 can contain a set of points (e.g., X and Y values), and data for a particular chart can be determined by retrieving all of the X values, then filtering and/or uniformly sampling the X values to create, for example, a chart with 1,000 equally spaced points. In some embodiments, a series database 205 may comprise multiple channels (e.g., channel 1 resolutions 220, channel 2 resolutions 225, multiple Y values for each X value, etc.) that can be depicted as sub-charts, which can then be compared to a query 210. The sub-charts (e.g., channel resolutions 220 and 225), as well as the query 210, can be transmitted to feature computation module 230 from series database 205.

A query 210 can be received by feature computation module 230 from a variety of sources. For instance, the query 210 can be received from a series database 205. In some embodiments, the query 210 can be received from another source such as a sensor, or another electronic device such as a computer, server, or a second series database (e.g., a series database that did not provide the channel resolutions 220 and 225). Of course, a query may include more than one chart, as described above.

Feature computation module 230 can perform a variety of functions. In some embodiments, feature computation module 230 can create sub-charts and compute features associated with those sub-charts (which may be derived from channel resolutions 220 and 225). For example, a chart (e.g., a sub-chart derived from a channel, or a sub-chart of a sub-chart derived from a channel (also known as a sub-sub-chart)) can comprise features, and values associated with those features, indicating features such as values at particular points (e.g., Y-axis values), average slopes (e.g., the slope between three or more points, the average slope in an entire chart), a min-max difference (e.g., the difference between the highest value represented by a line in a chart and the lowest value represented by the line in the chart). Regardless of the size (e.g., amount of time, scale, etc.) of a chart or sub-chart (which can include a sub-sub-chart), any chart or sub-chart can be represented by the same amount of values (e.g., in a vector or other data structure). In some embodiments, for each sub-chart indexed by a chart searching system, 5 values at particular points may be determined, an average slope of the sub-chart may be determined, and a min-max difference may be determined. These values can be compared with the same values determined in one or more charts included in a query. Thus, if 7 values are associated with each sub-chart, 10 overlapping sub-charts would include 70 values. Of course, additional or fewer features may be determined by a feature computation module 230. In some embodiments, whether additional or fewer features are to be determined for various sub-charts and queries may be modified by a user (e.g., by setting the values on a computer), or may be modified automatically based on resources associated with one or more processors or networks. Typically, the more features that are captured the better the results of a chart search will be. Further, features may be stored in a database, table, or other type of data-structure for faster searching.

After the features associated with sub-charts (e.g., channel resolutions 220 and 225) and query 210 are computed, they can be transmitted as a data structure such as a vector to transformation module 240. Transformation module 240 can perform functions including, but not limited to: normalizing charts (e.g., normalizing the range of a chart, where a signal going from 1 to 100 is normalized to go from 0 to 1; and/or normalizing one or more features, such that the overall mean for a feature is 0 and its variance is 1 (in other words, the features in a series database 205 are computed, and the mean/variance for each feature is computed, and then used to normalize values)), scaling charts, adjusting charts based on the values indicated by X-axes and Y-axes, etc. Transformation module 240 can normalize sub-charts based on query 210 such that the differences in the values and/or intervals between values are accounted for. In other words, the charts (or values representing those charts) can be normalized such that they look the same if they are displaying similar data, or such that they do not look the same if they are displaying dissimilar data.

Transformation module 240 can also receive input from at least one distance matrix 260. All, or a portion of, distance matrix 260 can be applied (e.g., multiplied) to one or more sub-charts and charts included in a query. A machine learning approach (e.g., a nearest neighbor search such as a Large Margin Nearest Neighbour (LMNN) search) can be used to learn a distance matrix 260.

Distance matrix 260 can be a square matrix of several values, and can give weight to different values in each sub-chart. If more than one chart (e.g., type of value such as temperature, pressure, etc.) is being searched for, a distance matrix 260 may correspond to each respective chart. For example, one, or more, distance matrices 260 can be associated with both temperature and pressure. In some embodiments, a chart searching system can determine whether two measurements occurred at substantially the same time. In such a case, the corresponding charts can be combined using the geometric mean of the inverses of distances (e.g., via a nearest neighbor search).

Groundtruth annotations 250 and metric learning module 255 assist a chart searching system with providing the most accurate results. The system 200 can be provided with a set of queries, provide results, and a user or another module can annotate results that are similar to the query. These annotations can be stored as groundtruth annotations 250, and used in a metric learning algorithm/module 255 to learn a distance matrix 260. This allows a chart searching system to provide more accurate results when performing subsequent queries.

The distance matrix 260 can be decomposed using Cholesky decomposition. Thus, if D is the distance matrix 260, D=L*LT, where LT represents the transpose of L. The matrix L can be used to transform features included in the query and channel resolutions at transformation module 240. Subsequently, in some embodiments, transformed sub-charts can be transmitted to a chunk index 270. Of course, in some embodiments the transformation module might not transform a sub-chart or query 210. In any case, the chunk index 270 can store vectors that contain values associated with the features of a sub-chart. For example, a vector may contain 5 values associated with a sub-chart, an average slope associated with a sub-chart, and a min-max value associated with a sub-chart. Further, these seven values could have been transformed by transformation module 240. Once indexed in a chunk index 270, the sub-charts (which can be represented as vectors), can be compared to a query using a nearest neighbours search, and if the sub-charts meet a particular threshold (e.g., similarity threshold on distance (all results closer than a particular distance), and/or a threshold on the number of results (only a few results are shown which are typically the most similar results), etc.), they can be associated with one or more sub-charts. In some embodiments, a threshold amount of closeness can be configured by a user, or can be based on system resources such as bandwidth and/or processing characteristics.

The nearest neighbours search module 280 can receive inputs from the transformation module 240. Inputs may include one or more sub-charts stored in the chunk index 270, and one or more queries 210. The nearest neighbours search module can perform a search to determine whether the query comprises features similar to any portions of the sub-charts. After a nearest neighbours search is performed, database module 290 can compare one or more queries 210 with the channel resolutions 220 and 225 from the series database 205 by retrieving raw data from series database 205 for visualization. A result can be a set of results, where each result comprises a start time, an end time, and various channels. Subsequent to the database query, the results may be plotted by plot module 295. The plot module can be used to display results of a chart searching system in a variety of methods, including exporting them as spreadsheets or .CSV files. In some embodiments, a plot module (or another device) can transmit results over a network.

FIG. 3 is a diagram that shows exemplary charts 300, consistent with embodiments of the present disclosure. Exemplary charts 300 include a chart 310, which can be a portion of a signal as described above. Chart 310 includes sub-chart 320. In some embodiments, sub-chart 320 can be used as a query. In FIG. 3, results 330A-F (collectively 330) can be the results if sub-chart 320 was a query, for example. In various embodiments, results 330 may be organized based on their similarity to a query sub-chart 320. Of course, in various embodiments a chart search system can be configured to provide more than one chart as a result 330, as shown in FIG. 1.

FIG. 4 is a flowchart 400 representing an exemplary method for indexing charts. While the flowchart discloses the following steps in a particular order, at least some of the steps can be performed in a different order, performed in parallel, modified, or deleted where appropriate, consistent with the teachings of the present disclosure. Further, steps may be added to flowchart 400. The indexing can be performed in full or in part by a chart searching system (e.g., chart searching system 110, system 200, etc.). In addition or alternatively, some or all of these steps can be performed in full or in part by other devices and/or modules.

FIG. 4 starts at 410 and receives a chart, as shown in step 420. As discussed above, more than one chart may be received. In any case, after a chart is received, sub-charts are created at step 430 using a sliding window algorithm. The term sub-charts may be used interchangeably with portions of a chart, etc., since a sub-chart is simply a portion of another chart. The sub-charts can overlap with each other. That is to say, that the sub-charts may include portions of other sub-charts since they are created based on a sliding window algorithm. If one were to look at a chart, a first sub-chart may start at time 00:00 and end at time 00:10, while a second sub-chart may start at time 00:01 and end at time 00:11. In some embodiments, a sub-chart may overlap with 99% of another sub-chart. Similarly, in some embodiments, a sub-chart may overlap with 90%, 85%, 50%, or any percentage of another sub-chart. The percentage can be determined by an administrator or a device and be based on various factors such as amounts of storage space, processing power, or other resources.

At step 440, the sub-charts are indexed. Indexing, as used herein, involves storing sub-charts (e.g., portions of a chart), with one or more identifiers (also known as keys) such as the chart's start time and/or length. For example, each sub-chart can be characterized by a channel, a start time, and a length. An index, in such an example, can be a table as characterized below:

In some embodiments, a table (or database, etc.) can be appended and scaled in real or near-real-time by adding additional rows. It is contemplated that other identifiers can be associated with a sub-chart such as an identifier associated with the larger chart (e.g., channel, signal, etc.) from which the sub-chart is derived. The flowchart 400 then ends at step 450.

FIG. 5 is a flowchart 500 representing an exemplary method for chart searching. As with flowchart 400, some or all of the steps included in flowchart 500 can be modified, removed, performed in a different order, or performed in parallel. Also, steps can be added to flowchart 500. Chart searching may be done in whole or in part by an electronic device such as chart searching system 110, system 200, etc. In some embodiments, the chart searching can be performed partially or wholly by additional devices, modules, etc., that are not shown in the figures.

Flowchart 500 starts 510 by receiving a query 520. A query can be a whole chart, a portion of a larger chart, or can include multiple charts. In any case, a query can comprise features that can be compared to and/or associated with charts that have been stored and/or indexed. The query can be received from the same location as the charts that it will be compared to. In some embodiments, a query can be received from another location. For example, it may be provided by a user, retrieved from a network storage device or website, received from a sensor or other electronic device, etc. In some embodiments where a query includes multiple charts, the multiple charts similarly can be received from the same or different places. For example, one of the received charts can be stored in a database with the charts that it is going to be compared with, while another chart can be received from another location. In some embodiments, charts received from different places/locations/devices can include the same time span.

At step 530, a set of charts is received. The set of charts includes the charts that will be compared to the query (or at least the features of each chart will be compared). At step 540, at least the query is normalized. Any type of normalization can be used to normalize the query and/or other charts, as described above. In some embodiments, the amplitude of some or all of the query and/or charts (which as described above can include signals) can be normalized such that their amplitude has a mean value of 0 and energy of 1. Energy of a query and/or chart can be the square norm of its features, which can be stored in a vector.

At step 550, the query is compared to the set of charts. In various embodiments, the way in which a query is compared to a set of charts may vary based on the configuration of a system. In some embodiments, features based on a chart and/or query can stored in a vector, normalized, and/or compared with one another. The query may be compared with all, or some of the charts stored in a system. For example, a comparison can be made between a query and each sub-chart/sub-window as determined using a sliding-window algorithm. The comparison, as described in an example above, may comprise comparing features stored in vectors, In the example above, the vectors could comprise 70 dimensions (e.g., (5 features, an average slope, a min-max value)×10 sub-charts/sub-windows). When comparing these vectors with a nearest neighbour search, in some embodiments, a Euclidean distance can be used. Alternatively, a metric learning algorithm (e.g., an LMNN search) can be used to learn a distance matrix. Annotations (generated manually or otherwise) can also be associated with metrics and a distance matrix to generate more accurate results. In some embodiments, a comparison can be represented as:

where M is a distance matrix and (feat1−feat2)T is (feat1−feat2) transposed. Such a comparison could increase calculating the distance from O(D) arithmetic operations to O(D2) arithmetic operations. In some embodiments, features/vectors can be transformed at the time that a sub-chart is indexed previous to receipt of a query. When a query is received, the sub-charts can be transformed and only the query (e.g., chart(s), vector(s) corresponding with the query) needs to be transformed. Thus, in some embodiments where the indexed charts/sub-charts are already transformed, a comparison can be performed using a Euclidean distance, a transformed query, and the set of indexed charts/sub-charts that were previously transformed, thereby reducing the calculations necessary to perform a comparison.

At step 560, charts from the set of charts that include features similar to the features of the query are returned. For example, the top 10 charts that are closest to a query may be provided and/or displayed for a user. The term closest, as used herein, can refer to the results of a nearest neighbour search (also known as a proximity search or similarity search), with the top results being the charts that are the most similar. The method then ends at step 570.

FIG. 6 is a block diagram of an exemplary computer system 600, consistent with embodiments of the present disclosure. Components of chart search system 110, or system 200, can include the architecture based on or similar to that of computer system 600.

As illustrated in FIG. 6, computer system 600 can include a bus 602 or other communication mechanism for communicating information, and one or more hardware processors 604 (denoted as processor 604 for purposes of simplicity) coupled with bus 602 for processing information. Hardware processor 604 can be, for example, one or more microprocessors or it can be a reduced instruction set of one or more microprocessors.

Computer system 600 also includes a main memory 606, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 602 for storing information and instructions to be executed by processor 604. Main memory 606 also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 604. Such instructions, after being stored in non-transitory storage media accessible to processor 604, render computer system 600 into a special-purpose machine that is customized to perform the operations specified in the instructions.

Computer system 600 further includes a read only memory (ROM) 608 or other static storage device coupled to bus 602 for storing static information and instructions for processor 604. A storage device 610, such as a magnetic disk, optical disk, or USB thumb drive (Flash drive), etc. is provided and coupled to bus 602 for storing information and instructions.

Computer system 600 can be coupled via bus 602 to a display 612, such as a cathode ray tube (CRT), liquid crystal display, or touch screen, for displaying information to a computer user. An input device 614, including alphanumeric and other keys, is coupled to bus 602 for communicating information and command selections to processor 604. Another type of user input device is cursor control 616, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 604 and for controlling cursor movement on display 612. The input device typically has two degrees of freedom in two axes, a first axis (for example, x) and a second axis (for example, y), that allows the device to specify positions in a plane. In some embodiments, the same direction information and command selections as cursor control can be implemented via receiving touches on a touch screen without a cursor.

Computing system 600 can include a user interface module to implement a graphical user interface that can be stored in a mass storage device as executable software codes that are executed by the one or more computing devices. This and other modules can include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, fields, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.

In general, the word “module,” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, Java, Lua, C or C++. A software module can be compiled and linked into an executable program, installed in a dynamic link library, or written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software modules can be callable from other modules or from themselves, and/or can be invoked in response to detected events or interrupts. Software modules configured for execution on computing devices can be provided on a computer readable medium, such as a compact disc, digital video disc, flash drive, magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that requires installation, decompression, or decryption prior to execution). Such software code can be stored, partially or fully, on a memory device of the executing computing device, for execution by the computing device. Software instructions can be embedded in firmware, such as an EPROM. It will be further appreciated that hardware modules can be comprised of connected logic units, such as gates and flip-flops, and/or can be comprised of programmable units, such as programmable gate arrays or processors. The modules or computing device functionality described herein are preferably implemented as software modules, but can be represented in hardware or firmware. Generally, the modules described herein refer to logical modules that can be combined with other modules or divided into sub-modules despite their physical organization or storage.

Computer system 600 can implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 600 to be a special-purpose machine. According to some embodiments, the operations, functionalities, and techniques and other features described herein are performed by computer system 600 in response to processor 604 executing one or more sequences of one or more instructions contained in main memory 606. Such instructions can be read into main memory 606 from another storage medium, such as storage device 610. Execution of the sequences of instructions contained in main memory 606 causes processor 604 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry can be used in place of or in combination with software instructions.

The term “non-transitory media” as used herein refers to any non-transitory media storing data and/or instructions that cause a machine to operate in a specific fashion. Such non-transitory media can comprise non-volatile media and/or volatile media. Non-volatile media can include, for example, optical or magnetic disks, such as storage device 610. Volatile media can include dynamic memory, such as main memory 606. Common forms of non-transitory media can include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM or any other flash memory, NVRAM, any other memory chip or cartridge, a cache, a register, and networked versions of the same.

Non-transitory media is distinct from, but can be used in conjunction with, transmission media. Transmission media can participate in transferring information between storage media. For example, transmission media can include coaxial cables, copper wire and fiber optics, including the wires that comprise bus 602. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.

Various forms of media can be involved in carrying one or more sequences of one or more instructions to processor 604 for execution. For example, the instructions can initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 600 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 602. Bus 602 carries the data to main memory 606, from which processor 604 retrieves and executes the instructions. The instructions received by main memory 606 can optionally be stored on storage device 610 either before or after execution by processor 604.

Computer system 600 can also include a communication interface 618 coupled to bus 602. Communication interface 618 can provide a two-way data communication coupling to a network link 620 that can be connected to a local network 622. For example, communication interface 618 can be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 618 can be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links can also be implemented. In any such implementation, communication interface 618 can send and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

Network link 620 can typically provide data communication through one or more networks to other data devices. For example, network link 620 can provide a connection through local network 622 to a host computer 624 or to data equipment operated by an Internet Service Provider (ISP) 626. ISP 626 in turn can provide data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 628. Local network 622 and Internet 628 can both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 620 and through communication interface 618, which carry the digital data to and from computer system 600, can be example forms of transmission media.

Computer system 600 can send messages and receive data, including program code, through the network(s), network link 620 and communication interface 618. In the Internet example, a server 630 can transmit a requested code for an application program through Internet 628, ISP 626, local network 622 and communication interface 618. The received code can be executed by processor 604 as it is received, and/or stored in storage device 610, or other non-volatile storage for later execution. In some embodiments, server 630 can provide information for being displayed on a display.

Embodiments of the present disclosure have been described herein with reference to numerous specific details that can vary from implementation to implementation. Certain adaptations and modifications of the described embodiments can be made. Other embodiments can be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the present disclosure being indicated by the following claims. It is also intended that the sequence of steps shown in figures are only for illustrative purposes and are not intended to be limited to any particular sequence of steps. As such, it is appreciated that these steps can be performed in a different order while implementing the exemplary methods or processes disclosed herein.

Palou, Guillem

Patent Priority Assignee Title
Patent Priority Assignee Title
10452651, Dec 23 2014 WELLS FARGO BANK, N A Searching charts
5532717, May 19 1994 The United States of America as represented by the Secretary of the Navy Method of displaying time series data on finite resolution display device
5724575, Feb 25 1994 WebMD Corporation Method and system for object-based relational distributed databases
5872973, Oct 26 1995 VIEWSOFT, INC Method for managing dynamic relations between objects in dynamic object-oriented languages
5897636, Jul 11 1996 Flash Networks LTD Distributed object computer system with hierarchical name space versioning
6073129, Dec 29 1997 BULL HN INFORMATION SYSTEMS INC Method and apparatus for improving the performance of a database management system through a central cache mechanism
6094653, Dec 25 1996 NEC Corporation Document classification method and apparatus therefor
6161098, Sep 14 1998 FOLIO FINANCIAL, INC Method and apparatus for enabling small investors with a portfolio of securities to manage taxable events within the portfolio
6243717, Sep 01 1998 SIEMENS PRODUCT LIFECYCLE MANAGEMENT SOFTWARE INC System and method for implementing revision management of linked data entities and user dependent terminology
6304873, Jul 06 1999 Hewlett Packard Enterprise Development LP System and method for performing database operations and for skipping over tuples locked in an incompatible mode
6366933, Oct 27 1995 HANGER SOLUTIONS, LLC Method and apparatus for tracking and viewing changes on the web
6418438, Dec 16 1998 Microsoft Technology Licensing, LLC Dynamic scalable lock mechanism
6510504, Jun 29 1998 Oracle International Corporation Methods and apparatus for memory allocation for object instances in an object-oriented software environment
6549752, Jan 29 2001 Fujitsu Limited Apparatus and method accumulating cases to be learned
6560620, Aug 03 1999 APLIX RESEARCH, INC Hierarchical document comparison system and method
6574635, Mar 03 1999 Oracle America, Inc Application instantiation based upon attributes and values stored in a meta data repository, including tiering of application layers objects and components
6609085, Jan 19 1998 Asahi Glass Company Ltd Method for storing time series data and time series database system, method and system for processing time series data, time series data display system, and recording medium
6745382, Apr 13 2000 Verizon Patent and Licensing Inc CORBA wrappers for rules automation technology
6976210, Aug 31 1999 PIECE FUTURE PTE LTD Method and apparatus for web-site-independent personalization from multiple sites having user-determined extraction functionality
6980984, May 16 2001 CONSONA ERP, INC ; CONSONA CORPORATION Content provider systems and methods using structured data
7058648, Dec 01 2000 Oracle International Corporation Hierarchy-based secured document repository
7111231, Feb 24 1999 Intellisync Corporation System and methodology for dynamic application environment employing runtime execution templates
7194680, Dec 07 1999 Adobe Inc Formatting content by example
7233843, Aug 08 2003 Electric Power Group, LLC Real-time performance monitoring and management system
7461158, Aug 07 2002 International Business Machines Corporation System and method for controlling access rights to network resources
7667582, Oct 14 2004 Oracle America, Inc Tool for creating charts
7725530, Dec 12 2005 GOOGLE LLC Proxy server collection of data for module incorporation into a container document
7725728, Mar 23 2005 BUSINESS OBJECTS DATA INTEGRATION, INC Apparatus and method for dynamically auditing data migration to produce metadata
7730082, Dec 12 2005 GOOGLE LLC Remote module incorporation into a container document
7730109, Dec 12 2005 GOOGLE LLC Message catalogs for remote modules
7761407, Oct 10 2006 MEDALLIA, INC Use of primary and secondary indexes to facilitate aggregation of records of an OLAP data cube
7814084, Mar 21 2007 DEMOGRAPHICS PRO, INC Contact information capture and link redirection
7844892, Aug 17 2006 LinkedIn Corporation Method and system for display of business intelligence data
7962495, Nov 20 2006 WELLS FARGO BANK, N A Creating data in a data store using a dynamic ontology
7984374, Jul 23 1999 Adobe Inc Computer generation of documents using layout elements and content elements
8041714, Sep 15 2008 WELLS FARGO BANK, N A Filter chains with associated views for exploring large data sets
8060259, Aug 08 2003 Electric Power Group, LLC Wide-area, real-time monitoring and visualization system
8112425, Oct 05 2006 SPLUNK INC Time series search engine
8126848, Dec 07 2006 Automated method for identifying and repairing logical data discrepancies between database replicas in a database cluster
8185819, Dec 12 2005 GOOGLE LLC Module specification for a module to be incorporated into a container document
8401710, Aug 08 2003 Electric Power Group, LLC Wide-area, real-time monitoring and visualization system
8504542, Sep 02 2011 WELLS FARGO BANK, N A Multi-row transactions
8676857, Aug 23 2012 International Business Machines Corporation Context-based search for a data store related to a graph node
8930331, Feb 21 2007 WELLS FARGO BANK, N A Providing unique views of data based on changes or rules
8954410, Sep 02 2011 WELLS FARGO BANK, N A Multi-row transactions
9009827, Feb 20 2014 WELLS FARGO BANK, N A Security sharing system
9043696, Jan 03 2014 WELLS FARGO BANK, N A Systems and methods for visual definition of data associations
9092482, Mar 14 2013 WELLS FARGO BANK, N A Fair scheduling for mixed-query loads
9116975, Oct 18 2013 WELLS FARGO BANK, N A Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores
9195700, Oct 10 2007 United Services Automobile Association (USAA) Systems and methods for storing time-series data
9208159, Jun 23 2011 WELLS FARGO BANK, N A System and method for investigating large amounts of data
9230280, Apr 09 2014 WELLS FARGO BANK, N A Clustering data based on indications of financial malfeasance
9280532, Aug 02 2011 WELLS FARGO BANK, N A System and method for accessing rich objects via spreadsheets
9672257, Jun 05 2015 WELLS FARGO BANK, N A Time-series data storage and processing database system
9753935, Aug 02 2016 WELLS FARGO BANK, N A Time-series data storage and processing database system
20010051949,
20010056522,
20020091694,
20030105759,
20030115481,
20030120675,
20030130993,
20030212718,
20040111410,
20040117345,
20040117387,
20040148301,
20050097441,
20050108231,
20050114763,
20050131990,
20050289524,
20060074881,
20060080316,
20060095521,
20060106847,
20060116991,
20060161558,
20060218206,
20060218405,
20060218491,
20060242630,
20060253502,
20060265397,
20060288035,
20070050429,
20070061487,
20070143108,
20070143253,
20070185850,
20070233756,
20070271317,
20080015970,
20080104060,
20080104149,
20080195672,
20080201339,
20080215546,
20080270316,
20080301378,
20090031247,
20090106308,
20090164387,
20090177962,
20090254971,
20090271435,
20090313223,
20090313311,
20100036831,
20100070489,
20100076939,
20100082541,
20100114817,
20100114831,
20100114887,
20100138842,
20100145909,
20100161565,
20100161688,
20100191884,
20100211550,
20100211618,
20100231595,
20100235606,
20100283787,
20100325581,
20110029498,
20110047540,
20110153592,
20110173619,
20110184813,
20110218978,
20110258158,
20110258242,
20110270812,
20120072825,
20120123989,
20120124179,
20120150791,
20120150925,
20120159307,
20120221589,
20120272186,
20120330908,
20120330931,
20130036346,
20130060742,
20130066882,
20130097130,
20130103657,
20130151388,
20130304770,
20130318060,
20140040276,
20140095543,
20140101139,
20140149272,
20140181833,
20140247946,
20140324876,
20140344231,
20150039886,
20150089353,
20150106347,
20150112956,
20150186338,
20150186434,
20150212663,
20150213043,
20150213134,
20150227295,
20150242397,
20150261817,
20150278325,
20150341467,
20150379065,
20160034545,
20160062555,
20160088013,
20160164912,
20160253679,
20160275432,
20160328432,
20170270172,
20170355036,
20180039651,
20180181629,
20190171775,
AU2014206155,
EP652513,
EP1126384,
EP2555126,
EP2863326,
EP2891992,
EP2993595,
EP3101560,
EP3279813,
EP3343403,
EP3493109,
WO2008043082,
WO2012025915,
WO2014019349,
////
Executed onAssignorAssigneeConveyanceFrameReelDoc
Dec 18 2014PALOU, GUILLEMPalantir Technologies IncASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS 0549720941 pdf
Sep 11 2019PALANTIR TECHNOLOGIES INC.(assignment on the face of the patent)
Jun 04 2020Palantir Technologies IncMORGAN STANLEY SENIOR FUNDING, INC SECURITY INTEREST SEE DOCUMENT FOR DETAILS 0528560817 pdf
Jul 01 2022Palantir Technologies IncWELLS FARGO BANK, N A SECURITY INTEREST SEE DOCUMENT FOR DETAILS 0605720506 pdf
Date Maintenance Fee Events
Sep 11 2019BIG: Entity status set to Undiscounted (note the period is included in the code).


Date Maintenance Schedule
Apr 26 20254 years fee payment window open
Oct 26 20256 months grace period start (w surcharge)
Apr 26 2026patent expiry (for year 4)
Apr 26 20282 years to revive unintentionally abandoned end. (for year 4)
Apr 26 20298 years fee payment window open
Oct 26 20296 months grace period start (w surcharge)
Apr 26 2030patent expiry (for year 8)
Apr 26 20322 years to revive unintentionally abandoned end. (for year 8)
Apr 26 203312 years fee payment window open
Oct 26 20336 months grace period start (w surcharge)
Apr 26 2034patent expiry (for year 12)
Apr 26 20362 years to revive unintentionally abandoned end. (for year 12)