Methods, systems and computer readable code for forecasting commodity consumption and for forecasting time series are provided. According to some embodiments, the forecasting includes deriving at least one population commodity consumption forecasting model from population historical consumption data, deriving an individual commodity consumption forecasting model for at least one individual of the population from at least one population commodity consumption forecasting model and from individual historical consumption data, and forecasting future individual commodity consumption for the individual using individual commodity consumption forecasting model. According to some embodiments, the presently disclosed forecasting includes forecasting future values of an individual time series within a population of time series, where each time series on the same domain. Thus, according to some embodiments, the forecasting includes deriving at least one population forecasting model from past values of the population of time series, deriving an individual time series for at least one individual time series forecasting model from past individual time series values and from at least one population forecasting model and forecasting future values of said individual time series using the individual time series forecasting model.
|
1. A computer program product encoding a computer program stored on a non-transitory computer readable storage medium for executing a process on a digital computer processor, the process comprising a set of instructions for forecasting commodity consumption by a target individual of a large population of a size of at least on the order of magnitude of one million individuals, the large population including individuals exhibiting irregular historical consumption patterns of a commodity, the target individual not necessarily having a commodity consumption pattern representative of the large population, wherein the set of instructions comprises instructions for:
a) selecting, from the large population of individuals having a size on the order of magnitude of one million, a representative sub-set of the population, the representative sub-set of the population, when combined with the target individual, not required to coincide with the large population;
b) providing historical consumption data describing actual historical consumptions of the commodity by the representative sub-set of the large population during one or more less-recent historical time period(s) and during one or more more-recent historical time periods(s);
c) for a plurality of commodity consumption forecast models, evaluating, for the specific case of the representative sub-set selected in step (b), performance of each forecast model of the plurality of forecast models by determining the ability of each forecast model of the plurality of commodity consumption forecast models to predict:
i) consumption of the commodity by the representative population sub-set during the more-recent historical time period(s) from
ii) data describing consumption of the commodity by the representative population sub-set during the less-recent historical time period(s)
d) according to the results of the forecast performance evaluating of step (c), selecting a sub-plurality of commodity consumption forecast models from the plurality of consumption forecast models;
e) for the target individual of the large population, providing historical consumption data describing actual historical consumptions of the commodity by the target individual during one or more less-recent historical time period(s) and during one or more more-recent historical time periods(s), the historical consumption data of the target individual not necessarily representative of the large population or of the representative sub-set of the large population;
f) for the selected sub-plurality of commodity consumption forecast models, evaluating, for the specific case of the target individual, performance of each forecast model of the sub-plurality of forecast models by determining the ability of each forecast model of the sub-plurality to predict:
i) consumption of the commodity by the target individual during the more-recent historical time period(s) from
ii) data describing consumption of the commodity by the target individual during the less recent historical time period(s);
g) formulating a combined forecasting model adapted for the target individual that includes at least some forecast models of the sub-plurality weighted for the target individual, in accordance with the results of the model performance-evaluating for the specific case of the target individual; and
h) forecasting, future consumption of the commodity by the target individual using the combined forecast model.
8. A system for forecasting commodity consumption by a target individual of a large population of a size of at least on the order of magnitude of one million individuals, the large population including individuals exhibiting irregular historical consumption patterns of a commodity, the target individual not necessarily having a commodity consumption pattern representative of the large population, the system comprising:
a) a data storage including at least one of volatile and non-volatile memory configured to store:
i) a description of a representative sub-set of the population selected from the large population of individuals having a size at least on the order of magnitude of one million, the representative sub-set of the population, when combined with the target individual, not required to coincide with the large population;
ii) historical consumption data describing actual historical consumptions of the commodity by the representative sub-set of the large population during one or more less-recent historical time period(s) and during one or more more-recent historical time periods(s);
iii) a description of a plurality of commodity consumption forecast models; and
iv) historical consumption data describing actual historical consumptions of the commodity by the target individual during one or more less-recent historical time period(s) and during one or more more-recent historical time periods(s), the historical consumption data of the target individual not necessarily representative of the large population or of the representative sub-set of the large population;
b) a digital computer processor; and
c) computer code stored on a computer-readable medium, the digital computer processor and the computer code configured such that execution of the computer code by the digital computer processor effects the following steps:
i) selecting the representative sub-set of the population from the large population of individuals having a size at least on the order of magnitude of one million;
ii) for the plurality of commodity consumption forecast models, evaluating, for the specific case of the representative sub-set selected in step (i), performance of each forecast model of the plurality of forecast models by determining the ability of each forecast model of the plurality of commodity consumption forecast models to predict:
A) consumption of the commodity by the representative population sub-set during the more-recent historical time period(s) from
B) data describing consumption of the commodity by the representative population sub-set during the less-recent historical time period(s)
iii) according to the results of the forecast performance evaluating of step (ii), selecting a sub-plurality of commodity consumption forecast models from the plurality of consumption forecast models;
iv) for the selected sub-plurality of commodity consumption forecast models, evaluating, for the specific ease of the target individual, performance of each forecast model of the sub-plurality of forecast models by determining the ability of each forecast model of the sub-plurality to predict:
A) consumption of the commodity by the target individual during the more-recent historical time period(s) from
B) data describing consumption of the commodity by the target individual during the more recent historical time period(s); and
v) formulating a combined forecast model adapted for the target individual from that includes at least some forecast models of the sub-plurality weighted, for the target individual, in accordance with the results of the model performance-evaluating for the specific case of the target individual; and
vi) forecasting future consumption of the commodity by the target individual using the combined forecast model.
2. The computer program product of
3. The computer program product of
4. The computer program product
5. The method computer program product
6. The method computer program product of
i) applying each commodity consumption model of the plurality of models to historical consumption data for the representative sub-set of the population during less-recent historical time period(s) in order to generate a test forecast of the consumption of the commodity by the representative sub-set during the more- recent historical time period(s); and
ii) comparing the test forecasts to the actual historical consumptions of the commodity by the representative sub-set of the large population during the more-recent historical time period(s).
7. The computer program product
9. The system of
10. The system of
11. The system of
12. The system of
13. The system of
i) applying each commodity consumption model of the plurality of models to historical consumption data for the representative sub-set of the population during less-recent historical time period(s) in order to generate a test forecast of the consumption of the commodity by the representative sub-set during the more-recent historical time period(s); and
ii) comparing the test forecasts to the actual historical consumptions of the commodity by the representative sub-set of the large population during the more-recent historical time period(s).
14. The system of
|
The present invention relates to methods and systems for forecasting commodity consumption and for forecasting time series.
Forecasting Time Series
A time series is a sequence of observations that are ordered in time (e.g., observations made at evenly spaced time intervals). Some examples of time series data may include end-of-month stock prices for General Electric, hits per day at a web site, the volume of usage of a communications network, weekly sales of Windows XP, electrical demand per hour in Seattle, daily high tide readings in San Francisco Bay, etc.
Various forecasting methods exist that attempt to predict future values of the time series based on the past time series data. Some forecasting methods are as simple as continuing the trend curve smoothly by a straight line. Other forecasting methods are more sophisticated. The most well known and widely used method is the ARIMA procedure (auto-regression and moving averages) due to Box and Jenkins (George E. P. Box, et al., “Time Series Analysis: Forecasting And Control,” 3rd Edition, Prentice Hall, Feb. 9, 1994), a procedure which assumes that each measurement in a time series is generated by a linear combination of past measurements plus noise.
Statistical methods (Gilchrist W., Statistical Forecasting, John Wiley & Sons; December 1976) related to solving forecast problems include Taylor Series Exponential Smoothing, Decision Trees, Neural Network and Heuristic Networks.
There are a number of publications describing time series forecasting as applied to a number of problems. Thus, time series have proven useful for forecasting usage of network resources (see U.S. Pat. No. 5,884,037, U.S. Pat. No. 6,125,105, US 2001/0013008 the disclosures of which are incorporated herein by reference). Another disclosure providing potentially relevant background information is Wolski R., Dynamically forecasting network performance using the Network Weather Service, (Cluster Computing, vol., 1, num. 1, pp. 119-132, 1998).
Other applications of time series forecasting include the forecasting of glucose concentration (see U.S. Pat. No. 6,272,364 and U.S. Pat. No. 6,546,269 the disclosures of which are incorporated herein by reference), and the forecasting of macroeconomic data (see Clements M., and Hendry D., “Forecasting Economic Time Series”, Cambridge University Press, 1998)
One known technique for improving forecast quality is to use a multiple forecasting model which combines forecasts obtained from a plurality of different forecasting models. For example, U.S. Pat. No. 6,535,817, the disclosure of which is incorporated herein by reference, discloses methods, systems and computer program products for generating weather forecasts from a multi-model superensemble. In particular, U.S. Pat. No. 6,535,817 and T. N. Krishnamurti et al. “Improved Weather and Seasonal Climate Forecasts from Multimodel Superensemble”, Science, vol. 285 No. 5433, pp 1548-1550, Sep. 3, 1999 disclose the generation of a model that combines the historical performance of forecasting data from multiple weather forecasting models over a large number of geographic areas or regions to produce an unifying forecast.
U.S. Pat. No. 6,032,125 discloses the use of a plurality of neural networks to forecast the sales of products.
Other disclosures providing potentially relevant background material and related to combining forecasts include:
In general, there are a number of difficulties which need to be overcome when combining forecasting models. For applications where limited computational resources is a factor, care must be taken to avoid or minimize redundancy between forecasting models. Furthermore, it is not always clear a priori how much weight to assign to each of the constitutive forecast models. In situations where there are numerous models to be combined and numerous time series to be forecast the computational costs associated with building appropriate forecasting model for each time series can be prohibitive. Finally, it is noted that issues associated with model overfitting can be difficult to treat appropriately in a combined model.
An important application of time series forecasting methods is the prediction of the future consumption of a commodity from historical consumption data. Exemplary commodities include electricity, natural resources, network bandwidth, and money spent in a retail store.
Forecasting Commodity Consumption by Individuals
Although techniques for forecasting consumption of commodities by an entire population have been disclosed, the more difficult task of predicting future consumption of commodities individual consumers within a population of consumers remains an open problem. While the former problem addresses the question “how much of the commodity in total will be consumed at a certain time” the latter problem attempts to forecast “how much will each specific consumer within the large population consume at a certain time.”
The differences between these two problems are substantial. Thus, for many applications the number of commodities to be forecasted is typically small, and the number of consumers is typically large. Because certain consumers exhibit irregular consumption habits, the forecasting of commodity consumption by individual consumers is prone either to overfitting or to large inexactitudes. Furthermore, it is noted that in the interests of building an accurate model, it is, for many applications, desirable to combine forecasting models. For the specific case where a forecast model is needed for a large number of consumers, this can be computationally expensive if specific forecasting models for each individual consumer are combined.
There is an ongoing need for models, systems and computer readable code for forecasting commodity consumptions of individuals within large, non-homogenous populations. One exemplary commercial application relates to comparing predicted future commodity consumption values with actual commodity consumption values. In one specific example, an individual customer consumes wireless telephone services from several providers. Should the individual under-consume the telephone services as compared to the forecast consumption, it could indicate that the consumer has switched to another provider. According to this example, it could be advantageous to offer a discount in order to stimulate consumption of the wireless service by the individual customer.
The aforementioned needs are satisfied by several aspects of the present invention.
It is now disclosed for the first time a method of forecasting commodity consumption by an individual within a population. The presently disclosed method includes deriving at least one population commodity consumption forecasting model from population historical consumption data, deriving an individual commodity consumption forecasting model for at least one individual of the population from at least one population commodity consumption forecasting model and from individual historical consumption data, and forecasting future individual commodity consumption for the individual using the individual commodity consumption forecasting model.
According to some embodiments, the population historical consumption data and/or the individual historical consumption data is provided as a time series.
According to some embodiments, the population historical consumption data includes data from a representative subset of the population.
Any technique known in the art for obtaining a data from a representative subset is appropriate for the present invention. According to some embodiments, the representative subset is a randomly selected subset.
There is no specific limitation in how the population commodity consumption forecasting model is derived. According to some embodiments, the step of deriving at least one population commodity consumption forecast model includes selecting at least one population commodity consumption forecast model from a plurality of candidate forecast models.
According to some embodiments, selection includes evaluating a forecast quality of the candidate forecast model using the population historical data as a training set.
According to some embodiments, the evaluating of the forecast quality includes determining an aggregate function of qualities of individual forecasts for a plurality of subsets of the population historical data.
According to some embodiments, selection includes an iterative process wherein individual candidate forecast models are appended to a set of previously selected candidate forecast models.
There is no specific limitation on the model selection process. According to some embodiments, the selection process is a greedy selection process.
According to some embodiments, the step of selecting includes employing a genetic selection algorithm.
According to some embodiments, the appending of a single candidate forecast model includes comparing forecast qualities of unchosen candidate forecast models.
According to some embodiments, the single appended forecast model has the best forecast quality among previously unchosen the candidate forecast models.
According to some embodiments, appending of a single candidate forecast model includes analyzing a redundancy parameter of unchosen candidate forecast models relative to set of said previously chosen candidate forecast models. Thus, according to some embodiments, candidate models which add information to the previously chosen forecast models are chosen.
According to some embodiments, the stage of selecting excludes analyzing forecast quality of a combination of said candidate models.
According to some embodiments, a plurality of population commodity consumption forecasting models is derived.
According to some embodiments, the deriving of the individual commodity consumption forecasting model includes forming a weighted combination of population commodity consumption forecasting models selected from the plurality.
According to some embodiments, the weight coefficients are derived by analyzing forecast quality of a population commodity consumption forecasting model on the individual historical data.
According to some embodiments, at least one forecast model selected from the group consisting of the population commodity consumption forecasting model and the individual commodity consumption forecasting model is derived from a statistical forecast model.
According to some embodiments, deriving of the at least one population commodity consumption forecasting model includes deriving a first population commodity consumption forecasting model from a first subset of population historical consumption data and deriving a second population commodity consumption forecasting model from a second subset of the population historical consumption data.
There is no limitation on the type of forecast model. Exemplary forecast models include but are not limited to a statistical models, pure statistical models, neural networks, decision trees, heuristic algorithms, simulated annealing algorithms, and genetic algorithms. Furthermore, it is appreciated that any equivalent forecast model or combination of the aforementioned forecast models is appropriate for the present invention.
According to some embodiments, the forecasting of the future individual commodity consumption includes forecasting a network bandwidth consumption.
It is now disclosed for the first time a system for forecasting commodity consumption by an individual within a population. The presently disclosed system includes a population model input for receiving at least one population commodity consumption model associated with population historical consumption data, a consumption data input for receiving individual historical consumption data for at least one individual, a model formulator for deriving from the at least one population commodity consumption model and from the individual historical consumption data an individual commodity consumption forecasting model, and a forecasting unit for outputting a future individual commodity consumption using the individual commodity consumption forecasting model.
According to some embodiments, the population model input is operative to receive a plurality of candidate population commodity consumption models and the model formulator includes a population model selector for selecting at least one population commodity consumption model from a plurality of candidate models.
According to some embodiments, the model formulator is operative to derive the individual commodity consumption forecasting model from a plurality of the population commodity consumption models, and the model formulator includes a model combiner for producing a combined model from the plurality of models and from the individual consumption data for an individual.
According to some embodiments, the model combiner includes a coefficient generator for generating weighting coefficients for the plurality of models.
According to some embodiments, the model formulator includes a model forecast quality analyzer for analyzing a forecast quality associated with the population commodity consumption model relative to the individual historical consumption data of an individual.
It is now disclosed for the first time a computer readable storage medium having computer readable code embodied in the computer readable storage medium, the computer readable code for forecasting commodity consumption by an individual within a population, the computer readable code comprising instructions for deriving at least one population commodity consumption forecasting model from population historical consumption data, deriving an individual commodity consumption forecasting model for at least one individual of the population from at least one population commodity consumption forecasting model and from individual historical consumption data, and forecasting future individual commodity consumption for the individual using the individual commodity consumption forecasting model.
It is now disclosed for the first time a method of forecasting future values of an individual time series within a population of time series, each time series on the same domain, the method. The presently disclosed method includes deriving at least one population forecasting model from past values of the population of time series, deriving for at least one individual time series an individual time series forecasting model from past individual time series values and from the at least one population forecasting model, and forecasting future values of the individual time series using the individual time series forecasting model.
According to some embodiments, the population forecasting model is operative to forecast future values of the population of time series.
It is now disclosed for the first time computer readable storage medium having computer readable code embodied in the computer readable storage medium, the computer readable code for forecasting future values of an individual time series within a population of time series, each time series on the same domain. The presently disclosed computer readable code includes instructions for deriving at least one population forecasting model from past values of the population of time series, deriving at least one individual time series an individual time series forecasting model from past individual time series values and from the at least one population forecasting model and forecasting future values of the individual time series using the individual time series forecasting model.
It is now disclosed for the first time a system for forecasting future values of an individual time series within a population of time series, each time series on the same domain. The system presently disclosed system includes a population model input for receiving at least one population forecasting model operative to forecast future values associated with the population of time series, an individual time series input for receiving at least one individual time series, a model formulator for deriving an individual time series forecasting model from past values of the received individual time series and from the received at least one population forecasting model and a forecasting unit for forecasting future values of the individual time series using the individual time series forecasting model.
These and further embodiments will be apparent from the detailed description and examples that follow.
The present invention will now be described in terms of specific, example embodiments. It is to be understood that the invention is not limited to the example embodiments disclosed. It should also be understood that not every feature of the method and system for forecasting large numbers of time series of the same domain and/or for forecasting commodity consumption is necessary to implement the invention as claimed in any particular one of the appended claims. Various elements and features of devices are described to fully enable the invention. It should also be understood that throughout this disclosure, where a process or method is shown or described, the steps of the method may be performed in any order or simultaneously, unless it is clear from the context that one step depends on another being performed first.
Exemplary embodiments of the present invention will be explained in terms of forecasting commodity consumption by an individual based on both population and individual historical data. The historical data in the presently presented examples is provided as time series data, though it is understood that this is not a limitation of the present invention, and individual and/or population historical data can be provided in any form appropriate for the building of forecast models.
As used herein, “individual historical data” is data about past commodity consumption by an individual within a population of individuals. According to some embodiments, the individual historical data is provided as time series data, i.e. a set of values for specific times or time periods, where each value represents a quantity of commodity consumed by the individual at the specific time or during the specific time period.
In some embodiments, “population historical data” is data related to past commodity consumption by a population of individuals. According to some embodiments, the population historical data is provided as time series data, i.e. a set of values for specific times or time periods, where each value represents a quantity of commodity consumed at the specific time or during the specific time period. According to some embodiments, this quantity of commodity consumed relates to an aggregate commodity consumption by the entire population. Alternatively or additionally, population historical data relates to actual commodity consumption by actual individuals selected from the population, e.g. individuals who are from a representative subset of the population including but not limited to a randomly selected subset of the population.
In some embodiments, the population is represented by a set of individuals, e.g. a representative subset of the population, and each individual is represented by the respective individual historical data.
According to some embodiments, the population is a “large” population, e.g. at least thousands, at least millions.
Furthermore, it will be appreciated that the presently presented methods and systems for forecasting future values of a time series representing commodity consumption are also applicable for forecasting future values of any individual time series of a given domain from a population of multiple time series of the same domain.
Exemplary time series domains and examples of consumed commodities include but are not limited to hourly usage of bandwidth by each of the consumers (users) of a communications network, or time series of daily expenses of each consumer (customer) of a retail store. Both of these aforementioned time series describe commodity consumption, where in the case of the communications network the bandwidth functions as the resource, while in the specific case of the retail store money spent is the specific resource consumed.
The term multiple time series addresses numbers starting from two time series up to and beyond hundreds of millions of time series. In the communication network, for example, there may be millions of users, the retail store example may be thousands of customers, etc.
In accordance with some embodiments of the present invention, methods, systems and computer readable code for forecasting are disclosed. It is noted that unless specified otherwise, these methods, systems and computer readable code can be implemented as software, hardware, or a combination thereof.
It is noted that throughout the figures, variables are defined as follows:
i,l are dummy variables to iterate over population forecast models;
Qh represents a single time series related to the population;
h is a dummy variable to iterate over single time series related to the population;
j is a dummy variable to iterate historical population test cases Pj related to time series of population historical data; for the embodiments presented in the figures, each test case Pj is given as a five-tuple (Qh, Hk,start, Hk,end, Tk,start, Tk,end);
k is a dummy variable to iterate over pairings between history periods (H) and test forecasting periods (T);
L represents the total number of time series associated with the population;
ValArr1[i,j] represents an evaluation of the effectiveness of the ith population forecast model on the jth test case of data;
r is a total number of candidate population forecasting models (including both those eventually selected and those rejected);
m is the number of test cases selected from population historical data;
Max is the total number of population forecasting models selected;
Sel represents the set of already selected population forecast models;
s is the number of population forecast models in the set Sel,
S1 . . . SN are time series related to single individual for a single forecasting period F;
α is a dummy variable to iterate historical individual test cases Iα associated with a single individual; each individual test case Iα is given as a five-tuple (Sβ, Hk,start, Hk,end, Tk,start, Tk,end).
n is the number of test cases selected from individual historical data for the particular individual;
ValArr2[i,α] represents an evaluation of the effectiveness of the ith population forecast model on the αth test case of data;
N is the total number of time series for single individual related to a single forecasting period F; According to some embodiments, each individual can have more than one relevant time series (e.g. N>1).
β is a dummy variable to iterate among these time series related to a single individual;
γ is a dummy variable representing a discrete point in time;
According to specific embodiments described in
It is noted that the order of steps presented in
Furthermore, it is noted that although the building of a repository of a population of forecasting models from population historical data 102 is recited as a specific step in
There is no limitation on the specifics of the forecasting models. According to some embodiments, multiple models are created running the same forecasting algorithm on the same set of population historical data (e.g. time series) using different parameters. For example, in ARIMA different numbers of autoregressive terms may be selected, and each set of parameters yields a different population forecasting model. According to some embodiments, the same exact forecasting algorithm using the same exact forecasting parameters can still yield multiple models when applied to different subsets of the population historical data.
According to some embodiments, different population forecasting models are derived from different specific forecasting algorithms. Exemplary forecasting algorithms include but are not limited to forecasting algorithms derived from statistical algorithms (e.g. ARIMA), expert systems, neural networks, and the like. The providing of the population forecasting model can involve automatic methods, manual methods or any combination thereof. Thus, according to some embodiments, parameters for the forecasting algorithms (e.g. ARIMA parameters) are selected in part by a human expert. Alternatively or additionally, automatic techniques are used, such as application of a genetic algorithm on one or more samples of one or more population time series.
Thus, as described in
The selection of appropriate test cases (step 410) may be carried out by any method known in the art. Thus, according to some embodiments, test cases are selected randomly by randomly selecting the history and the test forecasting period. Alternatively or additionally, test cases are selected based on domain knowledge. In one particular example, it is known that there is a correlation between certain behavior in the historic data (e.g. consumption of the commodity) and certain months of the year. Thus, domain knowledge may be used in order to select test cases appropriately.
Alternatively or additionally, based on properties previously discovered, different algorithms such as genetic algorithms are used to learn which test cases are good test cases for predicting the performance of the models in a certain domain of time series. It is further understood that any combination of the aforementioned techniques for selecting test cases is appropriate.
Steps 420 through 480 describes the nested loop where the forecast performance of each model is evaluated using the specific population test cases, where i is the dummy variable to iterate over a total of r population forecast models (see steps 420, 470 and 480) while j is the dummy variable to iterate over a total of m population test cases (see steps 430, 450 and 460). In step 440 the two dimensional array ValArr1 is set to record the performance of model Ai on test case Pj. The evaluation is calculated by a function Eval, and is based on comparing the forecasted values versus the true values in the time series. For the purposes of these examples, the convention that a lower value of Eval indicates a better quality forecast has been adopted. The present invention imposes no specific limitation on the Eval function that is adopted. In one exemplary embodiment, a function Eval returns the sum of the absolute values of the differences between the forecasted values and the true values at each point in the forecast period.
Furthermore, it is noted that in
MinDistance is the minimal distance required between Sel and a candidate model in order to consider the addition of the model to Sel. MinDistance is used to ensure that the set Sel will not be populated by models that give predictions too homogenous not covering all the test cases, choosing a larger MinDistance imposes a more stringent requirement of non-redundancy between a given model Ai to be appended to Sel and Sel. The mathematical expression for this redundancy is given in step 520 by Distance (Ai,Sel).
In step 520, specific a model Ai is selected for appending to Sel if it meets the following conditions: the model Ai must have not already been selected, e.g. Ai∉Sel, the model Ai must exhibit a minimum non-redundancy relative to previously selected models is Sel, e.g. Distance(Ai,Sel)≧MinDistance, and the model Ai must be the model with the best forecast quality, e.g. Σj=1mValArr1[i, j]≦Σj=1mValArr1[l, j] amongst all previously non-selected models A1 meeting the non-redundancy requirement, e.g. Distance(Al,Sel)≧MinDistance. Furthermore, it will be appreciated that the aforementioned criteria for best forecast quality and non-redundancy and exemplary criteria, and other appropriate forecast quality and/or redundancy criteria may be employed.
Upon selection of the appropriate model Ai, this model is appended to the set of selected models Sel, and the variable measuring the size of Sel is incremented. In the event that no such model was selected (step 530) or if the size of set of previously selected models Sel exceed Max (step 560), there is no more selection of models (step 570).
It is noted that the exemplary selection process described in
After selecting a subset of population models in the original repository (step 106) for the population (or population of time series), it is possible to adapt the selected models according to individual historical data (step 108) for each individual of interest. Thus, for each individual (or for each individual time series), an individual forecasting model is built 110 by adapting the subset of population forecasting models using historical data associated with the individual.
Towards this end, the performance of the selected population models is evaluated for each individual or time series of interest, and the forecasts are then combined into a single forecast, adapted or optimized for the individual. This adaptation is carried out according to the performance of the different models on the test cases relevant for the individual.
It is noted that this adaptation can be accomplished using any appropriate method, and in the following figures an exemplary technique is presented. According to this exemplary technique, the models that perform best on the test cases are given greater consideration or weight when building the combined individual-specific model. This weight is determined according to the success of a population model for the individual specific test cases.
Referring to
The selection of appropriate test cases (step 620) may be carried out by any method known in the art. Thus, according to some embodiments, test cases are selected randomly by randomly selecting the history and the test forecasting period. Alternatively or additionally, test cases are selected based on domain knowledge. In one particular example, it is known that there is a correlation between certain behavior in the historic data (e.g. consumption of the commodity) and certain months of the year. Thus, domain knowledge may be used in order to select test cases appropriately.
Alternatively or additionally, based on properties previously discovered, different algorithms such as genetic algorithms are used to learn which test cases are good test cases for predicting the performance of the models in a certain domain of time series. It is further understood that any combination of the aforementioned techniques for selecting test cases is appropriate.
It is noted that step 750 relates to the fact that a total of n individual-specific test cases are analyzed in
After the two dimensional array ValArr2 is populated with the appropriate values, the performance values of the various population models on individual test cases as recorded in the array ValArr2 are utilized in order to populate (step 780) the one dimensional array WeightArr of size s. The weighting functions are calculated based on the performance of each model on the test cases. Any method known in the art for deriving or calculating values of WeightArr is appropriate for the present invention. In one exemplary embodiment, the model that performed best in the greatest number of test cases is given a weight 1 and all other models are given a weight 0, thereby allowing for selection of one particular population model for a given individual with a given forecasting period though it is understood that other more methods are appropriate.
The array WeightArr provides relative weights of the population models appropriate or adapted for the individual Thus, according to exemplary embodiments presented, the individual forecast model created for which future values (e.g. commodity consumption) for the particular individual are to be forecast (step 790) for a given forecast period F is represented by the set of population models Sel and the weight array WeightArray. Exemplary implementations for step 790 are described in
According to the exemplary embodiments of
Some embodiments of the present invention provide systems for implementing any of the aforementioned methods.
In the two phrases “less-recent historical time period” and “more-recent historical time period” the words “less” and “more” are to be taken as relative to one another. Accordingly, when these two phrases appear near each other in the same sentence, each of the two phrases is to be interpreted relative to the nearest other of the two phrases in the same sentence.
In the description and claims of the present application, each of the verbs, “comprise” “include” and “have”, and conjugates thereof, are used to indicate that the object or objects of the verb are not necessarily a complete listing of members, components, elements or parts of the subject or subjects of the verb.
The present invention has been described using detailed descriptions of embodiments thereof that are provided by way of example and are not intended to limit the scope of the invention. The described embodiments comprise different features, not all of which are required in all embodiments of the invention. Some embodiments of the present invention utilize only some of the features or possible combinations of the features. Variations of embodiments of the present invention that are described and embodiments of the present invention comprising different combinations of features noted in the described embodiments will occur to persons of the art. The scope of the invention is limited only by the following claims.
Solotorevsky, Gad, Pedan, Sveta
Patent | Priority | Assignee | Title |
10045059, | Dec 17 2015 | AT&T Intellectual Property I, L.P. | Channel change server allocation |
10331490, | Nov 16 2017 | SAS INSTITUTE INC | Scalable cloud-based time series analysis |
10503498, | Nov 16 2017 | SAS INSTITUTE INC | Scalable cloud-based time series analysis |
10560313, | Jun 26 2018 | SAS Institute Inc. | Pipeline system for time-series data forecasting |
10642610, | Nov 16 2017 | SAS Institute Inc. | Scalable cloud-based time series analysis |
10685283, | Jun 26 2018 | SAS INSTITUTE INC | Demand classification based pipeline system for time-series data forecasting |
10728600, | Dec 17 2015 | AT&T Intellectual Property I, L.P. | Channel change server allocation |
11250449, | Apr 17 2017 | DataRobot, Inc. | Methods for self-adaptive time series forecasting, and related systems and apparatus |
11263172, | Jan 04 2021 | International Business Machines Corporation | Modifying a particular physical system according to future operational states |
11295217, | Jan 14 2016 | Uptake Technologies, Inc.; UPTAKE TECHNOLOGIES, INC | Localized temporal model forecasting |
11922329, | May 23 2014 | DataRobot, Inc. | Systems for second-order predictive data analytics, and related methods and apparatus |
8812346, | Oct 04 2006 | Salesforce.com, Inc. | Method and system for load balancing a sales forecast by selecting a synchronous or asynchronous process based on a type of event affecting the sales forecast |
9578362, | Dec 17 2015 | AT&T Intellectual Property I, L.P. | Channel change server allocation |
Patent | Priority | Assignee | Title |
5884037, | Oct 21 1996 | International Business Machines Corporation; IBM Corporation | System for allocation of network resources using an autoregressive integrated moving average method |
5983251, | Sep 08 1993 | IDT, Inc. | Method and apparatus for data analysis |
6032125, | Nov 07 1996 | Fujitsu Limited | Demand forecasting method, demand forecasting system, and recording medium |
6125105, | Jun 05 1997 | Cerebrus Solutions Limited | Method and apparatus for forecasting future values of a time series |
6266658, | Apr 20 2000 | Microsoft Technology Licensing, LLC | Index tuner for given workload |
6272364, | May 13 1998 | Lifescan IP Holdings, LLC | Method and device for predicting physiological values |
6297825, | Apr 06 1998 | SYNAPIX, INC | Temporal smoothing of scene analysis data for image sequence generation |
6535817, | Nov 10 1999 | FLORIDA STATE RESEARCH FOUNDATION, THE | Methods, systems and computer program products for generating weather forecasts from a multi-model superensemble |
6546269, | May 13 1998 | Lifescan IP Holdings, LLC | Method and device for predicting physiological values |
6611726, | Sep 17 1999 | Method for determining optimal time series forecasting parameters | |
6658396, | Nov 29 1999 | Prediction Sciences, LLC | Neural network drug dosage estimation |
6745150, | Sep 25 2000 | PRECISELY SOFTWARE INCORPORATED | Time series analysis and forecasting program |
6816839, | May 04 2000 | International Business Machines Corporation | Demand planning for configure-to-order and building blocks-based market environment |
6996508, | Dec 03 2001 | The Texas A&M University System | System and method for remote retrofit identification of energy consumption systems and components |
7406435, | Mar 18 2002 | ORACLE DEMANTRA R & D CENTER ISRAEL LTD | Computer implemented method and system for computing and evaluating demand information |
20010013008, | |||
20030177057, | |||
20030200134, | |||
20050076283, | |||
20080097802, | |||
20080249845, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jun 19 2005 | Cvidya Networks | (assignment on the face of the patent) | / | |||
Dec 18 2006 | SOLOTOREVSKY, GAD | CVIDYA NETWORKS LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026105 | /0242 | |
Dec 18 2006 | PEDAN, SVETA | CVIDYA NETWORKS LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026105 | /0242 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS II, LIMITED PARTNERSHIP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS MANAGEMENT 2004 LTD | SECURITY AGREEMENT | 022425 | /0703 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS III C I , LP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS III 2 LIMITED PARTNERSHIP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS III D C M LIMITED PARTNERSHIP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS III, LIMITED PARTNERSHIP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS II D C M LIMITED PARTNERSHIP | CORRECTIVE ASSIGNMENT TO CORRECT THE SECURITY AGREEMENT NAMES OF ASSIGNEES PREVIOUSLY RECORDED ON REEL 022425 FRAME 0703 ASSIGNOR S HEREBY CONFIRMS THE SECURITY AGREEMENT | 022486 | /0797 | |
Mar 18 2009 | CVIDYA NETWORKS LTD | PLENUS MANAGEMENT III 2007 LTD | SECURITY AGREEMENT | 022425 | /0703 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS II, LIMITED PARTNERSHIP | SECURITY AGREEMENT | 023790 | /0170 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS III, LIMITED PARTNERSHIP | SECURITY AGREEMENT | 023790 | /0170 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS III D C M , LIMITED PARTNERSHIP | SECURITY AGREEMENT | 023790 | /0170 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS III 2 , LIMITED PARTNERSHIP | SECURITY AGREEMENT | 023790 | /0170 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS III C I , L P | SECURITY AGREEMENT | 023790 | /0170 | |
Jan 07 2010 | CVIDYA NETWORKS LTD | PLENUS II D C M , LIMITED PARTNERSHIP | SECURITY AGREEMENT | 023790 | /0170 | |
Mar 09 2016 | PLENUS III C I L P | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Mar 09 2016 | PLENUS III 2 , LIMITED PARTNERSHIP | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Mar 09 2016 | PLENUS III, LIMITED PARTNERSHIP | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Mar 09 2016 | PLENUS II D C M , LIMITED PARTNERSHIP | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Mar 09 2016 | PLENUS II, LIMITED PARTNERSHIP | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Mar 09 2016 | PLENUS III D C M , LIMITED PARTNERSHIP | CVIDYA NETWORKS, LTD | RELEASE BY SECURED PARTY SEE DOCUMENT FOR DETAILS | 037933 | /0876 | |
Jun 04 2016 | CVIDYA NETWORKS LTD | AMDOCS DEVELOPMENT LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 038983 | /0371 |
Date | Maintenance Fee Events |
Jun 26 2015 | M2551: Payment of Maintenance Fee, 4th Yr, Small Entity. |
Jan 11 2018 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Jun 24 2019 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Sep 18 2023 | REM: Maintenance Fee Reminder Mailed. |
Mar 04 2024 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Jan 31 2015 | 4 years fee payment window open |
Jul 31 2015 | 6 months grace period start (w surcharge) |
Jan 31 2016 | patent expiry (for year 4) |
Jan 31 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 31 2019 | 8 years fee payment window open |
Jul 31 2019 | 6 months grace period start (w surcharge) |
Jan 31 2020 | patent expiry (for year 8) |
Jan 31 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 31 2023 | 12 years fee payment window open |
Jul 31 2023 | 6 months grace period start (w surcharge) |
Jan 31 2024 | patent expiry (for year 12) |
Jan 31 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |