A computer-executable method, system, and computer program product for managing a flash cache, having modes of cache management, the computer-executable method, system, and computer program product may be enabled to optimize flash cache by using a model to determine an optimized mode for the flash cache.
|
1. A computer-executable method, on one or more processors and memory, for managing a flash cache, having modes of cache management, the computer-executable method comprising:
optimizing, via the one or more processors and memory, the flash cache by using a model to determine a first mode of the modes of the flash cache, wherein the model is used to analyze an activity of the flash cache.
6. A system, comprising:
a flash cache having modes of cache management; and
computer-executable program logic operating in memory, wherein the computer-executable program logic is configured to enable one or more processors to execute:
optimizing the flash cache by using a model to determine a first mode of the modes of the flash cache, wherein the model is used to analyze an activity of the flash cache.
11. A computer program product, on one or more processors and memory, for managing a flash cache, having one or more modes, the computer program product comprising:
a non-transitory computer readable medium encoded with computer-executable program code executing on the one or more processors and memory, the code configured to enable the execution of:
optimizing the flash cache by using a model to determine a first mode of the modes of the flash cache, wherein the model is used to analyze an activity of the flash cache.
2. The computer-executable method of
3. The computer-executable method of
communicating simulated data to the flash cache;
collecting statistics while the flash cache is processing the simulated data; and
analyzing the statistics to create a model which includes a mapping of simulated data to the modes of flash cache.
5. The computer-executable method of
collecting statistics on the activity of the flash cache; and
analyzing the collected statistics on the activity using the model to determine the first mode of modes of the flash cache.
7. The system of
8. The system of
communicating simulated data to the flash cache;
collecting statistics while the flash cache is processing the simulated data; and
analyzing the statistics to create a model which includes a mapping of simulated data to the modes of flash cache.
10. The system of
collecting statistics on the activity of the flash cache; and
analyzing the collected statistics on the activity using the model to determine the first mode of the modes of the flash cache.
12. The computer program product of
13. The computer program product of
communicating simulated data to the flash cache;
collecting statistics while the flash cache is processing the simulated data; and
analyzing the statistics to create a model which includes a mapping of simulated data to the modes of flash cache.
15. The computer program product of
collecting statistics on the activity of the flash cache; and
analyzing the collected statistics on the activity using the model to determine the first mode of the modes of the flash cache.
|
A portion of the disclosure of this patent document may contain command formats and other computer language listings, all of which are subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This invention relates to data storage.
Computer systems are constantly improving in terms of speed, reliability, and processing capability. As is known in the art, computer systems which process and store large amounts of data typically include a one or more processors in communication with a shared data storage system in which the data is stored. The data storage system may include one or more storage devices, usually of a fairly robust nature and useful for storage spanning various temporal requirements, e.g., disk drives. The one or more processors perform their respective operations using the storage system. Mass storage systems (MSS) typically include an array of a plurality of disks with on-board intelligent and communications electronics and software for making the data on the disks available.
Companies that sell data storage systems and the like are very concerned with providing customers with an efficient data storage solution that minimizes cost while meeting customer data storage needs. It would be beneficial for such companies to have a way for reducing the complexity of implementing data storage.
A computer-executable method, system, and computer program product for managing a flash cache, having modes of cache management, the computer-executable method, system, and computer program product comprising optimizing the flash cache by using a model to determine a first mode of the modes of the flash cache, wherein the model is used to analyze an activity of the flash cache.
Objects, features, and advantages of embodiments disclosed herein may be better understood by referring to the following description in conjunction with the accompanying drawings. The drawings are not meant to limit the scope of the claims included herewith. For clarity, not every element may be labeled in every figure. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments, principles, and concepts. Thus, features and advantages of the present disclosure will become more apparent from the following detailed description of exemplary embodiments thereof taken in conjunction with the accompanying drawings in which:
Typically, flash caching solutions operate in one or more modes. Generally, performance of a flash caching solution may be affected depending on which mode is chosen. Traditionally, a mode may be chosen based on prior knowledge of a data storage device or a data storage system. Conventionally, being able to dynamically optimize cache performance of a flash caching solution may be beneficial to the performance of a data storage device or a data storage system.
In many embodiments, the current disclosure may enable dynamic optimization of a flash caching solution. In some embodiments, the current disclosure may enable a user, customer, or administrator to use a learning scenario to create a model to optimize a flash caching solution. In various embodiments, a learning scenario may include collected performance statistics while having a flash caching solution processing a simulated scenario. In particular embodiments, a simulated scenario may include having a flash caching solution process a read and/or write scenario. In certain embodiments, a simulated scenario may include having a flash caching solution process simulated data. In many embodiments, a simulated scenario may include having a flash caching solution process a read and/or write scenario using simulated data.
In many embodiments, the current disclosure may enable a flash caching solution to switch between cache modes to test each cache mode. In various embodiments, the current disclosure may enable a flash caching solution to choose the best mode suitable for a given working environment. In some embodiments, the best mode suitable for a given working environment may include a mode where a flash caching solution may operate faster than other available modes. In other embodiments, the current disclosure may enable a flash caching solution to collect data in one mode which may be used to determine which mode may be best suited for a given working environment. In many embodiments, a best mode suitable for a given working environment may include an optimized mode. In some embodiments, determining an optimized may include determining whether an application may have a faster storage response time in a particular mode, thus performing data storage operations faster than when configured in an alternate mode.
In various embodiments, the current disclosure may enable a user, customer, or administrator to use flash cache performance statistics to create a model that may enable optimization of a flash caching solution. In some embodiments, a model may be created by analyzing a flash caching solution in one or more simulated scenarios. In particular embodiments, a model may be created by analyzing performance statistics collected while a flash caching solution may be processing one or more simulated scenarios. In some embodiments, the analysis may include comparing operation of a flash caching solution in one or more modes while the flash caching system is processing a simulated situation. In some embodiments, a flash caching solution may have one or more modes enabling the caching solution to be optimized based on current or future usage of a data storage device and/or data storage system. In various embodiments, a caching solution may include EMC VFCache™
Typically, EMC VFCache™ may be a server flash caching solution that reduces latency and increases throughput to dramatically improve application performance by leveraging intelligent caching software and PCIe flash technology. Generally, VFCache may accelerate reads and protects data by using a write-through cache to the networked storage to deliver persistent high availability, integrity, and disaster recovery. Conventionally, on a write request, VFCache may first write to an array, then to the cache, and then may complete the application IO. Traditionally, on a read request, VFcache may satisfy a request with cached data, or, when the data is not present, may retrieve the data from the array, write the data to cache, and then may return the data to the application.
Generally, VFCache may operate in one or more modes. Traditionally, the modes of VF Cache may include Mode #1, a cache all writes mode or current write mode, Mode #2, a no caching for writes mode or invalidate mode, and Mode #3, a cache on write-hit mode. Typically, Mode #1 may mean when every write received is written to cache. Generally, Mode #2 may mean that corresponding entries in memory may be signed invalid. Conventionally, Mode #3 may mean that data is written to the cache only in instances where the data may already exist in cache. Traditionally, there may be difficulties switching between cache modes in order to test each mode and choose the best cache mode for a given working environment.
In many embodiments, the current disclosure may enable creation of a model enabled to predict and/or estimate the performance of a flash caching solution in one or more scenarios. In some embodiments, the model may be created by analyzing performance statistics collected from a flash caching solution. In various embodiments, performance statistics may be collecting from a flash caching solution while the flash caching solution may be in an active state. In particular embodiments, an active state may include a state where a flash caching solution may be actively processing customer data.
In various embodiments, performance statistics may be collected from a flash caching solution while the flash caching solution may be processing a simulated scenario. In some embodiments, performance statistics may be collected while a flash caching solution may be in an offline state. In particular embodiments, an offline state may mean a state where a flash caching solution may not be actively processing customer data. In many embodiments, a simulated scenario may include simulated data. In some embodiments, a simulated scenario may include a read and/or write scenario. In various embodiments, a simulated scenario may include simulated data and a read and/or write scenario. In particular embodiments, a simulated scenario may encompass every variation of data processing and/or read and write scenarios that a caching solution may encounter.
In many embodiments, the current disclosure may enable the use of performance statistics to create a model that may be enabled to predict and/or estimate the performance of a flash caching solution one or more modes. In some embodiments, a model may be used to map a simulated scenario to a best suited mode of a flash caching solution. In various embodiments, a model may enable a caching solution to map a simulated scenario to an active state. In some embodiments, a model may include one or more data patterns that may be mapped to one or more modes of a flash caching solution. In various embodiments, a data pattern may include a collection of data processing, a type of data being processed, and/or an amount of read and writes being processed. In particular embodiments, a model may enable selection of a mode best suited for an active state. In various embodiments, the current disclosure may enable a flash caching solution to choose an optimized mode based on a model. In certain embodiments, the current disclosure may enable a flash caching solution to dynamically change optimized modes based on a model. In various embodiments, a model may map one or more data patterns to one or more modes of a flash caching solution.
In many embodiments, a model may include an algorithm that may combine different variable selection methods in order to estimate a consensus optimization choice between each of the variables. In other embodiments, an algorithm may combine different variable selection techniques to select predictors for cache optimization. In some embodiments, an algorithm may use a random-forest technique and/or a stochastic gradient boosting technique as a predictor of importance of each variable. In various embodiments, data to be analyzed by a model may be divided into chunks of data. In particular embodiments, the chunks of data may be analyzed independently. In certain embodiments, one or more techniques may be used to determine a score for each chunk of data. In particular embodiments, each score for each chunk of data may be normalized (i.e. between 0 and 100). In various embodiments, a final score for each predictor may be calculated as a sum of all scores that the one or more techniques used to determine importance of each variable. In some embodiments, the determined importance of each variable may be averaged among all data chunks.
In many embodiments, a flash caching solution may be queried to provide the following performance statistics, such as:
Total number of pending reads
Total number of pending writes
Total number IOs
Total number of reads
Total number of read-hits
Total number of writes
Total number of write-hits
Total number of skipped IO
Total number of unaligned IOs
Total Kbytes transferred for reads
Total read latency in microseconds
Total Kbytes transferred for writes
Total write latency in microseconds.
In various embodiments, the above-mentioned performance statistics may be used as predictive variables in a model to determine an optimized mode for a flash caching solution.
In many embodiments, flash cache performance statistics may be collected in each available mode of a flash caching solution. In some embodiments, flash cache performance statistics collected in one mode may be applied to a second mode. In various embodiments, an optimization of a flash caching solution having one or more modes may be created from performance statistics collected in a single mode.
In many embodiments, flash cache performance statistics may be collected in one cache mode. In some embodiments, flash cache performance statistics may be collected in a cache all writes mode which may enable collection of all available statistical information for application data flow. In various embodiments, the current disclosure may enable collection of performance statistics in an offline mode. In particular embodiments, an offline mode may include running test data through a flash caching solution while recording performance statistics. In many embodiments, a flash cache may be placed in a learning phase where simulated data and/or read/write-scenarios may be passed to the flash cache while performance statistics are being recorded. In various embodiments, performance statistics may be recorded in each mode configuration of a flash caching solution.
In many embodiments, collected performance statistics may be used to create a model that may be enabled to predict and/or estimate the cache performance of a flash cache solution in one or more modes. In some embodiments, the model may be created based on performance statistics collected in a cache all writes mode. In various embodiments, a model may enable a flash caching solution to dynamically change modes depending on the working environment.
In many embodiments, the current disclosure may enable a caching solution to adapt to a current state. In various embodiments, a current state may include a current data processing load. In some embodiments, a flash caching solution may collect performance statistics while in an active state. In various embodiments, performance statistics collected in an active state may be analyzed using a model. In particular embodiments, performance statistics collected in an active state may be compared to a model.
In many embodiments, the current disclosure may enable a caching solution to adapt to an active state, where the flash caching solution is currently processing customer data. In some embodiments, a flash caching solution may collect performance statistics in an active state. In various embodiments, performance statistics collected in an active state may be used to determine an optimized mode. In certain embodiments, performance statistics collected in an active state may be compared with a model to determine an optimized mode. In other embodiments, performance statistics collected in an active state may be analyzed using a model. In some embodiments, model analysis of performance statistics collected in an active state may include mapping data patterns within the model to the performance statistics. In various embodiments, the current disclosure may enable the flash caching solution to dynamically change modes based on the model analysis of performance statistics of an active state. In many embodiments, performance statistics of an active state may be collected during finite time intervals. In some embodiments, a time interval may be measured in seconds, minutes, dozens of minutes, or other time increments.
In many embodiments, the current disclosure may enable a computer-executable method for managing a flash cache, having modes of cache management, the computer-executable method including optimizing the flash cache by using a model to determine a first mode of the modes of the flash cache, wherein the model is used to analyze an activity of the flash cache.
Refer now to the example embodiment of
Refer now to the example embodiment of
Refer now to the example embodiment of
Refer now to the example embodiment of
Refer now to the example embodiment of
Refer now to the example embodiments of
Refer now to the example embodiments of
Refer now to the example embodiments of
Refer now to the example embodiment of
The methods and apparatus of this invention may take the form, at least partially, of program code (i.e., instructions) embodied in tangible non-transitory media, such as floppy diskettes, CD-ROMs, hard drives, random access or read only-memory, or any other machine-readable storage medium.
The logic for carrying out the method may be embodied as part of the aforementioned system, which is useful for carrying out a method described with reference to embodiments shown in, for example,
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present implementations are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
Derbeko, Philip, Sherman, Yulia, Moldavsky, Vadim
Patent | Priority | Assignee | Title |
11543981, | Jun 12 2020 | Western Digital Technologies, Inc.; Western Digital Technologies, INC | Data storage device self-configuring based on customer prediction model |
Patent | Priority | Assignee | Title |
5586293, | Aug 24 1991 | Freescale Semiconductor, Inc | Real time cache implemented by on-chip memory having standard and cache operating modes |
5666509, | Mar 24 1994 | SHENZHEN XINGUODU TECHNOLOGY CO , LTD | Data processing system for performing either a precise memory access or an imprecise memory access based upon a logical address value and method thereof |
7729549, | Sep 02 2005 | Ricoh Company, LTD | Image processing apparatus and image processing method |
8549222, | Feb 12 2008 | NetApp, Inc | Cache-based storage system architecture |
8732163, | Aug 04 2009 | SYBASE, Inc. | Query optimization with memory I/O awareness |
8862645, | Sep 28 2012 | EMC IP HOLDING COMPANY LLC | Multi-path file system with block cache between client and storage array |
8949531, | Dec 04 2012 | VMware LLC | Automated space management for server flash cache |
9043535, | Dec 28 2012 | EMC IP HOLDING COMPANY LLC | Minimizing application response time |
20090265156, | |||
20100185816, | |||
20110035369, | |||
20130254457, | |||
20140156909, | |||
20140156910, |
Date | Maintenance Fee Events |
Jun 24 2019 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jun 21 2023 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Jan 26 2019 | 4 years fee payment window open |
Jul 26 2019 | 6 months grace period start (w surcharge) |
Jan 26 2020 | patent expiry (for year 4) |
Jan 26 2022 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 26 2023 | 8 years fee payment window open |
Jul 26 2023 | 6 months grace period start (w surcharge) |
Jan 26 2024 | patent expiry (for year 8) |
Jan 26 2026 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 26 2027 | 12 years fee payment window open |
Jul 26 2027 | 6 months grace period start (w surcharge) |
Jan 26 2028 | patent expiry (for year 12) |
Jan 26 2030 | 2 years to revive unintentionally abandoned end. (for year 12) |