An embodiment of the invention introduces a method for accessing big data, which contains at least the following steps. A data access request is received from one of a plurality of database frontends of different kinds. A data access operation is performed for the data access request by using an API (Application Programming Interface) to manipulate one of a plurality of cloud file systems of different kinds.
|
1. A method for accessing big data, comprising:
dispatching a db (database) API (Application Programming Interface) request from a client to one of plurality of proxy modules, wherein the proxy module receiving the db API request connects to a database frontend, wherein the database frontend is one of a plurality of database frontends of different kinds, wherein the database frontend interprets the db API request from the client and issues a data access request;
receiving the data access request from the database frontend; and
performing a data access operation for the data access request by using an API to manipulate one of a plurality of cloud file systems of different kinds, thereby enabling any of the database frontends of different kinds to use unified access methods to request information of the cloud file systems of different kinds.
13. A system for accessing big data, comprising:
a virtual server dispatching a db (database) API (Application Programming Interface) request from a client to one of plurality of proxy modules, wherein the proxy module receiving the db API request connects to a database frontend, wherein the database frontend is one of a plurality of database frontends of different kinds, wherein the database frontend interprets the db API request from the client and issues a data access request; and
a svh (Secure Verification Hashing) module, coupled between the database frontend and a plurality of cloud file systems of different kinds, receiving the data access request from the database frontend; and performing a data access operation for the data access request by using an API to manipulate one of the cloud file systems, thereby enabling any of the database frontends of different kinds to use unified access methods to request information of the cloud file systems of different kinds.
2. The method of
3. The method of
4. The method of
5. The method of
generating an ECC (Error Check and Correction) code according to data to be written, wherein the data to be written is stored in a data fragment of a file and the ECC code is stored in a parity fragment of the file; and
writing values of the data fragment and the parity fragment to the cloud file system by using the API.
6. The method of
computing a hash value for each of the data fragment and the parity fragment; and
storing the hash values to the data fragment and the parity fragment.
7. The method of
storing a meta tag to the data fragment for recognizing a data type of the data fragment.
8. The method of
reading data from the cloud file system using the API, wherein the data comprises values of a data fragment and a parity fragment;
determining whether the read data is correct;
replying with the read data when the read data is correct;
performing an error correction algorithm to attempt to correct error bits of the read data and determining whether the error bits have been corrected successfully when the read data is not correct;
replying with the corrected data when the error bits have been corrected successfully; and
replying with a data read failure message when the error bits have not been corrected successfully.
9. The method of
writing the corrected data back to the cloud file system by using the API when the error bits have been corrected successfully.
10. The method of
sending the meta tag of the corrected data fragment to a SIA (Security Intelligence Analytics) module when the error bits have been corrected successfully, wherein the sent meta tag is used to recognize a data type of the corrected data fragment.
11. The method of
receiving a specified number of the same meta tag;
publishing a fake document as a honeypot by the SIA module.
12. The method of
14. The system of
15. The system of
an erasure coding module, generating an ECC (Error Check and Correction) code according to data to be written, wherein the data to be written is stored in a data fragment of a file and the ECC code is stored in a parity fragment of the file; and writing values of the data fragment and the parity fragment to the cloud file system by using the API.
16. The system of
17. The system of
18. The system of
19. The system of
20. The system of
21. The system of
22. The system of
23. The system of
|
This Application claims priority of Taiwan Patent Application No. 103126348, filed on Aug. 1, 2014, the entirety of which is incorporated by reference herein.
Technical Field
The present invention relates to data access, and in particular, to methods for accessing big data and systems using the same.
Description of the Related Art
More and more enterprises establish an environment of cloud computing with big data storage. However, the big data is so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. Thus, it is desirable to have methods for accessing big data and systems using the same to improve scalability.
An embodiment of the invention introduces a method for accessing big data, which contains at least the following steps. A data access request is received from one of a plurality of database frontends of different kinds. A data access operation is performed for the data access request by using an API (Application Programming Interface) to manipulate one of a plurality of cloud file systems of different kinds.
An embodiment of the invention introduces a system for accessing big data, which contains at least a SVH (Secure Verification Hashing) module. The SVH module is coupled between database frontends of different kinds and cloud file systems of different kinds. The SVH module receives a data access request from one of the database frontends; and performs a data access operation for the data access request by using an API to manipulate one of the cloud file systems.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
The present invention can be fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
The present invention will be described with respect to particular embodiments and with reference to certain drawings, but the invention is not limited thereto and is only limited by the claims. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Use of ordinal terms such as “first”, “second”, “third”, etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having the same name (but for use of the ordinal term) to distinguish the claim elements.
The reception module 130, the virtual server 140, the proxy module 150_1 and the database frontend 160_1 may be integrated in a single computer apparatus or distributed over several computer apparatuses. Similarly, a pairing of the proxy module 150_2 with the database frontend 160_2 may be integrated in a single computer apparatus or distributed over several computer apparatuses. A pairing of the proxy module 150_i with the database frontend 160_i may be integrated in a single computer apparatus or distributed over several computer apparatuses. Any of the big data storage nodes 170_1, 170_2 to 170_i may contain many computer apparatuses to complete the big data retrieval and computation.
The SVH module 650 reads data from a corresponding file system by using an API provided by the document-oriented database 331, the distributed batch database 351 or the distributed real-time database 371 according to a data read request sent by the ULDSFS 611, 613 or 615. The SVH module 650 verifies the integrity of data read from the document-oriented database 331, the distributed batch database 351 or the distributed real-time database 371, and attempts to correct when the read data contains error bits. When the read data has no error or the error bits of the read data have been corrected, the SVH module 650 transmits the error-free or corrected data to the erasure coding module 630, and then the erasure coding module 630 replies with the error-free or corrected data to the ULDSFS 611, 613 or 615. In addition, the SVH module 650 writes the corrected data back to the corresponding file system by using an API provided by the document-oriented database 331, the distributed batch database 351 or the distributed real-time database 371, and sends an event containing one or more meta tags of the corrected data fragments to a SIA (Security Intelligence Analytics) module. After receiving a specified number of events regarding the same meta tag, the SIA may publish fake documents as honeypots to attract and thus pinpoint malware. Or, the SIA may predict what categories of data are hotspots for hackers by machine learning and take a proper action, such as rectifying the security breaches. When the read data contains too many error bits to be corrected, the SVH module 650 notifies the erasure coding module 630 that the data has failed to be read, so that the erasure coding module 630 replies to the ULDSFS 611, 613 or 610 with the failure of the data read request. The SVH module 650 may use a well-known error correction algorithm to attempt to correct a tolerable number of error bits that occur in the data and parity fragments.
Although the embodiment has been described as having specific elements in
While the invention has been described by way of example and in terms of the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. On the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
8839417, | Nov 17 2003 | JPMORGAN CHASE BANK, N A , AS ADMINISTRATIVE AGENT | Device, system and method for defending a computer network |
9251194, | Jul 26 2012 | Microsoft Technology Licensing, LLC | Automatic data request recovery after session failure |
20030219008, | |||
20110282982, | |||
20120079380, | |||
20120221683, | |||
20120303814, | |||
20130205183, | |||
20140040197, | |||
20150121526, | |||
20150149870, | |||
20150254150, | |||
20160191631, | |||
20160217159, | |||
TW200539640, | |||
TW201428624, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Oct 22 2014 | CHEN, CHIH-MING | Wistron Corp | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 034075 | 0832 | |
Oct 29 2014 | Wistron Corp. | (assignment on the face of the patent) |
Date | Maintenance Fee Events |
Jul 09 2020 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jul 11 2024 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Mar 21 2020 | 4 years fee payment window open |
Sep 21 2020 | 6 months grace period start (w surcharge) |
Mar 21 2021 | patent expiry (for year 4) |
Mar 21 2023 | 2 years to revive unintentionally abandoned end. (for year 4) |
Mar 21 2024 | 8 years fee payment window open |
Sep 21 2024 | 6 months grace period start (w surcharge) |
Mar 21 2025 | patent expiry (for year 8) |
Mar 21 2027 | 2 years to revive unintentionally abandoned end. (for year 8) |
Mar 21 2028 | 12 years fee payment window open |
Sep 21 2028 | 6 months grace period start (w surcharge) |
Mar 21 2029 | patent expiry (for year 12) |
Mar 21 2031 | 2 years to revive unintentionally abandoned end. (for year 12) |