Systems and methods are provided that enable probabilistic application of data traffic scanning in an effort to catch malicious software or code being carried by the data traffic. The methodology and systems operate by monitoring data traffic in an data network via an interface with the data network, calculating a first conditional probability that content in first given data traffic being monitored is malicious, calculating a second conditional probability that content in second given data traffic being monitored is malicious, ranking the first and second conditional probabilities resulting in ranked conditional probabilities, and performing at least one of anti-virus (AV) or anti-malware (AM) scanning of the content of the first or second given data traffic depending on whose conditional probability is ranked higher in the ranked conditional probabilities.
|
1. A method comprising:
monitoring data traffic in a data network via an electronic interface with the data network, wherein monitoring comprises monitoring data packets traversing the data network at an appliance that is logically disposed between an end-point device and the Internet;
calculating a first conditional probability that content in first given data traffic being monitored is malicious;
calculating a second conditional probability that content in second given data traffic being monitored is malicious;
ranking the first and second conditional probabilities with respect to each other resulting in ranked conditional probabilities; and
performing at least one of anti-virus or anti-malware scanning of the content of the first or second given data traffic depending on whose conditional probability is ranked higher in the ranked conditional probabilities to the exclusion of the other of the first or second given data traffic;
selecting a first one of a plurality of scanners for performing the at least one of anti-virus or anti-malware scanning, wherein selecting is based on an efficacy value associated with respective ones of the plurality of scanners for a given type of content that is being carried by the first or second given data traffic whose conditional probability is ranked higher in the ranked conditional probabilities;
selecting a second one of the plurality of scanners for a performing a subsequent operation of performing at least one of anti-virus or anti-malware scanning after scanning by the first one of the plurality of scanners, wherein selecting of the second one of the plurality of scanners is based on a conditional probability that the second one of the plurality of scanners can catch malicious content not caught by the first one of the plurality of scanners,
wherein a same content of the first or second given data traffic is processed by the first one of a plurality of scanners and the second one of a plurality of scanners, and
wherein selecting the first one of a plurality of scanners is based on a data type of the content of the first or second given data traffic.
10. One or more non-transitory computer readable storage media encoded with software comprising computer executable instructions and when the software is executed operable to:
monitor data traffic in a data network via an interface;
calculate a first conditional probability that content in first given data traffic being monitored is malicious;
calculate a second conditional probability that content in second given data traffic being monitored is malicious;
rank the first and second conditional probabilities with respect to one another resulting in ranked conditional probabilities; and
cause at least one of anti-virus or anti-malware scanning of the content of the first or second given data traffic to be performed depending on whose conditional probability is ranked higher in the ranked conditional probabilities to the exclusion of the other of the first or second given data traffic,
wherein the instructions are further operable to select a first one of a plurality of scanners for performing the at least one of anti-virus or anti-malware scanning based on an efficacy value associated with respective ones of the plurality of scanners for a given type of content that is being carried by the first or second given electronic data traffic whose conditional probability is ranked higher in the ranked conditional probabilities,
wherein the instructions are further operable to select a second one of the plurality of scanners for performing a subsequent operation of performing at least one of anti-virus or anti-malware scanning after scanning by the first one of the plurality of scanners, wherein the processor is configured to so select the second one of the plurality of scanners based on a conditional probability that the second one of the plurality of scanners can catch malicious content not caught by the first one of the plurality of scanners,
wherein a same content of the first or second given data traffic is processed by the first one of a plurality of scanners and the second one of a plurality of scanners, and
wherein the first one of a plurality of scanners is selected based on a data type of the content of the first or second given data traffic.
6. An apparatus comprising:
a network interface unit configured to communicate over a data network;
a memory; and
a processor configured to:
monitor electronic data traffic in the data network via the interface, by monitoring data packets traversing the data network when the appliance is logically disposed between an end-point device and the Internet;
calculate a first conditional probability that content in first given data traffic being monitored is malicious;
calculate a second conditional probability that content in second given data traffic being monitored is malicious;
rank the first and second conditional probabilities with respect to one another resulting in ranked conditional probabilities; and
cause at least one of anti-virus or anti-malware scanning of the content of the first or second given data traffic to be performed depending on whose conditional probability is ranked higher in the ranked conditional probabilities to the exclusion of the other of the first or second given data traffic,
wherein the processor is further configured to select a first one of a plurality of scanners for performing the at least one of anti-virus or anti-malware scanning based on an efficacy value associated with respective ones of the plurality of scanners for a given type of content that is being carried by the first or second given electronic data traffic whose conditional probability is ranked higher in the ranked conditional probabilities,
wherein the processor is further configured to select a second one of the plurality of scanners for performing a subsequent operation of performing at least one of anti-virus or anti-malware scanning after scanning by the first one of the plurality of scanners, wherein the processor is configured to so select the second one of the plurality of scanners based on a conditional probability that the second one of the plurality of scanners can catch rate malicious content not caught by the first one of the plurality of scanners,
wherein a same content of the first or second given data traffic is processed by the first one of a plurality of scanners and the second one of a plurality of scanners, and
wherein the first one of a plurality of scanners is selected based on a data type of the content of the first or second given data traffic.
2. The method of
3. The method of
performing at least one of anti-virus or anti-malware scanning of content of third given data traffic;
determining, as a result of the anti-virus or anti-malware scanning, whether the content of the third given data traffic is malicious; and
when the content of the third given data traffic is determined to be malicious, capturing characteristics of the third given data traffic, wherein the characteristics are employed to calculate conditional probabilities that content of still other given data traffic is malicious.
4. The method of
5. The method of
7. The apparatus of
cause at least one of anti-virus or anti-malware scanning of content of third given data traffic to be performed;
determine, as a result of the anti-virus or anti-malware scanning, whether the content of the third given data traffic is malicious; and
when the content of the third given electronic data traffic is determined to be malicious, capture characteristics of the third given data traffic, wherein the characteristics are employed to calculate conditional probabilities that content of still other given data traffic is malicious.
8. The apparatus of
9. The apparatus of
11. The computer readable storage media of
cause at least one of anti-virus or anti-malware scanning of content of third given data traffic to be performed;
determine, as a result of the anti-virus or anti-malware scanning, whether the content of the third given data traffic is malicious; and
when the content of the third given electronic data traffic is determined to be malicious, capture characteristics of the third given data traffic, wherein the characteristics are employed to calculate conditional probabilities that content of still other given data traffic is malicious.
12. The computer readable storage media of
13. The computer readable storage media of
|
The present disclosure relates to electronic data network security.
Computer systems, networks and data centers are exposed to a constant and differing variety of attacks that expose vulnerabilities of such systems in order to compromise their security and/or operation. As an example, various forms of malicious software program attacks include viruses, worms, Trojan horses and the like that computer systems can obtain over a network such as the Internet. Quite often, users of such computer systems are not even aware that such malicious programs have been obtained within the computer system. Once resident within a given computer, a malicious program that executes might disrupt operation of that computer to a point of inoperability and/or might spread itself to other computers within a network or data center by exploiting vulnerabilities of the computer's operating system or resident application programs. Other malicious programs, such as “Spyware,” might operate within a computer to secretly extract and transmit information within the computer to remote computer systems for various suspect purposes.
To combat the proliferation of malicious software program attacks, Anti-Malware (AM) and Anti-Virus (AV) scanners have been widely deployed in different security appliances and Intrusion Prevention Systems (IPSs) to detect known malicious software, and to thereafter block or filter such threats. Compared with, e.g., universal resource locator (URL) and reputation-based malware filtering, which rely broadly on the source of content by monitoring, e.g., an IP address or a domain name, AM/AV scanning may enjoy a lower false positive rate by employing, e.g., well-defined signatures on the actual content that is responsible for malicious behavior. On the other hand, AM/AV scanning can be expensive in terms of central processing unit (CPU) and memory usage which, in many instances, especially as electronic data network traffic grows at increasing rates, makes it impractical to apply AM/AV scanning to all of the content being carried by network traffic, or even destined for a single given computer within that network.
Overview
Systems and methods are described that enable probabilistic application of data traffic scanning in an effort to catch malicious software or code being carried by the data traffic. The methodology and systems operate by monitoring data traffic in a data network via an interface with the electronic data network, calculating a first conditional probability that content in first given data traffic being monitored is malicious, calculating a second conditional probability that content in second given data traffic being monitored is malicious, ranking the first and second conditional probabilities resulting in ranked conditional probabilities, and performing at least one of anti-virus (AV) or anti-malware (AM) scanning of the content of the first or second given data traffic depending on whose conditional probability is ranked higher in the ranked conditional probabilities. In one embodiment, the functionality for computing or determining the conditional probabilities is embodied in computer executable instructions or code that is hosted by a network device or appliance, which can be deployed in one of several different locations in a network topology.
Embodiments described herein include a computer or other processing system configured to execute a probabilistic application of Anti-Malware/Anti-Virus (AM/AV) scanning. The number of security threats introduced by electronic network (e.g., world wide web) traffic is reaching epidemic proportions. Traditional gateway defenses are proving inadequate against a variety of malware, leaving corporate electronic data networks exposed. The speed, variety and damage potential of malware attacks highlight the utility of a robust, secure platform to protect the enterprise network perimeter or, at the very least, individual end point or end user devices within that perimeter.
Network 101 is also connected to an edge router 110 that couples network 101 to a wide area network (WAN) 115 such as the Internet and that allows communication between the end point devices 105 and other computers, servers and other devices worldwide. Such other computers, servers, etc. (not shown) may be disposed, for example, inside a data center 120 that may be a physical data center or a virtual data center that functions, from the perspective of an end point device 105, as a dedicated physical data center.
Also shown in
In addition, yet another device shown in computer architecture 100 is an administration device 140, which hosts AM/AV scanning administration logic 250. Administration device 140 may be a stand alone computer or device as shown or may be incorporated into any one of the network connected devices shown in
In the same way, although AM/AV scanners 125 shown in
The functionality of AM/AV scanning application logic 200 will now be described. At a high level, a main function of AM/AV scanning application logic 200 is to identify, based on probabilistic methodologies, which electronic data traffic is to be scanned by a given one of the AM/AV scanners 125, and where appropriate, to determine, in a probabilistic manner, which additional AM/AV scanner or scanners 125 may also be applied to the electronic data traffic after a given first one of the AM/AV scanners 125 has completed its scan operations.
A typical AM/AV scanner can accommodate approximately 100 scanning requests per second, i.e., a request to scan content carried by given electronic data traffic. In today's network environment, however, in order to effectively scan substantially all electronic data traffic, a scanner would have to support potentially thousands of requests per second. To address this problem, some scanning implementations scan only higher risk but lower volume file types, such as executable (.exe.) and zipped (.zip) file types. However, the trend is that malware and viruses are increasingly embedded in many other file types including hyper text markup language (HTML) files, JavaScript files, portable document format (PDF) documents and other image format files
To address the forgoing, AM/AV scanning application logic 200 provides a dynamic electronic data traffic scanning approach that determines which high-risk traffic to scan based on a probabilistic prioritization, selects which AM/AV scanner(s) 125 to user if there are two or more AM/AV scanners available for use, and monitors overall system load to dynamically scan more content when a given one or more of AM/AV scanners 125 may be idle. AM/AV scanner application logic 200 may also periodically cause AM/AV scanners 125 to scan random electronic data traffic, determine whether that traffic includes malicious software programs, and then based on results of the determination, provide a feedback mechanism by which future probabilistic prioritization techniques can be tuned or optimized.
More specifically, probabilistic prioritization, in accordance with principles of embodiments described herein, is based on conditional probability. In the case of identifying malicious software in electronic data traffic, the conditional probability of an event B (i.e., existence of malicious software programs) is the probability that the event will occur given the knowledge that an event A (i.e., selected characteristics of the electronic data traffic are present) has already occurred. This probability is written as P(B|A) and referred to as “the probability of B given A.” In the instant context, this would be referred to as the probability of finding malicious code or software in given electronic data traffic given the attributes or characteristics of that electronic data traffic.
If events A and B are not independent, then the probability of the intersection of A and B (the probability that both events occur) is defined by P(A and B)=P(A)P(B|A). From this definition, the conditional probability P(B|A) is obtained by dividing by P(A):
Event “A” in the context of characteristics of the electronic data traffic might include, for example, the URL from which the electronic data traffic is being received, and/or a corresponding reputation thereof, a category of the content being received (e.g., gambling pages, or cooking recipes), or type of file be carried by the electronic data traffic (e.g., .exe, .pdf, etc.), a user-agent type (e.g., an application used to request the content), geographical of either or both a server from which the electronic data traffic is being received or a user or end point device that is to receive the electronic data traffic, among many other possibilities.
In one implementation, AM/AV scanning application logic 200 has a set of a priori knowledge about the likelihood (probability) of given electronic data traffic carrying malicious software given certain characteristics of that given electronic data traffic. As the traffic passes through host 130, AM/AV scanning application logic 200 operates on or processes the traffic and generates a conditional probability that the traffic is carrying malicious software. It is noted that there may be multiple characteristics that AM/AV scanning application logic 200 considers in determining the conditional probability. As such, a final conditional probability that given electronic data traffic carries malicious software may be a combination of multiple conditional probabilities.
That is, rather than setting hard thresholds or limiting scanning of content carried by electronic data traffic to certain types of files, embodiments described herein rank the electronic data traffic based on a plurality of characteristics and resulting conditional probabilities. Scanning can then take place on the electronic data traffic (i.e., the content therein) with the highest rankings. In this way, scanner 125 can be put to use in the a more optimized way by passing content through the scanners that is determined to more likely carry malicious software.
Thus, in accordance with an embodiment, AM/AV scanning application logic 200 first selects an available and appropriate scanner 125 to employ for a first pass of scanning. Then, if another scanner might be available and it is determined that such a scanner might catch malicious software that the first scanner did not detect, then AM/AV scanning application logic 200 may further cause the electronic data traffic being processed to be processed by a second or even third, etc. scanner.
Still with reference to
With the additional scanner(s) so selected, the same electronic data traffic can be passed through the additional scanner(s) such that an overall catch rate of malicious code can be improved.
To keep the system and methodology relevant and updated over time, a periodic feedback mechanism is employed.
Referring to
At 712, it is determined whether malicious software has been found via the scanning procedure. If not, operation 710 is repeated. If malicious code is found, then at 714, relevant characteristics of the electronic data traffic are captured. These characteristics might include, e.g., a URL, the category of the content being carried or the type of file be carried, among many other possible characteristics. At 716 this information (namely that malicious software was (or was not) found in electronic data traffic having the captured characteristics) is broadcast or otherwise made available to other instances of AM/AV scanning application logic 200 that may be operating within other network architectures. In this way, different instances of AM/AV scanning application logic 200 can share information with one another, enabling individual instances to be as up to date as may be practicable so that the conditional probabilities that are calculated or as accurate as possible.
Reference has been made to portions or segments of electronic data traffic. In the context of the embodiments described herein, those segments or portions can be based on information gleaned from layers 3-7 of Open Systems Interconnection (OSI) model wherein the headers and payloads of various packets or frames of the electronic data traffic at these several layers are inspected for maliciousness and for the characteristics that are used to compute the conditional probabilities.
Thus, as explained, electronic data traffic transiting a security appliance or host is dynamically selected to be scanned based on a function of the probability that the traffic contains malicious software or code. In one embodiment, scanners are employed at their maximum throughput such that the greatest amount of electronic data traffic can be scanned without substantially impacting an expected quality of service by an end user. Scanners can also be used in sequence in an attempt to capture as much malicious code as possible. Periodically, a random sample of electronic data traffic is scanned by one or more scanners and information gleaned from such scans is distributed to network appliances that operate in accordance with the principles described herein.
Reference is now made to
Operation 810 includes monitoring electronic data traffic in an electronic data network via an electronic interface with the electronic data network. Operation 812 includes calculating a first conditional probability that content in first given electronic data traffic being monitored is malicious. Operation 814 includes calculating a second conditional probability that content in second given electronic data traffic being monitored is malicious.
Operation 816 includes ranking the first and second conditional probabilities resulting in ranked conditional probabilities, and operation 818 includes performing at least one of anti-virus (AV) or anti-malware (AM) scanning of the content of the first or second given electronic data traffic depending on whose conditional probability is ranked higher in the ranked conditional probabilities.
Operation 820 includes performing at least one of AV or AM scanning of content of third given electronic data traffic, determining, as a result of the AV or AM scanning, whether the content of the third given electronic data traffic is malicious, and when the content of the third given electronic data traffic is determined to be malicious, capturing characteristics of the third given electronic data traffic, wherein the characteristics are employed to calculate conditional probabilities that content of still other given electronic data traffic is malicious
Operations may further include selecting a first one of a plurality of scanners for performing the at least one of AV or AM scanning, wherein selecting is based on an efficacy value associated with respective ones of the plurality of scanners for a given type of content that is being carried by the first or second given electronic data traffic whose conditional probability is ranked higher in the ranked conditional probabilities.
Operations may still further comprise selecting a second one of the plurality of scanners for a performing a subsequent operation of performing at least one of AV or AM scanning after scanning by the first one of the plurality of scanners, wherein selecting of the second one of the plurality of scanners is based on a catch rate of malicious content not caught by the first one of the plurality of scanners.
In one embodiment, the functionality for calculating the conditional probabilities and controlling which scanners are to be employed for scanning may be deployed at a network security appliance that is logically disposed between an end-point device and a firewall that is in communication with the Internet. This functionality may also be deployed with a cloud computing environment.
Although the apparatus, system and method are illustrated and described herein as embodied in one or more specific examples, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made therein without departing from the scope of the apparatus, system, and method and within the scope and range of equivalents of the claims. A Data Center can represent any location supporting capabilities enabling service delivery that are advertised. A Provider Edge Routing Node represents any system configured to receive, store or distribute advertised information as well as any system configured to route based on the same information. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the apparatus, system, and method, as set forth in the following.
Wang, Jisheng, Jones, Lee, Quinlan, Daniel
Patent | Priority | Assignee | Title |
10027689, | Sep 29 2014 | FireEye Security Holdings US LLC | Interactive infection visualization for improved exploit detection and signature generation for malware and malware families |
10594542, | Oct 27 2017 | Cisco Technology, Inc | System and method for network root cause analysis |
10681080, | Jun 30 2015 | NTT RESEARCH, INC | System and method for assessing android applications malware risk |
10868818, | Sep 29 2014 | FireEye Security Holdings US LLC | Systems and methods for generation of signature generation using interactive infection visualizations |
10887324, | Sep 19 2016 | NTT RESEARCH, INC | Threat scoring system and method |
10904071, | Oct 27 2017 | Cisco Technology, Inc. | System and method for network root cause analysis |
11757857, | Jan 23 2017 | NTT RESEARCH, INC | Digital credential issuing system and method |
9773112, | Sep 29 2014 | FireEye Security Holdings US LLC | Exploit detection of malware and malware families |
Patent | Priority | Assignee | Title |
6851058, | Jul 26 2000 | JPMORGAN CHASE BANK, N A ; MORGAN STANLEY SENIOR FUNDING, INC | Priority-based virus scanning with priorities based at least in part on heuristic prediction of scanning risk |
7882560, | Dec 16 2005 | Cisco Technology, Inc. | Methods and apparatus providing computer and network security utilizing probabilistic policy reposturing |
8370943, | Oct 28 2009 | NetApp, Inc. | Load balancing of scan requests to all antivirus servers in a cluster |
20120110667, | |||
20130291109, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
May 17 2012 | WANG, JISHENG | Cisco Technology, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 028420 | /0276 | |
May 17 2012 | JONES, LEE | Cisco Technology, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 028420 | /0276 | |
Jun 20 2012 | QUINLAN, DANIEL | Cisco Technology, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 028420 | /0276 | |
Jun 21 2012 | Cisco Technology, Inc. | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Aug 23 2019 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Aug 21 2023 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Feb 23 2019 | 4 years fee payment window open |
Aug 23 2019 | 6 months grace period start (w surcharge) |
Feb 23 2020 | patent expiry (for year 4) |
Feb 23 2022 | 2 years to revive unintentionally abandoned end. (for year 4) |
Feb 23 2023 | 8 years fee payment window open |
Aug 23 2023 | 6 months grace period start (w surcharge) |
Feb 23 2024 | patent expiry (for year 8) |
Feb 23 2026 | 2 years to revive unintentionally abandoned end. (for year 8) |
Feb 23 2027 | 12 years fee payment window open |
Aug 23 2027 | 6 months grace period start (w surcharge) |
Feb 23 2028 | patent expiry (for year 12) |
Feb 23 2030 | 2 years to revive unintentionally abandoned end. (for year 12) |