Embodiments disclosed herein relate to a method, system, and computer-readable medium for monitoring an application executing across a plurality of containers on a computing system. A performance monitor requests a list of containers created on the computing system. The performance monitor retrieves information associated with a creation of each container in the list. The performance monitor parses the information associated with each container in the list to identify a cluster of related containers that are running the applications. The performance monitor associates the cluster of related containers with the application. The performance monitor assesses a health of the application based on metrics collected from the identified cluster of containers.
|
1. A method of monitoring an application executing across a plurality of containers in a computing system, comprising:
requesting a list of containers created on a computing system;
retrieving information associated with a creation of each container in the list;
parsing the information associated with each container in the list to identify a cluster of related containers that are running the application; and
assessing a health of the application executing on the cluster of related containers based on metrics collected from the cluster of related containers, wherein parsing the information associated with each container in the list to identify the cluster of related containers comprises identifying the cluster of related containers by parsing the information to search for at least one of:
a link command,
a common network,
creation of a configuration file for the application,
a semaphore command related to one or more containers,
a command creating a control group, or
containers that use a same storage volume.
9. A computer system, comprising:
a processor; and
a memory storing program code, which, when executed on the process, performs a method of monitoring an application executing across a plurality of containers in a computing system, comprising:
requesting a list of containers created on the computing system;
retrieving information associated with a creation of each container in the list;
parsing the information associated with each container in the list to identify a cluster of related containers that are running the application; and
assessing a health of the application executing on the cluster of related containers based on metrics collected from the cluster of related containers, wherein parsing the information associated with each container in the list to identify the cluster of related containers comprises identifying the cluster of related containers by parsing the information to search for at least one of:
a link command,
a common network,
creation of a configuration file for the application,
a semaphore command related to one or more containers,
a command creating a control group, or
containers that use a same storage volume.
17. A non-transitory computer readable medium comprising instructions, which when executed in a computer system, causes the computer system to carry out a method of monitoring an application executing across a plurality of containers in a computing system, comprising:
requesting a list of containers created on the computing system;
retrieving information associated with a creation of each container in the list;
parsing the information associated with each container in the list to identify a cluster of related containers that are running the application; and
assessing a health of the application executing on the cluster of related containers based on metrics collected from the cluster of related containers, wherein parsing the information associated with each container in the list to identify the cluster of related containers comprises identifying the cluster of related containers by parsing the information to search for at least one of:
a link command,
a common network,
creation of a configuration file for the application,
a semaphore command related to one or more containers,
a command creating a control group, or
containers that use a same storage volume.
2. The method of
filtering the list of containers created on the computing system to include only those containers that are active on the computing system.
3. The method of
identifying the cluster of related containers by parsing the information to search for a link command.
4. The method of
identifying the cluster of related containers by parsing the information to search for a common network.
5. The method of
identifying the cluster of related containers by parsing the information to search for creation of a configuration file for the application.
6. The method of
identifying the cluster of related containers by parsing the information to search for a semaphore command related to one or more containers.
7. The method of
identifying the cluster of related containers by parsing the information to search for a command creating a control group.
8. The method of
identifying the cluster of related containers by parsing the information to search for containers that use a same storage volume.
10. The computer system of
filtering the list of containers created on the computing system to include only those containers that are active on the computing system.
11. The computer system of
identifying the cluster of related containers by parsing the information to search for a link command.
12. The computer system of
identifying the cluster of related containers by parsing the information to search for a common network.
13. The computer system of
identifying the cluster of related containers by parsing the information to search for creation of a configuration file for the application.
14. The computer system of
identifying the cluster of related containers by parsing the information to search for a semaphore command related to one or more containers.
15. The computer system of
identifying the cluster of related containers by parsing the information to search for a command creating a control group.
16. The computer system of
identifying the cluster of related containers by parsing the information to search for containers that use a same storage volume.
18. The non-transitory computer readable medium of
filtering the list of containers created on the computing system to include only those containers that are active on the computing system.
19. The non-transitory computer readable medium of
identifying the cluster of related containers by parsing the information to search for a link command.
20. The non-transitory computer readable medium of
identifying the cluster of related containers by parsing the information to search for a common network.
|
Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 201741017168 filed in India entitled “MONITORING APPLICATIONS RUNNING ON CONTAINERS”, filed on May 16, 2017 by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.
Computer virtualization is a technique that involves encapsulating a physical computing machine platform into virtual machine(s) executing under control of virtualization software on a hardware computing platform or “host.” A virtual machine provides virtual hardware abstractions for processor, memory, storage, and the like to a guest operating system. The virtualization software, also referred to as a “hypervisor,” includes one or more virtual machine monitors (VMMs) to provide execution environment(s) for the virtual machine(s). As physical hosts have grown larger, with greater processor core counts and terabyte memory sizes, virtualization has become key to the economic utilization of available hardware.
Virtual machines provide for hardware-level virtualization. Another virtualization technique is operating system-level (OS-level) virtualization, where an abstraction layer is provided on top of a kernel of an operating system executing on a host computer. Such an abstraction is referred to herein as a “container.” A container executes as an isolated process in user-space on the host operating system (referred to as the “container host”) and shares the kernel with other containers. A container relies on the kernel's functionality to make use of resource isolation (processor, memory, input/output, network, etc.).
Performance monitoring has become increasingly important because performance monitoring aids in troubleshooting a virtualized environment. As systems become more complex, the importance of providing customers with a scalable method that retrieves data and an easy way to analyze that data rises. Performance monitoring tools currently available typically provide the computation metrics for the individual containers themselves, but not for the applications running thereon, Because containers (e.g., stateless containers) are short lived, information directed to the containers are not of much importance.
Embodiments disclosed herein relate to a method, system, and computer-readable medium for monitoring an application executing across a plurality of containers on a computing system A performance monitor requests a list of containers created on a computing system. The performance monitor retrieves information associated with a creation of each container in the list. The performance monitor parses the information associated with the creation of each container in the list to identify a cluster of related containers. The containers are considered related if the containers are running the same application. The performance monitor associates the cluster of related containers with the application. The performance monitor assesses a health of the application based on metrics collected from the identified cluster of containers.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.
Computer system 106 supports one or more container hosts 108. In an embodiment, container host 108 may be a virtualized host, such as a virtual machine (VM), executing on a physical host. Each container host 108 includes an agent 112, one or more containers (“container(s) 110”), and operating system (OS) platform 116. Container host(s) 108 can be managed (e.g., provisioned, started, stopped, removed) using installer(s)/uninstaller(s) 105 executing on client computer(s) 102. In one embodiment, container host 108 may be a physical computer, such as a desktop computer, a mobile device, or the like. A container 110 may include binaries, configuration settings, and resource constraints (e.g., assigned processor, memory, and network resources).
Agent 112 provides an interface to computer system 106 for the creation of container(s) 110. Agent 112 provides an application programming interface (API) endpoint for container host 108. Client(s) 104 communicate with agent 112 to build, run, stop, update, and delete containers 110. Client(s) 104 can be any type of existing client for managing conventional containers, such as a Docker client. Agent 112 interfaces with computer system 106 to provision, start, stop, update, and delete containers 110. Agent 112 can also interface with containers 110 to control operations performed therein, such as launching processes, streaming standard output/standard error, setting environment variables, and the like.
Host 108 further includes an operating system platform 116. Operating system platform 116 provides a virtualization layer that allows multiple containers 110 to share resources of an operating system (OS) (“operating system-level virtualization”), The processes carried out by the multiple containers 110 are isolated in the underlying operating system. Operating system platform includes kernel 224. Each container 110 runs on top of kernel 224, which enables sharing of OS resources of host 108 by containers 110. In general, clients can use the API provided by the agent 112 to manage containers 110, such as provisioning, starting, stopping, and deleting containers. In an embodiment, a user interacts with agent 112 through an API using client software executing on client computer 102 to create or delete one or more containers 110.
Containers 110 implement OS-level virtualization, wherein an abstraction layer is provided on top of kernel 224 of an operating system 116 of host 108. The abstraction layer supports multiple containers 110, with each container including an application 220 and its dependencies. Each container 110 runs as an isolated process in userspace on host operating system 116 and shares kernel 224 with other containers 110. For example, each container 110i (from 1101 to 110n) shares kernel 224, Each container 110 relies on kernel's 224 functionality to make use of resource isolation (CPU, memory, block 110, network, etc.) and separate namespaces to completely isolate the application's view of the operating environments.
Traditionally, virtual machines (VMs) have been extensively used in cloud computing as they ensure isolation and limits on the resources. Container based virtualization has emerged as an alternative to VMs for deploying applications in the cloud and simplified deployment of the applications. Even though this has eased many tasks in the cloud, this has resulted in increasing the complexity of performance debugging.
Performance monitoring is of great importance for any system as it aids in troubleshooting a virtualized environment. As the demands of customers increase, so does the importance of providing customers with a scalable method that retrieves data from the virtualized environment and provides an easy way to analyze that data. Existing tools, such as, a monitoring tool, typically provide computation metrics for containers. However, existing, tools do not provide computation metrics for applications running on the containers. For example, because containers are typically short-lived, computation metrics directed to the containers themselves are not as important as the health of the application.
To aid in monitoring the health of an application executing, in one or more containers 110, host 108 may communicate with a performance monitor 230. In an embodiment, performance monitor 230 may reside in a separate server (not shown) that has access to the network (e.g., network 103) on which containers 110 are running. In an embodiment, performance monitor 230 can reside on the same host (e.g., host 108) on which the containers (e.g., containers 110) are created. Performance monitor 230 is configured to monitor applications running on containers 110. Conventionally, an application, such as application 220, may be executed across multiple containers 110. For example, application 2201 may be executed across containers 1101-1104 and application 2202 may be executed across containers 1105-1107. Performance monitor 230 monitors the host (e.g., host 108) on which a number of containers 110 (e.g., 1101-1107 are running, Performance monitor then processes a list of containers 110 to create one or more applications (e.g., application 2201 and application 2202). As such, performance monitor 230 provides the user with health/performance information to aid the user in deciding how to optimize the resource requirements for a resource intensive application.
At step 302 performance monitor 230 identifies all containers 110i running on a computer system. For example, performance monitor 230 may pine agent 112 to determine the containers 110i running in computer system 106. In one embodiment, performance monitor 230 may pine agent 112 to determine only those containers 110i that are active (i.e., running) in computer system 106. For example, as shown in the environment 400 of
Using the example of
Referring back to
The information associated with the creation of active containers 110i help performance monitor 230 identify clusters 402-406. Clusters 402-406 are further identified by determining which containers 110i communicate with each other. For example, application 1203 may comprise of a webserver in container 1108 and a database in container 1109. As such, container 1108 communicates with container 1109. Thus, performance monitor 230 may group containers 1108 and 1109 in the same cluster.
To link a webapp container (i.e., a separate container) to the new database container, user may enter into the command line:
docker run -d -P --name web --link db: db training/webapp python
app.py
With the --link option as shown in the above command, a new database container 110i is created and then linked with a new webapp container 110i. As such, when the performance monitor 230 accesses the retrieved creation information for containers 110i. The performance monitor 230 parses the information searching for “--link” to determine which containers 110i are related.
At step 504, performance monitor 230 identifies clusters of containers 110i that comprise individual applications by determining common networks on which the containers are running. For example, performance monitor 230 may utilize a network feature provided by the OS level virtualization layer that provides complete isolation among containers 110i. This feature provides the end user with control over the networks on which the applications 220i are running. This is particularly useful for applications 220, that involve several containers 110i running in combination. For example, Docker natively provides a “--net” tag to specify on which network a given container 110i will run. Containers 110i connected on the same network/bridge can communicate with one another. As such, performance monitor 230 may access the retrieved creation information, and parse the information for “--net” to determine on which network the container is running. The performance monitor 230 utilizes the network information to group containers into one or more clusters, such as those illustrated in
At step 506, performance monitor 230 identifies any configuration file or tool that may be provided by the OS-level virtualization layer used to configure an application's 220i services. For example, Compose is a tool for Docker that is used to define and run multi-container Docker applications. With Compose, a compose configuration file is used to configure an application's 220i services. As such, Compose allows the user to use a single command to create and start all application services from the configuration. Performance monitor 230 parses the creation information to locate the compose file configurations that are used to configure an application's 220i services to determine which containers 110i are associated with a given application.
At step 508, performance monitor 230 identifies interprocess communication among multiple containers 110i. Interprocess communication (IPC) is used for high performance computing. IPC allows multiple containers to share data and information through semaphores. A semaphore is a variable or abstract data type that is used to control access to a common resource by multiple processes in a concurrent system. Performance monitor 230 parses the extracted information collected from the containers and locates IPC among multiple containers by finding semaphores defined in the program code.
At step 510, performance monitor 230 identifies control groups (Cgroups) of containers 110i. Generally, containers 110i may be made from a combination of namespaces, capabilities, and Cgroups. A Cgroup is a collection of processes that are bound by the same criteria. A given Cgroup may be associated with a set of parameters or limits. In the case of Docker OS-level virtualization, Cgroups may be formed through a “-cgroup-parent” flag, which allows a user to create and manage resources, as well as categorize containers under a common parent group. Performance monitor 230 identifies Cgroups by parsing the creation information collected from the agent 112 to locate a “-cgroup-parent” instruction. As such, performance monitor 230 identifies an application by determining which containers are associated with the same Cgroup.
At step 512, performance monitor 230 identifies which containers share the same data volume and the same image to run a given container. For example, in the Docker case, a user may enter the following instructions when creating a container:
docker create -v /dbdata --name dbdata training/postgres /bin/true
docker run -d --volumes-from dbdata --name db1 training/postgres
The “-volume” parameter identifies the same data volume and same image to run a given container. Performance monitor 230 identifies containers 1101 sharing the same data volume by parsing the retrieved creation information and searching for the “-volume” flag. As such, performance monitor 230 is able to identify linked containers by determining which containers share the same data volume.
Referring back to
At step 310, performance monitor 230 generates a health assessment of each application running on host 108. Once all clusters are discovered, and the associated applications 220i are identified, the performance monitor 230 determines the health of each application 220i, based on at least metrics collected from the identified cluster of containers. This enables an end user to pinpoint service methods that consume maximum resources e.g., CPU, disk, or network time) for each request. For example, a simple web application may comprise a web server, a database server, and a load balancer. Assume that the web server is running on a first container, the database server executes on a second container, and the load balancer is run on a third container. The first, second, and third containers define a duster that was identified in step 304 above. Performance monitor 230 assesses the health of the overall cluster, i.e. the application. As such, performance monitor 230 may determine whether the application consumes an excessive amount of host 108 resources. For example, a given application may consume 90% of host's 108 memory resources. This information is provided to the end user, such that the end user can take action.
Monitoring containerized applications can maximize the efficiency of the underlying host. Without proper monitoring, for example, servers and cloud infrastructure may be unusable due to the load of resource-starved containers. The above method provides a way to maximize the resource utilization of the infrastructure of the host without sacrificing the performance of any applications of microservices. The collection of application metrics may be used subsequently to automate a large part of capacity planning.
The above process of monitoring applications of a host may be used on several occasions. For example, the application-centric information can be used for resource-aware scheduling and auto-scaling, i.e. the information can be used to start new containers on hosts where the load is less, and stop containers on hosts where the performance load is high. The performance monitor 230 helps user and administrators to investigate the applications, and manage the containers such that each container has the resources it needs, and that all hosts are running at their most efficient capacity. In another example, the monitoring process may be useful in the automated monitoring and alerting of applications. For example, an alerting system can be made that raises alerts and informs the user when performance for containers on a host is too high.
The various embodiments described herein may employ various computer-implemented operations involving data stored in computer systems. For example, these operations may require physical manipulation of physical quantities—usually, though not necessarily, these quantities may take the form of electrical or magnetic signals, where they or representations of them are capable of being stored, transferred, combined, compared, or otherwise manipulated. Further, such manipulations are often referred to in terms, such as producing, identifying, determining, or comparing. Any operations described herein that form part of one or more embodiments of the invention may be useful machine operations. In addition, one or more embodiments of the invention also relate to a device or an apparatus for performing these operations. The apparatus may be specially constructed for specific required purposes, or it may be a general purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general purpose machines may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
The various embodiments described herein may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like.
One or more embodiments of the present invention may be implemented as one or more computer programs or as one or more computer program modules embodied in one or more computer readable media. The term computer readable medium refers to any data storage device that can store data, which can thereafter be input to a computer system—computer readable media may be based on any existing or subsequently developed technology for embodying computer programs in a manner that enables them to be read by a computer. Examples of a computer readable medium include a hard drive, network attached storage (NAS), read-only memory, random-access memory (e.g., a flash memory device), a CD (Compact Discs) --CD-ROM, a CD-R, or a CD-RW, a DVD (Digital Versatile Disc), a magnetic tape, and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
Although one or more embodiments of the present invention have been described in some detail for clarity of understanding, it will be apparent that certain changes and modifications may be made within the scope of the claims. Accordingly, the described embodiments are to be considered as illustrative and not restrictive, and the scope of the claims is not to be limited to details given herein, but may be modified within the scope and equivalents of the claims. In the claims, elements and/or steps do not imply any particular order of operation, unless explicitly stated in the claims.
Virtualization systems in accordance with the various embodiments may be implemented as hosted embodiments, non-hosted embodiments or as embodiments that tend to blur distinctions between the two, are all envisioned. Furthermore, various virtualization operations may be wholly or partially implemented in hardware. For example, a hardware implementation may employ a look-up table for modification of storage access requests to secure non-disk data.
Many variations, modifications, additions, and improvements are possible, regardless the degree of virtualization. The virtualization software can therefore include components of a host, console, or guest operating system that performs virtualization functions. Plural instances may be provided for components, operations or structures described herein as a single instance. Boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention(s). In general, structures and functionality presented as separate components in exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the appended claim(s).
Gupta, Gaurav, Bhandari, Akshay, Asawa, Aayush
Patent | Priority | Assignee | Title |
11243712, | Jun 28 2019 | Hewlett Packard Enterprise Development LP | Local analytics for high-availability storage systems |
11669257, | Oct 08 2021 | Hewlett Packard Enterprise Development LP; Hewlett-Packard Limited | Container management in a storage system |
Patent | Priority | Assignee | Title |
10489179, | Jun 28 2016 | Amazon Technologies, Inc | Virtual machine instance data aggregation based on work definition metadata |
20140280894, | |||
20160162320, | |||
20170257424, | |||
20190235906, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Aug 16 2017 | ASAWA, AAYUSH | VMWARE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 043363 | /0100 | |
Aug 16 2017 | BHANDARI, AKSHAY | VMWARE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 043363 | /0100 | |
Aug 16 2017 | GUPTA, GAURAV | VMWARE, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 043363 | /0100 | |
Aug 23 2017 | VMware, Inc. | (assignment on the face of the patent) | / | |||
Nov 21 2023 | VMWARE, INC | VMware LLC | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 067103 | /0030 |
Date | Maintenance Fee Events |
Aug 14 2024 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Date | Maintenance Schedule |
Feb 16 2024 | 4 years fee payment window open |
Aug 16 2024 | 6 months grace period start (w surcharge) |
Feb 16 2025 | patent expiry (for year 4) |
Feb 16 2027 | 2 years to revive unintentionally abandoned end. (for year 4) |
Feb 16 2028 | 8 years fee payment window open |
Aug 16 2028 | 6 months grace period start (w surcharge) |
Feb 16 2029 | patent expiry (for year 8) |
Feb 16 2031 | 2 years to revive unintentionally abandoned end. (for year 8) |
Feb 16 2032 | 12 years fee payment window open |
Aug 16 2032 | 6 months grace period start (w surcharge) |
Feb 16 2033 | patent expiry (for year 12) |
Feb 16 2035 | 2 years to revive unintentionally abandoned end. (for year 12) |