A visualization tool for debugging and monitoring parallel codes as well as resource management of distributed clusters.
The Lilith Lights visualization tool provides graphical information about CPU usage and communication among nodes of a cluster. Application programmers can use the tool to visually discover problems in their algorithms, such as unexpected interactions between various parallel message passing library components, and the emergence of hot spots in the calculation. System resource managers can also use the tool to discover areas in the cluster which are under utilized.
Performance monitoring data on clusters of computers is traditionally performed by invasive message passing taps, and is examined only after the run in complete. Lilith Lights is a non-invasive network and processor monitoring tool which collects and processes data from individual cluster machines. The data is gathered and displayed concurrently with the running application.
The display shows system load on each node as a moving vertical bar. Traffic on the network links connecting nodes is represented by colored unidirectional arrows; colors indicate the level of consumed bandwidth.
The user code gets system load information from /proc/stat and gets traffic infomation from a small kernel module which records the transfer of every packet. The Lilith framework handles the details of distributing the user code and collecting the results. On our cluster, the Lilith traffic is sent across a slow secondary network so that the traffic displayed is only that of the application being monitored without interference of that of the Lilith monitoring tool.
In production is a System Status Monitoring tool necessary for the use and maintenance of large, distributed, heterogeneous clusters of computers. Monitoring and maintaining such systems incur difficulties which do not encumber more conventional systems. Even the simple task of obtaining system load, memory usage, etc., for each component of the machine can require some thought. This sort of system status monitoring is required by parallel applications and is essential for a remotely hosted system because normal status indicators are absent. This information is necessary to system administrators to monitor system performance, and is need by users to check the statuts of their jobs and to select the placement of those jobs in a clustered environment.
Lilith's scalable nature and encapsulation of the tool code make it ideal for the basis for the status tool on the distributed system. Normal tools like shell scripts that rely on serial algorithms will not finish in a meaningful period of time. Additionally, on experimental clusters, the monitoring section of the code will need to be altered frequently to adapt to changes in the rapidly evoloving system configuration. Lilith is ideal for this situation because the infrastructure needed to scalably and securely span the machine is written and debugged once. The much smaller code that actually performes the system monitoring function can be altered independently. Lilith takes care of getting the code and the data to the machine in a scalable way, executing it on every configured node, and scalably returning data to the originator.
The user code consists of a thread that waits for a flag from its parent describing the type of data to be returned (a delta or absolute quantity). The flag is passed to the children, and several system files in /proc are read (this includes downloadable kernel modules that expose kernel internals via the /proc interface). The values are read, processed, and merged with values returned by the other processors. All the user code has to do is specify the code for processing the values; Lilith handles the details of code distribution and result collection.
The application provides statistics on CPU utilization, memory utilization, Ethernet packet, and error statistics. For systems that have an ATM interface, we provide values for the AAL5 traffic. These files are read in such a way as to minimize the system overhead (file open/close on an NFS mounted root file system). The application can be tailored to perform queries only on demand, thus avoiding adverse effects on system load.
A snapshot of the graphical display appears above. Users can select which machines - in this case the system is configured into sets of racks of 8 machines - and which quantities to view. The set-up pictured is limited to two quantities displayed simultaneously; this number is a user-specified parameter at start-up and, in practice, is set higher. Results are dispalyed as color-coded bars sized proportionally to their value, similar to the display provided by gr_osview of SGI. Total scales for the display values can be entered by the user and updated dynamically.
Cluster-wide PS tool provides a tabular display of the results of a PS on all the nodes of a cluster. Sortable by hostname, CPU usage, memory usage, time, etc.