The presentation layer determines the user friendliness of the monitor. It should be designed such that it expects minimum user intelligence. The user need not be aware of the current system configuration or even the current monitor configuration. The monitor should learn the system and the monitor configurations automatically, if possible.
The monitor should be able to run interactively or in the background mode. Most system managers want the system to be observed continuously and use the monitor interactively only when a problem is sensed. Thus, batch mode as well as interactive mode operation of the monitor should be possible.
Some monitors allow callable interfaces so that extensions can be easily built on top of the functions provided by the monitor.
The preceding issues are general and apply to all applications of the monitor. There are a number of presentation issues that apply to specific applications of a monitor. Three key applications of the monitor are related to the speed, accuracy, and availability of the services provided by the system. These are termed performance monitoring, error monitoring, and configuration monitoring, respectively. The presentation issues related to each of these applications will now be described.
- 1. Performance Monitoring: Performance monitoring helps to quantify the quality of service when the service is provided correctly. The users are generally interested in observing the system performance in terms of throughput, response time, and utilization of various resources. They may also be interested in seeing summary statistics of various events on the system. Depending upon the frequency of occurrence of each event and their importance, the summary statistics may consist of counts, class counts (or histograms), or a time-stamped trace.
A monitor can simply count all service requests regardless of their type. It can categorize a request into several classes based on the type and then count the number for each class. This is called a class count or histogram. The monitor can sort these class counts in decreasing order and present only the top 10. It can present other statistical characteristics of the class counts, for example, average, standard deviation, and histogram. The monitor can also prepare a time-stamped trace of requests to be analyzed by other programs.
- 2. Error Monitoring: The error is defined as the incorrect performance. The system appears to be operating, accepting user requests, and performing the service, but the end result is not what the user wanted. A monitor should provide the error statistics, counts, class counts, or traces in a manner similar to that described under performance statistics. The error statistics on various components of the system may be sorted to determine the unreliable parts of the system.
- 3. Configuration Monitoring: Configuration monitoring relates to the nonperformance of the system components. It allows the user to determine which components are up. A monitor can determine this by promiscuous observation of traffic on the system bus or on the broadcast network medium by polling the components or by sending a marked packet. A monitor should record system initializations and any configuration changes due to components joining or leaving the configuration. The user can get a count, class count, or a trace of these events. The class count is generally that of interevent timethe interval for which the component was up or down.
Often incremental configuration information is more useful than knowing the full configuration. Such information includes a list of subsystems that were added or dropped from the system. Also useful is the list of subsystems that are not on a list prepared by the system manager. This facility may be used, for example, on a network to identify unknown stations joining the network.
The systems manager should be able to scope the configuration monitoring to any part of the systemthe whole system, a single subsystem, or a set of subsystems.
7.7.5 Interpretation
The interpretation of data requires a set of rules on which the interpreter makes judgments about the probable state of the system. This requires building an expert system to warn the systems manager about probable faults before they occur or asking the manager to change system parameters.
7.7.6 Console Functions
Console functions allow the systems manager to change system parameters, reconfigure the system, and bring system components up or down. Remote console functions allow a remote diagnostic link to the system. Although console functions are not an essential part of a monitor, they are activated as a result of the data provided by the monitor. It is easier for the systems manager to be able to get feedback (monitor) and apply control (console) from the same location. Unfortunately, often consoles and monitors are designed and sold by different vendors, and their activation from the same workstation may not be possible.
EXERCISES
- 7.1 For each of the following measurements list the type of monitor that can and cannot be used. Which type of monitor would you prefer and why?
- a. Interrupt response time
- b. Instruction opcode frequency
- c. Program reference pattern
- d. Virtual memory reference pattern in a multiprogramming system
- e. CPU time required to send one packet on a network
- f. Response time for a database query
- 7.2 For each of the following environments, describe how you would implement a monitor to produce a program counter histogram:
- a. Using a hardware monitor
- b. Using a software monitor on an IBM PC with the CPU having a trace bit.
- c. Using a software monitor on a TRS-80 with the CPU not having a trace bit.
- 7.3 Choose a computer system or subsystem. Assume that prototypes of systems you selected already exist and you have decided to measure their performance. Make a list of quantities, if any, that you could measure using a
- a. Software monitor
- b. Hardware monitor
- c. Firmware monitor
In each case, describe how performance metrics of interest to you could be calculated using the quantities measured. Discuss how you would resolve some of the issues you would face in using or designing a monitor for your system.