Previous Table of Contents Next


The above list includes only those mistakes that an analyst may make inadvertently due to inexperience. The benchmarking tricks that have been used by experienced analysts to show the superiority of their systems are discussed in the next section.

9.4 BENCHMARKING GAMES

Benchmarking is the process of comparing two systems using standard well-known benchmarks. The process is not always carried out fairly. Some of the ways that the results of a benchmarking study may be misleading or biased are discussed next.

1.  Differing configurations may be used to run the same workload on two systems. The configurations may have a different amount of memory, different disks, or different number of disks.
2.  The compilers may be wired to optimize the workload. In one case, the compiler totally eliminated the main loop in the synthetic program, thereby giving infinitely better performance than other systems.
3.  Test specifications may be written so that they are biased toward one machine. This may happen if the specifications are written based upon an existing environment without consideration to generalizing the requirements for different vendors.
4.  A synchronized job sequence may be used. It is possible to manipulate a job sequence so that CPU-bound and I/O-bound steps synchronize to give a better overall performance.
5.  The workload may be arbitrarily picked. Many of the well-known kernels, such as sieve and puzzle, have not been verified to be representative of the real-world applications.
6.  Very small benchmarks may be used. Such benchmarks give 100% cache hits, thereby ignoring the inefficiency of memory and cache organizations. Small benchmarks may also not show the effect of I/O overhead and context switching. The results will depend mainly on the few instructions that occur in the inner loop. By judicious choice of instructions in the loop, the results can be skewed by any amount desired.
Most real systems make use of a wide variety of workloads. To compare two systems, one should therefore use as many workloads as possible. By using only a few selected benchmarks, the results can be biased, as desired.
7.  Benchmarks may be manually translated to optimize the performance. Often benchmarks need to be manually translated to make them runable on different systems. The performance may then depend more on the ability of the translator than on the system under test.

9.5 LOAD DRIVERS

In order to measure the performance of a computer system, it is necessary to have some means of putting loads on the system. Although our interest in load drivers here is purely for performance measurement, it must be pointed out that the same load drivers can also be used for other purposes, such as the following:


FIGURE 9.2  An RTE and a SUT.

  Component Certification: This requires rigorous testing of hardware and software components by imposing sequential and random combinations of workload demands.
  System Integration: This involves verifying that various hardware and software components of distributed systems work compatibly under different environments.
  Stress-Load Analysis: This requires putting high loads on the system to test for stable and error-free operation at these loads.
  Regression Testing: After every change in the system, the new version of the system should be tested to demonstrate that all previous capabilities are functional along with the new ones.

Three techniques that have been used for load driving are using internal drivers, live operators, or remote-terminal emulators.

The internal-driver method consists of loading programs directly into the memory and executing. If there is more than one program per user, then the whole sequence of commands is put in a disk file and run as a batch job. The main problem with the internal-driver method is that the effect of the terminal communication overhead is not visible. Also, the loading overhead may affect the system performance.

One way to account for terminal communication overhead is to have live operators utilize the system. To test a multiuser system, many people would need to sit down at their own terminal and execute a predetermined set of commands. This is a costly process and one that is difficult to control. The presence of the human element in the measurement increases the variance of the result. The increased variance means more trials are required to obtain a desired level of confidence.

The most desirable and popular method for putting the load on the system is to make use of computers. One computer can generally simulate many users in a very controlled and repeatable fashion. These computers are called Remote-Terminal Emulators (RTEs) (see Figure 9.2). The remainder of this chapter discusses the design and use of RTEs.

9.6 REMOTE-TERMINAL EMULATION

An RTE emulates the terminals, the terminal communication equipments, the operators, and the requests to be submitted to the System Under Test (SUT), as shown in Figure 9.3. In general, the RTE is a full-fledged computer that includes disks, magtapes, and at least one console terminal. In many cases, the RTE may be more powerful than the SUT. For example, super-minicomputers may be used to drive minicomputers and workstations. Most RTEs have their own operating system designed specifically for this real-time operation.


FIGURE 9.3  Compenents emulated by an RTE


FIGURE 9.4  Sample scenario.

The RTE sends commands to the SUT at appropriate intervals. The user commands are read from a disk file called script. The script file contains user commands as well as other instructions for the RTE, such as when the RTE should send out a command.

Scripts written for one RTE system cannot be used on another RTE system because they may include an incompatible set of commands. For the same reason, scripts that are written for one SUT cannot be used on another SUT. To compare two incompatible SLITS, the workload should be first described in a manner independent of the SUT and RTE. This description is called a scenario. An example of a scenario is shown in Figure 9.4.


Previous Table of Contents Next

Copyright © John Wiley & Sons, Inc.