Previous | Table of Contents | Next |
TABLE 4.1 Gibson Instruction Mix | ||
---|---|---|
1. Load and Store | 31.2 | |
2. Fixed-Point Add and Subtract | 6.1 | |
3. Compares | 3.8 | |
4. Branches | 16.6 | |
5. Floating Add and Subtract | 6.9 | |
6. Floating Multiply | 3.8 | |
7. Floating Divide | 1.5 | |
8. Fixed-Point Multiply | 0.6 | |
9. Fixed-Point Divide | 0.2 | |
10. Shifting | 4.4 | |
11. Logical, And, Or | 1.6 | |
12. Instructions not using registers | 5.3 | |
13. Indexing | 18.0 | |
100.0 | ||
Instruction mixes have several disadvantages. Todays computers provide many more complex classes of instructions that are not reflected in the mixes. In modern computer systems, instruction time is highly variable depending upon addressing modes, cache hit rates, pipeline efficiency, and interference from other devices during processor-memory access cycles. The instruction times also vary according to parameter values such as frequency of zeros as a parameter, the distribution of zero digits in a multiplier, the average number of positions of preshift in floating-point add, and the number of times a conditional branch is taken. The mixes do not reflect the virtual addressing facilities (for example, page translation tables) that are provided by some processors.
Despite these limitations, instruction mixes do provide a single number for use in relative comparisons with other computers of similar architectures. Either this combined single time or a complete list of individual instruction times is useful in estimating the time required to execute key algorithms in applications and system programs. The inverse of average instruction time is commonly quoted as the MIPS (Millions of Instructions Per Second) or MFLOPS (Millions of Floating-Point Operations Per Second) rates for the processor.
It must be pointed that the instruction mixes only measure the speed of the processor. This may or may not have effect on the total system performance when the system consists of many other components. System performance is limited by the performance of the bottleneck component, and unless the processor is the bottleneck (that is, the usage is mostly compute bound), the MIPS rate of the processor does not reflect the system performance.
The introduction of pipelining, instruction caching, and various address translation mechanisms made computer instruction times highly variable. An individual instruction could no longer be considered in isolation. Instead, it became more appropriate to consider a set of instructions, which constitutes a higher level function, a service provided by the processors. Researchers started making a list of such functions and using the most frequent function as the workload. Such a function is called a kernel. Since most of the initial kernels did not make use of the input/output (I/O) devices and concentrated solely on the processor performance, this class of kernels could be called the processing kernel.
A kernel is a generalization of the instruction mix. The word kernel means nucleus. In some specialized applications, one can identify a set of common operations, for example, matrix inversion. Different processors can then be compared on the basis of their performance on this kernel operation. Some of the commonly used kernels are Sieve, Puzzle, Tree Searching, Ackermanns Function, Matrix Inversion, and Sorting. However, unlike instruction mixes, most kernels are not based on actual measurements of systems. Rather, they became popular after being used by a number of researchers trying to compare their processor architectures.
Most of the disadvantages of instruction mixes also apply to kernels, although some of the disadvantages related to parameter values, such as frequency of zeros and frequency of branches, no longer apply. The main disadvantage of kernels is that they do not typically make use of I/O devices, and thus, the kernel performance does not reflect the total system performance.
The processing kernels do not make use of any operating system services or I/O devices. As the applications of computer systems are proliferating, they are no longer used for processing-only applications. Input/output operations have become an important part of the real workloads. Initial attempts to measure I/O performance lead analysts to develop simple exerciser loops that make a specified number of service calls or I/O requests. This allows them to compute the average CPU time and elasped time for each service call. In order to maintain portability to different operating systems, such exercisers are usually written in high-level languages such as FORTRAN or Pascal.
The first exerciser loop was proposed by Buchholz (1969) who called it a synthetic program. A sample exerciser is shown in Figure 4.1. It makes a number of I/O requests. By adjusting the control parameters, one can control the number of times the request is made. Exerciser loops are also used to measure operating system services such as process creation, forking, and memory allocation.
The main advantage of exerciser loops is that they can be quickly developed and given to different vendors. It is not necessary to use real data files, which may contain proprietary information. The programs can be easily modified and ported to different systems. Further, most exercisers have built-in measurement capabilities. Thus, once developed, the measurement process is. automated and can be repeated easily on successive versions of the operating systems to characterize the relative performance gains/losses.
Previous | Table of Contents | Next |