PART IV
EXPERIMENTAL DESIGN AND ANALYSIS
Performance often depends upon more than one factor such as the system and the workload. Proper analysis requires that the effects of each factor be isolated from those of others so that meaningful statements can be made about different levels of the factor, for instance, different systems. Such analysis is the main topic of this part. The techniques presented in this part will enable you to do the following:
- Design a proper set of experiments for measurement or simulation.
- Develop a model that best describes the data obtained.
- Estimate the contribution of each alternative (for example, each processor and each workload) to the performance.
- Isolate the measurement errors.
- Estimate confidence intervals for model parameters.
- Check if the alternatives are significantly different.
- Check if the model is adequate.
Chapter 16 introduces various types of experimental designs and defines several new terms. The remaining chapters in this part present techniques to analyze a number of popular designs.
CHAPTER 16
INTRODUCTION TO EXPERIMENTAL DESIGN
The first ninety percent of the task takes ten percent of the time, and the last ten percent takes the other ninety percent.
Ninety-ninety rule of project schedules
The goal of a proper experimental design is to obtain the maximum information with the minimum number of experiments. This saves considerable labor that would have been spent gathering data. A proper analysis of experiments also helps in separating out the effects of various factors that might affect the performance. Also, it allows determining if a factor has a significant effect or if the observed difference is simply due to random variations caused by measurement errors and parameters that were not controlled.
Several new terms that are used in experimental design and analysis are explained first in Section 16.1. There are numerous possible experimental designs. Some sample designs are briefly described in Section 16.3. Of these, some are more popular and generally applicable than others.
16.1 TERMINOLOGY
This section explains the terms that are used in experimental design and analysis. It does so by using the example of a personal workstation design study. The problem is to design a personal workstation, where several choices have to be made. First, a microprocessor has to be chosen for the CPU. The alternatives are the 68000, Z80, or 8086 microprocessor. Second, a memory size of 512 kbytes, 2 Mbytes, or 8 Mbytes has to be chosen. Third, the workstation could have one, two, three, or four disk drives. Fourth, the workload on the workstations could be one of three typessecretarial, managerial, or scientific. Performance also depends on user characteristics, such as whether users are at a high school, college, or postgraduate level.
The following terms are frequently used in the design and analysis of experiments:
- Response Variable: The outcome of an experiment is called the response variable. Generally the response variable is the measured performance of the system. For example, in the workstation design study the response variable could be the throughput expressed in tasks completed per unit time, or response time for tasks, or any other metric. Since the techniques of experimental design are applicable for any kind of measurements, not just performance measurements, the more general term response is used in place of performance.
- Factors: Each variable that affects the response variable and has several alternatives is called a factor. For example, there are five factors in the workstation design study. The factors are CPU type, memory size, number of disk drives, workload used, and users educational level. The factors are also called predictor variables or predictors.
- Levels: The values that a factor can assume are called its levels. In other words, each factor level constitutes one alternative for that factor. For example, in the workstation design study the CPU type has three levels: 68000, 8080, or Z80. Memory size has three levels: 512 kbytes, 2 Mbytes, or 8 Mbytes. The number of disk drives has four levels: 1, 2, 3, or 4. The workload has three levels: secretarial, managerial, or scientific. Finally, users could be placed in one of three educational levelshigh school graduates, college graduates, and postgraduates. An alternative term treatment is also used in experimental design literature in place of levels.
- Primary Factors: The factors whose effects need to be quantified are called primary factors. For example, in the workstation design study, one may be primarily interested in quantifying the effect of CPU type, memory size only, and number of disk drives. Thus, there are three primary factors in this case.
- Secondary Factors: Factors that impact the performance but whose impact we are not interested in quantifying are called secondary factors. For example, in the workstation study we may not be interested in determining whether performance with postgraduates is better than that with college graduates. Similarly, we do not want to quantify the difference between the three workloads. These are the secondary factors.
- Replication: Repetition of all or some experiments is called replication. For example, if all experiments in a study are repeated three times, the study is said to have three replications.
- Design: An, experimental design consists of specifying the number of experiments, the factor level combinations for each experiment, and the number of replications of each experiment. For example, in the workstation design study, we could perform experiments corresponding to all possible combinations of levels of five factors. This would require 3 × 3 × 4 × 3 × 3, or 324, experiments. We could repeat each experiment five times, leading to a total of 1215 observations. This is one possible experimental design. Later, in Section 16.3, several other possible experimental designs will be described.
- Experimental Unit: Any entity that is used for the experiment is called an experimental unit. Generally only those experimental units that are considered as one of the factors in the study are of interest. For example, in the workstation design study, the users hired to use the workstation while measurements are being performed could be considered the experimental unit. Other examples of experimental units are patients in medical experiments or land used in agricultural experiments. In all such cases, we are really not interested in comparing the experimental units, although they affect the response. Therefore, one goal of the experimental design is to minimize the impact of variation among the experimental units.
- Interaction: Two factors A and B are said to interact if the effect of one depends upon the level of the other. For example, Table 16.1 shows the performance of a system with two factors. As the factor A is changed from level A1 to level A2, the performance increases by 2 regardless of the level of factor B. In this case there is no interaction. Table 16.2 shows another possibility. In this case, as the factor A is changed from level A1 to level A2, the performance increases either by 2 or by 3 depending upon whether B is at level B1 or level B2, respectively. The two factors interact in this case. A graphical presentation of this example is given in Figure 16.1. In case (a), the lines are parallel, indicating no interaction. In the second case, the lines are not parallel, indicating interaction.