Small sample sizes can be neatly disguised by specifying the percentage. Saying that 83.3% of the universities in town use system X is more impressive than saying that five of the six universities use the system.
The base in the percentage should be the initial valuethe value that comes first in the time order. Many people, particularly those trying to market a product aggressively, ignore this and specify percentages with respect to final values. For example, when they say that memory prices have gone down by 400%, it sounds like they are now paying you to buy memory. Actually, current prices are one-fifth of the original.
11.5 STRATEGIES FOR WINNING A RATIO GAME
Having seen several examples of ratio games, one obvious question is under what conditions can the conclusions be reversed by changing the base in a ratio game. This is the topic of this section. Only games involving the choice of a base system are considered. Application to other types of ratio games is straightforward. To win a ratio game, the guidelines are as follows:
- 1. If one system is better on all benchmarks, contradicting conclusions cannot be drawn by any ratio game technique. A contradicting conclusion means that one system is the best with one base and another is the best on some other base. Notice that in both case studies of the ratio games given in Section 11.1, the measurements were such that one system was better on some benchmark and worse on other benchmarks. If this is not the case, then the same system will come out the best using any base. For example, Table 11.9 shows the execution times of System A and System B on benchmarks I and J. The times relative to System A and System B are also shown in the table. In all three cases shown, System A is the better of the two systems.
- 2. Even if one system is better than the other on all benchmarks, a better relative performance can be shown by selecting the appropriate base. Thus, although the three analyses shown in Table 11.9 show System A to be better, System A is 40% better than System B using raw data, 43% better using system A as a base, and 42% better using System B as a base. The system A designers would prefer to use the raw-data averages.
TABLE 11.9 An Example with One System Better an Both Metrics
|
Raw Measurements
| With A as a Base
| With B as a Base
|
| System
|
| System
|
| System
|
Benchmark
| A
| B
| Benchmark
| A
| B
| Benchmark
| A
| B
|
|
|
|
I
| 0.50
| 1.00
| I
| 1.00
| 2.00
| I
| 0.50
| 1.00
|
J
| 1.00
| 1.50
| J
| 1.00
| 1.50
| J
| 0.67
| 1.00
|
Average
| 0.75
| 1.25
|
| 1.00
| 1.75
|
| 0.58
| 1.00
|
|
- 3. If a system is better on some benchmarks and worse on others, contradicting conclusions can be drawn in some cases. In other words, contradicting conclusions cannot be drawn in all cases. The easiest way to verify whether a contradictory conclusion can be drawn for a particular data set is to try all possible bases. The following rules may help select the base.
- 4. If the performance metric is an LB metric, it is better to use your system as the base. The execution time is an LB metric, and as shown in Table 11.9, System A designers would be better off using System A as the base. System B designers would prefer to use System B as the base.
- 5. If the performance metric is an HB metric, it is better to use your opponent as the base. The throughputs, efficiency, MIPS, and MFLOPS are examples of HB metrics. In these cases, a higher average ratio would be obtained for System A if System B is used as a base.
- 6. Those benchmarks that perform better on your system should be elongated and those that perform worse should be shortened. The time duration of the benchmarks is often adjustable. For example, in the Sieve benchmark (described in Section 4.6), the number of prime numbers to be generated can be set by the experimenter. If System A performs better than system B on the Sieve benchmark, choosing to generate a larger number of prime numbers would make the results more favorable to System A designers if a ratio-of-totals technique is used to compare the two systems. This in effect increases the weight on the favorable benchmark. Shortening unfavorable benchmarks also has the same effect.
Once again, remember that taking an average of a ratio is not a correct way to analyze the data. Unfortunately, it is done so often that it is useful to know the rules for self-protection. The correct method to analyze such data is by using techniques to analyze experimental designs, as discussed later in Part IV of this book. For mathematically oriented readers, a derivation of these rules follows next.
TABLE 11.10 Derivation of the Rules
