Develop and Download Open Source Software

Browse Subversion Repository

Contents of /License/Flops.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 11 - (show annotations) (download)
Wed Feb 10 18:21:00 2010 UTC (14 years ago) by sho1get
File MIME type: text/plain
File size: 4224 byte(s)


1 /*****************************/
2 /* FLOPS.c */
3 /* Version 2.0, 18 Dec 1992 */
4 /* Al Aburto */
5 /* aburto@marlin.nosc.mil */
6 /* 'ala' on BIX */
7 /*****************************/
8
9
10 Flops.c is a 'c' program which attempts to estimate your systems
11 floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV
12 operations based on specific 'instruction mixes' (discussed below).
13 The program provides an estimate of PEAK MFLOPS performance by making
14 maximal use of register variables with minimal interaction with main
15 memory. The execution loops are all small so that they will fit in
16 any cache. Flops.c can be used along with Linpack and the Livermore
17 kernels (which exersize memory much more extensively) to gain further
18 insight into the limits of system performance. The flops.c execution
19 modules also include various percent weightings of FDIV's (from 0% to
20 25% FDIV's) so that the range of performance can be obtained when
21 using FDIV's. FDIV's, being computationally more intensive than
22 FADD's or FMUL's, can impact performance considerably on some systems.
23
24 Flops.c consists of 8 independent modules (routines) which, except for
25 module 2, conduct numerical integration of various functions. Module
26 2, estimates the value of pi based upon the Maclaurin series expansion
27 of atan(1). MFLOPS ratings are provided for each module, but the
28 programs overall results are summerized by the MFLOPS(1), MFLOPS(2),
29 MFLOPS(3), and MFLOPS(4) outputs.
30
31 The MFLOPS(1) result is identical to the result provided by all
32 previous versions of flops.c. It is based only upon the results from
33 modules 2 and 3. Two problems surfaced in using MFLOPS(1). First, it
34 was difficult to completely 'vectorize' the result due to the
35 recurrence of the 's' variable in module 2. This problem is addressed
36 in the MFLOPS(2) result which does not use module 2, but maintains
37 nearly the same weighting of FDIV's (9.2%) as in MFLOPS(1) (9.6%).
38 The second problem with MFLOPS(1) centers around the percentage of
39 FDIV's (9.6%) which was viewed as too high for an important class of
40 problems. This concern is addressed in the MFLOPS(3) result where NO
41 FDIV's are conducted at all.
42
43 The number of floating-point instructions per iteration (loop) is
44 given below for each module executed:
45
46 MODULE FADD FSUB FMUL FDIV TOTAL Comment
47 1 7 0 6 1 14 7.1% FDIV's
48 2 3 2 1 1 7 difficult to vectorize.
49 3 6 2 9 0 17 0.0% FDIV's
50 4 7 0 8 0 15 0.0% FDIV's
51 5 13 0 15 1 29 3.4% FDIV's
52 6 13 0 16 0 29 0.0% FDIV's
53 7 3 3 3 3 12 25.0% FDIV's
54 8 13 0 17 0 30 0.0% FDIV's
55
56 A*2+3 21 12 14 5 52 A=5, MFLOPS(1), Same as
57 �@40.4% 23.1% 26.9% 9.6% previous versions of the
58 flops.c program. Includes
59 only Modules 2 and 3, does
60 9.6% FDIV's, and is not
61 easily vectorizable.
62
63 1+3+4 58 14 66 14 152 A=4, MFLOPS(2), New output
64 +5+6+ 38.2% 9.2% 43.4% 9.2% does not include Module 2,
65 A*7 but does 9.2% FDIV's.
66
67 1+3+4 62 5 74 5 146 A=0, MFLOPS(3), New output
68 +5+6+ 42.9% 3.4% 50.7% 3.4% does not include Module 2,
69 7+8 but does 3.4% FDIV's.
70
71 3+4+6 39 2 50 0 91 A=0, MFLOPS(4), New output
72 +8 42.9% 2.2% 54.9% 0.0% does not include Module 2,
73 and does NO FDIV's.
74
75 NOTE: Various timer routines are included as indicated below. The
76 timer routines, with some comments, are attached at the end
77 of the main program.
78
79 NOTE: Please do not remove any of the printouts.
80
81 EXAMPLE COMPILATION:
82 UNIX based systems
83 cc -DUNIX -O flops20.c -o flops
84 cc -DUNIX -DROPT flops20.c -o flops
85 cc -DUNIX -fast -O4 flops20.c -o flops
86 .
87 .
88 .
89 etc.
90
91 Al Aburto
92 aburto@marlin.nosc.mil
93

Back to OSDN">Back to OSDN
ViewVC Help
Powered by ViewVC 1.1.26