Develop and Download Open Source Software

Browse Subversion Repository

Annotation of /License/Flops.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 11 - (hide annotations) (download)
Wed Feb 10 18:21:00 2010 UTC (14 years, 2 months ago) by sho1get
File MIME type: text/plain
File size: 4224 byte(s)


1 sho1get 11 /*****************************/
2     /* FLOPS.c */
3     /* Version 2.0, 18 Dec 1992 */
4     /* Al Aburto */
5     /* aburto@marlin.nosc.mil */
6     /* 'ala' on BIX */
7     /*****************************/
8    
9    
10     Flops.c is a 'c' program which attempts to estimate your systems
11     floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV
12     operations based on specific 'instruction mixes' (discussed below).
13     The program provides an estimate of PEAK MFLOPS performance by making
14     maximal use of register variables with minimal interaction with main
15     memory. The execution loops are all small so that they will fit in
16     any cache. Flops.c can be used along with Linpack and the Livermore
17     kernels (which exersize memory much more extensively) to gain further
18     insight into the limits of system performance. The flops.c execution
19     modules also include various percent weightings of FDIV's (from 0% to
20     25% FDIV's) so that the range of performance can be obtained when
21     using FDIV's. FDIV's, being computationally more intensive than
22     FADD's or FMUL's, can impact performance considerably on some systems.
23    
24     Flops.c consists of 8 independent modules (routines) which, except for
25     module 2, conduct numerical integration of various functions. Module
26     2, estimates the value of pi based upon the Maclaurin series expansion
27     of atan(1). MFLOPS ratings are provided for each module, but the
28     programs overall results are summerized by the MFLOPS(1), MFLOPS(2),
29     MFLOPS(3), and MFLOPS(4) outputs.
30    
31     The MFLOPS(1) result is identical to the result provided by all
32     previous versions of flops.c. It is based only upon the results from
33     modules 2 and 3. Two problems surfaced in using MFLOPS(1). First, it
34     was difficult to completely 'vectorize' the result due to the
35     recurrence of the 's' variable in module 2. This problem is addressed
36     in the MFLOPS(2) result which does not use module 2, but maintains
37     nearly the same weighting of FDIV's (9.2%) as in MFLOPS(1) (9.6%).
38     The second problem with MFLOPS(1) centers around the percentage of
39     FDIV's (9.6%) which was viewed as too high for an important class of
40     problems. This concern is addressed in the MFLOPS(3) result where NO
41     FDIV's are conducted at all.
42    
43     The number of floating-point instructions per iteration (loop) is
44     given below for each module executed:
45    
46     MODULE FADD FSUB FMUL FDIV TOTAL Comment
47     1 7 0 6 1 14 7.1% FDIV's
48     2 3 2 1 1 7 difficult to vectorize.
49     3 6 2 9 0 17 0.0% FDIV's
50     4 7 0 8 0 15 0.0% FDIV's
51     5 13 0 15 1 29 3.4% FDIV's
52     6 13 0 16 0 29 0.0% FDIV's
53     7 3 3 3 3 12 25.0% FDIV's
54     8 13 0 17 0 30 0.0% FDIV's
55    
56     A*2+3 21 12 14 5 52 A=5, MFLOPS(1), Same as
57     �@40.4% 23.1% 26.9% 9.6% previous versions of the
58     flops.c program. Includes
59     only Modules 2 and 3, does
60     9.6% FDIV's, and is not
61     easily vectorizable.
62    
63     1+3+4 58 14 66 14 152 A=4, MFLOPS(2), New output
64     +5+6+ 38.2% 9.2% 43.4% 9.2% does not include Module 2,
65     A*7 but does 9.2% FDIV's.
66    
67     1+3+4 62 5 74 5 146 A=0, MFLOPS(3), New output
68     +5+6+ 42.9% 3.4% 50.7% 3.4% does not include Module 2,
69     7+8 but does 3.4% FDIV's.
70    
71     3+4+6 39 2 50 0 91 A=0, MFLOPS(4), New output
72     +8 42.9% 2.2% 54.9% 0.0% does not include Module 2,
73     and does NO FDIV's.
74    
75     NOTE: Various timer routines are included as indicated below. The
76     timer routines, with some comments, are attached at the end
77     of the main program.
78    
79     NOTE: Please do not remove any of the printouts.
80    
81     EXAMPLE COMPILATION:
82     UNIX based systems
83     cc -DUNIX -O flops20.c -o flops
84     cc -DUNIX -DROPT flops20.c -o flops
85     cc -DUNIX -fast -O4 flops20.c -o flops
86     .
87     .
88     .
89     etc.
90    
91     Al Aburto
92     aburto@marlin.nosc.mil
93    

Back to OSDN">Back to OSDN
ViewVC Help
Powered by ViewVC 1.1.26