1
0
Files
irix-657m-src/eoe/cmd/bru/doc/notes/performance
2022-09-29 17:59:04 +03:00

147 lines
5.4 KiB
Plaintext

(1) INTRODUCTION
In order to discover where bru spends most of its time, and
how inclusion of the macro based debugger package (dbug)
affects execution, a version of bru, both with and without
inclusion of the debugger, was profiled and the results are
presented here.
First some information about the machine and the version of
bru used:
Bru version: 4.8
Machine: Callan Unistar 200
Cpu: 68000
Cpu clock: 8 Mhz
Wait states: 2
Archive device: MPI 8304 (640Kb floppy)
Archive size: 272Kb
Files in archive: 32
(2) TIMING TESTS
The raw statistics from timing execution with the "time" command
are:
No dbug With dbug
Mode Real User Sys Real User Sys
---- ----------------------- -----------------------
c 1:05.8 3.9 2.4 1:08.7 7.4 2.7
d 1:15.1 5.0 4.6 1:19.7 10.0 4.5
e 6.4 0.7 2.9 7.6 1.2 2.9
i 1:00.9 3.8 2.2 1:04.9 7.4 2.3
t 31.5 2.8 3.9 33.1 4.9 4.2
x 1:15.9 3.9 6.0 1:16.5 8.5 5.8
(3) PROFILING TESTS
The profiling tests were run using the system V "prof" facility.
The following table represents at least the top 10 functions
for every mode (since they are not always the same ones the
table contains more than 10 entries).
Percent time spent in the function
(dbug package used)
Function c d e i t x
-------- ----- ----- ----- ----- ----- -----
__doprnt 22.55 5.73 14.43 7.60 7.09 5.59
_chksum 19.57 37.16 0.00 49.54 25.59 42.57
_diff 0.00 19.27 0.00 0.00 0.00 0.00
_zeroblk 10.07 0.00 0.00 0.00 0.00 0.00
aldiv 6.10 0.69 2.32 0.61 1.18 0.25
lrem 5.11 0.46 3.61 0.00 9.06 0.25
_memcpy 3.26 0.92 1.55 1.82 0.79 0.25
mcount 2.55 0.23 1.29 0.61 1.57 2.02
_sprintf 2.13 0.00 0.00 1.22 0.39 0.00
_fromhex 0.00 2.52 0.00 3.04 4.33 2.02
smul 0.43 0.92 0.52 0.30 0.00 0.76
_tree_wa 0.14 0.00 3.87 0.00 0.00 0.00
_fwrite 1.13 0.46 3.87 0.91 2.36 0.76
__flsbuf 0.00 0.46 3.09 0.00 2.36 0.76
ldiv 0.43 0.46 2.58 0.00 1.57 0.76
_tohex 1.99 0.46 0.00 1.22 0.79 0.50
_memccpy 0.14 0.46 0.26 0.30 2.76 0.50
_ar_seek 0.57 0.69 0.00 0.91 0.39 1.26
_ar_read 0.00 0.46 0.00 0.61 0.00 1.01
(debugger functions)
__db_key 5.25 10.09 19.33 11.85 13.78 16.62
__db_pri 1.42 1.83 3.35 2.74 2.36 3.37
__db_ent 1.56 5.05 7.22 3.04 4.72 4.03
__db_ret 3.26 4.59 8.25 2.13 4.72 3.27
Percent time spent in the function
(dbug package not used)
Function c d e i t x
-------- ----- ----- ----- ----- ----- -----
__doprnt 31.91 5.86 29.18 10.90 9.44 8.02
_chksum 22.27 53.52 0.00 65.88 37.22 61.79
aldiv 8.35 0.39 6.01 0.95 1.11 0.94
_zeroblk 6.85 0.00 0.00 0.00 0.00 0.00
lrem 4.71 0.39 1.72 0.47 15.56 0.00
_memcpy 3.85 1.17 2.58 1.42 1.67 1.89
_sleep 2.14 0.78 0.43 0.00 0.00 0.00
mcount 2.14 0.00 2.58 1.42 2.78 0.94
_tohex 1.71 0.39 0.00 0.00 0.00 0.47
_sprintf 1.71 0.00 0.00 0.47 0.56 1.42
_diff 0.00 23.05 0.00 0.00 0.00 0.00
_s_fprin 0.43 5.86 0.00 0.95 2.22 0.00
_fromhex 0.00 1.56 0.00 5.21 3.33 6.60
_verbosi 0.43 1.17 0.86 0.95 0.00 0.47
__flsbuf 0.46 1.17 4.29 0.00 2.22 0.94
__xflsbu 0.21 0.78 3.86 0.95 0.00 0.00
_ar_seek 0.21 0.78 0.00 0.00 1.67 0.00
ldiv 0.21 0.78 6.44 0.95 1.67 0.94
_tree_wa 0.21 0.00 3.86 0.47 0.00 0.00
_fwrite 1.07 0.39 3.43 0.95 1.11 0.94
smul 1.07 0.78 3.00 0.95 1.11 1.89
_estimat 0.00 0.00 2.58 0.00 0.00 0.00
_endpwen 0.00 0.00 0.00 0.95 1.11 0.00
_gmtime 0.00 0.00 0.00 0.00 2.22 0.00
__cleanu 0.00 0.00 1.72 0.47 1.67 0.00
_s_signa 0.00 0.00 0.00 0.00 1.42 1.42
_write 0.43 0.00 1.72 0.47 1.42 1.42
(4) CONCLUSIONS
1. Because the elapsed (real) time is so much larger
than the sum of the user and system times, bru
appears to be mostly I/O bound. Thus performance
will be highly dependent upon the speed of the
peripherals used as the archive device.
2. Because bru is I/O bound, inclusion of the debugger
does not significantly effect total execution time.
However, note that approximately 10-30 percent of
the time spent executing user code is spent in the
debugger routines. Thus in a multiuser environment,
inclusion of the debugger adds significant cpu
burden to the system.
3. Depending upon mode, bru spends from 20-60 percent of
its time computing block checksums in the routine
"chksum". It might be worthwhile to recode this
particular routine is assembly language for a
specific implementation in a multiuser
or multiprocessing environment. If bru is used
mostly on a system in "single-user" mode and
is the only significant process running, there is
probably no performance advantage to be gained by
recoding chksum since bru is I/O bound anyway.