114 lines
3.6 KiB
Plaintext
114 lines
3.6 KiB
Plaintext
From cpirazzi Mon Nov 24 23:42:36 1997
|
|
Subject: tserialio loops
|
|
|
|
I noticed that you changed all the for loops from this sort of
|
|
thing:
|
|
|
|
1612: urbidx = -1;
|
|
1613: for(i=0; i < N_URB; i++)
|
|
1614: {
|
|
1615: if (urbtab[i].allocated == 0)
|
|
1616: {
|
|
1617: urbidx = i;
|
|
1618: break;
|
|
1619: }
|
|
1620: }
|
|
|
|
to this sort of thing
|
|
|
|
1629: urb = (tsio_urb_t *)urbtab;
|
|
1630: urbidx = -1;
|
|
1631: for(i=0; i < N_URB; i++, urb++)
|
|
1632: {
|
|
1633: if (urb->allocated == 0)
|
|
1634: {
|
|
1635: urbidx = i;
|
|
1636: break;
|
|
1637: }
|
|
1638: }
|
|
|
|
despite what is commonly taught in compiler classes these days, I have
|
|
found that our compilers generate as good or better code with the
|
|
first form than the second form.
|
|
|
|
I discovered that one reason for this is that all our compiler people
|
|
are ex-fortran heads, and in fortran the first form is the only
|
|
choice, so they spent all their time optimizing the first form :)
|
|
|
|
for example, I put the two code segments above in tserialio.c and compiled
|
|
it non-debug.
|
|
|
|
here is the resulting assembly. i rearranged a few instructions
|
|
used by both loops to form the preamble:
|
|
|
|
preamble (used equally by both):
|
|
|
|
[1613] 0x f54: ff b4 00 40 sd s4,64(sp)
|
|
[1613] 0x f58: 24 14 00 10 li s4,16
|
|
[1613] 0x f5c: ff b6 00 48 sd s6,72(sp)
|
|
[1613] 0x f7c: ff b1 00 30 sd s1,48(sp)
|
|
|
|
first form:
|
|
|
|
1612: urbidx = -1;
|
|
1613: for(i=0; i < N_URB; i++)
|
|
[1613] 0x f38: 00 00 10 25 move v0,zero
|
|
[1613] 0x f50: 8f 85 88 04 lw a1,-30716(gp)
|
|
[f54 above]
|
|
[f58 above]
|
|
[f5c above]
|
|
[1612] 0x f60: 24 16 ff ff li s6,-1
|
|
[1613] 0x f64: 8c a3 00 00 lw v1,0(a1)
|
|
[1613] 0x f68: 10 60 00 77 beq v1,zero,0x1148
|
|
[1613] 0x f6c: 24 a5 00 4c addiu a1,a1,76
|
|
[1613] 0x f70: 24 42 00 01 addiu v0,v0,1
|
|
[1613] 0x f74: 54 54 ff fc bnel v0,s4,0xf68
|
|
[1613] 0x f78: 8c a3 00 00 lw v1,0(a1)
|
|
[f7c above]
|
|
1614: {
|
|
1615: if (urbtab[i].allocated == 0)
|
|
1616: {
|
|
1617: urbidx = i;
|
|
1618: break;
|
|
[1618] 0x1148: 10 00 ff 8c b 0xf7c
|
|
[1617] 0x114c: 00 40 b0 25 move s6,v0
|
|
1619: }
|
|
1620: }
|
|
|
|
second form:
|
|
|
|
1629: urb = (tsio_urb_t *)urbtab;
|
|
1630: urbidx = -1;
|
|
[1630] 0x f94: 24 16 ff ff li s6,-1
|
|
[1629] 0x f98: 8f 91 88 04 lw s1,-30716(gp)
|
|
1631: for(i=0; i < N_URB; i++, urb++)
|
|
[1631] 0x f9c: 00 00 10 25 move v0,zero
|
|
[1631] 0x fa0: 8e 26 00 00 lw a2,0(s1)
|
|
[1631] 0x fa4: 10 c0 00 6a beq a2,zero,0x1150
|
|
[1631] 0x fa8: 26 31 00 4c addiu s1,s1,76
|
|
[1631] 0x fac: 24 42 00 01 addiu v0,v0,1
|
|
[1631] 0x fb0: 54 54 ff fc bnel v0,s4,0xfa4
|
|
[1631] 0x fb4: 8e 26 00 00 lw a2,0(s1)
|
|
1632: {
|
|
1633: if (urb->allocated == 0)
|
|
1634: {
|
|
1635: urbidx = i;
|
|
1636: break;
|
|
[1636] 0x1150: 10 00 ff 99 b 0xfb8
|
|
[1635] 0x1154: 00 40 b0 25 move s6,v0
|
|
1637: }
|
|
1638: }
|
|
|
|
notice that both loops are exactly 4 instructions long, and both
|
|
code segments are 11 instructions long.
|
|
|
|
I have found that this works for longer code segments as well.
|
|
|
|
--
|
|
|
|
besides, all those changes you put into tserialio.c makes it really
|
|
hard to merge the changes from bonsai :0
|
|
|
|
- Chris Pirazzi
|
|
|