From: gt7080a@prism.gatech.EDU (Nathan Laredo) Subject: gcc2 woes Date: 6 Jun 1992 08:28:59 GMT
Today I decided I'd conduct a little experiment with gcc under
linux and compare it with the gcc 2.1 implementation on our
Sequent S81 here at Georgia Tech. The results were somewhat
disturbing. I would have expected the assembler output to
be the same using the same command line options. Here's
what I got:
Stripped a.out filesize: Sequent S81: 8192 bytes
Linux on my 486DX-33mhz: 9220 bytes
I suppose that's understandable given that Dynix is an established
OS with probably better support, but the following is what I found
most disturbing:
For those who are not familiar with the Sequent S81, it's a multi-
processor 386 16Mhz machine. We as lowly students would never be
allowed however to compile things to work with multiple processors
or else we'd bring the system to it's knees with a 100+ user load.
The following is the source followed by the assembler output, which
I had originally expected to be identical. It was not.
/* This is a test */
#include <stdio.h>
main()
{
int i;
puts("start");
for (i=0;i<10000000;i++);
puts("done");
}
Sequent S81 with gcc -O6 -S test.c:
.file "test.c"
gcc2_compiled.:
.text
LC0:
.ascii "start\0"
LC1:
.ascii "done\0"
.align 2
.globl _main
_main:
pushl %ebp
movl %esp,%ebp
call ___main
pushl $LC0
call _puts
addl $4,%esp
movl $9999999,%eax
.align 2
L4:
decl %eax
jns L4
pushl $LC1
call _puts
leave
ret
.align 2
Now that was short, sweet and to the point... almost as good as I'd
write it in assembler. Now look at this:
My 486 under Linux with gcc -O6 -S test.c:
.file "test.c"
gcc2_compiled.:
.text
LC0:
.ascii "start\0"
LC1:
.ascii "done\0"
.align 2
.globl _main
_main:
pushl %ebp
movl %esp,%ebp
call ___main
pushl $__cout_sbuf
pushl $LC0
call _fputs
addl $8,%esp
cmpl $-1,%eax
je L3
movl __cout_sbuf+24,%edx
cmpl %edx,__cout_sbuf+20
jb L5
pushl $10
pushl $__cout_sbuf
call ___overflow
addl $8,%esp
jmp L3
L5:
movl __cout_sbuf+20,%eax
movb $10,(%eax)
incl __cout_sbuf+20
.align 2,0x90
L3:
movl $9999999,%eax
L9:
decl %eax
jns L9
pushl $__cout_sbuf
pushl $LC1
call _fputs
addl $8,%esp
cmpl $-1,%eax
je L12
movl __cout_sbuf+24,%edx
cmpl %edx,__cout_sbuf+20
jb L14
pushl $10
pushl $__cout_sbuf
call ___overflow
leave
ret
.align 2,0x90
L14:
movl __cout_sbuf+20,%eax
movb $10,(%eax)
incl __cout_sbuf+20
.align 2,0x90
L12:
leave
ret
It's trash! Look at how wasteful it is... In fact, I
took the other assembler code from the sequent, put
it through as on my machine and it still worked perfectly.
IMHO there's something definately screwey around here...
I find the whole thing repulsive. The same compiler for
the same platform should not be doing this. There is also
absolutely no reason for half the statements that I see
there in the Linux assembler output. Are we all now so
blinded by high level languages that we forget to look
at what's underneath? I'd think in 24 years we'd come
up with something that optimized better than this, back
24 years ago (I was not there but I'm told) there were
such things as optimizing compilers that could even
outcode the best assembler programmer because it was
really good at bookkeeping. Well it's 24 years later
and gcc2 sure can't outcode me, and I'm nothing but
an undergrad CS student... you'd think the level of skill
in writing compilers would be to the point today where I
might not even be able to understand what comes out of
the compiler, but it turns out trash in this case...
but the sequent version isn't all that bad. Same
software, different OS, so what's the deal?
-Nathan Laredo
-- Nathan Laredo |begin 600 mean.msg.Z Georgia Institute of Technology |@'YV06V)T`;$"!`@B;]R4,<BPH<.'<MJ`:"''#`@3"@`` Box 37080 |` Atlanta, Georgia, 30332 |end