Giter VIP home page Giter VIP logo

gcc's Introduction

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.

gcc's People

Contributors

jakubjelinek avatar rguenth avatar jicama avatar jsm28 avatar jwakely avatar hjl-tools avatar ubizjak avatar marxin avatar davidmalcolm avatar tob2 avatar mpolacek avatar rsandifo-arm avatar rorth avatar urnathan avatar segher avatar rsandifo avatar edschonberg avatar aldyh avatar jamborm avatar nickclifton avatar bonzini avatar vnmakarov avatar sprintersb avatar hpataxisdotcom avatar janusw avatar geoffk01 avatar iains avatar fxcoudert avatar ebotcazou avatar djdelorierh avatar

gcc's Issues

SPEC2017 519.lbm regression on riscv gcc 11 backport of b646d7d279ae

Note: This issue involves an upstream gcc patch (for gcc 13, officially backported to gcc-12) but the issue happens on a gcc-11 backport hence no official bugzilla entry or issue on riscv gcc repo

The gcc change b646d7d ("RISC-V: Inhibit FP <--> int register moves via tune param")to elide FMV instructions can benefit performance on uarches where FMV are costlier and repeated stack spills could potentially be cached. And it is supposed to be codesize neutral - under high register pressure it would replace FMV.x.d/FMV.d.x. pair with FLD/FST pair.

However with a gcc 11 backport of above, we see regression on SPECFP2017 benchmark 519.lbm (dynamic instructions count).
The additional stack spills are obviously there, but there are additional FLD for loading floating point constants from .rodata - multiple times - and a few times the register hoisting the constant is still live.

336:		u2 = 1.5 * (ux*ux + uy*uy + uz*uz);

338:		feqs[C ] =            (1.0/3.0)*rho*(1.0                         - u2);
339:         feqs[N ] = feqs[S ] = (1.0/18.0)*rho*(1.0 + 4.5*(+uy )*(+uy ) - u2);

...

codegen BEFORE b646d7d

   10cb8:	8345784b          	fnmsub.d	fa6,fa0,fs4,fa6
   10cbc:	d3b27243          	fmadd.d	ft4,ft4,fs11,fs10
   10cc0:	5b4575cb          	fnmsub.d	fa1,fa0,fs4,fa1
   10cc4:	1236f6d3          	fmul.d	fa3,fa3,ft3
   10cc8:	d3457dcb          	fnmsub.d	fs11,fa0,fs4,fs10
   10ccc:	0b4570cb          	fnmsub.d	ft1,fa0,fs4,ft1
   10cd0:	0345704b          	fnmsub.d	ft0,fa0,fs4,ft0
   10cd4:	2b4572cb          	fnmsub.d	ft5,fa0,fs4,ft5
   10cd8:	3345734b          	fnmsub.d	ft6,fa0,fs4,ft6
   10cdc:	2345724b          	fnmsub.d	ft4,fa0,fs4,ft4
   10ce0:	1345714b          	fnmsub.d	ft2,fa0,fs4,ft2
   10ce4:	9345794b          	fnmsub.d	fs2,fa0,fs4,fs2
   10ce8:	bc42                	fsd	fa6,56(sp)
   10cea:	b42e                	fsd	fa1,40(sp)
   10cec:	ac36                	fsd	fa3,24(sp)
   10cee:	be05                	j	1081e <main+0x2e0>

codegen AFTER b646d7d

Note the repeated refetch of const @ LC29 = 1.5 used as multiplier in line 336 of src.

# lbm.c:338: 		feqs[C ] =            (1.0/3.0)*rho*(1.0                         - u2);
   10c8e:	c316f8cb          	fnmsub.d	fa7,fa3,fa7,fs8
   10c92:	3c06                	fld	fs8,96(sp)
   10c94:	bcc6                	fsd	fa7,120(sp)
   10c96:	8a81b887          	fld	fa7,-1880(gp)              # fld fa7, %lo(.LC29)(a3)
   10c9a:	c316f8cb          	fnmsub.d	fa7,fa3,fa7,fs8
   10c9e:	b8c6                	fsd	fa7,112(sp)
   10ca0:	8a81bc07          	fld	fs8,-1880(gp)              # fld fs8, %lo(.LC29)(a3)
   10ca4:	28e6                	fld	fa7,88(sp)
   10ca6:	f386ff4b          	fnmsub.d	ft10,fa3,fs8,ft10
   10caa:	b4fa                	fsd	ft10,104(sp)
   10cac:	8b86ff4b          	fnmsub.d	ft10,fa3,fs8,fa7
   10cb0:	2c46                	fld	fs8,80(sp)
   10cb2:	8a81b887          	fld	fa7,-1880(gp)              # fld fa7, %lo(.LC29)(a3)
   10cb6:	b0fa                	fsd	ft10,96(sp)
   10cb8:	c316ff4b          	fnmsub.d	ft10,fa3,fa7,fs8
   10cbc:	2886                	fld	fa7,64(sp)          # NOK: could have reused fa7 and fld into fa8
   10cbe:	8a81bc07          	fld	fs8,-1880(gp)              # fld fs8, %lo(.LC29)(a3)
   10cc2:	acfa                    fsd	ft10,88(sp)
   10cc4:	8b86ff4b          	fnmsub.d	ft10,fa3,fs8,fa7
   10cc8:	38c2                	fld	fa7,48(sp)
   10cca:	8a81bc07          	fld	fs8,-1880(gp)              # fld fs8, %lo(.LC29)(a3)
   10cce:	a8fa                    fsd	ft10,80(sp)
   10cd0:	8b86ff4b          	fnmsub.d	ft10,fa3,fs8,fa7
   10cd4:	3c22                	fld	fs8,40(sp)
   10cd6:	a0fa                	fsd	ft10,64(sp)
   10cd8:	8a81bf07          	fld	ft10,-1880(gp)              # fld ft10, %lo(.LC29)(a3)
   10cdc:	c3e6f8cb          	fnmsub.d	fa7,fa3,ft10,fs8
   10ce0:	2c62                	fld	fs8,24(sp)
   10ce2:	b846                	fsd	fa7,48(sp)
   10ce4:	c3e6f8cb          	fnmsub.d	fa7,fa3,ft10,fs8
   10ce8:	2c42                	fld	fs8,16(sp)
   10cea:	a83a                	fsd	fa4,16(sp)
   10cec:	c3e6f6cb          	fnmsub.d	fa3,fa3,ft10,fs8
   10cf0:	b446                	fsd	fa7,40(sp)
   10cf2:	ac36                	fsd	fa3,24(sp)
   10cf4:	b61d                	j	1081a <main+0x2dc>
...
.LC29:
	.word	0
	.word	1073217536	# 1.5e

Just before this codegen, there's another such pathetic refetch for a different constant .LC27 1.0 used in lines 339 and beyond.

	fld	fa7,%lo(.LC27)(s10)	
	fmadd.d	fa3,fa3,fs4,fa7	
# lbm.c:336: 		u2 = 1.5 * (ux*ux + uy*uy + uz*uz);
	fld	fa7,16(sp)	
	fsd	fs8,104(sp)	
	fmul.d	fs8,fa4,fa4	
	fmul.d	fa4,fa4,fs7	
	fsd	fa3,96(sp)	
	fadd.d	fa3,fa7,fs8
	fld	fa7,%lo(.LC27)(s10)
	fmadd.d	fs8,fs8,fs4,fa7
	fmadd.d	ft10,ft10,fs4,fa7
	fld	fa7,%lo(.LC27)(s10)
	fsd	fs8,88(sp)
	fmul.d	fs8,fs10,fs10	
	fmadd.d	fs8,fs8,fs4,fa7	
	fsd	fs8,80(sp)	
	fmul.d	fs8,fs9,fs9	
	fmadd.d	fa7,fs8,fs4,fa7	
	fld	fs8,24(sp)	
	fsd	fa7,64(sp)	
	fld	fa7,%lo(.LC27)(s10)	
	fmadd.d	fs8,fs8,fs4,fa7	
	fld	fa7,%lo(.LC27)(s10)	
	fsd	fs8,48(sp)	
	fld	fs8,40(sp)	
	fmadd.d	fs8,fs8,fs4,fa7	
	fld	fa7,%lo(.LC27)(s10)	
	fsd	fs8,40(sp)	
	fmul.d	fs8,fs11,fs11	
	fmadd.d	fs8,fs8,fs4,fa7	
	fsd	fs8,24(sp)	
	fld	fs8,104(sp)	
	fmadd.d	fa7,fs8,fs4,fa7	
	fld	fs8,%lo(.LC27)(s10)
	fsd	fa7,16(sp)
...
	.align	3
.LC27:
	.word	0
	.word	1072693248	# 1.0

I've yet to create a small test case as this only happens with -flto=auto on final link of benchmark.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.