[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index] - Subject: Re: [ANN] source code optimizer - function inlining
- From: David Manura <dm.lua@...>
- Date: Fri, 15 May 2009 22:18:25 -0400
On Fri, May 15, 2009 at 11:25 AM, Philippe Lhoste wrote: > On 15/05/2009 14:14, Olivier Hamel wrote: >> code, there's some un-optimized stuff IMO (unless I'm wrong): > > Beside, depending on platform, using an integer power operator might result > in slower operation than several multiplications. Unless I am mistaken? Is > it false on boards with floating-point operations? Currently it just unrolls functions and eliminates dead ones. There is room for performance improvement, but note that (x*x)*(x*x) does not necessary equal x^4 if you redefine the __mul and __pow metamethods. In some cases we can deduce that x is a plain number, in which case these metamethods are not applied, but even in the case of plain numbers, the possibility of overflow can still break some mathematical properties. We could just avoid optimizations that are conceivably--even remotely--unsafe, and this is good option as a default. A way around the difficulty is to allow the programmer to specify to the optimizer, such as via pragmas or switches, additional information that would allow it to determine when certain optimizations may be safe/unsafe (e.g. is x is a plain number, all variables named according to a given conventions shall be assumed to be plain numbers, or all arithmetic operations in the current file scope shall be assumed to have normal properties). That information would also be useful for a lint tool (luaanalyze) or optimizations in lua2c. For standard Lua on x86, I have found x*x to be faster than x^2 [1] and have at times manually "strength reduced" [2] x^2, x^3 and x^4 in code to make it run faster at the expense of being uglier. However, one can now avoid some of the ugliness by writing return pow4(x+y) + 1 following local function pow2(x) return x*x end local function pow4(x) return pow2(pow2(x)) end and it will automatically reduce to local __v4x = x + y local __v5x = __v4x -- note: this line could be eliminated local __v3x = __v5x * __v5x return __v3x * __v3x + 1 Unfortunately, the optimization pass messes up the debug info (line numbers and variable names). One way to mitigate the line number problem is for the translator to add empty lines and put multiple generated statements on the same line, in such a way as to preserve most line numbers. It may also name the temporaries in a more meaningful way: local x = setmetatable({}, {__add=function() return nil end}) local y = x ..... local __x_plus_y = x + y; local __pow2_x_plus_y = __x_plus_y * __x_plus_y; return __pow2_x_plus_y * __pow2_x_plus_y + 1 which would raise the somewhat meaningful error lua: 1.lua:3: attempt to perform arithmetic on local '__x_plus_y' (a nil value) stack traceback: 1.lua:3: in main chunk [C]: ? A patch to Lua would allow debugging information to be injected into the source code, somewhat like the C preprocessor [3]: ..... --! __LINE__=3 local $"x+y" = x + y local $"pow2(?)" = $"x+y" * $"x+y" return $"pow2(?)" * $"pow2(?)" + 1 --! __LINE__=4 Metalua resorts instead to writing its own bytecode generator. [1] http://lua-users.org/wiki/OptimisationCodingTips [2] http://en.wikipedia.org/wiki/Strength_reduction [3] http://gcc.gnu.org/onlinedocs/cpp/Line-Control.html