Register Allocation


View My Stats

I shall look at the following:

All in the context of compiling assignment statements

Example a:=w*(w+x*(x+y*z));

As stack code this compiles as

push w

push w

push x

push x

push y

push z

*

+

*

+

*

Max stack depth used =6

Registers Naive allocation

Assume we have instructions of the form

add reg, reg/mem

Given a tree of the form

 Figure

reserve a register r1

load x into r1

reserve a register r2

load y into r2

emit add r1,r2

unreserve r2

result in r1

We assume that we maintain reserve bits for each register at compile time

Code generated this way

 mov edi,DWORD  [  w]    ;edi:=w

 mov esi,DWORD  [  w]    ;esi:=w

 mov ebx,DWORD  [  x]    ;ebx:=x

 mov eax,DWORD  [  x]    ;eax:=x

 mov edx,DWORD  [  y]    ;edx:=y

 imul  edx, [  z]        ;edx:=y*z

 add  DWORD eax,edx      ;eax:=x+y*z

 imul  ebx,eax           ;ebx:=x*(x+y*z)

 add  DWORD esi,ebx      ;esi:=w+x*(x+y*z)      

 imul  edi,esi           ;edi:=w*(w+x*(x+y*z))

 ; result in edi

5 registers,10 instructions, 6 memory accesses

This is almost equivalent to the stack code

Problems

This kind of code can rapidly lead to the number of registers being exhausted, especially if you get a lot of right nested expresions of the form a*(b+(c*(d+(e*......

It uses more instructions than are strictly neccessary

We can however use far fewer registers consider:

 mov edi,DWORD  [ y];edi:=y

 imul  edi, [ z]    ;edi:=y*z

 add edi, [  x]     ;edi:=x+Y*z

 imul  edi, [  x]   ;edi:=x*(x+Y*z)

 add edi, [  w]     ;edi:=w+x*(x+Y*z)

 imul  edi, [  w]   ;edi:=w*(w+x*(x+Y*z))

Uses 1 register, 6 instructions, 6 memory accesses

this was done by reorganising the expression to ((y*z+x)*x+w)*w

Weights

We can reorganise expressions to minimize their usage of registers. The weight function estimates how many registers are used to compile a sub-tree.

On encountering a dyadic operator

If weight(right)>weight(left)then

 if commutes(operator) then 

   swap (left,right)

This is because we already need r1 in

ADD r1,r2        ; r1:=r1+r2
to hold the result so a left expression uses 1 less free register than a right expression.

Commuting operators

for the reorganisation to work we need


aWb º bWa

for some operator W

Examples : +,×, and, or, xor,=

Not valid with: -,¸, < , > , £ , ³ ,

The weight function which estimates register usage for an expression

weight(node->int)

case node.type of

  Dyad:       l= weight(left); 

              r= weight(right);  

              return (l<r?r+1:(l>r?l:l+1));

  register,

  constant:  return 0;

  memoryloc: i= weight(address_expr);

             return (w<1?1:w);

Minimising instruction usage and register usage is not always optimal. It may be better to minimise memory fetches

Example of the same original expression using common sub expression elimination

 mov eax,DWORD  [  x] ;eax:=x

 mov esi,DWORD  [  w] ;esi:=w

 mov edi,DWORD  [  y] ;edi:=y

 imul  edi, [  z]     ;edi:=y*z

 add  DWORD edi,eax   ;edi:=x+y*z

 imul  edi,eax        ;edi:=x*(x+y*z)

 add  DWORD edi,esi   ;edi:=w+x*(x+y*z)

 imul  edi,esi        ;edi:=w*(w+x*(x+y*z))

Uses 3 registers, 8 instructions, 4 memory accesses

This was done by first reorganising the expression to ((y*z+x)*x+w)*w

then eliminating repeated variable accesses

eax:=x

in

 ((y*z+eax)*eax+w)*w 

then to

eax:=x

in

  esi:=w

  in

    ((y*z+eax)*eax+esi)*esi

Algorithm

SubexpOpt(T)

  1. Search T for the repeated sub-expression e with greatest total cost.

    1. Cost = num memory accesses ×number of occurences of e
  2. Reserve register r
  3. Emit code for r:=e
  4. Substitute all occurences of e in T with r to give T'
  5. Emit code for T'
  6. Free register r

Register spilling

If an expression is complex it may not be possible to evaluate all of it in registers or, on a machine with a limited evaluation stack like the Pentium FPU, all of it on the stack.

In that case we have to split the expression up into parts each of which can be evaluated within the available number of registers.

Expression splitting

Use the weight function to determine if an expression needs to be split.

If weight(e) > the number of free regs

then split (e)

To split we take the subtree with the greatest weight and assign it to a temporary store location. The temporary store location is then subsituted for the subtree.

Given that we have already re-ordered expressions so that the tree on the left has the greatest weight, this amounts to assigning the subtree on the left to a store location.


File translated from TEX by TTH, version 2.78.
On 8 Dec 2003, 16:05.