Working with Loki’s internal representation
The objective of this notebook is to get an impression how Loki’s internal representation (IR) can be traversed, searched and manipulated using the provided visitor utilities.
We are again going to work with the phys_kernel_LITE_LOOP
routine. Let’s start by parsing the source file and extracting the routine from it. Note, that we can also directly access the routine using its name, although it is wrapped inside a Module
object as we have seen in the previous notebook:
[1]:
from loki import Sourcefile
source = Sourcefile.from_file('src/phys_mod.F90')
routine_lite_loop = source['phys_kernel_LITE_LOOP']
routine_lite_loop
[1]:
Subroutine:: phys_kernel_LITE_LOOP
We are going to manipulate this routine and want to try two different ways of doing that, so we start by creating a copy. That way, we don’t change the original object in the subsequent steps:
[2]:
routine = routine_lite_loop.clone()
print(routine.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: in1, in2, in3, in4, in5, in6, in7, in8, in9, in10
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: out1
INTEGER(KIND=ip) :: i, k
DO k=1,dim2
DO i=i1,i2
out1(i, k) = (in1(i, k) + in2(i, k) + in3(i, k) + in4(i, k) + in5(i, k) + in6(i, k) + in7(i, k) + in8(i, k) + in9(i, k) + &
& in10(i, k))*0.1
in1(i, k) = out1(i, k)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
The routine body consists of two nested loops. What we want to try first is to change the order of the loops (i.e., have the i
loop outermost and the k
loop innermost) but leave the loop body untouched.
For that, we first need to find the loops in the IR, which can be done using the FindNodes visitor. As argument to the constructor we provide the node type (or a tuple of multiple types) that we want to look for and call the visit
method with the tree to search. The visitor traverses the IR and collects all matching nodes into a list that is returned. For our purposes we are interested in the
Loop nodes:
[3]:
from loki import FindNodes, Loop
loops = FindNodes(Loop).visit(routine.body)
loops
[3]:
[Loop:: k=1:dim2, Loop:: i=i1:i2]
As we can see, the visitor has found both loops. Next, we create a substitution map - essentially a dictionary that maps the original node to its replacement. To exchange the two loops, we use the outer loop but with the inner loop’s body and make it the body of the inner loop:
[4]:
outer_loop, inner_loop = loops
new_inner_loop = outer_loop.clone(body=inner_loop.body)
loop_map = {outer_loop: inner_loop.clone(body=(new_inner_loop,))}
loop_map
[4]:
{Loop:: k=1:dim2: Loop:: i=i1:i2}
With the substitution map in place, we can call the Transformer. It takes the map as argument to the constructor and applies it to the control flow tree given to the visit
method:
[5]:
from loki import Transformer
routine.body = Transformer(loop_map).visit(routine.body)
The result is the original routine with the exchanged loop order.
[6]:
reordered_loops = FindNodes(Loop).visit(routine.body)
assert len(reordered_loops) == 2
assert reordered_loops[0].variable == 'i' and reordered_loops[1].variable == 'k'
print(routine.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: in1, in2, in3, in4, in5, in6, in7, in8, in9, in10
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: out1
INTEGER(KIND=ip) :: i, k
DO i=i1,i2
DO k=1,dim2
out1(i, k) = (in1(i, k) + in2(i, k) + in3(i, k) + in4(i, k) + in5(i, k) + in6(i, k) + in7(i, k) + in8(i, k) + in9(i, k) + &
& in10(i, k))*0.1
in1(i, k) = out1(i, k)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
Next, we want to start again with the original routine and this time keep the loop order as is but reverse the memory layout of all arrays. We start by creating another copy of the original routine and verify that it is indeed the original version without the above transformations:
[7]:
routine = routine_lite_loop.clone()
print(routine.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: in1, in2, in3, in4, in5, in6, in7, in8, in9, in10
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: out1
INTEGER(KIND=ip) :: i, k
DO k=1,dim2
DO i=i1,i2
out1(i, k) = (in1(i, k) + in2(i, k) + in3(i, k) + in4(i, k) + in5(i, k) + in6(i, k) + in7(i, k) + in8(i, k) + in9(i, k) + &
& in10(i, k))*0.1
in1(i, k) = out1(i, k)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
This time, we want to modify variables instead of loops. Loki uses a two-level internal representation that separates expressions from control flow. This means, the IR that we have worked with so far, is in fact the control flow tree and, nested inside, we have a second tree level as property of certain control flow nodes. For example, the loop bounds of the Loop
node or the left and right hand side expressions in
Assignment nodes are such expression trees. The advantage of this is that it makes traversing the control flow tree a lot faster and allows to recurse into expressions only when required.
Since we are now looking for variables we need to actually search the expression trees and therefore have to use a different visitor FindVariables. Here, we are only interested in arrays and can further restrict ourselves to Array expression nodes. We build again a
substitution map with the subscript dimensions
of the arrays reversed:
[8]:
from loki import FindVariables, Array
variable_map = {}
for var in FindVariables().visit(routine.body):
if isinstance(var, Array) and var.dimensions:
variable_map[var] = var.clone(dimensions=var.dimensions[::-1])
print('\n'.join(f'{a!s} -> {b!s}' for a, b in variable_map.items()))
in10(i, k) -> in10(k, i)
in2(i, k) -> in2(k, i)
in3(i, k) -> in3(k, i)
in5(i, k) -> in5(k, i)
in9(i, k) -> in9(k, i)
in7(i, k) -> in7(k, i)
out1(i, k) -> out1(k, i)
in4(i, k) -> in4(k, i)
in1(i, k) -> in1(k, i)
in8(i, k) -> in8(k, i)
in6(i, k) -> in6(k, i)
Just like we have a separate find utility for expression trees there is a separate transformer SubstituteExpressions. Applying this to the routine’s body we obtain the following result:
[9]:
from loki import SubstituteExpressions
routine.body = SubstituteExpressions(variable_map).visit(routine.body)
print(routine.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: in1, in2, in3, in4, in5, in6, in7, in8, in9, in10
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim1, 1:dim2) :: out1
INTEGER(KIND=ip) :: i, k
DO k=1,dim2
DO i=i1,i2
out1(k, i) = (in1(k, i) + in2(k, i) + in3(k, i) + in4(k, i) + in5(k, i) + in6(k, i) + in7(k, i) + in8(k, i) + in9(k, i) + &
& in10(k, i))*0.1
in1(k, i) = out1(k, i)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
The routine’s body is correctly modified, with the array subscript dimensions reversed, but the declarations are still unchanged. For that, we need to change the shape
of the variables as well as the dimensions
property of the variable nodes inside the declarations.
There are two ways of achieving this: The first and easier way would be to modify the variables property of the Subroutine
object and update all array dimensions and shapes. This automatically recreates declarations for modified variables but inserts separate new declarations for each. Let’s try this approach for a copy of the routine:
[10]:
routine_variant1 = routine.clone()
variables = []
for var in routine_variant1.variables:
if isinstance(var, Array):
shape = var.shape[::-1]
variables += [var.clone(dimensions=shape, type=var.type.clone(shape=shape))]
else:
variables += [var]
routine_variant1.variables = variables
print(routine_variant1.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
INTEGER(KIND=ip) :: i, k
REAL(KIND=lp), INTENT(INOUT) :: in1(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in2(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in3(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in4(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in5(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in6(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in7(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in8(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in9(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: in10(1:dim2, 1:dim1)
REAL(KIND=lp), INTENT(INOUT) :: out1(1:dim2, 1:dim1)
DO k=1,dim2
DO i=i1,i2
out1(k, i) = (in1(k, i) + in2(k, i) + in3(k, i) + in4(k, i) + in5(k, i) + in6(k, i) + in7(k, i) + in8(k, i) + in9(k, i) + &
& in10(k, i))*0.1
in1(k, i) = out1(k, i)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
As a finger exercise we demonstrate also a second approach that avoids recreating the declarations but modifies them directly. For that, we search for all VariableDeclaration nodes in the routine’s specification part (spec
) and build a substitution map with updated declarations where necessary. This involves updating the list of variables declared in a VariableDeclaration
node and making sure that only
Array
nodes are modified.
Fortran allows to specify array dimensions either using the DIMENSION
attribute or as dimensions in brackets after the declared symbol’s name (e.g., var(dim1, dim2)
). Loki’s default behaviour is the latter (as visible from the auto-generated declarations above). To accommodate both variants in Loki’s IR, we allow an optional property dimensions
on VariableDeclaration
nodes to produce the syntax of the first. Importantly, in both cases Loki stores the dimensions
property also
on the declared variable nodes to make sure they are always accessible in a uniform way.
When building the substitution map for the declaration nodes, we honour both versions and adapt our behaviour accordingly:
[11]:
from loki import VariableDeclaration
decl_map = {}
for decl in FindNodes(VariableDeclaration).visit(routine.spec):
if decl.dimensions:
shape = decl.dimensions[::-1]
symbols = [var.clone(dimensions=shape, type=var.type.clone(shape=shape)) for var in decl.symbols]
decl_map[decl] = decl.clone(dimensions=shape, symbols=symbols)
elif any(isinstance(var, Array) for var in decl.symbols):
symbols = []
for var in decl.symbols:
if isinstance(var, Array):
shape = var.shape[::-1]
symbols += [var.clone(dimensions=shape, type=var.type.clone(shape=shape))]
else:
symbols += [var]
decl_map[decl] = decl.clone(symbols=symbols)
routine.spec = Transformer(decl_map).visit(routine.spec)
print(routine.to_fortran())
SUBROUTINE phys_kernel_LITE_LOOP (dim1, dim2, i1, i2, in1, in2, in3, in4, in5, in6, in7, in8, in9, in10, out1)
INTEGER(KIND=ip), INTENT(IN) :: dim1, dim2, i1, i2
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim2, 1:dim1) :: in1, in2, in3, in4, in5, in6, in7, in8, in9, in10
REAL(KIND=lp), INTENT(INOUT), DIMENSION(1:dim2, 1:dim1) :: out1
INTEGER(KIND=ip) :: i, k
DO k=1,dim2
DO i=i1,i2
out1(k, i) = (in1(k, i) + in2(k, i) + in3(k, i) + in4(k, i) + in5(k, i) + in6(k, i) + in7(k, i) + in8(k, i) + in9(k, i) + &
& in10(k, i))*0.1
in1(k, i) = out1(k, i)
END DO
END DO
END SUBROUTINE phys_kernel_LITE_LOOP
And with that we have achieved the same result while retaining the compacted notation for declarations. Notably, the body is the same for both variants:
[16]:
from loki import fgen
fcode = fgen(routine.spec)
assert '(1:dim2, 1:dim1)' in fcode
assert '(1:dim1, 1:dim2)' not in fcode
fcode = fgen(routine_variant1.spec)
assert '(1:dim2, 1:dim1)' in fcode
assert '(1:dim1, 1:dim2)' not in fcode
assert fgen(routine.body) == fgen(routine_variant1.body)
For further details on how to work with Loki’s internal representation, have a look at the relevant section in the documentation.