loki.transformations.raw_stack_allocator
Classes
|
Transformation to inject stack arrays at the driver level. |
- class TemporariesRawStackTransformation(block_dim, horizontal, stack_name='STACK', local_int_var_name_pattern='JD_{name}', directive=None, driver_horizontal=None, **kwargs)
Bases:
Transformation
Transformation to inject stack arrays at the driver level. These, as well as corresponding sizes are passed on to the kernels. Any temporary arrays with the horizontal dimension as lead dimension are then allocated as offsets in the stack array.
The transformation needs to be applied in reverse order, which will do the following for each kernel:
Add arguments to the kernel call signature to pass the stack arrays and their (free) size
Determine the combined size of all local arrays that are to be allocated on the stack, taking into account calls to nested kernels. This is reported in
Item
’strafo_data
.Replace any access to temporary arrays with the corresponding offsets in the stack array
Pass the stack arrays as arguments to any nested kernel calls
In a driver routine, the transformation will:
Determine the required scratch space from
trafo_data
Allocate the stack arrays
Insert data sharing clauses into OpenMP or OpenACC pragmas
Pass the stack arrays and sizes into the kernel calls
- Parameters:
block_dim (
Dimension
) –Dimension
object to define the blocking dimensionhorizontal (
Dimension
) –Dimension
object to define the horizontal dimensionstack_name (str, optional) – Name of the scratch space variable that is allocated in the driver (default:
'STACK'
)local_int_var_name_pattern (str, optional) – Python format string pattern for the name of the integer variable for each temporary (default:
'JD_{name}'
)directive (str, optional) – Can be
'openmp'
or'openacc'
. If given, insert data sharing clauses for the stack derived type, and insert data transfer statements (for OpenACC only).driver_horizontal (str, optional) – Override string if a separate variable name should be used for the horizontal when allocating the stack in the driver.
key (str, optional) – Overwrite the key that is used to store analysis results in
trafo_data
.
- reverse_traversal = True
- type_name_dict = {BasicType.LOGICAL: {'driver': 'LL', 'kernel': 'LD'}, BasicType.INTEGER: {'driver': 'I', 'kernel': 'K'}, BasicType.REAL: {'driver': 'Z', 'kernel': 'P'}}
- property int_type
- transform_subroutine(routine, **kwargs)
Defines the transformation to apply to
Subroutine
items.For transformations that modify
Subroutine
objects, this method should be implemented. It gets called via the dispatch methodapply()
.- Parameters:
routine (
Subroutine
) – The subroutine to be transformed.**kwargs (optional) – Keyword arguments for the transformation.
- insert_stack_in_calls(routine, stack_arg_dict, successors)
Insert stack arguments into calls to successor routines.
- Parameters:
routine (
Subroutine
) – The routine in which to transform call statementsstack_arg_dict (dict) – dict that maps dtype and kind to the sets of stack size variables and their corresponding stack array variables
successors (list of
Item
) – The items corresponding to successor routines called fromroutine
- create_stacks_driver(routine, stack_dict, successors)
Create stack variables in the driver routine, add pragma directives to create the stacks on the device (if self.directive), and add the stack_variables to kernel call arguments.
- Parameters:
routine (
Subroutine
) – The driver subroutine to get the stack_variablesstack_dict (dict) – dict that maps dtype and kind to an expression for the required stack size
successors (list of
Item
) – The items corresponding to successor routines called fromroutine
- create_stacks_kernel(routine, stack_dict, successors)
Create stack variables in kernel routine, add pragma directives to create the stacks on the device (if self.directive), and add the stack_variables to kernel call arguments.
- Parameters:
routine (
Subroutine
) – The kernel subroutine to get the stack_variablesstack_dict (dict) – dict that maps dtype and kind to an expression for the required stack size
successors (list of
Item
) – The items corresponding to successor routines called fromroutine
- apply_raw_stack_allocator_to_temporaries(routine, item=None)
Apply raw stack allocator to local temporary arrays
This appends the relevant argument to the routine’s dummy argument list and creates the assignment for the local copy of the stack type. For all local arrays, a Cray pointer is instantiated and the temporaries are mapped via Cray pointers to the pool-allocated memory region.
The cumulative size of all temporary arrays is determined and returned.
- Parameters:
routine (
Subroutine
) – Subroutine object to apply transformation to- Returns:
stack_dict – dict with required stack size mapped to type and kind
- Return type: