loki.transformations.data_offload

Classes

DataOffloadTransformation(**kwargs)

Utility transformation to insert data offload regions for GPU devices based on marked !$loki data regions.

GlobalVarHoistTransformation([...])

Transformation to hoist module variables used in device routines

GlobalVarOffloadTransformation([key])

Transformation to insert offload directives for module variables used in device routines

GlobalVariableAnalysis([key])

Transformation pass to analyse the declaration and use of (global) module variables.

class DataOffloadTransformation(**kwargs)

Bases: Transformation

Utility transformation to insert data offload regions for GPU devices based on marked !$loki data regions. In the first instance this will insert OpenACC data offload regions, but can be extended to other offload region semantics (eg. OpenMP-5) in the future.

Parameters:
  • remove_openmp (bool) – Remove any existing OpenMP pragmas inside the marked region.

  • assume_deviceptr (bool) – Mark all offloaded arrays as true device-pointers if data offload is being managed outside of structured OpenACC data regions.

transform_subroutine(routine, **kwargs)

Apply the transformation to a Subroutine object.

Parameters:
  • routine (Subroutine) – Subroutine to apply this transformation to.

  • role (string) – Role of the routine in the scheduler call tree. This transformation will only apply at the 'driver' level.

  • targets (list or string) – List of subroutines that are to be considered as part of the transformation call tree.

insert_data_offload_pragmas(routine, targets)

Find !$loki data pragma regions and create according !$acc udpdate regions.

Parameters:
  • routine (Subroutine) – Subroutine to apply this transformation to.

  • targets (list or string) – List of subroutines that are to be considered as part of the transformation call tree.

remove_openmp_pragmas(routine, targets)

Remove any existing OpenMP pragmas in the offload regions that will have been intended for OpenMP threading rather than offload.

Parameters:
  • routine (Subroutine) – Subroutine to apply this transformation to.

  • targets (list or string) – List of subroutines that are to be considered as part of the transformation call tree.

class GlobalVariableAnalysis(key=None)

Bases: Transformation

Transformation pass to analyse the declaration and use of (global) module variables.

This analysis is a requirement before applying GlobalVarOffloadTransformation.

Collect data in Item.trafo_data for ProcedureItem and ModuleItem items and store analysis results under the provided key (default: 'GlobalVariableAnalysis') in the items’ trafo_data.

For procedures, use the the Loki dataflow analysis functionality to compile a list of used and/or defined variables (i.e., read and/or written). Store these under the keys 'uses_symbols' and 'defines_symbols', respectively.

For modules/ModuleItem, store the list of variables declared in the module under the key 'declares' and out of this the subset of variables that need offloading to device under the key 'offload'.

Note that in every case, the full variable symbols are stored to allow access to type information in transformations using the analysis data.

The generated trafo_data has the following schema:

ModuleItem: {
    'declares': set(Variable, Variable, ...),
    'offload': set(Variable, ...)
}

ProcedureItem: {
    'uses_symbols': set( (Variable, '<module_name>'), (Variable, '<module_name>'), ...),
    'defines_symbols': set((Variable, '<module_name>'), (Variable, '<module_name>'), ...)
}
Parameters:

key (str, optional) – Specify a different identifier under which trafo_data is stored

reverse_traversal = True

Traversal from the leaves upwards, i.e., modules with global variables are processed first, then kernels using them before the driver.

item_filter = (<class 'loki.batch.item.ProcedureItem'>, <class 'loki.batch.item.ModuleItem'>)

Process procedures and modules with global variable declarations.

transform_module(module, **kwargs)

Defines the transformation to apply to Module items.

For transformations that modify Module objects, this method should be implemented. It gets called via the dispatch method apply().

Parameters:
  • module (Module) – The module to be transformed.

  • **kwargs (optional) – Keyword arguments for the transformation.

transform_subroutine(routine, **kwargs)

Defines the transformation to apply to Subroutine items.

For transformations that modify Subroutine objects, this method should be implemented. It gets called via the dispatch method apply().

Parameters:
  • routine (Subroutine) – The subroutine to be transformed.

  • **kwargs (optional) – Keyword arguments for the transformation.

class GlobalVarOffloadTransformation(key=None)

Bases: Transformation

Transformation to insert offload directives for module variables used in device routines

Currently, only OpenACC data offloading is supported.

This requires a prior analysis pass with GlobalVariableAnalysis to collect the relevant global variable use information.

The offload directives are inserted by replacing !$loki update_device and !$loki update_host pragmas in the driver’s source code. Importantly, no offload directives are added if these pragmas have not been added to the original source code!

For global variables, the device-side declarations are added in transform_module(). For driver procedures, the data offload and pull-back directives are added in the utility method process_driver(), which is invoked by transform_subroutine().

For example, the following code:

module moduleB
   real :: var2
   real :: var3
end module moduleB

module moduleC
   real :: var4
   real :: var5
end module moduleC

subroutine driver()
implicit none

!$loki update_device
!$acc serial
call kernel()
!$acc end serial
!$loki update_host

end subroutine driver

subroutine kernel()
use moduleB, only: var2,var3
use moduleC, only: var4,var5
implicit none
!$acc routine seq

var4 = var2
var5 = var3

end subroutine kernel

is transformed to:

module moduleB
   real :: var2
   real :: var3
  !$acc declare create(var2)
  !$acc declare create(var3)
end module moduleB

module moduleC
   real :: var4
   real :: var5
  !$acc declare create(var4)
  !$acc declare create(var5)
end module moduleC

subroutine driver()
implicit none

!$acc update device( var2,var3 )
!$acc serial
call kernel()
!$acc end serial
!$acc update self( var4,var5 )

end subroutine driver

Nested Fortran derived-types and arrays of derived-types are not currently supported. If such an import is encountered, only the device-side declaration will be added to the relevant module file, and the offload instructions will have to manually be added afterwards.

item_filter = (<class 'loki.batch.item.ProcedureItem'>, <class 'loki.batch.item.ModuleItem'>)
transform_module(module, **kwargs)

Add device-side declarations for imported variables

transform_subroutine(routine, **kwargs)

Add data offload and pull-back directives to the driver

process_kernel(item, successors)

Propagate offload requirement to the items of the global variables

process_driver(routine, successors)

Add data offload and pullback directives

List of variables that requires offloading is obtained from the analysis data stored for each successor in successors.

class GlobalVarHoistTransformation(hoist_parameters=False, ignore_modules=None, key=None)

Bases: Transformation

Transformation to hoist module variables used in device routines

This requires a prior analysis pass with GlobalVariableAnalysis to collect the relevant global variable use information.

Modules to be ignored can be specified. Further, it is possible to configure whether parameters/compile time constants are hoisted as well or not.

Note

Hoisted variables that could theoretically be intent(out) are despite specified as intent(inout).

For example, the following code:

module moduleB
   real :: var2
   real :: var3
end module moduleB

module moduleC
   real :: var4
   real :: var5
end module moduleC

subroutine driver()
implicit none

call kernel()

end subroutine driver

subroutine kernel()
use moduleB, only: var2,var3
use moduleC, only: var4,var5
implicit none

var4 = var2
var5 = var3

end subroutine kernel

is transformed to:

module moduleB
   real :: var2
   real :: var3
end module moduleB

module moduleC
   real :: var4
   real :: var5
end module moduleC

subroutine driver()
use moduleB, only: var2,var3
use moduleC, only: var4,var5
implicit none

call kernel(var2, var3, var4, var5)

end subroutine driver

subroutine kernel(var2, var3, var4, var5)
implicit none
real, intent(in) :: var2
real, intent(in) :: var3
real, intent(inout) :: var4
real, intent(inout) :: var5

var4 = var2
var5 = var3

end subroutine kernel
Parameters:
  • hoist_parameters (bool, optional) – Whether or not to hoist module variables being parameter/compile time constants (default: False).

  • ignore_modules ((list, tuple) of str) – Modules to be ignored (default: None, thus no module to be ignored).

  • key (str, optional) – Overwrite the key that is used to store analysis results in trafo_data.

item_filter

alias of ProcedureItem

transform_subroutine(routine, **kwargs)

Hoist module variables.

process_driver(routine, successors)

Hoist module variables for driver routines.

This includes: appending the corresponding variables to calls within the driver and adding the relevant imports.

process_kernel(routine, successors, item)

Hoist mdule variables for kernel routines.

This includes: appending the corresponding variables to the routine arguments as well as to calls within the kernel and removing the imports that became unused.