loki.transformations.single_column.scc
Module Attributes
The basic Single Column Coalesced (SCC) transformation with vector-level kernel parallelism. |
|
The basic Single Column Coalesced (SCC) transformation with sequential kernels. |
|
SCC-style transformation with "vector-parallel" kernels that additionally hoists local temporary arrays that cannot be demoted to the outer driver call. |
|
SCC-style transformation with sequential kernels that additionally hoists local temporary arrays that cannot be demoted to the outer driver call. |
|
SCC-style transformation with "vector-parallel" kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
|
SCC-style transformation with sequential kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
|
SCC-style transformation with "vector-parallel" kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
|
SCC-style transformation with sequential kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
|
SCC-style transformation with "vector-parallel" kernels that additionally pre-allocates a "stack" pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array. |
|
SCC-style transformation with sequential kernels that additionally pre-allocates a "stack" pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array. |
|
SCC-style transformation with "vector-parallel" kernels that additionally pre-allocates a "stack" pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array. |
|
SCC-style transformation with sequential kernels that additionally pre-allocates a "stack" pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array. |
|
|
SCC-style transformation with "vector-parallel" kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
SCC-style transformation with sequential kernels that additionally pre-allocates a "stack" pool allocator and associates local arrays with preallocated memory. |
Classes
|
A special version of |
- SCCVVectorPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
The basic Single Column Coalesced (SCC) transformation with vector-level kernel parallelism.
This tranformation will convert kernels with innermost vectorisation along a common horizontal dimension to a GPU-friendly loop-layout via loop inversion and local array variable demotion. The resulting kernel remains “vector-parallel”, but with the
horizontalloop as the outermost iteration dimension (as far as data dependencies allow). This allows local temporary arrays to be demoted to scalars, where possible.The outer “driver” loop over blocks is used as the secondary dimension of parallelism, where the outher data indexing dimension (
block_dim) is resolved in the first call to a “kernel” routine. This is equivalent to a so-called “gang-vector” parallelisation scheme.This
Pipelineapplies the followingTransformationclasses in sequence: 1.SCCBaseTransformation- Ensure utility variables and resolveproblematic code constructs.
SCCDevectorTransformation- Remove horizontal vector loops.SCCDemoteTransformation- Demote local temporary array variables where appropriate.SCCVecRevectorTransformation- Re-insert the vector loops outermost, according to identified vector sections.SCCAnnotateTransformation- Annotate loops according to programming model (directive).
- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
- SCCSVectorPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
The basic Single Column Coalesced (SCC) transformation with sequential kernels.
This tranformation will convert kernels with innermost vectorisation along a common horizontal dimension to a GPU-friendly loop-layout via loop inversion and local array variable demotion. The resulting kernel becomes sequential as the
horizontalloop is hoisted to the driver and the loop index becomes an argument to the kernel(s). Moreover, this allows local temporary arrays to be demoted to scalars, where possible.The outer “driver” loop over blocks is used as the secondary dimension of parallelism, where the outher data indexing dimension (
block_dim) is resolved in the first call to a “kernel” routine. This is equivalent to a so-called “gang-vector” parallelisation scheme.This
Pipelineapplies the followingTransformationclasses in sequence: 1.SCCBaseTransformation- Ensure utility variables and resolveproblematic code constructs.
SCCDevectorTransformation- Remove horizontal vector loops.SCCDemoteTransformation- Demote local temporary array variables where appropriate.SCCSeqRevectorTransformation- Re-insert the vector loops outermost, according to identified vector sections.SCCAnnotateTransformation- Annotate loops according to programming model (directive).
- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
- SCCVHoistPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.temporaries.hoist_variables.HoistTemporaryArraysAnalysis'>, <class 'loki.transformations.single_column.hoist.SCCHoistTemporaryArraysTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with “vector-parallel” kernels that additionally hoists local temporary arrays that cannot be demoted to the outer driver call.
For details of the kernel and driver-side transformations, please refer to
SCCVVectorPipelineIn addition, this pipeline will invoke
HoistTemporaryArraysAnalysisandSCCHoistTemporaryArraysTransformationbefore the final annotation step to hoist multi-dimensional local temporary array variables to the “driver” routine, where they will be allocated on device and passed down as arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
dim_vars (tuple of str, optional) – Variables to be within the dimensions of the arrays to be hoisted. If not provided, no checks will be done for the array dimensions in
HoistTemporaryArraysAnalysis.
- SCCSHoistPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.temporaries.hoist_variables.HoistTemporaryArraysAnalysis'>, <class 'loki.transformations.single_column.hoist.SCCHoistTemporaryArraysTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally hoists local temporary arrays that cannot be demoted to the outer driver call.
For details of the kernel and driver-side transformations, please refer to
SCCSVectorPipelineIn addition, this pipeline will invoke
HoistTemporaryArraysAnalysisandSCCHoistTemporaryArraysTransformationbefore the final annotation step to hoist multi-dimensional local temporary array variables to the “driver” routine, where they will be allocated on device and passed down as arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
dim_vars (tuple of str, optional) – Variables to be within the dimensions of the arrays to be hoisted. If not provided, no checks will be done for the array dimensions in
HoistTemporaryArraysAnalysis.
- SCCVStackPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.pool_allocator.TemporariesPoolAllocatorTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with “vector-parallel” kernels that additionally pre-allocates a “stack” pool allocator and associates local arrays with preallocated memory.
For details of the kernel and driver-side transformations, please refer to
SCCVVectorPipelineIn addition, this pipeline will invoke
TemporariesPoolAllocatorTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
- SCCSStackPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.pool_allocator.TemporariesPoolAllocatorTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally pre-allocates a “stack” pool allocator and associates local arrays with preallocated memory.
For details of the kernel and driver-side transformations, please refer to
SCCSVectorPipelineIn addition, this pipeline will invoke
TemporariesPoolAllocatorTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
- SCCVStackFtrPtrPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.stack_allocator.FtrPtrStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with “vector-parallel” kernels that additionally pre-allocates a “stack” pool allocator and associates local arrays with preallocated memory.
For details of the kernel and driver-side transformations, please refer to
SCCVVectorPipelineIn addition, this pipeline will invoke
FtrPtrStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
- SCCSStackFtrPtrPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.stack_allocator.FtrPtrStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally pre-allocates a “stack” pool allocator and associates local arrays with preallocated memory.
For details of the kernel and driver-side transformations, please refer to
SCCSVectorPipelineIn addition, this pipeline will invoke
FtrPtrStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
- SCCVStackDirectIdxPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.stack_allocator.DirectIdxStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with “vector-parallel” kernels that additionally pre-allocates a “stack” pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array.
For details of the kernel and driver-side transformations, please refer to
SCCVectorPipelineIn addition, this pipeline will invoke
DirectIdxStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
driver_horizontal (str, optional) – Override string if a separate variable name should be used for the horizontal when allocating the stack in the driver.
- SCCSStackDirectIdxPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.stack_allocator.DirectIdxStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally pre-allocates a “stack” pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array.
For details of the kernel and driver-side transformations, please refer to
SCCVectorPipelineIn addition, this pipeline will invoke
DirectIdxStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
driver_horizontal (str, optional) – Override string if a separate variable name should be used for the horizontal when allocating the stack in the driver.
- SCCVRawStackPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCVecRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.raw_stack_allocator.TemporariesRawStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with “vector-parallel” kernels that additionally pre-allocates a “stack” pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array.
For details of the kernel and driver-side transformations, please refer to
SCCVectorPipelineIn addition, this pipeline will invoke
TemporariesRawStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
driver_horizontal (str, optional) – Override string if a separate variable name should be used for the horizontal when allocating the stack in the driver.
- SCCSRawStackPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.raw_stack_allocator.TemporariesRawStackTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally pre-allocates a “stack” pool allocator and replaces local temporaries with indexed sub-arrays of this preallocated array.
For details of the kernel and driver-side transformations, please refer to
SCCVectorPipelineIn addition, this pipeline will invoke
TemporariesRawStackTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that is pre-allocated in the driver routine and passed down via arguments.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)
driver_horizontal (str, optional) – Override string if a separate variable name should be used for the horizontal when allocating the stack in the driver.
- SCCSEcStackPipeline = functools.partial(<class 'loki.batch.pipeline.Pipeline'>, classes=(<class 'loki.transformations.single_column.vertical.SCCFuseVerticalLoops'>, <class 'loki.transformations.single_column.base.SCCBaseTransformation'>, <class 'loki.transformations.single_column.devector.SCCDevectorTransformation'>, <class 'loki.transformations.single_column.demote.SCCDemoteTransformation'>, <class 'loki.transformations.single_column.revector.SCCSeqRevectorTransformation'>, <class 'loki.transformations.single_column.scc.RemoveUnusedVarTransformation'>, <class 'loki.transformations.single_column.annotate.SCCAnnotateTransformation'>, <class 'loki.transformations.temporaries.pool_allocator.EcstackPoolAllocatorTransformation'>, <class 'loki.transformations.pragma_model.PragmaModelTransformation'>))
SCC-style transformation with sequential kernels that additionally pre-allocates a “stack” pool allocator and associates local arrays with preallocated memory.
For details of the kernel and driver-side transformations, please refer to
SCCSVectorPipelineIn addition, this pipeline will invoke
EcstackPoolAllocatorTransformationto back the remaining locally allocated arrays from a “stack” pool allocator that requests a chunk of offloaded memory from an externally defined module.- Parameters:
horizontal (
Dimension) –Dimensionobject describing the variable conventions used in code to define the horizontal data dimension and iteration space.block_dim (
Dimension) – OptionalDimensionobject to define the blocking dimension to use for hoisted column arrays if hoisting is enabled.directive (string or None) – Directives flavour to use for parallelism annotations; either
'openacc','omp-gpu'orNone.trim_vector_sections (bool) – Flag to trigger trimming of extracted vector sections to remove nodes that are not assignments involving vector parallel arrays.
demote_local_arrays (bool) – Flag to trigger local array demotion to scalar variables where possible
check_bounds (bool, optional) – Insert bounds-checks in the kernel to make sure the allocated stack size is not exceeded (default: True)