loki.analyse.analyse_dataflow

Collection of dataflow analysis schema routines.

Functions

attach_dataflow_analysis(module_or_routine)

Determine and attach to each IR node dataflow analysis metadata.

dataflow_analysis_attached(module_or_routine)

Create a context in which information about defined, live and used symbols is attached to each IR node

detach_dataflow_analysis(module_or_routine)

Remove from each IR node the stored dataflow analysis metadata.

loop_carried_dependencies(loop)

Find variables that are potentially loop-carried dependencies.

read_after_write_vars(ir, inspection_node)

Find variables that are read after being written in the given IR.

Classes

DataflowAnalysisAttacher(**kwargs)

Analyse and attach in-place the definition, use and live status of symbols.

DataflowAnalysisDetacher(**kwargs)

Remove in-place any dataflow analysis properties.

FindReads([start, stop, active, ...])

Look for reads in a specified part of a control flow tree.

FindWrites([start, stop, active, candidate_set])

Look for writes in a specified part of a control flow tree.

dataflow_analysis_attached(module_or_routine)

Create a context in which information about defined, live and used symbols is attached to each IR node

This makes for each IR node the following properties available:

  • Node.live_symbols: symbols defined before the node;

  • Node.defines_symbols: symbols (potentially) defined by the node;

  • Node.uses_symbols: symbols used by the node that had to be defined before.

This is an in-place update of nodes and thus existing references to IR nodes remain valid. When leaving the context the information is removed from IR nodes, while existing references remain valid.

The analysis is based on a rather crude regions-based analysis, with the hierarchy implied by (nested) InternalNode IR nodes used as regions in the reducible flow graph (cf. Chapter 9, in particular 9.7 of Aho, Lam, Sethi, and Ulliman (2007)). Our implementation shares some similarities with a full reaching definitions dataflow analysis but is not quite as powerful.

In reaching definitions dataflow analysis (cf. Chapter 9.2.4 Aho et. al.), the transfer function of a definition \(d\) can be expressed as:

\[f_d(x) = \operatorname{gen}_d \cup (x - \operatorname{kill}_d)\]

with the set of definitions generated \(\operatorname{gen}_d\) and the set of definitions killed/invalidated \(\operatorname{kill}_d\).

We, however, do not record definitions explicitly and instead operate on consolidated sets of defined symbols, i.e., effectively evaluate the chained transfer functions up to the node. This yields a set of active definitions at this node. The symbols defined by these definitions are in Node.live_symbols, and the symbols defined by the node (i.e., symbols defined by definitions in \(\operatorname{gen}_d\)) are in Node.defines_symbols.

The advantage of this approach is that it avoids the need to introduce a layer for definitions and dependencies. A downside is that this focus on symbols instead of definitions precludes, in particular, the ability to take data space into account, which makes it less useful for arrays.

Note

The context manager operates only on the module or routine itself (i.e., its spec and, if applicable, body), not on any contained subroutines or functions.

Parameters:

module_or_routine (Module or Subroutine) – The object for which the IR is to be annotated.

read_after_write_vars(ir, inspection_node)

Find variables that are read after being written in the given IR.

This requires prior application of dataflow_analysis_attached() to the corresponding Module or Subroutine.

The result is the set of variables with a data dependency across the inspection_node.

See the remarks about implementation and limitations in the description of dataflow_analysis_attached(). In particular, this does not take into account data space and iteration space for arrays.

Parameters:
  • ir (Node) – The root of the control flow (sub-)tree to inspect.

  • inspection_node (Node) – Only variables with a write before and a read at or after this node are considered.

Returns:

The list of read-after-write variables.

Return type:

set of Scalar or Array

loop_carried_dependencies(loop)

Find variables that are potentially loop-carried dependencies.

This requires prior application of dataflow_analysis_attached() to the corresponding Module or Subroutine.

See the remarks about implementation and limitations in the description of dataflow_analysis_attached(). In particular, this does not take into account data space and iteration space for arrays. For cases with a linear mapping from iteration to data space and no overlap, this will falsely report loop-carried dependencies when there are in fact none. However, the risk of false negatives should be low.

Parameters:

loop (Loop) – The loop node to inspect.

Returns:

The list of variables that potentially have a loop-carried dependency.

Return type:

set of Scalar or Array