@@ -58,11 +58,11 @@ as it's been utilised within a target region.
5858# Declare Target as represented in the OpenMP Dialect
5959
6060In the OpenMP Dialect ` declare target ` is not represented by a specific
61- ` operation ` . Instead it's a OpenMP dialect specific ` attribute ` that can be
62- applied to any operation in any dialect. This helps to simplify the
63- utilisation of it, instead of replacing or modifying existing global or
64- function ` operations ` in a dialect it applies to it as extra metadata that
65- the lowering can use in different ways as is necesary .
61+ ` operation ` . Instead, it's an OpenMP dialect specific ` attribute ` that can be
62+ applied to any operation in any dialect, which helps to simplify the
63+ utilisation of it. Rather than replacing or modifying existing global or
64+ function ` operations ` in a dialect, it applies to it as extra metadata that
65+ the lowering can use in different ways as is necessary .
6666
6767The ` attribute ` is composed of multiple fields representing the clauses you
6868would find on the ` declare target ` directive i.e. device type (` nohost ` ,
@@ -92,9 +92,9 @@ declareTargetGlobal.isDeclareTarget();
9292# Declare Target Fortran OpenMP Lowering
9393
9494The initial lowering of ` declare target ` to MLIR for both use-cases is done
95- inside of the usual OpenMP lowering in flang/lib/Lower/OpenMP.cpp. However some
96- direct calls to ` declare target ` related functions from Flang's lowering bridge
97- in flang/lib/Lower/Bridge.cpp are made.
95+ inside of the usual OpenMP lowering in flang/lib/Lower/OpenMP.cpp. However,
96+ some direct calls to ` declare target ` related functions from Flang's
97+ lowering bridge in flang/lib/Lower/Bridge.cpp are made.
9898
9999The marking of operations with the declare target attribute happens in two
100100phases, the second one optional and contingent on the first failing. The
@@ -108,7 +108,7 @@ types (e.g. `host`, `nohost`), then it will swap the device type to `any`.
108108
109109Whenever we invoke ` genFIR ` on an ` OpenMPDeclarativeConstruct ` from the
110110lowering bridge, we are also invoking another function called
111- ` gatherOpenMPDeferredDeclareTargets ` which gathers information relevant to the
111+ ` gatherOpenMPDeferredDeclareTargets ` , which gathers information relevant to the
112112application of the ` declare target ` attribute. This information
113113includes the symbol that it should be applied to, device type clause,
114114and capture clause, and it is stored in a vector that is part of the lowering
@@ -125,7 +125,7 @@ operations that have delayed generation and cannot be proccessed in the
125125first phase. The main notable case this occurs currently is when a
126126Fortran function interface has been marked. This is
127127done via the function
128- ` markOpenMPDeferredDeclareTargetFunctions ` which is called from the lowering
128+ ` markOpenMPDeferredDeclareTargetFunctions ` , which is called from the lowering
129129bridge at the end of the lowering process allowing us to mark those where
130130possible. It iterates over the data previously gathered by
131131` gatherOpenMPDeferredDeclareTargets `
@@ -140,7 +140,7 @@ by the initial semantic analysis.
140140
141141NOTE: ` declare target ` can be applied to implicit ` SAVE ` attributed variables.
142142However, by default Flang does not represent these as ` GlobalOp ` 's, which means
143- we cannot tag and lower them as ` declare target ` normally. Instead similarly
143+ we cannot tag and lower them as ` declare target ` normally. Instead, similarly
144144to the way ` threadprivate ` handles these cases, we raise and initialize the
145145variable as an internal ` GlobalOp ` and apply the attribute. This occurs in the
146146flang/lib/Lower/OpenMP.cpp function ` genDeclareTargetIntGlobal ` .
@@ -154,7 +154,7 @@ of `declare target`:
154154` declare target ` . It does so recursively, i.e. nested calls will also be
155155implicitly marked. It currently will try to mark things as conservatively as
156156possible, e.g. if captured in a ` target ` region it will apply ` nohost ` , unless
157- it encounters a ` host ` ` declare target ` in which case it will apply the any
157+ it encounters a ` host ` ` declare target ` in which case it will apply the ` any `
158158device type. Functions are handled similarly, except we utilise the parent's
159159device type where possible.
160160* ` OMPFunctionFiltering ` - This is executed after the ` OMPMarkDeclareTarget `
@@ -169,40 +169,38 @@ Host functions with `target` regions are marked with a `declare target host`
169169attribute so they will be removed after outlining the target regions contained
170170inside.
171171
172- While this infrastructure is generally applicable to more than just Flang, we
173- currently only utilise them in the Flang frontend and they are part of the
174- Flang codebase, rather than the OpenMP dialect codebase.
172+ While this infrastructure could be generally applicable to more than just Flang,
173+ it is only utilised in the Flang frontend, so it resides there rather than in
174+ the OpenMP dialect codebase.
175175
176176# Declare Target OpenMP Dialect To LLVM-IR Lowering
177177
178- The OpenMP dialect lowering of ` declare target ` is a little unique currently,
179- as it's not an ` operation ` and is an ` attribute ` we process it utilising the
180- LLVM Target lowerings ` amendOperation ` , which occurs immediately after an
181- operation has been lowered to LLVM-IR. As it can be applicable to multiple
182- different operations, we must specialise this function for each operation
183- type that we may encounter. Currently this is ` GlobalOp ` 's and ` FuncOp ` 's.
184-
185- In the case where we encounter a ` FuncOp ` our processing is fairly simple,
186- if we're processing the device code, we will finish up our removal of ` host `
187- marked functions, anything we could not remove previously we now remove, e.g.
188- if it had a ` target ` directive in it (which we need to keep a hold of to
189- this point, to actually outline the ` target ` kernel for device). This hopefully
190- leaves us with only ` any ` , ` device ` or undeterminable functions left in the
191- module to lower further, reducing the possibiltiy of device incompatible
192- code being in the module.
193-
194- For ` GlobalOp ` 's, the processing is a little more complex, we currently
195- leverage two OMPIRBuilder functions which we have inherited from Clang and
196- moved to the ` OMPIRBuilder ` to share across the two compiler
197- frontends` registerTargetGlobalVariable ` and ` getAddrOfDeclareTargetVar ` .
198- These two functions are actually recursive and invoke each other depending
199- on the clauses and options provided to the ` OMPIRBuilder ` (in particular
200- unified shared memory), but the main functionality they provide is the
201- generation of a new global pointer for device with a "ref_ " prefix, and
202- enqueuing metadata generation by the ` OMPIRBuilder ` at the end of the
203- module, for both host and device that links the newly generated device
204- global pointer and the host pointer together across the two modules
205- (and resulting binaries).
178+ The OpenMP dialect lowering of ` declare target ` is done through the
179+ ` amendOperation ` flow, as it's not an ` operation ` but rather an
180+ ` attribute ` . This is triggered immediately after the corresponding
181+ operation has been lowered to LLVM-IR. As it is applicable to
182+ different types of operations, we must specialise this function for
183+ each operation type that we may encounter. Currently, this is
184+ ` GlobalOp ` 's and ` FuncOp ` 's.
185+
186+ ` FuncOp ` processing is fairly simple. When compiling for the device,
187+ ` host ` marked functions are removed, including those that could not
188+ be removed earlier due to having ` target ` directives within. This
189+ leaves ` any ` , ` device ` or indeterminable functions left in the
190+ module to lower further. When compiling for the host, no filtering is
191+ done because ` nohost ` functions must be available as a fallback
192+ implementation.
193+
194+ For ` GlobalOp ` 's, the processing is a little more complex. We
195+ currently leverage the ` registerTargetGlobalVariable ` and
196+ ` getAddrOfDeclareTargetVar ` ` OMPIRBuilder ` functions shared with Clang.
197+ These two functions invoke each other depending on the clauses and options
198+ provided to the ` OMPIRBuilder ` (in particular, unified shared memory). Their
199+ main purposes are the generation of a new global device pointer with a
200+ "ref_ " prefix on the device and enqueuing metadata generation by the
201+ ` OMPIRBuilder ` to be produced at module finalization time. This is done
202+ for both host and device and it links the newly generated device global
203+ pointer and the host pointer together across the two modules.
206204
207205Similarly to other metadata (e.g. for ` TargetOp ` ) that is shared across
208206both host and device modules, processing of ` GlobalOp ` 's in the device
@@ -224,7 +222,7 @@ global symbol, with our new global ref pointer's symbol. Currently we do not
224222remove or delete the old symbol, this is due to the fact that the same symbol
225223can be utilised across multiple target regions, if we remove it, we risk
226224breaking lowerings of target regions that will be processed at a later time.
227- To appropriately delete these no longer necesary symbols we would need a
225+ To appropriately delete these no longer necessary symbols we would need a
228226deferred removal process at the end of the module, which is currently not in
229227place. It may be possible to store this information in the OMPIRBuilder and
230228then perform this cleanup process on finalization, but this is open for
0 commit comments