Skip to content

Commit 6a8d163

Browse files
committed
Address reviewer comments
1 parent 04be830 commit 6a8d163

File tree

3 files changed

+47
-49
lines changed

3 files changed

+47
-49
lines changed

flang/docs/OpenMP-declare-target.md

Lines changed: 42 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -58,11 +58,11 @@ as it's been utilised within a target region.
5858
# Declare Target as represented in the OpenMP Dialect
5959

6060
In the OpenMP Dialect `declare target` is not represented by a specific
61-
`operation`. Instead it's a OpenMP dialect specific `attribute` that can be
62-
applied to any operation in any dialect. This helps to simplify the
63-
utilisation of it, instead of replacing or modifying existing global or
64-
function `operations` in a dialect it applies to it as extra metadata that
65-
the lowering can use in different ways as is necesary.
61+
`operation`. Instead, it's an OpenMP dialect specific `attribute` that can be
62+
applied to any operation in any dialect, which helps to simplify the
63+
utilisation of it. Rather than replacing or modifying existing global or
64+
function `operations` in a dialect, it applies to it as extra metadata that
65+
the lowering can use in different ways as is necessary.
6666

6767
The `attribute` is composed of multiple fields representing the clauses you
6868
would find on the `declare target` directive i.e. device type (`nohost`,
@@ -92,9 +92,9 @@ declareTargetGlobal.isDeclareTarget();
9292
# Declare Target Fortran OpenMP Lowering
9393

9494
The initial lowering of `declare target` to MLIR for both use-cases is done
95-
inside of the usual OpenMP lowering in flang/lib/Lower/OpenMP.cpp. However some
96-
direct calls to `declare target` related functions from Flang's lowering bridge
97-
in flang/lib/Lower/Bridge.cpp are made.
95+
inside of the usual OpenMP lowering in flang/lib/Lower/OpenMP.cpp. However,
96+
some direct calls to `declare target` related functions from Flang's
97+
lowering bridge in flang/lib/Lower/Bridge.cpp are made.
9898

9999
The marking of operations with the declare target attribute happens in two
100100
phases, the second one optional and contingent on the first failing. The
@@ -108,7 +108,7 @@ types (e.g. `host`, `nohost`), then it will swap the device type to `any`.
108108

109109
Whenever we invoke `genFIR` on an `OpenMPDeclarativeConstruct` from the
110110
lowering bridge, we are also invoking another function called
111-
`gatherOpenMPDeferredDeclareTargets` which gathers information relevant to the
111+
`gatherOpenMPDeferredDeclareTargets`, which gathers information relevant to the
112112
application of the `declare target` attribute. This information
113113
includes the symbol that it should be applied to, device type clause,
114114
and capture clause, and it is stored in a vector that is part of the lowering
@@ -125,7 +125,7 @@ operations that have delayed generation and cannot be proccessed in the
125125
first phase. The main notable case this occurs currently is when a
126126
Fortran function interface has been marked. This is
127127
done via the function
128-
`markOpenMPDeferredDeclareTargetFunctions` which is called from the lowering
128+
`markOpenMPDeferredDeclareTargetFunctions`, which is called from the lowering
129129
bridge at the end of the lowering process allowing us to mark those where
130130
possible. It iterates over the data previously gathered by
131131
`gatherOpenMPDeferredDeclareTargets`
@@ -140,7 +140,7 @@ by the initial semantic analysis.
140140

141141
NOTE: `declare target` can be applied to implicit `SAVE` attributed variables.
142142
However, by default Flang does not represent these as `GlobalOp`'s, which means
143-
we cannot tag and lower them as `declare target` normally. Instead similarly
143+
we cannot tag and lower them as `declare target` normally. Instead, similarly
144144
to the way `threadprivate` handles these cases, we raise and initialize the
145145
variable as an internal `GlobalOp` and apply the attribute. This occurs in the
146146
flang/lib/Lower/OpenMP.cpp function `genDeclareTargetIntGlobal`.
@@ -154,7 +154,7 @@ of `declare target`:
154154
`declare target`. It does so recursively, i.e. nested calls will also be
155155
implicitly marked. It currently will try to mark things as conservatively as
156156
possible, e.g. if captured in a `target` region it will apply `nohost`, unless
157-
it encounters a `host` `declare target` in which case it will apply the any
157+
it encounters a `host` `declare target` in which case it will apply the `any`
158158
device type. Functions are handled similarly, except we utilise the parent's
159159
device type where possible.
160160
* `OMPFunctionFiltering` - This is executed after the `OMPMarkDeclareTarget`
@@ -169,40 +169,38 @@ Host functions with `target` regions are marked with a `declare target host`
169169
attribute so they will be removed after outlining the target regions contained
170170
inside.
171171

172-
While this infrastructure is generally applicable to more than just Flang, we
173-
currently only utilise them in the Flang frontend and they are part of the
174-
Flang codebase, rather than the OpenMP dialect codebase.
172+
While this infrastructure could be generally applicable to more than just Flang,
173+
it is only utilised in the Flang frontend, so it resides there rather than in
174+
the OpenMP dialect codebase.
175175

176176
# Declare Target OpenMP Dialect To LLVM-IR Lowering
177177

178-
The OpenMP dialect lowering of `declare target` is a little unique currently,
179-
as it's not an `operation` and is an `attribute` we process it utilising the
180-
LLVM Target lowerings `amendOperation`, which occurs immediately after an
181-
operation has been lowered to LLVM-IR. As it can be applicable to multiple
182-
different operations, we must specialise this function for each operation
183-
type that we may encounter. Currently this is `GlobalOp`'s and `FuncOp`'s.
184-
185-
In the case where we encounter a `FuncOp` our processing is fairly simple,
186-
if we're processing the device code, we will finish up our removal of `host`
187-
marked functions, anything we could not remove previously we now remove, e.g.
188-
if it had a `target` directive in it (which we need to keep a hold of to
189-
this point, to actually outline the `target` kernel for device). This hopefully
190-
leaves us with only `any`, `device` or undeterminable functions left in the
191-
module to lower further, reducing the possibiltiy of device incompatible
192-
code being in the module.
193-
194-
For `GlobalOp`'s, the processing is a little more complex, we currently
195-
leverage two OMPIRBuilder functions which we have inherited from Clang and
196-
moved to the `OMPIRBuilder` to share across the two compiler
197-
frontends`registerTargetGlobalVariable` and `getAddrOfDeclareTargetVar`.
198-
These two functions are actually recursive and invoke each other depending
199-
on the clauses and options provided to the `OMPIRBuilder` (in particular
200-
unified shared memory), but the main functionality they provide is the
201-
generation of a new global pointer for device with a "ref_" prefix, and
202-
enqueuing metadata generation by the `OMPIRBuilder` at the end of the
203-
module, for both host and device that links the newly generated device
204-
global pointer and the host pointer together across the two modules
205-
(and resulting binaries).
178+
The OpenMP dialect lowering of `declare target` is done through the
179+
`amendOperation` flow, as it's not an `operation` but rather an
180+
`attribute`. This is triggered immediately after the corresponding
181+
operation has been lowered to LLVM-IR. As it is applicable to
182+
different types of operations, we must specialise this function for
183+
each operation type that we may encounter. Currently, this is
184+
`GlobalOp`'s and `FuncOp`'s.
185+
186+
`FuncOp` processing is fairly simple. When compiling for the device,
187+
`host` marked functions are removed, including those that could not
188+
be removed earlier due to having `target` directives within. This
189+
leaves `any`, `device` or indeterminable functions left in the
190+
module to lower further. When compiling for the host, no filtering is
191+
done because `nohost` functions must be available as a fallback
192+
implementation.
193+
194+
For `GlobalOp`'s, the processing is a little more complex. We
195+
currently leverage the `registerTargetGlobalVariable` and
196+
`getAddrOfDeclareTargetVar` `OMPIRBuilder` functions shared with Clang.
197+
These two functions invoke each other depending on the clauses and options
198+
provided to the `OMPIRBuilder` (in particular, unified shared memory). Their
199+
main purposes are the generation of a new global device pointer with a
200+
"ref_" prefix on the device and enqueuing metadata generation by the
201+
`OMPIRBuilder` to be produced at module finalization time. This is done
202+
for both host and device and it links the newly generated device global
203+
pointer and the host pointer together across the two modules.
206204

207205
Similarly to other metadata (e.g. for `TargetOp`) that is shared across
208206
both host and device modules, processing of `GlobalOp`'s in the device
@@ -224,7 +222,7 @@ global symbol, with our new global ref pointer's symbol. Currently we do not
224222
remove or delete the old symbol, this is due to the fact that the same symbol
225223
can be utilised across multiple target regions, if we remove it, we risk
226224
breaking lowerings of target regions that will be processed at a later time.
227-
To appropriately delete these no longer necesary symbols we would need a
225+
To appropriately delete these no longer necessary symbols we would need a
228226
deferred removal process at the end of the module, which is currently not in
229227
place. It may be possible to store this information in the OMPIRBuilder and
230228
then perform this cleanup process on finalization, but this is open for

flang/lib/Lower/Bridge.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5092,7 +5092,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
50925092
/// processed at the time of lowering the declare target construct, such
50935093
/// as certain cases where interfaces are declared but not defined within
50945094
/// a module.
5095-
llvm::SmallVector<Fortran::lower::OMPDeferredDeclareTargetInfo, 2>
5095+
llvm::SmallVector<Fortran::lower::OMPDeferredDeclareTargetInfo>
50965096
ompDeferredDeclareTarget;
50975097

50985098
const Fortran::lower::ExprToValueMap *exprValueOverrides{nullptr};

flang/lib/Lower/OpenMP/OpenMP.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1175,7 +1175,7 @@ static void collectDeferredDeclareTargets(
11751175
const Fortran::parser::OpenMPDeclareTargetConstruct &declareTargetConstruct,
11761176
llvm::SmallVectorImpl<Fortran::lower::OMPDeferredDeclareTargetInfo>
11771177
&deferredDeclareTarget) {
1178-
llvm::SmallVector<DeclareTargetCapturePair, 0> symbolAndClause;
1178+
llvm::SmallVector<DeclareTargetCapturePair> symbolAndClause;
11791179
mlir::omp::DeclareTargetDeviceType devType = getDeclareTargetInfo(
11801180
converter, semaCtx, eval, declareTargetConstruct, symbolAndClause);
11811181
// Return the device type only if at least one of the targets for the
@@ -1200,7 +1200,7 @@ getDeclareTargetFunctionDevice(
12001200
Fortran::lower::pft::Evaluation &eval,
12011201
const Fortran::parser::OpenMPDeclareTargetConstruct
12021202
&declareTargetConstruct) {
1203-
llvm::SmallVector<DeclareTargetCapturePair, 0> symbolAndClause;
1203+
llvm::SmallVector<DeclareTargetCapturePair> symbolAndClause;
12041204
mlir::omp::DeclareTargetDeviceType deviceType = getDeclareTargetInfo(
12051205
converter, semaCtx, eval, declareTargetConstruct, symbolAndClause);
12061206

@@ -1989,7 +1989,7 @@ static void genOMP(Fortran::lower::AbstractConverter &converter,
19891989
Fortran::lower::pft::Evaluation &eval,
19901990
const Fortran::parser::OpenMPDeclareTargetConstruct
19911991
&declareTargetConstruct) {
1992-
llvm::SmallVector<DeclareTargetCapturePair, 0> symbolAndClause;
1992+
llvm::SmallVector<DeclareTargetCapturePair> symbolAndClause;
19931993
mlir::ModuleOp mod = converter.getFirOpBuilder().getModule();
19941994
mlir::omp::DeclareTargetDeviceType deviceType = getDeclareTargetInfo(
19951995
converter, semaCtx, eval, declareTargetConstruct, symbolAndClause);
@@ -2514,7 +2514,7 @@ bool Fortran::lower::markOpenMPDeferredDeclareTargetFunctions(
25142514
llvm::SmallVectorImpl<OMPDeferredDeclareTargetInfo> &deferredDeclareTargets,
25152515
AbstractConverter &converter) {
25162516
bool deviceCodeFound = false;
2517-
auto modOp = llvm::dyn_cast<mlir::ModuleOp>(mod);
2517+
auto modOp = llvm::cast<mlir::ModuleOp>(mod);
25182518
for (auto declTar : deferredDeclareTargets) {
25192519
mlir::Operation *op = modOp.lookupSymbol(converter.mangleName(declTar.sym));
25202520

0 commit comments

Comments
 (0)