MFEM  v4.2.0
Finite element discretization library
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Pages
amgxsolver.hpp
Go to the documentation of this file.
1 // Copyright (c) 2010-2020, Lawrence Livermore National Security, LLC. Produced
2 // at the Lawrence Livermore National Laboratory. All Rights reserved. See files
3 // LICENSE and NOTICE for details. LLNL-CODE-806117.
4 //
5 // This file is part of the MFEM library. For more information and source code
6 // availability visit https://mfem.org.
7 //
8 // MFEM is free software; you can redistribute it and/or modify it under the
9 // terms of the BSD-3 license. We welcome feedback and contributions, see file
10 // CONTRIBUTING.md for details.
11 
12 #ifndef MFEM_AMGX_SOLVER
13 #define MFEM_AMGX_SOLVER
14 
15 #include "../config/config.hpp"
16 
17 #ifdef MFEM_USE_AMGX
18 
19 #include <amgx_c.h>
20 #ifdef MFEM_USE_MPI
21 #include <mpi.h>
22 #include "hypre.hpp"
23 #else
24 #include "operator.hpp"
25 #include "sparsemat.hpp"
26 #endif
27 
28 namespace mfem
29 {
30 
31 /**
32  MFEM wrapper for Nvidia's multigrid library, AmgX (github.com/NVIDIA/AMGX)
33 
34  AmgX requires building MFEM with CUDA, and AMGX enabled. For distributed
35  memory parallism, MPI and Hypre (version 16.0+) are also required. Although
36  CUDA is required for building, the AmgX solver is compatible with a MFEM CPU
37  device configuration.
38 
39  The AmgXSolver class is designed to work as a solver or preconditioner for
40  existing MFEM solvers. The AmgX solver class may be configured in one of
41  three ways:
42 
43  Serial - Takes a SparseMatrix solves on a single GPU and assumes no MPI
44  communication.
45 
46  Exclusive GPU - Takes a HypreParMatrix and assumes each MPI rank is paired
47  with an Nvidia GPU.
48 
49  MPI Teams - Takes a HypreParMatrix and enables flexibility between number of
50  MPI ranks, and GPUs. Specifically, MPI ranks are grouped with GPUs and a
51  matrix consolidation step is taken so the MPI root of each team performs the
52  necessary AmgX library calls. The solution is then broadcasted to appropriate
53  ranks. This is particularly useful when configuring MFEM's device as
54  CPU. This work is based on the AmgXWrapper of Chuang and Barba. Routines were
55  adopted and modified for setting up MPI communicators.
56 
57  Examples 1/1p in the examples/amgx directory demonstrate configuring the
58  wrapper as a solver and preconditioner, as well as configuring and running
59  with exclusive GPU or MPI teams modes.
60 
61  This work is partially based on:
62 
63  Pi-Yueh Chuang and Lorena A. Barba (2017).
64  AmgXWrapper: An interface between PETSc and the NVIDIA AmgX library.
65  J. Open Source Software, 2(16):280, doi:10.21105/joss.00280
66 
67  See https://github.com/barbagroup/AmgXWrapper.
68 */
69 class AmgXSolver : public Solver
70 {
71 public:
72 
73  /// Flags to configure AmgXSolver as a solver or preconditioner
75 
76  /**
77  Flags to determine whether user solver settings are defined internally in
78  the source code or will be read through an external JSON file.
79  */
81 
82  AmgXSolver() = default;
83 
84  /**
85  Configures AmgX with a default configuration based on the AmgX mode, and
86  verbosity. Assumes no MPI parallism.
87  */
88  AmgXSolver(const AMGX_MODE amgxMode_, const bool verbose);
89 
90  /**
91  Once the solver configuration has been established through either the
92  ReadParameters method or the constructor, InitSerial will initalize the
93  library. If configuring with constructor, the constructor will make this
94  call.
95  */
96  void InitSerial();
97 
98 #ifdef MFEM_USE_MPI
99 
100  /**
101  Configures AmgX with a default configuration based on the AmgX mode, and
102  verbosity. Pairs each MPI rank with one GPU.
103  */
104  AmgXSolver(const MPI_Comm &comm, const AMGX_MODE amgxMode_, const bool verbose);
105 
106  /**
107  Configures AmgX with a default configuration based on the AmgX mode, and
108  verbosity. Creates MPI teams around GPUs to support MPI ranks >
109  GPUs. Consolidates linear solver data to avoid multiple ranks sharing
110  GPUs. Requires specifying number of devices in each compute node.
111  */
112  AmgXSolver(const MPI_Comm &comm, const int nDevs,
113  const AMGX_MODE amgx_Mode_, const bool verbose);
114 
115  /**
116  Once the solver configuration has been established, either through the
117  constructor or the ReadParameters method, InitSerial will initalize the
118  library. If configuring with constructor, the constructor will make this
119  call.
120  */
121  void InitExclusiveGPU(const MPI_Comm &comm);
122 
123  /**
124  Once the solver configuration has been established, either through the
125  ReadParameters method, InitMPITeams will initialize the library and create
126  MPI teams based on the number of devices on each node (nDevs). If
127  configuring with constructor, the constructor will make this call.
128  */
129  void InitMPITeams(const MPI_Comm &comm,
130  const int nDevs);
131 #endif
132 
133  /**
134  Sets Operator for AmgX library, either MFEM SparseMatrix or HypreParMatrix
135  */
136  virtual void SetOperator(const Operator &op);
137 
138  /**
139  Replaces the matrix coefficients in the AmgX solver.
140  */
141  void UpdateOperator(const Operator &op);
142 
143  virtual void Mult(const Vector& b, Vector& x) const;
144 
145  int GetNumIterations();
146 
147  void ReadParameters(const std::string config, CONFIG_SRC source);
148 
149  /**
150  @param [in] amgxMode_ AmgXSolver::PRECONDITIONER,
151  AmgXSolver::SOLVER.
152 
153  @param [in] verbose true, false. Specifies the level
154  of verbosity.
155 
156  When configured as a preconditioner, the default configuration is to apply
157  two iterations of an AMG V cycle with AmgX's default smoother (block
158  Jacobi).
159 
160  As a solver the preconditioned conjugate gradient method is used. The AMG
161  V-cycle with a block Jacobi smoother is used as a preconditioner.
162  */
163  void DefaultParameters(const AMGX_MODE amgxMode_, const bool verbose);
164 
165  ~AmgXSolver();
166 
167  void Finalize();
168 
169 private:
170 
171  AMGX_MODE amgxMode;
172 
173  std::string amgx_config = "";
174 
175  CONFIG_SRC configSrc = UNDEFINED;
176 
177 #ifdef MFEM_USE_MPI
178  // Consolidates matrix diagonal and off diagonal data and uploads matrix to
179  // AmgX.
180  void SetMatrixMPIGPUExclusive(const HypreParMatrix &A,
181  const Array<double> &loc_A,
182  const Array<int> &loc_I, const Array<int64_t> &loc_J,
183  const bool update_mat = false);
184 
185  // Consolidates matrix diagonal and off diagonal data for all ranks in an MPI
186  // team. Root rank of each MPI team holds the the consolidated data and sets
187  // matrix.
188  void SetMatrixMPITeams(const HypreParMatrix &A, const Array<double> &loc_A,
189  const Array<int> &loc_I, const Array<int64_t> &loc_J,
190  const bool update_mat = false);
191 
192  // The following methods consolidate array data to the root node in a MPI
193  // team.
194  void GatherArray(const Array<double> &inArr, Array<double> &outArr,
195  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
196 
197  void GatherArray(const Vector &inArr, Vector &outArr,
198  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
199 
200  void GatherArray(const Array<int> &inArr, Array<int> &outArr,
201  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
202 
203  void GatherArray(const Array<int64_t> &inArr, Array<int64_t> &outArr,
204  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
205 
206  // The following methods consolidate array data to the root node in a MPI
207  // team as well as store array partitions and displacements.
208  void GatherArray(const Vector &inArr, Vector &outArr,
209  const int mpiTeamSz, const MPI_Comm &mpiTeamComm,
210  Array<int> &Apart, Array<int> &Adisp) const;
211 
212  void ScatterArray(const Vector &inArr, Vector &outArr,
213  const int mpiTeamSz, const MPI_Comm &mpi_comm,
214  Array<int> &Apart, Array<int> &Adisp) const;
215 
216  void SetMatrix(const HypreParMatrix &A, const bool update_mat = false);
217 #endif
218 
219  void SetMatrix(const SparseMatrix &A, const bool update_mat = false);
220 
221  static int count;
222 
223  // Indicate if this instance has been initialized.
224  bool isInitialized = false;
225 
226 #ifdef MFEM_USE_MPI
227  // The name of the node that this MPI process belongs to.
228  std::string nodeName;
229 
230  // Number of local GPU devices used by AmgX.
231  int nDevs;
232 
233  // The ID of corresponding GPU device used by this MPI process.
234  int devID;
235 
236  // A flag indicating if this process will invoke AmgX
237  int gpuProc = MPI_UNDEFINED;
238 
239  // Communicator for all MPI ranks
240  MPI_Comm globalCpuWorld = MPI_COMM_NULL;
241 
242  // Communicator for ranks in same node
243  MPI_Comm localCpuWorld;
244 
245  // Communicator for ranks sharing a device
246  MPI_Comm devWorld;
247 
248  // A communicator for MPI processes that will launch AmgX (root of devWorld)
249  MPI_Comm gpuWorld;
250 
251  // Global number of MPI procs + rank id
252  int globalSize;
253 
254  int myGlobalRank;
255 
256  // Total number of MPI procs in a node + rank id
257  int localSize;
258 
259  int myLocalRank;
260 
261  // Total number of MPI ranks sharing a device + rank id
262  int devWorldSize;
263 
264  int myDevWorldRank;
265 
266  // Total number of MPI procs calling AmgX + rank id
267  int gpuWorldSize;
268 
269  int myGpuWorldRank;
270 #endif
271 
272  // A parameter used by AmgX.
273  int ring;
274 
275  // Sets AmgX precision (currently on double is supported)
276  AMGX_Mode precision_mode = AMGX_mode_dDDI;
277 
278  // AmgX config object.
279  AMGX_config_handle cfg = nullptr;
280 
281  // AmgX matrix object.
282  AMGX_matrix_handle AmgXA = nullptr;
283 
284  // AmgX vector object representing unknowns.
285  AMGX_vector_handle AmgXP = nullptr;
286 
287  // AmgX vector object representing RHS.
288  AMGX_vector_handle AmgXRHS = nullptr;
289 
290  // AmgX solver object.
291  AMGX_solver_handle solver = nullptr;
292 
293  // AmgX resource object.
294  static AMGX_resources_handle rsrc;
295 
296  // Set the ID of the corresponding GPU used by this process.
297  void SetDeviceIDs(const int nDevs);
298 
299  // Initialize all MPI communicators.
300 #ifdef MFEM_USE_MPI
301  void InitMPIcomms(const MPI_Comm &comm, const int nDevs);
302 #endif
303 
304  void InitAmgX();
305 
306  // Row partition for the HypreParMatrix
307  int64_t mat_local_rows;
308 
309  std::string mpi_gpu_mode;
310 };
311 }
312 #endif // MFEM_USE_AMGX
313 #endif // MFEM_AMGX_SOLVER
AmgXSolver()=default
void InitMPITeams(const MPI_Comm &comm, const int nDevs)
Definition: amgxsolver.cpp:137
void ReadParameters(const std::string config, CONFIG_SRC source)
Definition: amgxsolver.cpp:174
virtual void Mult(const Vector &b, Vector &x) const
Operator application: y=A(x).
Definition: amgxsolver.cpp:864
void source(const Vector &x, Vector &f)
Definition: ex25.cpp:581
Data type sparse matrix.
Definition: sparsemat.hpp:46
double b
Definition: lissajous.cpp:42
void UpdateOperator(const Operator &op)
Definition: amgxsolver.cpp:844
virtual void SetOperator(const Operator &op)
Definition: amgxsolver.cpp:821
void DefaultParameters(const AMGX_MODE amgxMode_, const bool verbose)
Definition: amgxsolver.cpp:181
void InitExclusiveGPU(const MPI_Comm &comm)
Definition: amgxsolver.cpp:108
Vector data type.
Definition: vector.hpp:51
Base class for solvers.
Definition: operator.hpp:634
Abstract operator.
Definition: operator.hpp:24
Wrapper for hypre&#39;s ParCSR matrix class.
Definition: hypre.hpp:181
AMGX_MODE
Flags to configure AmgXSolver as a solver or preconditioner.
Definition: amgxsolver.hpp:74