MFEM  v4.4.0
Finite element discretization library
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Pages
amgxsolver.hpp
Go to the documentation of this file.
1 // Copyright (c) 2010-2022, Lawrence Livermore National Security, LLC. Produced
2 // at the Lawrence Livermore National Laboratory. All Rights reserved. See files
3 // LICENSE and NOTICE for details. LLNL-CODE-806117.
4 //
5 // This file is part of the MFEM library. For more information and source code
6 // availability visit https://mfem.org.
7 //
8 // MFEM is free software; you can redistribute it and/or modify it under the
9 // terms of the BSD-3 license. We welcome feedback and contributions, see file
10 // CONTRIBUTING.md for details.
11 
12 #ifndef MFEM_AMGX_SOLVER
13 #define MFEM_AMGX_SOLVER
14 
15 #include "../config/config.hpp"
16 
17 #ifdef MFEM_USE_AMGX
18 
19 #include <amgx_c.h>
20 #ifdef MFEM_USE_MPI
21 #include <mpi.h>
22 #include "hypre.hpp"
23 #else
24 #include "operator.hpp"
25 #include "sparsemat.hpp"
26 #endif
27 
28 namespace mfem
29 {
30 
31 /**
32  MFEM wrapper for Nvidia's multigrid library, AmgX (github.com/NVIDIA/AMGX)
33 
34  AmgX requires building MFEM with CUDA, and AMGX enabled. For distributed
35  memory parallism, MPI and Hypre (version 16.0+) are also required. Although
36  CUDA is required for building, the AmgX solver is compatible with a MFEM CPU
37  device configuration.
38 
39  The AmgXSolver class is designed to work as a solver or preconditioner for
40  existing MFEM solvers. The AmgX solver class may be configured in one of
41  three ways:
42 
43  Serial - Takes a SparseMatrix solves on a single GPU and assumes no MPI
44  communication.
45 
46  Exclusive GPU - Takes a HypreParMatrix and assumes each MPI rank is paired
47  with an Nvidia GPU.
48 
49  MPI Teams - Takes a HypreParMatrix and enables flexibility between number of
50  MPI ranks, and GPUs. Specifically, MPI ranks are grouped with GPUs and a
51  matrix consolidation step is taken so the MPI root of each team performs the
52  necessary AmgX library calls. The solution is then broadcasted to appropriate
53  ranks. This is particularly useful when configuring MFEM's device as
54  CPU. This work is based on the AmgXWrapper of Chuang and Barba. Routines were
55  adopted and modified for setting up MPI communicators.
56 
57  Examples 1/1p in the examples/amgx directory demonstrate configuring the
58  wrapper as a solver and preconditioner, as well as configuring and running
59  with exclusive GPU or MPI teams modes.
60 
61  This work is partially based on:
62 
63  Pi-Yueh Chuang and Lorena A. Barba (2017).
64  AmgXWrapper: An interface between PETSc and the NVIDIA AmgX library.
65  J. Open Source Software, 2(16):280, doi:10.21105/joss.00280
66 
67  See https://github.com/barbagroup/AmgXWrapper.
68 */
69 class AmgXSolver : public Solver
70 {
71 public:
72 
73  /// Flags to configure AmgXSolver as a solver or preconditioner
75 
76  /// Flag to check for convergence
78 
79  /**
80  Flags to determine whether user solver settings are defined internally in
81  the source code or will be read through an external JSON file.
82  */
84 
85  AmgXSolver();
86 
87  /**
88  Configures AmgX with a default configuration based on the AmgX mode, and
89  verbosity. Assumes no MPI parallism.
90  */
91  AmgXSolver(const AMGX_MODE amgxMode_, const bool verbose);
92 
93  /**
94  Once the solver configuration has been established through either the
95  ReadParameters method or the constructor, InitSerial will initalize the
96  library. If configuring with constructor, the constructor will make this
97  call.
98  */
99  void InitSerial();
100 
101 #ifdef MFEM_USE_MPI
102 
103  /**
104  Configures AmgX with a default configuration based on the AmgX mode, and
105  verbosity. Pairs each MPI rank with one GPU.
106  */
107  AmgXSolver(const MPI_Comm &comm, const AMGX_MODE amgxMode_, const bool verbose);
108 
109  /**
110  Configures AmgX with a default configuration based on the AmgX mode, and
111  verbosity. Creates MPI teams around GPUs to support MPI ranks >
112  GPUs. Consolidates linear solver data to avoid multiple ranks sharing
113  GPUs. Requires specifying number of devices in each compute node.
114  */
115  AmgXSolver(const MPI_Comm &comm, const int nDevs,
116  const AMGX_MODE amgx_Mode_, const bool verbose);
117 
118  /**
119  Once the solver configuration has been established, either through the
120  constructor or the ReadParameters method, InitSerial will initalize the
121  library. If configuring with constructor, the constructor will make this
122  call.
123  */
124  void InitExclusiveGPU(const MPI_Comm &comm);
125 
126  /**
127  Once the solver configuration has been established, either through the
128  ReadParameters method, InitMPITeams will initialize the library and create
129  MPI teams based on the number of devices on each node (nDevs). If
130  configuring with constructor, the constructor will make this call.
131  */
132  void InitMPITeams(const MPI_Comm &comm,
133  const int nDevs);
134 #endif
135 
136  /**
137  Sets Operator for AmgX library, either MFEM SparseMatrix or HypreParMatrix
138  */
139  virtual void SetOperator(const Operator &op);
140 
141  /**
142  Replaces the matrix coefficients in the AmgX solver.
143  */
144  void UpdateOperator(const Operator &op);
145 
146  virtual void Mult(const Vector& b, Vector& x) const;
147 
148  int GetNumIterations();
149 
150  void ReadParameters(const std::string config, CONFIG_SRC source);
151 
152  /**
153  @param [in] amgxMode_ AmgXSolver::PRECONDITIONER,
154  AmgXSolver::SOLVER.
155 
156  @param [in] verbose true, false. Specifies the level
157  of verbosity.
158 
159  When configured as a preconditioner, the default configuration is to apply
160  two iterations of an AMG V cycle with AmgX's default smoother (block
161  Jacobi).
162 
163  As a solver the preconditioned conjugate gradient method is used. The AMG
164  V-cycle with a block Jacobi smoother is used as a preconditioner.
165  */
166  void DefaultParameters(const AMGX_MODE amgxMode_, const bool verbose);
167 
168  /// Add a check for convergence after applying Mult.
169  void SetConvergenceCheck(bool setConvergenceCheck_=true);
170 
171  ~AmgXSolver();
172 
173  void Finalize();
174 
175 private:
176 
177  AMGX_MODE amgxMode;
178 
179  std::string amgx_config = "";
180 
181  CONFIG_SRC configSrc = UNDEFINED;
182 
183 #ifdef MFEM_USE_MPI
184  // Consolidates matrix diagonal and off diagonal data and uploads matrix to
185  // AmgX.
186  void SetMatrixMPIGPUExclusive(const HypreParMatrix &A,
187  const Array<double> &loc_A,
188  const Array<int> &loc_I, const Array<int64_t> &loc_J,
189  const bool update_mat = false);
190 
191  // Consolidates matrix diagonal and off diagonal data for all ranks in an MPI
192  // team. Root rank of each MPI team holds the the consolidated data and sets
193  // matrix.
194  void SetMatrixMPITeams(const HypreParMatrix &A, const Array<double> &loc_A,
195  const Array<int> &loc_I, const Array<int64_t> &loc_J,
196  const bool update_mat = false);
197 
198  // The following methods consolidate array data to the root node in a MPI
199  // team.
200  void GatherArray(const Array<double> &inArr, Array<double> &outArr,
201  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
202 
203  void GatherArray(const Vector &inArr, Vector &outArr,
204  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
205 
206  void GatherArray(const Array<int> &inArr, Array<int> &outArr,
207  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
208 
209  void GatherArray(const Array<int64_t> &inArr, Array<int64_t> &outArr,
210  const int mpiTeamSz, const MPI_Comm &mpiTeam) const;
211 
212  // The following methods consolidate array data to the root node in a MPI
213  // team as well as store array partitions and displacements.
214  void GatherArray(const Vector &inArr, Vector &outArr,
215  const int mpiTeamSz, const MPI_Comm &mpiTeamComm,
216  Array<int> &Apart, Array<int> &Adisp) const;
217 
218  void ScatterArray(const Vector &inArr, Vector &outArr,
219  const int mpiTeamSz, const MPI_Comm &mpi_comm,
220  Array<int> &Apart, Array<int> &Adisp) const;
221 
222  void SetMatrix(const HypreParMatrix &A, const bool update_mat = false);
223 #endif
224 
225  void SetMatrix(const SparseMatrix &A, const bool update_mat = false);
226 
227  static int count;
228 
229  // Indicate if this instance has been initialized.
230  bool isInitialized = false;
231 
232 #ifdef MFEM_USE_MPI
233  // The name of the node that this MPI process belongs to.
234  std::string nodeName;
235 
236  // Number of local GPU devices used by AmgX.
237  int nDevs;
238 
239  // The ID of corresponding GPU device used by this MPI process.
240  int devID;
241 
242  // A flag indicating if this process will invoke AmgX
243  int gpuProc = MPI_UNDEFINED;
244 
245  // Communicator for all MPI ranks
246  MPI_Comm globalCpuWorld = MPI_COMM_NULL;
247 
248  // Communicator for ranks in same node
249  MPI_Comm localCpuWorld;
250 
251  // Communicator for ranks sharing a device
252  MPI_Comm devWorld;
253 
254  // A communicator for MPI processes that will launch AmgX (root of devWorld)
255  MPI_Comm gpuWorld;
256 
257  // Global number of MPI procs + rank id
258  int globalSize;
259 
260  int myGlobalRank;
261 
262  // Total number of MPI procs in a node + rank id
263  int localSize;
264 
265  int myLocalRank;
266 
267  // Total number of MPI ranks sharing a device + rank id
268  int devWorldSize;
269 
270  int myDevWorldRank;
271 
272  // Total number of MPI procs calling AmgX + rank id
273  int gpuWorldSize;
274 
275  int myGpuWorldRank;
276 #endif
277 
278  // A parameter used by AmgX.
279  int ring;
280 
281  // Sets AmgX precision (currently on double is supported)
282  AMGX_Mode precision_mode = AMGX_mode_dDDI;
283 
284  // AmgX config object.
285  AMGX_config_handle cfg = nullptr;
286 
287  // AmgX matrix object.
288  AMGX_matrix_handle AmgXA = nullptr;
289 
290  // AmgX vector object representing unknowns.
291  AMGX_vector_handle AmgXP = nullptr;
292 
293  // AmgX vector object representing RHS.
294  AMGX_vector_handle AmgXRHS = nullptr;
295 
296  // AmgX solver object.
297  AMGX_solver_handle solver = nullptr;
298 
299  // AmgX resource object.
300  static AMGX_resources_handle rsrc;
301 
302  // Set the ID of the corresponding GPU used by this process.
303  void SetDeviceIDs(const int nDevs);
304 
305  // Initialize all MPI communicators.
306 #ifdef MFEM_USE_MPI
307  void InitMPIcomms(const MPI_Comm &comm, const int nDevs);
308 #endif
309 
310  void InitAmgX();
311 
312  // Row partition for the HypreParMatrix
313  int64_t mat_local_rows;
314 
315  std::string mpi_gpu_mode;
316 };
317 }
318 #endif // MFEM_USE_AMGX
319 #endif // MFEM_AMGX_SOLVER
bool ConvergenceCheck
Flag to check for convergence.
Definition: amgxsolver.hpp:77
void InitMPITeams(const MPI_Comm &comm, const int nDevs)
Definition: amgxsolver.cpp:149
void ReadParameters(const std::string config, CONFIG_SRC source)
Definition: amgxsolver.cpp:186
virtual void Mult(const Vector &b, Vector &x) const
Operator application: y=A(x).
Definition: amgxsolver.cpp:902
void source(const Vector &x, Vector &f)
Definition: ex25.cpp:620
void SetConvergenceCheck(bool setConvergenceCheck_=true)
Add a check for convergence after applying Mult.
Definition: amgxsolver.cpp:193
Data type sparse matrix.
Definition: sparsemat.hpp:46
double b
Definition: lissajous.cpp:42
void UpdateOperator(const Operator &op)
Definition: amgxsolver.cpp:882
virtual void SetOperator(const Operator &op)
Definition: amgxsolver.cpp:859
void DefaultParameters(const AMGX_MODE amgxMode_, const bool verbose)
Definition: amgxsolver.cpp:198
void InitExclusiveGPU(const MPI_Comm &comm)
Definition: amgxsolver.cpp:120
Vector data type.
Definition: vector.hpp:60
Base class for solvers.
Definition: operator.hpp:651
Abstract operator.
Definition: operator.hpp:24
Wrapper for hypre&#39;s ParCSR matrix class.
Definition: hypre.hpp:337
AMGX_MODE
Flags to configure AmgXSolver as a solver or preconditioner.
Definition: amgxsolver.hpp:74