MFEM  v4.3.0
Finite element discretization library
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Pages
device.hpp
Go to the documentation of this file.
1 // Copyright (c) 2010-2021, Lawrence Livermore National Security, LLC. Produced
2 // at the Lawrence Livermore National Laboratory. All Rights reserved. See files
3 // LICENSE and NOTICE for details. LLNL-CODE-806117.
4 //
5 // This file is part of the MFEM library. For more information and source code
6 // availability visit https://mfem.org.
7 //
8 // MFEM is free software; you can redistribute it and/or modify it under the
9 // terms of the BSD-3 license. We welcome feedback and contributions, see file
10 // CONTRIBUTING.md for details.
11 
12 #ifndef MFEM_DEVICE_HPP
13 #define MFEM_DEVICE_HPP
14 
15 #include "globals.hpp"
16 #include "mem_manager.hpp"
17 
18 namespace mfem
19 {
20 
21 /// MFEM backends.
22 /** Individual backends will generally implement only a subset of the kernels
23  implemented by the default CPU backend. The goal of the backends is to
24  accelerate data-parallel portions of the code and they can use a device
25  memory space (e.g. GPUs) or share the memory space of the host (OpenMP). */
26 struct Backend
27 {
28  /** @brief In the documentation below, we use square brackets to indicate the
29  type of the backend: host or device. */
30  enum Id: unsigned long
31  {
32  /// [host] Default CPU backend: sequential execution on each MPI rank.
33  CPU = 1 << 0,
34  /// [host] OpenMP backend. Enabled when MFEM_USE_OPENMP = YES.
35  OMP = 1 << 1,
36  /// [device] CUDA backend. Enabled when MFEM_USE_CUDA = YES.
37  CUDA = 1 << 2,
38  /// [device] HIP backend. Enabled when MFEM_USE_HIP = YES.
39  HIP = 1 << 3,
40  /** @brief [host] RAJA CPU backend: sequential execution on each MPI rank.
41  Enabled when MFEM_USE_RAJA = YES. */
42  RAJA_CPU = 1 << 4,
43  /** @brief [host] RAJA OpenMP backend. Enabled when MFEM_USE_RAJA = YES
44  and MFEM_USE_OPENMP = YES. */
45  RAJA_OMP = 1 << 5,
46  /** @brief [device] RAJA CUDA backend. Enabled when MFEM_USE_RAJA = YES
47  and MFEM_USE_CUDA = YES. */
48  RAJA_CUDA = 1 << 6,
49  /** @brief [device] RAJA HIP backend. Enabled when MFEM_USE_RAJA = YES
50  and MFEM_USE_HIP = YES. */
51  RAJA_HIP = 1 << 7,
52  /** @brief [host] OCCA CPU backend: sequential execution on each MPI rank.
53  Enabled when MFEM_USE_OCCA = YES. */
54  OCCA_CPU = 1 << 8,
55  /// [host] OCCA OpenMP backend. Enabled when MFEM_USE_OCCA = YES.
56  OCCA_OMP = 1 << 9,
57  /** @brief [device] OCCA CUDA backend. Enabled when MFEM_USE_OCCA = YES
58  and MFEM_USE_CUDA = YES. */
59  OCCA_CUDA = 1 << 10,
60  /** @brief [host] CEED CPU backend. GPU backends can still be used, but
61  with expensive memory transfers. Enabled when MFEM_USE_CEED = YES. */
62  CEED_CPU = 1 << 11,
63  /** @brief [device] CEED CUDA backend working together with the CUDA
64  backend. Enabled when MFEM_USE_CEED = YES and MFEM_USE_CUDA = YES.
65  NOTE: The current default libCEED CUDA backend is non-deterministic! */
66  CEED_CUDA = 1 << 12,
67  /** @brief [device] CEED HIP backend working together with the HIP
68  backend. Enabled when MFEM_USE_CEED = YES and MFEM_USE_HIP = YES. */
69  CEED_HIP = 1 << 13,
70  /** @brief [device] Debug backend: host memory is READ/WRITE protected
71  while a device is in use. It allows to test the "device" code-path
72  (using separate host/device memory pools and host <-> device
73  transfers) without any GPU hardware. As 'DEBUG' is sometimes used
74  as a macro, `_DEVICE` has been added to avoid conflicts. */
75  DEBUG_DEVICE = 1 << 14
76  };
77 
78  /** @brief Additional useful constants. For example, the *_MASK constants can
79  be used with Device::Allows(). */
80  enum
81  {
82  /// Number of backends: from (1 << 0) to (1 << (NUM_BACKENDS-1)).
84 
85  /// Biwise-OR of all CPU backends
87  /// Biwise-OR of all CUDA backends
89  /// Biwise-OR of all HIP backends
91  /// Biwise-OR of all OpenMP backends
93  /// Bitwise-OR of all CEED backends
95  /// Biwise-OR of all device backends
97 
98  /// Biwise-OR of all RAJA backends
100  /// Biwise-OR of all OCCA backends
102  };
103 };
104 
105 
106 /** @brief The MFEM Device class abstracts hardware devices such as GPUs, as
107  well as programming models such as CUDA, OCCA, RAJA and OpenMP. */
108 /** This class represents a "virtual device" with the following properties:
109  - At most one object of this class can be constructed and that object is
110  controlled by its static methods.
111  - If no Device object is constructed, the static methods will use a default
112  global object which is never configured and always uses Backend::CPU.
113  - Once configured, the object cannot be re-configured during the program
114  lifetime.
115  - MFEM classes use this object to determine where (host or device) to
116  perform an operation and which backend implementation to use.
117  - Multiple backends can be configured at the same time; currently, a fixed
118  priority order is used to select a specific backend from the list of
119  configured backends. See the Backend class and the Configure() method in
120  this class for details. */
121 class Device
122 {
123 private:
124  friend class MemoryManager;
125  enum MODES {SEQUENTIAL, ACCELERATED};
126 
127  static bool device_env, mem_host_env, mem_device_env, mem_types_set;
128  static Device device_singleton;
129 
130  MODES mode = Device::SEQUENTIAL;
131  int dev = 0; ///< Device ID of the configured device.
132  int ngpu = -1; ///< Number of detected devices; -1: not initialized.
133  /// Bitwise-OR of all configured backends.
134  unsigned long backends = Backend::CPU;
135  /// Set to true during configuration, except in 'device_singleton'.
136  bool destroy_mm = false;
137  bool mpi_gpu_aware = false;
138 
139  MemoryType host_mem_type = MemoryType::HOST; ///< Current Host MemoryType
140  MemoryClass host_mem_class = MemoryClass::HOST; ///< Current Host MemoryClass
141 
142  /// Current Device MemoryType
143  MemoryType device_mem_type = MemoryType::HOST;
144  /// Current Device MemoryClass
145  MemoryClass device_mem_class = MemoryClass::HOST;
146 
147  char *device_option = NULL;
148  Device(Device const&);
149  void operator=(Device const&);
150  static Device& Get() { return device_singleton; }
151 
152  /// Setup switcher based on configuration settings
153  void Setup(const int dev = 0);
154 
155  void MarkBackend(Backend::Id b) { backends |= b; }
156 
157  void UpdateMemoryTypeAndClass();
158 
159  /// Enable the use of the configured device in the code that follows.
160  /** After this call MFEM classes will use the backend kernels whenever
161  possible, transferring data automatically to the device, if necessary.
162 
163  If the only configured backend is the default host CPU one, the device
164  will remain disabled.
165 
166  If the device is actually enabled, this method will also update the
167  current host/device MemoryType and MemoryClass. */
168  static void Enable();
169 
170 public:
171  /** @brief Default constructor. Unless Configure() is called later, the
172  default Backend::CPU will be used. */
173  /** @note At most one Device object can be constructed during the lifetime of
174  a program.
175  @note This object should be destroyed after all other MFEM objects that
176  use the Device are destroyed. */
177  Device();
178 
179  /** @brief Construct a Device and configure it based on the @a device string.
180  See Configure() for more details. */
181  /** @note At most one Device object can be constructed during the lifetime of
182  a program.
183  @note This object should be destroyed after all other MFEM objects that
184  use the Device are destroyed. */
185  Device(const std::string &device, const int dev = 0)
186  { Configure(device, dev); }
187 
188  /// Destructor.
189  ~Device();
190 
191  /// Configure the Device backends.
192  /** The string parameter @a device must be a comma-separated list of backend
193  string names (see below). The @a dev argument specifies the ID of the
194  actual devices (e.g. GPU) to use.
195  * The available backends are described by the Backend class.
196  * The string name of a backend is the lowercase version of the
197  Backend::Id enumeration constant with '_' replaced by '-', e.g. the
198  string name of 'RAJA_CPU' is 'raja-cpu'. The string name of the debug
199  backend (Backend::Id 'DEBUG_DEVICE') is exceptionally set to 'debug'.
200  * The 'cpu' backend is always enabled with lowest priority.
201  * The current backend priority from highest to lowest is:
202  'ceed-cuda', 'occa-cuda', 'raja-cuda', 'cuda',
203  'ceed-hip', 'hip', 'debug',
204  'occa-omp', 'raja-omp', 'omp',
205  'ceed-cpu', 'occa-cpu', 'raja-cpu', 'cpu'.
206  * Multiple backends can be configured at the same time.
207  * Only one 'occa-*' backend can be configured at a time.
208  * The backend 'occa-cuda' enables the 'cuda' backend unless 'raja-cuda'
209  is already enabled.
210  * The backend 'occa-omp' enables the 'omp' backend (if MFEM was built
211  with MFEM_USE_OPENMP=YES) unless 'raja-omp' is already enabled.
212  * Only one 'ceed-*' backend can be configured at a time.
213  * The backend 'ceed-cpu' delegates to a libCEED CPU backend the setup and
214  evaluation of the operator.
215  * The backend 'ceed-cuda' delegates to a libCEED CUDA backend the setup
216  and evaluation of operators and enables the 'cuda' backend to avoid
217  transfers between host and device.
218  * The backend 'ceed-hip' delegates to a libCEED HIP backend the setup
219  and evaluation of operators and enables the 'hip' backend to avoid
220  transfers between host and device.
221  * The 'debug' backend should not be combined with other device backends.
222  */
223  void Configure(const std::string &device, const int dev = 0);
224 
225  /// Set the default host and device MemoryTypes, @a h_mt and @a d_mt.
226  /** The host and device MemoryTypes are also set to be dual to each other.
227 
228  These two MemoryType%s are used by most MFEM classes when allocating
229  memory used on host and device, respectively.
230 
231  This method can only be called before Device construction and
232  configuration, and the specified memory types must be compatible with
233  the subsequent Device configuration. */
234  static void SetMemoryTypes(MemoryType h_mt, MemoryType d_mt);
235 
236  /// Print the configuration of the MFEM virtual device object.
237  void Print(std::ostream &out = mfem::out);
238 
239  /// Return true if Configure() has been called previously.
240  static inline bool IsConfigured() { return Get().ngpu >= 0; }
241 
242  /// Return true if an actual device (e.g. GPU) has been configured.
243  static inline bool IsAvailable() { return Get().ngpu > 0; }
244 
245  /// Return true if any backend other than Backend::CPU is enabled.
246  static inline bool IsEnabled() { return Get().mode == ACCELERATED; }
247 
248  /// The opposite of IsEnabled().
249  static inline bool IsDisabled() { return !IsEnabled(); }
250 
251  /// Get the device id of the configured device.
252  static inline int GetId() { return Get().dev; }
253 
254  /** @brief Return true if any of the backends in the backend mask, @a b_mask,
255  are allowed. */
256  /** This method can be used with any of the Backend::Id constants, the
257  Backend::*_MASK, or combinations of those. */
258  static inline bool Allows(unsigned long b_mask)
259  { return Get().backends & b_mask; }
260 
261  /** @brief Get the current Host MemoryType. This is the MemoryType used by
262  most MFEM classes when allocating memory used on the host.
263  */
264  static inline MemoryType GetHostMemoryType() { return Get().host_mem_type; }
265 
266  /** @brief Get the current Host MemoryClass. This is the MemoryClass used
267  by most MFEM host Memory objects. */
268  static inline MemoryClass GetHostMemoryClass() { return Get().host_mem_class; }
269 
270  /** @brief Get the current Device MemoryType. This is the MemoryType used by
271  most MFEM classes when allocating memory to be used with device kernels.
272  */
273  static inline MemoryType GetDeviceMemoryType() { return Get().device_mem_type; }
274 
275  /// (DEPRECATED) Equivalent to GetDeviceMemoryType().
276  /** @deprecated Use GetDeviceMemoryType() instead. */
277  static inline MemoryType GetMemoryType() { return Get().device_mem_type; }
278 
279  /** @brief Get the current Device MemoryClass. This is the MemoryClass used
280  by most MFEM device kernels to access Memory objects. */
281  static inline MemoryClass GetDeviceMemoryClass() { return Get().device_mem_class; }
282 
283  /// (DEPRECATED) Equivalent to GetDeviceMemoryClass().
284  /** @deprecated Use GetDeviceMemoryClass() instead. */
285  static inline MemoryClass GetMemoryClass() { return Get().device_mem_class; }
286 
287  static void SetGPUAwareMPI(const bool force = true)
288  { Get().mpi_gpu_aware = force; }
289 
290  static bool GetGPUAwareMPI() { return Get().mpi_gpu_aware; }
291 };
292 
293 
294 // Inline Memory access functions using the mfem::Device DeviceMemoryClass or
295 // the mfem::Device HostMemoryClass.
296 
297 /** @brief Return the memory class to be used by the functions Read(), Write(),
298  and ReadWrite(), while setting the device use flag in @a mem, if @a on_dev
299  is true. */
300 template <typename T>
301 MemoryClass GetMemoryClass(const Memory<T> &mem, bool on_dev)
302 {
303  if (!on_dev)
304  {
306  }
307  else
308  {
309  mem.UseDevice(true);
311  }
312 }
313 
314 /** @brief Get a pointer for read access to @a mem with the mfem::Device's
315  DeviceMemoryClass, if @a on_dev = true, or the mfem::Device's
316  HostMemoryClass, otherwise. */
317 /** Also, if @a on_dev = true, the device flag of @a mem will be set. */
318 template <typename T>
319 inline const T *Read(const Memory<T> &mem, int size, bool on_dev = true)
320 {
321  return mem.Read(GetMemoryClass(mem, on_dev), size);
322 }
323 
324 /** @brief Shortcut to Read(const Memory<T> &mem, int size, false) */
325 template <typename T>
326 inline const T *HostRead(const Memory<T> &mem, int size)
327 {
328  return mfem::Read(mem, size, false);
329 }
330 
331 /** @brief Get a pointer for write access to @a mem with the mfem::Device's
332  DeviceMemoryClass, if @a on_dev = true, or the mfem::Device's
333  HostMemoryClass, otherwise. */
334 /** Also, if @a on_dev = true, the device flag of @a mem will be set. */
335 template <typename T>
336 inline T *Write(Memory<T> &mem, int size, bool on_dev = true)
337 {
338  return mem.Write(GetMemoryClass(mem, on_dev), size);
339 }
340 
341 /** @brief Shortcut to Write(const Memory<T> &mem, int size, false) */
342 template <typename T>
343 inline T *HostWrite(Memory<T> &mem, int size)
344 {
345  return mfem::Write(mem, size, false);
346 }
347 
348 /** @brief Get a pointer for read+write access to @a mem with the mfem::Device's
349  DeviceMemoryClass, if @a on_dev = true, or the mfem::Device's
350  HostMemoryClass, otherwise. */
351 /** Also, if @a on_dev = true, the device flag of @a mem will be set. */
352 template <typename T>
353 inline T *ReadWrite(Memory<T> &mem, int size, bool on_dev = true)
354 {
355  return mem.ReadWrite(GetMemoryClass(mem, on_dev), size);
356 }
357 
358 /** @brief Shortcut to ReadWrite(Memory<T> &mem, int size, false) */
359 template <typename T>
360 inline T *HostReadWrite(Memory<T> &mem, int size)
361 {
362  return mfem::ReadWrite(mem, size, false);
363 }
364 
365 } // mfem
366 
367 #endif // MFEM_DEVICE_HPP
static MemoryClass GetMemoryClass()
(DEPRECATED) Equivalent to GetDeviceMemoryClass().
Definition: device.hpp:285
static bool IsAvailable()
Return true if an actual device (e.g. GPU) has been configured.
Definition: device.hpp:243
static int GetId()
Get the device id of the configured device.
Definition: device.hpp:252
static bool IsConfigured()
Return true if Configure() has been called previously.
Definition: device.hpp:240
[device] OCCA CUDA backend. Enabled when MFEM_USE_OCCA = YES and MFEM_USE_CUDA = YES.
Definition: device.hpp:59
static MemoryClass GetHostMemoryClass()
Get the current Host MemoryClass. This is the MemoryClass used by most MFEM host Memory objects...
Definition: device.hpp:268
[host] OCCA OpenMP backend. Enabled when MFEM_USE_OCCA = YES.
Definition: device.hpp:56
~Device()
Destructor.
Definition: device.cpp:148
[device] CEED CUDA backend working together with the CUDA backend. Enabled when MFEM_USE_CEED = YES a...
Definition: device.hpp:66
[host] RAJA OpenMP backend. Enabled when MFEM_USE_RAJA = YES and MFEM_USE_OPENMP = YES...
Definition: device.hpp:45
Biwise-OR of all HIP backends.
Definition: device.hpp:90
T * Write(Memory< T > &mem, int size, bool on_dev=true)
Get a pointer for write access to mem with the mfem::Device&#39;s DeviceMemoryClass, if on_dev = true...
Definition: device.hpp:336
Device(const std::string &device, const int dev=0)
Construct a Device and configure it based on the device string. See Configure() for more details...
Definition: device.hpp:185
T * Write(MemoryClass mc, int size)
Get write-only access to the memory with the given MemoryClass.
void Print(std::ostream &out=mfem::out)
Print the configuration of the MFEM virtual device object.
Definition: device.cpp:279
[device] RAJA CUDA backend. Enabled when MFEM_USE_RAJA = YES and MFEM_USE_CUDA = YES.
Definition: device.hpp:48
static bool IsEnabled()
Return true if any backend other than Backend::CPU is enabled.
Definition: device.hpp:246
static bool IsDisabled()
The opposite of IsEnabled().
Definition: device.hpp:249
void Configure(const std::string &device, const int dev=0)
Configure the Device backends.
Definition: device.cpp:180
Device()
Default constructor. Unless Configure() is called later, the default Backend::CPU will be used...
Definition: device.cpp:70
Id
In the documentation below, we use square brackets to indicate the type of the backend: host or devic...
Definition: device.hpp:30
[host] OCCA CPU backend: sequential execution on each MPI rank. Enabled when MFEM_USE_OCCA = YES...
Definition: device.hpp:54
Number of backends: from (1 &lt;&lt; 0) to (1 &lt;&lt; (NUM_BACKENDS-1)).
Definition: device.hpp:83
static MemoryClass GetDeviceMemoryClass()
Get the current Device MemoryClass. This is the MemoryClass used by most MFEM device kernels to acces...
Definition: device.hpp:281
double b
Definition: lissajous.cpp:42
MFEM backends.
Definition: device.hpp:26
static MemoryType GetDeviceMemoryType()
Get the current Device MemoryType. This is the MemoryType used by most MFEM classes when allocating m...
Definition: device.hpp:273
static MemoryType GetMemoryType()
(DEPRECATED) Equivalent to GetDeviceMemoryType().
Definition: device.hpp:277
static void SetMemoryTypes(MemoryType h_mt, MemoryType d_mt)
Set the default host and device MemoryTypes, h_mt and d_mt.
Definition: device.cpp:256
Biwise-OR of all OpenMP backends.
Definition: device.hpp:92
[host] RAJA CPU backend: sequential execution on each MPI rank. Enabled when MFEM_USE_RAJA = YES...
Definition: device.hpp:42
const T * Read(const Memory< T > &mem, int size, bool on_dev=true)
Get a pointer for read access to mem with the mfem::Device&#39;s DeviceMemoryClass, if on_dev = true...
Definition: device.hpp:319
[host] Default CPU backend: sequential execution on each MPI rank.
Definition: device.hpp:33
Biwise-OR of all CUDA backends.
Definition: device.hpp:88
Biwise-OR of all CPU backends.
Definition: device.hpp:86
T * HostWrite(Memory< T > &mem, int size)
Shortcut to Write(const Memory&lt;T&gt; &amp;mem, int size, false)
Definition: device.hpp:343
static void SetGPUAwareMPI(const bool force=true)
Definition: device.hpp:287
static MemoryType GetHostMemoryType()
Get the current Host MemoryType. This is the MemoryType used by most MFEM classes when allocating mem...
Definition: device.hpp:264
MemoryType
Memory types supported by MFEM.
Definition: mem_manager.hpp:31
[host] CEED CPU backend. GPU backends can still be used, but with expensive memory transfers...
Definition: device.hpp:62
[host] OpenMP backend. Enabled when MFEM_USE_OPENMP = YES.
Definition: device.hpp:35
static bool GetGPUAwareMPI()
Definition: device.hpp:290
static bool Allows(unsigned long b_mask)
Return true if any of the backends in the backend mask, b_mask, are allowed.
Definition: device.hpp:258
const T * HostRead(const Memory< T > &mem, int size)
Shortcut to Read(const Memory&lt;T&gt; &amp;mem, int size, false)
Definition: device.hpp:326
MemoryClass GetMemoryClass(const Memory< T > &mem, bool on_dev)
Return the memory class to be used by the functions Read(), Write(), and ReadWrite(), while setting the device use flag in mem, if on_dev is true.
Definition: device.hpp:301
T * ReadWrite(Memory< T > &mem, int size, bool on_dev=true)
Get a pointer for read+write access to mem with the mfem::Device&#39;s DeviceMemoryClass, if on_dev = true, or the mfem::Device&#39;s HostMemoryClass, otherwise.
Definition: device.hpp:353
Host memory; using new[] and delete[].
T * ReadWrite(MemoryClass mc, int size)
Get read-write access to the memory with the given MemoryClass.
[device] CEED HIP backend working together with the HIP backend. Enabled when MFEM_USE_CEED = YES and...
Definition: device.hpp:69
Biwise-OR of all OCCA backends.
Definition: device.hpp:101
Class used by MFEM to store pointers to host and/or device memory.
Biwise-OR of all RAJA backends.
Definition: device.hpp:99
bool UseDevice() const
Read the internal device flag.
[device] RAJA HIP backend. Enabled when MFEM_USE_RAJA = YES and MFEM_USE_HIP = YES.
Definition: device.hpp:51
Biwise-OR of all device backends.
Definition: device.hpp:96
OutStream out(std::cout)
Global stream used by the library for standard output. Initially it uses the same std::streambuf as s...
Definition: globals.hpp:66
The MFEM Device class abstracts hardware devices such as GPUs, as well as programming models such as ...
Definition: device.hpp:121
Bitwise-OR of all CEED backends.
Definition: device.hpp:94
[device] HIP backend. Enabled when MFEM_USE_HIP = YES.
Definition: device.hpp:39
T * HostReadWrite(Memory< T > &mem, int size)
Shortcut to ReadWrite(Memory&lt;T&gt; &amp;mem, int size, false)
Definition: device.hpp:360
const T * Read(MemoryClass mc, int size) const
Get read-only access to the memory with the given MemoryClass.
[device] CUDA backend. Enabled when MFEM_USE_CUDA = YES.
Definition: device.hpp:37
MemoryClass
Memory classes identify sets of memory types.
Definition: mem_manager.hpp:73
[device] Debug backend: host memory is READ/WRITE protected while a device is in use. It allows to test the &quot;device&quot; code-path (using separate host/device memory pools and host &lt;-&gt; device transfers) without any GPU hardware. As &#39;DEBUG&#39; is sometimes used as a macro, _DEVICE has been added to avoid conflicts.
Definition: device.hpp:75