MFEM v4.9.0
Finite element discretization library
Loading...
Searching...
No Matches
hash.hpp
Go to the documentation of this file.
1// Copyright (c) 2010-2025, Lawrence Livermore National Security, LLC. Produced
2// at the Lawrence Livermore National Laboratory. All Rights reserved. See files
3// LICENSE and NOTICE for details. LLNL-CODE-806117.
4//
5// This file is part of the MFEM library. For more information and source code
6// availability visit https://mfem.org.
7//
8// MFEM is free software; you can redistribute it and/or modify it under the
9// terms of the BSD-3 license. We welcome feedback and contributions, see file
10// CONTRIBUTING.md for details.
11
12#ifndef MFEM_HASH
13#define MFEM_HASH
14
15#include "../config/config.hpp"
16#include "array.hpp"
17#include "globals.hpp"
18
19#include <array>
20#include <cstdint>
21#include <type_traits>
22#include <utility>
23
24namespace mfem
25{
26
27/** A concept for items that should be used in HashTable and be accessible by
28 * hashing two IDs.
29 */
30struct Hashed2
31{
32 int p1, p2;
33 int next;
34};
35
36/** A concept for items that should be used in HashTable and be accessible by
37 * hashing four IDs.
38 */
39struct Hashed4
40{
41 int p1, p2, p3; // NOTE: p4 is neither hashed nor stored
42 int next;
43};
44
45
46/** HashTable is a container for items that require associative access through
47 * pairs (or quadruples) of indices:
48 *
49 * (p1, p2) -> item
50 * (p1, p2, p3, p4) -> item
51 *
52 * An example of this are edges and faces in a mesh. Each edge is uniquely
53 * identified by two parent vertices and so can be easily accessed from
54 * different elements using this class. Similarly for faces.
55 *
56 * The order of the p1, p2, ... indices is not relevant as they are sorted
57 * each time this class is invoked.
58 *
59 * There are two main methods this class provides. The Get(...) methods always
60 * return an item given the two or four indices. If the item did not previously
61 * exist, the methods creates a new one. The Find(...) methods, on the other
62 * hand, just return NULL or -1 if the item does not exist.
63 *
64 * Each new item is automatically assigned a unique ID - the index of the item
65 * inside the BlockArray. The IDs may (but need not) be used as p1, p2, ... of
66 * other items.
67 *
68 * The item type (T) needs to follow either the Hashed2 or the Hashed4
69 * concept. It is easiest to just inherit from these structs.
70 *
71 * All items in the container can also be accessed sequentially using the
72 * provided iterator.
73 *
74 * Notes:
75 * The data structure and implementation is based on a BlockArray<T> which
76 * provides an efficient item storage that avoids heap fragmentation, and
77 * index-based item access. The hash table implemented on top of the
78 * BlockArray provides fast associative (key -> value) access by grouping
79 * items into bins (buckets) of O(1) size.
80 * - "id" denotes the index of an item in the underlying BlockArray<T>,
81 * - "idx" denotes the index of a bin, determined by hashing a key with
82 * the function `Hash`.
83 */
84template<typename T>
85class HashTable : public BlockArray<T>
86{
87protected:
89
90public:
91 /** @brief Main constructor of the HashTable class.
92
93 @param[in] block_size The size of the storage blocks of the underlying
94 BlockArray<T>.
95 @param[in] init_hash_size The initial size of the hash table. Must be
96 a power of 2. */
97 HashTable(int block_size = 16*1024, int init_hash_size = 32*1024);
98 /// Deep copy
99 HashTable(const HashTable& other);
100 /// Copy assignment not supported
101 HashTable& operator=(const HashTable&) = delete;
103
104 /** @brief Item accessor with key (or parents) the pair p1, p2. Default
105 construct an item of type T if no value corresponds to the requested key.
106
107 @param[in] p1 First part of the key.
108 @param[in] p2 Second part of the key.
109 @return The index "id" of the key in the BlockArray<T>.
110
111 @warning This method should only be called if T inherits from Hashed2. */
112 T* Get(int p1, int p2);
113
114 /** @brief Item accessor with key (or parents) the quadruplet p1, p2, p3, p4.
115 The key p4 is optional. Default construct an item of type T if no value
116 corresponds to the requested key.
117
118 @param[in] p1 First part of the key.
119 @param[in] p2 Second part of the key.
120 @param[in] p3 Third part of the key.
121 @param[in] p4 Fourth part of the key (optional).
122 @return The index "id" of the key in the BlockArray<T>.
123
124 @warning This method should only be called if T inherits from Hashed4. */
125 T* Get(int p1, int p2, int p3, int p4 = -1 /* p4 optional */);
126
127 /** @brief Get the "id" of the item whose parents are p1, p2, this "id"
128 corresponding to the index of the item in the underlying BlockArray<T>
129 object. Default construct an item and "id" if no value corresponds to the
130 requested key.
131
132 @param[in] p1 First part of the key.
133 @param[in] p2 Second part of the key.
134 @return The index "id" of the key in the BlockArray<T>.
135
136 @warning This method should only be called if T inherits from Hashed2. */
137 int GetId(int p1, int p2);
138
139 /** @brief Get the "id" of an item, this "id" corresponding to the index of
140 the item in the underlying BlockArray<T> object. Default construct an item
141 and "id" if no value corresponds to the requested key.
142
143 @param[in] p1 First part of the key.
144 @param[in] p2 Second part of the key.
145 @param[in] p3 Third part of the key.
146 @param[in] p4 Fourth part of the key (optional).
147 @return The index "id" of the key in the BlockArray<T>.
148
149 @warning This method should only be called if T inherits from Hashed4. */
150 int GetId(int p1, int p2, int p3, int p4 = -1);
151
152 /** @brief Item accessor with key (or parents) the pair p1, p2. Return
153 NULL if no value corresponds to the requested key.
154
155 @param[in] p1 First part of the key.
156 @param[in] p2 Second part of the key.
157 @return The item associated to the key (p1,p2).
158
159 @warning This method should only be called if T inherits from Hashed2. */
160 T* Find(int p1, int p2);
161
162 /** @brief Item accessor with key (or parents) the quadruplet p1, p2, p3, p4.
163 The key p4 is optional. Return NULL if no value corresponds to the
164 requested key.
165
166 @param[in] p1 First part of the key.
167 @param[in] p2 Second part of the key.
168 @param[in] p3 Third part of the key.
169 @param[in] p4 Fourth part of the key (optional).
170 @return The item associated to the key (p1,p2,p3,p4).
171
172 @warning This method should only be called if T inherits from Hashed4. */
173 T* Find(int p1, int p2, int p3, int p4 = -1);
174
175 /** @brief Item const accessor with key (or parents) the pair p1, p2.
176 Return NULL if no value corresponds to the requested key.
177
178 @param[in] p1 First part of the key.
179 @param[in] p2 Second part of the key.
180 @return The item associated to the key (p1,p2).
181
182 @warning This method should only be called if T inherits from Hashed2. */
183 const T* Find(int p1, int p2) const;
184
185 /** @brief Item const accessor with key (or parents) the quadruplet p1, p2,
186 p3, p4. The key p4 is optional. Return NULL if no value corresponds to the
187 requested key.
188
189 @param[in] p1 First part of the key.
190 @param[in] p2 Second part of the key.
191 @param[in] p3 Third part of the key.
192 @param[in] p4 Fourth part of the key (optional).
193 @return The item associated to the key (p1,p2,p3,p4).
194
195 @warning This method should only be called if T inherits from Hashed4. */
196 const T* Find(int p1, int p2, int p3, int p4 = -1) const;
197
198 /** @brief Find the "id" of an item whose parents are p1, p2. Return -1 if it
199 does not exist.
200
201 This "id" corresponds to the index of the item in the underlying
202 BlockArray<T> object. Default construct an item and "id" if no value
203 corresponds to the requested key.
204
205 @param[in] p1 First part of the key.
206 @param[in] p2 Second part of the key.
207 @return The index "id" of the key in the BlockArray<T>.
208
209 @warning This method should only be called if T inherits from Hashed2. */
210 int FindId(int p1, int p2) const;
211
212 /** @brief Find the "id" of an item, this "id" corresponding to the index of
213 the item in the underlying BlockArray<T> object. Return -1 if it does not
214 exist. Default construct an item and "id" if no value corresponds to the
215 requested key.
216
217 @param[in] p1 First part of the key.
218 @param[in] p2 Second part of the key.
219 @param[in] p3 Third part of the key.
220 @param[in] p4 Fourth part of the key (optional).
221 @return The index "id" of the key in the BlockArray<T>.
222
223 @warning This method should only be called if T inherits from Hashed4. */
224 int FindId(int p1, int p2, int p3, int p4 = -1) const;
225
226 /// Return the number of elements currently stored in the HashTable.
227 int Size() const { return Base::Size() - unused.Size(); }
228
229 /// Return the total number of ids (used and unused) in the HashTable.
230 int NumIds() const { return Base::Size(); }
231
232 /// Return the number of free/unused ids in the HashTable.
233 int NumFreeIds() const { return unused.Size(); }
234
235 /** @brief Return true if item @a id exists in (is used by) the container.
236
237 @param[in] id Index of the item in the underlying BlockArray<T>.
238
239 @warning It is assumed that 0 <= id < NumIds(). */
240 bool IdExists(int id) const { return (Base::At(id).next != -2); }
241
242 /** @brief Remove an item from the hash table.
243
244 @param[in] id Index of the item in the underlying BlockArray<T>.
245
246 @warning Its @a id will be reused by newly added items. */
247 void Delete(int id);
248
249 /// Remove all items.
250 void DeleteAll();
251
252 /** @brief Allocate an item at @a id. Enlarge the underlying BlockArray if
253 necessary.
254
255 @param[in] id Index of the item in the underlying BlockArray<T>.
256 @param[in] p1 First part of the key.
257 @param[in] p2 Second part of the key.
258
259 @warning This is a special purpose method used when loading data from a
260 file. Does nothing if the slot @a id has already been allocated. */
261 void Alloc(int id, int p1, int p2);
262
263 /** @brief Reinitialize the internal list of unallocated items.
264
265 @warning This is a special purpose method used when loading data from a file. */
267
268 /** @brief Change the key associated with an item.
269
270 In other words, makes an item hashed under different parent IDs.
271
272 @param[in] id Index of the item in the underlying BlockArray<T>.
273 @param[in] new_p1 First part of the new key.
274 @param[in] new_p2 Second part of the new key.
275
276 @warning This method should only be called if T inherits from Hashed2. */
277 void Reparent(int id, int new_p1, int new_p2);
278
279 /** @brief Change the key associated with an item.
280
281 In other words, makes an item hashed under different parent IDs.
282
283 @param[in] id Index of the item in the underlying BlockArray<T>.
284 @param[in] new_p1 First part of the new key.
285 @param[in] new_p2 Second part of the new key.
286 @param[in] new_p3 Third part of the new key.
287 @param[in] new_p4 Fourth part of the new key (optional).
288
289 @warning This method should only be called if T inherits from Hashed4. */
290 void Reparent(int id, int new_p1, int new_p2, int new_p3, int new_p4 = -1);
291
292 /// Return total size of allocated memory (tables plus items), in bytes.
293 std::size_t MemoryUsage() const;
294
295 /// Write details of the memory usage to the mfem output stream.
296 void PrintMemoryDetail() const;
297
298 /// Print a histogram of bin sizes for debugging purposes.
299 void PrintStats() const;
300
302 {
303 protected:
304 friend class HashTable;
305 typedef typename Base::iterator base;
306
308 iterator(const base &it) : base(it)
309 {
310 while (base::good() && (*this)->next == -2) { base::next(); }
311 }
312
313 public:
315 {
316 while (base::next(), base::good() && (*this)->next == -2) { }
317 return *this;
318 }
319 };
320
322 {
323 protected:
324 friend class HashTable;
325 typedef typename Base::const_iterator base;
326
328 const_iterator(const base &it) : base(it)
329 {
330 while (base::good() && (*this)->next == -2) { base::next(); }
331 }
332
333 public:
335 {
336 while (base::next(), base::good() && (*this)->next == -2) { }
337 return *this;
338 }
339 };
340
342 iterator end() { return iterator(); }
344 const_iterator end() const { return const_iterator(); }
345
347 const_iterator cend() const { return const_iterator(); }
348
349protected:
350 /** The hash table: each bin is a linked list of items. For each non-empty
351 bin, this arrays stores the "id" of the first item in the list, or -1
352 if the bin is empty. */
353 int* table;
354
355 /** mask = table_size-1. Used for fast modulo operation in Hash(), to wrap
356 the raw hashed index around the current table size (which must be a power
357 of two). */
358 int mask;
359
360 /** List of deleted items in the BlockArray<T>. New items are created with
361 these ids first, before they are appended to the block array. */
363
364 /** @brief hash function for Hashed2 items.
365
366 @param[in] p1 First part of the key.
367 @param[in] p2 Second part of the key.
368 @return The hash key "idx" identifying a bin/bucket.
369
370 NOTE: the constants are arbitrary
371 @warning This method should only be called if T inherits from Hashed2. */
372 inline int Hash(size_t p1, size_t p2) const
373 { return (984120265ul*p1 + 125965121ul*p2) & mask; }
374
375 /** @brief hash function for Hashed4 items.
376
377 @param[in] p1 First part of the key.
378 @param[in] p2 Second part of the key.
379 @param[in] p3 Third part of the key.
380 @return The hash key "idx" identifying a bin/bucket.
381
382 NOTE: The constants are arbitrary.
383 NOTE: p4 is not hashed nor stored as p1, p2, p3 identify a face uniquely.
384 @warning This method should only be called if T inherits from Hashed4. */
385 inline int Hash(size_t p1, size_t p2, size_t p3) const
386 { return (984120265ul*p1 + 125965121ul*p2 + 495698413ul*p3) & mask; }
387
388 // Delete() and Reparent() use one of these:
389 /// Hash function for items of type T that inherit from Hashed2.
390 inline int Hash(const Hashed2& item) const
391 { return Hash(item.p1, item.p2); }
392
393 /// Hash function for items of type T that inherit from Hashed4.
394 inline int Hash(const Hashed4& item) const
395 { return Hash(item.p1, item.p2, item.p3); }
396
397 /** @brief Search the index of the item associated to the key (p1,p2)
398 starting from the item with index @a id.
399
400 @param[in] id Index of the item in the underlying BlockArray<T>.
401 @param[in] p1 First part of the key.
402 @param[in] p2 Second part of the key.
403 @return The index "id" of the key in the BlockArray<T>.
404
405 @warning This method should only be called if T inherits from Hashed2. */
406 int SearchList(int id, int p1, int p2) const;
407
408 /** @brief Search the index of the item associated to the key (p1,p2,p3,(p4))
409 starting from the item with index @a id.
410
411 @param[in] id Index of the item in the underlying BlockArray<T>.
412 @param[in] p1 First part of the key.
413 @param[in] p2 Second part of the key.
414 @param[in] p3 Third part of the key.
415 @return The index "id" of the key in the BlockArray<T>.
416
417 @warning This method should only be called if T inherits from Hashed4. */
418 int SearchList(int id, int p1, int p2, int p3) const;
419
420 /** @brief Insert the item @a id into bin @a idx.
421
422 @param[in] idx The bin/bucket index.
423 @param[in] id The index of the item in the BlockArray<T>.
424 @param[in] item The item to insert at the beginning of the linked list.
425
426 @warning The method only works with bin @a idx and does not check the
427 overall fill factor of the hash table. If appropriate, use
428 CheckRehash() for that. */
429 inline void Insert(int idx, int id, T &item);
430
431 /** @brief Unlink an item @a id from the linked list of bin @a idx.
432
433 @param[in] idx The bin/bucket index.
434 @param[in] id The index of the item in the BlockArray<T>.
435
436 @warning The method aborts if the item is not found. */
437 void Unlink(int idx, int id);
438
439 /** @brief Check table fill factor and resize if necessary.
440
441 The method checks the average size of the bins (i.e., the fill factor).
442 If the fill factor is > 2, the table is enlarged (see DoRehash()). */
443 inline void CheckRehash();
444
445 /** @brief Double the size of the hash table (i.e., double the number of bins)
446 and reinsert all items into the new bins.
447
448 NOTE: Rehashing is computationally expensive (O(N) in the number of items),
449 but since it is only done rarely (when the number of items doubles), the
450 amortized complexity of inserting an item is still O(1). */
451 void DoRehash();
452
453 /** @brief Return the size of the bin @a idx.
454
455 @param[in] idx The index of the bin.
456 @return The size of the bin. */
457 int BinSize(int idx) const;
458};
459
460///
461/// @brief streaming implementation for murmurhash3 128 (x64).
462/// Constructs the hash in 3 stages: init, append, finalize.
463///
464struct Hasher
465{
466 /// where the final hash result is stored after finalize. Use data[1] when
467 /// only 64 bits are required.
468 uint64_t data[2] = {0, 0};
469
470private:
471 uint64_t nbytes = 0;
472
473 uint64_t buf_[2] = {0, 0};
474
475public:
476
477 /// resets this hasher back to an initial seed
478 void init(uint64_t seed = 0);
479 void append(const uint8_t *vs, uint64_t bytes);
480
481 void finalize();
482
483private:
484 // add 16 bytes
485 void add_block(uint64_t k1, uint64_t k2);
486
487 // add [1-8] more bytes, then finalize
488 void finalize(uint64_t k1, int num);
489
490 // add [1-15] more bytes, then finalize
491 // 0 < num < 16
492 void finalize(uint64_t k1, uint64_t k2, int num);
493};
494
495/// Helper class for hashing std::pair. Usable in place of std::hash<std::pair<T,U>>
497{
498 template <class T, class V>
499 size_t operator()(const std::pair<T, V> &v) const noexcept
500 {
501 Hasher hash;
502 // chosen randomly with a 2^64-sided dice
503 hash.init(0xfebd1fe69813c14full);
504 hash.append(reinterpret_cast<const uint8_t *>(&v.first), sizeof(T));
505 hash.append(reinterpret_cast<const uint8_t *>(&v.second), sizeof(V));
506 hash.finalize();
507 return hash.data[1];
508 }
509};
510
511/// Helper class for hashing std::array. Usable in place of std::hash<std::array<T,N>>
513{
514 template <class T, size_t N>
515 size_t operator()(const std::array<T, N> &v) const noexcept
516 {
517 Hasher hash;
518 // chosen randomly with a 2^64-sided dice
519 hash.init(0xfebd1fe69813c14full);
520 for (size_t i = 0; i < N; ++i)
521 {
522 hash.append(reinterpret_cast<const uint8_t *>(&v[i]), sizeof(T));
523 }
524 hash.finalize();
525 return hash.data[1];
526 }
527};
528
529/// Hash function for data sequences.
530/** Depends on GnuTLS for SHA-256 hashing. */
532{
533protected:
535
536 /// Add a sequence of bytes for hashing
537 void HashBuffer(const void *buffer, size_t num_bytes);
538
539 /// Integer encoding method; result is independent of endianness and type
540 template <typename int_type_const_iter>
541 HashFunction &EncodeAndHashInts(int_type_const_iter begin,
542 int_type_const_iter end);
543
544 /// Double encoding method: encode in little-endian byte-order
545 template <typename double_const_iter>
546 HashFunction &EncodeAndHashDoubles(double_const_iter begin,
547 double_const_iter end);
548
549public:
550 /// Default constructor: initialize the hash function
551 HashFunction();
552
553 /// Destructor
555
556 /// Add a sequence of bytes for hashing
557 HashFunction &AppendBytes(const void *seq, size_t num_bytes)
558 { HashBuffer(seq, num_bytes); return *this; }
559
560 /// Add a sequence of integers for hashing, given as a c-array.
561 /** Before hashing the sequence is encoded so that the result is independent
562 of endianness and type: int, long, unsigned, etc. */
563 template <typename int_type>
564 HashFunction &AppendInts(const int_type *ints, size_t num_ints)
565 { return EncodeAndHashInts(ints, ints + num_ints); }
566
567 /// Add a sequence of integers for hashing, given as a fixed-size c-array.
568 /** Before hashing the sequence is encoded so that the result is independent
569 of endianness and type: int, long, unsigned, etc. */
570 template <typename int_type, size_t num_ints>
571 HashFunction &AppendInts(const int_type (&ints)[num_ints])
572 { return EncodeAndHashInts(ints, ints + num_ints); }
573
574 /// Add a sequence of integers for hashing, given as a container.
575 /** Before hashing the sequence is encoded so that the result is independent
576 of endianness and type: int, long, unsigned, etc. */
577 template <typename int_type_container>
578 HashFunction &AppendInts(const int_type_container &ints)
579 { return EncodeAndHashInts(ints.begin(), ints.end()); }
580
581 /// Add a sequence of doubles for hashing, given as a c-array.
582 /** Before hashing the sequence is encoded so that the result is independent
583 of endianness. */
584 HashFunction &AppendDoubles(const real_t *doubles, size_t num_doubles)
585 { return EncodeAndHashDoubles(doubles, doubles + num_doubles); }
586
587 /// Add a sequence of doubles for hashing, given as a fixed-size c-array.
588 /** Before hashing the sequence is encoded so that the result is independent
589 of endianness. */
590 template <size_t num_doubles>
591 HashFunction &AppendDoubles(const real_t (&doubles)[num_doubles])
592 { return EncodeAndHashDoubles(doubles, doubles + num_doubles); }
593
594 /// Add a sequence of doubles for hashing, given as a container.
595 /** Before hashing the sequence is encoded so that the result is independent
596 of endianness. */
597 template <typename double_container>
598 HashFunction &AppendDoubles(const double_container &doubles)
599 { return EncodeAndHashDoubles(doubles.begin(), doubles.end()); }
600
601 /** @brief Return the hash string for the current sequence and reset (clear)
602 the sequence. */
603 std::string GetHash() const;
604};
605
606
607// implementation
608
609template<typename T>
610HashTable<T>::HashTable(int block_size, int init_hash_size)
611 : Base(block_size)
612{
613 mask = init_hash_size-1;
614 MFEM_VERIFY(!(init_hash_size & mask), "init_size must be a power of two.");
615
616 table = new int[init_hash_size];
617 for (int i = 0; i < init_hash_size; i++)
618 {
619 table[i] = -1;
620 }
621}
622
623template<typename T>
625 : Base(other), mask(other.mask)
626{
627 int size = mask+1;
628 table = new int[size];
629 memcpy(table, other.table, size*sizeof(int));
630 other.unused.Copy(unused);
631}
632
633template<typename T>
635{
636 delete [] table;
637}
638
639namespace internal
640{
641
642inline void sort3(int &a, int &b, int &c)
643{
644 if (a > b) { std::swap(a, b); }
645 if (a > c) { std::swap(a, c); }
646 if (b > c) { std::swap(b, c); }
647}
648
649inline void sort4(int &a, int &b, int &c, int &d)
650{
651 if (a > b) { std::swap(a, b); }
652 if (a > c) { std::swap(a, c); }
653 if (a > d) { std::swap(a, d); }
654 sort3(b, c, d);
655}
656
657inline void sort4_ext(int &a, int &b, int &c, int &d)
658{
659 if (d < 0) // support optional last index
660 {
661 sort3(a, b, c);
662 }
663 else
664 {
665 sort4(a, b, c, d);
666 }
667}
668
669} // internal
670
671template<typename T>
672inline T* HashTable<T>::Get(int p1, int p2)
673{
674 return &(Base::At(GetId(p1, p2)));
675}
676
677template<typename T>
678inline T* HashTable<T>::Get(int p1, int p2, int p3, int p4)
679{
680 return &(Base::At(GetId(p1, p2, p3, p4)));
681}
682
683template<typename T>
684int HashTable<T>::GetId(int p1, int p2)
685{
686 // search for the item in the hashtable
687 if (p1 > p2) { std::swap(p1, p2); }
688 int idx = Hash(p1, p2);
689 int id = SearchList(table[idx], p1, p2);
690 if (id >= 0) { return id; }
691
692 // not found - use an unused item or create a new one
693 int new_id;
694 if (unused.Size())
695 {
696 new_id = unused.Last();
697 unused.DeleteLast();
698 }
699 else
700 {
701 new_id = Base::Append();
702 }
703 T& item = Base::At(new_id);
704 item.p1 = p1;
705 item.p2 = p2;
706
707 // insert into hashtable
708 Insert(idx, new_id, item);
709 CheckRehash();
710
711 return new_id;
712}
713
714template<typename T>
715int HashTable<T>::GetId(int p1, int p2, int p3, int p4)
716{
717 // search for the item in the hashtable
718 internal::sort4_ext(p1, p2, p3, p4);
719 int idx = Hash(p1, p2, p3);
720 int id = SearchList(table[idx], p1, p2, p3);
721 if (id >= 0) { return id; }
722
723 // not found - use an unused item or create a new one
724 int new_id;
725 if (unused.Size())
726 {
727 new_id = unused.Last();
728 unused.DeleteLast();
729 }
730 else
731 {
732 new_id = Base::Append();
733 }
734 T& item = Base::At(new_id);
735 item.p1 = p1;
736 item.p2 = p2;
737 item.p3 = p3;
738
739 // insert into hashtable
740 Insert(idx, new_id, item);
741 CheckRehash();
742
743 return new_id;
744}
745
746template<typename T>
747inline T* HashTable<T>::Find(int p1, int p2)
748{
749 int id = FindId(p1, p2);
750 return (id >= 0) ? &(Base::At(id)) : NULL;
751}
752
753template<typename T>
754inline T* HashTable<T>::Find(int p1, int p2, int p3, int p4)
755{
756 int id = FindId(p1, p2, p3, p4);
757 return (id >= 0) ? &(Base::At(id)) : NULL;
758}
759
760template<typename T>
761inline const T* HashTable<T>::Find(int p1, int p2) const
762{
763 int id = FindId(p1, p2);
764 return (id >= 0) ? &(Base::At(id)) : NULL;
765}
766
767template<typename T>
768inline const T* HashTable<T>::Find(int p1, int p2, int p3, int p4) const
769{
770 int id = FindId(p1, p2, p3, p4);
771 return (id >= 0) ? &(Base::At(id)) : NULL;
772}
773
774template<typename T>
775int HashTable<T>::FindId(int p1, int p2) const
776{
777 if (p1 > p2) { std::swap(p1, p2); }
778 return SearchList(table[Hash(p1, p2)], p1, p2);
779}
780
781template<typename T>
782int HashTable<T>::FindId(int p1, int p2, int p3, int p4) const
783{
784 internal::sort4_ext(p1, p2, p3, p4);
785 return SearchList(table[Hash(p1, p2, p3)], p1, p2, p3);
786}
787
788template<typename T>
789int HashTable<T>::SearchList(int id, int p1, int p2) const
790{
791 while (id >= 0)
792 {
793 const T& item = Base::At(id);
794 if (item.p1 == p1 && item.p2 == p2) { return id; }
795 id = item.next;
796 }
797 return -1;
798}
799
800template<typename T>
801int HashTable<T>::SearchList(int id, int p1, int p2, int p3) const
802{
803 while (id >= 0)
804 {
805 const T& item = Base::At(id);
806 if (item.p1 == p1 && item.p2 == p2 && item.p3 == p3) { return id; }
807 id = item.next;
808 }
809 return -1;
810}
811
812template<typename T>
814{
815 const int fill_factor = 2;
816
817 // is the table overfull?
818 if (Base::Size() > (mask+1) * fill_factor)
819 {
820 DoRehash();
821 }
822}
823
824template<typename T>
826{
827 delete [] table;
828
829 // double the table size
830 int new_table_size = 2*(mask+1);
831 table = new int[new_table_size];
832 for (int i = 0; i < new_table_size; i++) { table[i] = -1; }
833 mask = new_table_size-1;
834
835#if defined(MFEM_DEBUG) && !defined(MFEM_USE_MPI)
836 mfem::out << _MFEM_FUNC_NAME << ": rehashing to size " << new_table_size
837 << std::endl;
838#endif
839
840 // reinsert all items
841 for (iterator it = begin(); it != end(); ++it)
842 {
843 Insert(Hash(*it), it.index(), *it);
844 }
845}
846
847template<typename T>
848inline void HashTable<T>::Insert(int idx, int id, T &item)
849{
850 // add item at the beginning of the linked list
851 item.next = table[idx];
852 table[idx] = id;
853}
854
855template<typename T>
856void HashTable<T>::Unlink(int idx, int id)
857{
858 // remove item from the linked list
859 int* p_id = table + idx;
860 while (*p_id >= 0)
861 {
862 T& item = Base::At(*p_id);
863 if (*p_id == id)
864 {
865 *p_id = item.next;
866 return;
867 }
868 p_id = &(item.next);
869 }
870 MFEM_ABORT("HashTable<>::Unlink: item not found!");
871}
872
873template<typename T>
875{
876 T& item = Base::At(id);
877 Unlink(Hash(item), id);
878 item.next = -2; // mark item as unused
879 unused.Append(id); // add its id to the unused ids
880}
881
882template<typename T>
884{
885 Base::DeleteAll();
886 for (int i = 0; i <= mask; i++) { table[i] = -1; }
887 unused.DeleteAll();
888}
889
890template<typename T>
891void HashTable<T>::Alloc(int id, int p1, int p2)
892{
893 // enlarge the BlockArray to hold 'id'
894 while (id >= Base::Size())
895 {
896 Base::At(Base::Append()).next = -2; // append "unused" items
897 }
898
899 T& item = Base::At(id);
900 if (item.next == -2)
901 {
902 item.next = -1;
903 item.p1 = p1;
904 item.p2 = p2;
905
906 Insert(Hash(p1, p2), id, item);
907 CheckRehash();
908 }
909}
910
911template<typename T>
913{
914 unused.DeleteAll();
915 for (int i = 0; i < Base::Size(); i++)
916 {
917 if (Base::At(i).next == -2) { unused.Append(i); }
918 }
919}
920
921template<typename T>
922void HashTable<T>::Reparent(int id, int new_p1, int new_p2)
923{
924 T& item = Base::At(id);
925 Unlink(Hash(item), id);
926
927 if (new_p1 > new_p2) { std::swap(new_p1, new_p2); }
928 item.p1 = new_p1;
929 item.p2 = new_p2;
930
931 // reinsert under new parent IDs
932 int new_idx = Hash(new_p1, new_p2);
933 Insert(new_idx, id, item);
934}
935
936template<typename T>
938 int new_p1, int new_p2, int new_p3, int new_p4)
939{
940 T& item = Base::At(id);
941 Unlink(Hash(item), id);
942
943 internal::sort4_ext(new_p1, new_p2, new_p3, new_p4);
944 item.p1 = new_p1;
945 item.p2 = new_p2;
946 item.p3 = new_p3;
947
948 // reinsert under new parent IDs
949 int new_idx = Hash(new_p1, new_p2, new_p3);
950 Insert(new_idx, id, item);
951}
952
953template<typename T>
954std::size_t HashTable<T>::MemoryUsage() const
955{
956 return (mask+1) * sizeof(int) + Base::MemoryUsage() + unused.MemoryUsage();
957}
958
959template<typename T>
961{
962 mfem::out << Base::MemoryUsage() << " + " << (mask+1) * sizeof(int)
963 << " + " << unused.MemoryUsage();
964}
965
966template<typename T>
967int HashTable<T>::BinSize(int idx) const
968{
969 int count = 0;
970 int id = table[idx];
971 while (id >= 0)
972 {
973 const T& item = Base::At(id);
974 id = item.next;
975 count++;
976 }
977 return count;
978}
979
980template<typename T>
982{
983 int table_size = mask+1;
984 mfem::out << "Hash table size: " << table_size << "\n";
985 mfem::out << "Item count: " << Size() << "\n";
986 mfem::out << "BlockArray size: " << Base::Size() << "\n";
987
988 const int H = 16;
989 int hist[H];
990
991 for (int i = 0; i < H; i++) { hist[i] = 0; }
992
993 for (int i = 0; i < table_size; i++)
994 {
995 int bs = BinSize(i);
996 if (bs >= H) { bs = H-1; }
997 hist[bs]++;
998 }
999
1000 mfem::out << "Bin size histogram:\n";
1001 for (int i = 0; i < H; i++)
1002 {
1003 mfem::out << " size " << i << ": "
1004 << hist[i] << " bins" << std::endl;
1005 }
1006}
1007
1008
1009template <typename int_type_const_iter>
1011 int_type_const_iter end)
1012{
1013 // For hashing, an integer k is encoded as follows:
1014 // * 1 byte = sign_bit(k) + num_bytes(k), where
1015 // - sign_bit(k) = (k >= 0) ? 0 : 128
1016 // - num_bytes(k) = minimum number of bytes needed to represent abs(k)
1017 // with the convention that num_bytes(0) = 0.
1018 // * num_bytes(k) bytes = the bytes of abs(k), starting with the least
1019 // significant byte.
1020
1021 static_assert(
1022 std::is_integral<
1023 /**/ typename std::remove_reference<decltype(*begin)>::type
1024 /**/ >::value,
1025 "invalid iterator type");
1026
1027 // Skip encoding if hashing is not available:
1028 if (hash_data == nullptr) { return *this; }
1029
1030 constexpr int max_buffer_bytes = 64*1024;
1031 unsigned char buffer[max_buffer_bytes];
1032 int buffer_counter = 0;
1033 while (begin != end)
1034 {
1035 int byte_counter = 0;
1036 auto k = *begin;
1037 buffer[buffer_counter] = (k >= 0) ? 0 : (k = -k, 128);
1038 while (k != 0)
1039 {
1040 byte_counter++;
1041 buffer[buffer_counter + byte_counter] = (unsigned char)(k % 256);
1042 k /= 256; // (k >>= 8) results in error, e.g. for 'char'
1043 }
1044 buffer[buffer_counter] |= byte_counter;
1045 buffer_counter += (byte_counter + 1);
1046
1047 ++begin;
1048
1049 if (begin == end ||
1050 buffer_counter + (1 + sizeof(*begin)) > max_buffer_bytes)
1051 {
1052 HashBuffer(buffer, buffer_counter);
1053 buffer_counter = 0;
1054 }
1055 }
1056 return *this;
1057}
1058
1059template <typename double_const_iter>
1061 double_const_iter end)
1062{
1063 // For hashing, a double is encoded in little endian byte-order.
1064
1065 static_assert(
1066 std::is_same<decltype(*begin), const real_t &>::value,
1067 "invalid iterator type");
1068
1069 // Skip encoding if hashing is not available:
1070 if (hash_data == nullptr) { return *this; }
1071
1072 constexpr int max_buffer_bytes = 64*1024;
1073 unsigned char buffer[max_buffer_bytes];
1074 int buffer_counter = 0;
1075 while (begin != end)
1076 {
1077 auto k = reinterpret_cast<const uint64_t &>(*begin);
1078 for (int i = 0; i != 7; i++)
1079 {
1080 buffer[buffer_counter++] = (unsigned char)(k & 255); k >>= 8;
1081 }
1082 buffer[buffer_counter++] = (unsigned char)k;
1083
1084 ++begin;
1085
1086 if (begin == end || buffer_counter + 8 > max_buffer_bytes)
1087 {
1088 HashBuffer(buffer, buffer_counter);
1089 buffer_counter = 0;
1090 }
1091 }
1092 return *this;
1093}
1094
1095} // namespace mfem
1096
1097#endif
int Size() const
Return the logical size of the array.
Definition array.hpp:166
void Copy(Array &copy) const
Create a copy of the internal array to the provided copy.
Definition array.hpp:1042
iterator begin()
Definition array.hpp:710
T & At(int index)
Access item of the array.
Definition array.hpp:600
const_iterator cbegin() const
Definition array.hpp:715
int Size() const
Return the number of items actually stored.
Definition array.hpp:616
Hash function for data sequences.
Definition hash.hpp:532
HashFunction()
Default constructor: initialize the hash function.
Definition hash.cpp:29
HashFunction & AppendInts(const int_type(&ints)[num_ints])
Add a sequence of integers for hashing, given as a fixed-size c-array.
Definition hash.hpp:571
HashFunction & AppendDoubles(const real_t(&doubles)[num_doubles])
Add a sequence of doubles for hashing, given as a fixed-size c-array.
Definition hash.hpp:591
HashFunction & AppendDoubles(const double_container &doubles)
Add a sequence of doubles for hashing, given as a container.
Definition hash.hpp:598
HashFunction & EncodeAndHashInts(int_type_const_iter begin, int_type_const_iter end)
Integer encoding method; result is independent of endianness and type.
Definition hash.hpp:1010
HashFunction & EncodeAndHashDoubles(double_const_iter begin, double_const_iter end)
Double encoding method: encode in little-endian byte-order.
Definition hash.hpp:1060
std::string GetHash() const
Return the hash string for the current sequence and reset (clear) the sequence.
Definition hash.cpp:60
~HashFunction()
Destructor.
Definition hash.cpp:38
HashFunction & AppendBytes(const void *seq, size_t num_bytes)
Add a sequence of bytes for hashing.
Definition hash.hpp:557
HashFunction & AppendDoubles(const real_t *doubles, size_t num_doubles)
Add a sequence of doubles for hashing, given as a c-array.
Definition hash.hpp:584
HashFunction & AppendInts(const int_type *ints, size_t num_ints)
Add a sequence of integers for hashing, given as a c-array.
Definition hash.hpp:564
HashFunction & AppendInts(const int_type_container &ints)
Add a sequence of integers for hashing, given as a container.
Definition hash.hpp:578
void HashBuffer(const void *buffer, size_t num_bytes)
Add a sequence of bytes for hashing.
Definition hash.cpp:45
Base::const_iterator base
Definition hash.hpp:325
const_iterator(const base &it)
Definition hash.hpp:328
const_iterator & operator++()
Definition hash.hpp:334
iterator(const base &it)
Definition hash.hpp:308
Base::iterator base
Definition hash.hpp:305
iterator & operator++()
Definition hash.hpp:314
void Reparent(int id, int new_p1, int new_p2, int new_p3, int new_p4=-1)
Change the key associated with an item.
Definition hash.hpp:937
const_iterator end() const
Definition hash.hpp:344
T * Get(int p1, int p2)
Item accessor with key (or parents) the pair p1, p2. Default construct an item of type T if no value ...
Definition hash.hpp:672
void PrintMemoryDetail() const
Write details of the memory usage to the mfem output stream.
Definition hash.hpp:960
int Hash(const Hashed4 &item) const
Hash function for items of type T that inherit from Hashed4.
Definition hash.hpp:394
void DeleteAll()
Remove all items.
Definition hash.hpp:883
T * Find(int p1, int p2)
Item accessor with key (or parents) the pair p1, p2. Return NULL if no value corresponds to the reque...
Definition hash.hpp:747
bool IdExists(int id) const
Return true if item id exists in (is used by) the container.
Definition hash.hpp:240
int Hash(size_t p1, size_t p2, size_t p3) const
hash function for Hashed4 items.
Definition hash.hpp:385
iterator begin()
Definition hash.hpp:341
HashTable(int block_size=16 *1024, int init_hash_size=32 *1024)
Main constructor of the HashTable class.
Definition hash.hpp:610
void Reparent(int id, int new_p1, int new_p2)
Change the key associated with an item.
Definition hash.hpp:922
BlockArray< T > Base
Definition hash.hpp:88
T * Get(int p1, int p2, int p3, int p4=-1)
Item accessor with key (or parents) the quadruplet p1, p2, p3, p4. The key p4 is optional....
Definition hash.hpp:678
int Hash(size_t p1, size_t p2) const
hash function for Hashed2 items.
Definition hash.hpp:372
const_iterator begin() const
Definition hash.hpp:343
void Insert(int idx, int id, T &item)
Insert the item id into bin idx.
Definition hash.hpp:848
int FindId(int p1, int p2, int p3, int p4=-1) const
Find the "id" of an item, this "id" corresponding to the index of the item in the underlying BlockArr...
Definition hash.hpp:782
int SearchList(int id, int p1, int p2) const
Search the index of the item associated to the key (p1,p2) starting from the item with index id.
Definition hash.hpp:789
int GetId(int p1, int p2)
Get the "id" of the item whose parents are p1, p2, this "id" corresponding to the index of the item i...
Definition hash.hpp:684
void CheckRehash()
Check table fill factor and resize if necessary.
Definition hash.hpp:813
int NumIds() const
Return the total number of ids (used and unused) in the HashTable.
Definition hash.hpp:230
const T * Find(int p1, int p2, int p3, int p4=-1) const
Item const accessor with key (or parents) the quadruplet p1, p2, p3, p4. The key p4 is optional....
Definition hash.hpp:768
int SearchList(int id, int p1, int p2, int p3) const
Search the index of the item associated to the key (p1,p2,p3,(p4)) starting from the item with index ...
Definition hash.hpp:801
iterator end()
Definition hash.hpp:342
int Hash(const Hashed2 &item) const
Hash function for items of type T that inherit from Hashed2.
Definition hash.hpp:390
int GetId(int p1, int p2, int p3, int p4=-1)
Get the "id" of an item, this "id" corresponding to the index of the item in the underlying BlockArra...
Definition hash.hpp:715
void Delete(int id)
Remove an item from the hash table.
Definition hash.hpp:874
HashTable(const HashTable &other)
Deep copy.
Definition hash.hpp:624
int BinSize(int idx) const
Return the size of the bin idx.
Definition hash.hpp:967
int FindId(int p1, int p2) const
Find the "id" of an item whose parents are p1, p2. Return -1 if it does not exist.
Definition hash.hpp:775
Array< int > unused
Definition hash.hpp:362
void DoRehash()
Double the size of the hash table (i.e., double the number of bins) and reinsert all items into the n...
Definition hash.hpp:825
T * Find(int p1, int p2, int p3, int p4=-1)
Item accessor with key (or parents) the quadruplet p1, p2, p3, p4. The key p4 is optional....
Definition hash.hpp:754
void Alloc(int id, int p1, int p2)
Allocate an item at id. Enlarge the underlying BlockArray if necessary.
Definition hash.hpp:891
const_iterator cend() const
Definition hash.hpp:347
int Size() const
Return the number of elements currently stored in the HashTable.
Definition hash.hpp:227
void Unlink(int idx, int id)
Unlink an item id from the linked list of bin idx.
Definition hash.hpp:856
const T * Find(int p1, int p2) const
Item const accessor with key (or parents) the pair p1, p2. Return NULL if no value corresponds to the...
Definition hash.hpp:761
void UpdateUnused()
Reinitialize the internal list of unallocated items.
Definition hash.hpp:912
const_iterator cbegin() const
Definition hash.hpp:346
int NumFreeIds() const
Return the number of free/unused ids in the HashTable.
Definition hash.hpp:233
void PrintStats() const
Print a histogram of bin sizes for debugging purposes.
Definition hash.hpp:981
HashTable & operator=(const HashTable &)=delete
Copy assignment not supported.
std::size_t MemoryUsage() const
Return total size of allocated memory (tables plus items), in bytes.
Definition hash.hpp:954
real_t b
Definition lissajous.cpp:42
real_t a
Definition lissajous.cpp:41
OutStream out(std::cout)
Global stream used by the library for standard output. Initially it uses the same std::streambuf as s...
Definition globals.hpp:66
float real_t
Definition config.hpp:46
Helper class for hashing std::array. Usable in place of std::hash<std::array<T,N>>
Definition hash.hpp:513
size_t operator()(const std::array< T, N > &v) const noexcept
Definition hash.hpp:515
streaming implementation for murmurhash3 128 (x64). Constructs the hash in 3 stages: init,...
Definition hash.hpp:465
void init(uint64_t seed=0)
resets this hasher back to an initial seed
Definition hash.cpp:88
uint64_t data[2]
Definition hash.hpp:468
void finalize()
Definition hash.cpp:158
void append(const uint8_t *vs, uint64_t bytes)
Definition hash.cpp:131
Helper class for hashing std::pair. Usable in place of std::hash<std::pair<T,U>>
Definition hash.hpp:497
size_t operator()(const std::pair< T, V > &v) const noexcept
Definition hash.hpp:499