Generating De Bruijn Sequences

Published sometime in 2015, when I was 15

I love bit-hackery. But I also remember how difficult it was to learn some aspects of it. So in this article I'm writing out some things I know about De Bruijn Sequences, and present a inefficient De Bruijn Sequence Generator.

De Bruijn Sequences

A De Bruijn Sequence is used to find the index of the least significant (or first) 1 bit (aka LSB) and most significant (or last) 1 bit (aka MSB) set in a given integer. What you actually do is use the special De Bruijn constant to build up a multiply-right-shift perfect hashing algorithm to index the LSB/MSB of the given integer, and use that index to return LSB/MSB index of the given integer. De Bruijn based LSB is also very fast due to fast multiplication on recent processors.

The De Bruijn Bit Scan Forward Algorithm

Note: Bit Scan Forward is just another name for locating LSB.

1. Isolate LSB. Isolating LSB means given an integer, find an integer that will only have the LSB bit of the given integer set. It is easy to see that Isolated LSB is simply 1 for odd numbers. And for 152 (0x98 or 10011000b), it is 8 (or 0x8, or 1000b). Isolating LSB is efficiently done with a just two basic operations, thanks to two's complement representation:

int isolate_lsb(int x) { 
    return x & -x;    // Will only have LSB bit set, try it! 
}

2. Multiply by a De Bruijn Sequence. A good De Bruijn sequence must hash all possible power-of-two integers (i.e. that have only one bit set) uniquely.
An example 32-bit De Bruijn Sequence is 0x6EB14F9.

3. Shifting by the required bits. This given you the unique index, after which you can use a predefined table to get the LSB index! The higher the required bits, the larger the table size, and the more the number of possible De Bruijn sequence.

Code:

int bit_scan_forward(uint32_t b)
{
  static const int BitTable[32] = {
     0,  1, 16,  2, 29, 17,  3, 22,
    30, 20, 18, 11, 13,  4,  7, 23,
    31, 15, 28, 21, 19, 10, 12,  6,  
    14, 27,  9,  5, 26,  8, 25, 24,  
  };  

  assert(b);  
  b &= -b;    // Isolate LSB  
  b *= 0x6EB14F9;    // Multiply by De Bruijn sequence.  
  b >>= 27;     // Shift by the required bits;  
  return BitTable[b];    // Table lookup  
}  

As a small example, consider the getting LSB of the number 19018432. First, you get isolated LSB as 64. Then you multiply it by 0x6EB14F9 to get 0xBAC53E40 (assuming modulo 2^32 multiplication). The you right shift 0xBAC53E40 by 27 to get 23. And BitTable[23] == 6, and BINGO! 6 is the Required LSB!

MSB separation

MSB separation is much more expensive than LSB separation. In the following example, we fill with ones every bit behind MSB bit and apply the same idea to hash it. Note that Bit Scan Reverse is just another name for MSB index location.

Code:

int bit_scan_reverse(uint32_t b)  
{  
  static const uint32_t Magic = (uint32_t) 0x07C4ACDD;  

  static const int BitTable[32] = {  
     u,  9,  1, 10, 13, 21,  2, 29,  
    11, 14, 16, 18, 22, 25,  3, 30,  
     8, 12, 20, 28, 15, 17, 24,  7,  
    19, 27, 23,  6, 26,  5,  4, 31,  
  };  

  assert(b);  
  b |= b >> 1;  
  b |= b >> 2;  
  b |= b >> 4;  
  b |= b >> 8;  
  b |= b >> 16;  
  return BitTable[(b * Magic) >> 27];  
}  

Also, all MSB magics can also be used to find LSB, by the idea of separated LSB, that is xor with ones' decrement. The separation b ^ (b-1) contains only bits behind and including LSB set. For example, 1010000 ^ (1010000-1) = 1010000^1001111 = 0011111. Here's the code, using the same magic and table of the bit_scan_reverse() given above:

int bit_scan_forward(uint32_t b)  
{  
  static const uint32_t Magic = (uint32_t) 0x07C4ACDD;  

  static const int BitTable[32] = {  
     u,  9,  1, 10, 13, 21,  2, 29,  
    11, 14, 16, 18, 22, 25,  3, 30,  
     8, 12, 20, 28, 15, 17, 24,  7,  
    19, 27, 23,  6, 26,  5,  4, 31,  
  };  

  assert(b);  
  b ^= b - 1;  
  return BitTable[(b * Magic) >> 27];  
}  

Using shifts instead of multiplication

If the given De Bruijn sequence can be written as a product of numbers each one of which is of the form 2^n, 2^n - 1, 2^n + 1, then mulitiplication with that number can be replaced by addition and shifting. Here is an example:

int bit_scan_forward(uint32_t b)  
{  
  static const int BitTable[64] = {  
    31,  u,  u, 10,  u,  4,  u, 11,  
     1,  u,  5,  u,  u,  u,  u, 12,  
     8,  2,  u,  u,  6,  u,  u, 24,  
     u,  u,  u,  u,  u, 20, 26, 13,  
    30,  9,  3,  u,  u,  u,  u,  u,  
     7,  u,  u, 23,  u,  u, 19, 25,  
    29,  u,  u,  u,  u, 22,  u, 18,  
    28,  u, 21, 17, 27, 16, 15, 14,  
  };  

  assert(b);  
  b &= -b;  
  b += b << 6;              // Multiply by 65  
  b += b << 4;              // Multiply by 17  
  b = (b << 18) - (b << 1); // Multiply by 262142  
  return BitTable[b >> 26];  
  // TODO: Try incorporating the multiplication by 2 into the shift  
}  

Only two such numbers exist for 32-bit LSB with optimal index bits (i.e. five index bits): 0x6EB14F9 and 0xDD629F2. Here are both:

int bit_scan_forward(uint32_t b)  
{  
  static const int BitTable[32] = {  
     0,  1, 16,  2, 29, 17,  3, 22,  
    30, 20, 18, 11, 13,  4,  7, 23,  
    31, 15, 28, 21, 19, 10, 12,  6,  
    14, 27,  9,  5, 26,  8, 25, 24,  
  };  

  assert(b);  
  b &= -b;  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 3) - b;    // Multiply by 7  
  return BitTable[b >> 27];  
}  

int bit_scan_forward(uint32_t b)  
{  
  static const int BitTable[32] = {  
    31,  0, 15,  1, 28, 16,  2, 21,  
    29, 19, 17, 10, 12,  3,  6, 22,  
    30, 14, 27, 20, 18,  9, 11,  5,  
    13, 26,  8,  4, 25,  7, 24, 23,  
  };  

  assert(b);  
  b &= -b;  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 8) - b;    // Multiply by 255  
  b = (b << 4) - (b << 1);    // Multiply by 14  
  return BitTable[b >> 27];  
}  

Unfortunately, no such numbers exist for MSB with optimal 32-bit index bits. For 6 index bits, however, 289 of them exist. Here's one:

int bit_scan_reverse(uint32_t b)  
{  
  static const int BitTable[64] = {  
     2,  u,  u,  u,  9,  u, 29,  5,  
     u,  u,  3,  u,  u,  u, 21,  u,  
     u, 18, 10,  u, 23, 13, 30,  u,  
     6,  u, 27,  1,  u,  u,  u,  4,  
     u,  u, 20,  u, 17, 22, 12,  u,  
    26,  u,  u,  u, 19, 16, 11, 25,  
     u,  u, 15, 24, 14,  u, 31,  u,  
     u,  u,  7,  u,  u,  8, 28,  u,  
  };  

  assert(b);  
  b |= b >> 1;  
  b |= b >> 2;  
  b |= b >> 4;  
  b |= b >> 8;  
  b |= b >> 16;  
  b += (b << 3);     // Multiply by 9  
  b += (b << 4);     // Multiply by 17  
  b  = (b << 5) - b; // Multiply by 31  
  b  = (b << 17) - b;// Multiply by 131071  
  return BitTable[b >> 26];  
}  

For 64 bits, things get worse. There are no such magics for Optimal 6 index bits, neither for 7 index bits, and only one exists for 8 index bits!  Here's that number:

int bit_scan_forward(uint64_t b)  
{  
  static const int BitTable[256] = {  
    63,  u,  u,  1,  u,  u,  u,  2,  
     u, 44,  u,  u,  u,  u,  3,  u,  
     u,  u,  u, 45, 38,  u,  u,  u,  
    58,  u,  u,  u,  u,  4,  u,  u,  
     u,  u,  u, 55,  u,  u, 46,  u,  
    39,  u,  u,  u,  u, 20,  u,  u,  
    59,  u,  u,  u, 49,  u,  u,  u,  
     u,  u,  5,  u,  u,  u,  u, 30,  
     u,  u, 42,  u,  u,  u, 56,  u,  
    53,  u,  u,  u,  u, 47,  u,  u,  
    40,  u,  u,  u,  u,  u,  u,  u,  
     u,  u,  u, 21,  u,  u, 10,  u,  
    60,  u,  u,  u,  u, 17,  u,  u,  
     u, 50,  u,  u,  u,  u, 23,  u,  
     u,  u,  u, 26,  u,  6,  u,  u,  
     u,  u,  u, 12,  u,  u, 31,  u,  
    62,  u,  u,  u, 43,  u,  u,  u,  
     u,  u, 37,  u, 57,  u,  u,  u,  
     u, 54,  u,  u,  u,  u, 19,  u,  
     u,  u, 48,  u,  u,  u,  u, 29,  
     u, 41,  u,  u, 52,  u,  u,  u,  
     u,  u,  u,  u,  u,  u,  u,  9,  
     u,  u, 16,  u,  u,  u,  u, 22,  
     u, 25,  u,  u,  u, 11,  u,  u,  
    61,  u,  u,  u,  u, 36,  u,  u,  
     u,  u,  u, 18,  u,  u,  u, 28,  
     u,  u, 51,  u,  u,  u,  u,  8,  
     u, 15,  u,  u, 24,  u,  u,  u,  
     u,  u, 35,  u,  u,  u,  u, 27,  
     u,  u,  u,  7, 14,  u,  u,  u,  
     u, 34,  u,  u,  u,  u, 13,  u,  
    33,  u,  u,  u, 32,  u,  u,  u,  
  };  

  assert(b);  
  b &= -b;  
  // Multiply by 3*9*9*15*33*257*4294967294  
  b += b << 1;    // Multiply by 3  
  b += b << 3;    // Multiply by 9  
  b += b << 3;    // Multiply by 9  
  b = (b << 4) - b;    // Multiply by 15  
  b += b << 5;    // Multiply by 33  
  b += b << 8;    // Multiply by 257  
  b = (b << 32) - (b << 1);    // Multiply by 4294967294  
   
  return BitTable[b >> 56];  
}

Super Magics

There are some De Bruijn magic numbers that can hash both isolated LSB and MSB uniquely. I call them Super Magics. 0x06EB14F9 is one of them:

const uint32_t Magic = (uint32_t) 0x06EB14F9;  
int bit_scan_forward(uint32_t b)  
{  
  static const int BitTable[64] = {  
     u,  u,  u,  1,  u, 16,  2,  u,  
    29,  u, 17,  u,  u,  3,  u, 22,  
    30,  u,  u, 20, 18,  u, 11,  u,  
    13,  u,  u,  4,  u,  7,  u, 23,  
    31,  u, 15,  u, 28,  u,  u, 21,  
     u, 19,  u, 10, 12,  u,  6,  u,  
     u, 14, 27,  u,  u,  9,  u,  5,  
     u, 26,  8,  u, 25,  u, 24,  u,  
  };  

  assert(b);  
  b &= -b;  
  return BitTable[(b * Magic) >> 26];  
}  

int bit_scan_reverse(uint32_t b)  
{  
  static const int BitTable[64] = {  
     u,  u,  u, 15,  u,  1, 28,  u,  
    16,  u,  u,  u,  2, 21, 29,  u,  
     u,  u, 19, 17, 10,  u, 12,  u,  
     u,  3,  u,  6,  u, 22, 30,  u,  
    14,  u, 27,  u,  u,  u, 20,  u,  
    18,  9, 11,  u,  5,  u,  u, 13,  
    26,  u,  u,  8,  u,  4,  u, 25,  
     u,  7, 24,  u, 23,  u, 31,  u,  
  };  

  assert(b);  
  b |= b >> 1;  
  b |= b >> 2;  
  b |= b >> 4;  
  b |= b >> 8;  
  b |= b >> 16;  
  return BitTable[(b * Magic) >> 26];  
}

MagicGen

This C++11 program can generate all 8, 16, and 32-bit De Bruijn sequences of different kinds. Tweak 'BYTES' and 'OptimalBits' to experiment. I haven't made any efforts to optimize the code, and you could get something magnitudes faster by applying a backtracking-based algorithm.

Code:

/*  
** MagicGen, a C++ implementation of a DeBruijn Sequence Searcher  
** This program searches for DeBruijn Sequences that can be used to  
** do forward (aka lsb) and reverse (aka msb) bit scans.  

** Version 1.0  
** Copyright (C) 2015 Syed Fahad  

** MagicGen is free software: you can redistribute it and/or modify  
** it under the terms of the GNU General Public License as published by  
** the Free Software Foundation, either version 3 of the License, or  
** (at your option) any later version.  

** MagicGen is distributed in the hope that it will be useful,  
** but WITHOUT ANY WARRANTY; without even the implied warranty of  
** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the  
** GNU General Public License for more details.  

** For a copy of the GNU General Public License , see  
** <http://www.gnu.org/licenses/>.  

** PS Do whatever the heck you want to do with it, but if you've got some  
** suggestion, question, or anything, do drop me a line.  
** See <http://sites.google.com/site/sydfhd/> for contact details.  
*/  

#include <cstdint>  
#include <iostream>  
#include <cassert>  
#include <array>  
#include <vector>  
#include <string>  
#include <sstream>  
#include <chrono>  
#include <fstream>  

enum GenType {  
  BIT_SCAN_FORWARD /* aka lsb */, BIT_SCAN_REVERSE /* aka msb */  
};  

#define BYTES 4    /* Set to 1 for 8 bit magics, 2 for 16 bit magics, and 4 for 32 bit magics*/  

#if BYTES == 1  
typedef uint8_t magic_t;  
const int OptimalBits = 3;  
const std::string TypeStr("uint8_t");  
#elif BYTES == 2  
typedef uint16_t magic_t;  
const int OptimalBits = 4;  
const std::string TypeStr("uint16_t");  
#elif BYTES == 4  
typedef uint32_t magic_t;  
const int OptimalBits = 5;  
const std::string TypeStr("uint32_t");  
#else  
#error 'BYTES' must be 1, 2, or 4.  
#endif  

// Returns the next number based on type which is needed to be hashed by the magic  
// For example, if Type is BIT_SCAN_FORWARD then it'll return 1, 2, 4, 8, 16, ...  
template <GenType Type>  
magic_t next(magic_t x)  
{  
  static_assert(Type == BIT_SCAN_FORWARD || Type == BIT_SCAN_REVERSE, "Unknown GenType");  
  if (Type == BIT_SCAN_FORWARD)  
    return x ? x << 1 : 1;  
  else if (Type == BIT_SCAN_REVERSE) {  
    if (x == magic_t(-1)) return 0;  
    return (x << 1) + 1;  
  }  
}  

// Checks if all magic can hash all the possible numbers uniquely  
template <GenType Type, int Bits>  
bool is_good_magic(magic_t magic, std::array< bool, 1 << Bits >& used)  
{  
  magic_t s = 1;  // First number to hash  
  used.assign(false);  
  do {  
    magic_t a = magic_t(magic * s) >> ((BYTES * 8) - Bits);  
    assert(a < used.size());  
    if (used[a] == false)  
      used[a] = true;  
    else  
      return false;  
  } while (s = next<Type>(s));  
  return true;  
}  

template <GenType Type, int Bits>  
struct Magic {  
  static const int TableSize = 1 << Bits;  
  int table[TableSize];  
  magic_t magic;  

  Magic() = delete;  
  Magic(magic_t m_) : magic(m_)  
  {  
    std::memset(table, 0, sizeof table);  
    magic_t s = 1;  
    int i = 0;  
    do  
      table[magic_t(magic * s) >> ((BYTES * 8) - Bits)] = i++;  
    while (s = next<Type>(s));  
  }  
  friend std::ostream& operator<<(std::ostream& os, Magic<Type, Bits> m)  
  {  
    os << "int bit_scan_";  
    os << (Type == BIT_SCAN_FORWARD ? "forward(" : "reverse(")  
      << TypeStr << " b)\n{ \n  ";  
    os << "static const " << TypeStr << " Magic = (" << TypeStr << ") 0x";  
    os << std::hex << std::uppercase;  
    os.fill('0');  
    os.width(BYTES * 2);  
    os << uint64_t(m.magic) << std::dec << std::nouppercase << ";\n\n  ";  
    os << "static const int BitTable[" << m.TableSize << "] = {";  
    for (int i = 0; i < m.TableSize; ++i) {  
      if ((i & 7) == 0) os << "\n    ";  
      os.fill(' ');  
      os.width(2);  
      if (m.table[i])  
        os << m.table[i] << ", ";  
      else os << " u, ";  
    }  
    os << "\n  };\n\n  ";  
    os << "assert(b);\n  ";  
    if (Type == BIT_SCAN_FORWARD)  
      os << "b &= -b;\n  ";  
    else if (Type == BIT_SCAN_REVERSE) {  
      os << "b |= b >> 1;\n  ";  
      os << "b |= b >> 2;\n  ";  
      os << "b |= b >> 4;\n  ";  
      if (BYTES > 1)  
        os << "b |= b >> 8;\n  ";  
      if (BYTES > 2)  
        os << "b |= b >> 16;\n  ";  
    }  
    os << "return BitTable[(b * Magic) >> " << (BYTES * 8) - Bits << "];\n}";  
    return os;  
  }  
};  

// Super Magic is (not officially) defined as a magic which can be used for both  
// forward and reverse bit scans, although the resulting lookup would  
// be different  
template <int Bits>  
struct SuperMagic {  
  Magic<BIT_SCAN_FORWARD, Bits> fmagic;  
  Magic<BIT_SCAN_REVERSE, Bits> rmagic;  
  SuperMagic() = delete;  
  SuperMagic(magic_t m) : fmagic(m), rmagic(m)  
  {};  
  friend std::ostream& operator<<(std::ostream& os, SuperMagic<Bits>& sm)  
  {  
    os << sm.fmagic << "\n\n" << sm.rmagic;  
    return os;  
  }  
};  

template <GenType Type, int Bits, bool FindSuperMagic = false, bool FindMulShift = false>  
void find_all(std::ostream& os)  
{  
  using namespace std;  
  array<bool, 1 << Bits> used;  // Temporary table  
  magic_t i, n = 0;  
  for (i = 1; i; i += 1) {  
    if ((i & 0xfffffff) == 0) cout << (i * 100.) / magic_t(-1) << "%..." << endl;  
    if (is_good_magic<Type, Bits>(i, used)) {  
      if (FindMulShift && !is_mulshift(i))  
        continue;  
      if (FindSuperMagic && is_good_magic<GenType(!Type), Bits>(i, used)) {  
        ++n;  
        SuperMagic<Bits> sm(i);  
        os << sm << endl;  
      } else {  
        ++n;  
        Magic<Type, Bits> m(i);  
        os << m << endl;  
      }  
    };  
  }  
  os << "\nFound: " << dec << uint64_t(n) << " total magics with index bits: " << Bits << endl;  
}  

int main()  
{  
  // You can redirect output to file very easily, just pass file as  
  // an argument to find_all() instead of std::cout.  
  using namespace std;  
  std::ofstream out1("bsf_optimal_32.txt");  
  cout << "Finding all BitScanForward Magics with optimal index bits..." << endl;  
  find_all<BIT_SCAN_FORWARD, OptimalBits>(out1);  

  std::ofstream out2("bsr_optimal_32.txt");  
  cout << "Finding all BitScanReverse Magics with optimal index bits..." << endl;  
  find_all<BIT_SCAN_REVERSE, OptimalBits>(out2);  

  std::ofstream out3("super_magics_optimal_32.txt");  
  cout << "Finding all Super Magics with optimal index bits..." << endl;  
  find_all<BIT_SCAN_FORWARD, OptimalBits, true, false>(out3);  

  std::ofstream out4("mushift_32");  
  cout << "Finding all MulShift Magics with optimal index bits..." << endl;  
  find_all<BIT_SCAN_FORWARD, OptimalBits, false, true>(out4);  
#ifdef _WIN32  
  system("PAUSE");  
#endif  
}