Bit counting

Create a class named bitarray<P> which represents an array of bits and supports the following operations:

Array size is specified as an argument to constructor; allocation is done during construction. The set/reset operations shall have constant complexity, the others shall be linear wrt. array size. The counting operation is expected to be slower than the others (i.e. the count shall not be precomputed during other operations).

The template argument P is used to specify the vector platform available. You shall define the following three policy classes: policy_sse, policy_avx, and policy_avx512. You shall implement bitarray<policy_sse>, bitarray<policy_avx>, and bitarray<policy_avx512>.

The test wrapper du6main.cpp uses policy_sse by default, other policies may be selected using command-line argument "avx" or "avx512".

If you use AVX512 instruction, be aware that not every compiler supports them. Hide the parts which use AVX512 into "#ifdef USE_AVX512" directives to avoid error messages in non-supporting compilers and define the macro if your compiler supports it. du6main.cpp will not require policy_avx512 if this macro is not defined.

If you do not want to implement specific AVX512 implementation, define policy_avx512 equal to policy_avx.

The du6bitcount.hpp (as distributed) contains implementation based on int64_t blocks and 8-bit lookup for bitcounting (all policy classes are identical).

The empty.err file contains stderr output of the test wrapper. Your implementation shall produce the same contents.

The test wrapper outputs measured time (in nanoseconds per bit). Note that the count operation is measured together with an and operation (to detect tricks like precomputing the count). The reference implementation shows the following times (for 1M of bits):