**Array sizing**

As an alternative for `qsort()`, consider a signature more like `void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));`

For `void BRESort_byte(unsigned char arr[], int n);`, this would mean using `size_t n` instead of `int n`.

`size_t` is also the [Goldilocks]() type for array sizing and indexing: not too narrow, not too wide.

This is also a fairer comparison against `qsort()`, which can handle arrays more than `INT_MAX`.

Since `size_t` is an _unsigned_ type, code needs review to avoid negatives. 

This also reduces the casual type conversions going on with `mem...()` calls.

**Avoid naked magic numbers**

Instead of values like 3, 8, 32, ... (with no explanation), use a self-documenting macro or named constant.

Use standard library ones when possible. E.g. rather than 255, `UCHAR_MAX` when assigning the maximal `unsigned char`, `int count[256] = {0};` --> `int count[UCHAR_MAX + 1] = {0};`.

If code depends on an 8-bit `char`, use `static_assert(CHAR_BIT == 8, "Must use 8-bit char");`. I suspect `CHAR_BIT <= 12` (or so) may be supportable.

**Back up the claim**

To avoid dismissing "3-4x faster than qsort", post deserves a test harness to demo it.

**Minor: Argument order**

`(unsigned char arr[], int n)` is more common that `(int n, unsigned char arr[])` yet I find `(int n, unsigned char arr[n])` or `(int n, unsigned char arr[static n])` more useful as it allows various analysis inspections.