For the why one or the other many arguments can be made - one is, as seen the programmers view, which, and that may surprise some may or may not coincidence with the hardware he uses. A great recent (*6) example here is the PowerPC with its fluid endianess. Here the hardware bit (and byte) order within a word may or may not be the same as the one seen by a programmer. Thus one does need a way to number bits on a more abstract way.
That and the mentioned HP2100 hints at a possible reason why so many modern documentation documentation go for value based numbering: Evolution of modern machines and their understanding went thru a bottle neck, much like the Indian Tiger, namely the invention of very simple byte orientated microprocessors. Where historical machines had serial data formats, or byte sizes smaller than machine words or were using byte sized addressing but big endian for larger data items, the very simple mini and even more simple micros cut it all down to the byte as basic unit.
That byte being the CPU word of those micros and universally fixed at 8 bits (*7) was no longer a subdivision but seen as a monolithic building block for larger structures of words, which in turn are extensions of that byte. Making most history - and the reasons to count different - hidden.
*5 - Divided by the good old discussion about an array starting at ZERO or ONE (Hello BASIC).
*6 - I.e 1990s instead of 1950s and 60s.
*7 - This byte orientation is so canon today that may question on RC.SE show a total missing of understanding that there not only could be different byte sizes, but also that there would be machines without. Even more, everything is seen as bytes, no matter what medium or format and there being nothing else but bytes.




