Haskell 59.35 * 228 = 13520.413566
import Data.List import Data.List.Split import Prelude.Unicode import System.Random findBW strm | i == 0 = if(16, rest) | i < 3 then = findBW rest else| otherwise = (i, rest) where (bw,rest) = splitAt 4 strm i = sum $ zipWith ((*).(2^)) [0..] bw mkStream i strm = first ++ second ++ sum ++ mkStream i rest where (first,tmp) = splitAt i strm (second,rest) = splitAt i tmp sum = reverse $ g (reverse first) (reverse second) 0 run 100000 reps _ = print $ (reps*16*3) / 100000 run n reps gen = do p validstream run (n+1) (reps+rep) gen2 where (gen1,gen2) = System.Random.split gen (bytewidth, rndstream) = findBW $ randomRs (0::Int,1) gen1 validstream = mkStream bytewidth rndstream rep = fst $ until(\(_,t)->sum[1|x<-h#t,x]<2)(\(n,s)->(n+1,tail#s))(1,f validstream) main = do gen <- getStdGen run 0 0 gen r=reverse (#)=map h=head g[]_ _=[] g(a:b)(c:d)e=mod(a+c+e)2:g b d(div(a+c+e)2) c(m:n:o:p)=(g(r m)(r n)0≡r o):c p f s=(scanl1(∧).c.(`chunksOf`s))#[3..16] p=print.(3+).h.elemIndices True .(h#).until(\t->sum[1|x<-h#t,x]<2)(tail#).f The average bits consumed in a run on 100,000 streams are around 59.2x3x / 59.4x , so I take 59.35 as the multiplier. For the golf part I count the characters of function p and it's helper functions, i.e. all characters beginning with r=reverse to the end. p is the function that analyses a bit stream and prints the byte-width. Note: there are some unicode characters, all counting 1.
How it works: split the bit stream in chunks of 3 bits, 4 bits, ... 16 bits and check for validity. Repeat taking 3 of the chunks for every size until there's only one valid set. That means I consume 3*16 = 48 bits per repetition. Most of the time one ore two repetitions are enough.