Although this code looks dangerously wrong on many accounts at first glance, it actually only contains a single problem. Otherwise it is perfectly fine, within its limitations.
You are creating a vector initialized to some particular size, and you do not allow more elements being pushed than this given size. That is somewhat "odd behavior" but if this is what is desired, then there is no problem.
Calling vector::operator[] looks very troublesome for thread-safety because it is not an atomic operation, but it is really not a problem as such. All that vector::operator[] does is return a reference to the element corresponding to the index that you provide. It doesn't do any bounds-checking or reallocation or any other complicated stuff that might break in presence of concurrency (in all likelihood, it boils down to something like a single LEA instruction).
You are using fetch_add in every case, which is the right mindset to guarantee that indices are unique among threads. If indices ar unique, no two threads will ever access the same element, insofar it does not matter whether that access is atomic. Even if several threads atomically increment a counter at the same time, they will all get different results (i.e. no increments are "lost" on the way). At least that's the theory, and it's true for as far push is concerned, too.
The one real problem lies in the pop function where (other than in push) you do not atomically fetch_add the index (or more correctly, both indices) into local variables prior to comparing them and to calling operator[].
This is a problem since if(start==end) is not atomic as such, and it is not atomic in conjuction with calling operator[] a few lines later. You must fetch both values to be compared in one atomic operation, or you have no way of asserting that the comparison is meaningful in any way. Otherwise:
- Another thread (or another two or three threads) can increment either
start or end (or both) while you compare them, and your check for having reached the maximum capacity will fail. - Another thread may increment
start after that check, but before the fetch_add inside the call to operator[] is seen, causing you to index out-of-bounds and invoking undefined behavior.
The non-intuitive thing to note here is that although you do "the right thing" by using atomic operations, the program will still not behave correctly, since not just individual operations need to be atomic.