"Thinking aloud" tests are more for design and prototype testing sessions, not validation sessions. For validation you want to check timing, and you want to check correctness of data: if your users are often entering incorrect data, you may need additional validation.
Yes, you need a variety of test subjects: beginners, experts, those familiar with your products, those unfamiliar with your products.
You might also want to capture your users' path. For example, if they often leave the cart while checking out, but don't buy any additional merchandise, determine why - perhaps they find that the buttons or prompts are confusing.
EDIT: add timing to the path data. Find out where your users consistently slow down. You can average the times from different users to establish a benchmark, but you can also have benchmarks from each of the categories of testers. I'm often interested in the fastest overall time, so I know what experienced users are capable of. I also want to know which screens the various users stall at. If every other screen on your site is completed in 40 seconds on average, a screen that takes 10 minutes for the median of all users to complete means something's likely very wrong.