I think what you're proposing is pretty reasonable, with some tweaks:
I would report the median -- or a bunch of percentiles -- rather than the minimum. If your code places a lot of strain on the garbage collector, simply taking the minimum can easily fail to pick up on that (all it takes is for a single iteration to fit between two consecutive GC pauses).
In many cases it makes sense to measure the CPU time rather then the wall-clock time. This takes care of some of the impact of having other code running on the same box.
Some benchmarking tools use two levels of loops: the inner loop repeatedly performs the operation, and the outer loop looks at the clock before and after the inner loop. The observations are then aggregated across the iterations of the outer loop.
Finally, the following gives a very good overview of JVM-specific issues to be aware of: How do I write a correct micro-benchmark in Java?How do I write a correct micro-benchmark in Java?