I wouldn't say Mathematica does automatically GPU or Paralell-CPU computing, at least in general. Since you need do something with paralell kernels, then you should initialize more kernels and/or upload CUDALink or OpenCLLink and use specific Mathematica functionality to exploit the potential of CPU and/or GPU.
For example, I haven't got very powerful graphics card (NVIDIA GeForce 9400 GT) but we can test how CUDALink works. First I have to upload CUDALink :
Needs["CUDALink`"]
I am going to test multiplication of large matrices. I choose a random matrix 5000 x 5000 of real numbers in range (-1,1) :
M = RandomReal[{-1,1}, {5000, 5000}];
Now we can check the computing times without GPU support
In[4]:= AbsoluteTiming[ Dot[M,M]; ] Out[4]= {26.3780000, Null}
and with GPU support
In[5]:= AbsoluteTiming[ CUDADot[M, M]; ] Out[5]= {6.6090000, Null}
In this case we obtained a performance speed-up roughly of factor 4, by using CUDADot instead of Dot.
Edit
To add an example of parallel CPU acceleration (on a dual-core machine) I choose all prime numbers in range [2^300, 2^300 +10^6]. First without parallelizing :
In[139]:= AbsoluteTiming[ Select[ Range[ 2^300, 2^300 + 10^6], PrimeQ ]; ] Out[139]= {121.0860000, Null}
while using Parallelize[expr], which evaluates expr using automatic parallelization
In[141]:= AbsoluteTiming[ Parallelize[ Select[ Range[ 2^300, 2^300 + 10^6], PrimeQ ] ]; ] Out[141]= {63.8650000, Null}
As one could expect we've got almost two times faster evaluation.
PerformanceGoal->"Speed"andMaxPlotPointsand many other things like this would help. So I would imagine a faster graphics card, would help. As for other aspects, M can utilize GPU's with CUDA. I do not use this part of it.PerformanceGoal->"Speed". DateListPlot is slow because date and time functions in Mma are very slow. I'd like to see a 50 times speed improvement to make them competitive with e.g. VBA.