TL;DR: The problem is the pivot-choosing strategy, which makes repeatedly poor choices on these types of inputs (A- and V-shaped sequences). These result in quicksort making highly "imbalanced" recursive calls, which in turn result in the algorithm performing very poorly (quadratic time for A-shaped sequences).
Congratulations, you've (re)discovered an adversarial input (or rather a family of inputs) for the version of quicksort that chooses the middle element as the pivot.
For the reference, an example of an A-shaped sequence is 1 2 3 4 3 2 1, i.e., a sequence that increases, reaches the pick at the middle, and then decreases; an example of a V-shaped sequence is 4 3 2 1 2 3 4, i.e., a sequence that decreases, reaches the minimum at the middle, and then increases.
Think about what happens when you pick the middle element as the pivot of an A- or a V-shaped sequence. In the first case, when you pass the algorithm the A-shaped sequence 1 2 ... n-1 n n-1 ... 2 1, the pivot is the largest element of the array---this is because the largest element of an A-shaped sequence is the middle one, and you choose the middle element as the pivot---and you will make recursive calls on subarrays of sizes 0 (your code doesn't actually make the call on 0 elements) and n-1. In the next call on the subarray of size n-1 you will pick as the pivot the largest element of the subarray (which is the second-largest element of the original array); and so on. This results in poor performance because the running time is O(n)+O(n-1)+...+O(1) = O(n^2) because in each step you essentially pass on almost the whole array (all elements except the pivot), in other words, the sizes of the arrays in the recursive calls are highly imbalanced.
Here's the trace for the A-shaped sequence 1 2 3 4 5 4 3 2 1:
blazs@blazs:/tmp$ ./test pivot=5 1 2 3 4 1 4 3 2 5 pivot=4 1 2 3 2 1 3 4 4 pivot=3 1 2 3 2 1 3 pivot=3 1 2 1 2 3 pivot=2 1 2 1 2 pivot=2 1 1 2 pivot=1 1 1 pivot=4 4 4 1 1 2 2 3 3 4 4 5
You can see from the trace that at recursive call the algorithm chooses a largest element (there can be up to two largest elements, hence the article a, not the) as the pivot. This means that the running time for the A-shaped sequence really is O(n)+O(n-1)+...+O(1) = O(n^2). (In the technical jargon, the A-shaped sequence is an example of an adversarial input that forces the algorithm to perform poorly.)
This means that if you plot running times for "perfectly" A-shaped sequences of the form
1 2 3 ... n-1 n n-1 ... 3 2 1
for increasing n, you will see a nice quadratic function. Here's a graph I just computed for n=5,105, 205, 305,...,9905 for A-shaped sequences 1 2 ... n-1 n n-1 ... 2 1:

In the second case, when you pass the algorithm a V-shaped sequence, you choose the smallest element of the array as the pivot, and will thus make recursive calls on subarrays of sizes n-1 and 0 (your code doesn't actually make the call on 0 elements). In the next call on the subarray of size n-1 you will pick as the pivot the largest element; and so on. (But you won't always make such terrible choices; it's hard to say anything more about this case.) This results in poor performance for similar reasons. This case is slightly more complicated (it depends on how you do the "moving" step).
Here's a graph of running times for V-shaped sequences n n-1 ... 2 1 2 ... n-1 n for n=5,105,205,...,49905. The running times are somewhat less regular---as I said it is more complicated because you don't always pick the smallest element as the pivot. The graph:

Code that I used to measure time:
double seconds(size_t n) { int *tab = (int *)malloc(sizeof(int) * (2*n - 1)); size_t i; // construct A-shaped sequence 1 2 3 ... n-1 n n-1 ... 3 2 1 for (i = 0; i < n-1; i++) { tab[i] = tab[2*n-i-2] = i+1; // To generate V-shaped sequence, use tab[i]=tab[2*n-i-2]=n-i+1; } tab[n-1] = n; // For V-shaped sequence use tab[n-1] = 1; clock_t start = clock(); quicksort(0, 2*n-2, tab); clock_t finish = clock(); free(tab); return (double) (finish - start) / CLOCKS_PER_SEC; }
I adapted your code to print the "trace" of the algorithm so that you can play with it yourself and gain insight into what's going on:
#include <stdio.h> void print(int *a, size_t l, size_t r); void quicksort(int l,int p,int *tab); int main() { int tab[] = {1,2,3,4,5,4,3,2,1}; size_t sz = sizeof(tab) / sizeof(int); quicksort(0, sz-1, tab); print(tab, 0, sz-1); return 0; } void print(int *a, size_t l, size_t r) { size_t i; for (i = l; i <= r; ++i) { printf("%4d", a[i]); } printf("\n"); } void quicksort(int l,int p,int *tab) { int i=l,j=p,x=tab[(l+p)/2],w; //x - pivot printf("pivot=%d\n", x); do { while (tab[i]<x) { i++; } while (x<tab[j]) { j--; } if (i<=j) { w=tab[i]; tab[i]=tab[j]; tab[j]=w; i++; j--; } } while (i<=j); print(tab, l, p); if (l<j) { quicksort(l,j,tab); } if (i<p) { quicksort(i,p,tab); } }
By the way, I think the graph showing the running times would be smoother if you took the average of, say, 100 running times for each input sequence.
We see that the problem here is the pivot-choosing strategy. Let me note that you can alleviate the problems with adversarial inputs by randomizing the pivot-choosing step. The simplest approach is to pick the pivot uniformly at random (each element is equally likely to be chosen as the pivot); you can then show that the algorithm runs in O(n log n) time with high probability. (Note, however, that to show this sharp tail bound you need some assumptions on the input; the result certainly holds if the numbers are all distinct; see, for example, Motwani and Raghavan's Randomized Algorithms book.)
To corroborate my claims, here's the graph of running times for the same sequences if you choose the pivot uniformly at random, with x = tab[l + (rand() % (p-l))]; (make sure you call srand(time(NULL)) in the main). For A-shaped sequences: 
For V-shaped sequences:
