Consider this scenario:
I am about to break into a website X using an SQL injection to retrieve a list of users, with their password hashes and salt. Suppose website X is also using a global pepper.
All I would have to do would be to register a user at website X with a user name and password known to me prior to the SQL injection. I would then know, for a particular record in the database, the password hash, the plain text password, the salt (stored as plain text) and it would be computationally trivial for me to crack the global pepper on the basis of this one record.
So really, a pepper would be a way of slowing an attacker down for a trivial amount of overhead time. They wouldn't have to brute force the password + salt + pepper, as intended, only the pepper.
The above is a form of chosen plaintext attack. As long as attackers know the algorithm (hash()), the output ($hashed_password), and all but one of the inputs ("constants" $salt & $password and "variable" $pepper), they can "solve for x" like a linear algebra equation (h=s+p+x == h-s-p=x), but by brute force of course. Making the pepper longer than 56 bytes (448 bits), bcrypt's limit, increases the time cost and is as good as bcrypt but still may not be as good as scrypt. So, as long as the pepper is sufficiently long, it is an improvement.