65

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.

When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.

Consider the following:

void foo(int* bar) { } int main() { int x = 5; int* y = NULL; y = &x; *y = 15; foo(y); } 

The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.

Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:

void test(int@ x) { } 
14
  • 11
    This question can't be answered, only speculated upon. Commented Dec 31, 2011 at 0:58
  • 21
    @bmargulies It can be answered directly; the creator of C wrote a document explaining exactly why this is so. Commented Dec 31, 2011 at 1:15
  • 4
    A question can be in the form of genuine curiosity, right? In fact, I find Crashworks' answer to be just that, a direct answer to my question. So why could this only be speculated upon? Commented Dec 31, 2011 at 1:16
  • 3
    @ildjarn: This isn't discussion-based, there's a clear answer to the question and we've all given it. Commented Dec 31, 2011 at 1:35
  • 2
    It might make sense to take out the phooehy case and reduce the code to a simplified main. I removed reference from the tags, as the question does not seem to be about references at all. Although the related question might be: "Why is the address-of operator (&) also used to declare a reference?" It is also overloaded in a similar fashion. Commented Dec 31, 2011 at 1:39

6 Answers 6

92

In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:

The second innovation that most clearly distinguishes C from its predecessors is this fuller type structure and especially its expression in the syntax of declarations... given an object of any type, it should be possible to describe a new object that gathers several into an array, yields it from a function, or is a pointer to it.... [This] led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,

int i, *pi, **ppi; declare an integer, a pointer to an integer, a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression.

Similarly, int f(), *f(), (*f)(); declare a function returning an integer, a function returning a pointer to an integer, a pointer to a function returning an integer. int *api[10], (*pai)[10]; declare an array of pointers to integers, and a pointer to an array of integers.

In all these cases the declaration of a variable resembles its usage in an expression whose type is the one named at the head of the declaration.

An accident of syntax contributed to the perceived complexity of the language. The indirection operator, spelled * in C, is syntactically a unary prefix operator, just as in BCPL and B. This works well in simple expressions, but in more complex cases, parentheses are required to direct the parsing. For example, to distinguish indirection through the value returned by a function from calling a function designated by a pointer, one writes *fp() and (*pf)() respectively. The style used in expressions carries through to declarations, so the names might be declared

int *fp(); int (*pf)();

In more ornate but still realistic cases, things become worse: int *(*pfp)(); is a pointer to a function returning a pointer to an integer.

There are two effects occurring. Most important, C has a relatively rich set of ways of describing types (compared, say, with Pascal). Declarations in languages as expressive as C—Algol 68, for example—describe objects equally hard to understand, simply because the objects themselves are complex. A second effect owes to details of the syntax. Declarations in C must be read in an `inside-out' style that many find difficult to grasp. Sethi [Sethi 81] observed that many of the nested declarations and expressions would become simpler if the indirection operator had been taken as a postfix operator instead of prefix, but by then it was too late to change.

Sign up to request clarification or add additional context in comments.

12 Comments

That is an excellent explanation, I wish we were actually being told these kind of things in class, it would've helped me understand pointers much sooner.
This is very helpful! I've always wondered about the weird syntax of function pointers. Knowing the background makes reading and writing them much easier. Love this website <3
@Tomalak Geret'kal: Shucks, too bad I can't edit it now. Well, these things do happen when writing in a foreign language.
@diggingforfire You may also like my "graph paper and pencil" technique for reasoning with pointers (example: stackoverflow.com/questions/7062853/…). I generally believe that learning what the machine actually does with pointers first makes learning the C abstraction easier; rather than trying to learn the abstraction first and then the concrete after.
What part of this quote explains why prefix * was selected over alternatives?
|
17

The reason is clearer if you write it like this:

int x, *y; 

That is, both x and *y are ints. Thus y is an int *.

7 Comments

Actually, this is not clearer if you expand it. I don't want to start an argument, but I'd like to point out that this syntax is bound to lead to one.
I agree that you shouldn't use this syntax in practice - it was merely to illustrate the point (which is the same point David makes).
Indeed, nothing against you or your answer (except perhaps the "clearer" part)
No offence taken :) Just trying to clarify.
Probably not the clearest answer in the world, but typed on an iPad hence shorter than normal. The point I was trying to make is that if you group the * with the variable name then it becomes clearer where this syntax came from - namely the idea that by saying * y is an int, you are implying that y itself is an int *. As this answer has been borne out in practice by the reference given (and it wasn't a guess), I'm not sure where the hostility is coming from. Anyway, let's agree to disagree.
|
12

That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.

Comments

11

Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:

  • A pointer type
  • The dereference operator
  • Multiplication

You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.


Why not choose another symbol for C++?

Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.


Why not choose another symbol for C?

It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:

when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}

This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:

int *ptr1; // roughly C-style int* ptr2; // roughly C++-style 

though both varieties are found in programs of both languages, varyingly.

4 Comments

I don't think the committee decided anything; it was probably Dennis Ritchie when he created C.
The committee is responsible for the language we know today. Fair point, it started with DR, but that's not "C" as we know it now. And, more than much else, the C++ committee has the conscious decision for the C++ language -- Ritchie did not.
Before there was a committee, Ritchie invented that syntax. Stroustrup was not going to break backwards compatibility with C by not using it. All these decisions were made before the development of C and C++ was put under a standards body.
@BrianNeal: Fine, but the languages as we know them are governed by those committees, and those committees are -- currently -- responsible for that decision. Unless you want to credit that first protein in the primordial soup.
6

Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.

Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.

I prefer the way van der Linden puts it.

Comments

6

Haha, I feel your pain, I had the exact same problem.

I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.

After a while I thought for myself, every type in C can be read backwards, like

int * const x

can be read as

x const * int

A constant x, when dereferenced (signaled with *) is of type int. So something that has to be dereferenced, has to be a pointer.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.