0

Let's say we have the following two pieces of code:

 int *a = (int *)malloc(sizeof(*a)); int *b = (int *)malloc(sizeof(*b)); 

And

 int *a = (int *)malloc(2 * sizeof(*a)); int *b = a + 1; 

Both of them allocate two integers on the heap and (assuming the normal usage) they should be equivalent. The first seems to be slower as it calls malloc twice and can result in a more cache-friendly code. The second however is possibly insecure as we can accidentally override the value of what b points to just by incrementing a and writing to the resulting pointer (or someone malicious can instantly change the value of b just by knowing where a is).

It's possible that the above claims are not true (for example the speed is questioned here: Minimizing the amount of malloc() calls improves performance?) but my question is just: Can the compiler do this type of transformation or is there something fundamentally different between the two according to the standard? If it is possible, what compiler flags (let's say gcc) can allow it?

5
  • 2
    You should specify which language; I believe the answer is different in both cases. Commented Feb 9, 2021 at 0:50
  • There's a very good chance that less space would be used by the second (double malloc()) fragment than by the first. The first would need two lots of accounting overhead, so the chances are good that the values in a and b are more than 4 bytes apart in that case (probably at least 8 bytes apart on a 32-bit system and at least 16 bytes on a 64-bit system, but those numbers are endlessly fungible by different implementations). Commented Feb 9, 2021 at 0:51
  • 1
    You urgently need to get in the habit of doing sizeof(int) rather than assuming 4. Code like this is what makes porting from 32-bit to 64-bit far, far harder than it should be. (Sure, int is the same size on both, but other types differ considerably.) Commented Feb 9, 2021 at 0:59
  • 1
    Attempt to fool the optimizer at your own peril. Commented Feb 9, 2021 at 1:24
  • Writing out of bounds will be possible with both examples. Neither are safe against that. Commented Feb 9, 2021 at 1:40

5 Answers 5

3

In reality, no, the compiler will never combine the 2 malloc() calls into a single malloc() call automatically. Each call to malloc() returns the address of a new memory block, there is no guarantee that the allocated blocks will be located anywhere close to each other, and each allocated block must be free()'d individually. So no compiler will ever assume anything about the relationship between multiple allocated blocks and try to optimize their allocations for you.

Now, it is possible that in a very simplified use-case, where the allocation and deallocation were in the same scope, and if it can be proven to be safe to do so, then the compiler vendor might decide to try to optimize, ie:

void doIt() { int *a = (int *)malloc(sizeof(*a)); int *b = (int *)malloc(sizeof(*b)); ... free(a); free(b); } 

Could become:

void doIt() { void *ptr = malloc(sizeof(int) * 2); int *a = (int *)ptr; int *b = a + 1; ... free(ptr); } 

But in reality, no compiler vendor will actually attempt to do this. It is not worth the effort, or the risk, for such little gain. And it would not work in more complex scenarios anyway, eg:

void doIt() { int *a = (int *)malloc(sizeof(*a)); int *b = (int *)malloc(sizeof(*b)); ... UseAndFree(a, b); } void UseAndFree(int *a, int *b) { ... free(a); free(b); } 
Sign up to request clarification or add additional context in comments.

4 Comments

I have seen a compiler optimize a malloc and free away completely. So it might not reduce two to one, but it would reduce two to zero.
@EricPostpischil that would only be possible if the free() were in scope of the malloc(), and the compiler knew the pointer wasn't being used for anything other than the free().
Yes, and I have seen it.
You do not need to have free in the scope. Just drop the pointer and malloc is optimized away.
3

No, it can't, because the compiler (in general) doesn't know when a and b might get free()'d, and if it allocates them both as part of a single allocation, then it would need to free() them both at the same time also.

6 Comments

I'm not sure I follow the logic. The compiler could choose to free them both at the last free of either pointer, right? Would the as-if rule be violated in that case? It's unlikely that a compiler would do this, but I don't see why they can't.
@cigien It could, but why? Allocators are already complicated enough. This takes it to entirely new levels of complexity.
It wouldn't be as-if, if the code did it over and over again. Alloc A+B, free A, alloc C+D, free C, etc... memory exhaustion. And determining whether this situation occurs or not probably reduces to the halting problem,
"it can't" That's a bit too strong of a statement. For example, a compliant compiler could hypothetically use a dedicated fixed-size allocator for int sized chunks, and track those independently. Unlikely? Yes, of course. Impossible? No.
To clarify: I'm not saying that a compiler would do this. But the question is asking "can", and the answer is saying "no", which seems wrong. The compiler could choose to release all the memory at the end of the program (when it's definitely safe). This would be inefficient, but is a QOI issue; I think compilers would be compliant in doing this.
|
2

There's a number of reasons why this will likely never happen, but the most important is lifetimes where these allocations, if made independently, can be freed independently. If made together they're locked to the same lifetime.

This sort of nuance is best expressed by the developer rather than determined by the compiler.

Is the second "insecure" in that you can overwrite values? In C, and by extension C++, the language does not protect you from bad programming. You are free to shoot yourself in the foot at any time, using any means necessary:

int a; int b; int* p = &a; p[1] = 9; // Bullet, meet foot (&b)[-1] = 9; // Why not? 

If you want to allocate N of something by all means use calloc() to express it, or an appropriately sized malloc(). Doing individual allocations is pointless unless there's a good reason.

Normally you wouldn't allocate a single int, that's kind of useless, but there are cases where that might be the only reasonable option. Typically it's larger blocks of things, like a full struct or a character buffer.

5 Comments

@0___________ My bad typing. Fixed. Thanks.
"If made together they're locked to the same lifetime" In the end it's a matter of implementation, and there is no guarantee about that. Suppose a fixed-size allocator where the used blocks are marked with bits in a master "bitmap". Then it would be entirely possible for one malloc to flip two bits on, then each free to flip one bit back off.
@dxiv Depends on the internals of the allocator and how it interfaces with the host OS. A trivial allocator might find this easy to do, but that doesn't mean it's a good idea.
@tadman To repeat an earlier comment, unlikely - yes of course, but impossible - no. Guess it depends on how you read OP's question: "can the compiler do ...?". In the sense of "will a common/sane compiler do ..." the answer is negative. But in the sense of "is the compiler technically allowed to ever dream of doing ..." the answer is just maybe.
@dxiv There's an infinite number of things a compiler can do in theory. I'd prefer to keep the scope to things that they're likely to do, or where it makes practical sense to do such a thing. Like "Can a compiler detect all the bugs in your code?" or "Can a compiler prove P=NP?" Technically, yes. Practically, no.
1

First of all:

int *a = (int *)malloc(8); int *b = a + 4; 

Is not what you think. You want:

int *a = malloc(sizeof(*a) * 2); int *b = a + 1; 

It shows that pointer arithmetic is something you need to learn.

Secondly: the compiler does not change anything in your code, and it will not combine any function calls in one. What you try to achieve is a micro-optimization. If you want to use a larger chunk of memory simply use arrays.

int *a = malloc(sizeof(*a) * 2); a[0] = 5; a[1] = 6; /* some other code */ free(a); 

Do not use "magic" number is malloc only sizeof of the objects. Do not cast the result of malloc

1 Comment

Thanks for the sizeof suggestion, actually I didn't know that I can use sizeof on the variable that I'm currently declaring. I've edited my question.
0

I've done exactly that with a bignum library, but you only free the one pointer.

//initialization every time program runs extern bignum_t *scratch00; //these are useful for taylor series, etc. extern bignum_t *scratch01; extern bignum_t *scratch02; . . . bignum_t *bn_malloc(int bignums) { return(malloc(bignums * bn_numbytes)); } . . . //bignums specific to the program being written at the moment bignum_t *numerator; bignum_t *denom; bignum_t *denom_add; bignum_t *accum; bignum_t *term; . . . numerator = bn_malloc(1); denom = bn_malloc(1); denom_add = bn_malloc(1); accum = bn_malloc(1); term = bn_malloc(1); 

1 Comment

Of course the programmer can implement blocked memory... but the question is about whether the compiler can do it, i.e. as a behind-the-scenes optimization without requiring any special steps on the part of the programmer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.