2

I'm using this code to dynamically create a 2d array:

char **FileTables; int rows = 1000; int i; FileTables = (char**)malloc(rows * sizeof(char)); for (i = 0; i < rows; i++) { FileTables[i] = (char*)malloc(256 * sizeof(char)); } 

Problem is with 1000 rows, and there could be more, it takes a couple of seconds to allocate all the memory. Is there any faster/better method to doing this?

EDIT: Is there an advantage to using one of these methods over the other, besides the obvious simpler code?

char **FileTables; int rows = 1000; int i; FileTables = malloc(rows * sizeof(char*)); FileTables[0] = malloc(rows * 256 * sizeof(char)); for (i = 0; i < rows; i++) { FileTables[i] = FileTables[0] + i * 256; } 

And..

char (*FileTables)[256]; int rows = 1000; FileTables = malloc(rows * sizeof(*FileTables)); 

(And yes, I fixed the unnecessary casting)

4
  • 1
    Note: you shouldn't cast the result of malloc in C - it's unnecessary and can mask bugs that the compiler might otherwise warn you about. Commented Sep 12, 2011 at 16:29
  • 2
    Paul: +1. I really wonder why people do this. At what point did they say, "Hm, my code works well and looks great. I wonder if I can add some noise to it by inserting gratuitous casts. for (i = (int)0; (int)i < 12; ++(int)i). Commented Sep 12, 2011 at 16:33
  • 1
    @Kerrek: ++(int)i does not compile. ++ requires an lvalue and (int)i is not an lvalue. Commented Sep 12, 2011 at 16:45
  • Paul: Yes, indeed, it was an over-exaggerated example. To be fair, the malloc-casts usually do actually change types, too -- but unnecessarily so, as the conversion is already implicit. Commented Sep 12, 2011 at 16:49

7 Answers 7

6

You could get away with just two allocations and some pointer arithmetic:

int rows = 1000; int cols = 256; char *data; char **FileTables; int i; data = malloc(rows * cols); FileTables = malloc(rows * sizeof(char*)); for (i = 0; i < rows; i++) { FileTables[i] = data + i * cols; } 

Also note that I fixed a bug in malloc(rows * sizeof(char)) (the sizeof(char) should be sizeof(char*), since you're allocating an array of pointers to char).

Sign up to request clarification or add additional context in comments.

1 Comment

Noted (and comment deleted). +1 for the answer.
4

As long as the number of columns is constant, or if you're using C99, you can get away with a single malloc without having to do ugly row/column addressing arithmetic yourself:

char (*FileTables)[256] = malloc(rows * sizeof *FileTables); 

4 Comments

I was going to use the solution posted by aix, but this is alot simpler, and alot less code. Could anyone comment on which is a better approach, and why?
I like mine better because the type of FileTables is correct such that you can just write FileTables[row][col], and FileTables[row] by itself will decay to a pointer to that row (a normal char *). You don't have to keep track of the arithmetic yourself.
Do I free it with the usual free(FileTables); or do i need to do something special?
That's right, just free(FileTables);. You can also use realloc on it; just be sure to compute the new size right.
3

If the array is always of the size row × 256, then you might consider a one-dimensional array malloc(row * 256), and access it in strides:

char get(unsigned i, unsigned j, char * array) { return array[j + 256 * i]; } void set(char value, unsigned i, unsigned j, char * array) { array[j + 256 * i] = value; } 

This avoids multiple allocations and gives better memory locality. On top of that, you can pick row or column ordering to micro-optimize.

Comments

1
char **FileTables; int rows = 1000; int i; FileTables = (char**)malloc(rows * sizeof(char *)); char *data = (char *)malloc(256 * 1000 * sizeof(char)); for (i = 0; i < rows; ++i) { FileTables[i] = data; data += 256 * sizeof(char); } 

Should be a better solution.

Comments

1

I don't believe you will get anywhere near seconds. Increasing the rows to 10 million is still under a second on my machine.

However if you want to minimise allocations, you only need one.

FileTables = (char**) malloc(rows * (sizeof(char *) + 256*sizeof(char))); FileTables[0] = (char *) &FileTables[rows]; for (i = 1; i < rows; i++) { FileTables[i] = FileTables[i-1] + 256 * sizeof (char); } free(FileTables); 

A more efficient way to do this is to avoid the second level of indirection.

typedef char chars[256]; int main(int argc, char** argv) { chars* FileTables; int rows = 100000000; int i; FileTables = (chars*) malloc(rows * sizeof (chars)); free(FileTables); return (EXIT_SUCCESS); } 

This avoid a pointer lookup as the C can calculate the rest.

Comments

0

First of all, are you sure it's the memory allocation that is the problem? allocating 1000 blocks of memory should generally not take a few seconds.

You could look into alternate malloc implementations if you have particular needs (e.g., google's tcmalloc if you are allocating memory in threads).

Otherwise, the real "slow" part of malloc is actually getting memory from the OS (with sbrk() or mmap()), and most malloc implementations will grab a big chunk at a time and give it back in smaller pieces, so there are not 1000 calls to allocate 1k each here, there are maybe 60 calls to allocate 16k. Running the program under strace or similar may give you an idea of how many slow system calls are really being made.. You could implement similar behavior yourself, by making a single call to allocate 256K and subdividing that up into smaller chunks. You could try allocating a big chunk of memory and then immediately free()-ing it and hope that the library malloc holds onto that memory and doesn't go back to the OS for more.

Comments

0

This really looks like premature optimization; because, you are asking for faster, but you haven't indicated how fast is fast enough. Still, if you really need to do it this way...

Tips to speed up allocation:

  1. Do fewer allocations
  2. Do smaller allocations

As you can see, if you need 10M allocated, these tips soon become conflicting. To determine the right balance between smaller and fewer allocations, on needs to do profiling.

Look to your memory block size and allocate whole pages of memory at once. It's an old hardware hack, but it does guarantee that you don't ask for multiple pages of continuous memory at once (which speeds up the selecting from the free page lists), and it also guarantees that you don't waste some cycles address space by asking for addresses already reserved by the block reservation subsystem of the memory manager.

If that doesn't get you the performance you need, then rewrite the code to not require allocation the way it's been presented.

Either way, it's not possible to guarantee optimum allocation speed without detailed knowledge of how the memory management subsystem on your computer is actually designed.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.