0

I am trying to parallelize a translated program which is in C/C++. The program was originally in Verilog hardware description language. The program has been translated into C/C++ by a translator program. The fact that it is circuit level C/C++ program should not make a difference. I am essentially trying to follow the approach in

https://stackoverflow.com/users/2979872/user2979872

But

I am getting a SEGMENTATION FAULT when i try to use openmp to parallize. Here is the code. When i make "top", which is a pointer to an object private by writing #pragma omp parallel num_threads(2) private(top), i get segmentation fault.

//////////////////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv, char **env) { Verilated::commandArgs(argc, argv); Vaes_cipher_top* top = new Vaes_cipher_top; // this is the aes object that will do the enc unsigned int i = 0; unsigned int set_done; unsigned int ld_set = 0; top->rst = 1; // assert reset #pragma omp parallel num_threads(2) private(top) while (i < 2) { if (main_time > 10) { top->rst = 0; // Deassert reset } if ((main_time % 10) == 1) { top->clk = 1; // Toggle clock (posedge) } if ((main_time % 10) == 6) { top->clk = 0; //setting DUT values if(ld_set!=1 && main_time > 10) { top -> ld = 1; top -> key = {0x00000000,0x00000000,0x00000000,0x00000000}; top -> text_in = {0x00000000,0x00000000,0x00000000,0x00000000}; ld_set++; } else if(ld_set == 1 && main_time > 10) { top -> ld = 0; set_done = 0; } } //(main_time % 10) == 6) top->eval(); // Evaluate model if(top->done && !set_done) { print(top->key); print(top->text_in); print(top->text_out); ld_set = 0; //reset i++; set_done = 1; } //if(top->done) main_time++; // Time passes... } //end of while printf("\n Test Done\n"); top->final(); // Done simulating delete top; return 0; } //end of main 

Trying to move forward as suggested by Hristo. No more segmentation fault but incorrect result due to race conditions.

 int main(int argc, char **argv, char **env) { Verilated::commandArgs(argc, argv); Vaes_cipher_top* top; // this is the aes object that will do the enc unsigned int i = 0; unsigned int set_done; unsigned int ld_set = 0; //top->rst = 1; // assert reset unsigned int iter_count = 1; #pragma omp parallel num_threads(2) firstprivate(iter_count,ld_set,set_done,i) while (i < 2) { if(iter_count) { top = new Vaes_cipher_top; iter_count = 0; } if(main_time == 0) top-> rst = 1; //assert reset if (main_time > 10) { top->rst = 0; // Deassert reset } if ((main_time % 10) == 1) { top->clk = 1; // Toggle clock (posedge) } if ((main_time % 10) == 6) { top->clk = 0; //setting DUT values if(ld_set!=1 && main_time > 10) { top -> ld = 1; top -> key = {0x00000000,0x00000000,0x00000000,0x00000000}; top -> text_in = {0x00000000,0x00000000,0x00000000,0x00000000}; ld_set++; } else if(ld_set == 1 && main_time > 10) { top -> ld = 0; set_done = 0; } } //(main_time % 10) == 6) top->eval(); // Evaluate model if(top->done && !set_done) { print(top->key); print(top->text_in); print(top->text_out); ld_set = 0; //reset i++; set_done = 1; iter_count = 1; } //if(top->done) main_time++; // Time passes... } //end of while printf("\n Test Done\n"); top->final(); // Done simulating delete top; return 0; } //end of main 

////////////////////////////////////////////////////////////////////////////////////

updated as suggested by Hristo to move the declaration Vaes_cipher_top *top inside the while loop

int main(int argc, char **argv, char **env) { Verilated::commandArgs(argc, argv); unsigned int i = 0; unsigned int set_done; unsigned int ld_set = 0; //top->rst = 1; // assert reset unsigned int iter_count = 1; #pragma omp parallel num_threads(2) firstprivate(iter_count,ld_set,set_done,i) while (i < 2) { if(iter_count) { Vaes_cipher_top* top; // this is the aes object that will do the enc top = new Vaes_cipher_top; iter_count = 0; } if(main_time == 0) top-> rst = 1; //assert reset if (main_time > 10) { top->rst = 0; // Deassert reset } if ((main_time % 10) == 1) { top->clk = 1; // Toggle clock (posedge) } if ((main_time % 10) == 6) { top->clk = 0; //setting DUT values if(ld_set!=1 && main_time > 10) { top -> ld = 1; top -> key = {0x00000000,0x00000000,0x00000000,0x00000000}; top -> text_in = {0x00000000,0x00000000,0x00000000,0x00000000}; ld_set++; } else if(ld_set == 1 && main_time > 10) { top -> ld = 0; set_done = 0; } } //(main_time % 10) == 6) top->eval(); // Evaluate model if(top->done && !set_done) { print(top->key); print(top->text_in); print(top->text_out); ld_set = 0; //reset i++; set_done = 1; iter_count = 1; } //if(top->done) main_time++; // Time passes... } //end of while printf("\n Test Done\n"); top->final(); // Done simulating delete top; return 0; } //end of main 

//////////////////////////////////////////////////////////////////////////////////////////// Here is the output. All the errors are the same. I am putting a few of them

./sim_main.cpp:76: error: ‘top’ was not declared in this scope ( on line where top->rst=1) ../sim_main.cpp:80: error: ‘top’ was not declared in this scope (on line where top->rst=0) ../sim_main.cpp:84: error: ‘top’ was not declared in this scope (on line where top->clk =1) ../sim_main.cpp:89: error: ‘top’ was not declared in this scope (on line where top->clk=0)

If you remove the if surrounding the Vaes_cipher_top declartion, it becomes infinite loop!!! ///////////////////////////////////////////////////////////////////////////////////////////

Simulation hangs and output appears at different times on every run. I am using 2 threads i.e, num_threads(2)

 (1) This is the run where simulation terminates key=67fd3c2821b9201521d6a87f205e3039 text_in=67fd3c2821b9201521d6a87f205e3039 Time=251,text_out=71a354729996bac975784dcdb50260d9, done= 1 on 0 of 2 i= 1 key=1a857b7f39a0290d20bbf2466b5b14e8 text_in=1a857b7f39a0290d20bbf2466b5b14e8 Time=321,text_out=da36095f53fd86a57f9d147e8e05603, done= 1 on 1 of 2 i= 1 key=67fd3c2821b9201521d6a87f205e3039 text_in=67fd3c2821b9201521d6a87f205e3039 Time=401,text_out=71a354729996bac975784dcdb50260d9, done= 1 on 0 of 2 i= 2 key=1a857b7f39a0290d20bbf2466b5b14e8 text_in=1a857b7f39a0290d20bbf2466b5b14e8 Time=601,text_out=da36095f53fd86a57f9d147e8e05603, done= 1 on 1 of 2 i= 2 key=67fd3c2821b9201521d6a87f205e3039 text_in=67fd3c2821b9201521d6a87f205e3039 Time=641,text_out=71a354729996bac975784dcdb50260d9, done= 1 on 0 of 2 i= 3 key=1a857b7f39a0290d20bbf2466b5b14e8 text_in=1a857b7f39a0290d20bbf2466b5b14e8 Time=841,text_out=da36095f53fd86a57f9d147e8e05603, done= 1 on 1 of 2 i= 3 key=67fd3c2821b9201521d6a87f205e3039 text_in=67fd3c2821b9201521d6a87f205e3039 Time=911,text_out=71a354729996bac975784dcdb50260d9, done= 1 on 0 of 2 i= 4 key=1a857b7f39a0290d20bbf2466b5b14e8 text_in=1a857b7f39a0290d20bbf2466b5b14e8 Time=991,text_out=da36095f53fd86a57f9d147e8e05603, done= 1 on 1 of 2 i= 4 Test Done (2) This is the RUN where simulation DOES NOT terminate and i had to press ctrl+c to abort the simulation key=75f1bcf47451ab0f33b58a5e1adfdd6 text_in=75f1bcf47451ab0f33b58a5e1adfdd6 Time=411,text_out=9049c33819d61de5c09aa388479ef10, done= 1 on 0 of 2 i= 1 key=75f1bcf47451ab0f33b58a5e1adfdd6 text_in=75f1bcf47451ab0f33b58a5e1adfdd6 Time=696,text_out=9049c33819d61de5c09aa388479ef10, done= 1 on 0 of 2 i= 2 key=75f1bcf47451ab0f33b58a5e1adfdd6 text_in=75f1bcf47451ab0f33b58a5e1adfdd6 Time=931,text_out=9049c33819d61de5c09aa388479ef10, done= 1 on 0 of 2 i= 3 key=75f1bcf47451ab0f33b58a5e1adfdd6 text_in=75f1bcf47451ab0f33b58a5e1adfdd6 Time=1151,text_out=9049c33819d61de5c09aa388479ef10, done= 1 on 0 of 2 i= 4 ^C 

(Had to press Ctrl+c to abort the simulation. Only 1 core is being used instead
of 2, Why is this happening and how to prevent that from happening? Why is output
of two threads NOT appearing at the same time? Can this be done?

Please click the button add / show 1 more comment at the bottom of this webpage to see new comments

Thanks

/////////////////////////////////////////////////////// Here is the final working code that i want to share with everybody ///////////////////////////////////////////////////////////////////

#include <omp.h> #include "Vaes_cipher_top.h" #include "verilated.h" #include "verilated_vcd_c.h" #include <stdio.h> #include <stdlib.h> #include <time.h> //#pragma omp threadprivate(top) vluint64_t main_time = 0; // Current simulation time // This is a 64-bit integer to reduce wrap over issues and // allow modulus. You can also use a double, if you wish. double sc_time_stamp () { // Called by $time in Verilog return main_time; // converts to double, to match // what SystemC does } int main(int argc, char **argv, char **env) { Verilated::commandArgs(argc, argv); srand(time(NULL)); unsigned int set_done = 0; unsigned int i = 0; unsigned int ld_set = 0; #ifdef OMP #pragma omp parallel default(none) firstprivate(i,set_done,ld_set,main_time) { // unsigned int set_done = 0; // unsigned int i = 0; // unsigned int ld_set = 0; Vaes_cipher_top* top = new Vaes_cipher_top; // this is the aes object that will do the enc top->rst = 1; // assert reset #endif while (i < 65000) // #pragma omp parallel for ordered schedule(static) // for(i=0; (i<65000);i++) { if (main_time > 10) { top->rst = 0; // Deassert reset } if ((main_time % 10) == 1) { top->clk = 1; // Toggle clock (posedge) } if ((main_time % 10) == 6) { top->clk = 0; //setting DUT values if(ld_set!=1 && main_time > 10) { top -> ld = 1; //unsigned int rand_state = time(NULL) + 1337*omp_get_thread_num(); //unsigned int rnd[4]; //rnd[0] = rand_r(&rand_state); //rnd[1] = rand_r(&rand_state); //rnd[2] = rand_r(&rand_state); //rnd[3] = rand_r(&rand_state); top -> key = {rand(),rand(),rand(),rand()}; // {0x00000000,0x000000000,0x00000000,0x00000000}; top -> text_in = {rand(),rand(),rand(),rand()}; //{0x00000000,0x00000000,0x00000000,0x00000000}; //top -> key = {0x00000000,0x00000000,0x00000000,0x00000000}; //top -> text_in = {0x00000000,0x00000000,0x00000000,0x00000000}; ld_set++; } else if(ld_set == 1 && main_time > 10) { top -> ld = 0; set_done = 0; } } //(main_time % 10) == 6) top->eval(); // Evaluate model if(top->done == 1 && set_done == 0) { #ifdef OMP printf("Time=%2d, key=%2x%2x%2x%2x,text_in=%2x%2x%2x%2x,text_out=%2x%2x%2x%2x on %2d of %2d\n", \ main_time,top->key[3],top->key[2],top->key[1],top->key[0], \ top->text_in[3],top->text_in[2],top->text_in[1],top->text_in[0], \ top->text_out[3],top->text_out[2],top->text_out[1],top->text_out[0],top->done, \ omp_get_thread_num(),omp_get_num_threads() ); #else printf("Time=%2d, key=%2x%2x%2x%2x,text_in=%2x%2x%2x%2x,text_out=%2x%2x%2x%2x\n" , \ main_time,top->key[3],top->key[2],top->key[1],top->key[0], \ top->text_in[3],top->text_in[2],top->text_in[1],top->text_in[0], \ top->text_out[3],top->text_out[2],top->text_out[1],top->text_out[0],top->done ); #endif ld_set = 0; //reset the value i++; // printf("i=%2d\n",i); set_done = 1; } //if(top->done) //#pragma omp barrier main_time++; } //end of while top->final(); // Done simulating delete top; } //pragma omp printf("\n Test Done\n"); return 0; } //end of main 
6
  • 1
    private(top) doesn't make the object pointed by top private but rather the pointer itself. Each thread ends up with an uninitialised private pointer and any attempt to dereference such a pointer with -> ends up in a segmentation fault. How about you move the Vaes_cipher_top* top = new Vaes_cipher_top; statement inside the parallel loop? Some variables look like they too need the private treatment. Also, each thread executes the same while loop and the work is not distributed among the threads. Are you testing if Vaes_cipher_top is thread-safe? Commented Nov 20, 2013 at 23:18
  • Hi Hristo, Thanks for your comment again. I am not testing if Vaes_cipher_top is thread-safe, i just want to make it multithreaded. Secondly, i have coded the while loop as you see so that each thread does the same work. Once each thread produces the correct result, i shall put random number generation for key and plaintext so that each thread works on separate plaintext and key. Right now things are not working. Please see modified code above Commented Nov 21, 2013 at 2:08
  • You moved only the new statement inside the parallel region but left the definition of top outside and it is still shared. Commented Nov 21, 2013 at 7:44
  • You get scoping errors like ./sim_main.cpp:76: error: ‘top’ was not declared in this scope. See above for updated code and error messages Commented Nov 21, 2013 at 11:04
  • Hristo, just to let you know. I shall acknowledge your contribution in helping me out in the source code. Also this code will be made opensource for anyone to use. Commented Nov 21, 2013 at 11:12

1 Answer 1

1

You almost got it right, but Vaes_cipher_top* top; should be moved before the if statement, e.g. immediately after the opening brace of the while loop. Also in this case top would not be visible outside the parallel region. I would suggest that you restructure the code as follows:

int main(int argc, char **argv, char **env) { Verilated::commandArgs(argc, argv); #pragma omp parallel num_threads(2) { unsigned int i = 0; unsigned int set_done; unsigned int ld_set = 0; //top->rst = 1; // assert reset unsigned int iter_count = 1; Vaes_cipher_top* top = new Vaes_cipher_top; // this is the aes object that will do the enc while (i < 2) { if (iter_count) { top = new Vaes_cipher_top; iter_count = 0; } if (main_time == 0) top-> rst = 1; //assert reset if (main_time > 10) { top->rst = 0; // Deassert reset } if ((main_time % 10) == 1) { top->clk = 1; // Toggle clock (posedge) } if ((main_time % 10) == 6) { top->clk = 0; //setting DUT values if (ld_set!=1 && main_time > 10) { top->ld = 1; top->key = {0x00000000,0x00000000,0x00000000,0x00000000}; top->text_in = {0x00000000,0x00000000,0x00000000,0x00000000}; ld_set++; } else if (ld_set == 1 && main_time > 10) { top->ld = 0; set_done = 0; } } //(main_time % 10) == 6) top->eval(); // Evaluate model if (top->done && !set_done) { print(top->key); print(top->text_in); print(top->text_out); ld_set = 0; //reset i++; set_done = 1; iter_count = 1; } //if(top->done) main_time++; // Time passes... } //end of while top->final(); // Done simulating delete top; } printf("\nTest Done\n"); return 0; } //end of main 

Though this has nothing to do with OpenMP, the above code leaks memory. top is being allocated anew each time when iter_count is non-zero, which happens when the if (top->done && !set_done) conditional is true and i is still equal to 0. Probably you should think about how to remedy that problem.

Sign up to request clarification or add additional context in comments.

9 Comments

Let me try that code.I should get back to you in an hour with an update. Thanks so so much.
Hristo dude! It IS working. Hooray!!! First of all my heartfelt thanks to you and to the stackoverflow community for making such a marvelous and friendly platform available. I am going to put your name in the source code and make it opensource soon. I shall keep you posted. If it is possible for you to email me your contact, you can be posted about the open source code. Thanks again
I am noticing one thing in the simulation and that is simulation hangs sometimes. Also when i re-run the simulation, the cipher text appears on different times on every run. See above for an example. How to remedy this
Simulation hangs when i add randomization for plaintext and key. If i don't add randomization and let the key and plaintext to be 0 as shown above in your solution, simulation does not hang most of the time. But once in 5 runs, it still hangs with plaintext and key being 0. While in case of plaintext and key being random, it hangs most of the time like 8 in 10 runs.
Does it hang when run in serial mode, e.g. with num_threads(1) or with OpenMP disabled during compilation?
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.