Inner product space of arrays with posix threads

Question

I want to read as input a table A and B from a user , and make an inner product space from them (a1b1+a2b2+……+anbn) and save it in a local_sum and then share it to an total_sum variable. I am doing the bellow code , but there is a segment fault. For some reason table A & B can't pass to function MUL. Any help would be great, thank you!

#include <stdio.h> #include <pthread.h> #include <stdlib.h> #define N 2 int p; int A[N],B[N]; int local_sum; void *mul(void *arg) { int lines, start, end, i, j; int id = *(int*)arg; lines = N / p; start = id * lines; end = start + lines; for (i = start; i < end; i++) local_sum = A[i] * B[i] + local_sum; return NULL; } int main (int argc, char *argv[]) { int i; pthread_t *tid; if (argc != 2) { printf("Provide number of threads.\n"); exit(1); } p = atoi(argv[1]); tid = (pthread_t *)malloc(p * sizeof(pthread_t)); if (tid == NULL) { printf("Could not allocate memory.\n"); exit(1); } printf("Give Table A\n"); for (int i = 0; i < N; i++) { scanf("%d", &A[i]); } printf("Give Table B\n"); for (int i = 0; i < N; i++) { scanf("%d", &B[i]); } for (i = 0; i < p; i++) { int *a; a = malloc(sizeof(int)); *a = 0; pthread_create(&tid[i], NULL, mul, a); } for (i = 0; i < p; i++) pthread_join(tid[i], NULL); printf("%d", local_sum); return 0; }

Michael Beer · Accepted Answer · 2018-05-28 01:18:19Z

Let's see:

You want to have p threads, working on the vectors A and B.

You must be aware of that threads share the same memory, and might be interrupted at any time.

You've got p threads, all trying to write to one shared variable local_sum. This leads to unpredictable results since one thread overwrites the value another thread has written there before.

You can bypass this problem by ensuring exclusive access of one single thread to this variable by using a mutex or the like, or you could have one variable per thread, have each thread produce an intermediate result and after joining all threads, collapse all your intermediate results into the final one.

To do this, your main should look something like (assuming your compiler supports a recent C standard):

#include <stdio.h> #include <pthread.h> #include <stdlib.h> #define N 2 /* these are variables shared amongst all threads */ int p; int A[N], B[N]; /* array with one slot per thread to receive the partial result of each thread */ int* partial_sum; /* prototype of thread function, just to be independent of the place mul will be placed in the source file... */ void *mul(void *arg); int main (int argc, char** argv) { pthread_t* tid; p = atoi(argv[1]); const size_t n_by_p = N/p; if(n_by_p * p != N) { fprintf(stderr, "Number of threads must be an integral factor of N\n"); exit(EXIT_FAILURE) ; } tid = calloc(p, sizeof(pthread_t)); partial_sum = calloc(p, sizeof(int)) ; printf("Give Table A\n"); for(size_t i = 0; i < N; ++i) { scanf("%d",&A[i]); } printf("Give Table B\n"); for(size_t i = 0; i < N; ++i) { scanf("%d",&B[i]); } for (size_t i =0; i < p; ++i) { /* clumsy way to pass a thread it's slot number, but works as a starter... */ int *a; a = malloc(sizeof(int)); *a = i; pthread_create(&tid[i], 0, mul, a); } for (size_t i = 0; i < p; ++i) { pthread_join(tid[i], 0); } free(tid); tid = 0; int total_sum = 0; for (size_t i = 0; i < p; ++i) { total_sum += partial_sum[i] ; } free(partial_sum); partial_sum = 0; printf("%d",total_sum); return EXIT_SUCCESS; }

Your threaded method mul should now write to its particular partial_sum slot only :

void *mul(void *arg) { int slot_num = *(int*)arg; free(arg); arg = 0; const size_t lines = N/p; const size_t start = slot_num * lines; const size_t end = start + lines; partial_sum[slot_num] = 0; for(size_t i = start; i < end; ++i) { partial_sum[slot_num] += A[i]*B[i]; } return 0; }

Beware: This code runs smoothly, only if N is some integral multiple of p. If this condition is not met, due to truncation in N/p, not all elements of the vectors will be processed. However, fixing these cases is not the core of this question IMHO.

I spared all kinds of error-checking, which you should add, should this code become part of some operational setup...

it works but I think the operation of local_sum is wrong... can you check it?
@laland sorry, what do you refer to? Do you mean your code works? My proposed solution does not contain a variable local_sum?
I mean your partial sum :P my fault... for example if I give as input , Table A(1,2) and B(1,2) with 2 processors , the result has to be 5 right?
Hmm, the total_sum keeps the final result. If I run the program with p =2, A=(1, 2), b =(1, 2), it produces the correct result 5. partial_sum is then an array of size 2, finally containing (1,4).
oh ok , is there an alternative way with mutex ? if yes, can you suggest?

purec · Accepted Answer · 2018-05-27 19:38:27Z

if (tid=NULL)

-->

 if (tid==NULL)

and

for (i=start;i<end;i++)

I suppose we need

for (i=0;i<end-start;i++)

Collectives™ on Stack Overflow

Inner product space of arrays with posix threads

2 Answers 2

7 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Related