I have written two programs implementing a simple algorithm for matrix multiplication, one in C++ and one in Java. Contrary to my expectations, the Java program runs about 2.5x faster than the C++ program. I am a novice at C++, and would like suggestions on what I can change in the C++ program to make it run faster.
My programs borrow code and data from this blog post http://martin-thoma.com/matrix-multiplication-python-java-cpp .
Here are the current compilation flags I am using:
g++ -O3 main.cc javac Main.java Here are the current compiler/runtime versions:
$ g++ --version g++.exe (GCC) 4.8.1 Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ java -version java version "1.8.0_05" Java(TM) SE Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) My computer is a ~2012 era core i3 laptop running windows with MinGW. Here are the current performance results:
$ time ./a.exe < ../Testing/2000.in 507584919 real 0m36.469s user 0m0.031s sys 0m0.030s $ time java Main < ../Testing/2000.in 507584919 real 0m14.299s user 0m0.031s sys 0m0.015s Here is the C++ program:
#include <iostream> #include <cstdio> using namespace std; int *A; int *B; int height; int width; int * matMult(int A[], int B[]) { int * C = new int[height*width]; int n = height; for (int i = 0; i < n; i++) { for (int k = 0; k < n; k++) { for (int j = 0; j < n; j++) { C[width*i+j]+=A[width*i+k] * B[width*k+j]; } } } return C; } int main() { std::ios::sync_with_stdio(false); cin >> height; cin >> width; A = new int[width*height]; B = new int[width*height]; for (int i = 0; i < width*height; i++) { cin >> A[i]; } for (int i = 0; i < width*height; i++) { cin >> B[i]; } int *result = matMult(A,B); cout << result[2]; } Here is the java program:
import java.util.*; import java.io.*; public class Main { static int[] A; static int[] B; static int height; static int width; public static void main(String[] args) { try { BufferedReader reader = new BufferedReader(new InputStreamReader(System.in)); height = Integer.parseInt(reader.readLine()); width = Integer.parseInt(reader.readLine()); A=new int[width*height]; B=new int[width*height]; int index = 0; String thisLine; while ((thisLine = reader.readLine()) != null) { if (thisLine.trim().equals("")) { break; } else { String[] lineArray = thisLine.split("\t"); for (String number : lineArray) { A[index] = Integer.parseInt(number); index++; } } } index = 0; while ((thisLine = reader.readLine()) != null) { if (thisLine.trim().equals("")) { break; } else { String[] lineArray = thisLine.split("\t"); for (String number : lineArray) { B[index] = Integer.parseInt(number); index++; } } } int[] result = matMult(A,B); System.out.println(result[2]); reader.close(); } catch (Exception e) { e.printStackTrace(); } } public static int[] matMult(int[] A, int[] B) { int[] C = new int[height*width]; int n = height; for (int i = 0; i < n; i++) { for (int k = 0; k < n; k++) { for (int j = 0; j < n; j++) { C[width*i+j]+=A[width*i+k] * B[width*k+j]; } } } return C; } } Here is a link to a 2000x2000 test case: https://mega.nz/#!sglWxZqb!HBts_UlZnR4X9gZR7bG-ej3xf2A5vUv0wTDUW-kqFMA
Here is a link to a 2x2 test case: https://mega.nz/#!QwkV2SII!AtfGuxPV5bQeZtt9eHNNn36rnV4sGq0_sJzitjiFE8s
Any advice explaining what I am doing wrong in C++, or why my C++ implementation is running so much slower than Java here, would be much appreciated!
EDIT: As suggested, I modified the programs so that they do not actually perform a multiplication, but just read the arrays in and print out one number from each. Here are the performance results for that. The C++ program has slower IO. That only accounts for part of the difference however.
$ time ./IOonly.exe < ../Testing/2000.in 7 944 real 0m8.158s user 0m0.000s sys 0m0.046s $ time java IOOnly < ../Testing/2000.in 7 944 real 0m1.461s user 0m0.000s sys 0m0.047s