84

How to get the MD5 hash of a file in C++?

3
  • 1
    @silky - not really a helpful comment :) ..implementing MD5 from scratch is a really good way to get exposure to cryptographic algorithms and protocols, and since it's "known", you can instantly verify your code is right vs md5sum or similar Commented Oct 10, 2009 at 2:18
  • 1
    @Noon Silk I think for the purpose here of making a unique signature for a file md5 should be adequate. Commented Oct 17, 2010 at 18:14
  • @Noon Silk, with long recursive checks sha1 would be too slow! Commented Jul 26, 2011 at 10:27

8 Answers 8

56

Here's a straight forward implementation of the md5sum command that computes and displays the MD5 of the file specified on the command-line. It needs to be linked against the OpenSSL library (gcc md5.c -o md5 -lssl) to work. It's pure C, but you should be able to adapt it to your C++ application easily enough.

#include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <openssl/md5.h> unsigned char result[MD5_DIGEST_LENGTH]; // Print the MD5 sum as hex-digits. void print_md5_sum(unsigned char* md) { int i; for(i=0; i <MD5_DIGEST_LENGTH; i++) { printf("%02x",md[i]); } } // Get the size of the file by its file descriptor unsigned long get_size_by_fd(int fd) { struct stat statbuf; if(fstat(fd, &statbuf) < 0) exit(-1); return statbuf.st_size; } int main(int argc, char *argv[]) { int file_descript; unsigned long file_size; char* file_buffer; if(argc != 2) { printf("Must specify the file\n"); exit(-1); } printf("using file:\t%s\n", argv[1]); file_descript = open(argv[1], O_RDONLY); if(file_descript < 0) exit(-1); file_size = get_size_by_fd(file_descript); printf("file size:\t%lu\n", file_size); file_buffer = mmap(0, file_size, PROT_READ, MAP_SHARED, file_descript, 0); MD5((unsigned char*) file_buffer, file_size, result); munmap(file_buffer, file_size); print_md5_sum(result); printf(" %s\n", argv[1]); return 0; } 
Sign up to request clarification or add additional context in comments.

6 Comments

on 32bit platforms, your mmap has a limit as to how large the file can be, though it is an elegant solution to the problem. On 32bit Windows, for example, you couldn't MD5 a DVD with this code.
@ChrisKaminski you can slide 4GB window of memory-mapped file on 32-bit platform.
Excellent answer, it's helped me immensely. However, you don't call munmap afterward. It's no memory leak for you because the program ends immediately afterwards, but if some buffoon like myself copies the code and doesn't put in munmap, we get a memory leak in our program ;) The solution: munmap(file_buffer, file_size);
For me gcc md5.c -o md5 -lcrypto this worked instead of -lssl on Ubuntu 14.04
Depending on openssl - a huge and gnarly library - for something as simple as MD5 seems like a bad idea to me.
|
22

You can implement the MD5 algorithm yourself (examples are all over the web), or you can link against the OpenSSL libs and use OpenSSL's digest functions. here's an example to get the MD5 of a byte array:

#include <openssl/md5.h> QByteArray AESWrapper::md5 ( const QByteArray& data) { unsigned char * tmp_hash; tmp_hash = MD5((const unsigned char*)data.constData(), data.length(), NULL); return QByteArray((const char*)tmp_hash, MD5_DIGEST_LENGTH); } 

8 Comments

when using Qt (as you do), i would rather just do return QCryptographicHash::hash(data, QCryptographicHash::Md5); as the body of the function...
When it comes to security-related stuff, never write your own implementation if the stuff out there on the net will suffice. And every single possible implementation of MD4/5 is out there, so there's really no reason to write your own.
@MahmoudAl-Qudsi Um yes there is, my professor doesn't let me plagiarize code.
@MahmoudAl-Qudsi When it comes to security-related stuff, never use MD5. MD5 is not a crypto-strength hash.
@uliwitness md5 was not my idea. It's OK to treat MD5 as a middling-fast non-crypto hash, but I agree that it is utterly broken as a crypto hash (and there are far better in terms of speed and hashing for non-crypto hashes).
|
16

For anyone redirected from "https://stackoverflow.com/questions/4393017/md5-implementation-in-c" because it's been incorrectly labelled a duplicate.

The example located here works:

http://www.zedwood.com/article/cpp-md5-function

If you are compiling in VC++2010 then you will need to change his main.cpp to this:

#include <iostream> //for std::cout #include <string.h> //for std::string #include "MD5.h" using std::cout; using std::endl; int main(int argc, char *argv[]) { std::string Temp = md5("The quick brown fox jumps over the lazy dog"); cout << Temp.c_str() << endl; return 0; } 

You will have to change the MD5 class slightly if you are to read in a char * array instead of a string to answer the question on this page here.

EDIT:

Apparently modifying the MD5 library isn't clear, well a Full VC++2010 solution is here for your convenience to include char *'s:

https://github.com/alm4096/MD5-Hash-Example-VS

A bit of an explanation is here:

#include <iostream> //for std::cout #include <string.h> //for std::string #include <fstream> #include "MD5.h" using std::cout; using std::endl; int main(int argc, char *argv[]) { //Start opening your file ifstream inBigArrayfile; inBigArrayfile.open ("Data.dat", std::ios::binary | std::ios::in); //Find length of file inBigArrayfile.seekg (0, std::ios::end); long Length = inBigArrayfile.tellg(); inBigArrayfile.seekg (0, std::ios::beg); //read in the data from your file char * InFileData = new char[Length]; inBigArrayfile.read(InFileData,Length); //Calculate MD5 hash std::string Temp = md5(InFileData,Length); cout << Temp.c_str() << endl; //Clean up delete [] InFileData; return 0; } 

I have simply added the following into the MD5 library:

MD5.cpp:

MD5::MD5(char * Input, long length) { init(); update(Input, length); finalize(); } 

MD5.h:

std::string md5(char * Input, long length); 

5 Comments

That is for a string, not a file
Answer modified to include a file
some of your links are broken
Can you please update the VC++2010 solution link.
links updated to a Git
12

I needed to do this just now and required a cross-platform solution that was suitable for c++11, boost and openssl. I took D'Nabre's solution as a starting point and boiled it down to the following:

#include <openssl/md5.h> #include <iomanip> #include <sstream> #include <boost/iostreams/device/mapped_file.hpp> const std::string md5_from_file(const std::string& path) { unsigned char result[MD5_DIGEST_LENGTH]; boost::iostreams::mapped_file_source src(path); MD5((unsigned char*)src.data(), src.size(), result); std::ostringstream sout; sout<<std::hex<<std::setfill('0'); for(auto c: result) sout<<std::setw(2)<<(int)c; return sout.str(); } 

A quick test executable demonstrates:

#include <iostream> int main(int argc, char *argv[]) { if(argc != 2) { std::cerr<<"Must specify the file\n"; exit(-1); } std::cout<<md5_from_file(argv[1])<<" "<<argv[1]<<std::endl; return 0; } 

Some linking notes: Linux: -lcrypto -lboost_iostreams Windows: -DBOOST_ALL_DYN_LINK libeay32.lib ssleay32.lib

1 Comment

thank you. if(!exists(boost::filesystem::path(path))) {
10
QFile file("bigimage.jpg"); if (file.open(QIODevice::ReadOnly)) { QByteArray fileData = file.readAll(); QByteArray hashData = QCryptographicHash::hash(fileData,QCryptographicHash::Md5); // or QCryptographicHash::Sha1 qDebug() << hashData.toHex(); // 0e0c2180dfd784dd84423b00af86e2fc } 

1 Comment

Not so good for files that are GB in size :)
7

md5.h also have MD5_* functions very useful for big file

#include <openssl/md5.h> #include <fstream> ....... std::ifstream file(filename, std::ifstream::binary); MD5_CTX md5Context; MD5_Init(&md5Context); char buf[1024 * 16]; while (file.good()) { file.read(buf, sizeof(buf)); MD5_Update(&md5Context, buf, file.gcount()); } unsigned char result[MD5_DIGEST_LENGTH]; MD5_Final(result, &md5Context); 

Very simple, isn`t it? Convertion to string also very simple:

#include <sstream> #include <iomanip> ....... std::stringstream md5string; md5string << std::hex << std::uppercase << std::setfill('0'); for (const auto &byte: result) md5string << std::setw(2) << (int)byte; return md5string.str(); 

1 Comment

worked fine for me!
2

Using Crypto++, you could do the following:

#include <sha.h> #include <iostream> SHA256 sha; while ( !f.eof() ) { char buff[4096]; int numchars = f.read(...); sha.Update(buff, numchars); } char hash[size]; sha.Final(hash); cout << hash <<endl; 

I have a need for something very similar, because I can't read in multi-gigabyte files just to compute a hash. In theory I could memory map them, but I have to support 32bit platforms - that's still problematic for large files.

1 Comment

Just to note: sha256 != md5
2

A rework of impementation by @D'Nabre for C++. Don't forget to compile with -lcrypto at the end: gcc md5.c -o md5 -lcrypto.

#include <iostream> #include <iomanip> #include <fstream> #include <string> #include <openssl/md5.h> using namespace std; unsigned char result[MD5_DIGEST_LENGTH]; // function to print MD5 correctly void printMD5(unsigned char* md, long size = MD5_DIGEST_LENGTH) { for (int i=0; i<size; i++) { cout<< hex << setw(2) << setfill('0') << (int) md[i]; } } int main(int argc, char *argv[]) { if(argc != 2) { cout << "Specify the file..." << endl; return 0; } ifstream::pos_type fileSize; char * memBlock; ifstream file (argv[1], ios::ate); //check if opened if (file.is_open() ) { cout<< "Using file\t"<< argv[1]<<endl; } else { cout<< "Unnable to open\t"<< argv[1]<<endl; return 0; } //get file size & copy file to memory //~ file.seekg(-1,ios::end); // exludes EOF fileSize = file.tellg(); cout << "File size \t"<< fileSize << endl; memBlock = new char[fileSize]; file.seekg(0,ios::beg); file.read(memBlock, fileSize); file.close(); //get md5 sum MD5((unsigned char*) memBlock, fileSize, result); //~ cout << "MD5_DIGEST_LENGTH = "<< MD5_DIGEST_LENGTH << endl; printMD5(result); cout<<endl; return 0; } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.