What is the proper way to hash the contents of a file in C? I'm not trying to hash the file as a whole but rather line by line for the file. My main goal is to create a program that searches for hash collisions. I've written the program in C but it uses the system command to hash each line with both MD5 and SHA256. I understand that using the system command is unsafe and not the proper way to do this so I'm reaching out to the community to get the proper way to hash with MD5 and SHA256.
1 Answer
Use OpenSSL C APIs
#include <openssl/md5.h> #include <openssl/sha.h> #include <stdio.h> #include <stdlib.h> #include <string.h> void main() { unsigned char sha256_digest[SHA256_DIGEST_LENGTH]; unsigned char md5_digest[MD5_DIGEST_LENGTH]; unsigned char *buffer = "Hello World!"; int i; SHA256(buffer, strlen(buffer), sha256_digest); MD5(buffer, strlen(buffer), md5_digest); for (i = 0; i < SHA256_DIGEST_LENGTH; i++) { printf("%02x", sha256_digest[i]); } printf("\n"); for (i = 0; i < MD5_DIGEST_LENGTH; i++) { printf("%02x", md5_digest[i]); } } To compile this code you need to link it properly using the crypto library
gcc testmd5.c -lcrypto Once you execute, you will get this output
7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069 ed076287532e86365e841e92bfc50d8c 1 Comment
Dickens A S
fixed the code, thanks for the review, also fixed the output
md5andsha256. Two good general hash table links are Coding up a Hash Table and Hash tables - eternally confuzzled. Essentially you want to read/hash each line. (a good test for collisions is the/usr/share/dict/wordsfile which will provide between 100,000 and 300,000 words (one per-line)