Encrypting/decrypting words in C

Question

I came across this piece of C code that makes use of what looks like a decryption function, which I've put below. I'm new to encryption, so how does one go about comprehending what happens inside?

char* decrypt(char* code) { int hash = ((strlen(code) - 3) / 3) + 2; char* decrypt = malloc(hash); char* toFree = decrypt; char* word = code; for (int ch = *code; ch != '\0'; ch = *(++code)) { if((code - word + 2) % 3 == 1){ *(decrypt++) = ch - (word - code + 1) - hash; } } *decrypt = '\0'; return toFree; }

These are the encrypted words that the function takes in

char values[WORDS][WORDLEN] = {"N~mqOlJ^tZletXodeYgs","gCnDIfFQe^CdP^^B{hZpeLA^hv","7urtrtwQv{dt`>^}FaR]i]XUug^GI", "aSwfXsxOsWAlXScVQmjAWJG","cruD=idduvUdr=gmcauCmg]","BQt`zncypFVjvIaTl]u=_?Aa}F", "iLvkKdT`yu~mWj[^gcO|","jSiLyzJ=vPmnv^`N]^>ViAC^z_","xo|RqqhO|nNstjmzfiuoiFfhwtdh~", "OHkttvxdp|[nnW]Drgaomdq"};

How do I go about making an encryption function which converts normal words into their encrypted version?

Encryption and hashing are two different cryptographic operations. Encryption algorithms such as AES typically convert plaintext into encrypted bytes of similar length and can be reversed. Hashing algorithms like SHA or MD5 convert plaintext into a small fixed-size set of bytes and are typically one-way, irreversible operations. I've updated your question with the appropriate terminology. — John Kugelman
– John Kugelman, Commented Nov 20, 2020 at 2:36
This looks like a decryption function. There should be an encryption function that encodes data using some algorithm, and this should be doing the opposite to restore it. If you compare it to encrypt function, you can see it. — user13387119
– user13387119, Commented Nov 20, 2020 at 3:05
Note that there's what might be regarded as a 'steganographic' component to the encoding algorithm — only the 3rd, 6th, 9th, … 3Nth characters (counting from 1) of the encrypted data are used in the decryption; the others are ignored. That means that the encrypting process can use any values at all for the other characters. The obfuscation for the characters that are not ignored is hardly complex. Unfortunately, all the messages are so short that they avoid a number of problems waiting to cause trouble for long strings. The variable name hash is misleading — it is a string length. — Jonathan Leffler
– Jonathan Leffler, Commented Nov 20, 2020 at 3:40

Jerry Jeremiah · Accepted Answer · 2020-11-22 21:54:51Z

The problem is that the decrypt function is written to make it hard to understand. Here is a simpler but equivalent version (see the edit below for details):

char* decrypt(char* code) { int length = strlen(code) / 3 + 1; char* decrypt = malloc(length); int d = 0; for (int c = 2; c < strlen(code); c+=3) { decrypt[d++] = code[c] + c - length - 1; } decrypt[d] = '\0'; return decrypt; }

As you can see, only every third character matters. For each of them, we add the index and subtract the length of the string. So to encrypt a string using the same method:

char *encrypt(char* decrypt) { int length = strlen(decrypt) + 1; char* code = malloc(length * 3); int c = -1; for (int d = 0; d < length; d++) { code[++c] = '*'; // or rand()%94+33; code[++c] = '*'; // or rand()%94+33; code[++c] = decrypt[d] - c + length + 1; } code[c] = '\0'; return code; }

We fill in every third character, with each of them being modified by the index and the length.

Here is an entire sample program:

#include <stdio.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #include <math.h> char* original_decrypt(char* code) { int hash = ((strlen(code) - 3) / 3) + 2; char* decrypt = malloc(hash); char* toFree = decrypt; char* word = code; for (int ch = *code; ch != '\0'; ch = *(++code)) { if((code - word + 2) % 3 == 1){ *(decrypt++) = ch - (word - code + 1) - hash; } } *decrypt = '\0'; return toFree; } char* decrypt(char* code) { int length = strlen(code) / 3 + 1; char* decrypt = malloc(length); int d = 0; for (int c = 2; c < strlen(code); c+=3) { decrypt[d++] = code[c] + c - length - 1; } decrypt[d] = '\0'; return decrypt; } char *encrypt(char* decrypt) { int length = strlen(decrypt) + 1; char* code = malloc(length * 3); int c = -1; for (int d = 0; d < length; d++) { code[++c] = '*'; // or rand()%94+33; code[++c] = '*'; // or rand()%94+33; code[++c] = decrypt[d] - c + length + 1; } code[c] = '\0'; return code; } char values[10][256] = {"N~mqOlJ^tZletXodeYgs" ,"gCnDIfFQe^CdP^^B{hZpeLA^hv" ,"7urtrtwQv{dt`>^}FaR]i]XUug^GI" ,"aSwfXsxOsWAlXScVQmjAWJG" ,"cruD=idduvUdr=gmcauCmg]" ,"BQt`zncypFVjvIaTl]u=_?Aa}F" ,"iLvkKdT`yu~mWj[^gcO|" ,"jSiLyzJ=vPmnv^`N]^>ViAC^z_" ,"xo|RqqhO|nNstjmzfiuoiFfhwtdh~" ,"OHkttvxdp|[nnW]Drgaomdq" }; int main() { for(int i=0;i<10; i++) { char *decrypted = decrypt(values[i]); char *encrypted = encrypt(decrypted); printf("%-30s %-10s %-30s\n",values[i],decrypted,encrypted); free(decrypted); free(encrypted); } return 0; }

Try it here: https://onlinegdb.com/Bk7lonV9w

So, how did I transform the decrypt function in the original question into the equivalent function I used in my answer?

Here is the functio0n we are interested in:

char* decrypt(char* code) { int hash = ((strlen(code) - 3) / 3) + 2; char* decrypt = malloc(hash); char* toFree = decrypt; char* word = code; for (int ch = *code; ch != '\0'; ch = *(++code)) { if((code - word + 2) % 3 == 1){ *(decrypt++) = ch - (word - code + 1) - hash; } } *decrypt = '\0'; return toFree; }

#1 Looking at int hash = ((strlen(code) - 3) / 3) + 2; algebraically:

H = (X-3)/3 + 2 H = X/3 - 3/3 + 2 H = X/3 - 1 + 2 H = X/3 + 1

So that statement is equivalent to int hash = strlen(code) / 3 + 1;

#2 hash is a stupid name for a variable because the variable is actually the length og the string so we should write int length = strlen(code) / 3 + 1; and char* decrypt = malloc(length);

#3 The variable toFree is initialized to point to the start of the allocated string because the code needs to return that string but the variable decrypt is incremented inside the loop so we need another variable. If we switch to array indexing then we don't need toFree at all because we will be incrementing the index (which I will call d) So *decrypt becomes decrypt[d], char* toFree = decrypt; becomes int d=0; and the function just returns the variable decrypt

#4 Both word and code are pointers into a character string. The expression word - code is the distance between the two pointers. But that's convoluted compared to just using an index into the character array. So code - word is just the index into the string (which I will call c)

#5 If code - word is just the loop index c then (code - word + 2) % 3 == 1 is (c+2) % 3 == 1. What that means is that the if statement only executes every third character starting with the c=2:

c=0: (0+2) % 3 = 2 (not equal to 1) c=1: (1+2) % 3 = 0 (not equal to 1) c=2: (2+2) % 3 = 1 (equals 1 so c=2 will be the start of the loop) c=3: (3+2) % 3 = 2 (not equal to 1) c=4: (4+2) % 3 = 0 (not equal to 1) c=5: (5+2) % 3 = 1 (equals 1 so the loop increment will be c+=3)

So the for loop should be for(int c = 2; c < strlen(code); c+=3) and we don't need char* word = code; or the if statement at all.

#6 The expression - (word - code + 1) is algevraically equivalent to:

- (word - code + 1) -word + code - 1 code - word - 1

But if code - word is just the loop index c then - (word - code + 1) is just + c - 1

#7 We can simplify *(decrypt++) = ch - (word - code + 1) - hash; using:

*(decrypt++) becomes decrypt[d++] because d is the index into the string,
- (word - code + 1) is just + c - 1
hash is the renamed variable length
ch just becomes code[c]

So *(decrypt++) = ch - (word - code + 1) - hash; can be simplified to decrtpy[d++] = code[c] + c - 1 - length;

If we piut all that together, we end up with a function that looks like:

char* decrypt(char* code) { int length = strlen(code) / 3 + 1; // #1 #2 char* decrypt = malloc(length); // #2 int d = 0; // #3 for (int c = 2; c < strlen(code); c+=3) // #4 #5 { decrypt[d++] = code[c] + c - length - 1; // #6 #7 } decrypt[d] = '\0'; // #3 return decrypt; // #3 }

Of course, not all those changes needed to be made - I just thought the code looked easier to understand that way.

That's a great answer. But could you also explain how you figured that out? The original decrypt function looked so convoluted I couldn't start to wrap my head around it. I mean to ask, is there a specific set of steps I could follow when trying to understand code like this, or does it simply come with practice?
@ArunParolikkal I have edited the answer to explain my rationale for transforming the code into the "simpler" version I used. I write software professionally and I believe that making code simple to understand is the main priority of developers. If it is simple to understand then it is faster to diagnose when there are bugs (an extra minute of downtime is hundreds or even thousands of dollars of lost profit) and if it isn't simple to understand then making enhancements is almost impossible (which means much a longer time before the changes are pushed to production - more lost profit)

Collectives™ on Stack Overflow

Encrypting/decrypting words in C

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related