2

I have written a code that basically asks the user to input 2 strings which it compares to find if they are anagrams or not. Im new to C and I know the code works fine but I don't know if it works well. I know its a relatively small code so the difference in performance would be negligible but I still want to learn

Someways I have thought of minimizing memory usage was by using data types like int8_t instead of int for a number that will never reach above 127. My while loops were a bunch of if statements before so that's one way I've faster. Also used && and II instead of extra if conditions. How else could I make the code work better?

Is it better for me to use put("...") instead of print("...\n") for the long string lines since they have no other purpose than displaying only strings? Is there a way to make the size of the array 26 instead of 128 since I am only concerned about the lower case alphabet? Are there any unwritten universal laws of coding that I need to know about to make my code more efficient?

/* This code takes two input strings and detects whether they are anagrams. This is done by defining two functions, 'is_length() and 'is_anagram()'. is_length makes sure both strings are a maximum of 10 strings. is_anagram() scans each character on both strings and adds 1 to the elements corresponding to the character's equivalent ASCII number. This produces two arrays of 128 elements which it compares to check if the two words are anagrams */ // Libraries used in the code #include <stdio.h> #include <stdint.h> #include <string.h> void is_length(char str1[], char str2[]) { while (strlen(str1) > 10 || strlen(str2) > 10) { printf("\nMake sure the words are a maximum 10 characters\n"); printf("Input two words:\n"); gets(str1); gets(str2); } } int8_t is_anagram(char str1[], char str2[]) { // Defining a function detects anagrams int8_t size1 = strlen(str1); // Finding the sizes of the strings int8_t size2 = strlen(str2); if (size1 == size2) { // Checking if they are the same length int16_t str1Arr[128] = {}; // Defining an array with 128 elements int16_t str2Arr[128] = {}; while (*str1) { str1Arr[(int16_t)*str1]++; // Scanning each character in the str1++; // string and adding 1 to the element } // corresponding their ASCII equivalent while (*str2) { // number str2Arr[(int16_t)*str2]++; str2++; } uint8_t i; for (i = 128; i--;) { // Comparing the string array elements if (str1Arr[i] != str2Arr[i]) { return 0; } } return 1; } else { printf("The words are not the same length so... \n"); return 0; } } int main(void) { char word1[10] = {}; // Defining variables for the strings char word2[10] = {}; printf("Input two words: \n"); // Aquiring input strings from user gets(word1); gets(word2); is_length(word1, word2); // Recalling is_length() if ((is_anagram(strlwr(word1), strlwr(word2))) == 1) { // Converting to lower case printf("YES! The words are anagrams \n"); // and recalling is_anagram() } else { // to compare the two strings printf("NO, The words are not anagrams \n"); } system("pause"); } 
14
  • 3
    Don't use gets(), or your program might run forever... Commented Feb 18, 2021 at 17:15
  • 2
    Using smaller data types for local variables (that are allocated on the stack) is more likely than not to actually reduce performance (because the compiler needs to insert casts or conversions in a lot of operations) Commented Feb 18, 2021 at 17:15
  • 1
    A omment on the algorithm, use just one array, increment it for one string, decrement it for the other string, check it's all zeros at the end. Less memory and a faster check. Commented Feb 18, 2021 at 17:17
  • 2
    gets() dont use it. ever. en.wikipedia.org/wiki/Morris_worm Commented Feb 18, 2021 at 17:22
  • 1
    @Jeban gets an evil, maniacal, mess of a function, drafted into the standard library at a time when people naively believed computers would always be programmed with competence, and used without malice. It is a dinosaur, and a recipe for a buffer overflow exploit by its very nature. It is so bad it has been removed from modern C standard libraries. Commented Feb 18, 2021 at 17:22

1 Answer 1

4

How can I maximize the performance of this C code?

  • Instead of recomputing string length many times, compute once and pass around the length.

  • No need to call strlwr() until after string length equal test. With long strings, consider skipping strlwr() on original strings and instead, after all ++, -- on a2i[], add together the letter sums.

When strings are short, sort and compare. Research qsort().

When strings are long, proceed as originally coded with some changes:

  • Use a single 256 int array to handle all char values.

     // int16_t str1Arr[128] = {}; // int16_t str2Arr[128] = {}; int s2A[UCHAR_MAX + 1] = { 0 }; while (*str1) { s2A[(unsigned char)*str1++]++; s2A[(unsigned char)*str2++]--; } for (int i = 0; i <= UCHAR_MAX; i++) { if (s2A[i]) { // Non-zero? return 0; } } return 1; 

Replace gets(). Research fgets().

Sign up to request clarification or add additional context in comments.

1 Comment

A couple ideas: 1) Doing separate filling loops may or may not be faster than a single loop, depends on the size of the string and how the caching works out 2) to check the final buffer against zero, you can reinterpret it as the widest type you have and do less comparisons (e.g. a 64bit type will save 7/8th of the iterations). It's a cute toy problem.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.