2

I need to compare some char * (which I know the length of) with some string literals. Right now I am doing it like this:

void do_something(char * str, int len) { if (len == 2 && str[0] == 'O' && str[1] == 'K' && str[2] == '\0') { // do something... } } 

The problem is that I have many comparisons like this to make and it's quite tedious to break apart and type each of these comparisons. Also, doing it like this is hard to maintain and easy to introduce bugs.

My question is if there is shorthand to type this (maybe a MACRO).

I know there is strncmp and I have seen that GCC optimizes it. So, if the shorthand is to use strncmp, like this:

void do_something(char * str, int len) { if (len == 2 && strncmp(str, "OK", len) == 0) { // do something... } } 

Then, I would like to know it the second example has the same (or better) performance of the first one.

14
  • In the second, if you need to check the length then you can use strcmp (instead of strncmp) which would implicitly do that without the need for len == 2 test. Commented Jul 9, 2020 at 17:07
  • I think do strcmp(str, "OK"); and let the compiler doing possible optimization Commented Jul 9, 2020 at 17:08
  • 1
    @TertulianoMáximoAfonso If you already know the length (and are testing that it equals 2), then in your first example, why do you bother to also test the '\0' at the end? Commented Jul 9, 2020 at 17:11
  • 2
    Only use strcmp if you know both strings are null-terminated. Commented Jul 9, 2020 at 17:12
  • 1
    @TertulianoMáximoAfonso But the two examples are not equivalent. In the first case you are testing that the string does not continue with more (non-null) characters after the declared length of 2, in the case with strncmp you do not test this. Commented Jul 9, 2020 at 17:21

2 Answers 2

1

Yes it will. However, your code is not comparing a char * to a string literal. It is comparing two string literals. The compiler is smart enough to spot this and optimize all the code away. Only the code inside the if block remains.

We can see this by looking at the assembly code generated by the comiler:

cc -S -std=c11 -pedantic -O3 test.c 

First with your original code...

#include <stdio.h> #include <string.h> int main() { unsigned int len = 2; char * str = "OK"; if (len == 2 && strncmp(str, "OK", len) == 0) { puts("Match"); } } 

Then with just the puts.

#include <stdio.h> #include <string.h> int main() { //unsigned int len = 2; //char * str = "OK"; //if (len == 2 && strncmp(str, "OK", len) == 0) { puts("Match"); //} } 

The two assembly files are practically the same. No trace of the strings remains, only the puts.

 .section __TEXT,__text,regular,pure_instructions .build_version macos, 10, 14 sdk_version 10, 14 .globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp leaq L_.str(%rip), %rdi callq _puts xorl %eax, %eax popq %rbp retq .cfi_endproc ## -- End function .section __TEXT,__cstring,cstring_literals L_.str: ## @.str .asciz "Match" .subsections_via_symbols 

This is a poor place to focus on optimization. String comparison against small strings is very unlikely to be a performance problem.

Furthermore, your proposed optimization is likely slower. You need to get the length of the input string, and that requires walking the full length of the input string. Maybe you need that for other reasons, but its an increasing edge case.

Whereas strncmp can stop as soon as it sees unequal characters. And it definitely only has to read up to the end of the smallest string.

Sign up to request clarification or add additional context in comments.

12 Comments

That's because you are assuming that I know str at compile time. The example serves only to illustrate the problem. str is not set at compile time. It comes from a user. Then I "parse" it to know the appropriate action.
@TertulianoMáximoAfonso Please provide a more clear example.
I think it was pretty clear. You are the one that taught that the example was my whole program and compiled the whole thing. That is a really bad assumption. An example is an example.
Anyway I added a note to question.
@TertulianoMáximoAfonso If you want good answers, you have to provide a clear example of what we're working with. You have to put in the work to make your question as clear as possible. For example, instead of a note, you could replace char * str = "OK"; with a call to fgets. Then there is no confusion. It's important that your question is clear to other people.
|
0

Your example implies that your strings are always NUL terminated. In that case, don't bother getting their length ahead of time, since that involves searching for the NUL. Instead, you can do

memcmp(str, "OK", 3); 

This way, the NULs get compared too. If your length is > 2, the result will be > 0 and if it's shorter, the result will be < 0.

This is a single function call, and memcmp is virtually guaranteed to be better optimized than your hand-written code. At the same time, don't bother optimizing unless you find this code to be a bottleneck. Keep in mind also that any benchmark I run on my machine will not necessarily apply to yours.

The only real reason to make this change is for readability.

3 Comments

Also. About getting the length ahead of time. I am reading these chars from a serial device, so I do have a counter for them and thus it's length. Would it be faster if a did something like if (length == 2 && memcmp(str, "OK", 3)) { ... }?
@TertulianoMáximoAfonso. It would be slower if the length usually matches, faster if they don't.
Makes sense. Since I have lots of these comparisons and only one of them matches I will be doing this check. Thank you ;)