0

As I understand it, you can't know for certain the length of a string in C unless it is declared as a char[] in local scope. If it is passed as an argument, it's a char * pointer, and you can't know the length without using strlen(). Many other answers on Stack Overflow describe this, including this answer to "How to get the string size in bytes?".

But sometimes strings aren't null terminated, and if they aren't, you can end up looking into some other memory while trying to find the end of a string. In my own code, I should always pass the length of the string around so that I know for sure how long it is, but what about arguments to main()?

What if bash has a bug and passes a string that is truncated or isn't null terminated? Or what if my program is called by something other than a shell, like another program that isn't as mature as the most common shells? Could my program segfault? Could I expose the memory of whatever happens to be adjacent to argv?

16
  • Step 1: Create a situation where you successfully pass in a NUL byte as a command-line argument and find out what happens. In practice this is almost never done for reasons you've outlined. It is, however, quite common on things like STDIN because it's often a better line-delimiter than other characters (e.g. find -print0 for xargs -0) Commented Aug 16, 2018 at 4:24
  • I'm going to speculate that passing in either a NUL byte early, or passing in data without a trailing NUL character is going to be impossible by design because that would create severe buffer-overflow exploitation opportunities for a wide range of programs. Commented Aug 16, 2018 at 4:28
  • 1
    The system requires the strings to be null-terminated in the call to execve() or equivalent. If they aren't, the call will fail. The strings passed to main() will be OK barring a catastrophic and incredibly impossible bug in the o/s. Commented Aug 16, 2018 at 4:37
  • 1
    @tadman -- OP seems to have some misconceptions about strings in C: "you can't know for certain the length of a string in C unless it is declared as a char[] in local scope," "sometimes strings aren't null terminated." I think that it is good to attempt to clarify in these situations. Yes, there are valid reasons to handle such malformed input, but "in my own code, I should always pass the length of the string around..." is not right; you should make sure that you pass around valid values in your own code, including valid strings when applicable. Commented Aug 16, 2018 at 4:41
  • 1
    Thanks for the discussion about C strings and execve(). I'm a C novice, so it was helpful. @DavidBowling I think I understand how C strings work, but I'm definitely new. How could I clear up the misconceptions in my question? Regarding the "passing the length around", I got that idea from my research about strncpy in this email. Commented Aug 16, 2018 at 6:39

3 Answers 3

4

Simple answer: no.

You have to consider that arguments to main(int argv, char *argv[]) are always valid.

Sign up to request clarification or add additional context in comments.

Comments

3

The software that starts a C program is responsible for creating proper contents for argv:

  • Per C 2018 5.1.2.2.1 2, “If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup.”

  • Per 7.1.1 1, “A string is a contiguous sequence of characters terminated by and including the first null character.”

Thus, it is not legal according to the C standard for argv to point to sequences of bytes that are not null-terminated. It is possible? Yes, if there is a bug in the software, it is possible. A bug in bash cannot cause this, as bash works through the operating system, and bash would not be able to pass arguments to your program that the operating system does not process. Nor could other user-mode programs cause this, as they have to work through the operating system in the same way. It would require a bug in the code that loads and executes programs and/or the code inside a C program that starts the program before calling main.

Comments

0

No, argv is an array of strings and it skips any spaces you may place in between them, so any input you may give would be valid, but obviously you want to check the input before you try to use it.

4 Comments

And how would you check the input without accessing memory that is not part of the strings in argv in the case they are not null-terminated?
What do you mean? everything in the command line is going as parameters to argv, if the param is NULL - then it means that you've tried to access a parameter you haven't entered (i > argc)/
If you access past the end of an array, you don't get NULL in C. You access memory that wasn't a part of your array, which at best causes you to get nonsense data and at worst causes a segmentation fault. Take a look at this example: repl.it/repls/DemandingWhimsicalMice
@xordspar0 -- argv[] is an array of pointers to strings, and it is guaranteed that these pointers are followed by a null pointer in the array. So you can always check for a null pointer to know when you have reached the end of argv[].

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.