7

I understand that the end of a string is indicated by a null character, but i cannot understand the output of the following code.

#include <stdio.h> #include <string.h> int main(void) { char s[] = "Hello\0Hi"; printf("%d %d", strlen(s), sizeof(s)); } 

OUTPUT: 5 9

If strlen() detects the end of the string at the end of o, then why doesn't sizeof() do the same thing? Even if it doesn't do the same thing, isn't '\0' A null character (i.e, only one character), so shouldn't the answer be 8?

2
  • 1
    main() -> int main(void) please. Commented Jan 25, 2016 at 10:24
  • @PW apparently in the c standard they don't use NUL despite the many ascii tables where I have seen it. In the c standard the word null lowercase is used. Commented Jan 25, 2016 at 10:28

6 Answers 6

22

The sizeof operator does not give you the length of a string but instead the size of the type of it's operand. Since in your code the operand is an array, sizeof is giving you the size of the array including both null characters.

If it were like this

const char *string = "This is a large text\0This is another string"; printf("%zu %zu\n", strlen(string), sizeof(string)); 

the result will be very different because string is a pointer and not an array.

Note: Use the "%zu" specifier for size_t which is what strlen() returns, and is the type of the value given by sizeof.

Sign up to request clarification or add additional context in comments.

1 Comment

Good answer, but add explanation about %zu operand due to size_t sizeof return type;
10

strlen() doesn't care about the actual size of the string. It looks for a null byte and stops when it sees the first null byte.

But sizeof() operator knows the total size. It doesn't care about what bytes you are in the string literal. You might as well have all null bytes in the string and sizeof() would still give the correct size of the array (strlen() would retrun 0 in that case).

They are not comparable; they do different things.

3 Comments

Actually, "strlen" does care about the actual size of the string, because the string size is defined by being "nul terminated". It is sizeof which doesn't care about strings at all, only about types.
@Daniel because the string size is defined by being "nul terminated" -- No, that (null termination) is neither the string size nor does strlen() return the size of the string. It's called "string length" as opposed to "string size" for a reason. strlen(), by definition, doesn't know or care about the size. It starts from the given address and looks for the first null byte.I stand by the answer and please don't downvote because it's not clear to you.
Ah, I'm sorry, "string size" doesn't have a definition then, strlen cares about the length of the string, and sizeof cares about the number of bytes of a type! There is no "size of a string".
5

If strlen() detects the end of the string at the end of o, then why doesn't sizeof() do the same thing?

strlen only works for string (character array), whereas sizeof works for every data type. sizeof calculates the exact memory spaces for any given data type; whereas strlen provides the length of a string (NOT including the NULL terminator \0). So in normal cases, this is true for a typical character array s:

char s[] = "Hello"; strlen( s ) + 1 = sizeof( s ); // +1 for the \0 

In your case it's different because you have a NULL terminator in the middle of character array s:

char s[] = "Hello\0Hi"; 

Here, strlen would detect the first \0 and gives the length as 5. The sizeof, however, will calculate the total number of spaces enough to hold the character arrays, including two \0, so that's why it gives 9 as the second output.

Comments

5

strlen() computes the length of the string. This is done by returning the amount of characters before (and not including) the '\0' character. (See the manual page below.)

sizeof() returns the amount of bytes of the given variable (or data-type). Note that your example "Hello\0Hi" has 9 characters. But you don't seem to understand where character 9 comes from in your question. Let me explain the given string first. Your example string is:

"Hello\0Hi" 

This can be written as the following array:

['H', 'e', 'l', 'l', 'o', '\0', 'H', 'i', '\0'] 

Note the last '\0' character. When using the string quotes the compiler ends the string with an '\0' character. This means "" also is ['\0'] and thus has 1 element.

BEWARE that sizeof() does NOT return the number of elements in the array. It returns the amount of bytes. char is 1 byte and therefor sizeof() does returns the number of elements. But if you used any other datatype, for example if you would call sizeof() on [1, 2, 3, 4] it would return 16. Since int is 4 bytes and the array has 4 elements.

BEWARE that passing an array as parameter will only passes the pointer. If you would pass s to another function and call sizeof() it will return the size of the pointer, which is the same as sizeof(void *). This is a fixed length independent from the array.

STRLEN(3) BSD Library Functions Manual STRLEN(3) NAME strlen, strnlen -- find length of string LIBRARY Standard C Library (libc, -lc) SYNOPSIS #include <string.h> size_t strlen(const char *s); size_t strnlen(const char *s, size_t maxlen); DESCRIPTION The strlen() function computes the length of the string s. The strnlen() function attempts to compute the length of s, but never scans beyond the first maxlen bytes of s. RETURN VALUES The strlen() function returns the number of characters that precede the terminating NUL character. The strnlen() function returns either the same result as strlen() or maxlen, whichever is smaller. SEE ALSO string(3), wcslen(3), wcswidth(3) STANDARDS The strlen() function conforms to ISO/IEC 9899:1990 (``ISO C90''). The strnlen() function conforms to IEEE Std 1003.1-2008 (``POSIX.1''). BSD February 28, 2009 BSD 

2 Comments

"BEWARE that sizeof() does NOT return the size of the array" - uh, actually it does. You seem to mean "sizeof does not give the number of elements in the array", but in this particular case it does, because the elements have size 1.
Good point I changed "size of" to "number of elements", but my answer already stated that in this particular case it is true. So you should read the whole answer before commenting half way through next time ;-)
2

As name literal itself implies string literal is a sequence of characters enclosed in double quotes. Implicitly this sequence of characters is appended by a terminating zero.

So any character enclosed in the double quotes is a part of the string literal.

When a string literal is used to initialize a character array all its characters including the terminating zero serve as initializers of the corresponding elements of the character array.

Each string literal in turn has type of a character array.

For example this string literal "Hello\0Hi" in C has type char[9]: 8 characters enclosed in the quotes plus the implicit terminating zero.

So in memory this string literal is stored like

{ 'H', 'e', 'l', 'l', 'o', '\0', 'H', 'i', '\0' } 

Operator sizeof returns the number of bytes occupied by an object. So for the string literal above the operator sizeof will return value 9- it is the number of bytes occupied by the literal in memory.

If you wrote "Hello\0Hi" then the compiler may not itself just remove this part Hi from the literal. It has to store it in memory along with other characters of the literal enclosed in quotes.

The sizeof operator returns the size in bytes of any object in C not only of character arrays.

In general character arrays can store any raw data for example some binary data read from a binary file. In this case this data is not considered by the user and by the program like strings and as result are processed differently than strings.

Standard C function strlen is specially written for character arrays that to find the length of a stored string in a character array. It does not know what data are stored in an array and how they were written in it. All what it does is searches the first zero character in a character array and returns the number of characters in the character array before the zero character.

You can store in one character array several strings sequentially. For example

char s[12]; strcpy( s, "Hello" ); strcpy( s + sizeof( "Hello" ), "World" ); puts( s ); // outputs "Hello" puts( s + sizeof( "Hello" ) ); // outputs "World" 

If you would define a two dimensional array like this

char t[2][6] = { "Hello", "World" }; 

then in memory it will be stored the same way as the one-dimensional array above. So you can write

char *s = ( char * )t; puts( s ); // outputs "Hello" puts( s + sizeof( "Hello" ) ); // outputs "World" 

Another example. Standard C function strtok can split one string stored in a character array to several strings substituting the specified by the user delimiters with zero bytes. As result the character array will contain several strings.

For example

char s[] = "Hello World"; printf( "%zu\n", sizeof( s ) ); // outputs 12 strtok( s, " " ); puts( s ); // outputs "Hello" puts( s + sizeof( "Hello" ) ); // outputs "World" printf( "%zu\n", sizeof( s ) ); // outputs 12 

The last printf statement will output the same value equal to 12 because the array occupies the same number of bytes. Simply one byte in the memory allocated for the array was changed from ' ' to '\0'.

4 Comments

Note: Re: "As name literal itself implies string literal ...", a literal need not be a string. (int []){2, 4} is a literal (compound) and not a string.
@chux Nobody said that a literal is only the string literal.
Reading "As name literal itself implies string literal ..." indicates that literal implies string literal.
@chux You should read the meaning of the word literal. I hope this will help you.
1

Character arrays in C and pointers to character array are not same thing. Though you can print addresses and get same value. An array in C is made up of following things.

  1. Size of array
  2. Its address / pointer
  3. Homogenous Type of elements

Where a pointer is made up of just:

  1. Address
  2. Type information

    char s[] = "Hello\0Hi"; printf("%d %d", strlen(s), sizeof(s));

Here you are calculating the size of array (which is s variable) using sizeof() which is 9.

But if you treat this character array as string than array(string now) looses its size information and become just a pointer to a character. Same thing happens when you try to print character array using %s.

So strlen() and %s treat character array as string and it utilize its address information only. You can guess, strlen() keep incrementing the pointer to calculate the length up-to first null character. When it encounter a null character you get a length up-to that point.

So the strlen() gives you 5 and do not count null character.

So sizeof() operator tells only the size of its operand. If you give it array variable than it utilize the array size information and tells the size regardless of null character position.

But if you give sizeof() the pointer to array of characters than it finds pointer without the size information and prints the size of pointer which is usually 64bit/8byte on 64bit systems or 32bit/4bytes on 32bit systems.

One more thing if you initialize your character arrays using double quotes like "Hello" than C adds a null character otherwise it does not in case of {'H','e','l','l','o'}.

Using gcc compiler. Hope it will help only to understand.

1 Comment

All objects (including arrays and pointers) have an address, and type information. Neither of those things require storage. In the case of arrays, part of the type information is how many elements are in the array.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.