1

I am trying to convert a char element from an char *argv[] array to lowercase from uppercase without using a function. I want to add 32 to the ascii integer.

When I try to pass the variable as an argument, it will show the integer sum, but not the new lowercase character. Instead it shows the following output:

letter h, 104 tolower: 136, � 

Code:

int i = 0; for(i = 0; argv[1][i] != '\0'; i++) { char letter = argv[1][i]; printf("letter %c, %i\n", letter, letter); char tolower = argv[1][i]; int lowercase = tolower + 32; printf("tolower: %i, %c\n", lowercase, lowercase); } 

Why is it printing a "?" char?

3
  • 2
    Do not call a variable tolower it's the name of a standard function. Commented Mar 6, 2015 at 22:31
  • 3
    h is already lowercase. Commented Mar 6, 2015 at 22:32
  • 3
    You have to check that letter >= 'A' && letter <= 'Z' Commented Mar 6, 2015 at 22:33

4 Answers 4

3

First, don't assume you are using an ascii character set (ie, 32 is the wrong additive value in some character sets). The problem you are having is that 'h' + 32 is not a lower case "h". 'h' is already lowercase, so you want to be adding 0. Check it; something like:

if( tolower >= 'A' && tolower <= 'Z' ) tolower += 'a' - 'A'; 
Sign up to request clarification or add additional context in comments.

7 Comments

Oops. && was intended.
@MattMcNabb if & was "correct" but && "idiomatic" why does MSCV issue the warning C4554: '&' : check operator precedence for possible error; use parentheses to clarify precedence?It wasn't correct, it was wrong.
@WeatherVane don't ask me to explain why MSVC does anything
@MattMcNabb I am asking you to explain your comment. Why is it right to mix bitwise with boolean?
@WeatherVane The code with & complied with the C standard and correctly implemented the programmer's intent. However it is more common practice and better style to use && instead.
|
2

I will not point the problems that other answerers did, I will show a neat trick to perform the swap upper-lower. For the letters, the difference between the lower case and the upper case letters is the bit 5 in the ascii code. So to set the letter lowercase you need to set this bit:

lower = 0x20 | letter; 

For uppercase reset the bit:

upper = (~0x20) & letter; 

And to swap the case you can use XOR:

swapped = 0x20 ^ letter; 

The good thing here that you don't have to worry and check whether or not the letter is already the case you need.

Of course the assumption here is that your system is using ASCII encoding.

Comments

1

This is from my own C++ library, but it works for C too as well.

Self-Optimized (Library Function)

// return the specified letter in lowercase char tolower(int c) { // (int)a = 97, (int)A = 65 // (a)97 - (A)65 = 32 // therefore 32 + 65 = a return c > 64 && c < 91 ? c + 32 : c; } // return the specfied letter in uppercase char toupper(int c) { // (int)a = 97, (int)A = 65 // (a)97 - (A)65 = 32 // therefore 97 - 32 = A return c > 96 && c < 123 ? c - 32 : c; } 

The return value is a printable char. However, some modern compilers may have optimized the code as this already if you're going to do the way below.

Readable

char tolower(int c) { return c >= 'A' && c <= 'Z' ? c + ('a' - 'A') : c; } char toupper(int c) { return c >= 'a' && c <= 'z' ? c - ('a' - 'A') : c; } 

Note that the difference between a and A is a 32 constant.

9 Comments

Do not hard code ASCII values: if you are going to use this method, at least make it more readable as return (c >= 'A' && c <= 'Z') ? c + 'a' - 'A' : c; etc.
@chqrlie Alright, you've got a point there since this should be a readable answer
Why do you insist on 32 instead of 'a' - 'A' ?
I'm afraid your reasoning is incorrect. Unless you are compiling with optimisations disabled, 'a' - 'A' will not generate an actual subtraction, it is a constant expression that will be evaluated at compile time, so no performance penalty at all. The reason it is advisable to use 'a' - 'A' instead of harcoding 32 is both portability and readability. 32 is a constant that works for ASCII, but not some other encodings and is magical in the sense that the reader must know this detail about ASCII that is not required for this algorithm.
The code return (c >= 'A' && c <= 'Z') ? c + 'a' - 'A' : c; does make an some assumptions about the encoding of characters: they are encoded in alphabetical order, form a continuous sequence and therefore the offset between an upper case letter and its lowercase equivalent is constant. The first 2 assumptions imply the third and are correct for ASCII, but the second is false for EBCDIC although the third still holds. Using this function for EBCDIC will lower case all uppercase letters correctly and leave lowercase letter unchanged but will change some other characters values too.
|
0

Ascii 136 is not a printable character. You should do something like lowercase = tolower - 'A' + 'a';, but only if tolower is uppercase for sure (it's not in your example).

1 Comment

ASCII 136 isn't a character at all. ASCII has 128 codepoints, numbered 0-127, encoded in one byte, 0x00 to 0x7F. But, @JAS is probably not using ASCII, anyway.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.