0

Let assume I have a function like

template<typename charT> void fun(std::basic_ostream<charT>& out, std::basic_fstream<charT>& file) { std::basic_string<charT> str; file>>str; out<<str; } 

Note: file is encoded as utf-8

I am not knowledgeable with Unicode. Can I use this function for both ASCII and Unicode, or build a class using basic_type so that class class can be use for both Unicode and ASCII.

My question is there is any difference between ASCII and Unicode at processing level?

Edit:

processing level means doing operation on that strings like append, print and take string from file.

Why i am asking that question is std::string and std::wstring are typedef ed version of basic_string having char and wchar_t

and std::cout and std::wcout are typedef ed version of std::basic_ostream having char and wchar_t but both code are same.

in both cases difference is only memory.

so i build a class using basic_type so that class can be used for both ASCII and Unicode.

6
  • 2
    Yes. ASCII and the unicode encodings are different. I don't know what you mean by "processing level". Commented Jul 2, 2020 at 14:46
  • There are several tens of thousands of Unicode characters that can be encoded as UTF-8 but not as ASCII. What exactly do you expect to happen if your file contains even one of those? Commented Jul 2, 2020 at 14:49
  • @eerorika i mean that process on that string(like append) or print it or take a string from file Commented Jul 2, 2020 at 14:49
  • Isn't the issue here the interpretation of whitespace characters? operator>> depends on knowing what a whitespace character is and some unicode whitespace characters have multi-byte UTF-8 encodings. Commented Jul 2, 2020 at 14:53
  • C++ doesn't really have support for unicode. If you want to use unicode, I'd suggest a library like ICU Commented Jul 2, 2020 at 14:53

1 Answer 1

1

There is any difference between ascii and unicode

Yes. They are distinct encodings and not identical. As such, there are differences.

Can I use this function for both ASCII and Unicode

Yes. For UTF-8 (assuming size of byte is 8 bits).

The function does nothing that would require different handling between those encodings.

Although, if you want to read the output from a terminal, it depends on the capabilities and cofiguration of the terminal which encoding it uses to show the output. If it doesn't match what you're printing, then the output may be misinterpreted.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your answer, can you see edited question ?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.