0

We have this function which takes care of special character. For example this function will convert He shouldn’t be allowed to He shouldn�t be allowed.

import io.circe._, io.circe.generic.auto._, io.circe.syntax._, io.circe.parser._, io.circe.optics.JsonPath._ private def bytesToJsonCirce(value: Array[Byte]): Json = { parse(encodeCharacters(value.map(_.toChar).mkString)) match { case Right(x: Json) => x case Left(err: ParsingFailure) => logger.error(err.getLocalizedMessage) Json.Null } } private def encodeCharacters(x: String): String = { val encodeChar = '�' (x.toCharArray map { case y if y.isControl => encodeChar case y => y }).mkString } 

Now we are trying to consume this API response at client side(python) getting error UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 11: ordinal not in range(128) . If we apply .encode("utf8") at client its giving back He shouldn�t be allowed but we are loosing the original form He shouldn’t be allowed.

Also this UnicodeEncodeError is not desired without .encode("utf8"). How it can be achieved at scala front .

7
  • I am a bit confused, what side is in Scala - server or client? Commented Sep 1, 2021 at 19:38
  • Server actually Commented Sep 1, 2021 at 19:40
  • Does you API produce JSON responses? Where does the value: Array[Byte] come from, which I understand is a JSON? Commented Sep 1, 2021 at 19:45
  • 3
    If value: Array[Byte] is a UTF-8 encoded string then treating each element of the array as separate character is wrong because some characters can take more than a byte. Commented Sep 1, 2021 at 19:54
  • 1
    is not a control character so the example you give will be unchanged Commented Sep 2, 2021 at 6:19

1 Answer 1

1

The error message says that \ufffd is not a valid ASCII character:

'ascii' codec can't encode character u'\ufffd'

\ufffd is the Unicode value of so the problem is that the replacement value encodeChar is not a valid ASCII value. Try changing encodeChar to a valid ASCII character.

Sign up to request clarification or add additional context in comments.

1 Comment

I was more looking for instead of replacing by same encoder , can I have some escape mechanism that can help to retain the actual value.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.