I will only address the issue of performance.
You might want to use
strchrto find occurrences of'\'\and justmemcpyeverything in between. It's highly probable that these are optimized by SSE or AVX.If you can (I suspect you cannot), don't allocate memory for each string separately, and if you do, don't reallocate, it's probably not worth the overhead.
To kill two birds with one stone, you can allocate an array where you save the positions of
'\'\in the string. Then you allocate exactly as much memory as needed, and do thememcpyand parsing of escape sequences. EDIT To deal with escape sequences of variable length, you can parse and store the escape sequences as you scan the string for'\'\s. Store them in another array, along with their positions, and then domemcpyof plain text plus individually copy the parsed characters.Preferably put the most common branch first, eg.
if (*pos!='\\'). Although the branch prediction buffer will probably alleviate the negative effects of doing it the way you do it now. You can take a look at the macros__builtin_expectandlikely/unlikely.In function
tto_escape_hex, save*posinto a variable instead of using it directly. The way you do it now, you dereference a pointer on every access. That, if your compiler didn't optimize it, would be slow. Allocating an extra variable is worth it and if you have optimizations turned on (or maybe even if you don't), the compiler probably stores the value in a register anyway.If you are serious about this, you can take inspiration from some reference-grade compilers such as GCC (although that one might be a bit too heavyweight).