0

I am using the Unicode version of NSIS to make an installer. I will be appending lines to both ANSI and Unicode files. Before I write a line to a file I need to know whether the file is ANSI encoded or Unicode so I know if I should use FileWrite or FileWriteUTF16LE.

How can I find out the encoding type of a file?

The Unicode Plugin which can tell me the encoding of a file doesn't work for NSIS Unicode, the function unicode::UnicodeType always returns 6.

Any advice would be extremely helpful.

2 Answers 2

0

If you want to continue using that plugin you could recompile it yourself as unicode or try the CallAnsiPlugin plugin.

You can also perform the check yourself:

!include LogicLib.nsh !define ByHandleIsFileUTF16LE "'' ByHandleIsFileUTF16LE " !macro _ByHandleIsFileUTF16LE a b t f !insertmacro _LOGICLIB_TEMP FileReadByte ${b} $_LOGICLIB_TEMP IntCmpU $_LOGICLIB_TEMP 0xFF "" `${f}` FileReadByte ${b} $_LOGICLIB_TEMP IntCmpU $_LOGICLIB_TEMP 0xFE `${t}` `${f}` !macroend !define IsFileUTF16LE "'' IsFileUTF16LE " !macro _IsFileUTF16LE a b t f !insertmacro _LOGICLIB_TEMP Push $0 FileOpen $0 "${b}" r !define _IsFileUTF16LE _IsFileUTF16LE${__LINE__} !insertmacro _ByHandleIsFileUTF16LE '' $0 ${_IsFileUTF16LE}t ${_IsFileUTF16LE}f ${_IsFileUTF16LE}f: StrCpy $_LOGICLIB_TEMP "" ${_IsFileUTF16LE}t: !undef _IsFileUTF16LE FileClose $0 Pop $0 StrCmp "" $_LOGICLIB_TEMP `${f}` `${t}` !macroend section !macro testutf16detection file ${If} ${IsFileUTF16LE} "${file}" DetailPrint "${file} is UTF16LE" ${Else} DetailPrint "${file} is NOT UTF16LE" ${EndIf} !macroend !insertmacro testutf16detection "$temp\test1.txt" !insertmacro testutf16detection "$temp\test2.txt" sectionend 
Sign up to request clarification or add additional context in comments.

Comments

0

One potential solution is to check for the BOM. Here's how you could check if a file uses the UTF16LE encoding:

!define fileIsUTF16LE "!insertmacro FileIsUTF16LE" !macro FileIsUTF16LE file result Push $0 Push $1 FileOpen $0 "${file}" r FileReadByte $0 $1 IntCmpU $1 0xFF "" FileIsUTF16LE_ItsNot FileIsUTF16LE_ItsNot FileReadByte $0 $1 IntCmpU $1 0xFE FileIsUTF16LE_ItIs FileIsUTF16LE_ItsNot FileIsUTF16LE_ItsNot FileIsUTF16LE_ItIs: StrCpy ${result} 1 Goto FileIsUTF16LE_Done FileIsUTF16LE_ItsNot: StrCpy ${result} 0 FileIsUTF16LE_Done: FileClose $0 Pop $1 Pop $0 !macroend 

Usage:

${fileIsUTF16LE} "$R0" $3 ${If} $3 == 1 

Note that this will not work in all cases since not all UTF encodings require a BOM. You could easily modify this macro to check for other BOMs, however, definitively determining encoding is non trivial. One method would be to check for all the different BOMs, if the file doesn't have a BOM, assume it's not unicode.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.