I want to process a long svg file, but I found that when reading it into a buffer then printing the buffer out, a lot of text is written to the buffer beyond the end of the file, and I discovered that that extra text has been copied from before the end of the data and added. The last bit of text in the file closes with the svg tag, so it is easy to see where the end is. For a regular text file this would be less obvious. When trying to open such a file with an image application or a browser, the application gets confused because of this trailing text.
Eventually I discovered that for some reason the function ftell() returns a file length that is too large. Here is a simplified version of my function:
int datReadFileToBuf(char* fn, BYTE** buf) { int fileLen = 0; if (*buf != NULL) return -1; // ERROR: The buffer should be NULL. FILE* fp = NULL; if (fopen_s(&fp, fn, "r") != 0) return -2; fseek(fp, 0, SEEK_END); fileLen = ftell(fp); rewind(fp); if ((*buf = (BYTE*)calloc(fileLen, 1)) == NULL) return -3; size_t sizetLen = fread(*buf, 1, fileLen, fp); // fileLen == 483553 // sizetLen == 481976 fclose(fp); // return fileLen; // Bad return sizetlen; // Good } Negative return values indicate errors, fn is the filename and buf is the buffer which is declared as BYTE* datInBuf = NULL in the main() function and passed to datReadFileToBuf() together with fn. By returning sizeLen as the length of the buffer, the rest of the program works OK as the rest of the buffer is ignored, but by returning fileLen causes problems as it is larger.
This seems to be a problem with long text files. I have not checked if this happens with binary files. I've searched for problems with ftell() online, but found no explanation, so would be grateful to have some information on this issue. Incidentally, I'm using Visual Studio 2022 on a Windows 10 platform. The workaround I'm using fixes the problem, but there might be a better way.
fseek()/ftell()gets taught to get the size of a file. C11 7.21.9.4p2: "For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read."_fileno()and_filelength()to get the size of the underlying file given aFILEpointer. Or skip stdio and use Win32 file functions includingGetFileSizeEx()andReadFile().ftell()islong, notint-- which, in addition to the other reasons, is one more reason you may get incorrect results. See man 3 fseek. Or on windows ftell, _ftelli64