1

There are possibilities of an .exe file being renamed to a .txt file to bypass any file type validations. I am looking for a way to find out the actual file type by reading the header of a file without using dlls like urlmon.dll.

MimeMapping.GetMimeMapping doesn't solve the problem, it just extracts the mime type based on the extensions.

Is there a dictionary which says what combinations of bytes represents atleast the very common file types such as txt, doc, docx, pdf, xls or xlsx , an exe etc?

3
  • Do you know the type of file in particular that you're looking for? It'd be a lot easier to jus tlook for a unique header signature for one file type than to build a generic library to guess them all. One issue in particular with your list is that docx and xlsx will both appear to be zip files on cursory inspection and would need deeper analysis to really figure it out. That could get expensive even if it is feasable. Commented Mar 21, 2015 at 22:50
  • @DanField - I am looking for txt, doc, docx, pdf, xls and xlsx for now. Commented Mar 22, 2015 at 2:24
  • @DanField - There may be new types that I may have to support later in the project. But the ones I have mentioned are the bare minimum. Commented Mar 22, 2015 at 2:40

2 Answers 2

1

I think you sort of answered your own question.

This is a little bit of a pickle. Read the file-header signature, and see if it matches that of its extension. Using a FileStream or similar.

Combine this with Tommy DDD's answer, and i think you are set.

Sign up to request clarification or add additional context in comments.

Comments

0

This isn't the most elegant solution but check out this answer. How can I determine if a file is binary or text in c#? you can psudo check for if the file is binary or text.

In the comments someone checked for 4 zero bytes in a row. \0\0\0\0 which tends to indicate binary file because we don't type NULL characters too often.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.