60

I'm loading an image from a file, and I want to know how to validate the image before it is fully read from the file.

string filePath = "image.jpg"; Image newImage = Image.FromFile(filePath); 

The problem occurs when image.jpg isn't really a jpg. For example, if I create an empty text file and rename it to image.jpg, an OutOfMemory Exception will be thrown when image.jpg is loaded.

I'm looking for a function that will validate an image given a stream or a file path of the image.

Example function prototype

bool IsValidImage(string fileName); bool IsValidImage(Stream imageStream); 
3
  • 3
    Why not wrap that code in a try...catch block, and if it throws this exception, you can consider it "invalid"? Granted, this is a naive heuristic, but it does the job. Anything else will still have to open the file, so you aren't going to save a significant amount performance-wise regardless, IMO. Commented Oct 16, 2008 at 23:41
  • See also: stackoverflow.com/questions/9354747/… Commented Mar 5, 2013 at 21:41
  • See also, for an alternative method: stackoverflow.com/q/2053662/2181514 Commented Mar 6, 2019 at 9:02

15 Answers 15

91

here is my image check. I cannot rely on file extensions and have to check the format on my own. I am loading BitmapImages in WPF from byte arrays and don't know the format upfront. WPF detects the format fine but does not tell you the image format of BitmapImage objects (at least I am not aware of a property for this). And I don't want load the image again with System.Drawing only to detect the format. This solution is fast and works fine for me.

public enum ImageFormat { bmp, jpeg, gif, tiff, png, unknown } public static ImageFormat GetImageFormat(byte[] bytes) { // see http://www.mikekunz.com/image_file_header.html var bmp = Encoding.ASCII.GetBytes("BM"); // BMP var gif = Encoding.ASCII.GetBytes("GIF"); // GIF var png = new byte[] { 137, 80, 78, 71 }; // PNG var tiff = new byte[] { 73, 73, 42 }; // TIFF var tiff2 = new byte[] { 77, 77, 42 }; // TIFF var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon if (bmp.SequenceEqual(bytes.Take(bmp.Length))) return ImageFormat.bmp; if (gif.SequenceEqual(bytes.Take(gif.Length))) return ImageFormat.gif; if (png.SequenceEqual(bytes.Take(png.Length))) return ImageFormat.png; if (tiff.SequenceEqual(bytes.Take(tiff.Length))) return ImageFormat.tiff; if (tiff2.SequenceEqual(bytes.Take(tiff2.Length))) return ImageFormat.tiff; if (jpeg.SequenceEqual(bytes.Take(jpeg.Length))) return ImageFormat.jpeg; if (jpeg2.SequenceEqual(bytes.Take(jpeg2.Length))) return ImageFormat.jpeg; return ImageFormat.unknown; } 
Sign up to request clarification or add additional context in comments.

8 Comments

The above code was failing for a particular PNG file. When I checked, the first 4 bytes contained {80, 75, 3, 4} instead of the sequence you've mentioned. The image can be opened by normal viewers/editors. What's going on?
I have a JPEG with 255,216,255,237 so this doesnt work.
just add this sequence of bytes to the code when this is valid for a jpeg an the code will work fine
Old but gold :) I actualise it a little see my answer below
This is now a NuGet package for checking the 'magic bytes' - with over 120K downloads, I'd assume it works - nuget.org/packages/File.TypeChecker
|
35

Using Windows Forms:

bool IsValidImage(string filename) { try { using(Image newImage = Image.FromFile(filename)) {} } catch (OutOfMemoryException ex) { //The file does not have a valid image format. //-or- GDI+ does not support the pixel format of the file return false; } return true; } 

Otherwise if you're using WPF you can do the following:

bool IsValidImage(string filename) { try { using(BitmapImage newImage = new BitmapImage(filename)) {} } catch(NotSupportedException) { // System.NotSupportedException: // No imaging component suitable to complete this operation was found. return false; } return true; } 

You must release the image created. Otherwise when you call this function large number of times, this would throw OutOfMemoryException because the system ran out of resources, and not because the image is corrupt yielding an incorrect result, and if you delete images after this step, you'd potentially be deleting good ones.

15 Comments

Thanks :) . I was thinking about doing that, but I was wondering if there was a way to do this that is already built into the .NET framework. Since no one else mentioned any built-in functions in the .NET framework to do this, I believe that this would be a good solution.
You should probably catch OutOfMemoryException, which is the documented exception thrown if the file format is invalid. This means you would let FileNotFoundException propagate to the caller.
@dbkk: the VB reference really hurt. :)
@Ervin: the question asker didn't think so, but I do, obviously. In the context of programming, you're not trying to determine if a file is some sort of Platonic ideal of a JPEG; you're trying to determine whether your program can open it and display it. I think the best way is to let .Net try to open it and tell you if it can or can't do that.
OutOfMemoryException is indeed the correct exception to trap according to MSDN!!! msdn.microsoft.com/en-us/library/stf701f5.aspx Microsoft, you never cease to amaze and baffle.
|
23

JPEG's don't have a formal header definition, but they do have a small amount of metadata you can use.

  • Offset 0 (Two Bytes): JPEG SOI marker (FFD8 hex)
  • Offset 2 (Two Bytes): Image width in pixels
  • Offset 4 (Two Bytes): Image height in pixels
  • Offset 6 (Byte): Number of components (1 = grayscale, 3 = RGB)

There are a couple other things after that, but those aren't important.

You can open the file using a binary stream, and read this initial data, and make sure that OffSet 0 is 0, and OffSet 6 is either 1,2 or 3.

That would at least give you slightly more precision.

Or you can just trap the exception and move on, but I thought you wanted a challenge :)

4 Comments

I would have gone ahead and read the header for the file and compared it to an array of .NET supported images' file headers. Eventually, I'll code that up and post it as a solution for anyone that would need it in the future.
Just reading the headers will not guarantee that the file is valid and won't throw an exception when opened in Image.FromFile().
No, but I didn't claim it would.
Please update JPEG format en.wikipedia.org/wiki/JPEG_File_Interchange_Format I will look for the first 2 bytes FFD8 and the last 2 bytes FFD9. What you say is offset2 and offset 4 is not valid or may not apply to all JPEG formats
21

Well, I went ahead and coded a set of functions to solve the problem. It checks the header first, then attempts to load the image in a try/catch block. It only checks for GIF, BMP, JPG, and PNG files. You can easily add more types by adding a header to imageHeaders.

static bool IsValidImage(string filePath) { return File.Exists(filePath) && IsValidImage(new FileStream(filePath, FileMode.Open, FileAccess.Read)); } static bool IsValidImage(Stream imageStream) { if(imageStream.Length > 0) { byte[] header = new byte[4]; // Change size if needed. string[] imageHeaders = new[]{ "\xFF\xD8", // JPEG "BM", // BMP "GIF", // GIF Encoding.ASCII.GetString(new byte[]{137, 80, 78, 71})}; // PNG imageStream.Read(header, 0, header.Length); bool isImageHeader = imageHeaders.Count(str => Encoding.ASCII.GetString(header).StartsWith(str)) > 0; if (isImageHeader == true) { try { Image.FromStream(imageStream).Dispose(); imageStream.Close(); return true; } catch { } } } imageStream.Close(); return false; } 

3 Comments

Not quite. If imageStream.Read throws an exception, you still don't close it. Best to put a using statement around the stream instantiation.
@Joe I must disagree. He should not be closing or disposing of the stream in this function. This function didn't create the stream, and so should not perform unexpected behaviours. Also.. In case of success, Image.FromStream will consume the stream (which might be readonly, and can't be reset) meaning that a subsequent read of the stream later would fail since the stream had already been consumed. Also, upon success the image is loaded (very costly) and then disposed of immediately. If this method return true, it's likely the caller will load the image on the next line. So that's double work.
@Troy, I agree. It would be better for this method to take a byte array or some similar object that isn't affected by the method, especially since it's static.
14

You can do a rough typing by sniffing the header.

This means that each file format you implement will need to have a identifiable header...

JPEG: First 4 bytes are FF D8 FF E0 (actually just the first two bytes would do it for non jfif jpeg, more info here).

GIF: First 6 bytes are either "GIF87a" or "GIF89a" (more info here)

PNG: First 8 bytes are: 89 50 4E 47 0D 0A 1A 0A (more info here)

TIFF: First 4 bytes are: II42 or MM42 (more info here)

etc... you can find header/format information for just about any graphics format you care about and add to the things it handles as needed. What this won't do, is tell you if the file is a valid version of that type, but it will give you a hint about "image not image?". It could still be a corrupt or incomplete image, and thus crash when opening, so a try catch around the .FromFile call is still needed.

3 Comments

hmm.. four people answered while I was typing that and collecting links. Busy place.
Please correct for TIFF the first 4 bytes are II* (49 49 42 00) or MM* (4D 4D 00 42)
For JPEG the first 3 bytes will do it, FFD8 is a SOI marker and FF?? is the APP marker where ?? usually is E0. So for non jfif jpeg 3 bytes FFD8FF will do it.
7

2019 here, dotnet core 3.1. I take the answer of Alex and actualise it a little

public static bool IsImage(this byte[] fileBytes) { var headers = new List<byte[]> { Encoding.ASCII.GetBytes("BM"), // BMP Encoding.ASCII.GetBytes("GIF"), // GIF new byte[] { 137, 80, 78, 71 }, // PNG new byte[] { 73, 73, 42 }, // TIFF new byte[] { 77, 77, 42 }, // TIFF new byte[] { 255, 216, 255, 224 }, // JPEG new byte[] { 255, 216, 255, 225 } // JPEG CANON }; return headers.Any(x => x.SequenceEqual(fileBytes.Take(x.Length))); } 

Usage :

public async Task UploadImage(Stream file) { using (MemoryStream ms = new MemoryStream()) { await file.CopyToAsync(ms); byte[] bytes = ms.ToArray(); if (!bytes.IsImage()) throw new ArgumentException("Not an image", nameof(file)); // Upload your file } } 

1 Comment

Note that this doesn't handle all image formats, e.g. it doesn't handle .webp files.
6

This should do the trick - you don't have to read raw bytes out of the header:

using(Image test = Image.FromFile(filePath)) { bool isJpeg = (test.RawFormat.Equals(ImageFormat.Jpeg)); } 

Of course, you should trap the OutOfMemoryException too, which will save you if the file isn't an image at all.

And, ImageFormat has pre-set items for all the other major image types that GDI+ supports.

Note, you must use .Equals() and not == on ImageFormat objects (it is not an enumeration) because the operator == isn't overloaded to call the Equals method.

Comments

4

A method that supports Tiff and Jpeg also

private bool IsValidImage(string filename) { Stream imageStream = null; try { imageStream = new FileStream(filename, FileMode.Open); if (imageStream.Length > 0) { byte[] header = new byte[30]; // Change size if needed. string[] imageHeaders = new[] { "BM", // BMP "GIF", // GIF Encoding.ASCII.GetString(new byte[]{137, 80, 78, 71}),// PNG "MM\x00\x2a", // TIFF "II\x2a\x00" // TIFF }; imageStream.Read(header, 0, header.Length); bool isImageHeader = imageHeaders.Count(str => Encoding.ASCII.GetString(header).StartsWith(str)) > 0; if (imageStream != null) { imageStream.Close(); imageStream.Dispose(); imageStream = null; } if (isImageHeader == false) { //Verify if is jpeg using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open))) { UInt16 soi = br.ReadUInt16(); // Start of Image (SOI) marker (FFD8) UInt16 jfif = br.ReadUInt16(); // JFIF marker return soi == 0xd8ff && (jfif == 0xe0ff || jfif == 57855); } } return isImageHeader; } return false; } catch { return false; } finally { if (imageStream != null) { imageStream.Close(); imageStream.Dispose(); } } } 

1 Comment

I tried this. It worked for most test cases but it failed for a particular valid jpg. The soi value matched but jfif for the jpg was 58111. I looked at the header and it had ICC_PROFILE and some other stuff in the header where JFIF was expected. JFIF was after that, much further down.
3

Noticed couple of problems with all functions above. First of all - Image.FromFile opens given image and afterwards will cause open file error whoever wants to open given image file for any reason. Even application itself - so I've switched using Image.FromStream.

After you switch api - exception type changes from OutOfMemoryException to ArgumentException for some unclear for me reason. (Probably .net framework bug?)

Also if .net will add more image file format supports than currently we will check by function - it makes sense first try to load image if only if then fails - only after that to report error.

So my code looks now like this:

try { using (FileStream stream = new FileStream(path, FileMode.Open, FileAccess.Read)) { Image im = Image.FromStream(stream); // Do something with image if needed. } } catch (ArgumentException) { if( !IsValidImageFormat(path) ) return SetLastError("File '" + fileName + "' is not a valid image"); throw; } 

Where:

/// <summary> /// Check if we have valid Image file format. /// </summary> /// <param name="path"></param> /// <returns>true if it's image file</returns> public static bool IsValidImageFormat( String path ) { using ( FileStream fs = File.OpenRead(path) ) { byte[] header = new byte[10]; fs.Read(header, 0, 10); foreach ( var pattern in new byte[][] { Encoding.ASCII.GetBytes("BM"), Encoding.ASCII.GetBytes("GIF"), new byte[] { 137, 80, 78, 71 }, // PNG new byte[] { 73, 73, 42 }, // TIFF new byte[] { 77, 77, 42 }, // TIFF new byte[] { 255, 216, 255, 224 }, // jpeg new byte[] { 255, 216, 255, 225 } // jpeg canon } ) { if (pattern.SequenceEqual(header.Take(pattern.Length))) return true; } } return false; } //IsValidImageFormat 

Comments

1

I took Semicolon's answer and converted to VB:

Private Function IsValidImage(imageStream As System.IO.Stream) As Boolean If (imageStream.Length = 0) Then isvalidimage = False Exit Function End If Dim pngByte() As Byte = New Byte() {137, 80, 78, 71} Dim pngHeader As String = System.Text.Encoding.ASCII.GetString(pngByte) Dim jpgByte() As Byte = New Byte() {255, 216} Dim jpgHeader As String = System.Text.Encoding.ASCII.GetString(jpgByte) Dim bmpHeader As String = "BM" Dim gifHeader As String = "GIF" Dim header(3) As Byte Dim imageHeaders As String() = New String() {jpgHeader, bmpHeader, gifHeader, pngHeader} imageStream.Read(header, 0, header.Length) Dim isImageHeader As Boolean = imageHeaders.Count(Function(str) System.Text.Encoding.ASCII.GetString(header).StartsWith(str)) > 0 If (isImageHeader) Then Try System.Drawing.Image.FromStream(imageStream).Dispose() imageStream.Close() IsValidImage = True Exit Function Catch ex As Exception System.Diagnostics.Debug.WriteLine("Not an image") End Try Else System.Diagnostics.Debug.WriteLine("Not an image") End If imageStream.Close() IsValidImage = False End Function 

Comments

0

I would create a method like:

Image openImage(string filename); 

in which I handle the exception. If the returned value is Null, there is an invalid file name / type.

4 Comments

LOL, I must've been writing that as a comment when you posted this. I agree with this answer, it's simple enough to get the job done.
This method is just kind of wrong. You should not control program flow using exceptions. Also.. The exceptions returned from that particular call can be very misleading and ambiguous.
I don't see what's wrong with this. The person who wrote openImage chose to throw an exception if the image is invalid instead of providing a return value. So it seems to me that catching and handling the exception is the way they intended for you to deal with that situation.
Exceptions take resources. We all know that. It is kind of wrong! Do not be lazy!
0

You could read the first few bytes of the Stream and compare them to the magic header bytes for JPEG.

Comments

0

Here is my approch using multiple validations.

public class ImageValidator { private readonly Dictionary<string,byte[]> _validBytes = new Dictionary<string, byte[]>() { { ".bmp", new byte[] { 66, 77 } }, { ".gif", new byte[] { 71, 73, 70, 56 } }, { ".ico", new byte[] { 0, 0, 1, 0 } }, { ".jpg", new byte[] { 255, 216, 255 } }, { ".png", new byte[] { 137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82 } }, { ".tiff", new byte[] { 73, 73, 42, 0 } }, }; /// <summary> /// image formats to validate using Guids from ImageFormat. /// </summary> private readonly Dictionary<Guid, string> _validGuids = new Dictionary<Guid, string>() { {ImageFormat.Jpeg.Guid, ".jpg" }, {ImageFormat.Png.Guid, ".png"}, {ImageFormat.Bmp.Guid, ".bmp"}, {ImageFormat.Gif.Guid, ".gif"}, {ImageFormat.Tiff.Guid, ".tiff"}, {ImageFormat.Icon.Guid, ".ico" } }; /// <summary> /// Supported extensions: .jpg,.png,.bmp,.gif,.tiff,.ico /// </summary> /// <param name="allowedExtensions"></param> public ImageValidator(string allowedExtensions = ".jpg;.png") { var exts = allowedExtensions.Split(';'); foreach (var pair in _validGuids.ToArray()) { if (!exts.Contains(pair.Value)) { _validGuids.Remove(pair.Key); } } foreach (var pair in _validBytes.ToArray()) { if (!exts.Contains(pair.Key)) { _validBytes.Remove(pair.Key); } } } [System.Diagnostics.CodeAnalysis.SuppressMessage("Style", "IDE0063:Use simple 'using' statement", Justification = "<Pending>")] [System.Diagnostics.CodeAnalysis.SuppressMessage("Design", "CA1031:Do not catch general exception types", Justification = "<Pending>")] public async Task<bool> IsValidAsync(Stream imageStream, string filePath) { if(imageStream == null || imageStream.Length == 0) { return false; } //First validate using file extension string ext = Path.GetExtension(filePath).ToLower(); if(!_validGuids.ContainsValue(ext)) { return false; } //Check mimetype by content if(!await IsImageBySigAsync(imageStream, ext)) { return false; } try { //Validate file using Guid. using (var image = Image.FromStream(imageStream)) { imageStream.Position = 0; var imgGuid = image.RawFormat.Guid; if (!_validGuids.ContainsKey(imgGuid)) { return false; } var validExtension = _validGuids[imgGuid]; if (validExtension != ext) { return false; } } } catch (OutOfMemoryException) { return false; } return true; } /// <summary> /// Validate the mimetype using byte and file extension. /// </summary> /// <param name="imageStream"></param> /// <param name="extension"></param> /// <returns></returns> private async Task<bool> IsImageBySigAsync(Stream imageStream, string extension) { var length = _validBytes.Max(x => x.Value.Length); byte[] imgByte = new byte[length]; await imageStream.ReadAsync(imgByte, 0, length); imageStream.Position = 0; if (_validBytes.ContainsKey(extension)) { var validImgByte = _validBytes[extension]; if (imgByte.Take(validImgByte.Length).SequenceEqual(validImgByte)) { return true; } } return false; } } 

Comments

0
public enum ImageFormat { Bmp, Jpeg, Gif, Tiff, Png, Unknown } public static ImageFormat GetImageFormat(byte[] bytes) { if (bytes.Length >= 2 && bytes[0] == 0x42 && bytes[1] == 0x4D) { return ImageFormat.Bmp; // BMP } if (bytes.Length >= 3 && bytes[0] == 0x47 && bytes[1] == 0x49 && bytes[2] == 0x46) { return ImageFormat.Gif; // GIF } if (bytes.Length >= 8 && bytes[0] == 0x89 && bytes[1] == 0x50 && bytes[2] == 0x4E && bytes[3] == 0x47 && bytes[4] == 0x0D && bytes[5] == 0x0A && bytes[6] == 0x1A && bytes[7] == 0x0A) { return ImageFormat.Png; // PNG } if (bytes.Length >= 4 && bytes[0] == 0x49 && bytes[1] == 0x49 && bytes[2] == 0x2A && bytes[3] == 0x00) { return ImageFormat.Tiff; // TIFF } if (bytes.Length >= 4 && bytes[0] == 0x4D && bytes[1] == 0x4D && bytes[2] == 0x00 && bytes[3] == 0x2A) { return ImageFormat.Tiff; // TIFF } if (bytes.Length >= 2 && bytes[0] == 0xFF && bytes[1] == 0xD8) { return ImageFormat.Jpeg; // JPEG } return ImageFormat.Unknown; } 

1 Comment

You should add explanation.
-1

in case yo need that data read for other operations and/or for other filetypes (PSD for example), later on, then using the Image.FromStream function is not necessarily a good ideea.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.