The problem has been resolved in GDAL does not ignore NoData value
f = gdal.Open("a.tif") bands = f.RasterCount print bands 3 for j in range(bands): band = f.GetRasterBand(j+1) stats = band.GetStatistics( True, True ) print "[ STATS ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % ( stats[0], stats[1], stats[2], stats[3] ) [ STATS ] = Minimum=17.000, Maximum=255.000, Mean=220.586, StdDev=39.705 [ STATS ] = Minimum=64.000, Maximum=255.000, Mean=214.975, StdDev=36.926 [ STATS ] = Minimum=45.000, Maximum=255.000, Mean=179.029, StdDev=68.234
But if you use band.ReadAsArray() (= Numpy array)
for j in range(bands): band = f.GetRasterBand(j+1) data = band.ReadAsArray() print "[ Numpy ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % (data.min(), data.max(), data.mean(), data.std()) [ Numpy ] = Minimum=0.000, Maximum=255.000, Mean=220.477, StdDev=42.584 [ Numpy ] = Minimum=31.000, Maximum=255.000, Mean=214.955, StdDev=39.558 [ Numpy ] = Minimum=0.000, Maximum=255.000, Mean=178.856, StdDev=69.535
Why? The problem is (GDAL does not ignore NoData value)
GetStatistics will reuse previously computed statistics if they exist (i.e computed before you set the NoData value). You can use stats = band.ComputeStatistics(0) instead of GetStatistics to force the statistics to be recomputed.
for j in range(bands): band = f.GetRasterBand(j+1) stats = band.ComputeStatistics(0) print "[ STATS ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % ( stats[0], stats[1], stats[2], stats[3] ) [ STATS ] = Minimum=0.000, Maximum=255.000, Mean=220.477, StdDev=42.584 [ STATS ] = Minimum=31.000, Maximum=255.000, Mean=214.955, StdDev=39.558 [ STATS ] = Minimum=0.000, Maximum=255.000, Mean=178.856, StdDev=69.535