3

I am trying to find the maximum value of a column of data using gawk:

gawk 'BEGIN{max=0} {if($1>0+max) max=$1} END {print max}' dataset.dat 

where dataset.dat looks like this:

2.0
2.0e-318

The output of the command is

2.0e-318

which is clearly smaller than 2.

Where is my mistake?

Edit

Interestingly enough, if you swap the rows of the input file, the output becomes

2.0

Edit 2

My gawk version is GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2).

2 Answers 2

4

There are several issues with processing such small numbers (2e-318) in awk.

  • First, the input needs to be converted to a number before using it. That is usually done by adding 0. So, you need something like:

    val=0+$1 
  • Second, Normal double precision floats (53 bit mantissa and 11 bit exponent) The 11 bit width of the exponent allows the representation of numbers between 10e-308 and 10e308, so, normal floats will not be able to represent such numbers.

    $ echo '1e-307 1e-308' | awk '{print $1,$1+0,$2,$2+0}' 1e-307 1e-307 1e-308 0 

    Default GNU awk will not accept (normal) values below 1e-308.

  • Third, The default conversion format for awk (both CNVFMT and OFMT) are set to "%.6g". Numbers with more than 6 significant figures will be truncated. To get more significant figures: ask for them. Like %.15g for 15 (don't ask for more than 17 for a 53 bit mantissa, it could lie).

  • Fourth, It is better to set the first value of max to the first input. Setting max to 0 will fail if input has a negative maximum.


If you are using GNU awk and it has been compiled with arbitrary precision you can use:

$ printf '%s\n' 2e-318 2e-317 2e-307 2e-308 2e-319 | awk -M -v PREC=100 'BEGIN{OFMT="%.15g"}; {val=0+$1}; NR==1{max=val}; {print($1,val,max)}; val>max{max=val} END{print max}' 2e-318 2e-318 2e-318 2e-317 2e-317 2e-318 2e-307 2e-307 2e-317 2e-308 2e-308 2e-307 2e-319 2e-319 2e-307 2e-307 

Or simplified to your use case:

awk -M -v PREC=100 ' BEGIN{OFMT="%.15g"}; # allow more than 6 figures {val=0+$1}; # convert input to a (float) number. NR==1{max=val}; # On the first line, set the max value. val>max{max=val} # On every entry keep track of the max. END{print max} # At the end, print the max. ' file # file with input (one per line). 
1
  • 1
    Excellent answer -- whole new area of gawk for me. OP's posted awk version confirms it was built with arbitrary precision (MPFR and MP libraries). Mild surprise that huge numbers return 0.0, and not NaN or Inf, though. Commented Jan 29, 2020 at 9:17
2

The 0+ needs to be prefixed to each $1 to force a numeric conversion. max does not need 0+ -- it is already cast to numeric when it is stored.

Paul--) AWK=' > BEGIN { max = 0; } > 0+$1 > max { max = 0 + $1; } > END { print max; } > ' Paul--) awk "${AWK}" <<[][] > 2.0 > 2.0e-318 > [][] 2 Paul--) awk "${AWK}" <<[][] > 2.0e-318 > 2.0 > [][] 2 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.