Computing.Net > Forums > Unix > awk - numeric comparison fails

awk - numeric comparison fails

Reply to Message Icon

Original Message
Name: swascelot
Date: March 31, 2008 at 17:36:56 Pacific
Subject: awk - numeric comparison fails
OS: Mac OS X 10.4
CPU/Ram: 2 GB
Model/Manufacturer: MacBook Pro
Comment:

I'm trying to write what I expected to be a reasonably
simple awk routine to generate a list of numbers as part
of a larger bash shell script (since I'll need this done for
many different files).

For example:

echo | awk '{
x = 0.0; y = 0.0;
do {
printf("%.1f\n",x)
if ( x < 3.0 ) { y = 0.3 }
else if ( x < 6.0 ) { y = 1.0 }
else if ( x < 15.0 ) { y = 3.0 }
else if ( x < 25.0 ) { y = 5.0 }
else { y = 10.0 }
x = x + y;
} while ( x < 55.0 )
}'

I expected this to produce this list of numbers:
0 0.3 0.6 ... 2.4 2.7 3.0 4.0 5.0 6.0 9.0 12 15 20 25 35 45
55

what I get is:
0 0.3 0.6 ... 2.7 3.0 3.3 4.3 5.3 ...

when x = 3 in the loop (printed x after each "if"
conditional) the comparison seems to fail. i.e. ( x < 3 )
returns true, even when x = 3, and y remains equal to 0.3
for one iteration longer than it should ( as if I used "<="
instead of "<" ). I manage to get the right results if I
initialize x = 0.0001 and use printf("%.1f\n",x). Fine, but
why is this necessary? (Its terribly unsatisfying).

I found post 8067
(http://www.computing.net/unix/wwwboard/forum/8067
.html) which suggests it might be because awk treats the
value of x as a string, but I tried adding 0 to x ( x += 0; or
x = x + 0 ) and the results are the same.

I've tried every variation I can think of and searched a
number of forums (but my question is so general I'm not
sure I know how to search for it). I think I even has a case
where ( x < 3 ) did not work inside the "do" loop but
( x < 55 ) did work as the while condition.... but I haven't done
that again.

Can anyone explain why my " x < 3 " is acting like
" x <= 3 "? I would greatly appreciate your help (and I might be
able to move on and get more work done).

Thank you!


Report Offensive Message For Removal


Response Number 1
Name: ghostdog
Date: March 31, 2008 at 18:19:05 Pacific
Reply: (edit)

try to print out your variables at each stage and see what's wrong with the logic.


Report Offensive Follow Up For Removal

Response Number 2
Name: swascelot
Date: March 31, 2008 at 20:09:23 Pacific
Reply: (edit)


$ echo | awk '{
> x = 0.0; y = 0.0;
> do {
> printf("%.1f\t",x)
> if (x < 3) {y = 0.3; print "(x<3) x=" x "; y=" y }
> else if (x < 6) {y = 1; print "(x<6) x=" x "; y=" y }
> else if (x < 15) {y = 3; print "(x<15) x=" x "; y=" y }
> x += y;
> } while ( x <= 15 )
> }'
0.0 (x < 3) x = 0; y = 0.3
0.3 (x < 3) x = 0.3; y = 0.3
0.6 (x < 3) x = 0.6; y = 0.3
0.9 (x < 3) x = 0.9; y = 0.3
1.2 (x < 3) x = 1.2; y = 0.3
1.5 (x < 3) x = 1.5; y = 0.3
1.8 (x < 3) x = 1.8; y = 0.3
2.1 (x < 3) x = 2.1; y = 0.3
2.4 (x < 3) x = 2.4; y = 0.3
2.7 (x < 3) x = 2.7; y = 0.3
3.0 (x < 3) x = 3; y = 0.3 <-- ( 3 < 3 ) --> TRUE?!?
3.3 (x < 6) x = 3.3; y = 1
4.3 (x < 6) x = 4.3; y = 1
5.3 (x < 6) x = 5.3; y = 1
6.3 (x < 15) x = 6.3; y = 3
9.3 (x < 15) x = 9.3; y = 3
12.3 (x < 15) x = 12.3; y = 3
$

When x = 3.0, the statement "if (x < 3)" evaluates true(?)
and sets y = 0.3 ... It looks like its acting as "<=" instead
of "<". Should I be initializing the variables differently to
force awk to see that 3.0 == 3.0 and is not less than 3.0?
I tried using inserting "x += 0" right before "if (x < 3.0)..."
and this didn't work either.

Thank you


Report Offensive Follow Up For Removal

Response Number 3
Name: ghostdog
Date: March 31, 2008 at 21:11:34 Pacific
Reply: (edit)

floating point numbers are approximations only. so 3.0 may not be exactly 3. it may be 3.00001 for example. you might want think of a workaround for your particular problem.


Report Offensive Follow Up For Removal

Response Number 4
Name: swascelot
Date: April 1, 2008 at 07:41:50 Pacific
Reply: (edit)

Thank you for the sanity check. initializing x = 0.00001 is a
rough workaround. but (x < 3) does work sometimes. maybe
just when dealing with integers? perhaps bc or perl would
work better for anything more complicated than a list of
numbers.

Thanks again.


Report Offensive Follow Up For Removal

Response Number 5
Name: James Boothe
Date: April 1, 2008 at 13:16:41 Pacific
Reply: (edit)

I observed the same thing.  I put in checks for x in the range of 0.3 to 3.3. The 0.9 did not test exact, nor anything past 2.4.

x is now exactly 0.3
0.300000
x is now exactly 0.6
0.600000
0.900000
x is now exactly 1.2
1.200000
x is now exactly 1.5
1.500000
x is now exactly 1.8
1.800000
x is now exactly 2.1
2.100000
x is now exactly 2.4
2.400000
2.700000
3.000000
3.300000

My workaround for this is to avoid the decimal points.  I just multiplied everything by 10, and changed the printf to print x/10.  But it also works to do:
xfinal=x/10
printf xfinal

(and 30/10 did yield an exactly 3.0)

echo | awk '{
x = 0; y = 0;
do {
printf("%.1f\n",x/10)
if ( x < 30 ) { y = 3 }
else if ( x < 60 ) { y = 10 }
else if ( x < 150 ) { y = 30 }
else if ( x < 250 ) { y = 50 }
else { y = 100 }
x = x + y;
} while ( x < 550 )
}'

0.0
0.3
0.6
0.9
1.2
1.5
1.8
2.1
2.4
2.7
3.0
4.0
5.0
6.0
9.0
12.0
15.0
20.0
25.0
35.0
45.0


Report Offensive Follow Up For Removal


Response Number 6
Name: swascelot
Date: April 1, 2008 at 14:45:01 Pacific
Reply: (edit)

Thank you, I've used your 10x trick before to keep track of
decimals as integers in bash and avoid using awk or bc.

Its strange that the exact test fails for x = 0.9 and x > 2.4
( and 30/10 == 3 ).

$ a=0; b=0.3;
$ echo $a $b | awk '{
for (x=0; x<=12; ++x) {
printf("a =%18.15f b =%18.15f a + b
=%23.20f\n",$1,$2,$1 + $2)
$1 = $1 + $2;
} }'
...
## I don't know how to format the output nicely for a
## post, so here's an excerpt:
...
a = 0.000000000000000 \
b = 0.300000000000000 \
a + b = 0.29999999999999998890
a = 0.300000000000000 \
b = 0.300000000000000 \
a + b = 0.59999999999999997780
...
a = 1.200000000000000 \
b = 0.300000000000000 \
a + b = 1.50000000000000000000
a = 1.500000000000000 \
b = 0.300000000000000 \
a + b = 1.80000000000000004441


Right off the bat, a = 0.0 and a+b = 0.2999..., but the
next a is 0.3000. the only exception is when a =
1.20000... and a+b = 1.500000... (maybe I just need
more arbitrary decimal places). The "exactly" test fails at
0.9 and above 2.4 again. hmmm... something in the
subtleties of floating point values I guess... (I read somewhere that the decimal 0.1 can't be represented exactly as a floating point number in base 2).

I'll have to stick with your 10x suggestion and get back to
work.
Thank you!


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: awk - numeric comparison fails

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




Have you ever used OpenOffice?

Yes, as my main suite.
Yes, occationally.
Yes, but only once.
No, never.


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge