What makes this more complex is the need to match up multiple lines of the same acct/amt such as:
000681047 -600.00
000681047 -600.00
000681047 600.00
000681047 600.00
000681047 600.00
The BEGIN procedure stores all the negative lines in a serialized array. I also create negindexes at this time to allow for efficiency in processing. This allows me to quickly know if a positive line has a stored offsetting line. If I store three lines for 000681047-600.00, negindexes will have a single entry representing these.
Awk then processes all the positive lines. For those that had negative lines stored, check_for_offset will be called, which will:
Scan the array in attempt to find an offsetting negative. If found, will NOT print the positive line, and WILL delete the negative line from the array to prevent it from being matched again. If an offsetting negative is NOT found, this means that all of them (for this acct/amt) have already been consumed (matched and deleted). In that case, check_for_offsets WILL print the unmatched positive line, and WILL ALSO delete thiskey from negindexes to stop signalling the fact that this acct/amt has one or more negative lines stored since this is no longer true due to consumption.
awk 'BEGIN \
{while ((getline < "cam.in") > 0)
if ($6Ǡ)
{negmax+=1
negrecs[negmax] = $0
negkeys[negmax] = $1 $6
negindexes[$1 $6] = 1}
}function check_for_offset() \
{for (i=1;i<=negmax;i++)
if (thiskey==negkeys[i])
{delete negrecs[i]
delete negkeys[i]
return}
print $0
delete negindexes[thiskey]
}
$6>=0 \
{thiskey=$1 "-" $6
if (thiskey in negindexes)
check_for_offset()
else
print
}
END \
{for (i in negrecs)
print negrecs[i]
}' cam.in
Hard work never killed anybody, but why take a chance?
-- Charlie McCarthy