Computing.Net > Forums > Unix > Cancel Out lines in a file

Cancel Out lines in a file

Reply to Message Icon

Original Message
Name: CamTu
Date: March 10, 2005 at 10:01:43 Pacific
Subject: Cancel Out lines in a file
OS: UNIX
CPU/Ram: N/A
Comment:

Hi all,
I have this text file with certain duplicate lines, and i want to remove those with same ID but with one positive and one negative charge. Is there a way I can accomplish this?
Here is my input file:

[code]
000681047 Allison Ancel 200530 EASY -600.00 3/9/05 0:00 Wrong Term
000681047 Allison Ancel 200530 EASY 600.00 3/9/05 0:00 SCAD Debit Card Charge
000681047 Allison Ancel 200530 EASY 600.00 3/9/05 0:00 SCAD Debit Card Charge
000699678 Ashley Willem 200520 EASY 650.00 3/9/05 0:00 SCAD Debit Card Charge
000702927 Charla Peters 200520 EASY 260.00 3/9/05 0:00 SCAD Debit Card Charge
000703541 Dana Williams 200520 EASY -20.00 3/9/05 0:00 Payment for ID
000672954 Emily Voegtlin 200520 EASY 5.00 3/9/05 0:00 SCAD Debit Card Charge
000657138 Jimmy Gilbert 200530 EASY 600.00 3/9/05 0:00 SCAD Debit Card Charge
000692343 Paul Martine 200530 EASY -200.00 3/9/05 0:00 Not deposit
000692343 Paul Martine 200530 EASY 200.00 3/9/05 0:00 SCAD Debit Card Charge
[/code]

I want the output file to look like this:
[code]
000681047 Allison Ancel 200530 EASY 600.00 3/9/05 0:00 SCAD Debit Card Charge
000699678 Ashley Willem 200520 EASY 650.00 3/9/05 0:00 SCAD Debit Card Charge
000702927 Charla Peters 200520 EASY 260.00 3/9/05 0:00 SCAD Debit Card Charge
000703541 Dana Williams 200520 EASY -20.00 3/9/05 0:00 Payment for ID
000672954 Emily Voegtlin 200520 EASY 5.00 3/9/05 0:00 SCAD Debit Card Charge
000657138 Jimmy Gilbert 200530 EASY 600.00 3/9/05 0:00 SCAD Debit Card Charge
[/code]

Thanks


Report Offensive Message For Removal


Response Number 1
Name: nails
Date: March 10, 2005 at 11:07:21 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

Hi:

Here is a way:

#!/bin/ksh

sed -n '/Debit Card Charge$/p' data.file|uniq

You seem to have left Paul Martine out of your final data file.



Report Offensive Follow Up For Removal

Response Number 2
Name: CamTu
Date: March 10, 2005 at 11:49:56 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

hi,

Your code only remove those lines with negative charges. I want to remove also the positive charges for those with same ID number.
There are 3 lines for Allison Ancel, and 2 lines for Paul Martine. So the final output file, i only want Allison Ancel but not Paul Martine because they should cancelled each other out.
Thanks

CT


Report Offensive Follow Up For Removal

Response Number 3
Name: Jim Boothe
Date: March 11, 2005 at 07:15:26 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

What makes this more complex is the need to match up multiple lines of the same acct/amt such as:

000681047 -600.00
000681047 -600.00
000681047  600.00
000681047  600.00
000681047  600.00

The BEGIN procedure stores all the negative lines in a serialized array.  I also create negindexes at this time to allow for efficiency in processing.  This allows me to quickly know if a positive line has a stored offsetting line.  If I store three lines for 000681047-600.00, negindexes will have a single entry representing these.

Awk then processes all the positive lines.  For those that had negative lines stored, check_for_offset will be called, which will:

Scan the array in attempt to find an offsetting negative.  If found, will NOT print the positive line, and WILL delete the negative line from the array to prevent it from being matched again.  If an offsetting negative is NOT found, this means that all of them (for this acct/amt) have already been consumed (matched and deleted).  In that case, check_for_offsets WILL print the unmatched positive line, and WILL ALSO delete thiskey from negindexes to stop signalling the fact that this acct/amt has one or more negative lines stored since this is no longer true due to consumption.

awk 'BEGIN \
{while ((getline < "cam.in") > 0)
   if ($6Ǡ)
      {negmax+=1
       negrecs[negmax] = $0
       negkeys[negmax] = $1 $6
       negindexes[$1 $6] = 1}
}

function check_for_offset() \
{for (i=1;i<=negmax;i++)
    if (thiskey==negkeys[i])
        {delete negrecs[i]
         delete negkeys[i]
         return}
 print $0
 delete negindexes[thiskey]
}

$6>=0 \
{thiskey=$1 "-" $6
 if (thiskey in negindexes)
    check_for_offset()
 else
    print
}

END \
{for (i in negrecs)
    print negrecs[i]
}' cam.in

Hard work never killed anybody, but why take a chance?
        -- Charlie McCarthy


Report Offensive Follow Up For Removal

Response Number 4
Name: Jim Boothe
Date: March 11, 2005 at 07:18:20 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

OK, I had a posting error above. The line below the "while getline" line should be:

if ($6<0)


Report Offensive Follow Up For Removal

Response Number 5
Name: CamTu
Date: March 11, 2005 at 07:44:34 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

Jim,

Thanks for your solution. It's working if i run it from the command line; however when i put it in a file and run as the script and redirect the output to another file, it won't work.

I put the code in a file like this:

#!/bin/awk -f

BEGIN \
{while ((getline < "cam.in") > 0)
if ($6Ǡ)
{negmax+=1
negrecs[negmax] = $0
negkeys[negmax] = $1 $6
negindexes[$1 $6] = 1}
}

function check_for_offset() \
{for (i=1;i<=negmax;i++)
if (thiskey==negkeys[i])
{delete negrecs[i]
delete negkeys[i]
return}
print $0
delete negindexes[thiskey]
}

$6>=0 \
{thiskey=$1 "-" $6
if (thiskey in negindexes)
check_for_offset()
else
print
}

END \
{for (i in negrecs)
print negrecs[i]
} cam.in


And i run the script with this command:

./scriptname > output.txt

It won't give me back the command promt. The output.txt file is create, but it won't print anything in there.

Thanks

CT



Report Offensive Follow Up For Removal


Response Number 6
Name: Jim Boothe
Date: March 11, 2005 at 08:03:30 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

I never specify awk on the shell header line. For now, just start your script like this:

# !/bin/sh
awk 'BEGIN \

(and remember to fix the "if ($6<0)" line)


Report Offensive Follow Up For Removal

Response Number 7
Name: CamTu
Date: March 11, 2005 at 08:36:20 Pacific
Subject: Cancel Out lines in a file
Reply: (edit)

Jim,

It works !!!!
Thanks....;)

CT


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: Cancel Out lines in a file

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge