Computing.Net > Forums > Unix > Combine multiple files using awk

Combine multiple files using awk

Reply to Message Icon

Original Message
Name: hp_admin20
Date: October 4, 2006 at 07:17:34 Pacific
Subject: Combine multiple files using awk
OS: HP-UX 11.0
CPU/Ram: B2600
Model/Manufacturer: HP B2600
Comment:

Can anyone point me in the right direction to compare two files and combine them using awk.

Example.

File1 Data
2442,Sep,16,2006,ACTIVE
2068,Aug,17,2006,ACTIVE
2245,Sep,20,2006,ACTIVE
2044,Jun,29,2006,INACTIVE

File2 Data
a024831,2068
a636090,2069
a639911,2044

What I want is to compare the two files and where I see 2068 in file1 and 2068 in file2 create file3 so it looks like this:

a024831,Aug,17,2006,ACTIVE

I hope I haven't confused anyone...HELP!


Report Offensive Message For Removal


Response Number 1
Name: James Boothe
Date: October 4, 2006 at 08:10:57 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

At start up, awk loads file2 in an array so that it can reference it while processing each line in file1.

awk -F, 'BEGIN {
while ((getline < "file2") > 0)
f2array[$2] = $1
OFS=","}
{if (f2array[$1])
print f2array[$1],$2,$3,$4,$5
#else
# print $1 " not listed in file2" > "unmatched"
}' file1

You can uncomment the else condition if you want the no matches to be written to a file named unmatched. If this were a shell script doing line by line processing, this would need >> instead of > for appending multiple lines, but awk keeps each output file open until explicitly closed or end of program, so you just need the single > (unless you want to append unmatched lines across multiple awk runs).


Report Offensive Follow Up For Removal

Response Number 2
Name: James Boothe
Date: October 4, 2006 at 08:16:57 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

I am reposting with the spacing preserved for easier reading.

awk -F, 'BEGIN {
while ((getline < "file2") > 0)
   f2array[$2] = $1
OFS=","}

{if (f2array[$1])
   print f2array[$1],$2,$3,$4,$5
#else
#  print $1 " not listed in file2" > "unmatched"
}' file1


Report Offensive Follow Up For Removal

Response Number 3
Name: hp_admin20
Date: October 4, 2006 at 10:14:55 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

Its not displaying any data in the new file I'm creating...here is how I have the script setup...

awk -F, 'BEGIN {
while ((getline < $DATA2) > 0)
f2array[$2] = $1
OFS=","}

{if (f2array[$1])
print f2array[$1],$2,$3,$4,$5
}' $DATA1


Report Offensive Follow Up For Removal

Response Number 4
Name: James Boothe
Date: October 4, 2006 at 11:29:38 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

Inside of the single-quoted awk program, shell variables do not get evaluated.  There are a handful of ways to process shell variables with awk, and I post two solutions below.  The first solution plays tricks with single quotes, and the second solution passes the variable on the command line.  data2 and DATA2 can be the same, but I show them in different case for clarity of the situation.

DATA1=file1
DATA2=file2

awk -F, 'BEGIN {
while ((getline < "'$DATA2'") > 0)
f2array[$2] = $1
OFS=","}

{if (f2array[$1])
print f2array[$1],$2,$3,$4,$5
}' $DATA1

DATA1=file1
DATA2=file2

awk -F, -v data2=$DATA2 'BEGIN {
while ((getline < data2) > 0)
f2array[$2] = $1
OFS=","}

{if (f2array[$1])
print f2array[$1],$2,$3,$4,$5
}' $DATA1


Report Offensive Follow Up For Removal

Response Number 5
Name: hp_admin20
Date: October 4, 2006 at 12:26:22 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

I figured it out....thank you for the help/info! It worked like a champ!


Report Offensive Follow Up For Removal


Response Number 6
Name: ghostdog
Date: October 4, 2006 at 21:27:48 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

How about this in python

f2 = open("file2.txt").readlines()
number = [] #store numbers
for items in f2:
... number.append(items.split(",")[-1].strip() )
for lines in open("file1.txt"):
.......for n in number:
..............if n in lines:
....................print lines



Report Offensive Follow Up For Removal

Response Number 7
Name: WilliamRobertson
Date: October 11, 2006 at 10:12:26 Pacific
Subject: Combine multiple files using awk
Reply: (edit)

Or if you can sort them first,

$ join -t, -1 1 -2 2 file1 file2
2044,Jun,29,2006,INACTIVE,a639911
2068,Aug,17,2006,ACTIVE,024831


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: Combine multiple files using awk

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge