Computing.Net > Forums > Unix > trouble merging two files with awk

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

trouble merging two files with awk

Reply to Message Icon

Name: pda976
Date: October 14, 2007 at 18:01:10 Pacific
OS: OSF1
CPU/Ram: alpha 1Gb ram
Product: compaq
Comment:

Hi All,

I've seen many questions/answers about combining or merging two files using awk but I can't quite get them to work for my situation. It's possible there is an answer out there but just haven't stumbled upon it yet. My understanding of awk array's is letting me down.

Firstly i'm needing an awk solution. sort, join or paste will not work. The two files may not have the same key records.

I have two files. for ease file1 and file2.
I need to combine the records $2 and $3 in file2 to the end of the line of file1 where $1 in file2 is equal to $2 in file1

If a record exists in file1 but not file2 then I still need to output seperators so the field length is consistent.

They are not sorted although this can be done however as previously mentioned there can be instances in file1 where a record does not exist in file2.

There will be around 1 million records.

file1
27896370|10411223311|BLABLABLA 0411223311|27896370|1||3|
27896381|10411223322|BLABLABLA 0411223322|27896381|1||3|
64979764|10311223333|BLABLABLA|64979764|2||3|

file2
10411223311|0|2|
10311223333|0|1|
10411223322|1|0|

expected file3 output
27896370|10411223311|BLABLABLA 0411223311|27896370|1||3|0|2|
27896381|10411223322|BLABLABLA 0411223322|27896381|1||3|1|0|
64979764|10311223333|BLABLABLA|64979764|2||3|0|1|


Thanks in advance to anyone who can assist.
Cheers



Sponsored Link
Ads by Google

Response Number 1
Name: James Boothe
Date: October 16, 2007 at 08:23:01 Pacific
Reply:


awk -F\| 'BEGIN {
while ((getline < "file2") > 0)
   f2data[$1] = $2 "|" $3 }
{c2=f2data[$2]
 if (c2=="") c2="|"
 print $0 c2 "|"
}' file1

And here it is coded a slightly different way.  In the above solution, I pulled the array entry into a variable, then checked my variable to see if it was null.  In the below solution, I check to see if the array entry exists, and if so, pull that entry.  I would be surprised if you can see any difference in run time between the two versions.

awk -F\| 'BEGIN {
while ((getline < "file2") > 0)
   f2data[$1] = $2 "|" $3 }
{if ($2 in f2data)
    c2=f2data[$2]
 else
    c2="|"
 print $0 c2 "|"
}' file1


0

Response Number 2
Name: pda976
Date: October 16, 2007 at 23:02:04 Pacific
Reply:

Thanks heaps James. I was trying things similar but couldn't quite crack it.


0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More







Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: trouble merging two files with awk

Merging two files in Specific Manner www.computing.net/answers/unix/merging-two-files-in-specific-manner/8451.html

Two files showing mismatch in awk ? www.computing.net/answers/unix/two-files-showing-mismatch-in-awk-/7762.html

Compare two files using awk or sed, add value www.computing.net/answers/unix/compare-two-files-using-awk-or-sed-add-value/8499.html