Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Hi All,
I've seen many questions/answers about combining or merging two files using awk but I can't quite get them to work for my situation. It's possible there is an answer out there but just haven't stumbled upon it yet. My understanding of awk array's is letting me down.
Firstly i'm needing an awk solution. sort, join or paste will not work. The two files may not have the same key records.
I have two files. for ease file1 and file2.
I need to combine the records $2 and $3 in file2 to the end of the line of file1 where $1 in file2 is equal to $2 in file1If a record exists in file1 but not file2 then I still need to output seperators so the field length is consistent.
They are not sorted although this can be done however as previously mentioned there can be instances in file1 where a record does not exist in file2.
There will be around 1 million records.
file1
27896370|10411223311|BLABLABLA 0411223311|27896370|1||3|
27896381|10411223322|BLABLABLA 0411223322|27896381|1||3|
64979764|10311223333|BLABLABLA|64979764|2||3|file2
10411223311|0|2|
10311223333|0|1|
10411223322|1|0|expected file3 output
27896370|10411223311|BLABLABLA 0411223311|27896370|1||3|0|2|
27896381|10411223322|BLABLABLA 0411223322|27896381|1||3|1|0|
64979764|10311223333|BLABLABLA|64979764|2||3|0|1|
Thanks in advance to anyone who can assist.
Cheers

awk -F\| 'BEGIN {
while ((getline < "file2") > 0)
f2data[$1] = $2 "|" $3 }
{c2=f2data[$2]
if (c2=="") c2="|"
print $0 c2 "|"
}' file1And here it is coded a slightly different way. In the above solution, I pulled the array entry into a variable, then checked my variable to see if it was null. In the below solution, I check to see if the array entry exists, and if so, pull that entry. I would be surprised if you can see any difference in run time between the two versions.
awk -F\| 'BEGIN {
while ((getline < "file2") > 0)
f2data[$1] = $2 "|" $3 }
{if ($2 in f2data)
c2=f2data[$2]
else
c2="|"
print $0 c2 "|"
}' file1

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |