Computing.Net > Forums > Programming > awk scripting using 2 input files

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

awk scripting using 2 input files

Reply to Message Icon

Name: telcomark
Date: February 18, 2009 at 18:36:06 Pacific
OS: unix
Subcategory: General
Comment:

Does anyone know how to read a pattern in from a file file1 and then search for this pattern in another file file2. The output should be equal to the line in file2 that has the pattern from file1. File1 and file2 will both have different number of lines in them.

file1
abcd 1
bcde 2
abcd 3

file2
abcd 2 aa bb cc
bcde 2 bb cc dd

output
bcde 2 bb cc dd

any assistance would be greatly appreciated,

Thanks



Sponsored Link
Ads by Google

Response Number 1
Name: nails
Date: February 19, 2009 at 10:15:02 Pacific
Reply:

Two things:

1) I'm using solaris, so I'm using nawk.

2) Since I'm reading file2.txt into an array, if file2.txt is large, you could have a problem with performance.

nawk ' BEGIN { cnt=0;
   while ( getline line < "file2.txt" > 0 )
      n[++cnt]=line
}
{
for(i=1; i<=cnt; i++)
   if(match(n[i],$0) > 0)
      print n[i]

} ' file1.txt


0

Response Number 2
Name: telcomark
Date: February 20, 2009 at 07:49:54 Pacific
Reply:

Thanks for the code that you have provided. It appears to be partially working but it is also return incorrect information.

To give a little more background, file1 has 1193 records in the file and file2 has 29300 records. I am trying to use the first two columns of file1 to index into file2. The code that was provided appears to work for the first 212 records in file1 but after that something happens and incorrect data starts to periodically be written.

I was able to get another awk script to run, but it is very inefficient as it does not use an array and it opens and closes both files numerous time (also takes and hour to run). Below is an excerpt of an sdiff that was done between a known good file and the output of the code that was provided. The output of the script is on the left hand side while the good data is on the right.

sdiff
NRFLVAJTDS002B1B 12 0 MGW 119 28 NRFLVAJTDS002B1B 12 0 MGW 119 28
NRFLVAJTDS002B1B 120 0 MGW 104 17 <
NRFLVAJTDS002B1B 121 0 MGW 104 17 <
NRFLVAJTDS002B1B 122 0 MGW 104 17 <
NRFLVAJTDS002B1B 123 0 MGW 104 17 <
NRFLVAJTDS002B1B 124 0 MGW 104 17 <
NRFLVAJTDS002B1B 125 0 MGW 127 29 <
NRFLVAJTDS002B1B 126 0 MGW 127 29 <
NRFLVAJTDS002B1B 127 0 MGW 127 29 <
NRFLVAJTDS002B1B 128 0 MGW 127 29 <
NRFLVAJTDS002B1B 129 0 MGW 127 29 <
NRFLVAJTDS002B1B 22 0 MGW 119 28 NRFLVAJTDS002B1B 22 0 MGW 119 28

Any information that you could provided for why this script that was provided is returning false data would be greatly appreciated.

Thanks



0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More







Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Programming Forum Home


Sponsored links

Ads by Google


Results for: awk scripting using 2 input files

AWK script using 2 different files www.computing.net/answers/programming/awk-script-using-2-different-files/14937.html

launching simultaneously 2 exe file www.computing.net/answers/programming/launching-simultaneously-2-exe-file/15124.html

awk script issue www.computing.net/answers/programming/awk-script-issue-/19198.html