|
|
|
Read tow files with AWK
|
Original Message
|
Name: abd73fr
Date: December 17, 2004 at 09:35:35 Pacific
Subject: Read tow files with AWKOS: MacintoshCPU/Ram: PowerMac G4 |
Comment: Hi, I have two files: tst1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 0 1 1 1 1 1 1 1 tst2 5 6 5 8 8 9 8 8 8 8 9 8 8 8 8 9 0 5 6 5 5 6 5 5 The two files have the same dimensions (i.e. same NF&NR). I want to read the file (tst1) and get the coordinates (x,y) or (NF, NR in AWK) of the different entries in it. Then get the sum of the elements in tst2 whose the same (NF, NR) in tst1 for one entry. More clearly: in the file tst1 and for the number (1), I can print NF & NR as follows: x=`awk '{for (i=0; i<=NF; i++) {if ($i==1) printf (i" " ) }}' tst1` y=`awk '{for (i=0; i<=NF; i++) {if ($i==1) printf (NR" ") }}' tst1` ==> NFs: 1 2 3 2 3 4 5 6 7 8 NRs: 1 1 1 3 3 3 3 3 3 3 In the file tst2, I want to sum the entries whose the same (x,y) as for the number 1 in tst1. For this exampel, I expect: sum (for 1 in tst1) = (5+6+5+5+6+5+5+6+5+5)= 53 sum (for 2 in tst1) = (8+8+9+8+8+8+8+9+8+8+8+8+9)= 107 I wish it is clear and feaisble :) Thanks
Report Offensive Message For Removal
|
|
Response Number 1
|
Name: Jim Boothe
Date: December 17, 2004 at 12:12:20 Pacific
|
Reply: (edit)F1=tst1 F2=tst2if [ $(wc -l < $F1) -ne $(wc -l < $F2) ] ; then echo 'Files must have same # lines' exit 1 fi awk -v F1=$F1 '{ if (FILENAME==F1) {t1line[NR] = $0 next} else if (nbrrecs==0) nbrrecs=NR-1 origNR=NR-nbrrecs split(t1line[origNR],t1words) for (i=1;i<=NF;i++) {t1word=t1words[i] totals[t1word]=totals[t1word]+$i} } END { for (i in totals) print i, totals[i] }' $F1 $F2
Report Offensive Follow Up For Removal
|
|
Response Number 2
|
Name: Jim Boothe
Date: December 17, 2004 at 12:30:41 Pacific
|
Reply: (edit)I like this version better. I store file1 into the array in the BEGIN statement (I have to create my own NR at this stage). Then the processing of file2 can utilize NR directly instead of an adjusted NR. F1=tst1 F2=tst2if [ $(wc -l < $F1) -ne $(wc -l < $F2) ] ; then echo 'Files must have same # lines' exit 1 fi awk -v F1=$F1 '\ BEGIN { while ((getline < F1) > 0) {nr++ t1line[nr] = $0} } {split(t1line[NR],t1words) for (i=1;i<=NF;i++) {t1word=t1words[i] totals[t1word]=totals[t1word]+$i} } END { for (i in totals) print i, totals[i] }' $F2
Report Offensive Follow Up For Removal
|
|
Response Number 3
|
Name: abd73fr
Date: December 20, 2004 at 07:22:03 Pacific
|
Reply: (edit)It is superb... Thanks a lot Jim... :) In fact, the files tst1 & tst2 are just test files. But in reality, I have two files each one has 22200 lines and 146 fields and your program will help me a lot.. :) But I have one more question please: To verify the sum, how can I print the values? i.e. for my examples: for (1) in tst1, if I'd like to print : 5 6 5 5 6 5 5 6 5 5 what I have to do? Thanks again :))
Report Offensive Follow Up For Removal
|
|
Response Number 4
|
Name: Jim Boothe
Date: December 20, 2004 at 10:38:10 Pacific
|
Reply: (edit)The green lines will display all values accumulated for the value being checked for:F1=tst1 F2=tst2if [ $(wc -l < $F1) -ne $(wc -l < $F2) ] ; then echo 'Files must have same # lines' exit 1 fi awk -v F1=$F1 '\ BEGIN { while ((getline < F1) > 0) {nr++ t1line[nr] = $0} } {split(t1line[NR],t1words) hdrprinted=0 for (i=1;i<=NF;i++) {t1word=t1words[i] totals[t1word]=totals[t1word]+$i if (t1word==1) if (hdrprinted==0) {printf "Line%3d: word%3d: %s\n",NR,i,$i hdrprinted=1} else printf "%13s%3d: %s\n"," ",i,$i } }END { print "Summary totals:" for (i in totals) print i, totals[i] }' $F2 ./sumit.sh Line 1: word 1: 5 &nbp 2: 6 &nbp 3: 5 Line 3: word 2: 5 &nbp 3: 6 &nbp 4: 5 &nbp 5: 5 &nbp 6: 6 &nbp 7: 5 &nbp 8: 5 Summary totals: 2 107 1 53
Report Offensive Follow Up For Removal
|
|
Response Number 5
|
Name: Jim Boothe
Date: December 20, 2004 at 10:46:57 Pacific
|
Reply: (edit)Had a little posting error with the spaces - will try again ... The green lines will display all values accumulated for the value being checked for:F1=tst1 F2=tst2if [ $(wc -l < $F1) -ne $(wc -l < $F2) ] ; then echo 'Files must have same # lines' exit 1 fi awk -v F1=$F1 '\ BEGIN { while ((getline < F1) > 0) {nr++ t1line[nr] = $0} } {split(t1line[NR],t1words) hdrprinted=0 for (i=1;i<=NF;i++) {t1word=t1words[i] totals[t1word]=totals[t1word]+$i if (t1word==1) if (hdrprinted==0) {printf "Line%3d: word%3d: %s\n",NR,i,$i hdrprinted=1} else printf "%13s%3d: %s\n"," ",i,$i } }END { print "Summary totals:" for (i in totals) print i, totals[i] }' $F2 ./sumit.sh Line 1: word 1: 5 2: 6 3: 5 Line 3: word 2: 5 3: 6 4: 5 5: 5 6: 6 7: 5 8: 5 Summary totals: 2 107 1 53
Report Offensive Follow Up For Removal
|
|
Response Number 6
|
Name: abd73fr
Date: December 21, 2004 at 08:12:34 Pacific
|
Reply: (edit)Very nice :) Thank you very much Jim... your codes work fine ... I wish you a very Merry Christmas and Happy New Year :) @+
Report Offensive Follow Up For Removal
|
Use following form to reply to current message:
|
|

|