|
|
|
awk question
|
Original Message
|
Name: narsman
Date: January 25, 2005 at 18:09:34 Pacific
Subject: awk questionOS: Windows NTCPU/Ram: Intel Penthium 2.0 Ghz 51 |
Comment: Hi All, I just started taking Unix and need help with my assignment. /home% cat file1 1file08 2file08 1file03 1file05 2file05 1file09 2file09 I need to group all lines that start with 1 and the same thing with lines that start with 2. My output should be: GROUP1 (1,1file08) (2,1file03) (3,1file05) (4,1file09) GROUP2 (1,2file08) (3,2file05) (4,2file09) I can use grep to group the files but the problem is the "number" before it. Note on GROUP2 does not have level 2 because there is not 2file03. Thanks for any input you may have. ~Narsman
Report Offensive Message For Removal
|
|
Response Number 1
|
Name: thepubba
Date: January 26, 2005 at 05:53:48 Pacific
Subject: awk question |
Reply: (edit)Here is a simple Korn shell solution. With some work, you can make it better. I don't have time. #!/bin/ksh
group1= group2= exec 3< ./junk999.dat while read -u3 line do case $line in 1* ) group1="$group1$line " ;; 2* ) group2="$group2$line " ;; esac done /usr/bin/tput clear counter=1 print "\n\nGROUP1" for name in $group1 do print "($counter,$name)" (( counter += 1 )) done counter=1 print "\nGROUP2" for name in $group2 do print "($counter,$name)" (( counter += 1 )) done print "\n" Jerry
Report Offensive Follow Up For Removal
|
|
Response Number 2
|
Name: vgersh99
Date: January 26, 2005 at 07:20:14 Pacific
Subject: awk question |
Reply: (edit)nawk -f nars.awk file1 here's nars.awk: ------------------- { if (!match($0, /^[0-9][0-9]*/) ) next; else grpID=(substr($0, RSTART, RLENGTH)); arr[grpID] = (grpID in arr) ? arr[grpID] SUBSEP $0 : $0; } END { for ( i in arr ) { print "GROUP" i; n=split(arr[i], cellA, SUBSEP); for(j=1; j<=n; j++) printf("\t(%d,%s)\n", j, cellA[j]); } } --------------- vlad #include<disclaimer.h>
Report Offensive Follow Up For Removal
|
|
Response Number 4
|
Name: narsman
Date: January 26, 2005 at 13:54:27 Pacific
Subject: awk question |
Reply: (edit)Hi vgersh99, I tried your script and got this output: GROUP2, (1,2file08) (2,2file05) -> this should say (3,2file05) (3,2file09) -> this should say (4,2file09) GROUP2, (1,1file08) (2,1file04) (3,1file05) (4,1file09) -> because there is no 2file04.
Thanks,
Report Offensive Follow Up For Removal
|
|
Response Number 5
|
Name: vgersh99
Date: January 26, 2005 at 14:52:39 Pacific
Subject: awk question |
Reply: (edit)hm.... this is confusing.... could you explain AGAIN how you derive the LEADING numbers bfore the ',' in: GROUP2 (1,2file08) (3,2file05) (4,2file09) I thought they where just sequestion numbers... vlad #include<disclaimer.h>
Report Offensive Follow Up For Removal
|
|
Response Number 6
|
Name: narsman
Date: January 26, 2005 at 16:40:44 Pacific
Subject: awk question |
Reply: (edit)Hi vgersh99, Here is a copy of my script... =============================== #! /bin/csh -f set pcount1 = 1 set pcount2 = 1 set grp1 = `grep '^1' file2` set grp2 = `grep '^2' file2` echo "GROUP1" foreach PLOOP1 ($grp1) setenv G1 $grp1[$pcount1] echo "($pcount1,$G1)" @ pcount1 = $pcount1 + 1 end echo "" echo "GROUP2" foreach PLOOP2 ($grp2) setenv G2 $grp2[$pcount2] echo "($pcount2,$G2)" @ pcount2 = $pcount2 + 1 end =============================== I really appreciate your help on this. Thanks.
Report Offensive Follow Up For Removal
|
|
Response Number 7
|
Name: thepubba
Date: January 26, 2005 at 17:13:42 Pacific
Subject: awk question |
Reply: (edit)I'm sure Vlad will solve your awk problem. However, I'd recommend you forget csh for writing shell scripts. The csh is lacking in too many areas. Look at using bash, ksh or even sh. This is not just a personal opinion; it is shared by many system administrators. Ask around, you'll find the csh is not used much for writing shell scripts. A good example is my shell script. I simply use the file descriptor to read the file. I don't need to use grep at all. I see what you are after with the second group. It would be easy to modify my script to compare the values in each group and number them accordingly.
Report Offensive Follow Up For Removal
|
|
Response Number 8
|
Name: narsman
Date: January 26, 2005 at 17:58:09 Pacific
Subject: awk question |
Reply: (edit)Hi Jerry. Yeah, I was told to get away from csh and consider learning perl. Like what I've said on my first post, I'm a newbie so right now, I'm just trying to concentrate on getting my feet wet. Thanks for the input.
Report Offensive Follow Up For Removal
|
|
Response Number 9
|
Name: vgersh99
Date: January 27, 2005 at 05:54:11 Pacific
Subject: awk question |
Reply: (edit)sorry, I don't DO csh ;) You posted your script, but I asked to explain the algorithm though. vlad #include<disclaimer.h>
Report Offensive Follow Up For Removal
|
|
Response Number 10
|
Name: narsman
Date: January 28, 2005 at 11:38:53 Pacific
Subject: awk question |
Reply: (edit)Here's what goes on in the script... The results of grep are assigned to grp1 and grp2. $grp1 is 1file08[1] 1file04[2] 1file05[3] 1file09[4] $grp2 is 2file08[1] 2file05[2] 2file09[3] I put them in a loop (similar to {for (i=1; i<=NF; i++)} in awk).
The problem here is on [2file05[2] in ($grp2)], I'm trying to make this 2file05[3]. I hope this makes sense as I'm not really good on explaination. Thanks for trying though. Actually I have another awk question which I will post in another topic. TGIF!
Report Offensive Follow Up For Removal
|
|
Response Number 11
|
Name: Jim Boothe
Date: January 28, 2005 at 14:28:52 Pacific
Subject: awk question |
Reply: (edit)My understanding is that column position 1 defines the "group", and columns 2-n is a control break field. We need to number the control breaks and generate a sequencing number based on that. The first phase sequences the control breaks and formats the lines. It puts a group number at the beginning of the line also to make sorting easy. The output from the first phase (prior to sorting) would be:1 (1,1file08) 2 (1,2file08) 1 (2,1file03) 1 (3,1file05) 2 (3,2file05) 1 (4,1file09) 2 (4,2file09) After sorting, the final pass control breaks on the group number and inserts the GROUP header lines. That temporary group number at the front of the line gets removed on this phase. #!/bin/ksh seq=0 holdkey= holdgroup=while read line do key=${line#?} prefix=${line%$key} group=$prefix if test "$key" != "$holdkey" ; then holdkey=$key ((seq=seq+1)) fi echo "$group ($seq,$line)" done < junk.txt | sort -k 1n -k2 | while read group line do if test "$group" != "$holdgroup" ; then holdgroup=$group echo "\nGROUP$group" fi echo $line doneGROUP1 (1,1file08) (2,1file03) (3,1file05) (4,1file09)GROUP2 (1,2file08) (3,2file05) (4,2file09) And by the way, this solution assumes any number of groups. If I knew that there would always be just groups 1 and 2, I would have taken an entirely different (simpler) approach.
Report Offensive Follow Up For Removal
|
Use following form to reply to current message:
|
|

|