Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Hi All, I just started taking Unix and need help with my assignment.
/home% cat file1
1file08
2file08
1file03
1file05
2file05
1file09
2file09I need to group all lines that start with 1 and the same thing with lines that start with 2. My output should be:
GROUP1
(1,1file08)
(2,1file03)
(3,1file05)
(4,1file09)GROUP2
(1,2file08)
(3,2file05)
(4,2file09)I can use grep to group the files but the problem is the "number" before it. Note on GROUP2 does not have level 2 because there is not 2file03.
Thanks for any input you may have.
~Narsman

Here is a simple Korn shell solution. With some work, you can make it better. I don't have time.
#!/bin/kshgroup1=
group2=exec 3< ./junk999.dat
while read -u3 line
do
case $line in
1* ) group1="$group1$line "
;;
2* ) group2="$group2$line "
;;
esac
done/usr/bin/tput clear
counter=1
print "\n\nGROUP1"
for name in $group1
do
print "($counter,$name)"
(( counter += 1 ))
donecounter=1
print "\nGROUP2"
for name in $group2
do
print "($counter,$name)"
(( counter += 1 ))
doneprint "\n"
Jerry

nawk -f nars.awk file1
here's nars.awk:
-------------------
{
if (!match($0, /^[0-9][0-9]*/) )
next;
else
grpID=(substr($0, RSTART, RLENGTH));arr[grpID] = (grpID in arr) ? arr[grpID] SUBSEP $0 : $0;
}END {
for ( i in arr ) {
print "GROUP" i;
n=split(arr[i], cellA, SUBSEP);
for(j=1; j<=n; j++)
printf("\t(%d,%s)\n", j, cellA[j]);
}
}
---------------vlad
#include<disclaimer.h>

Hi vgersh99,
I tried your script and got this output:
GROUP2,
(1,2file08)
(2,2file05) -> this should say (3,2file05)
(3,2file09) -> this should say (4,2file09)
GROUP2,
(1,1file08)
(2,1file04)
(3,1file05)
(4,1file09)
-> because there is no 2file04.Thanks,

hm.... this is confusing....
could you explain AGAIN how you derive the LEADING numbers bfore the ',' in:
GROUP2
(1,2file08)
(3,2file05)
(4,2file09)I thought they where just sequestion numbers...
vlad
#include<disclaimer.h>

Hi vgersh99,
Here is a copy of my script...
===============================
#! /bin/csh -f
set pcount1 = 1
set pcount2 = 1
set grp1 = `grep '^1' file2`
set grp2 = `grep '^2' file2`echo "GROUP1"
foreach PLOOP1 ($grp1)
setenv G1 $grp1[$pcount1]
echo "($pcount1,$G1)"
@ pcount1 = $pcount1 + 1
endecho ""
echo "GROUP2"
foreach PLOOP2 ($grp2)
setenv G2 $grp2[$pcount2]
echo "($pcount2,$G2)"
@ pcount2 = $pcount2 + 1
end
===============================I really appreciate your help on this.
Thanks.

I'm sure Vlad will solve your awk problem. However, I'd recommend you forget csh for writing shell scripts. The csh is lacking in too many areas. Look at using bash, ksh or even sh. This is not just a personal opinion; it is shared by many system administrators. Ask around, you'll find the csh is not used much for writing shell scripts.
A good example is my shell script. I simply use the file descriptor to read the file. I don't need to use grep at all.
I see what you are after with the second group. It would be easy to modify my script to compare the values in each group and number them accordingly.

Hi Jerry. Yeah, I was told to get away from csh and consider learning perl. Like what I've said on my first post, I'm a newbie so right now, I'm just trying to concentrate on getting my feet wet. Thanks for the input.

sorry, I don't DO csh ;)
You posted your script, but I asked to explain the algorithm though.
vlad
#include<disclaimer.h>

Here's what goes on in the script...
The results of grep are assigned to grp1 and grp2.
$grp1 is 1file08[1] 1file04[2] 1file05[3] 1file09[4]
$grp2 is 2file08[1] 2file05[2] 2file09[3]
I put them in a loop (similar to {for (i=1; i<=NF; i++)} in awk).The problem here is on [2file05[2] in ($grp2)], I'm trying to make this 2file05[3].
I hope this makes sense as I'm not really good on explaination.
Thanks for trying though.
Actually I have another awk question which I will post in another topic.
TGIF!

My understanding is that column position 1 defines the "group", and columns 2-n is a control break field. We need to number the control breaks and generate a sequencing number based on that.
The first phase sequences the control breaks and formats the lines. It puts a group number at the beginning of the line also to make sorting easy. The output from the first phase (prior to sorting) would be:
1 (1,1file08)
2 (1,2file08)
1 (2,1file03)
1 (3,1file05)
2 (3,2file05)
1 (4,1file09)
2 (4,2file09)After sorting, the final pass control breaks on the group number and inserts the GROUP header lines. That temporary group number at the front of the line gets removed on this phase.
#!/bin/ksh
seq=0
holdkey=
holdgroup=while read line
do
key=${line#?}
prefix=${line%$key}
group=$prefix
if test "$key" != "$holdkey" ; then
holdkey=$key
((seq=seq+1))
fi
echo "$group ($seq,$line)"
done < junk.txt |
sort -k 1n -k2 |
while read group line
do
if test "$group" != "$holdgroup" ; then
holdgroup=$group
echo "\nGROUP$group"
fi
echo $line
doneGROUP1
(1,1file08)
(2,1file03)
(3,1file05)
(4,1file09)GROUP2
(1,2file08)
(3,2file05)
(4,2file09)And by the way, this solution assumes any number of groups. If I knew that there would always be just groups 1 and 2, I would have taken an entirely different (simpler) approach.

![]() |
IF-Then-Else syntax Help!...
|
How to use AWK command
|

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |