Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
ex data
aaa20040110
aaa20040110
aaa20040110
aaa20040111
aaa20040111File may contain data for 2 dates or single day. I need to find if it has 2 day data or one day data

#!/bin/ksh
exec 3< ./junk.data
while read -u3 line
do
if [[ $(print $line | wc -w) -eq 2 ]]
then
print $line
fi

Jerry's solution prints each line containing 2 words.
To determine how many unique dates your file contains (keying on word1 in each line), you can isolate that word (with awk or cut, for example), and then do a unique sort which will eliminate duplicates. And this is korn shell syntax, but can be converted for other shells.
k=$(awk '{print $1}' myfile.txt | sort -u | wc -l)
echo "number of unique dates:" $kThe following script will summarize your file on word1:
awk '\
{date[$1]++}
END {for (i in date)
print i, date[i]}
' myfile.txt

Thanks a lot. Next hurdle is I want to know if there are more than 2 dates then how many records for each date.
in above example.
20040110 - 3
20040111 - 2Once again thanks a lot for replying fast.

I'm ahead of you - my first reply provides those answers.
The first script gives you a count ($k) of how many unique dates you have.
The second script provides the summary you are asking for.

Thanks,
it's working great. it's dumb q. but how can I get summary data into unix shell variable ( or array) and manipulate it.

It is very easy to waste the first line. awk could do an initial getline to waste it, or tail +1 would do it. But the last line is much more of a pain. awk could do that too, but only awkwardly (sorry about that) by holding each line and processing it on a delayed basis. I took the easy way below and just let sed strip those.
There are several ways to get output from awk (or from any command) back to the shell. But when the output is any number of lines, you need a construct that can process multiple lines. Below, the awk output is piped into a while-loop, where each iteration will process one line.
By the way, on some linux, any environment variables established within that while-loop will go away after the while-loop.
sed '1d;$d' myfile.txt |
awk '\
{date[$1]++}
END {for (i in date)
print i, date[i]}' |
while read date k
do
echo "date=$date k=$k"
done

I used
set -A record_count `awk 'BEGIN{FS = "|"}{ if(NF!=1) date[$1]++}END {for (i in date) print date[i]}' $fposted`and then array record_count will have data as i require. also if(NF!=1) got rid of head and tail as they will not have dilimiter(which i did not mention in my post, sorry)
Thanks for all your help
I'd appritiate if you can give me some useful links on net for code sample/ useful articles for unix shell scripting etc

Excellent! Looks like before long, you will be replying to the questions instead of asking them.
I do not have a good set of links (sorry) except for sed. But I think the other regulars on this board have some good recommendations.
But I would suggest that you post your question in a new thread because it might not be seen at the bottom of this thread.

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |