Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
How do you count and display the frequency of occurrence of each letter in alphabet (ignore case).For example abc as input file.
This is a script I found on nawk section that display the frequencies of words.
awk '
# Print list of word frequencies
{
for (i = 1; i <= NF; i++)
freq[$i]++
}END {
for (word in freq)
printf "%s\t%d\n", word, freq[word]
}'How do I modify the the script to count/display frequencies of character instead? Any help will be greatly appreciated.

Good grief!
BEGIN { FS = "" }
{
$0 = tolower($0)
gsub(/[^[:alpha:]]/, "", $0)
for (i = 1; i <= NF; i++)
freq[$i]++
}
END {
for (word in freq)
printf "%s\t%d\n", word, freq[word]
}HYHN.

As far as I know.. you changed the [^a-z0-9_ \t] to [:alpha:].What does this alpha function do? counting alphabets?
You could tell me earlier if you wanted a piece of code,I can easily provide you.I got some code in my unix folder.Besides,I m not that type of person copy/paste somebody elses homework.Not before I can fully understand it.

You seem to have the documentation to hand but can you not run the examples to see what they do? If for some reason you're stuck with WinXP and can't put a proper OS on your computer, I heard that MS has provided something called "Windows services for Unix" which might help.
From your earlier posts it did look to me like you just wanted the answer but if that is not the case...
[:alpha:] is roughly the same as [a-zA-Z]. It is not a function but a regexp pattern for matching against the input records.
BEGIN { FS = "" } is the salient part of the modification of the word counting programme because it means "set the field separator to the null string" - instead of whitespace. This is what makes it count individual characters instead of words and has the same effect as the commandline option I was on about earlier.
I'm sure I don't need to tell you what the tolower() function does so that just leaves the gsub() function. You can look it up in your documentation but basically it just replaces any text matching the regexp in it's first argument with it's second argument. So in this case we are matching anything that is NOT an alphabet character and replacing it with "" - the null string.

as this is the unix forum, on *nix the following will do the job:
sed 's/./&\
/g' file|sort -f|uniq -cia carriage-return/enter follows &\

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |