Computing.Net > Forums > Unix > Word count in unix

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Word count in unix

Reply to Message Icon

Name: srikanth_onlinein
Date: September 21, 2005 at 05:15:30 Pacific
OS: Sun Solaris
CPU/Ram: 2gb
Comment:

Hi Friends,

what is the fastest way to calculate wordcount of unix flat file.
my flat file contained about 150 million records.
Conventional way of wc -l is taking huge time togive word count. Is there any other way of doing it faster ???

pls reply soon , its urgent for me.

Regards,
Srikanth



Sponsored Link
Ads by Google

Response Number 1
Name: nails
Date: September 21, 2005 at 09:19:06 Pacific
Reply:

In order to count words using unix tools other than wc, you'd probably have to write an awk or perl script. Without running tests, I can't be precise, but I don't think you can improve much on 'wc's speed.



0

Response Number 2
Name: Jim Boothe
Date: September 21, 2005 at 11:30:05 Pacific
Reply:

If your large file happens to be a permanent file (the same file day in and day out) that grows, you might get some mileage out of establishing a base line.  This approach ran 17% faster for me on a file with 7 million lines. That's not a lot, but "your results may vary".

In the example below, I am using a much smaller file.  The counts below show the entire file, then a "base count" for the first 10000 lines:

wc bigfile
11284 88935 539877 bigfile

basecount=10000
head -$basecount bigfile | wc
10000 79311 481362

Now, each day I just have to process the lines beyond the base:

tail +$((basecount+1)) bigfile | wc
1284 9624 58515

Of course, the tail command still has to scan the file, but it appears to do that a bit faster than wc.


0

Response Number 3
Name: Dlonra
Date: September 23, 2005 at 07:06:13 Pacific
Reply:

you say "wordcount" but also say "wc -l"
do you want words or lines?

if you want words, are there a different number of words in each line in the file?


0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More







Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: Word count in unix

Page Count in UNIX www.computing.net/answers/unix/page-count-in-unix/6814.html

Highlighting words in unix www.computing.net/answers/unix/highlighting-words-in-unix/4592.html

how to read binary files in Unix www.computing.net/answers/unix/how-to-read-binary-files-in-unix/4566.html