Computing.Net > Forums > Unix > Shell script for Text file parsing

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Shell script for Text file parsing

Reply to Message Icon

Name: nani123nani
Date: December 21, 2002 at 19:10:23 Pacific
OS: Unix
CPU/Ram: ??
Comment:

Hi guys,
I am very new to the shell scripting ....
I have a text file with 5 columns and all colums are seperated by some spI need to capture all individual user information attributes seperately like Phone number, UserId, FNAME etc.
I am able to parse few lines which have phone number is single string with out spaces in between. But I am not able to parse phone numbers which have spaces ...
ace ...

My text file is like below ...
++++++++++++++++++++++++++++++++++++++
UserId FNAME LNAME Phone ORG
------
PZ44GK Gery Kissel 8-353-4149 ATV
FZW2WV Phillip Louey +61 3 9647 5520 HOLDENS
KZ20FH James Davies +61 3 9647 1420 HOLDENS
BZYB7C Andrew Brenz 8-(810) 236-0598 CLCD
++++++++++++++++++++++++++++++++++++++++++

Allthe fields except phone number going to be a single word. I need to read line by line and capture all individual entities seperately.

Any suggestions ..

Thanks ...




Sponsored Link
Ads by Google

Response Number 1
Name: James Boothe
Date: December 23, 2002 at 10:57:57 Pacific
Reply:

Best choice would be for that file to have something besides spaces for the field delimeters (comma, semicolon, bar character). Your only problem right now is with spaces in phone number, but what if first name needs to be "John Jr." or ORG needs to be "ABC Inc."

The following script will isolate the phone number as starting with word 4 and all additional words on the line up to but not including the last word. It then prints all fields separated by bar characters. This output could be directed to a new file instead of the screen. It bypasses the first two lines of the input file on the assumption that they are header lines, but the bypass logic could of course be made more intelligent if needed.

#!/bin/ksh
awk '\
BEGIN {getline;getline} #bypass two header lines
{phone=$4
for (w=5;NF>w;w++)
phone=phone " " $w
print $1 "|" $2 "|" $3 "|" phone "|" $NF
}' myfile
exit 0

You did not say what you wanted to do with these fields after you have them parsed and isolated. The following script expands on the above script by piping the awk output into a while loop for processing by whatever unix commands:

#!/bin/ksh
awk '\
BEGIN {getline;getline} #bypass two header lines
{phone=$4
for (w=5;NF>w;w++)
phone=phone " " $w
print $1 "|" $2 "|" $3 "|" phone "|" $NF
}' myfile |
while IFS=\| read UserId FNAME LNAME Phone ORG
do
print "UserId=$UserId FNAME=$FNAME LNAME=$LNAME Phone=$Phone ORG=$ORG"
done
exit 0


0
Reply to Message Icon

Related Posts

See More


finding new files Non -Intercative login



Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: Shell script for Text file parsing

shell script for comparing file www.computing.net/answers/unix/shell-script-for-comparing-file-/7685.html

Utility or script for renaming files on www.computing.net/answers/unix/utility-or-script-for-renaming-files-on-/2538.html

shell script for sent e-mail www.computing.net/answers/unix/shell-script-for-sent-email/5000.html