Computing.Net > Forums > Unix > Shell script for Text file parsing

Shell script for Text file parsing

Reply to Message Icon

Original Message
Name: nani123nani
Date: December 21, 2002 at 19:10:23 Pacific
Subject: Shell script for Text file parsing
OS: Unix
CPU/Ram: ??
Comment:

Hi guys,
I am very new to the shell scripting ....
I have a text file with 5 columns and all colums are seperated by some spI need to capture all individual user information attributes seperately like Phone number, UserId, FNAME etc.
I am able to parse few lines which have phone number is single string with out spaces in between. But I am not able to parse phone numbers which have spaces ...
ace ...

My text file is like below ...
++++++++++++++++++++++++++++++++++++++
UserId FNAME LNAME Phone ORG
------
PZ44GK Gery Kissel 8-353-4149 ATV
FZW2WV Phillip Louey +61 3 9647 5520 HOLDENS
KZ20FH James Davies +61 3 9647 1420 HOLDENS
BZYB7C Andrew Brenz 8-(810) 236-0598 CLCD
++++++++++++++++++++++++++++++++++++++++++

Allthe fields except phone number going to be a single word. I need to read line by line and capture all individual entities seperately.

Any suggestions ..

Thanks ...



Report Offensive Message For Removal

Response Number 1
Name: James Boothe
Date: December 23, 2002 at 10:57:57 Pacific
Subject: Shell script for Text file parsing
Reply: (edit)

Best choice would be for that file to have something besides spaces for the field delimeters (comma, semicolon, bar character). Your only problem right now is with spaces in phone number, but what if first name needs to be "John Jr." or ORG needs to be "ABC Inc."

The following script will isolate the phone number as starting with word 4 and all additional words on the line up to but not including the last word. It then prints all fields separated by bar characters. This output could be directed to a new file instead of the screen. It bypasses the first two lines of the input file on the assumption that they are header lines, but the bypass logic could of course be made more intelligent if needed.

#!/bin/ksh
awk '\
BEGIN {getline;getline} #bypass two header lines
{phone=$4
for (w=5;NF>w;w++)
phone=phone " " $w
print $1 "|" $2 "|" $3 "|" phone "|" $NF
}' myfile
exit 0

You did not say what you wanted to do with these fields after you have them parsed and isolated. The following script expands on the above script by piping the awk output into a while loop for processing by whatever unix commands:

#!/bin/ksh
awk '\
BEGIN {getline;getline} #bypass two header lines
{phone=$4
for (w=5;NF>w;w++)
phone=phone " " $w
print $1 "|" $2 "|" $3 "|" phone "|" $NF
}' myfile |
while IFS=\| read UserId FNAME LNAME Phone ORG
do
print "UserId=$UserId FNAME=$FNAME LNAME=$LNAME Phone=$Phone ORG=$ORG"
done
exit 0


Report Offensive Follow Up For Removal







Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: Shell script for Text file parsing

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software