Specialty Forums
Security and Virus
General Hardware
CPUs/Overclocking
Networking
Digital Photo/Video
Office Software
PC Gaming
Console Gaming
Programming
Database
Web Development
Digital Home

General Forums
Windows XP
Windows Vista
Windows 95/98
Windows Me
Windows NT
Windows 2000
Win Server 2008
Win Server 2003
Windows 3.1
Linux
PDAs
BeOS
Novell Netware
OpenVMS
Solaris
Disk Op. System
Unix
Mac
OS/2

Drivers
Driver Scan
Driver Forum

Software
Automatic Updates

BIOS Updates

My Computing.Net

Solution Center

Free IT eBook

Howtos

Site Search

Message Find

RSS Feeds

Install Guides

Data Recovery

About

Home
Reply to Message Icon Go to Main Page Icon

Parsing Question

Original Message
Name: ld123
Date: February 20, 2008 at 23:26:31 Pacific
Subject: Parsing Question
OS: Win XP
CPU/Ram: Dual Core
Model/Manufacturer: Pavilion DV6000
Comment:
I have a directory with .txt files in it. I'm trying to parse out a value in the files and output to a .csv file. The value will always be a letter, V, and 7 digits, example V1234567. The letter will always be the same, the digits will always be different, but always 7 in number.

I've tried some of the examples I've found on the board, but, I am stumped.

When in doubt, format.


Report Offensive Message For Removal


Response Number 1
Name: klint
Date: February 21, 2008 at 03:24:08 Pacific
Subject: Parsing Question
Reply: (edit)
Can you paste the first few lines of the file so we can see what format it's in?

Report Offensive Follow Up For Removal

Response Number 2
Name: ld123
Date: February 21, 2008 at 09:09:41 Pacific
Subject: Parsing Question
Reply: (edit)
Here are the first three lines of a sample. This will be the general format for all the files. I have added the 1.), 2.), and 3.) in front of the lines:

1.)testrun|^~\&|ITDEPT|SAMPLE|ENGINE||||TRVB||2|68|
2.)TID||||||||||||||||||V1234567||
3.)CID||||||||||||||||||||||||||||||||||||||||||||08 13500|

I hope that helps. The V1234567 is the value that I'm trying to abstract.


When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 3
Name: ld123
Date: February 21, 2008 at 09:26:29 Pacific
Subject: Parsing Question
Reply: (edit)
As an FYI, this is a custom report we run on workstations. TID is terminal id, I have no idea what CID is, some date thing that I don't really need to parse out.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 4
Name: Mechanix2Go
Date: February 21, 2008 at 14:52:21 Pacific
Subject: Parsing Question
Reply: (edit)
Will the string of interest always be in the line[s] starting with:

TID

?

If so, things get easier.



=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 5
Name: klint
Date: February 21, 2008 at 15:31:08 Pacific
Subject: Parsing Question
Reply: (edit)
Also, in addition to M2's question, on the TID line is there always exactly one instance of the letter Z (i.e. the one that's followed by the number)? If so, things get easier still.

Report Offensive Follow Up For Removal


Response Number 6
Name: ld123
Date: February 21, 2008 at 18:23:27 Pacific
Subject: Parsing Question
Reply: (edit)
Yes, the TID line is the only one of interest, and there is only one instance of the letter V, followed by the numbers.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 7
Name: ghostdog
Date: February 22, 2008 at 07:31:47 Pacific
Subject: Parsing Question
Reply: (edit)
get GNU awk for Windows from http://gnuwin32.sourceforge.net/pac...
then

c:\> gawk -F "|" "/TID/{print $(NF-2)}" file
V1234567

you can incorporate that command in batch if you wish


Report Offensive Follow Up For Removal

Response Number 8
Name: ld123
Date: February 22, 2008 at 08:39:05 Pacific
Subject: Parsing Question
Reply: (edit)
I'm downloading a copy of the Gawk for Windows now. I'll install it on a VM server I have and test it out. With the example you gave, c:\> gawk -F "|" "/TID/{print $(NF-2)}" file
V1234567, wouldn't that only abstract the V1234567? The next file may have the value V0987654.

I haven't had a chance to look at the software yet, but how would you handle wildcards?

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 9
Name: ld123
Date: February 22, 2008 at 10:18:41 Pacific
Subject: Parsing Question
Reply: (edit)
Well, a whole different thing to learn now. I'm going through the pdf's with the program. Could you explain the command you wrote? I can't find any mention of a printing command NF- etc.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 10
Name: FishMonger
Date: February 22, 2008 at 10:31:43 Pacific
Subject: Parsing Question
Reply: (edit)
Your original post indicated that you needed to output to a csv file, which presumably means that you need to incorporate this parsed value with other data. Is that correct? If so, the gawk command may not do everything you need.

Here's a Perl command that accomplishes the same thing as the gawk command, but my assumption is that we need to expand the logic to handle the additional csv data.

perl -ne "print $1 if /TID.+(V\d{7})/" file.txt


Report Offensive Follow Up For Removal

Response Number 11
Name: ld123
Date: February 22, 2008 at 10:41:39 Pacific
Subject: Parsing Question
Reply: (edit)
I'm trying these ideas out on a 2003 Server VMware machine. What would I need to run Perl on that?

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 12
Name: FishMonger
Date: February 22, 2008 at 11:14:36 Pacific
Subject: Parsing Question
Reply: (edit)
You need to install Perl, which is free.

http://activestate.com/Products/act...


Report Offensive Follow Up For Removal

Response Number 13
Name: ld123
Date: February 22, 2008 at 11:16:52 Pacific
Subject: Parsing Question
Reply: (edit)
ok, I'll give it a shot. At this point, I will try anything.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 14
Name: ghostdog
Date: February 22, 2008 at 21:10:25 Pacific
Subject: Parsing Question
Reply: (edit)
@ID123

NF means number of fields in gawk. -F means field delimiter.

gawk -F "|" means set the field delimiter to "|".


If you try this on the command line:
gawk -F "|" '{print NF}' file

it shows you the number of fields you have separated by "|". you can verify by counting.

/TID/ simply means to match lines where it contains TID.


$(NF-2) means to get the VALUE of the field that is 2 places before the final field. That would be your V1234567, and its assumed that VXXXXXXX is always at that field.

$NF means to get the last field VALUE.

NF means the last field number.

to extract that VXXXXXX value to a new file, just pipe it.

gawk -F .... > newfile



Report Offensive Follow Up For Removal

Response Number 15
Name: Mechanix2Go
Date: February 22, 2008 at 23:05:49 Pacific
Subject: Parsing Question
Reply: (edit)
If the Vxxxxxxx is in the TID line and the intervening chars are |, this may do it.


::==
@echo off
setLocal EnableDelayedExpansion

for /f "tokens=* delims= " %%a in ('find "TID" ^< parse.txt') do (
set str=%%a
set str=!str:^|=!
set str=!str:TID=!
echo !str!
)


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 16
Name: klint
Date: February 23, 2008 at 03:00:52 Pacific
Subject: Parsing Question
Reply: (edit)
Why not use the letter V as the field delimiter? Then all you have to do is look at the first seven characters of the second field.

Report Offensive Follow Up For Removal

Response Number 17
Name: ld123
Date: February 25, 2008 at 03:59:26 Pacific
Subject: Parsing Question
Reply: (edit)
Mech,

I've run the bat against a sample file, but no dice. The file runs but no output, even though the sample file has the format and value mentioned above. Any ideas?

Thank you for the help.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 18
Name: ld123
Date: February 25, 2008 at 04:00:29 Pacific
Subject: Parsing Question
Reply: (edit)
Forgot to mention, the bat runs against the file without any errors, just no output.

When in doubt, format.


Report Offensive Follow Up For Removal

Response Number 19
Name: Mechanix2Go
Date: February 26, 2008 at 03:38:52 Pacific
Subject: Parsing Question
Reply: (edit)
Did you change parse.txt to whatever yours is?

Is the file SRTICTLY text?

try this:

type parse.txt

If it doesn't display 'clean' it may have some non-standard line breaks.


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal



Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: Parsing Question

Comments:

 
  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 


Data Recovery Software




XP Installed to G?

exessive internet traffic

ZoneAlarm Question. Blocked Connect

Windows Live Messenger Problem

Delete $Uninstall after SP3 updates


The information on Computing.Net is the opinions of its users. Such opinions may not be accurate and they are to be used at your own risk. Computing.Net cannot verify the validity of the statements made on this site. Computing.Net and Computing.Net, LLC hereby disclaim all responsibility and liability for the content of Computing.Net and its accuracy.
PLEASE READ THE FULL DISCLAIMER AND LEGAL TERMS BY CLICKING HERE

All content ©1996-2007 Computing.Net, LLC