Computing.Net > Forums > Programming > UNIX script for Extracting data

Computing.Net: Over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to sign up now, it's free!

UNIX script for Extracting data

Reply to Message Icon

Original Message
Name: arung
Date: January 21, 2007 at 04:12:50 Pacific
Subject: UNIX script for Extracting data
OS: Windows Professional
CPU/Ram: P4/512MB
Model/Manufacturer: Intel
Comment:

Hi ALL,
I have a txt file with contents like:
(CLASS "MO_INP_PFTC0308"
(Name "MO_INP_PFTC0308")
(MESSAGE_TYPE "M")
(CLASSID "156332")
(PERSISTENCE_TYPE "T")
(MODULE_PREFIX "PF")
)
I need to place each of these in a MS excel
against the class name. There are thousands of such classes .Can somebody suggest me a UNIX script to extract this information from the .txt file, so as to place it in MS excel.(UNIX sript is not a must....Can also suggest some other easier way to do this )

Arun


Report Offensive Message For Removal


Response Number 1
Name: Mechanix2Go
Date: January 21, 2007 at 08:30:33 Pacific
Reply: (edit)

Are there several 'chunks' ['blocks'] starting with:

(CLASS

and ending with:

(MODULE_PREFIX

?


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 2
Name: FishMonger
Date: January 21, 2007 at 08:47:55 Pacific
Reply: (edit)

A unix (shell) script can't write to an excel file, but it can write to a csv file which can then be imported into excel. M2 is good with batch scripting, but it has the same limitation when writing to excel files. Perl, however, can read/write an excel file without the need of the csv file.

I'm a little busy at the moment so I can't work up an example, but if you want to go with Perl, I'll work up something when I have more time.


Report Offensive Follow Up For Removal

Response Number 3
Name: Mechanix2Go
Date: January 21, 2007 at 08:58:46 Pacific
Reply: (edit)

Hi FM,

Is there a free util to emulate unix shell within DOS/NT?


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 4
Name: FishMonger
Date: January 21, 2007 at 10:30:57 Pacific
Reply: (edit)

Hi M2,

Yes there is, it's called cygwin. It's not a full unix shell, but it allows you to run a good number of the most used unix commands.

http://www.cygwin.com/


Report Offensive Follow Up For Removal

Response Number 5
Name: FishMonger
Date: January 21, 2007 at 10:45:03 Pacific
Reply: (edit)

arung,

Exactly how do you want the data formated in the spreadsheet? Due to the way this site processes our posts, the formatting often gets screwed up. Can you post a link to your sample text file so we can see its exact format?


Report Offensive Follow Up For Removal


Response Number 6
Name: arung
Date: January 21, 2007 at 23:39:30 Pacific
Reply: (edit)

Hi all,
The text file looks similar to wat is displayed...

Arun


Report Offensive Follow Up For Removal

Response Number 7
Name: arung
Date: January 21, 2007 at 23:41:06 Pacific
Reply: (edit)

Hi Fishmonger ,
I can also go with PERL...
Thanx in advance

Arun


Report Offensive Follow Up For Removal

Response Number 8
Name: arung
Date: January 21, 2007 at 23:43:22 Pacific
Reply: (edit)

Hi Mechanix2go ,
Yes, there are there several 'chunks' ['blocks'] starting with:
(CLASS

and ending with:

(MODULE_PREFIX

Thanx in advance

Arun


Report Offensive Follow Up For Removal

Response Number 9
Name: Mechanix2Go
Date: January 22, 2007 at 02:00:44 Pacific
Reply: (edit)

I figured out how to do it with an NT script. Maybe there's a util to make unix use a batch.

FM,

I got cygwin. As soon as I back up, I'll give it a whirl.


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 10
Name: FishMonger
Date: January 22, 2007 at 09:20:22 Pacific
Reply: (edit)

Here's a perl script that extracts the data and creates the spreadsheet. However, it's still unclear which fields you want to extract, how the records are seporated (ie, is there a blank line between each record?), or how you want the spreadsheet data formated.

I assumed that ech record is seporated by a blank line, and you wanted to extract each field. The script does not set the cell format (size, color, etc), but that could be added.

I used Spreadsheet-WritExcel but there are lots of other Perl modules for dealing with spreadsheets.
http://search.cpan.org/search?query...

=======================================================================
#!perl

use warnings;
use strict;
use Spreadsheet::WriteExcel;

# create an empty spreadsheet with 1 worksheet
my $workbook = Spreadsheet::WriteExcel->new('class.xls');
my $worksheet = $workbook->add_worksheet();

# initialize the row and col positions
my ($row, $col) = 0,0;

# write the header row (using an anonymous array)
$worksheet->write_row($row, $col, ['Class', 'Name', 'Message', 'ID', 'Persistence', 'Module']);

# set the input record separator to a blank line
$/ = "\n\n";

# open a read filehandle to the raw data file
open (CLASS, "<", "class.txt") or die "can't open class.txt $!";

while (<CLASS>) {
my @data; # initialize an empty array to hold the record

# use multiple regex's to extract the data
# this can also be done with a single regex (regular expression)
($data[0]) = /CLASS "([^"]+)"/;
($data[1]) = /Name "([^"]+)"/;
($data[2]) = /MESSAGE_TYPE "([^"]+)"/;
($data[3]) = /CLASSID "([^"]+)"/;
($data[4]) = /PERSISTENCE_TYPE "([^"]+)"/;
($data[5]) = /MODULE_PREFIX "([^"]+)"/;

# create a reference to the data array
my $data = \@data;

# increment the row and write the data
$row++;
$worksheet->write_row($row, $col, $data);
}

$workbook->close() or die "Error closing spreadsheet file: $!";
close CLASS;


Report Offensive Follow Up For Removal

Response Number 11
Name: Mechanix2Go
Date: January 22, 2007 at 09:39:42 Pacific
Reply: (edit)

Hi FM,

I didn't assume blank line but used "MODULE".

::== m.bat
@echo off > my.csv
setLocal EnableDelayedExpansion

for /f "tokens=* delims= " %%A in (mytxt) do (
set /a T+=1
set S!T!=%%A
echo %%A | find "MODULE" > nul
if not errorlevel 1 set /a T=0
if !T! equ 0 echo !S1!, !S2!, !S3!, !S4!, !S5!, !S6! >> my.csv
)
goto :eof
:: DONE


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 12
Name: FishMonger
Date: January 22, 2007 at 10:18:26 Pacific
Reply: (edit)

Update:

If there are no blank lines separating the records, then we would need to change the input record separator to this:

$/ = ")\n)\n";


Report Offensive Follow Up For Removal






Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Programming Forum Home








Do you have your own blog?

Yes
No
I did before
I will soon


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge
Poll History




Data Recovery Software