Computing.Net > Forums > Programming > Perl help: What does a ^_ represent

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Perl help: What does a ^_ represent

Reply to Message Icon

Name: jb60606
Date: May 21, 2009 at 20:03:01 Pacific
OS: Mac OS X 10.5.6
CPU/Ram: Octo 2.8Ghz Xeon
Product: Apple / MAC PRO
Subcategory: General
Comment:

Using Perl, I'm trying to parse a very large(1GB) file containing
several different values per line. I'd like to put each line in an array so
that I can append certain text to specific elements.

This file was written by a shell/perl script and appears to have
inserted the characters "^_" between several fields/elements (at least,
when I view the file in UNIX. They're not visible in a gui and the two
fields that would normally surround the character are appended to
each other). Is there any way to 'split' each line into an array using
those characters or whatever they represent? I've tried splitting by the
characters themselves (or \s+), but it doesn't work.

The following is a single line in the file:

AACA^_wOption^_AA       wDeleteDate^_string^_2009/03/21 
wExpirationDate^_string^_2009/03/21     wStrikePrice^_price^_12.5       
wExerciseStyle^_string^_A       wPremiumCurrency^_string^_USD   
wStrikeCurrency^_string^_USD    wIssueSymbol^_string^_AACA      
wPutCall^_string^_C     wPosLimitNearTerm^_int^_0       
wSettleOnOpenInd^_string^_N     wPosLimit^_int^_25000000        
wUnderlyingSymbol^_string^_AA


The following is a small sampling of my code. This portion uses regex
to search for a specific symbol range on each line:

open (OCC_RAW, $rawOccFile) || die $!;
	while (my $line=<OCC_RAW>){
		my @line = split /wOption/, $line;
			if ($line[0] =~ /^A\.[AWYXBNQ]$/ or $line[0] =~ 
/^A[A-O][A-Z]*\.[AWYXBNQ]/) {print BBO_1 "$line";}
                       elsif ($line[0] =~ /^A[P-Z][A-Z]*\.[AWYXBNQ]/ or 
$line[0]=~ /^B\.[AWYXBNQ]$/ or $line[0]=~ /^B[A-M][A-
Z]*\.[AWYXBNQ]/) {print BBO_2 "$line";}

else {open (NaE, ">> $notFoundLog") || die print NaE "$line[0]\.NaEwOption$line[1]";}


I'm trying to append is an "NaE" to the symbols (AACA, in this example). As you can see, I'm splitting it by "wOption" (and, consequently adding wOption back, later in the script after appending NaE to the symbol) to get the symbol on it's own. The problem is that the symbol appears once more, near the middle of the line

I'm just trying to determine if it is possible to break each line into an using whatever that is between them. Does anyone have any recommendations, or a smarter way of doing this? I'm sure that my chosen method is a reach, though it got the job done before I needed to address this issue.

thanks in advance.



Sponsored Link
Ads by Google

Response Number 1
Name: klint
Date: May 22, 2009 at 03:06:13 Pacific
Reply:

^_ is probably the Unit Separator ASCII control character, shown in Caret Notation.


0

Response Number 2
Name: FishMonger
Date: May 22, 2009 at 03:50:37 Pacific
Reply:

I'd probably start by looking at the script that created the file to see how it constructed the lines.

Or I'd use od to dump a portion of the file.

head filename | od -c


0

Response Number 3
Name: FishMonger
Date: May 22, 2009 at 03:59:41 Pacific
Reply:

On a side note, this line has a problem.

else {open (NaE, ">> $notFoundLog") || die print NaE "$line[0]\.NaEwOption$line[1]";}

If the open call fails, you try to print the error message to the filehandle that failed to open? And, die and print should not be used in the same statement. die sends its output to STDERR and print sends it to the currently selected (or specified) filehandle.

For info on why you should not do this:
"$line"

read:
perldoc -q quoting


0

Response Number 4
Name: jb60606
Date: May 22, 2009 at 18:03:14 Pacific
Reply:

Thanks Klint/Fishmonger

'od' indeed revealed it is a Unit Separator (or 037). Do you
guys know if it's possible to split the line up using this as a delimiter?

Thanks

FishMonger:

thanks for the note on the 'open' call. I just threw that in there
to simulate what was being done with the line, for this post.
The actual code is different.


0

Response Number 5
Name: jb60606
Date: May 30, 2009 at 13:46:33 Pacific
Reply:

If it helps anyone else, I found that the control character can
be simulated as follows:

-hold down the <ctrl> key
-press the letter "v" (should prompt the caret '^')

While continuing to hold down the <ctrl> key

-press the letter or symbol you wish to use; in this case, the
underscore

<ctrl> + v then <ctrl> +<shift> + <_>

Didn't know it was that simple. I was chasing down articles
on Perl and unicode.


0

Related Posts

See More



Sponsored Link
Ads by Google
Reply to Message Icon

Writing a batch file usin... How to convert 25,000 ara...



Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Programming Forum Home


Sponsored links

Ads by Google


Results for: Perl help: What does a ^_ represent

what is a cpu or cpi www.computing.net/answers/programming/what-is-a-cpu-or-cpi/9907.html

need help with c homework www.computing.net/answers/programming/need-help-with-c-homework/4670.html

C++ or Perl help www.computing.net/answers/programming/c-or-perl-help/6287.html