Get string using multichar delimiters & Unix

October 1, 2010 at 20:33:58
Specs: Windows XP
Please help! This should be simple to do with gnu sed or gnu awk, but I'm totally stumped.

I need to extract delimited ascii strings that are only in some lines of a big data file.
The other lines (that don't have delimiters), I need to ignore.

The data I need is separated by multiple special-character delimiters.

The start delimiter string is 4 special characters shown here in quotes so you can read it:
"  " (it looks different in text editors, depending on which font is used).

By using several different hex editors I found that
the string is: STX NUL SHO NUL
or in octal is 002 000 001 000
or as ctrkeys ^B ^@ ^A ^@

End delimiter is just one special character shown here in quotes: ""
which is SHO or 001 or ^A

The text between the strings that I need is just plain ascii.
An example data file with just 4 lines is:

this is line one that has no text to extract
slrnaielnseinasine  this is text I am trying to extract in line 2 ezmein349an((349sknf
this is line three that has no text to extract
ienax.mrienalie9a  this is text I am trying to extract in line 4 ei;slkdirndia;liensis;as

So from the example file above, I'm trying to pull out the delimted strings only:

this is text I am trying to extract in line 2
this is text I am trying to extract in line 4

Any insights would be greatly appreciated!!!


See More: Get string using multichar delimiters & Unix

Report •

#1
October 1, 2010 at 20:41:33
Here is a better sample data file that doesn't have wrap around:

this is line one that has no text to extract
na si ne  this is text I am trying to extract in line 2 ezmei n34 9an(34
this is line three that has no text to extract
iea xr e9a  this is text I am trying to extract in line 4 ei;slkdi r ndia;lie


Report •
Related Solutions


Ask Question