Interesting Perl prblem to extract file cont.

Hewlett-packard / Kj315aa-acj a6340in
December 8, 2010 at 05:35:22
Specs: Windows Vista, 2.2 GHz / 1014 MB
Hi everybody..Here's an interesting problem to solve. I have a text file like this
>first
TTCCCAAAAAAGACCTACTAAGTCAAGCGGATGCGTTTTGTGTCTTATGG
AAAGTCCCTGACGGATACGAGGCTTTGGGTGATTCGGTACGAATGATTCG
GTTACCAGAACTTACCGAAGAAGAAATGGGACGAACCGAGGTTTCTCGTT
CGTGTGCTAATCCTACATTCAAACATCGATTTCGATCAGAGTTTGTTTTT
CATGAAGAACAGACATTCGTATTACGTGTTTACGATGAAGATTTGAGGTA
>firsta
TTCCCAAAAAAGACCTACTAAGTCAAGCGGATGCGTTTTGTGTCTTATGG
AAAGTCCCTGACGGATACGAGGCTTTGG----------------------
-----------------AAGAAGAAATGGGACGAACCGAGGTTTCTCGTT
CGTGTGCTAATCCTACATTCAAACATCGATTTCGATCAGAGTTT------
CATGAAGAACAGACATTCGTATTACGTGTTTACGATGAAGATTTGAGGTA<a href="<a href="<pre>">">

Both >first and >firsta containing same characters except the part with hyphens. Now is it possible to write a perl script that would extract the text starting after >firsta and before the start of - for each line? Also, would it be possible to extract the unmatched text from >first?
Please note that both >first and >firsta are in the same text file and other similar text files which I am using might contain more lines like these. (I am using Windows Vista)
Thanks a lot in advance..


See More: Interesting Perl prblem to extract file cont.

Report •


#1
December 8, 2010 at 06:11:25
Not clear. Are first and firsta two text files? Or what?


=====================================
Life is too important to be taken seriously.

M2


Report •

#2
December 8, 2010 at 12:24:22
No..they are in same text file..

Report •

#3
December 8, 2010 at 12:32:20
they are in same text file..actually I would like to see a program that would first print the matched portion from >first and >firsta and then print out the unmatched portion from >first and >firsta.
Actually, I am getting the matched part..but I do not know how to extract the unmatched part.
You'll see ---- in >firsta. These ----- indicate that portion of firsta is not matching with >first.
For eg.
>first
ACCGG
ATGTTG
GCCTAA
>firsta
ACCGG
A--TTG
--CTAA

Here..everything in >first and >firsta is similar except the ---- part. So Perl should extract :
TG
GC
from >first
Thanks


Report •

Related Solutions


Ask Question