Computing.Net > Forums > Programming > Counting pattern matches with sed

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Counting pattern matches with sed

Reply to Message Icon

Name: MikeO4723
Date: June 23, 2006 at 15:36:15 Pacific
OS: XP (cygwin)
CPU/Ram: 3gHz/1gB
Product: Dell/Inspiron5160
Comment:

Dear all,

I am trying to write a simple script that takes as input a list of identifiers, searches in a large text file for each of those identifiers, and where it finds them, replaces their occurence in the text file as ID_1, ID_2, ID_3, etc. The entries in the large text file look like this:
>probe:HG_U95Av2:1138_at:395:301; Interrogation_Position=2631; Antisense;
TGGCTCCTGCTGAGGTCCCCTTTCC
>probe:HG_U95Av2:1138_at:322:441; Interrogation_Position=2661; Antisense;
GGCTGTGAATTCCTGTACATATTTC
>probe:HG_U95Av2:1138_at:213:419; Interrogation_Position=2703; Antisense;
GCTTCAATTCCATTATGTTTTAATG

where "1138_at" is the identifier. I would like the output file to be identical to the input file, except that 1138_at would be replaced by 1138_at_1, 1138_at_2, 1138_at_3, etc. The code below finds and replaces all instances of 1138_at with 1138_at_1; that is, the counter does not increment. Does anyone have any suggestions? Thanks very much.

~ MikeO

CODE:

exec<"All_IDs.txt"
i="1"
while read line; do
echo $line
sed -e "s/:$line/&_$i:/" Large_file.txt >>Output.txt
i=$[$i+1]
done




Sponsored Link
Ads by Google

Response Number 1
Name: nails
Date: June 25, 2006 at 14:36:43 Pacific
Reply:

There is no simple way of performing this substitution. Basically, I count the number of substitutions that must be made. Then, use sed inside of a loop to perform one substituition at a time:

#!/bin/ksh
typeset -Z2 x

cp data.file tmp.data.file
cnt=$(grep -c 1138_at tmp.data.file)

x=0
while [ $x -lt $cnt ]
do
x=$((x+1))

sed -e '1s/1138_at/1138_'"$x"'/;t' -e '1,/1138_at/s//1138_'"$x"'/' tmp.data.file > /tmp/mytmp.file
mv /tmp/mytmp.file tmp.data.file
done

The final solution will be in tmp.data.file.


0
Reply to Message Icon

Related Posts

See More


Primary & Foreign Key... POSTing into MySQL from t...



Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Programming Forum Home


Sponsored links

Ads by Google


Results for: Counting pattern matches with sed

variable substitution with sed www.computing.net/answers/programming/variable-substitution-with-sed-/16669.html

Program that can find a patern www.computing.net/answers/programming/program-that-can-find-a-patern/12444.html

Deleting a block of text with sed www.computing.net/answers/programming/deleting-a-block-of-text-with-sed/16729.html