Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Hi ,
Within Unix from a large file containing text (not always seperated with spaces) I would like to be able to look for a word , and then extract the word with the next n amount of characters or the next amount of characters untill it hits a certain character (such as a / or a ], but not extract the whole line.Any thoughts/help would be great!
Jama

Just to add to it, The specific word that I want to find may not always be in the same place on each line.

I think a script is required. This is my take. Let me know if you have any questions:
#!/bin/kshword=myword
while read line
do
if echo "$line"|grep "$word" > /dev/null
then # if word exists in line
str=""
len=${#line} # length of line
mypos=$(echo "$line"|awk ' { print index($0, "'"$word"'")} ')
while [[ $mypos -le $len ]]
do # stop at the end of the line
# grab the character
charpos=$(echo "$line"| cut -c$mypos)
# get the next char position
mypos=$((mypos+1))
if [[ $charpos = "/" || $charpos == "]" ]]
then # stop at / or ]
break
fi
# build the string
str="$str"${charpos}
done
echo "$str"
fi
done < myfileIf efficiency is an issue, you might consider greping your original file first, and place the results in a temp file and run the script against the temp file:
#!/bin/kshword=myword
grep "$word" myfile > tmpmyfile
while read line
do
str=""
len=${#line} # length of line
mypos=$(echo "$line"|awk ' { print index($0, "'"$word"'") } ')
while [[ $mypos -le $len ]]
do # stop at the end of the line
# grab the character
charpos=$(echo "$line"| cut -c$mypos)
# get the next char position
mypos=$((mypos+1))
if [[ $charpos = "/" || $charpos == "]" ]]
then # stop at / or ]
break
fi
# build the string
str="$str"${charpos}
done
echo "$str"
done < tmpmyfile

If I am understanding this right, you can probably do this one liner.
grep beginword infile | sed -e "s/.*beginword/beginword/g" -e "s/endword.*/endword/g"
This will look for any line with your beginning word, cut everything before it, then everything after your ending word. With the following file, using one as my begin and three as my end:
nine one two three asdf asdf asdf
one four seven three asdf asdf asdf
four six nine asdf asdf asdf
seventeen one five thirteen six three asdf asdf asdf asdf
eleven twelve asdf asdf asdf asdf asdf a
one six six six six five three xxsd asdfa asdf asdf asdfit produces:
one two three
one four seven three
one five thirteen six three
one six six six six five three

Thank you both for your replies .
lankrypt0 have not had chance to try your one liner out , but it looks good!

grep -E
There's actually a link to this option as 'egrep', because it's so common:egrep -o '(match|this|or|this)' file.txt

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |