Solved Awk selecting multipl lines ranging between specific pattern

December 23, 2011 at 16:51:41
Specs: Linux x86_64)
I've got an html file like this

garbage html code
garbage html code IDENTIFIER
juicy line 1
juicy line 2
juicy line 3
IDENTIFIER garbage html code
garbage html code

The lines i need only contain a-z, A-Z and "." ; they are placed at always different line numbers, and can be one or more.

I tried to just select lines starting with alpha characters using "cat file.html | grep \<[[:alpha:]]\" but lot of other garbage is being outputted.

I'd need to store those lines into different variables as well, and so far i've just been able to store that data all inside a single variable and i then need to work around it and separate the elements with tabs and then treat them separately again with "awk '{print $x}' " but that's pretty ugly.

I have no idea what to do now, apart from desperately reading awk, sed, and "bash strings manipulation" guides.

See More: Awk selecting multipl lines ranging between specific pattern

Report •

December 23, 2011 at 23:15:31
✔ Best Answer
If I am interpreting what you want correctly, the following code only grabs lines with upper and lowercase letters. Therefore your test data has 4 valid lines and this script should create 4 variables v1, v2, v3, v4:


while read line
   if [[ $line =~ "[a-zA-Z]+$" ]]
       # create a dynamic variable a1,
       eval a${i}="\$line"
done < file.html

# display variables
while (($h < $i))
   eval echo \"\$a$h\"

# end script
Let me know if you have any questions.

Report •

December 24, 2011 at 08:58:39
This is the second time you save my butt this week. I owe you a favour mate.
Still i haven't checked whether it fits with my case or not, but it seems it should work fine.

Thanks again and merry christmasn nails!.

Report •

December 24, 2011 at 10:55:39

Thank you for the kind words. Merry Christmas to you.

Report •

Related Solutions

December 25, 2011 at 12:46:00
Well I checked the script and it seems I am doing something wrong.

When i run the script i get no output from it.
I read about the use of the eval command but didn't understand its behaviour. What is meant by "evaluating" one or more expressions? Does it mean return 0 if it's correct?

Plus i haven't found the meaning of "=~" as logical operator? is it "different than"? i tried to change it to != but, instead, got every line of the dummy html file as output so i don't think so.

Report •

December 25, 2011 at 15:22:44
"=~" is the matching operator used to evaluate bash regular expressions. In this program, it matches all strings that contain only upper and lower case alpha characters. Here is a discussion:

The matching operators were first introduced in bash 3. You might check your bash version.

The eval command command combines all the arguments and executes the command, and the exit code is set. It's commonly used to execute dynamically created commands:

This example executes a string determining how many lines are in /etc/passwd:

mycommand="wc -l < /etc/passwd"
eval $mycommand

Another common use is creating shell variables such as used in this program:

eval a${x}="\$myline"
echo $a5 # new variable a5

Here is an explanation of the eval command:

Report •

Ask Question