Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
Struggling with this one...
I have a html file
eg:
<html>
link
link2
link3
</html>and i wish to create an awk script that will take this html file and print a list of the links contained in this file aswell as the frequency of each link (No. of times it occured).
example output:www.google.com 2
http://computing.net 1How would I go about starting this script, presumably making the '
Sponsored LinkResponse Number 1 Reply:
In the main body, each line that is not empty (has Number of Fields greater than zero), it accumulates into an array called "kount" the number of occurrences of the first word in each line.
At the END, it prints the contents of the array, and pipes the printed output into a sort command.
You will need to beef up the logic concerning which lines should be accumulated. As coded, it will create entries in the array for the html headers and footers also.
To print the links left justified in a minimum 44-character field, use %-44s instead of %44s.
awk '\
{if (NF>0) kount[$1]++}
END {\
for (i in kount)
printf "%44s %5d\n",i,kount[i]
}
' my.html | sort # -nResponse Number 2 Reply:
you are on the right track, but a usual html file doesn't really look like that? Links may be everywhere, not really just at $1..you might want to show an example html.
Response Number 3 Reply:
Does the regular expression need to be something like...
awk '/a href/{ sub (/.*a href = "/, ""); sub(/".*/,""); print }'
to safeguard for links being everywhere?
Remember if i have <\a href = "..."\> link <\/a> (N.B used '\' to show anchor tag as this forum creates a href hyperlink with the tag. i.e im trying to mimic what awk will see which is just the source code.) i want to print the link itself e.g "..." and the number of times that particular link occurs.
Intelligence Services
Software for Investigation and Change
Sponsored Link
creating data file with n...
Multiple file multiple co... ![]()
Post LockedThis post is quite old and has been locked from receiving new replies. Please create a new posting instead.
Go to Unix Forum Home
Sponsored links
Ads by Google
Results for: Using awk to list links in a fileFind duplicate words in a fileSummary: The gsub commands are to get rid of extra spaces that would mess up the comparisons. I made an assumption that you do not need two or more spaces in a row. Any word appearing in the same tag list mo...
www.computing.net/answers/unix/find-duplicate-words-in-a-file/7999.html
Awk to edit field in fileSummary: How can i edit a field in a file (saving the changes in the file) using awk. I can find the line that i want to edit by looking up field 1, but on that line, i don't know how to edit field 2 and save ...
www.computing.net/answers/unix/awk-to-edit-field-in-file/8482.html
how to replace a line in a fileSummary: my problem is i want to replace a line in a file e.g i want to replace the line DBUID= (some name) with DBUID=$user variable file name is sample.ksh DBUID=aruns010 please help me in solving this pro...
www.computing.net/answers/unix/how-to-replace-a-line-in-a-file/7214.html
![]()