Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
ok....i am not sure i can detail this more
than i have but here goes.we must take the following file from the web located here:
http://www.lcra.org/water/temperature.htmland save it as a .txt file.
that much i can do!
next, we must locate specific stations
on the list by looking down the left most
column.for example:
we need the current temperature at
Lampasas 10 WSW
Mason 3 NNE
Fredericksburg 10 NNE
Marble Falls 6 ENEyou will note, that the file gives you
the station name, then date, then time,
then current temperature.good so far?
now...the numeric temperature value
for each of the above....must be SUBSTITUTED into another file,
for the following:"xxx" is the place holder for "Lampasas 10 WSW"
so, we need to replace "xxx" with the corresponding temperature.
then "yyy" gets replaced by the temperature
at Mason 3 NNEthen "zzz" gets replaced by the temperature
at Fredericksburg 10 NNEand finally
"qqq" gets replaced by the temperature
at Marble Falls 6 ENEthe strings xxx,yyy,zzz,qqq are all located
within a rather large and volumunus file.needsless to say, if we OPEN that file
within our companys program, currently
it lists the temperature at those
various locations as: xxx,yyy,zzz,qqqby substituting the correct values...
we will be able to correctly list the
real temperature (rather than xxx,yyy,zzz,qqq)then we would save the file that has just
seen its strings, replaced by the numeric values...and open it in the program!
in a nutshell....the numbers (temps) are on the web
we save that file hourly, as a text file
we need to "lookup" the needed values
and have those values replace the placeholder strings (xxx,yyy etc) in a different file.finally, we save that new file
(now with numbers, instead of xxx,yyy zzz,qqq..)and voila!
we have it!!
i thank you all for all your help...
and appreciate your time.i hope this makes sense...
sean a

The page that you refer to is blank tonight. I saw data there the other day, but it's gone now. Without that file, I cannot make this completely correct. For instance, it appears that the location names vary in their number of words, so word 6 for the "Lampasas" line will not have the same type of data that word 6 for the "Marble Falls" line. So, some sort of data pull based on column number may be in order. Without seeing it, I can't be sure. And, without at least a sample of your output file (is it a trade secret?), the script will be, at the very least, inefficient because the substitution will be for the entire file rather than for a subset of lines. Also, in one place you say that you want to substitute the values (temps for strings) in the "big" file, yet in another you say that you want to make the changes and save it in a new file. Just to get you something workable, I will assume that you mean the former. Also, do the xxx, yyy, ... patterns ever occur more than once in the big file? Or, on the same lines? If so, and you are substituting for all lines multiple times, you've got a bigger problem.
The rough basics on this...
-Get the value from the input file (I'm calling it webtemps.txt):
xxxtemp=`grep -i Lampasas webtemps.txt | awk '{print $4}'`
qqqtemp=`grep -i "Marble Falls" webtemps.txt | awk '{print $5}'`
etc...-Make the replacement, but dump the output to standard out:
sedcmd -n "1,\$s/xxx/"$xxxtemp"/p" bigfile.txt
etc...This sed command will need to be reworked so it will display all of the other non-matching lines.
I hope this provides a good start. :-)
Martin

Here is my solution, with caveats at the end. This code works on HP-UX in the bourne or korn shell. I tested with an actual download of the web file, and a small tokenized file to represent your big file with tokens.
The files are named sean.* but this can be changed to something appropriate:
sean.sh the script
sean.base your big file with tokens
sean.curr the big file with tokens replaced
sean.htm the downloaded weather file
sean.tokens your defined token associations
sean.hist one line for each hourly runsean.tokens example:
xxx Lampasas 10 WSW
yyy Mason 3 NNE
zzz Fredericksburg 10 NNE
qqq Marble Falls 6 ENEMy solution requires the downloaded weather file to be saved not as a text file but rather as an htm file. As a text file, the data is all jammed together and hard to isolate the separate columns. If I have 8 digits for the four temperature columns such as 60605255, then I could assume four 2-digit numbers. But with 1-digit 2-digit and 3-digit numbers jammed together, forget it. As an htm file, the data columns are isolated.
NOTE: If I post a message than contains the "less than" sign, a lot of the data that follows is lost. So in the following script, there are four X's that really need to be "less than" signs.
sean.sh:
#!/bin/sh
awk 'BEGIN {
rc=getline X "sean.tokens"
while (rc==1)
{loc=substr($0,index($0,$2))
loctbl[loc]=$1
rc=getline X "sean.tokens"}
printf "sed"
}
{cut1=substr($0,index($0,">")+1)
data=substr(cut1,1,index(cut1,"X")-1)
if (data in loctbl)
{getline # date-time
getline # current temp
cut1=substr($0,index($0,">")+1)
ctmp=substr(cut1,1,index(cut1,"X")-1)
printf " -e \042s/%s/%s/g\042",\
loctbl[data],ctmp
sed++}
}
END {printf " sean.base > sean.curr\n"
print sed > "sean.sedcnt"
}' sean.htm > sean.sedtokcnt=`cat sean.tokens|wc -l`
sedcnt=`cat sean.sedcnt`if test "$sedcnt" -gt 0 ; then
echo 'Build of sean.curr starting ...'
rm -f sean.curr > /dev/null
/bin/sh sean.sed
if test -s sean.curr ; then
echo 'Build of sean.curr finished'
stat=
else
echo 'Build of sean.curr unsuccessful !!!'
stat=FAILED
fi
fiecho "`date` tok=$tokcnt sed=$sedcnt \
$stat \c" >> sean.hist
cat sean.sed >> sean.hist
# =========== end of script ===============awk begins by storing the tokenized locations (sean.tokens) in an array. It then processes the donwloaded htm file. For each location of interest that it finds, it gets the associated current temperature, then outputs a sed argument such as:
-e "s/xxx/69/"
When awk finishes, if at least one sed argument was output, the script processes the constructed sed command, which will input sean.base and create sean.curr.The script then verifies that sean.curr is there and not an empty file. One line is appended to the history file, and includes # lines in the token file, # of locations found in the htm file, and the constructed sed command that was used. Since the sed command has the token/temp associations, this history file is capturing hourly current temperatures for these locations, and a report could be generated. A history line looks like:
Fri Nov 9 06:12:50 tok=4 sed=4 sed -e ...One concern is passing the huge binary file through sed. I have seen it drop data, maybe null bytes? To test this:
sed -e "s/xxx/xxx/g" origfile > newfile
ls -l origfile newfile
rm newfile
The newfile should be same bytes as origfile.The script assumes that the 6 columns are:
location,date-time,curr,max,min,avg
As long as the downloaded file has location as col1 and current temperature as col3 then it will work. With a little extra coding, the script could confirm this against the column headers.Unforturnately, when I post this, the nice alignment and indentation in my script will go away.
James

![]() |
UNIX HELP (martin, james ...
|
A few questions
|

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |