Computing.Net > Forums > Programming > Batch Find and Extract Regex String

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Batch Find and Extract Regex String

Reply to Message Icon

Name: Defects
Date: September 27, 2008 at 22:20:38 Pacific
OS: XP
CPU/Ram: NA
Product: NA
Comment:

I'm trying to create a batch file to find and extract text to a new file.

The string is "<.b>Album:*<.br>" (where * can be any number of any characters)

The objective is to input an RSS feed containing recent album release dates, and extract the albums to a new text file. From there, I want to compare the albums using an SQL query in greasemonkey to what albums the user has, and find only the albums the user doesn't have.

However, I'm having real issues with findstr... Any help would be much appreciated.

EX:

From:
[CDATA[<.b>Artist:<./b> Artist Name<.br><.b>Album:<./b> Album Name<.br><.b>Release Date:<./b> Date<.br>[CDATA[<.b>Artist:<./b> Artist Name<.br><.b>Album:<./b> Album Name<.br><.b>Release Date:<./b> Date<.br>

to:
<.b>Album:<./b> Album Name<.br><.b>Album:<./b> Album Name<.br>

(There are no periods in the html tags, I just used them to stop the forums from recognizing them as such :D)



Sponsored Link
Ads by Google

Response Number 1
Name: Judago
Date: September 28, 2008 at 02:58:48 Pacific
Reply:

Ok I've given it a go and this is what I came up with.....


@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /F "DELIMS=" %%G IN ('FINDSTR /I /R "<B.>ALBUM:.*</.B>.*<.BR>" YOURRSS.HTML') DO (
SET TEMP="%%G"
SET TEMP=!TEMP:^>=}!
SET TEMP=!TEMP:^<={!
SET TEMP=!TEMP:{B}ALBUM:=##!
SET TEMP=!TEMP:~1,-1!
CALL :STRIP1
)
GOTO OTHERSTUFF


:STRIP1
IF NOT DEFINED TEMP GOTO :EOF
IF NOT "!TEMP:~0,2!"=="##" SET TEMP=!TEMP:~1!&&GOTO STRIP1
SET SVR=!TEMP!
SET CNT=0
:STRIP2
IF NOT DEFINED SVR GOTO :EOF
SET /A CNT+=1
IF /I NOT "!SVR:~0,3!"=="{BR" SET SVR=!SVR:~1!&GOTO STRIP2
SET SVR=!TEMP:~0,%CNT%!
SET SVR="!SVR:##=<.B>Album:!"
SET SVR=!SVR:{/B}=^</B^>!
SET SVR=!SVR:{=^<BR^>!
ECHO !SVR:~1,-1!>>NEWTEXTFILE.TXT
SET TEMP=!TEMP:~%CNT%!
GOTO STRIP1

:OTHERSTUFF

There is one thing to watch out for with the script above, it won't tolerate any changes(except for case) to "<.b>Album:"(it will still work with stuff after the colon).

I used dots like you like you did above, there are four of them in the script and five in the whole post.


0

Response Number 2
Name: FishMonger
Date: September 28, 2008 at 08:54:13 Pacific
Reply:

You can not properly parse HTML or XML (RSS is an XML application) with a batch script. You need to use a language that has parsing libraries that understand html and xml.

You need to use a language like Perl, or Python, or VB.


0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More







Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Programming Forum Home


Sponsored links

Ads by Google


Results for: Batch Find and Extract Regex String

Find and replace a string in a file www.computing.net/answers/programming/find-and-replace-a-string-in-a-file/19034.html

Batch find and replace www.computing.net/answers/programming/batch-find-and-replace/15145.html

find and replace string in a file www.computing.net/answers/programming/find-and-replace-string-in-a-file/19987.html