Batch search multiple lines in text file.

April 24, 2009 at 09:53:15
Specs: Windows XP
I've seen a few .bat files with the capability of searching for a single line of text. Many of them give me a headache to look at. I'm still somewhat of a newbie ^^'

I do have a problem, though, that I need to use a .bat file for, and that's not searching for a single string on a single line, but searching for a specific set of strings across a series of lines.

Basically, I have a text file with a lot of stuff in it (a .dxf file, for those of you familiar with AutoCAD) and I need to see if this bit of text:

{ACAD_REACTORS
330
C
102
}
330
C
100
AcDbDictionary
281
1
3

or any other similar series of lines only appears uniquely (case-sensitive) -once- in the file. Each string individually appears multiple times, so I need to search for the entire set.

For those of you familiar with AutoCAD, I'm trying to create a .bat file to pull the list of layout tabs from the .dxf. I know I can do it once I find a pattern, which I can use to locate the layout tab names in the .dxf, I just don't know how to find that pattern.. I'm guessing it has to do with that series of lines.

Anything you good people can do to help would be most helpful, and please, if it wouldn't be too much trouble, if you find a solution, can you explain it? I am actively trying to learn this. ^^


See More: Batch search multiple lines in text file.

Report •


#1
April 24, 2009 at 11:28:13
and then for those who are not familiar with autocad, like myself, I don't understand your requirement. show sample input, and describe your output. show also where in your sample is the start of pattern to get and the end of pattern to stop finding.

Unix Win32 tools | Gawk for Windows


Report •

#2
April 24, 2009 at 12:17:30
Sorry, I'll try to be clearer this time around.

The input is a text file with many lines (thousands, in some cases, if not tens of thousands) each describing some element of a complex file, which can be represented as vectors and objects. In terms of raw input, it can be assumed that each file in itself is random, but shares common elements with similar files.

Here is a sample of the file:

...
100
AcDbDictionary
281
1
0
DICTIONARY
5
1A
102
{ACAD_REACTORS
330
C
102
}
330
C
100
AcDbDictionary
281
1
3
Layout1
350
59
3
Layout2
350
5E
3
Model
350
22
0
DICTIONARY
5
72
102
{ACAD_REACTORS
330
C
102
}
330
C
100
AcDbDictionary
281
...

Note that the ellipses are there to indicate that it continues in this fashion.

The lines I'm trying to extract, in this example, are "Layout1" , "Layout2" and "Model". "Model" is optional, and "Layout1" and "Layout2" are just examples; they can be any string whatsoever. Furthermore, they all appear twice within the file.

What I'm trying to do is find a pattern between these files, since they all do share some commonalities, and using those commonalities as a sort of datum, extract the lines I need. I believe one method for doing this is as I posted: find the series of lines that I originally posted, and if it only appears once within each file, then I know that it appears just before the set of strings I want to extract.

In a nutshell..

Suppose you had a file. Within this file there exists a set of lines, which appears only once within this file and all similar files, but never in the same location, though it does contain the exact same data. If you know what the set of lines contains, how would you determine the line number of the first line in the set?

I believe I can accomplish actually extracting the lines once I can locate where they are.

Thank you for your time and patience, and I apologize again for being unclear. ^^


Report •

#3
April 24, 2009 at 23:29:00
"Many of them give me a headache to look at."

I'm glad it's not just me.

In order to process a file with a bat it must be strictly text.

Try opening it in NOTEPAD. If it looks junked up, likely it's not plain text.


=====================================
If at first you don't succeed, you're about average.

M2


Report •

Related Solutions

#4
April 25, 2009 at 10:00:59
Oh, I can open and manipulate .bat files.. I've made a few already.. but see, I'm used to HTML and LISP. Batch has a vastly different set of standards and annotations and syntax.. I mean, switches are completely new to me. And having to use "eol= tokens=1 delims=" in a FOR loop in order to get data from a single text file instead of multiple files.. @.@

Anyway, I've been toying around with it and here's my logic: search through and pull out some of the text I'm looking for, get the number of the line that it shows up in, compare the numbers to see if there's an increment. I've gotten up to the first check, but the output confuses me. Here's my code:

@echo off
REM Clear existing files
if exist C:\fr1.txt del C:\fr1.txt
if exist C:\fr2.txt del C:\fr2.txt
if exist C:\fr3.txt del C:\fr3.txt
if exist C:\fr4.txt del C:\fr4.txt
if exist C:\fr1a.txt del C:\fr1a.txt
if exist C:\fr2a.txt del C:\fr2a.txt
if exist C:\fr3a.txt del C:\fr3a.txt
if exist C:\fr4a.txt del C:\fr4a.txt
if exist C:\newfile.txt del C:\newfile.txt
set success=0

REM Export selected data to files
findstr /I /N /X "{ACAD_REACTORS" C:\dwg.dxf > C:\fr1.txt
findstr /I /N /X "330" C:\dwg.dxf > C:\fr2.txt
findstr /I /N /X "C" C:\dwg.dxf > C:\fr3.txt
findstr /I /N /X "100" C:\dwg.dxf > C:\fr4.txt

REM Obtain the first number from the first file. May need a loop.
for /f "eol= tokens=1 delims=:" %%i in (C:\fr1.txt) do @echo ^%%i>>C:\fr1a.txt
for /f "eol= tokens=1 delims=:" %%i in (C:\fr2.txt) do @echo ^%%i>>C:\fr2a.txt
for /f "eol= tokens=1 delims=:" %%i in (C:\fr3.txt) do @echo ^%%i>>C:\fr3a.txt
for /f "eol= tokens=1 delims=:" %%i in (C:\fr4.txt) do @echo ^%%i>>C:\fr4a.txt

REM Grab the first string in the first file
for /f "eol= tokens=1 delims=" %%a in (c:\fr1a.txt) do (
set /A you=%%a+1
for /f "eol= tokens=1 delims=" %%b in (c:\fr2a.txt) do (
set /A lost=%%b
if "%you%"=="%lost%" set you >> c:\newfile.txt
)
)

REM if "%lost%"=="%the%" (
REM set success=1
REM set the >> C:\newfile.txt

set success
pause

I manually compared "newfile.txt" with "fr1a.txt" and "fr2a.txt" and determined that the string "330" immediately follows the string "{ACAD_REACTORS" 90 times in the file "dwg.dxf". The problem is, "newfile.txt" should have only 90 entries of "you", one for each situation in which "you" and "lost" are equal, and if I were to open "newfile.txt" I should see simply a list of everywhere that this pairing occurs. Instead, it has 90 sets of "you", one for each loop though with "lost".. basically, if there were 100 entries in "fr2a.txt", there would be 100 entries for a single instance of a pairing, then 100 entries for the next instance, 100 entries for the next instance, and so on. I think there's something I lost in my loop, or maybe my check isn't correct. Can anyone help?


Report •


Ask Question