Computing.Net > Forums > Programming > del blank lines, remove white space

del blank lines, remove white space

Reply to Message Icon

Original Message
Name: edman747
Date: May 2, 2007 at 02:20:59 Pacific
Subject: del blank lines, remove white space
OS: xp home
CPU/Ram: 2.8ghz p4/1gb
Model/Manufacturer: hp/pavilion zv5160us
Comment:

Hello,
I often have some ascii text files that need edited. Not wanting to do the same task over and over, I want to make a batch file.

Here is and idea given to me by another poster. So I have 100 or 1000 files. Could be txt, log, cfg etc.

Need to delete the blank lines, for all the files in the current directory tree. And as an option remove extra whitespace. (what is extra whitespace? I will consider it to be: one or more tabs, two or more spaces between tokens, trailing spaces after the last token on a line.)

The following will single space the files in the current tree. Option to remove extra whitespace.
:: single_space.bat
:: single space all txt files in the current tree...
:: Usage single_space [nowhitespace]
@echo off
SetLocal EnableDelayedExpansion
If exist *.new del /s *.new >nul
If not %%1==() Set Option=%1

:: backup original files. cp *txt *old
For /F %%E in ('dir/b/s sample*.txt') Do (
copy /y %%E %%~dpnE.old >nul)

:: parse txt files, create new temp files.
For /F %%J in ('dir/s/b sample*.txt') Do (
For /F "tokens=*" %%A in (%%J) Do (
Set Row=%%A
If !Option!==nowhitespace Call :Sub1
Echo !Row!>>%%~dpnJ.new))

:: rename temp files. cp *new *txt, rm *new
For /F %%M in ('dir/b/s sample*.new') Do (
copy /y %%M %%~dpnM.txt >nul && del %%M)

:: tab to space, two spaces to one, zero trailing spaces.
:Sub1
Set Row=!Row: = !
Echo !Row!|find/I " " >nul
If errorlevel 1 (break) else (Set Row=!Row: = !) & GoTo :Sub1
If "!Row:~-1,1!"==" " (Set Row=!Row:~0,-1!) & GoTo :Sub1
GoTo :EOF

Some of this I understand. There are four parts and a subroutine.
part1:
1 ::single space all txt files in the current tree...
2 @echo off
3 SetLocal EnableDelayedExpansion
4
5 If not %%1==() Set Flag=%1
6
Not sure why it must be %%1.

part2:
7 :: backup original files. cp *txt *old
8 For /F %%E in ('dir/b/s sample*.txt') Do (
9 copy /y %%E %%~dpnE.old >nul)
10

part3:
11 :: parse txt files, create new temp files.
12 For /F %%J in ('dir/s/b sample*.txt') Do (
13 For /F "tokens=*" %%A in (%%J) Do (
14 Set Row=%%A
15 If !Option!==nowhitespace Call :Sub1
16 Echo !Row!>>%%~dpnJ.new))
17

part4:
18 :: rename temp files. cp *new *txt, rm *new
19 For /F %%M in ('dir/b/s sample*.new') Do (
20 copy /y %%M %%~dpnM.txt >nul && del %%M)
21

subroutine:
22 :: tab to space, two spaces to one, zero trailing.
23 :Sub1
24 Set Row=!Row: = !
25 Echo !Row!|find/I " " >nul
26 If errorlevel 1 (break) else (Set Row=!Row: = !) & GoTo :Sub1

27 If "!Row:~-1,1!"==" " (Set Row=!Row:~0,-1!) & GoTo :Sub1
28 GoTo :EOF

Ok,
line 24 change a tab to a space.
line 25 find two spaces.
line 26 not found, quit.
found, change two spaces to one space.
line 27 not sure, maybe check for trailing space.
Can anyone explain this?

Also, had to have the subroutine. Or the goto on line 26, 27 will mess up the for loop.

If I was real smart, I'd figure out how to change the font for this site.
Or support code blocks like begin [code] end [/code]
These post are so hard to read. Can't tell one space from two spaces or one space from a tab.
Maybe even expand the margins? To make post more readable.

enjoy,



Report Offensive Message For Removal


Response Number 1
Name: wizard-fred
Date: May 2, 2007 at 03:54:31 Pacific
Subject: del blank lines, remove white space
Reply: (edit)

What you want is doable, but when carried to such extreme greatly reduces the readability of a file.
Removing every blank line produces a big blob with no logical breaks.
Reducing white space to a single space loses vertical alignment of columns, tables, and numbers.
I've done what you want in converting HTML pages to plain ASCII text. Sometimes multiple passes are needed.

With regards to changing the forum page layout. Widening the margins will make it more unreadable. Not everyone has browsers that can display longer lines. I still use machines with 640x480 resolution. Even with 1024x768 resolution, there are occasional pages on some sites that require horizontal scrolling.


Report Offensive Follow Up For Removal

Response Number 2
Name: edman747
Date: May 2, 2007 at 10:24:30 Pacific
Subject: del blank lines, remove white space
Reply: (edit)

Hello, Thanks for your reply. I have a brother named Fred.
Maybe if I include a sample data file it will help to demonstrate removal of extra blank lines. Not all cr/lf are removed. Each line is terminated with cr/lf. Also, extra white space is removed.

sample1.txt with tabs shown as <tab>, two spaces shown as <ss> and multiple blank lines.

before:
Subject: this is the <ss> last
1. part <tab> 16


2. part <tab><tab> 17


3. part 18


after:
Subject: this is the last
1. part 16
2. part 17
3. part 18

Perhaps, the option should be named lesswhitespace

sample4.txt with multiple blank lines and trailing spaces.

before:
Subject: yet again even just <ss>more<ss>
1. part 13

how to tame a wild elephant

in the wild you would do this but at a zoo you do this
2. part 14

walking up the street with your friends is fun

you can kick over trash cans and look at speeding cars

3. part 15

after:
Subject: yet again even just more
1. part 13
how to tame a wild elephant
in the wild you would do this but at a zoo you do this
2. part 14
walking up the street with your friends is fun
you can kick over trash cans and look at speeding cars
3. part 15

So as you can see this comes out very nice and more readable. The text files are smaller. Extra white space reduction is an option. I would not recommend using that option on a file or set of files that has columns of data. It would affect the spacing!

Lets see how many characters will fit on one line.
52 numeric character margin spreader.
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabca
1234567890123456789012345678901234567890123456789012
ABCABCABCABCABCABCABCABCABCABCABCABCAB

[code]
:: single_space.bat
:: single space all txt files in the current tree.
:: And optionally remove extra whitespace.
:: Usage single_space [lesswhitespace]
@echo off
SetLocal EnableDelayedExpansion

If not %%1==() Set Option=%1

:: backup original files. cp *txt *old
For /F %%E in ('dir/b/s sample*.txt') Do (
copy /y %%E %%~dpnE.old >nul)

:: parse txt files, create new temp files.
For /F %%J in ('dir/s/b sample*.txt') Do (
For /F "tokens=*" %%A in (%%J) Do (
Set Row=%%A
If !Option!==lesswhitespace Call :Sub1
Echo !Row!>>%%~dpnJ.new))

:: rename temp files. cp *new *txt, rm *new
For /F %%M in ('dir/b/s sample*.new') Do (
copy /y %%M %%~dpnM.txt >nul && del %%M)

:: tab to space, two spaces to one, zero trailing spaces.
:Sub1
Set Row=!Row: = !
Echo !Row!|find/I " " >nul
If errorlevel 1 (break) else (Set Row=!Row: = !) & GoTo :Sub1
If "!Row:~-1,1!"==" " (Set Row=!Row:~0,-1!) & GoTo :Sub1
GoTo :EOF
[/code]

Please, any help on my questions regarding line 5 and line 27?

Hope this post is helpful.


Report Offensive Follow Up For Removal

Response Number 3
Name: Mechanix2Go
Date: May 2, 2007 at 12:16:09 Pacific
Subject: del blank lines, remove white space
Reply: (edit)

The long lines make it hard to read. Maybe start a new thread? For now, this may help"

::== white9.bat
:: get ut extra white space
:: lesson learned: a tab in VDE becomes two spaces

@echo off
setLocal EnableDelayedExpansion
if exist *.new del *.new

for %%T in (*.txt) do (
set outname=%%~nT.new
for /f "tokens=*" %%a in (%%T) do (
set row=%%a
: the next chunk changes [tab] to [space]
: and triple and double [space] to single [space]
set row=!row: = !
set row=!row: = !
set row=!row: = !
call :sub1 !row!
))
goto :eof

:sub1
echo %*>>!outname!
goto :eof
::==



=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal

Response Number 4
Name: edman747
Date: May 2, 2007 at 19:14:58 Pacific
Subject: del blank lines, remove white space
Reply: (edit)

Thank You,
It seems to do the job, fast. With only minor changes, will recurse the subdirectories to get all the other text files in the tree. Another change, to backup the original files and rename the temp files.
I see that calling the subroutine with the one line (row) removes the last trailing space.
Have reduced the margin in previous post from 80 to 52 numeric characters.
My brother was an aircraft mechanic in the gulf. He got me a shiny new zippo with some Arabic writing on it.
He says: Waiting in line at the bank, was strange. Because, everyone in the line had a gun.

What is VDE? Wow you went to VDE-institute in Germany. That must have been cool. Or you speak of Virtual distributed ethernet, possibly the VDE dos editor?

:: white9a.bat
:: get out extra white space
:: lesson learned: a tab in VDE becomes two spaces

@echo off
setLocal EnableDelayedExpansion
if exist *.new del *.new

:: backup original files. cp *txt *old
For /F %%E in ('dir/b/s sample*.txt') Do (
copy /y %%E %%~dpnE.old >nul)

:: parse txt files, create new temp files.
for /F %%T in ('dir/b/s sample*.txt') do (
set outname=%%~dpnT.new
echo !outname!
for /f "tokens=*" %%a in (%%T) do (
set row=%%a
: the next chunk changes [tab] to [space]
: and triple and double [space] to single [space]
set row=!row: = !
set row=!row: = !
set row=!row: = !
call :sub1 !row!
))
:: rename temp files. cp *new *txt, rm *new
For /F %%M in ('dir/b/s sample*.new') Do (
copy /y %%M %%~dpnM.txt >nul && del %%M)
goto :eof

:sub1
echo %*>>!outname!
goto :EOF

Thanks Again, I learn something new everyday.


Report Offensive Follow Up For Removal

Response Number 5
Name: Mechanix2Go
Date: May 3, 2007 at 02:13:34 Pacific
Subject: del blank lines, remove white space
Reply: (edit)

the VDE dos editor


=====================================
If at first you don't succeed, you're about average.

M2



Report Offensive Follow Up For Removal







Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: del blank lines, remove white space

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 2 Days.
Discuss in The Lounge