Spliting large text files

May 7, 2009 at 10:43:31
Specs: Windows XP
I'm trying to write a script that would run through a bunch of large text files, and split them in half. I've got some of the code figured out, but I can't figure out how to go about copying all data in the file after the mid point to a new file.

I just need to figure out how to count the lines, cut it in half, and append the data to a new file name >> newfile.txt

Any help would be greatly appreciated.


See More: Spliting large text files

Report •


#1
May 7, 2009 at 14:47:29
:: SPLIT.BAT Usage: split Filename
@echo off & setlocal EnableDelayedExpansion

set param=%*
set param=%param:"=%
for %%j in ("%param%") do set file=%%~nj

for /F %%j in ('type "%param%" ^| find /V /C ""') do set Half=%%j
set /A Half/=2

type nul > "%file%_A.txt"
set N=1
for /F "tokens=1* delims=]" %%j in ('type "%param%" ^| find /V /N ""') do (
  if !N! gtr %Half% goto :SPLIT
  set /A N+=1
  echo.%%k>> "%file%_A.txt"
)
:SPLIT
type nul > "%file%_B.txt"
for /F "skip=%Half% tokens=1* delims=]" %%j in ('type "%param%" ^| find /V /N ""') do (
  echo.%%k>> "%file%_B.txt"
)
:: End_Of_Batch

If the filename has spaces enclose the name between double quotes, e.g.

split "split file.txt"


Report •

#2
May 7, 2009 at 17:21:13
Can I put this in a loop
for %%a in (*.*) do (

)
in order to run through multiple files?

Thanks very much in advance for your help!!!!


Report •

#3
May 7, 2009 at 17:29:39
if you can download and use split (for windows) , the most basic command is
c:\test> split file


Report •

Related Solutions

#4
May 7, 2009 at 18:06:19
Hey,

Thanks for the suggestion, but I'm also trying to learn/understand the coding process of scripts while I do this...

I've already downloaded that app if all else fails...


Report •

#5
May 7, 2009 at 19:08:44
that's all right, since you are still learning. But in reality, i can tell you that you will seldom need to reinvent the wheel like that.

Report •

#6
May 8, 2009 at 03:48:45
First of all replace the Split.bat code with the following revised version that is more efficient and has full check of the parameter file:

:: SPLIT.BAT Usage: split [device:][pathname]filename
@echo off & setlocal EnableDelayedExpansion

set param=%*
if not defined param (
  echo.
  echo.  Usage: split [device:][pathname]filename
  goto :EOF
)
set param=%param:"=%
if not exist "%param%" (
  echo.
  echo. File "%param%" not found
  goto :EOF
)
for %%j in ("%param%") do (
  set name=%%~dpnj
  set ext=%%~xj
)
for /F %%j in ('type "%param%" ^| find /V /C ""') do set Half=%%j
set /A Half/=2

type nul > "%name%_A%ext%"
type nul > "%name%_B%ext%"
set X=A
set N=1
for /F "tokens=1* delims=]" %%j in ('type "%param%" ^| find /V /N ""') do (
  set /A N+=1
  echo.%%k>> "%name%_!X!%ext%"
  if !N! gtr %Half% set X=B
)
:: End_Of_Batch

Then the easiest way to perform mass splitting is to call the above batch from a main one, i.e.

:: MASSPLIT.BAT Usage: massplit Folder_Name
@echo off
pushd %*
echo.  Splitting, please wait...
for /F "delims=" %%j in ('dir /B *.txt') do call %~dp0\split %%j
echo.  DONE
popd
:: End_Of_Batch

Store both scripts into the same directory and then type e.g.

  massplit C:\My Dir\test

The process may take mimutes so be patient. I tested the script under Win 2K/XP and it worked fine.

You can directly put it into a For loop, but that requires a bit of code rearranging and so I selected the shortest way.

To ghostdog

The scientist Enrico Fermi who built the first nuclear reactor (and then the atomic bomb in Los Alamos) loved to reinvent the wheel and he discovered new ways to build wheels.

Anyway sometime it is better to be able to perform a calculation by your own hands even if you can't operate a calculator or a computer.


Report •

#7
May 8, 2009 at 04:25:02
to ivo
>>The scientist Enrico Fermi who built the first nuclear reactor (and then the atomic bomb in Los Alamos) loved to reinvent the wheel and he discovered new ways to build wheels.

i don't think you would want to use this as analogy in this case, because what the batch is doing can be done with the split command , ie, no new ways of doing things :)

>> Anyway sometime it is better to be able to perform a calculation by your own hands even if you can't operate a calculator or a computer.

not in the world where time is precious and tools are already available to perform specific tasks , and doing it well. :)
I agree that for educational purposes coding from scratch like this case is fun, but not in reality when you are in a sysadmin job and time is mostly not on your side. No, at least not for me. I would not want to rewrite one whole bunch of batch commands when i can use split just like that.


Report •


Ask Question