Script to copy header rows to multiple other files

April 13, 2018 at 04:53:37
Specs: Windows 10
1. I have a script which splits a huge file (>500mb) to smaller files(30-40mb each) with 10000 lines in each file e.g. ER_4465_20180118_021100_2 to ER_4465_20180118_021100_21, ER_4465_20180118_021100_22,ER_4465_20180118_021100_23 etc

@echo off
setLocal EnableDelayedExpansion

set limit=100000
set file=ER_4465_20180118_021100_2.csv
set lineCounter=1
set filenameCounter=1

set name=
set extension=
for %%a in (%file%) do (
set "name=%%~na"
set "extension=%%~xa"
)

for /f "tokens=*" %%a in (%file%) do (
set splitFile=!name!!filenameCounter!!extension!
if !lineCounter! gtr !limit! (
set /a filenameCounter=!filenameCounter! + 1
set lineCounter=1
echo Created !splitFile!.
)
echo %%a>> !splitFile!

set /a lineCounter=!lineCounter! + 1


2. The script copies the first 2 header rows from original file to the first split file only e.g ER_4465_20180118_021100_21 and not the other files

3. Currently we are copying the first two lines to separate file and using that to copy the lines in the other files

4. Can we modify the script in Step 1 to include a condition to copy the first to rows of the original file to all the split files e.g ER_4465_20180118_021100_22,ER_4465_20180118_021100_23
please let me know if you need any inputs


See More: Script to copy header rows to multiple other files

Reply ↓  Report •

#1
April 13, 2018 at 11:25:16
If you're going to do this in Win10 and/or Server 2012, you should probably use its official scripting language: PowerShell.
$inputPath = 'C:\temp\ER_4465_20180118_021100_2.csv'
$lineLimit = 100000
$inInfo = gi $inputPath
[IO.Directory]::SetCurrentDirectory($inInfo.Directory)
$inFile = New-Object IO.StreamReader $inputPath
$header = $inFile.ReadLine() + "`r`n" + $inFile.ReadLine()
$line = $null
$lineCount = -1
$fileCount = 0
$outFile = New-Object PSObject | 
            Add-Member -MemberType ScriptMethod -Name "Dispose" -Value {} -PassThru

try {
  while (($line = $inFile.ReadLine()) -ne $null) {
    if (($lineCount = ($lineCount + 1) % $lineLimit) -eq 0) { 
      $outFile.Dispose()
      $outFile = New-Object IO.StreamWriter ("{0}_{1}{2}" -f $inInfo.BaseName, 
                                             ++$fileCount, $inInfo.Extension)
      $outFile.WriteLine($header)
    }
    $outFile.WriteLine($line)
  }
} finally {
  $inFile.Dispose()
  $outFile.Dispose()
}

How To Ask Questions The Smart Way

message edited by Razor2.3


Reply ↓  Report •

#2
April 15, 2018 at 01:05:32
Many thanks. This works great :)

I have new files with different file name coming in the folder each time. How can I pass the input path file parameter with an identifier like 'C:\temp\ER_*.csv'


Reply ↓  Report •

#3
April 16, 2018 at 06:40:46
I generally leave such actions as exercises for the reader, but sure. I'll make the configurable bits parameters, modify the script to allow multiple files, make 'C:\temp\ER_*.csv' the default, and add pipeline support.
[cmdletbinding(SupportsShouldProcess=$false)]
Param (
  [Parameter(ValueFromPipeline=$True,ValueFromPipeLineByPropertyName=$True)]
   [Alias('FullName')]
   [String[]]$Path = 'C:\temp\ER_*.csv',
  [int]$lineLimit = 100000,
  [int]$headerCount = 2
)
Begin { $files = @() }
Process { $files += dir $Path }
End {
  foreach($f in $files)  {
    Write-Verbose "Splitting $f . . . . "
    $inFile = New-Object IO.StreamReader $f.FullName
    $header = (1..$headerCount | % { $inFile.ReadLine() }) -join "`r`n"
    $line = $null
    $lineCount = -1
    $fileCount = 0
    $outFile = New-Object PSObject | 
                Add-Member -MemberType ScriptMethod -Name "Dispose" -Value {} -PassThru
    try {
      while (($line = $inFile.ReadLine()) -ne $null) {
        if (($lineCount = ($lineCount + 1) % $lineLimit) -eq 0) { 
          $outFile.Dispose()
          $outFile = New-Object IO.StreamWriter ("{0}\{1}_{2}{3}" -f $f.Directory, 
                                               $f.BaseName, ++$fileCount, $f.Extension)
          $outFile.WriteLine($header)
        }
      $outFile.WriteLine($line)
      }
    } finally {
      $inFile.Dispose()
      $outFile.Dispose()
    }
    Write-Verbose "Done, $fileCount pieces."
  }
}

How To Ask Questions The Smart Way


Reply ↓  Report •

Related Solutions

#4
April 17, 2018 at 04:08:44
Thanks a lot. It works as expected.

Reply ↓  Report •

Ask Question