Batch Merge CSV Without Duplicate Headers

September 23, 2011 at 06:41:39
Specs: Windows XP
Hi everyone,

Some background:

- I am completely brand new to DOS batch programming
- I am trying to batch merge multiple CSV files in a single folder without having the header/column name/etc duplicated.
- Each day, a new CSV file will be added to the folder.
- This batch would be used each day to create a master CSV with all data in folder.

After spending 2+ hours desperately searching for an answer on this forum, I haven't yet found anything that works! While users have posted the exact same question, each user seems to get a difference code. After trying all the codes, I still get duplicate headers in my master file, and sometimes duplicate records!

Here is what I'm looking for (seems repetitive from other posts, but for clarity's sake):

data1.csv:
Name Date
Sam 8/12/11
John 7/15/11
Bob 6/15/11

data2.csv
Name Date
Billy 7/14/10
Sara 9/8/11
Joey 6/2/10

data3.csv
Name Date
Hank 3/5/09
Sally 4/28/11
Gina 1/2/11

FinalData.csv
Name Date
Sam 8/12/11
John 7/15/11
Bob 6/15/11
Billy 7/14/10
Sara 9/8/11
Joey 6/2/10
Hank 3/5/09
Sally 4/28/11
Gina 1/2/11

Any help would be GREATLY appreciated!

Thanks so much,
Kevin


See More: Batch Merge CSV Without Duplicate Headers

Report •


#1
September 23, 2011 at 07:14:45
This should be close, but it needs testing.

@echo off

REM Creating FinalData.csv with Name Date at the top if it doesn't exist already.
if not exist FinalData.csv echo Name Date > FinalData.csv

REM First for loop drops the top row of data1.csv
REM and appends the rest of them to FinalData.csv
for /F "tokens=*" %%A in (data1.csv) do (
more + 1 > data1mod.csv
data1mod.csv >> FinalData.csv
)

for /F "tokens=*" %%A in (data2.csv) do (
more + 1 > data2mod.csv
data2mod.csv >> FinalData.csv
)

for /F "tokens=*" %%A in (data3.csv) do (
more + 1 > data3mod.csv
data2mod.csv >> FinalData.csv
)
REM Deleting the files used to drop the top row
del /f /q data1mod.csv
del /f /q data2mod.csv
del /f /q data3mod.csv


Report •

#2
September 23, 2011 at 07:24:21
@jowah

About to test, but is there a way to have the code account for all CSV files and not just data1-3? Each day, a new CSV file will be added to the folder, so I will run the batch everyday, and since this batch is for a client, I won't be able to edit the code to account for each new CSV file. This means the code has to account for merging all CSV files in a single batch code.


Report •

#3
September 23, 2011 at 07:43:47
Yes there is. You would need nested loops. The first step would be to use dir to get a list of the csv's into a file like this:

dir /b c:\directory\*.csv > csvlist.tmp

Then you would need a loop that will do what my first loop did for every file in the list. Here... I'll just modify the original script:

@echo off

REM Creating FinalData.csv with Name Date at the top if it doesn't exist already.
if not exist FinalData.csv echo Name Date > FinalData.csv

REM Creating list of csv files
dir /b c:\directory\*.csv > csvlist.tmp

for /F "tokens=*" %%A in (csvlist.tmp) do (
for /F "tokens=*" %%B in (%%A) do (
more + 1 > mod%%B
mod%%B >> FinalData.csv
)
)

REM Deleting the files used to drop the top row
del /f /q mod*


Report •

Related Solutions

#4
September 23, 2011 at 08:00:08
Hi again @jowah,

So, I used the following code in the batch file, which is located in the same folder as all the csv files:
=====================

@echo off

dir /b P:\SCCIP\DB Test*.csv > csvlist.tmp

REM Creating FinalData.csv with Name Date at the top if it doesn't exist already.
if not exist FinalData.csv echo Name Date > FinalData.csv

REM Creating list of csv files
dir /b c:\directory\*.csv > csvlist.tmp

for /F "tokens=*" %%A in (csvlist.tmp) do (
for /F "tokens=*" %%B in (%%A) do (
more + 1 > mod%%B
mod%%B >> FinalData.csv
)
)

REM Deleting the files used to drop the top row
del /f /q mod*

===============

The result is a .temp file, not a FinalData.csv file. Any idea why?


Report •

#5
September 23, 2011 at 08:23:25
You should put quotes around any path strings and/or file names that have spaces in them. Like P:\SCCIP\DB Test*.csv should be expressed as

"P:\SCCIP\DB Test*.csv"

or

P:\SCCIP\DB_Test*.csv

I'm in the habit of not using spaces in a Windows environment, but you can use quotes if you want to use spaces. I'm not sure if this is the reason it failed, but try changing that and we'll see.


Report •

#6
September 23, 2011 at 08:38:18
Hi again,

Put quotes around it, same issue. Only a temp file.

Is there anything to this code that would work (see below). Found it on another thread. Gives me a single file with all data, but no header removal.

===========
@echo off & setLocal EnableDELAYedExpansion

if exist all.csv del all.csv
set /p head=<1.csv
> newfile echo !head!

for /f "tokens=* delims= " %%a in ('dir/b *.csv') do (
for /f "skip=1 tokens=* delims= " %%i in (%%a) do (
>> newfile echo %%i
)
)
ren newfile all.csv
=================

Also, "Echo off" appears in Line 1. Not sure why....


Report •

#7
September 23, 2011 at 08:44:35
Maybe so. The more +1 line in our script is what removes the header. You could test this new one with that line and see.

Report •

#8
September 23, 2011 at 08:52:31
@jowah,

I'll try it out, though I don't know my way around DOS programming enough to know how to paste that line in there, and what part of the line would go where, etc. I actually have to head out for the weekend, but I'll be back on Monday trying to make this work. Hopefully, you'll still be around!

Thanks for everything so far,
Kevin


Report •

#9
September 23, 2011 at 09:01:10
Hey Kevin,

Yep I'll be around. I usually check my messages on here every morning, so just send me a new one and this thread will pop up.


Report •

#10
September 26, 2011 at 05:12:38
@jowah,

I'm out sick today, so I won't be able to come to the office to work on the .bat file we've been talking about. Definitely will be in tomorrow, so I'll update you then.

Thanks for the help,
Kevin


Report •

#11
September 26, 2011 at 06:17:51
Sounds good Kevin. I hope you feel better.

Report •

#12
September 30, 2011 at 13:06:55
@jowah,

I was sicker than I thought! Didn't get back to work until today, and I've been catching up like crazy. Chances are I won't get to this project until mid-next week. I'll message again then. Didn't want you thinking I flaked!

Take care,
Kevin


Report •

#13
September 30, 2011 at 13:20:46
Kevin - I recently posted in another thread dealing with almost the same query as yours. Maybe there might be some benefit for you in reading the thread which is here http://www.computing.net/answers/pr...

Good luck.


Please come back & tell us if your problem is resolved.


Report •

#14
October 1, 2011 at 20:20:15
No problem Kevin. I'll be around whenever you have time. Also, Wahine is a veritable ninja of batch scripting :) and the link he posted above is relevant to your needs.

Report •

#15
October 11, 2011 at 06:17:56
Hi @jowah and @wahine,

Wow - got wiped out at work trying to catch up from being out for a few days. I'll be moving on to this project later this week, so I'll apply both methods and see which works. You can bet I'll be posting again to let you know how it goes!

Take care,
Kevin


Report •

Ask Question