Merge CSV-Files and remove duplicates columns

March 13, 2012 at 21:41:05
Specs: Windows XP
Hi everyone,

Here is some background for my question: Unfortunately, I am not really good at batch programming. However, I think this might be the best way to merge csv files and I hope that someone is able to help me. I have about 2000 csv files in one folder: Basically, there a three different file names :
19921231_Assets.csv
19921231_Cash.csv
19921231_Liabilities.csv
19930331_Assets.csv
19930331_Cash.csv
19930331_Liabilities.csv
19930630_Assets.csv
19930630_Cash.csv
19930630_Liabilities.csv

The syntax of the csv filename varies just by the date (quarterly data). I want to create for each quarter one merged file, e.g. 19921231.csv. This file should be a joined file of the three individual files. However, the three csv files of one period have some identical columns like name or address, so that in the final files these duplicate columns are unwanted. All files have the identical number and order of rows. In addition, I would like to join the files, so that the files should not be appended below. They should rather be juxtaposed to each other.

To clarify my intention, here is an example what I am looking for:

19921231_Assets.csv
cert,name,city,value1,value2,value3
"26979","Paul”,”New York”,”1”,”23”,”332”
"435","Bob”,”Boston”,”5”,”4”,”3”

19921231_Cash.csv
cert,name,city,value4,value5,
"26979","Paul”,”New York”,”7”,”88”
"435","Bob”,”Boston”,”34”,”43”

19921231_Liabilities.csv
name, cert,value6,city,
"Paul”,” 26979”, "78", New York”
“Bob”, "435"," 67”, ”Boston”

19921231.csv
cert,name,city,value1,value2,value3, value4,value5, value6,
"26979","Paul”,”New York”,”1”,”23”,”332”, ”7”,”88”,”78”
"435","Bob”,”Boston”,”5”,”4”,”3” ,”34”,”43””,”67”

Any help would be really helpful and is greatly appreciated!

Thank you very much in advance.

Brian


See More: Merge CSV-Files and remove duplicates columns

Report •


#1
March 14, 2012 at 02:13:00
An interesting problem for a batch script but before committing time to it please confirm your use of double quotes, you show three different characters being used. Also spaces appear where I doubt spaces should be.

Please review the data examples shown.


Report •

#2
March 14, 2012 at 02:33:13
T.C., thank you very much for your interest and your effort.

Yes, you are right! I messed something about with the example data; there shouldn't be any spaces nor quotes! Here is the revised example:

19921231_Assets.csv

cert,name,city,value1,value2,value3

26979,Paul,NewYork,1,23,332

435,Bob,Boston,5,4,3

19921231_Cash.csv

cert,name,city,value4,value5

26979,Paul,NewYork,7,88

435,Bob,Boston,34,43

19921231_Liabilities.csv

name,cert,value6,city

Paul,26979,78,NewYork

Bob,435,67,Boston

19921231.csv

cert,name,city,value1,value2,value3,value4,value5,value6
26979,Paul,NewYork,1,23,332,7,88,78

435,Bob,Boston,5,4,3,34,43,67


Report •

#3
March 14, 2012 at 14:28:13
Having copied/pasted the new data supplied I find what I think are spurious question marks in both the input and required output (see below), the presence of these question marks is confirmed using a hex editor to view the copied/pasted file. I will generate what I think your input files should contain and do a bit of work on that. Could take a couple of days (perhaps longer) for me to get back at you.

    19921231_Assets.csv?
    cert,name,city,value1,value2,value3?
    26979,Paul,NewYork,1,23,332
    ?435,Bob,Boston,5,4,3

    19921231_Cash.csv
    ?cert,name,city,value4,value5
    ?26979,Paul,NewYork,7,88?
    435,Bob,Boston,34,43

    19921231_Liabilities.csv
    ?name,cert,value6,city?
    Paul,26979,78,NewYork
    ?Bob,435,67,Boston

    19921231.csv?
    cert,name,city,value1,value2,value3,value4,value5,value6?26979,Paul,NewYork,1,23,332,7,88,78?
    435,Bob,Boston,5,4,3,34,43,67



Report •

Related Solutions


Ask Question