Combining CSV Files with different column data on one column

Question

A sample dataset is structured as follows

Home_HeatSensor_AA.CSV
Office_HeatSensor_BB.CSV
Ship_ElevationSensor_XXYY.CSV

AA.CSV has the following columns, with a sample row

   Time  AA  AB  BB  Site  Type
0  1:00   5   4   5  Home  Heat

BB.CSV is formatted similarly

   Time  AA  AB  BB    Site  Type
0  1:00   6   2   4  Office  Heat

However, XXYY.CSV has a much different format

   Time     XX       XY     YY  Site       Type
0  1:00  1.332  12.1123  4.212  Ship  Elevation

I need to join these three CSV files into a master CSV file formatted as follows

   Time AA AB AB     XX       XY     YY    Site       Type
0  1:00  5  4  4                           Home       Heat
0  1:00  6  2  2                         Office       Heat
0  1:00           1.332  12.1123  4.212    Ship  Elevation

I've tried mucking about with pandas a bit but the results have been mixed. The code below will join the data but switches but the column order of time, Site, and Unit. Ideally I'd like these two to stay static, with time in the front of the order and Site and Unit staying the last two column values

for filename in filepaths:
 df = pd.read_csv(filename, index_col=None, header=0, parse_dates=True,infer_datetime_format=True)
 li.append(df)

piRSquared · Accepted Answer · 2019-07-09 16:08:27Z

2

`pd.concat`

def read_csv(fn):
    return pd.read_csv(fn, skipinitialspace=True)

files = ['Home_HeatSensor_AA.CSV', 'BB.CSV', 'XXYY.CSV']
cols = ['Time', 'AA', 'AB', 'BB', 'XX', 'XY', 'YY', 'Site', 'Type']

pd.concat(map(read_csv, files), sort=False)[cols].to_csv('MASTER.CSV', index=False)

Then confirm

cat MASTER.CSV

Time,AA,AB,BB,XX,XY,YY,Site,Type
1:00,5.0,4.0,5.0,,,,Home,Heat
1:00,6.0,2.0,4.0,,,,Office,Heat
1:00,,,,1.3319999999999999,12.1123,4.212,Ship,Elevation

If you won't know the column names in advanced:

def read_csv(fn):
    return pd.read_csv(fn, skipinitialspace=True)

files = ['Home_HeatSensor_AA.CSV', 'BB.CSV', 'XXYY.CSV']

pd.concat(map(read_csv, files), sort=False).to_csv('MASTER.CSV', index=False)

edited Jul 9, 2019 at 16:08

answered Jul 9, 2019 at 15:07

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ronald mcdolittle Over a year ago

Its a good answer, but I won't know the column names in advance. Users could upload a new file with a new set of column names, and the code needs to account for that

piRSquared Over a year ago

Without the column names, pandas places the columns in some order. I used the column names to present the result in the order you specified. If you won't know the column names, leave it out.

Collectives™ on Stack Overflow

Combining CSV Files with different column data on one column

1 Answer 1

`pd.concat`

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

pd.concat

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related

`pd.concat`