I have about 5600 directories structured as follows:
I need to merge all A files into one file, all B files into another file, and so on.
How can I do this?
I have about 5600 directories structured as follows:
I need to merge all A files into one file, all B files into another file, and so on.
How can I do this?
IIUC, this should work for your case (I used a RootDir with 2 subdirectories Dir1 and Dir2 with in each 2 files A.csv and B.csv). You can change the value of rootdir to match your usecase:
import os
import pandas as pd
rootdir = 'RootDir' # Change when needed to your root directory
files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(rootdir) for f in filenames if os.path.splitext(f)[1] == '.csv']
names = set([x.rstrip('.csv').split('/')[-1] for x in files])
df_dict = {key: pd.DataFrame() for key in names}
for file in files:
key = file.rstrip('.csv').split('/')[-1]
df = pd.read_csv(file)
df_dict[key] = pd.concat([df_dict[key], df])
Output is a dictionary of dataframes df_dict with A and B as keys.
Use df_dict['A'] to access DataFrame A and so on...