My DataFrame has a index SubjectID, and each Subject ID has its own directory. In each Subject directory is a .csv file with info that I want to put into my DataFrame. Using my SubjectID index, I want to read in the header of the .csv file for every subject and put it into a new column in my DataFrame.
Each subject directory has the same pathway except for the individual subject number.
I have found ways to read multiple .csv files from a single target directory into a pandas DataFrame, but not from multiple directories. Here is some code I have for importing multiple .csv files from a target directory:
subject_path = ('/home/mydirectory/SubjectID/')
filelist = []
os.chdir('subject_path')
for files in glob.glob( "*.csv" ) :
filelist.append(files)
# read each csv file into single dataframe and add a filename reference column
df = pd.DataFrame()
columns = range(1,100)
for c, f in enumerate(filelist) :
key = "file%i" % c
frame = pd.read_csv( (subject_path + f), skiprows = 1, index_col=0, names=columns )
frame['key'] = key
df = df.append(frame,ignore_index=True)
I want to do something similar but iteratively go into the different Subject directories instead of having a single target directory.
Edit:
I think I want to do this using os not pandas, is there a way to use a loop to search through multiple directories using os?
os? It doesn't look like this can be accomplished in pandasfilelistthe full path rather than just filename.)