0

I am inside a directory with a series of .csv files that I would like to assign to their own variable.

The idea is that I want to tidy up each dataframe on its own first within a loop, then concantenate everything at the end (my code non-"loopified" is a series of dropping, renaming, and group-by/pivot commands. I wrote these commands out as all .csv files look the same.

The last step to writing my loop is to iteratively read the set of .csv files in a for loop. The csv files are named:

  1. 100001_t0.csv
  2. 100001_t1.csv
  3. 100001_t2.csv
  4. 100002_t0.csv

... and so on until 100009_t2.csv

In my below loop, filename is the filename of the csv while subjid is the alphanumeric ID before the .csv extension.

I have tried exec("{0}_df = pd.read_csv(filename)".format(subjid)), but get an invalid token error. Is there a way I can change my format portion of this line to get the dataframes assigned to their own variable named by their subjid?

Thanks!

for filename in os.listdir(volume_statistics_directory):
    f = os.path.join(volume_statistics_directory, filename)
    if os.path.isfile(f):
        subjid = filename[0:9]
        #print(subjid)
        #print(f)
        print(filename, "being read in...")
        print("\n")
        exec("{0}_df = pd.read_csv(filename)".format(subjid))
        df = pd.read_csv(filename)


100001_t0.csv being read in...


Traceback (most recent call last):

  File "C:\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-109-ceed2fd80975>", line 9, in <module>
    exec("{0}_df = pd.read_csv(filename)".format(subjid))

  File "<string>", line 1
    100001_t0_df = pd.read_csv(filename)
          ^
SyntaxError: invalid token
1
  • Constructing variable names from strings is usually a bad idea. Use a dict instead. Commented Nov 8, 2022 at 23:13

1 Answer 1

4

The error here happens because it's not legal for a variable name to start with a number. Your code would have worked otherwise.

However, constructing variable names from strings is usually a bad idea. Use a dict instead:

dfs = {}
for f in files:
    dfs[f] = pd.read_csv(f)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.