I have a dataframe df and want to create a new dataframe df_b from it but only taking the rows where the value of the row's column df['id'] is in my list array list_of_ids.
Both df['id'] and list_of_ids contain string values.
I thought of using a regex, but the regex would be huge since the length of list_of_ids is > 20 elements, so would need a generator over list_of_ids but I don't know how to apply that.
I was thinking something like:
list_of_ids = ["thing1", "thing2", "thing3" ]
df_b = df[df["id"].apply(lambda x: x in list_of_ids)==True]
Or I could use the .str.contains() method but pass a string that is built from all the elements of list_of_ids where they are separated by a pipe '|', but doing that doesn't seem "clean".
df[df['id'].str.isin(list_of_ids)]