2

I have 4 CSV files. I am Able to merge all 4 of them but there is a problem with it.

This is one file named services.csv

ServiceID   Service
1   General Practitioner
2   Pathology
3   Radiology
4   Psychiatry
5   Chiropratic

and the other file ("I have named it test.csv")

ClinicServiceID ClinicID    ServiceID   Name    Suburb  State   Postcode    Email   Lat Lon
1   1   1   Hurstville Clinic   Hurstville  NSW 1493    [email protected]  -33.975869  151.088939
2   1   2   Hurstville Clinic   Hurstville  NSW 1493    [email protected]  -33.975869  151.088939
3   2   1   Sydney Centre Clinic    Sydney  NSW 2000    [email protected]  -33.867139  151.207114
4   2   2   Sydney Centre Clinic    Sydney  NSW 2000    [email protected]  -33.867139  151.207114
5   2   3   Sydney Centre Clinic    Sydney  NSW 2000    [email protected]  -33.867139  151.207114

Now I have to add Service Column from the Service.csv file according to the ServiceID in the test file.

I am able to merge all the files but I don't know how to perform the above operation.

Things that I Have to achieve:
1) Add Service column into the test.csv file
2) make the entries in the Service Column using the Service.csv file data.

Can anyone please help me. I don't know how to resolve this problem.

1 Answer 1

2

I think need read_csv for DataFrames with map and insert for new column after ServiceID:

df1 = pd.read_csv('services.csv') 
df2 = pd.read_csv('test.csv')

#get position of ServiceID column
pos = df2.columns.get_loc('ServiceID') + 1

df2.insert(pos, 'Service', df2['ServiceID'].map(df1.set_index('ServiceID')['Service']))
print (df2)
   ClinicServiceID  ClinicID  ServiceID               Service  \
0                1         1          1  General Practitioner   
1                2         1          2             Pathology   
2                3         2          1  General Practitioner   
3                4         2          2             Pathology   
4                5         2          3             Radiology   

                   Name      Suburb State  Postcode  \
0     Hurstville Clinic  Hurstville   NSW      1493   
1     Hurstville Clinic  Hurstville   NSW      1493   
2  Sydney Centre Clinic      Sydney   NSW      2000   
3  Sydney Centre Clinic      Sydney   NSW      2000   
4  Sydney Centre Clinic      Sydney   NSW      2000   

                        Email        Lat         Lon  
0  [email protected] -33.975869  151.088939  
1  [email protected] -33.975869  151.088939  
2      [email protected] -33.867139  151.207114  
3      [email protected] -33.867139  151.207114  
4      [email protected] -33.867139  151.207114  
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for the help. But I need Service Column after Service ID also there's an extra column.
It worked Thanks. will index = false remove the extra column?
@Damian - Yes, if think df.to_csv(file, index=False)
Hi @jezrael, in which case we will get the IndexError, could you explain
@pyd - Thank you for comment. I test it and find min is not necessary, because pos = df2.columns.get_loc('Lon') + 1 working nice - add after last column.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.