Greetings Beautiful People!
I'm putting together a visualization for some customer whoops edit survey data. Unfortunately, the data modeling or end to end process throughout is non-existent
I have multiple columns as follows :
What Role : Teacher, What Role: Engineer, What Role : Doctor
1 Yes, Yes, No,
2 No, No, Yes,
3, Yes, No, Yes,
so, what I want to do is create a new column and convert the Yes' into a new Value which matches the Header, so if doctor is Yes, then it would enter int a new Column:
What Role?
1 Teacher, Engineer,
2 Doctor,
3 Teacher, Doctor
Could this be done by creating a dictionary then a for loop?
for example:
import pandas as pd
df = pd.read_csv("file.csv")
Dictionary_File = {'What Role?' : 'What Role : Teacher',
'What Role?': 'What Role : Engineer', 'What Role?' : 'What Role : Doctor'}
for k,v in Dictionary_File.items():
(df[k] = df[k] == 'Yes', 'Unsure here' + df[v])
df = df.drop(list(Dictonary_File.values()), axis=1)
So when it comes to the for loop I couldn't think or find a way to merge the values into something new (Other than manually changing all the columns Yes into a new value then merging..?)
any help would be much appreciated!
Cheers,
You need first remove What Role:
by split
.
Then by boolean mask df == 'Yes'
create joined values by numpy.where
c = df.columns.str.split().str[-1]
s = np.where(df == 'Yes', ['{}, '.format(x) for x in c], '')
print (s)
[['Teacher, ' 'Engineer, ' '']
['' '' 'Doctor, ']
['Teacher, ' '' 'Doctor, ']]
df['new'] = pd.Series([''.join(x).strip(', ') for x in s], index=df.index)
print (df)
What Role : Teacher What Role : Engineer What Role : Doctor \
1 Yes Yes No
2 No No Yes
3 Yes No Yes
new
1 Teacher, Engineer
2 Doctor
3 Teacher, Doctor
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments