How do I turn an array of column names into a pandas Dataframe?


I have a bunch of data that represents choices from a large collection, and a classification. Something like:

pizzas = [

I want to turn this into a data frame with one column for each topping type, with one row for each pizza. Something like

ham  cheese ...  feta
1      1          0
0      1          1
0      0          1

(Obviously there will be a lot more columns and rows, but you get the general idea.)

What is the best way to do this?


You can try first create DataFrame from constructor, then use get_dummies and last groupby by columns and sum:

import pandas as pd

pizzas = [

df = pd.DataFrame(pizzas)
print df
           0       1          2
0        ham  cheese  pineapple
1      bacon    feta     cheese
2  mushrooms    feta        ham

df = pd.get_dummies(df, prefix_sep='', prefix='')
print df
   bacon  ham  mushrooms  cheese  feta  cheese  ham  pineapple
0      0    1          0       1     0       0    0          1
1      1    0          0       0     1       1    0          0
2      0    0          1       0     1       0    1          0

print df.groupby(df.columns, axis=1).sum()
   bacon  cheese  feta  ham  mushrooms  pineapple
0      0       1     0    1          0          1
1      1       1     1    0          0          0
2      0       0     1    1          1          0

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at


Login to comment


From Dev

How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame

From Dev

How do I turn pandas DataFrame groupby results into a DataFrame?

From Dev

How do I process 2 columns in Pandas and create a new dataframe with new column names

From Dev

How to convert a pandas dataframe into a numpy array with the column names

From Java

Creating a Pandas DataFrame from a Numpy array: How do I specify the index column and column headers?

From Dev

Slice a Pandas dataframe by an array of indices and column names

From Dev

Slice a Pandas dataframe by an array of indices and column names

From Dev

How do I put all column names in a php array?

From Dev

how to get numeric column names in pandas dataframe

From Dev

how to assign hierarchical column names in pandas dataframe

From Dev

how to create a pandas DataFrame by combining a list of column_names and a numpy array, and then adding more column(s)?

From Dev

How do I turn a dataframe into a series of lists?

From Dev

How do I turn a dataframe into a series of lists?

From Dev

How do I take rows in Pandas Dataframe and transform into values for a Column?

From Dev

How do I remove/omit the count column from the dataframe in Pandas?

From Dev

How do I fill a string column using a set in Pandas dataframe?

From Dev

How do I fill a string column using a set in Pandas dataframe?

From Dev

How do I copy rows in a pandas DataFrame and add an id column

From Dev

How do I "enrich" every record in a Pandas dataframe with an hour column?

From Dev

How to change column names in pandas Dataframe using a list of names?

From Dev

How do I flattern a pySpark dataframe by one array column?

From Dev

How do I flattern a pySpark dataframe by one array column?

From Dev

How do I turn my urls into readable names?

From Dev

How do I access a numpy array as quickly as a pandas dataframe

From Dev

how do i convert a numpy array to pandas dataframe

From Dev

How do I convert a numpy array into a pandas dataframe?

From Dev

How do I store a numpy array as an object in a pandas dataframe?

From Dev

How do I convert a numpy array into a pandas dataframe?

From Dev

Python Pandas Setting Dataframe index and Column names from an array

Related Related

  1. 1

    How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame

  2. 2

    How do I turn pandas DataFrame groupby results into a DataFrame?

  3. 3

    How do I process 2 columns in Pandas and create a new dataframe with new column names

  4. 4

    How to convert a pandas dataframe into a numpy array with the column names

  5. 5

    Creating a Pandas DataFrame from a Numpy array: How do I specify the index column and column headers?

  6. 6

    Slice a Pandas dataframe by an array of indices and column names

  7. 7

    Slice a Pandas dataframe by an array of indices and column names

  8. 8

    How do I put all column names in a php array?

  9. 9

    how to get numeric column names in pandas dataframe

  10. 10

    how to assign hierarchical column names in pandas dataframe

  11. 11

    how to create a pandas DataFrame by combining a list of column_names and a numpy array, and then adding more column(s)?

  12. 12

    How do I turn a dataframe into a series of lists?

  13. 13

    How do I turn a dataframe into a series of lists?

  14. 14

    How do I take rows in Pandas Dataframe and transform into values for a Column?

  15. 15

    How do I remove/omit the count column from the dataframe in Pandas?

  16. 16

    How do I fill a string column using a set in Pandas dataframe?

  17. 17

    How do I fill a string column using a set in Pandas dataframe?

  18. 18

    How do I copy rows in a pandas DataFrame and add an id column

  19. 19

    How do I "enrich" every record in a Pandas dataframe with an hour column?

  20. 20

    How to change column names in pandas Dataframe using a list of names?

  21. 21

    How do I flattern a pySpark dataframe by one array column?

  22. 22

    How do I flattern a pySpark dataframe by one array column?

  23. 23

    How do I turn my urls into readable names?

  24. 24

    How do I access a numpy array as quickly as a pandas dataframe

  25. 25

    how do i convert a numpy array to pandas dataframe

  26. 26

    How do I convert a numpy array into a pandas dataframe?

  27. 27

    How do I store a numpy array as an object in a pandas dataframe?

  28. 28

    How do I convert a numpy array into a pandas dataframe?

  29. 29

    Python Pandas Setting Dataframe index and Column names from an array

