Apache spark with python

Devesh

I want to read a spark dataframe using python and then convert the spark dataframe to pandas dataframe then convert the pandas dataframe back to spark dataframe ( after doing some data analysis) . Please suggest.

Alberto Bonsanto

I really recommend you to take your time and read carefully the Spark's documentation, focusing in the Pyspark implementation, because it has more examples than others.

Easy, if you read the documentation of SQLContext.createDataFrame, you can see that they can receive as data the next structures:

createDataFrame(data, schema=None, samplingRatio=None)

data – an RDD of Row/tuple/list/dict, list, or pandas.DataFrame.

Besides, if you read the documentation related to DataFrames, you will notice they have a method called toPandas, and it allows to to convert spark's DataFrames into Pandas.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事