PySpark: java.lang.OutofMemoryError: Java heap space

pg2455

I have been using PySpark with Ipython lately on my server with 24 CPUs and 32GB RAM. Its running only on one machine. In my process, I want to collect huge amount of data as is give in below code:

train_dataRDD = (train.map(lambda x:getTagsAndText(x))
.filter(lambda x:x[-1]!=[])
.flatMap(lambda (x,text,tags): [(tag,(x,text)) for tag in tags])
.groupByKey()
.mapValues(list))

When I do

training_data =  train_dataRDD.collectAsMap()

It gives me outOfMemory Error. Java heap Space. Also, I can not perform any operations on Spark after this error as it looses connection with Java. It gives Py4JNetworkError: Cannot connect to the java server.

It looks like heap space is small. How can I set it to bigger limits?

EDIT:

Things that I tried before running: sc._conf.set('spark.executor.memory','32g').set('spark.driver.memory','32g').set('spark.driver.maxResultsSize','0')

I changed the spark options as per the documentation here(if you do ctrl-f and search for spark.executor.extraJavaOptions) : http://spark.apache.org/docs/1.2.1/configuration.html

It says that I can avoid OOMs by setting spark.executor.memory option. I did the same thing but it seem not be working.

pg2455

After trying out loads of configuration parameters, I found that there is only one need to be changed to enable more Heap space and i.e. spark.driver.memory.

sudo vim $SPARK_HOME/conf/spark-defaults.conf
#uncomment the spark.driver.memory and change it according to your use. I changed it to below
spark.driver.memory 15g
# press : and then wq! to exit vim editor

Close your existing spark application and re run it. You will not encounter this error again. :)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

java.lang.OutOfMemoryError: Java heap space

From Dev

java.lang.OutOfMemoryError: Java heap space?

From Dev

java.lang.OutofMemoryError: Java heap space collecting a lot of elements from an rdd in pyspark

From Dev

java.lang.OutOfMemoryError: Java heap space No more space

From Dev

OutOfMemoryError Java heap space

From Dev

OutOfMemoryError Java heap space

From Dev

java java.lang.outofmemoryerror: java heap space

From Dev

java.lang.OutOfMemoryError: Java heap space for java 8

From Dev

Constant dspace error java.lang.OutOfMemoryError: Java heap space

From Dev

java.lang.OutOfMemoryError: Java heap space for 100000 records

From Dev

Eclipse: java.lang.OutOfMemoryError: Java heap space

From Dev

java.lang.OutOfMemoryError: Java heap space while initialising an array

From Dev

java.lang.OutOfMemoryError: Java heap space in grails

From Dev

How to solve java.lang.OutOfMemoryError: Java heap space error

From Dev

java.lang.OutOfMemoryError: Java heap space with hive

From Dev

NETBEANS: “java.lang.OutOfMemoryError: Java heap space”

From Dev

SonarQube analysis failed java.lang.OutOfMemoryError: Java heap space

From Dev

Exception java.lang.OutOfMemoryError: Java heap space

From Dev

Error java.lang.OutOfMemoryError: Java heap space

From Dev

Getting “java.lang.OutOfMemoryError: Java heap space”

From Dev

Using opencsv - java.lang.OutOfMemoryError: Java heap space

From Dev

java.lang.OutOfMemoryError: Java heap space Hadoop Ubuntu

From Dev

Application with java.lang.OutOfMemoryError: Java heap space

From Dev

SEVERE: Java heap space java.lang.OutOfMemoryError: Java heap space

From Dev

Tomcat threw out "java.lang.OutOfMemoryError: Java heap space", but heap size in dump is less than -Xmx

From Dev

Tomcat threw out "java.lang.OutOfMemoryError: Java heap space", but heap size in dump is less than -Xmx

From Dev

java.lang.OutOfMemoryError: Java heap space when try to convert Java Object to Json String

From Dev

How to increase memory allocated to java? java.lang.OutOfMemoryError: Java heap space

From Dev

Jetty webapp continues to grow into heap space: (OutOfMemoryError Java heap space)

Related Related

  1. 1

    java.lang.OutOfMemoryError: Java heap space

  2. 2

    java.lang.OutOfMemoryError: Java heap space?

  3. 3

    java.lang.OutofMemoryError: Java heap space collecting a lot of elements from an rdd in pyspark

  4. 4

    java.lang.OutOfMemoryError: Java heap space No more space

  5. 5

    OutOfMemoryError Java heap space

  6. 6

    OutOfMemoryError Java heap space

  7. 7

    java java.lang.outofmemoryerror: java heap space

  8. 8

    java.lang.OutOfMemoryError: Java heap space for java 8

  9. 9

    Constant dspace error java.lang.OutOfMemoryError: Java heap space

  10. 10

    java.lang.OutOfMemoryError: Java heap space for 100000 records

  11. 11

    Eclipse: java.lang.OutOfMemoryError: Java heap space

  12. 12

    java.lang.OutOfMemoryError: Java heap space while initialising an array

  13. 13

    java.lang.OutOfMemoryError: Java heap space in grails

  14. 14

    How to solve java.lang.OutOfMemoryError: Java heap space error

  15. 15

    java.lang.OutOfMemoryError: Java heap space with hive

  16. 16

    NETBEANS: “java.lang.OutOfMemoryError: Java heap space”

  17. 17

    SonarQube analysis failed java.lang.OutOfMemoryError: Java heap space

  18. 18

    Exception java.lang.OutOfMemoryError: Java heap space

  19. 19

    Error java.lang.OutOfMemoryError: Java heap space

  20. 20

    Getting “java.lang.OutOfMemoryError: Java heap space”

  21. 21

    Using opencsv - java.lang.OutOfMemoryError: Java heap space

  22. 22

    java.lang.OutOfMemoryError: Java heap space Hadoop Ubuntu

  23. 23

    Application with java.lang.OutOfMemoryError: Java heap space

  24. 24

    SEVERE: Java heap space java.lang.OutOfMemoryError: Java heap space

  25. 25

    Tomcat threw out "java.lang.OutOfMemoryError: Java heap space", but heap size in dump is less than -Xmx

  26. 26

    Tomcat threw out "java.lang.OutOfMemoryError: Java heap space", but heap size in dump is less than -Xmx

  27. 27

    java.lang.OutOfMemoryError: Java heap space when try to convert Java Object to Json String

  28. 28

    How to increase memory allocated to java? java.lang.OutOfMemoryError: Java heap space

  29. 29

    Jetty webapp continues to grow into heap space: (OutOfMemoryError Java heap space)

HotTag

Archive