I'm running Spark 1.4 in Databrick's Cloud. I loaded a file into my S3 instance and mounted it. Mounting worked. But I'm having trouble creating an RDD:
dbutils.fs.mount("s3n://%s:%s@%s" % (ACCESS_KEY, SECRET_KEY, AWS_BUCKET_NAME), "/mnt/%s" % MOUNT_NAME)
Any ideas?
sc.parallelize([1,2,3])
rdd = sc.textFiles("/mnt/GDELT_2014_EVENTS/GDELT_2014.csv")
You've done a great job getting your data mounted into dbfs which is great, and it looks like you just have a small typo. I suspect you want to use sc.textFile
rather than sc.textFiles
. Best of luck with your adventures with Spark.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments