I have a JavaRDD with some json documents, I want to filter the JavaRDD based on a list of ID's in a ArrayList, basically want to get all the documents in the JavaRDD which has the ID which is in the ArrayList. I know this can be done easily on DataSet but not sure how to do it with JavaRDD
javaRdd.filter(json -> arrayList.contains(json.get("id")))
That's a high-level snippet, with json
being what's stored in each row of your RDD (I'm not sure what kind of structure is there and how JSON is represented), arrayList
is your list of IDs, and json.get("id")
just denotes some way of obtaining the ID from your JSON - again, without more info it's hard to be more specific
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments