我们如何才能replace
在 spark scala shell 中添加元素?
例如: val t= sc.parallelize(Seq(("100",List("2","-4","NA","6","8","2"))))
我想用 0 替换 NA
你可以尝试NA
用 0替换,但会给你一个新的RDD
.
scala> val t= sc.parallelize(Seq(("100",List("2","-4","NA","6","8","2"))))
t: org.apache.spark.rdd.RDD[(String, List[String])] = ParallelCollectionRDD[0] at parallelize at <console>:21
scala> val newRDD = t.map( x => (x._1,x._2.map{case "NA" => 0; case x => x }))
newRDD: org.apache.spark.rdd.RDD[(String, List[Any])] = MapPartitionsRDD[3] at map at <console>:23
scala> newRDD.collect
res5: Array[(String, List[Any])] = Array((100,List(2, -4, 0, 6, 8, 2)))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句