Search

Search

Add identical rows to a Spark Dataframe using an integer

Mamaf Published at Java

7

Mamaf

Assuming the following Dataframe df1 :

df1 :
+---------+--------+-------+
|A        |B       |C      |
+---------+--------+-------+
|toto     |tata    |titi   |
+---------+--------+-------+

I have the N = 3 integer which I want to use in order to create 3 duplicates in the df2 Dataframe using df1 :

df2 :
+---------+--------+-------+
|A        |B       |C      |
+---------+--------+-------+
|toto     |tata    |titi   |
|toto     |tata    |titi   |
|toto     |tata    |titi   |
+---------+--------+-------+

Any ideas ?

Shu

From Spark-2.4+ use arrays_zip + array_repeat + explode functions for this case.

val df=Seq(("toto","tata","titi")).toDF("A","B","C")
df.withColumn("arr",explode(array_repeat(arrays_zip(array("A"),array("B"),array("c")),3))).
drop("arr").
show(false)

//or dynamic way
val cols=df.columns.map(x => col(x))
df.withColumn("arr",explode(array_repeat(arrays_zip(array(cols:_*)),3))).
drop("arr").
show(false)

//+----+----+----+
//|A   |B   |C   |
//+----+----+----+
//|toto|tata|titi|
//|toto|tata|titi|
//|toto|tata|titi|
//+----+----+----+

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-10-19

0

Comments

0 comments

Login to comment

Related

From Dev

How to break each rows into multiple rows in Spark DataFrame using scala

From Dev

how find the identical rows in a DataFrame -- python

From Dev

Group rows with identical value into column using LINQ

From Dev

Add two dataframe rows?

From Dev

Add rows in a dataframe using values from another list

From Java

How do I add a new column to a Spark DataFrame (using PySpark)?

From Dev

Regrouping / Concatenating DataFrame rows in Spark

From Dev

What is an efficient way to isolate dataframe rows with identical values in specific columns?

From Dev

Using python/pandas to search dataframe rows containing both a user-specified integer and approximated float value

From Dev

addition of two dataframe integer values in Scala/Spark

From Dev

add rows to groups in pandas dataframe

From Dev

Add new rows to pyspark Dataframe

From Dev

Add Aggregate Column to Spark DataFrame

From Dev

Add an empty column to Spark DataFrame

From Dev

Spark: Add column to dataframe conditionally

From Dev

Spark dataframe add Missing Values

From Dev

Add rows to dataframe based on existing rows

From Dev

R: Combining identical rows into one (preferably using dplyr/tidyr)

From Dev

How to add a Spark Dataframe to the bottom of another dataframe?

From Java

Combining Rows that link together in a Spark Dataframe

From Dev

Spark 1.6, DataFrame: fill gaps by adding rows

From Dev

Spark remove duplicate rows from DataFrame

From Dev

Spark dataframe transform multiple rows to column

From Java

Is there a way to take the first 1000 rows of a Spark Dataframe?

From Dev

Merging multiple rows in a spark dataframe into a single row

From Dev

Spark DataFrame - Select n random rows

From Dev

Spark DataFrame - Select n random rows

From Dev

Spliting columns in a Spark dataframe in to new rows [Scala]

From Dev

How to get all the rows from spark DataFrame?

Related Related

Article

HotTag

Archive