Add a new calculated column from 2 values in RDD

Userrrrrrrr

I have 2 paired RDDs that I joined them together using the same key and I now I want to add a new calculated column using 2 columns from the values part. The new joined RDD type is:

RDD[((String, Int), Iterable[((String, DateTime, Int,Int), (String, DateTime, String, String))])]

I want to add another field to the new RDD which show the delta between the 2 DateTime fields.

How can I do this?

DNA

You should be able to do this using map to extend the 2-tuples into 3-tuples, roughly as follows:

joined.map{ case (key, values) =>
  val delta = computeDelta(values)
  (key, values, delta)
}

Or, more concisely:

joined.map{ case (k, vs) => (k, vs, computeDelta(vs)) }

Then your computeDelta function can just extract the first and second values of type (String, DateTime, Int,Int), get the second item (DateTime) from each and compute the delta using whatever DateTime functions are convenient.

If you want your output RDD to still be a paired RDD, then you will need to wrap the new delta field into a tuple, roughly as follows:

joined.mapValues{ values =>
  val delta = computeDelta(values)
  (values, delta)
}

which will preserve the original PairedRDD keys, and give you values of type (Iterable[(String, DateTime, Int,Int)], Long)

(assuming you are calculating deltas of type Long)

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정
0

몇 마디 만하겠습니다

0리뷰
로그인참여 후 검토

관련 기사

분류에서Dev

Add new column based on boolean values in a different column

분류에서Dev

Populate new column with values from database MySQL

분류에서Dev

Calculated column reference in DB2

분류에서Dev

How do I use pandas to add a calculated column in a pivot table?

분류에서Dev

Add 2 float values in UILabel from 2 difference functions

분류에서Dev

how to react to NSFetchedResultsController updates while needing calculated values from database?

분류에서Dev

Add column with accumulative count of unique values in a column

분류에서Dev

BigQuery recursive calculated column

분류에서Dev

How to add new column(header) to a csv file from command line arguments

분류에서Dev

Sending a calculated number from textbox 1 to textbox 2

분류에서Dev

how to add output as a new column with the file names

분류에서Dev

Add new array values to existing array position

분류에서Dev

Update the values of a dictionary checking if there are new values on a column of a Pandas Dataframe

분류에서Dev

An efficient way to create a new dataframe based on values from 2 separate dataframes in R

분류에서Dev

Strapi CMS: Add calculated field

분류에서Dev

Multiply column values in SQL, then add all the values - Laravel 4

분류에서Dev

Make a new column from grouped data

분류에서Dev

Getting new values from already loaded controllers

분류에서Dev

Add PFObject from an NSMutableArray to a new NSMutableDictionary

분류에서Dev

Create a new column value based on another column content from a list

분류에서Dev

How to make a new column based on difference of max values by index?

분류에서Dev

How to add new column depends on row value in R?

분류에서Dev

Search for column values in another column and assign a value from the next column from the row found to another column

분류에서Dev

Add column with initial value from another table

분류에서Dev

R: Add data from row to a column with conditions

분류에서Dev

Add two values from two different Lists

분류에서Dev

Splitting one Pandas column on values from a different column in the same row?

분류에서Dev

Add column to Data Frame based on values of other columns

분류에서Dev

Sitecore multivariate testing: how are values calculated?

Related 관련 기사

  1. 1

    Add new column based on boolean values in a different column

  2. 2

    Populate new column with values from database MySQL

  3. 3

    Calculated column reference in DB2

  4. 4

    How do I use pandas to add a calculated column in a pivot table?

  5. 5

    Add 2 float values in UILabel from 2 difference functions

  6. 6

    how to react to NSFetchedResultsController updates while needing calculated values from database?

  7. 7

    Add column with accumulative count of unique values in a column

  8. 8

    BigQuery recursive calculated column

  9. 9

    How to add new column(header) to a csv file from command line arguments

  10. 10

    Sending a calculated number from textbox 1 to textbox 2

  11. 11

    how to add output as a new column with the file names

  12. 12

    Add new array values to existing array position

  13. 13

    Update the values of a dictionary checking if there are new values on a column of a Pandas Dataframe

  14. 14

    An efficient way to create a new dataframe based on values from 2 separate dataframes in R

  15. 15

    Strapi CMS: Add calculated field

  16. 16

    Multiply column values in SQL, then add all the values - Laravel 4

  17. 17

    Make a new column from grouped data

  18. 18

    Getting new values from already loaded controllers

  19. 19

    Add PFObject from an NSMutableArray to a new NSMutableDictionary

  20. 20

    Create a new column value based on another column content from a list

  21. 21

    How to make a new column based on difference of max values by index?

  22. 22

    How to add new column depends on row value in R?

  23. 23

    Search for column values in another column and assign a value from the next column from the row found to another column

  24. 24

    Add column with initial value from another table

  25. 25

    R: Add data from row to a column with conditions

  26. 26

    Add two values from two different Lists

  27. 27

    Splitting one Pandas column on values from a different column in the same row?

  28. 28

    Add column to Data Frame based on values of other columns

  29. 29

    Sitecore multivariate testing: how are values calculated?

뜨겁다태그

보관