Delete Duplicates in Array based on matching id. Rails

Tinus Wagner

So I've got a .csv file that I've imported into an array. They're all comma separated so I've gone ahead and made a nice array for em.

Now I'm trying to find records with matching id's so I can remove duplicates and only keep the last encountered. Using ID for instance.

I've imported to array but for some reason I can't get a tool like uniq to display the new unique list even though when I do .length on it, it returns the right amount of rows.

Any help would be greatly appreciated.

CODE

    lines = []
    i = 0

    file = File.open("./properties.csv", "r")

    elements = Array[]
    element2 = Array[]
    output = Array[]

    while (line = file.gets)
        i += 1
      # use split to break array up using commas
        arr = line.split(',')
        elements.push({ id: arr[0], streetAddress: arr[1], town: arr[2], valuationDate: arr[3], value: arr[4] })
    end

    file.close

    # Loop through array and sort nicely
     element2 = elements.group_by { |c| c[:id] }.values.select { |elements| elements.size > 1 }


    output = (element2.uniq)
    puts output

    puts element2.length

SAMPLE .CSV FILE

ID,Street address,Town,Valuation date,Value
1,1 Northburn RD,WANAKA,1/1/2015,280000
2,1 Mount Ida PL,WANAKA,1/1/2015,280000
3,1 Mount Linton AVE,WANAKA,1/1/2015,780000
1,1 Northburn RD,WANAKA,1/1/2015,330000
2,1 Mount Ida PL,WANAKA,1/1/2015,330000
3,1 Mount Linton AVE,WANAKA,1/1/2015,830000
1,1 Northburn RD,WANAKA,1/1/2016,340000
2,1 Mount Ida PL,WANAKA,1/1/2016,340000
3,1 Mount Linton AVE,WANAKA,1/1/2016,840000
4,1 Kamahi ST,WANAKA,1/1/2016,215000
5,1 Kapuka LANE,WANAKA,1/1/2016,209000
6,1 Mohua MEWS,WANAKA,1/1/2016,620000
7,1 Kakapo CT,WANAKA,1/1/2016,490000
8,1 Mt Gold PL,WANAKA,1/1/2016,1320000
9,1 Penrith Park DR,WANAKA,1/1/2016,1310000
Tinus Wagner

So I've actually swapped my approach to using hashes. which seems to automatically remove duplicates and leave the last encountered record intact? Can anyone shed some light here?

    require 'csv'

    element = {}

    CSV.foreach("properties.csv", :headers => true, :header_converters => :symbol) do |row|
        element[row.fields[0]] = Hash[row.headers[1..-1].zip(row.fields[1..-1])]
    end

    puts element["1"]

    element.each do |key, value|
        puts key 
        puts value
    end

    puts "#{element.length} records returned" 

To keep the first matching element, instead of the last, you can do a key existence check before assigning the value. This can be done like so:

CSV.foreach("properties.csv", :headers => true, :header_converters => :symbol) do |row|
  key = row.fields[0]
  if !element.key?(key)
    element[key] = Hash[row.headers[1..-1].zip(row.fields[1..-1])]
  end
end

which can also be written much more efficiently like this:

CSV.foreach("properties.csv", :headers => true, :header_converters => :symbol) do |row|
  element[row.fields[0]] ||= Hash[row.headers[1..-1].zip(row.fields[1..-1])]
end

Note that these methods to preserve the first found record for a key will perform much better than the version that preserves the final found record for a key. This is because of work avoidance, primarily in producing the hash value, which is done with slice and zip in this code.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Delete Duplicates in Array based on matching id. Rails

From Dev

Delete Duplicates in Multidimensional Array

From Dev

Delete duplicates based on Group By - SQL

From Dev

PHP - Delete all duplicates in array

From Dev

Remove Duplicates Based on ID Variable?

From Dev

How to find duplicates in a large table based on matching and non matching fields?

From Dev

Find duplicates based on two fields and delete them

From Dev

How to delete one of the duplicates based on another column?

From Dev

Delete duplicates in large dataset based on condition

From Dev

Matching values based on group ID

From Dev

php array find duplicates, sum them up & delete duplicates

From Dev

php array find duplicates, sum them up & delete duplicates

From Dev

Delete duplicates in array but keep first occurrence

From Dev

VBA (Microsoft Access) Function to delete duplicates in Array

From Dev

Mongo remove duplicates in array of objects based on field

From Dev

Remove duplicates from Array based on multiple keys

From Dev

Rails ActiveRecord Delete Duplicates and also change on relational table

From Dev

Python - Delete duplicates in a dataframe based on two columns combinations?

From Dev

Oracle How delete duplicates based on multiples database columns and conditions

From Dev

Ruby on Rails: /articles/:id not matching /articles/1

From Dev

Oracle SQL Update Table Based on Matching ID

From Dev

Oracle SQL Update Table Based on Matching ID

From Dev

Php Summing Array 2 Values Based On Array 1 Value Duplicates

From Dev

Sort hash based on keys matching with array

From Dev

Sort hash based on keys matching with array

From Dev

Comparing two array and matching elements based on boolean

From Dev

Delete elements of array based on another array

From Dev

Rails: select array based on attribute

From Dev

Collection without duplicates (based on id) but keeps the highest version number

Related Related

  1. 1

    Delete Duplicates in Array based on matching id. Rails

  2. 2

    Delete Duplicates in Multidimensional Array

  3. 3

    Delete duplicates based on Group By - SQL

  4. 4

    PHP - Delete all duplicates in array

  5. 5

    Remove Duplicates Based on ID Variable?

  6. 6

    How to find duplicates in a large table based on matching and non matching fields?

  7. 7

    Find duplicates based on two fields and delete them

  8. 8

    How to delete one of the duplicates based on another column?

  9. 9

    Delete duplicates in large dataset based on condition

  10. 10

    Matching values based on group ID

  11. 11

    php array find duplicates, sum them up & delete duplicates

  12. 12

    php array find duplicates, sum them up & delete duplicates

  13. 13

    Delete duplicates in array but keep first occurrence

  14. 14

    VBA (Microsoft Access) Function to delete duplicates in Array

  15. 15

    Mongo remove duplicates in array of objects based on field

  16. 16

    Remove duplicates from Array based on multiple keys

  17. 17

    Rails ActiveRecord Delete Duplicates and also change on relational table

  18. 18

    Python - Delete duplicates in a dataframe based on two columns combinations?

  19. 19

    Oracle How delete duplicates based on multiples database columns and conditions

  20. 20

    Ruby on Rails: /articles/:id not matching /articles/1

  21. 21

    Oracle SQL Update Table Based on Matching ID

  22. 22

    Oracle SQL Update Table Based on Matching ID

  23. 23

    Php Summing Array 2 Values Based On Array 1 Value Duplicates

  24. 24

    Sort hash based on keys matching with array

  25. 25

    Sort hash based on keys matching with array

  26. 26

    Comparing two array and matching elements based on boolean

  27. 27

    Delete elements of array based on another array

  28. 28

    Rails: select array based on attribute

  29. 29

    Collection without duplicates (based on id) but keeps the highest version number

HotTag

Archive