SQL Query: Deleting rows from PostgreSQL with same values

Zeliax

I have a database in which I have managed to get some duplicate rows.

The database contains the following values:

--------------------------------------------------------------
| id  | did    | sensorid | timestamp | data  | db_timestamp |
--------------------------------------------------------------
| int | string | int      | bigint    | jsonb | bigint       |
--------------------------------------------------------------

The timestamp and db_timestamp are unix timestamps (milliseconds since January 1, 1970 12:00:00 AM)

I have managed to get a lot of duplicate values and I need a fast method for "discarding" them. The values that show me that they are duplicates are the did, sensorid and the timestamp. This means that if I find a rows where these are the same, then they are duplicates.

I have made the following query to find the duplicates, now I just need to find out how to delete them.

SELECT did, sensorid, timestamp, COUNT(*)
FROM <db_name>
GROUP BY did, sensorid, timestamp
HAVING COUNT(*) > 1
ORDER BY sensorid, timestamp;

My db contains almost 1.200.000.000 rows and know I have way too many of these duplicates and therefore I have created a python script that will delete them for me. I have however managed to find out that using my python script doesn't go as fast as I need it to, so therefore I hope that using an SQL query to delete my duplicates can make do.

Here is a print out of the output from my above query:

----------------------------------------------------
|       did       | sensorid |  timestamp  | count |
|"358188056470108"|    910   |1492487704000|  61   |
|"358188056470108"|    910   |1492487705000|  61   |
----------------------------------------------------

I know that the did could be a bigint or alike to improve performance, but there's a reason for not doing so.

I want to delete all of the duplicates, however it is important that I keep the one of the rows, ideally the row with the lowest row id which will then be the "original" row.

I hope that someone here can help me create such a query.

JohnHC

Use a CTE with a row_number() to identify the id's to delete, then delete them

with CTE as
(
select t1.*, row_number() over(partition by did, sensorid order by id) as rn
from MyTable t1
)
delete
from MyTable 
where id in (select id from CTE where rn > 1)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

SQL query to join rows with same values

From Dev

SQL Query for finding rows with same set of values

From Dev

Query performance increase from deleting rows in SQL database?

From Dev

SQL query: same rows

From Dev

SQL query: same rows

From Dev

Find rows with same values PostgreSQL

From Dev

Find rows with same values PostgreSQL

From Dev

PostgreSQL: deleting rows referenced from another table

From Dev

SQL PIVOT query is repeating the same values for all pivoted rows

From Dev

Sql query for deleting rows before a certain year

From Dev

SQL Delete rows based on query from same table?

From Dev

Count rows from another SQL table in the same query

From Dev

SQL Delete rows based on query from same table?

From Dev

Deleting all same values from Generic List

From Dev

SQL Server - How to SELECT values from different rows but in the same table

From Dev

SQL query to get unique values from repeating rows

From Dev

SQL Query to get differences between the values from two rows

From Dev

Merging sql rows with same values

From Dev

Merging sql rows with same values

From Java

Update multiple rows in same query using PostgreSQL

From Java

PostgreSQL select rows having same column values

From Dev

Deleting rows that contain a the same a values as a different row but in different columns

From Dev

sql query get multiple values from same column for one row

From Dev

SQL query to get computed values from the same column in one row

From Dev

MySql Query rows that same values in 2 columns

From Dev

Oracle: sql query for deleting duplicate rows based on a group

From Dev

Deleting all rows from table that share the same ID

From Dev

How can I stop SQL Server query from returning multile rows for the same item ID in this query

From Dev

Deleting rows from Pandas dataframe based on groupby values

Related Related

  1. 1

    SQL query to join rows with same values

  2. 2

    SQL Query for finding rows with same set of values

  3. 3

    Query performance increase from deleting rows in SQL database?

  4. 4

    SQL query: same rows

  5. 5

    SQL query: same rows

  6. 6

    Find rows with same values PostgreSQL

  7. 7

    Find rows with same values PostgreSQL

  8. 8

    PostgreSQL: deleting rows referenced from another table

  9. 9

    SQL PIVOT query is repeating the same values for all pivoted rows

  10. 10

    Sql query for deleting rows before a certain year

  11. 11

    SQL Delete rows based on query from same table?

  12. 12

    Count rows from another SQL table in the same query

  13. 13

    SQL Delete rows based on query from same table?

  14. 14

    Deleting all same values from Generic List

  15. 15

    SQL Server - How to SELECT values from different rows but in the same table

  16. 16

    SQL query to get unique values from repeating rows

  17. 17

    SQL Query to get differences between the values from two rows

  18. 18

    Merging sql rows with same values

  19. 19

    Merging sql rows with same values

  20. 20

    Update multiple rows in same query using PostgreSQL

  21. 21

    PostgreSQL select rows having same column values

  22. 22

    Deleting rows that contain a the same a values as a different row but in different columns

  23. 23

    sql query get multiple values from same column for one row

  24. 24

    SQL query to get computed values from the same column in one row

  25. 25

    MySql Query rows that same values in 2 columns

  26. 26

    Oracle: sql query for deleting duplicate rows based on a group

  27. 27

    Deleting all rows from table that share the same ID

  28. 28

    How can I stop SQL Server query from returning multile rows for the same item ID in this query

  29. 29

    Deleting rows from Pandas dataframe based on groupby values

HotTag

Archive