I have a single SQL table that contains multiple entries for each customerID (some customerID's only have one entry which I want to keep). I need to remove all but the most recent entry per customerID, using the invoiceDate field as my marker.
So I need to go from this:
+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
| 1 | 1393995600 | xx |
| 1 | 1373688000 | xx |
| 1 | 1365220800 | xx |
| 2 | 1265220800 | xx |
| 2 | 1173688000 | xx |
| 3 | 1325330800 | xx |
+------------+-------------+-----------+
To this:
+------------+-------------+-----------+
| customerID | invoiceDate | invoiceID |
+------------+-------------+-----------+
| 1 | 1393995600 | xx |
| 2 | 1265220800 | xx |
| 3 | 1325330800 | xx |
+------------+-------------+-----------+
Any guidance would be greatly appreciated!
SELECT * FROM t
WHERE invoiceDate NOT IN (
SELECT MAX(invoiceDate)
-- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
FROM (SELECT * FROM t) AS t2
WHERE t2.customerId = t.customerId
GROUP BY t2.customerId
)
This may take a long time on a big database.
DELETE FROM t
WHERE invoiceDate NOT IN (
SELECT MAX(invoiceDate)
-- "FROM t AS t2" isn't supported by MySQL, see http://stackoverflow.com/a/14302701/227576
FROM (SELECT * FROM t) AS t2
WHERE t2.customerId = t.customerId
GROUP BY t2.customerId
)
See http://sqlfiddle.com/#!9/6e031/1
If you have multiple rows whose date is the most recent for the same customer, you would have to look for duplicates and decide which one you want to keep yourself. For instance, look at customerId 2 on the SQL fiddle link above.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments