Select last record from data table for each device in devices table

debugcn Published at Dev

Binakot

I have a problem with the executing speed of my sql query to postgres database.

I have 2 tables:

table 1: DEVICES

ID | NAME
------------------
1  | first device
2  | second device

table 2: DATA

ID | DEVICE_ID | TIME                | DATA
--------------------------------------------
1  | 1         | 2016-07-14 2:00:00  | data1
2  | 1         | 2016-07-14 1:00:00  | data2
3  | 2         | 2016-07-14 4:00:00  | data3
4  | 1         | 2016-07-14 3:00:00  | data4
5  | 2         | 2016-07-14 6:00:00  | data5
6  | 2         | 2016-07-14 5:00:00  | data6

I need get this select's result table:

ID | DEVICE_ID | TIME               | DATA
-------------------------------------------
4  | 1         | 2016-07-14 3:00:00 | data4
5  | 2         | 2016-07-14 6:00:00 | data5

i.e. for each device in devices table I need to get only one data record with the last TIME value.

This is my sql query:

SELECT * FROM db.data d 
    WHERE d.time = (
        SELECT MAX(d2.time) FROM db.data d2 
             WHERE d2.device_id = d.device_id);

This is HQL query equivalent:

SELECT d FROM Data d 
    WHERE d.time = (
        SELECT MAX(d2.time) FROM Data d2 
            WHERE d2.device.id = t2.device.id)

Yes, I use Hibernate ORM in my project - may this info will be useful for someone.

I got correct answer on my queries, BUT it's too long - about 5-10 seconds on 10k records in data table and only 2 devices in devices table. It's terrible.

First of all, I thought that problem is in Hibernate. But native sql query from psql in linux terminal execute the same time as through hibernate.

How can I optimize my query? This query is too complexity:

 O(device_count * data_count^2)

Chitharanjan Das

Since you're using Postgres, you could use window functions to achieve this, like so:

select
    sq.id,
    sq.device_id,
    sq.time,
    sq.data
from (
    select
        data.*,
        row_number() over (partition by data.device_id order by data.time desc) as rnk
    from
        data
) sq
where
    sq.rnk = 1

The row_number() window function first ranks the rows in the data table on the basis of the device_id and time columns, and the outer query then picks the highest-ranked rows.

Collected from the Internet

Please contact [email protected] to delete if infringement.