Question title isn't the best. I have two parent tables A and B and their association table AB. I need columns from A and B, none from AB. For a given row in A, the multiple matching rows in B will always have identical values for the columns I want. Currently, I join the 3 tables and do a DISTINCT on B, but it is incredibly time-consuming. Here's example data:
Table A
PK_A col1
1 a
2 b
3 b
TAble B
PK_B col1
5 R
6 S
7 T
8 R
9 R
Table AB
FK_A FK_B
1 5
1 8
1 9
select
A.col1,
B.col1
from A
join AB
on AB.FK_A = A.PK_A
and AB,FK_B = B.PK_B
join B
on B.PK_B = AB.FK_B
returns
a R
a R
a R
I add DISTINCT to get what I want.
Is there a speedier way to do this by forcing a join using only a single matching row in AB? Sort of like where FK_B = MAX(FK_B)?
It will always be the case that the B.col1 values I want for a given value of A.col1 are identical, the unique parameters in the association table aren't of interest in this query.
Here is how you would do what you ask, I'm not sure if it would be faster than distinct but it will use just 1 row from the AB table.
select
A.col1,
B.col1
from A
join (select
FK_A,
FK_B,
row_number() OVER (partition by FK_A ORDER BY FK_B) as rn
from AB
) jtable ON A.PK_A = jtalbe.FK_A AND rn = 1
join B ON B.PK_B = jtable.FK_B
How this works:
I use the row_number() OVER
to "pull out" each row we want to join on from the joining table in a sub-query.
This may be faster than distinct but probably is dependent on what indexes you have defined relative size of tables etc.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments