I have a SQL question from one of the well known IT company couple month ago when they were interviewing me, and I never got it figured out.
An order can have multiple lines - For ex., if a customer ordered cookies,
chocolates, and bread, this would count as 3 lines in one order. The question
is to find the number of orders in each line count. The output of this query
would be something like 100 orders had 1 line, 70 orders had 2 lines, 30 had 3
lines, and so on. This table has two columns - order_id and line_id
Sample Data:
order_id line_id
1 cookies
1 chocolates
1 bread
2 cookies
2 bread
3 chocolates
3 cookies
4 milk
desired output:
orders line
1 1
2 2
1 3
So generally speaking, we have a very large data set, and the line_id per order_id can be ranging from 1 to infinite(Theoretically speaking).
The desired output for the general case is:
orders line
100 1
70 2
30 3
etc..
How can I write a query to find the total number of orders per line count=1,2,3... etc
My thought on this problem is to first subquery the count of line_id per order_id.
And then select the subquery along with a list of values as the second column ranging from 1 to max(lines_id per order)
Test Data:
create table data
(
order_id int,
line_id char(50)
);
insert into data
values(1, 'cookies'),
(1, 'chocolates'),
(1, 'bread'),
(2, 'bread'),
(2, 'cookies'),
(3, 'chocolates'),
(3, 'cookies'),
(4, 'milk');
Since order_id=1 has 3 lines,
order_id=2 has 2 lines,
order_id=3 has 2 lines,
order_id=4 has 1 line.
Thus it yield our solution:
orders line
1 1
2 2
1 3
This is because both order_id = 2 and 3 has 2 lines. So it would mean 2 orders has line = 2.
So far, I have:
select lines,
sum(case when orders_per_line = '1' then 1 else 0),
sum(case when orders_per_line = '2' then 1 else 0),
sum(case when orders_per_line = '3' then 1 else 0)
from(
select lines, order_id, count(*) as orders_per_line from data
where lines in ('1, '2', '3')
group by order_id, lines
)
group by lines
My query is wrong, as I only want 2 columns, and also creating a sequence of numbers ranging from 1 to max(lines per order) is also wrong.
Any suggestions?
Thanks in advance!
Try this:
Select Count(*) as Orders, Lines from (
Select order_id, Count(*) as Lines from data group by order_id
)query group by Lines
For exmaple, look at this sqlfiddle
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments