The definition (taken from O'Reilly's "Learning SQL on SQL Server 2005") of a correlated subquery:
A correlated subquery is an inner subquery whose information is referenced by the main, outer query such that the inner query may be thought of as being executed repeatedly.
And the definition of an uncorrelated subquery:
A noncorrelated subquery is a subquery that is independent of the outer query. In other words, the subquery could be executed on its own.
This is all fine and dandy, but what's the use?
I'll put up two simple examples of usage, which should make clear which kind of questions can be answered with (un)correlated subqueries:
Example 1: get the number of "open" orders, the number of "waiting" orders and the open vs. waiting ratio
SELECT COUNT(*) AS o_open, (SELECT COUNT(*) FROM sales WHERE order_status = 'waiting') AS o_waiting, (COUNT(*))/ (SELECT COUNT(*) FROM sales WHERE order_status = 'waiting') AS open_waiting_ratio FROM SALES WHERE order_status = 'open';
You can see that the queries are uncorrelated, and they better be, as the where conditions are different.
Example 2: we want to see the customer number, the amount of the customer's last order and the total amount ordered by each customer
SELECT s.cust_no, s.total_value AS order_value, (SELECT SUM(i_s.total_value) FROM sales i_s WHERE i_s.cust_no = s.cust_no) total_ordered FROM sales s WHERE s.order_date = (SELECT MAX(ii_s.order_date) FROM sales ii_s WHERE ii_s.cust_no = s.cust_no);
You can see that both subqueries are correlated, the first one gets total amount ordered for each customer, the second one gets the last order date (field is datetime) and helps determine the last order in the main query.
This should make you aware that some kind of set based logic is often disguised as procedural and usually leads to ugly cursor usage in stored procedures instead of direct queries by the inexperienced.
Another example of a correlated subquery to display customers and one randomly selected purchase order for each of them:
SELECT a.CUST_NO, a.CUSTOMER, b.PO_NUMBER FROM CUSTOMER a INNER JOIN sales b ON a.CUST_NO = b.CUST_NO WHERE b.PO_NUMBER = ( SELECT first 1 c.PO_NUMBER FROM sales c WHERE c.CUST_NO = b.CUST_NO ORDER BY rand() )
Both examples are run against Firebird's standard EMPLOYEE.FDB.
Correlated subqueries are also useful to specify updates and deletes on a table conditioned to values of another table (instead of joining them) see this example:
UPDATE stops s SET s.inservice=1 WHERE EXISTS ( SELECT ls.stopid FROM linestop ls WHERE s.stopid=ls.stopid AND ls.signid=79 );
No comments:
Post a Comment