本小说介绍朝气蓬勃篇关于sql中IN与EXISTS,T-SQL是亟需优化的

图片 1

题目:来自Madrid且订单数少于3的消费者

T-SQL是需要优化的。而优化的前提是你对它的执行步骤有清楚的认识。我经常去给讲课或者咨询辅导的时候,就发现很多朋友对此了解甚少。下面截取了我的一个PPT,给大家参考一下

这2张表可以很清晰的看出外链接

代码如下复制代码 SELECT pub_name FROM publishers WHERE NOT EXISTS
(SELECT * FROM titles WHERE pub_id = publishers.pub_id AND type =
‘business’)

建表:复制代码 代码如下: set nocount
on –当 SET NOCOUNT 为 ON 时,不返回计数。当 SET NOCOUNT 为 OFF
时,返回计数 use SY GO if object_Id(‘dbo.Orders’) is not null drop
table dbo.Orders GO if object_Id(‘dbo.Customers’) is not null drop
table dbo.Customers GO create table dbo.Customers ( customerid char(5)
not null primary key , city varchar(10) not null ); insert into
dbo.Customers values(‘FISSA’,’Madrid’); insert into dbo.Customers
values(‘FRNDO’,’Madrid’); insert into dbo.Customers
values(‘KRLOS’,’Madrid’); insert into dbo.Customers
values(‘MRPHS’,’Zion’); create table dbo.Orders ( orderid int not null
primary key , customerid char(5) null references customers(customerid) )
insert into dbo.Orders values(1,’FRNDO’); insert into dbo.Orders
values(2,’FRNDO’); insert into dbo.Orders values(3,’KRLOS’); insert into
dbo.Orders values(4,’KRLOS’); insert into dbo.Orders values(5,’KRLOS’);
insert into dbo.Orders values(6,’MRPHS’); insert into dbo.Orders

–因为没有WITH Rollup和WITH
Cube语句,所以跳过第六步,进入HAVING子句的处理


–所以在含有NULL值的列的时候,就要小心了,not exists与not
in在逻辑上是不等价的

values(7,null);

做题分析:复制代码 代码如下: select
customerid as 消费者,count(customerid) as 订单数 from dbo.Orders where
customerid in ( select customerid from dbo.Customers where city =
‘Madrid’) group by customerid having count(customerid) 3
结果如图所示:

–第一次想到的答案,突然发现少了一个来自Madrid的FISSA订单,FISSA订单数量为0,所以在Orders表中没有出现,所以上面的写法会少一个.–推翻了上面的答案,又想到了用表的连接,而用内连接出现的情况会和上面的一样,所以我选择了左连接,如下:复制代码 代码如下: select C.customerid as
消费者,count(O.customerid) as 订单数 from dbo.Customers as C left join
dbo.Orders as O on C.customerid = O.customerid where C.city= ‘Madrid’
group by C.customerid having count(C.customerid) 3 结果如图所示:


–查询发现是正确的。–分析查看不带条件的左连接复制代码 代码如下: select * from dbo.Customers
as C left join dbo.Orders as O on C.customerid = O.customerid 复制代码 代码如下: select * from dbo.Customers
as C left join dbo.Orders as O on C.customerid = O.customerid
结果如图所示:


–书中给的标准答案是:复制代码
代码如下: select C.customerid , count(O.orderid) as numorders from
dbo.Customers as C left join dbo.Orders as O on C.customerid =
O.customerid where C.city= ‘Madrid’ group by C.customerid having
count(O.orderid) 3 order by numorders 结果如图所示:


书中给的只是多了一个order by
进行定义了排序方式(以numorders这一列的升序进行排序)

–第二步:处理ON子句,只把那些两个表的customerid匹配的行找出来,我把它们放到VT2中去
SELECT temp.* INTO #VT2 FROM (SELECT * FROM #VT1 WHERE
Customer=customerid) temp
SELECT * FROM #VT2
–返回6行数据


代码如下复制代码 SELECT pub_name FROM publishers WHERE pub_id NOT IN
(SELECT pub_id FROM titles WHERE type = ‘business’)

CREATE TABLE dbo.Orders
(
  orderid    INT        NOT NULL PRIMARY KEY,
  customerid CHAR(5)    NULL     REFERENCES Customers(customerid)
);


下面的查询查找已经不销售的书的名称:

/*
准备数据架构和数据
*/
SET NOCOUNT ON;
USE tempdb;
GO
IF OBJECT_ID(‘dbo.Orders’) IS NOT NULL
  DROP TABLE dbo.Orders;
GO
IF OBJECT_ID(‘dbo.Customers’) IS NOT NULL
  DROP TABLE dbo.Customers;
GO
CREATE TABLE dbo.Customers
(
  customerid  CHAR(5)     NOT NULL PRIMARY KEY,
  city        VARCHAR(10) NOT NULL
);

(7 行受影响)

–我们再用IN方法

–第七步:使用SELECT 的字段列表过滤结果集
SELECT #VT6.Customer,#VT6.orderidcount as  numorders  INTO #VT7 FROM
#VT6
SELECT * FROM #VT7
–还是2行数据,只不过只有两个列了

customerid city       orderid    
customerid

接着看

INSERT INTO dbo.Customers(customerid, city) VALUES(‘FISSA’, ‘Madrid’);
INSERT INTO dbo.Customers(customerid, city) VALUES(‘FRNDO’, ‘Madrid’);
INSERT INTO dbo.Customers(customerid, city) VALUES(‘KRLOS’, ‘Madrid’);
INSERT INTO dbo.Customers(customerid, city) VALUES(‘MRPHS’, ‘Zion’);


set identity_insert Orders off;

/*第一步:处理FROM子句,把来源的表进行CROSS JOIN(笛卡尔乘积)
我这里把结果做一个生成表查询,写到一个临时表(VT1)中去
*/
SELECT C.customerid as Customer,c.city,o.*  INTO #VT1 FROM
dbo.Customers C,dbo.Orders  O
SELECT * FROM #VT1
–返回28行数据(4*7)

customerid city       orderid    
customerid

代码如下复制代码 select custid,companynamefrom Customers as Cwhere
country=N’中国’and exists (select * from Orders as O where
O.custid=C.custid);–返回–custid companyname–1 大众–3 奔驰

–第五步:处理GROUP子句,进行分类汇总
SELECT temp.* INTO #VT5 FROM
(SELECT Customer,COUNT(OrderID)  as orderidcount,COUNT(city) as
citycount,Count(customerid)  as customeridcount FROM #VT4 GROUP BY
Customer) temp
SELECT * FROM #VT5
–返回3行数据,根据客户分组,统计了订单的个数
–这里会不会去统计其他列的汇总呢

CREATE TABLE dbo.Customers
(
  customerid  CHAR(5)     NOT NULL PRIMARY KEY,
  city        VARCHAR(10) NOT NULL
);

create table t1 (c1 number,c2 number);create table t2 (c1 number,c2
number);

–第三步:根据JOIN语句的类型,决定是否要添加行到VT2中去,例如如果是LEFT 
JOIN的话,那么就要检查坐边的表(我们这里是customers表)的连接键值是否都存在,如果不存在就要去添加到VT2中
SELECT temp.* INTO #VT3 FROM
(SELECT * FROM #VT2
UNION ALL
SELECT CustomerID,City,NULL,NULL FROM dbo.Customers c WHERE NOT
EXISTS(SELECT DISTINCT Customer FROM #VT2 WHERE Customer=c.CustomerID))
temp
SELECT * FROM #VT3
–返回7行数据,其中有一个客户,因为没有订单,这一步中被添加进来。它的Orders的记录被标记为NULL

(8 行受影响)

两者的区别: EXISTS:后面可以是整句的查询语句如:SELECT * FROM titles
IN:后面只能是对单列:SELECT pub_id FROM titles NOT EXISTS:
例如,要查找不出版商业书籍的出版商的名称:

–第八步:跳过DISTINCT,进行OrderBy操作
SELECT #VT7.* INTO #VT8 FROM #VT7 Order By #VT7.Customer
SELECT * FROM #VT8
–返回2行数据,经过排序

FRNDO      Madrid     1           FRNDO
FRNDO      Madrid     2           FRNDO
KRLOS      Madrid     3           KRLOS
KRLOS      Madrid     4           KRLOS
KRLOS      Madrid     5           KRLOS
MRPHS      Zion       6           MRPHS
NULL       NULL       7          
NULL

–下面的知识点我们需要认识到:–当列表中有NULL时,in实际会产生一个UNKNOWN的结果,例如
a in(d,b,null)的结果是UNKNOWN,而a not in (d,b,null)返回的是not
unknowd仍然是unknowd–而not in与not exists则结果会很不同,例如a
in(a,b,null)返回的是TRUE,而a not in(a,b,null)返回的肯定是not
true即为false–有了上面的认识,好继续开工了….–我们现在向Orders表插入一行数据

–第四步:处理WHERE
子句,对VT3的结果集进行过滤,我们的条件是city=Madid
SELECT temp.* INTO #VT4 FROM
(SELECT * FROM #VT3 WHERE city=’Madrid’) temp
SELECT * FROM #VT4
–返回6行数据,因为有一个客户不是这个城市的

表如下

上面涉及的3个表的记录数都不小,均在百万左右。根据这种情况,我想到了前不久看的tom的一篇文章,说的是exists和in的区别,in
是把外表和那表作hash
join,而exists是对外表作loop,每次loop再对那表进行查询。这样的话,in适合内外表都很大的情况,exists适合外表结果集很小的情况。

下面这个脚本可以解释这个过程

FRNDO      Madrid     1          
FRNDO
FRNDO      Madrid     2          
FRNDO
KRLOS      Madrid     3          
KRLOS
KRLOS      Madrid     4          
KRLOS
KRLOS      Madrid     5          
KRLOS
MRPHS      Zion       6          
MRPHS

代码如下复制代码 select custid,companyname from Customers as Cwhere
country=N’美国’and custid not in(select custid from Orders);

图片 1

其结果如下

然而,如果使用 IN,少键入一些字符也可以得到同样的结果:

–这个查询结束演示。完整的8个步骤

INSERT INTO dbo.Orders(orderid, customerid) VALUES(1, ‘FRNDO’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(2, ‘FRNDO’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(3, ‘KRLOS’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(4, ‘KRLOS’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(5, ‘KRLOS’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(6, ‘MRPHS’);
INSERT INTO dbo.Orders(orderid, customerid) VALUES(7, NULL);

INSERT INTO Customers(custid, companyname,country) VALUES(1, N’大众’,
N’中国’); INSERT INTO Customers(custid, companyname,country) VALUES(2,
N’宝马’, N’美国’); INSERT INTO Customers(custid, companyname,country)
VALUES(3, N’奔驰’, N’中国’); INSERT INTO Customers(custid,
companyname,country) VALUES(4, N’奇瑞’, N’德国’); INSERT INTO
Customers(custid, companyname,country) VALUES(5, N’福特’, N’美国’); set
identity_insert Customers off; set identity_insert Orders
on;–custid代表员工号 INSERT INTO Orders(orderid, custid) VALUES(1,1);
INSERT INTO Orders(orderid, custid) VALUES(2,2); INSERT INTO
Orders(orderid, custid) VALUES(3,3); INSERT INTO Orders(orderid, custid)
VALUES(4,4); INSERT INTO Orders(orderid, custid)
VALUES(5,5);–查看表的数据select custid,companyname,country from
Customers;select orderid,custid from Orders;–插入数据成功

/*
一个测试查询,检索那些订单个数小于3的客户,并且按订单总数排序(升序)
*/
SELECT C.customerid, COUNT(O.orderid) AS numorders
FROM dbo.Customers AS C
  LEFT OUTER JOIN dbo.Orders AS O
    ON C.customerid = O.customerid
WHERE C.city = ‘Madrid’
GROUP BY C.customerid
HAVING COUNT(O.orderid) < 3
ORDER BY numorders;

 

–查询来自中国,而且下过订单的所有客户

相关文章

You can leave a response, or trackback from your own site.

Leave a Reply

网站地图xml地图