sql实现寻找中位数(使用sign、case、自定义变量等)
- 作者: 互相伤害啊-
- 来源: 51数据库
- 2021-07-11
目录
- sql实现寻找中位数
- 思路1
- 算法
- case
- sign():
- 思路二:排序后再找中位数
sql实现寻找中位数
思路1
对于一个 奇数 长度数组中的 中位数,大于这个数的数值个数等于小于这个数的数值个数。
算法
根据上述的定义,我们来找一下 [1, 3, 2] 中的中位数。首先 1 不是中位数,因为这个数组有三个元素,却有两个元素 (3,2) 大于 1。3 也不是中位数,因为有两个元素小于 3。对于最后一个 2 来说,大于 2 和 小于 2 的元素数量是相等的,因此 2 是当前数组的中位数。
当数组长度为 偶数,且元素唯一时,中位数等于排序后 中间两个数 的平均值。对这两个数来说,大于当前数的数值个数跟小于当前数的数值个数绝对值之差为 1,恰好等于这个数出现的频率。
总的来说,不管是数组长度是奇是偶,也不管元素是不是唯一,中位数出现的频率一定大于等于大于它的数和小于它的数的绝对值之差。这个规律是这道题的关键,可以通过下面这个搜索条件来过滤。
--架构
Create table If Not Exists Employee (Id int, Company varchar(255), Salary int)
Truncate table Employee
insert into Employee (Id, Company, Salary) values ('1', 'A', '2341')
insert into Employee (Id, Company, Salary) values ('2', 'A', '341')
insert into Employee (Id, Company, Salary) values ('3', 'A', '15')
insert into Employee (Id, Company, Salary) values ('4', 'A', '15314')
insert into Employee (Id, Company, Salary) values ('5', 'A', '451')
insert into Employee (Id, Company, Salary) values ('6', 'A', '513')
insert into Employee (Id, Company, Salary) values ('7', 'B', '15')
insert into Employee (Id, Company, Salary) values ('8', 'B', '13')
insert into Employee (Id, Company, Salary) values ('9', 'B', '1154')
insert into Employee (Id, Company, Salary) values ('10', 'B', '1345')
insert into Employee (Id, Company, Salary) values ('11', 'B', '1221')
insert into Employee (Id, Company, Salary) values ('12', 'B', '234')
insert into Employee (Id, Company, Salary) values ('13', 'C', '2345')
insert into Employee (Id, Company, Salary) values ('14', 'C', '2645')
insert into Employee (Id, Company, Salary) values ('15', 'C', '2645')
insert into Employee (Id, Company, Salary) values ('16', 'C', '2652')
insert into Employee (Id, Company, Salary) values ('17', 'C', '65')
SELECT
Employee.Id, Employee.Company, Employee.Salary
FROM
Employee,
Employee alias
WHERE
Employee.Company = alias.Company
GROUP BY Employee.Company , Employee.Salary
HAVING SUM(CASE
WHEN Employee.Salary = alias.Salary THEN 1
ELSE 0
END) >= ABS(SUM(SIGN(Employee.Salary - alias.Salary)))
ORDER BY Employee.Id;
--alias是别名
1.运用CASE表达式,非等值自连接和HAVING子句来找中位数
2.通过 WHERE e1.Company = e2.Company 进行分组
3.最后通过GROUP BY 去重
--更好理解一些:
select Id, Company, Salary
from Employee
where Id in (select e1.Id
from Employee e1, Employee e2
WHERE e1.Company = e2.Company
GROUP BY e1.Id
HAVING SUM(CASE WHEN e1.Salary >= e2.Salary THEN 1 ELSE 0 END) >= COUNT(*)/2 AND SUM(CASE WHEN e1.Salary <= e2.Salary THEN 1 ELSE 0 END) >= COUNT(*)/2)
GROUP BY Company, Salary
ORDER BY Company
case
The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.If there is no ELSE part and no conditions are true, it returns NULL.
--用例
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
WHEN conditionN THEN resultN
ELSE result
END;
sign():
这里是引用
ps:注意:在 MySQL 5.6 中,这个代码是可以运行的,但如果你用的是 MySQL 5.7+,就需要在 SELECT 语句中 把 Employee.id 改成 ANY_VALUE(Employee.Id)。
思路二:排序后再找中位数
根据 salary 排序记录,利用会话变量计算排名。由于不需要级联表,这个方法要比方法一更高效。
SELECT
Id, Company, Salary
FROM
(SELECT
e.Id,
e.Salary,
e.Company,
IF(@prev = e.Company, @Rank:=@Rank + 1, @Rank:=1) AS rank,
@prev:=e.Company
FROM
Employee e, (SELECT @Rank:=0, @prev:=0) AS temp
ORDER BY e.Company , e.Salary , e.Id) Ranking
INNER JOIN
(SELECT
COUNT(*) AS totalcount, Company AS name
FROM
Employee e2
GROUP BY e2.Company) companycount ON companycount.name = Ranking.Company
WHERE
Rank = FLOOR((totalcount + 1) / 2)
OR Rank = FLOOR((totalcount + 2) / 2)
;
自定义变量
- SQLite数据库操作:原生操作,GreenDao操作讲解
- 数据库SQL实战题:获取员工其当前的薪水比其manager当前薪水还高的相关信息(教程)
- SQLSERVER查询区分大小写的写法分析
- SQL学习总结之SQL的分类介绍
- 阶梯到高级T-SQL 1级:高级T-SQL介绍交叉连接
- 高级T-SQL级别1的Stairway:使用CROSS JOIN引入高级T-SQL分析
- 什么是SQL隔离级别?四个SQL隔离级别定义介绍
- Sql递归介绍之用with实现递归查询
- SQLSERVER查询时日期格式化的实例讲解
- 数据库SQL实战:从titles表获取按照title进行分组,注意对于重复的emp_no进行忽略(题解)


