mysql中有用的去重sql分享

数据库使用的时候,经常会遇到一些去重的需求,怎样才能正确的使用最快的方式去重,使我们要考虑的问题
自己总结出来的去重sql往往就是锦囊,以下是我经常用的去重sql,和大家分享:

1、删重,保留id最小的一条

DELETE FROM table_test WHERE id NOT IN (
SELECT id FROM (
SELECT t1.id, t1.uid FROM table_test t1,
(
SELECT uid,MIN(id) AS minid FROM table_test GROUP BY uid
) t2
WHERE t1.uid = t2.uid AND t1.id = t2.minid) t3);

解释:
主体A是删除表中记录,排除所列id:DELETE FROM table_test WHERE id NOT IN(t3);
构建表t3,为排除ID:SELECT id FROM(t1,t2);
构建t1 & t2,这里是该表的自连接,分两部分,t2通过group by分组得到重复记录里的最小id

上面有重复的部分,主要是考虑到查询速度,以上sql还可以优化为以下这样:

DELETE FROM table_test WHERE id NOT IN (
SELECT id FROM
(SELECT MIN(id) AS id FROM table_test AS t GROUP BY uid)
t1);

2、删重,保留id最小的一条
删除重复记录里的id最大的记录 如果重复记录大于2的话 可能需要执行多次来删除 比如重复值最大为5条,则需要执行5-2三次,每次只能删除重复记录中的一条。

DELETE FROM table_test WHERE id IN (SELECT id FROM (
SELECT t1.id FROM table_test t1, (
SELECT uid, MAX(id) AS maxid FROM table_test GROUP BY uid HAVING COUNT(uid) > 1
)t2
WHERE t1.uid = t2.uid AND t1.id = t2.maxid
)t3);

3、删重,保留id最小的一条

delete a FROM table_test a LEFT JOIN
(SELECT MIN(id) AS minid FROM table_test GROUP BY uid )
b ON a.id=b.minid WHERE b.minid IS NULL;

解释:注意这里面的left join,即保留左表的所有记录, b.minid IS NULL即是非id最小的记录集

4、删重,保留id最小的一条

DELETE FROM table_test
WHERE uid IN (
SELECT uid FROM (SELECT uid FROM table_test a GROUP BY uid HAVING COUNT(*)>1) t1
)  AND id NOT IN (
SELECT minid FROM (SELECT MIN(id) AS minid FROM table_test b GROUP BY uid HAVING COUNT(*)>1) t2
);

解释:筛选出重复的uid&最小id,where来排除

5、删重,保留id最小的一条

DELETE a FROM table_test a,
(SELECT uid ,MIN(id) AS minid FROM table_test GROUP BY uid HAVING COUNT(*)>1) b
WHERE a.uid =b.uid AND a.id > b.minid;

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注