大部分数据库都提供了窗口函数,比如RANK,ROW_NUMBER等等。 MySQL 这方面没有直接提供,但是可以变相的实现,我以前写了row_number 的实现,今天有时间把 rank 的实现贴出来。
这里,我用MySQL 以及Python 分别实现了rank 窗口函数。
原始表信息:
[sql] view plaincopyprint?01.t_girl=# \d group_concat;
02. Table "ytt.group_concat"
03. Column | Type | Modifiers
04.----------+-----------------------+-----------
05. rank | integer |
06. username | character varying(20) |
t_girl=# \d group_concat;
Table "ytt.group_concat"
Column | Type | Modifiers
----------+-----------------------+-----------
rank | integer |
username | character varying(20) |
表数据
[sql] view plaincopyprint?01.t_girl=# select * from group_concat;
02. rank | username
03.------+----------
04. 100 | Lucy
05. 127 | Lucy
06. 146 | Lucy
07. 137 | Lucy
08. 104 | Lucy
09. 121 | Lucy
10. 136 | Lily
11. 100 | Lily
12. 100 | Lily
13. 105 | Lily
14. 136 | Lily
15. 149 | ytt
16. 116 | ytt
17. 116 | ytt
18. 149 | ytt
19. 106 | ytt
20. 117 | ytt
21.(17 rows)
22.
23.
24.Time: 0.638 ms
t_girl=# select * from group_concat;
rank | username
------+----------
100 | Lucy
127 | Lucy
146 | Lucy
137 | Lucy
104 | Lucy
121 | Lucy
136 | Lily
100 | Lily
100 | Lily
105 | Lily
136 | Lily
149 | ytt
116 | ytt
116 | ytt
149 | ytt
106 | ytt
117 | ytt
(17 rows)
Time: 0.638 ms
PostgreSQL 的rank 窗口函数示例:
[sql] view plaincopyprint?01.t_girl=# select username,rank,rank() over(partition by username order by rank desc) as rank_cnt from group_concat;
02. username | rank | rank_cnt
03.----------+------+----------
04. Lily | 136 | 1
05. Lily | 136 | 1
06. Lily | 105 | 3
07. Lily | 100 | 4
08. Lily | 100 | 4
09. Lucy | 146 | 1
10. Lucy | 137 | 2
11. Lucy | 127 | 3
12. Lucy | 121 | 4
13. Lucy | 104 | 5
14. Lucy | 100 | 6
15. ytt | 149 | 1
16. ytt | 149 | 1
17. ytt | 117 | 3
18. ytt | 116 | 4
19. ytt | 116 | 4
20. ytt | 106 | 6
21.(17 rows)
22.
23.
24.Time: 131.150 ms
t_girl=# select username,rank,rank() over(partition by username order by rank desc) as rank_cnt from group_concat;
username | rank | rank_cnt
----------+------+----------
Lily | 136 | 1
Lily | 136 | 1
Lily | 105 | 3
Lily | 100 | 4
Lily | 100 | 4
Lucy | 146 | 1
Lucy | 137 | 2
Lucy | 127 | 3
Lucy | 121 | 4
Lucy | 104 | 5
Lucy | 100 | 6
ytt | 149 | 1
ytt | 149 | 1
ytt | 117 | 3
ytt | 116 | 4
ytt | 116 | 4
ytt | 106 | 6
(17 rows)
Time: 131.150 ms
MySQL 提供了group_concat 聚合函数可以变相的实现:
[sql] view plaincopyprint?01.mysql>
02.select a.username, a.rank, find_in_set(a.rank,b.rank_gp) as rank_cnt
03.from group_concat as a ,
04.(select username,group_concat(rank order by rank desc separator ',') as rank_gp from group_concat group by username
05.) b
06.where a.username = b.username order by a.username asc,a.rank desc;
07.+----------+------+----------+
08.| username | rank | rank_cnt |
09.+----------+------+----------+
10.| Lily | 136 | 1 |
11.| Lily | 136 | 1 |
12.| Lily | 105 | 3 |
13.| Lily | 100 | 4 |
14.| Lily | 100 | 4 |
15.| Lucy | 146 | 1 |
16.| Lucy | 137 | 2 |
17.| Lucy | 127 | 3 |
18.| Lucy | 121 | 4 |
19.| Lucy | 104 | 5 |
20.| Lucy | 100 | 6 |
21.| ytt | 149 | 1 |
22.| ytt | 149 | 1 |
23.| ytt | 117 | 3 |
24.| ytt | 116 | 4 |
25.| ytt | 116 | 4 |
26.| ytt | 106 | 6 |
27.+----------+------+----------+
28.17 rows in set (0.02 sec)
mysql>
select a.username, a.rank, find_in_set(a.rank,b.rank_gp) as rank_cnt
from group_concat as a ,
(select username,group_concat(rank order by rank desc separator ',') as rank_gp from group_concat group by username
) b
where a.username = b.username order by a.username asc,a.rank desc;
+----------+------+----------+
| username | rank | rank_cnt |
+----------+------+----------+
| Lily | 136 | 1 |
| Lily | 136 | 1 |
| Lily | 105 | 3 |
| Lily | 100 | 4 |
| Lily | 100 | 4 |
| Lucy | 146 | 1 |
| Lucy | 137 | 2 |
| Lucy | 127 | 3 |
| Lucy | 121 | 4 |
| Lucy | 104 | 5 |
| Lucy | 100 | 6 |
| ytt | 149 | 1 |
| ytt | 149 | 1 |
| ytt | 117 | 3 |
| ytt | 116 | 4 |
| ytt | 116 | 4 |
| ytt | 106 | 6 |
+----------+------+----------+
17 rows in set (0.02 sec)
当然了,如果MySQL SQL不太熟悉,可以用程序来处理,比如我下面用python 实现了rank 函数,执行结果如下:(脚本源代码最后)
[sql] view plaincopyprint?01.>>> ================================ RESTART ================================
02.>>>
03. username | rank | rank_cnt
04.--------------------------------
05.ytt |149 |1
06.ytt |149 |1
07.ytt |117 |3
08.ytt |116 |4
09.ytt |116 |4
10.ytt |106 |6
11.Lucy |146 |1
12.Lucy |137 |2
13.Lucy |127 |3
14.Lucy |121 |4
15.Lucy |104 |5
16.Lucy |100 |6
17.Lily |136 |1
18.Lily |136 |2
19.Lily |105 |3
20.Lily |100 |4
21.Lily |100 |4
22.(17 Rows.)
23.Time: 0.162 Seconds.
>>> ================================ RESTART ================================
>>>
username | rank | rank_cnt
--------------------------------
ytt |149 |1
ytt |149 |1
ytt |117 |3
ytt |116 |4
ytt |116 |4
ytt |106 |6
Lucy |146 |1
Lucy |137 |2
Lucy |127 |3
Lucy |121 |4
Lucy |104 |5
Lucy |100 |6
Lily |136 |1
Lily |136 |2
Lily |105 |3
Lily |100 |4
Lily |100 |4
(17 Rows.)
Time: 0.162 Seconds.
附上脚本代码:
[python] view plaincopyprint?01.from __future__ import print_function
02.from datetime import date, datetime, timedelta
03.import mysql.connector
04.import time
05.# Created by ytt 2014/5/14.
06.# Rank function implement.
07.def db_connect(is_true):
08. cnx = mysql.connector.connect(host='192.168.1.131',port='3306',user='python_user', password='python_user',database='t_girl',autocommit=is_true)
09. return cnx
10.def db_rs_rank(c1 ='username desc' ,c2 = ' rank desc'):
11. # c1: partition column.
12. # c2: sort column.
13. time_start = time.time()
14. cnx = db_connect(True)
15. rs = cnx.cursor()
16. query0 = "select username,rank from group_concat order by " + c1 + ", " + c2
17. rs.execute(query0,multi=False)
18. if rs.with_rows:
19. rows = rs.fetchall()
20. else:
21. return "No rows affected."
22. i = 0
23. j = 0
24. k = 1
25. result = []
26. field1_compare = rows[0][0]
27. field2_compare = rows[0][1]
28. while i < len(rows):
29. if field1_compare == rows[i][0]:
30. j += 1
31. if field2_compare != rows[i][1]:
32. field2_compare =rows[i][1]
33. k = j
34. result.append((rows[i][0],rows[i][1],k))
35. else:
36. j = 1
37. k = 1
38. field1_compare = rows[i][0]
39. result.append((rows[i][0],rows[i][1],k))
40. i += 1
41. i = 0
42. rows_header = list(rs.column_names)
43. rows_header.append('rank_cnt')
44. print (rows_header[0].center(10,' ') + '|' + rows_header[1].center(10,' ') + '|' + rows_header[2].center(10,' '))
45. print ('-'.center(32,'-'))
46. while i < len(result):
47. print (result[i][0].ljust(10,' ') + '|' + str(result[i][1]).ljust(10,' ') + '|' + str(result[i][2]).ljust(10,' '))
48. i += 1
49. rs.close()
50. cnx.close()
51. time_end = time.time()
52. print ('(' + str(len(rows))+ ' Rows.)')
53. print ("Time:" + str(round((time_end-time_start),3)).rjust(10,' ') + ' Seconds.')
54.if __name__=='__main__':
55. db_rs_rank()
56.