When I use the GROUP BY clause in mysql It starts grouping by the first different row in the resultset so for example if I have this table:
mysql> select * from grouptest; |
+----+---------+-------+------------+ |
| id | city | value | date | |
+----+---------+-------+------------+ |
| 1 | tijuana | 10 | 2010-02-02 | |
| 2 | tijuana | 11 | 2010-02-09 | |
| 3 | tijuana | 15 | 2010-02-12 | |
| 4 | tijuana | 17 | 2010-02-17 | |
| 5 | tecate | 12 | 2010-02-10 | |
| 6 | tecate | 1 | 2010-02-14 | |
| 7 | tecate | 29 | 2010-02-17 | |
+----+---------+-------+------------+ |
7 rows in set (0.00 sec) |
If I group by city this is what I get:
mysql> SELECT * FROM grouptest g GROUP BY city; |
+----+---------+-------+------------+ |
| id | city | value | date | |
+----+---------+-------+------------+ |
| 5 | tecate | 12 | 2010-02-10 | |
| 1 | tijuana | 10 | 2010-02-02 | |
+----+---------+-------+------------+ |
2 rows in set (0.00 sec) |
But I want to get the last inserted rows, so could do a sub query like this:
mysql> SELECT * FROM grouptest g |
-> WHERE g.date = (SELECT MAX(date) FROM grouptest gt WHERE g.city = gt.city) |
-> GROUP BY g.city; |
+----+---------+-------+------------+ |
| id | city | value | date | |
+----+---------+-------+------------+ |
| 7 | tecate | 29 | 2010-02-17 | |
| 4 | tijuana | 17 | 2010-02-17 | |
+----+---------+-------+------------+ |
2 rows in set (0.00 sec) |
But if the db is large making subqueries could be expensive, so a cleaner solution might be using the GROUP_CONCAT function, so for example the above query could be written like this:
mysql> SELECT id, city, value, date, GROUP_CONCAT(date) AS all_dates |
-> FROM grouptest g |
-> GROUP BY city; |
+----+---------+-------+------------+---------------------------------------------+ |
| id | city | value | date | all_dates | |
+----+---------+-------+------------+---------------------------------------------+ |
| 5 | tecate | 12 | 2010-02-10 | 2010-02-10,2010-02-14,2010-02-17 | |
| 1 | tijuana | 10 | 2010-02-02 | 2010-02-02,2010-02-09,2010-02-12,2010-02-17 | |
+----+---------+-------+------------+---------------------------------------------+ |
2 rows in set (0.00 sec) |
At this point I can explode the all_dates column in php or split them in perl to get the last date for each city, but a little more work on our query could do our job simpler, something like this for instance:
mysql> SELECT id, city, value, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS date |
-> FROM grouptest g |
-> GROUP BY city; |
+----+---------+-------+------------+ |
| id | city | value | date | |
+----+---------+-------+------------+ |
| 5 | tecate | 12 | 2010-02-17 | |
| 1 | tijuana | 10 | 2010-02-17 | |
+----+---------+-------+------------+ |
2 rows in set (0.00 sec) |
|
Or if I need to get the first and last rows for each group set I just add the date column like this:
mysql> SELECT id, city, value, date as first_date, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS last_date |
-> FROM grouptest g |
-> GROUP BY city; |
+----+---------+-------+------------+------------+ |
| id | city | value | first_date | last_date | |
+----+---------+-------+------------+------------+ |
| 5 | tecate | 12 | 2010-02-10 | 2010-02-17 | |
| 1 | tijuana | 10 | 2010-02-02 | 2010-02-17 | |
+----+---------+-------+------------+------------+ |
2 rows in set (0.00 sec) |
|
There may be several other ways around this, but for now this approach is enough, I’m not sure about the impact this 2 functions may have on the server but I’m sure it is less than executing a subquery.