Archive

Posts Tagged ‘tips’

Getting first and last row from a group set without sub queries in mysql

February 18th, 2010 Ivan Villareal No comments

When I use the GROUP BY clause in mysql It starts grouping by the first different row in the resultset so for example if I have this table:

 

mysql> select * from grouptest;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  1 | tijuana |    10 | 2010-02-02 | 
|  2 | tijuana |    11 | 2010-02-09 | 
|  3 | tijuana |    15 | 2010-02-12 | 
|  4 | tijuana |    17 | 2010-02-17 | 
|  5 | tecate  |    12 | 2010-02-10 | 
|  6 | tecate  |     1 | 2010-02-14 | 
|  7 | tecate  |    29 | 2010-02-17 | 
+----+---------+-------+------------+
7 rows in set (0.00 sec)

 

If I group by city this is what I get:

mysql> SELECT * FROM grouptest g GROUP BY city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  5 | tecate  |    12 | 2010-02-10 | 
|  1 | tijuana |    10 | 2010-02-02 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 

But I want to get the last inserted rows, so could do a sub query like this:

mysql> SELECT * FROM grouptest g
    -> WHERE g.date = (SELECT MAX(date) FROM grouptest gt WHERE g.city = gt.city)
    -> GROUP BY g.city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  7 | tecate  |    29 | 2010-02-17 | 
|  4 | tijuana |    17 | 2010-02-17 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 

But if the db is large making subqueries could be expensive, so a cleaner solution might be using the GROUP_CONCAT function, so for example the above query could be written like this:

mysql> SELECT id, city, value, date, GROUP_CONCAT(date) AS all_dates
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+---------------------------------------------+
| id | city    | value | date       | all_dates                                   |
+----+---------+-------+------------+---------------------------------------------+
|  5 | tecate  |    12 | 2010-02-10 | 2010-02-10,2010-02-14,2010-02-17            | 
|  1 | tijuana |    10 | 2010-02-02 | 2010-02-02,2010-02-09,2010-02-12,2010-02-17 | 
+----+---------+-------+------------+---------------------------------------------+
2 rows in set (0.00 sec)

 
At this point I can explode the all_dates column in php or split them in perl to get the last date for each city, but a little more work on our query could do our job simpler, something like this for instance:

mysql> SELECT id, city, value, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS date
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  5 | tecate  |    12 | 2010-02-17 | 
|  1 | tijuana |    10 | 2010-02-17 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 
Or if I need to get the first and last rows for each group set I just add the date column like this:

mysql> SELECT id, city, value, date as first_date, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS last_date
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+------------+
| id | city    | value | first_date | last_date  |
+----+---------+-------+------------+------------+
|  5 | tecate  |    12 | 2010-02-10 | 2010-02-17 | 
|  1 | tijuana |    10 | 2010-02-02 | 2010-02-17 | 
+----+---------+-------+------------+------------+
2 rows in set (0.00 sec)

 

There may be several other ways around this, but for now this approach is enough, I’m not sure about the impact this 2 functions may have on the server but I’m sure it is less than executing a subquery.

Categories: Mysql Tags: , , , ,