Archive

Archive for the ‘Mysql’ Category

Partial Db backups with mysqldump

May 5th, 2010 Ivan Villareal No comments

I was updating my db dev instance with a set of changes, and when I was restoring the dump for a table it contained a drop db clause :( it was a simple table and it took a about 15 seconds to finish the operation, when I looked at the db schema all the tables were gone!

This db has a very large table about 16Gb of binary data, so getting a new dump would require a lot of time, so as workaround I’ve dumped only sets of data that I need to my tests.

First I did a dump of the db schema with no data:

mysqldump -h host-u ${USER} --password=${PASS} --no-data --add-drop-database --databases largeDb> schema_largeDb.sql

then I created a dump for the tables that I need, for this I have to include a list of tables that I don’t want, I’ve created the list of unwanted tables manually.

mysqldump -h host-u ${USER} --password=${PASS} --no-create-info --ignore-table=largeDb.unwantedTable1 --ignore-table=largeDb.unwantedTable2 --ignore-table=largeDb.unwantedTable3 > smallData.sql

And because I need data from the big table, I’ve used the –where clause to limit the rows returned in

mysqldump -h host-u ${USER} --password=${PASS} "--where=brand_id='2496'" largeDb bigTable > binaryData.sql

With this one I was able to get only the binaryData that I need to make my tests, instead of waiting to get for the entire dbDump,

Categories: Development, Mysql Tags:

Getting first and last row from a group set without sub queries in mysql

February 18th, 2010 Ivan Villareal No comments

When I use the GROUP BY clause in mysql It starts grouping by the first different row in the resultset so for example if I have this table:

 

mysql> select * from grouptest;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  1 | tijuana |    10 | 2010-02-02 | 
|  2 | tijuana |    11 | 2010-02-09 | 
|  3 | tijuana |    15 | 2010-02-12 | 
|  4 | tijuana |    17 | 2010-02-17 | 
|  5 | tecate  |    12 | 2010-02-10 | 
|  6 | tecate  |     1 | 2010-02-14 | 
|  7 | tecate  |    29 | 2010-02-17 | 
+----+---------+-------+------------+
7 rows in set (0.00 sec)

 

If I group by city this is what I get:

mysql> SELECT * FROM grouptest g GROUP BY city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  5 | tecate  |    12 | 2010-02-10 | 
|  1 | tijuana |    10 | 2010-02-02 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 

But I want to get the last inserted rows, so could do a sub query like this:

mysql> SELECT * FROM grouptest g
    -> WHERE g.date = (SELECT MAX(date) FROM grouptest gt WHERE g.city = gt.city)
    -> GROUP BY g.city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  7 | tecate  |    29 | 2010-02-17 | 
|  4 | tijuana |    17 | 2010-02-17 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 

But if the db is large making subqueries could be expensive, so a cleaner solution might be using the GROUP_CONCAT function, so for example the above query could be written like this:

mysql> SELECT id, city, value, date, GROUP_CONCAT(date) AS all_dates
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+---------------------------------------------+
| id | city    | value | date       | all_dates                                   |
+----+---------+-------+------------+---------------------------------------------+
|  5 | tecate  |    12 | 2010-02-10 | 2010-02-10,2010-02-14,2010-02-17            | 
|  1 | tijuana |    10 | 2010-02-02 | 2010-02-02,2010-02-09,2010-02-12,2010-02-17 | 
+----+---------+-------+------------+---------------------------------------------+
2 rows in set (0.00 sec)

 
At this point I can explode the all_dates column in php or split them in perl to get the last date for each city, but a little more work on our query could do our job simpler, something like this for instance:

mysql> SELECT id, city, value, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS date
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+
| id | city    | value | date       |
+----+---------+-------+------------+
|  5 | tecate  |    12 | 2010-02-17 | 
|  1 | tijuana |    10 | 2010-02-17 | 
+----+---------+-------+------------+
2 rows in set (0.00 sec)

 
Or if I need to get the first and last rows for each group set I just add the date column like this:

mysql> SELECT id, city, value, date as first_date, SUBSTRING_INDEX(GROUP_CONCAT(date),',',-1) AS last_date
    -> FROM grouptest g
    -> GROUP BY city;
+----+---------+-------+------------+------------+
| id | city    | value | first_date | last_date  |
+----+---------+-------+------------+------------+
|  5 | tecate  |    12 | 2010-02-10 | 2010-02-17 | 
|  1 | tijuana |    10 | 2010-02-02 | 2010-02-17 | 
+----+---------+-------+------------+------------+
2 rows in set (0.00 sec)

 

There may be several other ways around this, but for now this approach is enough, I’m not sure about the impact this 2 functions may have on the server but I’m sure it is less than executing a subquery.

Categories: Mysql Tags: , , , ,