The function group_concat in MySQL will let you combine the values from multiple rows together in an aggregate function.
It’s a little hard to picture, so here’s a hypothetical aggregate query from the Netflix prize dataset:
select movie_id, customer_id, max(date) as latest, group_concat(rating) as all_ratings, count(*) as total_ratings
from ratings
group by movie_id,customer_id
having count(*)>1;
And we get back* **:
+----------+-------------+------------+-------------+---------------+
| MOVIE_ID | CUSTOMER_ID | LATEST | ALL_RATINGS | TOTAL_RATINGS |
+----------+-------------+------------+-------------+---------------+
| 1 | 5 | 2007-12-02 | 3,2,4,5 | 4 |
| 54 | 3 | 2007-12-04 | 1,3,1 | 3 |
| 23 | 34 | 2007-12-07 | 5,5 | 2 |
+----------+-------------+------------+-------------+---------------+
As you can see for each set of rows grouped by the same movie_id and customer_id, it took each individual rating value and concatenated them all together in the new aggregate row.
* My actual query returned 0 rows so I thought I’d make up a quick example to illustrate the point.
** I made that pretty table*** with this ASCII table generating utility. Yay!
*** Pretty table may not be pretty in RSS readers or Blackberries.
Posted by Greg Pinero (Primary Searcher) as SQL at 9:49 PM MST
No Comments »
These are links associated with recent searches I’ve done. They’re not difficult enough to warrant to their own posts but still super useful.
Tags: Apparmor, Backup, Debug, Download, Forum_I_Posted_To, Grub, Linux, Mirror, Mysql, Pfam, Programming, Python, Security, Sql, Ubuntu, Wget
Posted by Greg Pinero (Primary Searcher) as Uncategorized at 5:00 AM MST
No Comments »
Sometimes I’ll want to run a query against a MySQL database from Python that returns a large result set. So large in fact that Python hits a memory error.
One way around this is to have the Python MySQL library (MySQLdb) store the results from a query on the database server and bring them back to you one at a time as you request them.
Here’s how you set that up:
import MySQLdb
import MySQLdb.cursors #Make sure to import this seperately
#build your connection object normally but pass it a cursorclass keyword
db=MySQLdb.connect(host=HOSTNAME,user=USERNAME,passwd=PASSWORD,db=DATABASE,
cursorclass=MySQLdb.cursors.SSCursor)
#Usage
cursor=db.cursor()
cursor.execute('select huge_column from huge_table')
#iterate over the results pulling them down from the server on each iteration
for row in cursor:
print row
cursor.close()
Posted by Greg Pinero (Primary Searcher) as Python at 8:49 PM MST
1 Comment »
These are links associated with recent searches I’ve done. They’re not difficult enough to warrant to their own posts but still super useful.
FlexigridLightweight but rich data grid with resizable columns and a scrolling data to match the headers, plus an ability to connect to an xml based data source using Ajax to load the content. Supposedly it can convert existing tables too?
Tags: Address, Airport, Ajax, Api, Census, Distance, Forum_I_Posted_To, Geolocation, Geotagging, Gis, Googleearth, Gps, Groupby, Javascript, Jquery, Kayak, Linux, Local, Lvm, Maps, Mysql, Optimization, Performance, Pfam, Query, Sql, Storage, Table, Taxi, Ubuntu, Visualization, Webservices, Wikipedia
Posted by Greg Pinero (Primary Searcher) as Uncategorized at 11:10 PM MST
Comments Off