<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How to produce random rows from a table</title>
	<atom:link href="http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/</link>
	<description>Exploring, Sharing and  Discussing MySQL practice</description>
	<lastBuildDate>Tue, 31 Aug 2010 09:36:07 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Bruce</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-916</link>
		<dc:creator>Bruce</dc:creator>
		<pubDate>Fri, 21 May 2010 03:26:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-916</guid>
		<description>Wow... interesting information.</description>
		<content:encoded><![CDATA[<p>Wow&#8230; interesting information.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hazan Ilan</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-688</link>
		<dc:creator>Hazan Ilan</dc:creator>
		<pubDate>Mon, 10 May 2010 08:09:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-688</guid>
		<description>Hi bdwankhede,
1. As far as I know (correct me if I&#039;m wrong), there is no &quot;FIRST&quot; aggregate function in MySql, so you need to check the performance of it.
2. Please send me the Explain.
3. If “inspections” table is a big one, the ORDER BY RAND() is known to be a performance consumer. Please tell how much rows you have.
4. You can try selecting 1000 random records from “inspections” first and LEFT JOIN only on the results.
 
Waiting for your response
Ilan.</description>
		<content:encoded><![CDATA[<p>Hi bdwankhede,<br />
1. As far as I know (correct me if I&#8217;m wrong), there is no &#8220;FIRST&#8221; aggregate function in MySql, so you need to check the performance of it.<br />
2. Please send me the Explain.<br />
3. If “inspections” table is a big one, the ORDER BY RAND() is known to be a performance consumer. Please tell how much rows you have.<br />
4. You can try selecting 1000 random records from “inspections” first and LEFT JOIN only on the results.</p>
<p>Waiting for your response<br />
Ilan.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bdwankhede</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-614</link>
		<dc:creator>bdwankhede</dc:creator>
		<pubDate>Tue, 04 May 2010 03:28:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-614</guid>
		<description>i have a query on Firebird sql as 

SELECT FIRST 1000 i.&quot;id&quot;, i.&quot;slug&quot;, i.&quot;what_they_do&quot;, i.&quot;does_well&quot;, i.&quot;last_improves&quot; FROM &quot;inspections&quot; i LEFT JOIN &quot;services&quot; s on i.&quot;service_id&quot; = s.&quot;id&quot; WHERE s.&quot;active&quot; = 1 AND s.&quot;suspended&quot; = 0 AND i.&quot;report_date&quot; &gt; &#039;2005-04-01&#039; ORDER BY RAND() ASC;

and it executes very slow , Please help...</description>
		<content:encoded><![CDATA[<p>i have a query on Firebird sql as </p>
<p>SELECT FIRST 1000 i.&#8221;id&#8221;, i.&#8221;slug&#8221;, i.&#8221;what_they_do&#8221;, i.&#8221;does_well&#8221;, i.&#8221;last_improves&#8221; FROM &#8220;inspections&#8221; i LEFT JOIN &#8220;services&#8221; s on i.&#8221;service_id&#8221; = s.&#8221;id&#8221; WHERE s.&#8221;active&#8221; = 1 AND s.&#8221;suspended&#8221; = 0 AND i.&#8221;report_date&#8221; &gt; &#8217;2005-04-01&#8242; ORDER BY RAND() ASC;</p>
<p>and it executes very slow , Please help&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Amsterdam Escort</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-602</link>
		<dc:creator>Amsterdam Escort</dc:creator>
		<pubDate>Thu, 29 Apr 2010 21:44:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-602</guid>
		<description>I enjoyed viewing your blog and I will be back to check it more in the future so please keep up your good quality work. I love the colors that you chose, you are quite talented!</description>
		<content:encoded><![CDATA[<p>I enjoyed viewing your blog and I will be back to check it more in the future so please keep up your good quality work. I love the colors that you chose, you are quite talented!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bernarda Ocon</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-598</link>
		<dc:creator>Bernarda Ocon</dc:creator>
		<pubDate>Thu, 29 Apr 2010 07:30:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-598</guid>
		<description>Wow... interesting information.</description>
		<content:encoded><![CDATA[<p>Wow&#8230; interesting information.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael J</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-380</link>
		<dc:creator>Michael J</dc:creator>
		<pubDate>Tue, 02 Mar 2010 06:09:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-380</guid>
		<description>Perfect solution. Thank you</description>
		<content:encoded><![CDATA[<p>Perfect solution. Thank you</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Richard Lynch</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-9</link>
		<dc:creator>Richard Lynch</dc:creator>
		<pubDate>Tue, 01 Dec 2009 03:27:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-9</guid>
		<description>What I find simplest is to add an indexed column:
alter table myTable add static_rand int(11) default null;
create index myTable_rand_index on myTable(static_rand);

Then I initialize the values to random numbers in batches of, say, 10000:
update myTable set static_rand = 1000000 * rand() where static_rand is null limit 10000;

Repeat until there are no NULL values.

I can then do:
select id from myTable order by static_rand limit 10;
//get the id values in $VALUES
update myTable set static_rand = 1000000 * rand() where id in ($VALUES)

So basically, it &quot;shuffles&quot; the records to start with, and then pops the required set off the top of the deck, and moves the &quot;used&quot; part of the deck back to random locations within the deck.

This works well for quite large datasets, as the static_rand index fits into RAM and the record set being altered is generally quite small and fast to update.

Odds are quite good, actually, that you&#039;ll be doing something else with the randomly-selected records anyway, so they&#039;ll be in RAM as a working set usually.

Invariably, this is all wrapped up in application logic with a single entry point anyway, and the random constraint &quot;feels&quot; more like an application business requirement than a DB layer, so I have no problem doing this in the application layer...

But I&#039;m sure a mysql guru could re-work it easily enough into an SPROC though.

I personally find this to fit KISS (Keep it simple, stupid) best of all the solutions I&#039;ve tried over the years.

ymmv</description>
		<content:encoded><![CDATA[<p>What I find simplest is to add an indexed column:<br />
alter table myTable add static_rand int(11) default null;<br />
create index myTable_rand_index on myTable(static_rand);</p>
<p>Then I initialize the values to random numbers in batches of, say, 10000:<br />
update myTable set static_rand = 1000000 * rand() where static_rand is null limit 10000;</p>
<p>Repeat until there are no NULL values.</p>
<p>I can then do:<br />
select id from myTable order by static_rand limit 10;<br />
//get the id values in $VALUES<br />
update myTable set static_rand = 1000000 * rand() where id in ($VALUES)</p>
<p>So basically, it &#8220;shuffles&#8221; the records to start with, and then pops the required set off the top of the deck, and moves the &#8220;used&#8221; part of the deck back to random locations within the deck.</p>
<p>This works well for quite large datasets, as the static_rand index fits into RAM and the record set being altered is generally quite small and fast to update.</p>
<p>Odds are quite good, actually, that you&#8217;ll be doing something else with the randomly-selected records anyway, so they&#8217;ll be in RAM as a working set usually.</p>
<p>Invariably, this is all wrapped up in application logic with a single entry point anyway, and the random constraint &#8220;feels&#8221; more like an application business requirement than a DB layer, so I have no problem doing this in the application layer&#8230;</p>
<p>But I&#8217;m sure a mysql guru could re-work it easily enough into an SPROC though.</p>
<p>I personally find this to fit KISS (Keep it simple, stupid) best of all the solutions I&#8217;ve tried over the years.</p>
<p>ymmv</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill Karwin</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-8</link>
		<dc:creator>Bill Karwin</dc:creator>
		<pubDate>Mon, 30 Nov 2009 21:04:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-8</guid>
		<description>The solution to account for gaps produces results weighted toward id values that follow gaps.  In other words, if no rows with id 400-405 exist, then id 406 will be returned for any random number between 400 and 406.
 
Also you assume that id starts at 0.  For example, if you have 1000 rows but the least id value is &gt;1000, no random value between 0 and COUNT(*) will ever match any id in your table.

Also I&#039;m not sure why you are setting a User Variable at all.  You make no use of the variable in your query.  I tried a test with your query and it still functions if I omit the expressions that initialize and increment @num.</description>
		<content:encoded><![CDATA[<p>The solution to account for gaps produces results weighted toward id values that follow gaps.  In other words, if no rows with id 400-405 exist, then id 406 will be returned for any random number between 400 and 406.</p>
<p>Also you assume that id starts at 0.  For example, if you have 1000 rows but the least id value is &gt;1000, no random value between 0 and COUNT(*) will ever match any id in your table.</p>
<p>Also I&#8217;m not sure why you are setting a User Variable at all.  You make no use of the variable in your query.  I tried a test with your query and it still functions if I omit the expressions that initialize and increment @num.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hazan Ilan</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-7</link>
		<dc:creator>Hazan Ilan</dc:creator>
		<pubDate>Mon, 30 Nov 2009 19:32:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-7</guid>
		<description>If there are some holes in the table you can fix it by:
 
SELECT (SELECT myTable2.id from myTable myTable2 where myTable2.id&gt;=b.num LIMIT 1) c FROM (SELECT FLOOR(RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 from (SELECT @num:=0) a , myTable LIMIT X) b ; 
 
To improve performance you can combine this solution with a left join: 

SELECT IF(c.id is null, (SELECT c2.id FROM myTable c2 WHERE c2.id &gt; b.num LIMIT 1), c.id) d FROM (SELECT FLOOR(RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 FROM (SELECT @num:=0) a , myTable LIMIT X) b LEFT JOIN myTable c ON( c.id=b.num);</description>
		<content:encoded><![CDATA[<p>If there are some holes in the table you can fix it by:</p>
<p>SELECT (SELECT myTable2.id from myTable myTable2 where myTable2.id>=b.num LIMIT 1) c FROM (SELECT FLOOR(RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 from (SELECT @num:=0) a , myTable LIMIT X) b ; </p>
<p>To improve performance you can combine this solution with a left join: </p>
<p>SELECT IF(c.id is null, (SELECT c2.id FROM myTable c2 WHERE c2.id > b.num LIMIT 1), c.id) d FROM (SELECT FLOOR(RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 FROM (SELECT @num:=0) a , myTable LIMIT X) b LEFT JOIN myTable c ON( c.id=b.num);</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/comment-page-1/#comment-6</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Mon, 30 Nov 2009 14:49:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.mysqldiary.com/?p=20#comment-6</guid>
		<description>Very interesting idea.  I wonder how the performance is compared with the more common implementation of code + query.</description>
		<content:encoded><![CDATA[<p>Very interesting idea.  I wonder how the performance is compared with the more common implementation of code + query.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
