Sunday, November 19, 2006

PHP, MySQL Stored Procedures, and Recursion

A current project which now includes the use of PHP, MySQL Stored Procedures, recursion, temporary tables, etc., and some of the issues encountered along the way, leads me to write up these notes to share with others. Hierarchical datasets have been discussed before, and the means to traverse them in SQL. This current dataset is not hierarchical, not even a partial ordering, but a network, somewhat along the lines of a social network. In this instance it is musicians who play for bands, and creating an extended network by following up on other bands that musicians played for, and also researching all members of a particular band that a musician is a member of, ad infinitum or at least to maximum recursion depth, currently set at 25.

Right now the stored procedures consist of two: a non-recursive function that creates a temporary table and makes the initial call to the recursive procedure. No resultsets are returned; the PHP script that calls the procedures performs a query against the temporary table. I wanted to write a stored function which would call the stored procedures, query the number of rows in the temporary table and return that value, but I ran into all sorts of issues there and am currently stymied. More on that in a moment.

Anyway the current set of procedures is performing nicely, creating a temporary table (engine=Memory) and populating it with all of the bands and musicians along with degree of separation as indicated by recursion depth. My notion is ultimately when researching a musician, a group, or a song, to be able to display related groups, musicians, works, etc. Currently I am not crawling the network created by linking songs to artists but I hope to add that feature in a future version.

The idea of creating a function to return the row count in the temporary table was to then run a query of all artists and this number, ordered by the count, thus to find the largest networks in order to explore them. The major issue that is blocking this is that functions can be called from statements, and statements can be part of transactions, and therefore stored functions execute in a different environment than stored procedures call directly. This environment is then imposed on any stored procedures that may be called, rather like the restrictions imposed by calling stored routines from triggers. I can understand the rationale for conducting execution in this fashion, but it is not less frustrating to be so stymied.

MySQL allows the CREATE TABLE function to be executed from this limited environment, however either the actual creation of the table is delayed until the function completes, or the function is not allowed to do the COMMIT, implicitly or explicitly, required to make the new table visible during the remainder of the execution of the function. So the CREATE TABLE completes successfully, but the new table either does not exist yet or is not permitted to be visible, thus guaranteeing an error when the table is subsequently referenced.

Future attempts at implementing this function will undoubtedly center around having the table already exist, and possibly doing a TRUNCATE operation, but these solutions are not as elegant in my mind, necessitating a prior query from PHP to perform the setup work.

3 Comments:

Anonymous Anonymous said...

Hi Bob,


I was wondering, if you are already using a temp table, why use recursion? You can solve it using iteration without running into limits.

just a thought

kind regards,

Roland Bouman

8:07 AM, November 20, 2006  
Blogger Bob Field said...

Actually I will probably want to eliminate the recursion since it could be an open-ended search process, but I also like to be able to impose a limit on how far I crawl the network.

10:06 AM, November 20, 2006  
Blogger Ocramius Aethril said...

You can limit the crawler actions without recursion too :)
It's just messy code with better performance, or better 'should be'...

3:19 AM, October 09, 2009  

Post a Comment

<< Home