Generating prime nos in PHP using multiprocessing - Geaman with PHP client Part I

Please take a look at the other two tutorials to install Gearman job server and Gearman PHP PECL extension on Linux (including a simple example) in order to run the examples below.

This tutorial generates a list of prime nos. using Gearman as a multiprocessing server to speed up the desired number generation process. I am going to present the code first and then we'll analyze how it uses multiple processors to do the actual work for you.

The worker code that does the actual work of prime no. generation is as follows (please refer to the manual or documentation to understand what the code actually does):

<?php
$gmworker= new GearmanWorker();
$gmworker->addServer();
$gmworker->addFunction("prime_nos", "print_primes");

while($gmworker->work())
{
  if ($gmworker->returnCode() != GEARMAN_SUCCESS)
  {echo "return_code: ".$gmworker->returnCode()."\n";  break; }
}

function print_primes($job)
{
  $load    = unserialize($job->workload());
  $start   = $load['start'];
  $end     = $load['end'];
  $result  = '';
  for ($i = $start; $i <= $end; $i++)
  {
    if(($i % 2) != 1) continue;
    $d = 3;
    $x = sqrt($i);
    while($i % $d != 0 && $d < $x) $d += 2;
    if(((($i % $d)==0 && $i != $d) * 1) == 0) $result .= $i.',';
  }
  return $result;
}
?>

Now, the client that submits tasks and jobs to the Gearman server.

<?php
# create the gearman client
$gmc= new GearmanClient();

# add the default server (localhost)
$gmc->addServer();

# register some callbacks
$gmc->setCreatedCallback ("reverse_created");
$gmc->setDataCallback    ("reverse_data");
$gmc->setStatusCallback  ("reverse_status");
$gmc->setCompleteCallback("reverse_complete");
$gmc->setFailCallback    ("reverse_fail");

# add two tasks
$tasks = array();
$data['foo'] = "bar";
$tasks[] = $gmc->addTask("prime_nos", 
                         serialize(array("start" => 0, "end" => 30)),$data);
# run the tasks in parallel (assuming multiple workers)
if(!$gmc->runTasks()){echo "ERROR ".$gmc->error()."\n";   exit;}

echo "DONE\n";
function reverse_created($task){echo "CREATED: ".$task->jobHandle()."\n";}
function reverse_status($task){ echo "STATUS: ".$task->jobHandle()." - ".$task->taskNumerator()."/".$task->taskDenominator()."\n";}
function reverse_complete($task){ echo "COMPLETE: ".$task->jobHandle().", ".$task->data()."\n";}
function reverse_fail($task){echo "FAILED: ".$task->jobHandle()."\n";}
function reverse_data($task){ echo unserialize($task->data())."\n";}
?>

Here is the output for the first complete run.

CREATED: H:debian-dba:251197
COMPLETE: H:debian-dba:251197, 1,3,5,7,11,13,17,19,23,29,
DONE

Proof that it uses multiple processes:

In order to prove that Gearman can utilize multiple processors to run the jobs, I'll be making slight modifications to the worker's and client's code. Add the following line between $end=$load['end']; and $result=''; lines in worker's code. Here is how it looks like after the change:

$end = $load['end'];
for($j = 0; $j < 999999; $j++)
{
$result = '';

Also, add an ending curly bracket before return $result;.

}
return $result;

What I have done here is, I have added a loop of approx. million cycles to make use of the processors so that we get enough time to check if our code uses multiple processors or not. You have to run the worker multiple times. Assuming that the worker's filename is worker_primes.php, you'll run it something like this:

$ /opt/php/bin/php worker_primes.php &
$ /opt/php/bin/php worker_primes.php &

Some changes will be required on the client's side in order to send multiple jobs to the Gearman server so that it uses multiple workers to perform the work and hence uses multiple processors. Replace the following:

$tasks[] = $gmc->addTask("prime_nos", 
                         serialize(array("start" => 0, "end" => 30)),$data);

with

for($i = 0; $i < 2; $i++)
{
 $tasks[] = $gmc->addTask("prime_nos", serialize(array("start" => ($i * 30), "end" => (($i+1)*30) )),$data);
}

The job will take a while to finish and in the meantime in another session/screen run the following to check the activity of your processors.

mpstat -P ALL 1

Note: If you do not have mpstat installed use apt-get install sysstat to install it on Ubuntu/Debian.

Here is the output of the command above (while the job is running).

Linux 2.6.26-2-686 (debian-dba)         09/12/2010      _i686_

05:36:49 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
05:36:50 AM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      1.00
05:36:50 AM    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
05:36:50 AM    1  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      1.00

05:36:50 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
05:36:51 AM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      4.00
05:36:51 AM    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00
05:36:51 AM    1   99.01    0.00    0.00    0.00    0.00    0.99    0.00    0.00      2.00

05:36:51 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
05:36:52 AM  all   99.50    0.00    0.00    0.00    0.50    0.00    0.00    0.00      4.00
05:36:52 AM    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00
05:36:52 AM    1   99.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00      2.00

05:36:52 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
05:36:53 AM  all   99.50    0.00    0.00    0.00    0.00    0.50    0.00    0.00     23.00
05:36:53 AM    0  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00     21.00
05:36:53 AM    1  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00

Here is the output of running the client (whenever it finishes):

CREATED: H:debian-dba:251204
CREATED: H:debian-dba:251205
COMPLETE: H:debian-dba:251205, 1,3,5,7,11,13,17,19,23,29,
COMPLETE: H:debian-dba:251204, 31,37,41,43,47,53,59,
DONE

As you can see from the processor activity that Gearman uses multiple processors and makes multiprocessing in PHP so easy and efficient

Did this tutorial help a little? How about buy me a cup of coffee?

Buy me a coffee at ko-fi.com

Please feel free to use the comments form below if you have any questions or need more explanation on anything. I recommend thoroughy testing on a production-like test system first before moving to production.