Tuesday, November 9, 2010

Matlab DCS + Torque : What's missing?

Matlab Distributed Computing Server directly supports Torque scheduler with minimum configuration. However, when it passes parameter such as number of CPU to Torque, it only allows passing one parameter using '^N^' keyword, which is defined in 'ResourceTemplate' at 'Scheduler Configuration Properties' window.

Unfortunately, Torque, in general, needs two parameters to specify number of CPU in this format to submit multi-nodes job.

nodes=X:ppn=Y

where X denotes number of node, Y denotes processor per node, respectively.

Torque set ppn to 1 by default if it is not explicitly specified. In other words, 4 way job result in

nodes=4:ppn=1

instead of

nodes=1:ppn=4

which is ideal for system having node with 4 cores each.

I needed a Matlab function that can handle this conversion between Matlab user and Matlab Distributed Computing Server.

TASKLAYOUT(X,Y) takes one parameter and claculates optimal way to define X, Y values and constructs right string which can be used by 'psub' command internally.

For example, assume each node has 8 cores, then TASKLAYOUT(X,Y) generate these output;

4 -> nodes=1:ppn=4
8 -> nodes=1:ppn=8
12 -> nodes=
2:ppn=6
13 -> nodes=2:ppn=7
16 -> nodes=2:ppn=8
18 -> nodes=3:ppn=6
23 -> nodes=3:ppn=8
24 -> nodes=3:ppn=8
32 -> nodes=4:ppn=8

In case of 12,13 and 18 way job, it does not requests full number of cores per node to minimized waste of unused cores. But, there is still sort of internal fragmentation in case of 13, which requests 14 cores even though it only needs 13 cores. In other words, requesting 13 cores are not really efficient at all.

Now it's time to install and test it.

Follow these steps to install tasklayout.m.

1. Define 'ResourceTemplate' in TORQUE Scheduler Configuration Properties

Click 'Parallel/Manage Configurations' on Matlab window.
Click 'File/Import' on 'Configurations Manager window'.
Double click configuration you want to modify.
Set 'ResourceTemplate' to '-l nodes=^N^,walltime=24:00:00'.
*Probably, most of you have done this already.

2. Copy tasklayout.m to Matlab directory.

cp tasklayout.m $MATLAB/toolbox/local


3. Modify $MATLAB/toolbox/distcomp/@distcomp/@pbsscheduler/pSubmitParallelJob.m. Assume each node has 8 cores.

selectStr=strrep(pbs.ResourceTemplate,'^N^',num2str(length(job.Tasks)))

to

selectStr=strrep(pbs.ResourceTemplate,'^N^',tasklayout(length(job.Tasks),8))

4. Done

Please, let me know if it is not compatible in certain environment or have bugs or anything is missing in this function.

Thanks




function taskStr = tasklayout( numTasks,numPpn )
%TASKLAYOUT Construct CPU requirement string for Torque Scheduler
% TASKLAYOUT(X,Y) construct CPU requirement string for X tasks
% with each node has Y cores.
%
% Notes:
%
% Matlab Distributed Computing Server directly supports Torque scheduler
% with minimum configuration. However, when it passes parameter such as
% number of CPU to Torque, it only allows passing one parameter using '^N^'
% keyword, which is defined in 'ResourceTempalte' at 'Scheduler
% Configuration Properties' window.
%
% Unfortunately, Torque, in general, needs two parameters in this format
% to submit multi nodes job.
%
% nodes=X:ppn=Y
%
% where X denotes number of node, Y denotes processor per node,
% respectively.
%
% TASKLAYOUT takes one parameter and claculates optimal way to define X, Y
% values and constructs right string which can be used by 'psub' command
% internally.
%
% Example 1:
% % Run 16 way job, each node has 8 cores.
% reqStr = tasklayout(16,8)
%
% ans =
%
% nodes=2:ppn=8
%
% returns
% taskStr
%
% See also matlabpool

% For Matlab Distributed Computing Server
% (c) Brian Kim @ Texas A&M University

if ( numTasks <= numPpn )
taskStr = strcat('1:ppn=',num2str(numTasks)) ;
elseif ( numTasks > numPpn )
if ( rem(numTasks,numPpn) == 0 )
taskStr = strcat(num2str(ceil(numTasks/numPpn)), ... ,
':ppn=',num2str(numPpn));
else
taskStr = strcat(num2str(ceil(numTasks/numPpn)),':ppn=', ... ,
num2str(ceil(numTasks/ceil(numTasks/numPpn))));
end
end


2 comments:

  1. another way to do it is to modify the pSubmitParallelJob.m like this

    NumofNodes = int16(length( job.Tasks )/4 +0.49);
    if ~isempty( pbs.ResourceTemplate )
    % selectStr = strrep( pbs.ResourceTemplate, '^N^', num2str( length( job.Tasks ) ) );
    selectStr = strrep( pbs.ResourceTemplate, '^N^', num2str( NumofNodes ) );
    else
    error( 'distcomp:pbsscheduler:noResourceTemplate', ...
    ['Please fill out a ResourceTemplate for the PBS scheduler, using the ', ...
    'syntax "^N^" to represent the number of tasks'] );
    end

    then we can have configuration like:

    "-l ppn=^N^:cpus=4"

    ReplyDelete
  2. @Winston. Thanks for comment. One of main point of this function is that it does not use constant value for cpus. For example, 6 tasks can be '-l nodes=2:ppn3' instead of '-l nodes=2:ppn4'. By doing this way, 2 cores could be used for other jobs, otherwise would have been wasted. Of course, it depends on whether each node is configured to share with multiple jobs or not.

    ReplyDelete