capsul.process

Inheritance diagram of capsul.process, capsul.process.process, capsul.process.nipype_process, capsul.process.runprocess, capsul.process.xml, capsul.pipeline.pipeline, capsul.pipeline.process_iteration

capsul.process module

capsul.process.process submodule

Process main class and infrastructure

Classes

Process

FileCopyProcess

InteractiveProcess

NipypeProcess

ProcessMeta

ProcessResult

class capsul.process.process.FileCopyProcess(activate_copy=True, inputs_to_copy=None, inputs_to_clean=None, destination=None, inputs_to_symlink=None, use_temp_output_dir=False)[source]

A specific process that copies all the input files.

copied_inputs

the list of copied file parameters {param: dst_value}

Type:

dict

copied_files

copied files {param: [dst_value1, …]}

Type:

dict

__call__()
_update_input_traits()[source]
_get_process_arguments()[source]
_copy_input_files()[source]

Note

  • Type ‘FileCopyProcess.help()’ for a full description of this process parameters.

  • Type ‘<FileCopyProcess>.get_input_spec()’ for a full description of this process input trait types.

  • Type ‘<FileCopyProcess>.get_output_spec()’ for a full description of this process output trait types.

Initialize the FileCopyProcess class.

Parameters:
  • activate_copy (bool (default True)) – if False this class is transparent and behaves as a Process class.

  • inputs_to_copy (list of str (optional, default None)) – the list of inputs to copy. If None, all the input files are copied.

  • inputs_to_clean (list of str (optional, default None)) – some copied inputs that can be deleted at the end of the processing. If None, all copied files will be cleaned.

  • destination (str (optional default None)) – where the files are copied. If None, the output directory will be used, unless use_temp_output_dir is set.

  • inputs_to_symlink (list of str (optional, default None)) – as inputs_to_copy, but for files which should be symlinked

  • use_temp_output_dir (bool) – if True, the output_directory parameter is set to a temp one during execution, then outputs are copied / moved / hardlinked to the final location. This is useful when several parallel jobs are working in the same directory and may write the same intermediate files (SPM does this a lot).

class capsul.process.process.InteractiveProcess(**kwargs)[source]

Base class for interactive processes. The value of the is_interactive parameter determine if either the process can be run in background (eventually remotely) as a standard process (is_interactive = False) or if the process must be executed interactively in the user environment (is_interactive = False).

Note

  • Type ‘InteractiveProcess.help()’ for a full description of this process parameters.

  • Type ‘<InteractiveProcess>.get_input_spec()’ for a full description of this process input trait types.

  • Type ‘<InteractiveProcess>.get_output_spec()’ for a full description of this process output trait types.

Initialize the Process class.

class capsul.process.process.NipypeProcess(*args, **kwargs)[source]

Base class used to wrap nipype interfaces.

Note

  • Type ‘NipypeProcess.help()’ for a full description of this process parameters.

  • Type ‘<NipypeProcess>.get_input_spec()’ for a full description of this process input trait types.

  • Type ‘<NipypeProcess>.get_output_spec()’ for a full description of this process output trait types.

Initialize the NipypeProcess class.

NipypeProcess instance gets automatically an additional user trait ‘output_directory’.

This class also fix some lacks of the nipype version ‘0.10.0’.

NipypeProcess is normally not instantiated directly, but through the CapsulEngine factory, using a nipype interface name:

ce = capsul_engine()
npproc = ce.get_process_instance('nipype.interfaces.spm.Smooth')

However, it is now still possible to instantiate it directly, using a nipype interface class or instance:

npproc = NipypeProcess(nipype.interfaces.spm.Smooth)

NipypeProcess may be subclassed for specialized interfaces. In such a case, the subclass may provide:

  • (optionally) a class attribute _nipype_class_type to specify the

nipype interface class. If present the nipype interface class or instance will not be specified in the constructor call. * (optionally) a __postinit__() method which will be called in addition to the constructor, but later once the instance is correctly setup. This __postinit__ method allows to customize the new class instance. * (optionally) a class attribute _nipype_trait_mapping: a dict specifying a translation table between nipype traits names and the names they will get in the Process instance. By default, inputs get the same name as in their nipype interface, and outputs are prefixed with an underscore (‘_’) to avoid names collisions when a trait exists both in inputs and outputs in nipype. A special trait name _spm_script_file is also used in SPM interfaces to write the matlab script. It can also be translated to a different name in this dict.

Subclasses should preferably not define an __init__ method, because it may be called twice if no precaution is taken to avoid it (a __np_init_done__ instance attribute is set once init is done the first time).

Ex:

class Smooth(NipypeProcess):
    _nipype_class_type = spm.Smooth
    _nipype_trait_mapping = {
        'smoothed_files': 'smoothed_files',
        '_spm_script_file': 'spm_script_file'}

smooth = Smooth()
Parameters:
  • nipype_instance (nipype interface (mandatory, except from internals)) – the nipype interface we want to wrap in capsul.

  • use_temp_output_dir (bool or None) – use a temp working directory during processing

_nipype_interface

private attribute to store the nipye interface

Type:

Interface

_nipype_module

private attribute to store the nipye module name

Type:

str

_nipype_class

private attribute to store the nipye class name

Type:

str

_nipype_interface_name

private attribute to store the nipye interface name

Type:

str

classmethod help(nipype_interface, returnhelp=False)[source]

Method to print the full wrapped nipype interface help.

Parameters:
  • cls (process class (mandatory)) – a nipype process class

  • nipype_instance (nipype interface (mandatory)) – a nipype interface object that will be documented.

  • returnhelp (bool (optional, default False)) – if True return the help string message, otherwise display it on the console.

requirements()[source]

Requirements needed to run the process. It is a dictionary which keys are config/settings modules and values are requests for them.

The default implementation returns an empty dict (no requirements), and should be overloaded by processes which actually have requirements.

Ex:

{'spm': 'version >= "12" and standalone == "True"')
set_output_directory(out_dir)[source]

Set the process output directory.

Parameters:

out_dir (str (mandatory)) – the output directory

set_usedefault(parameter, value)[source]

Set the value of the usedefault attribute on a given parameter.

Parameters:
  • parameter (str (mandatory)) – name of the parameter to modify.

  • value (bool (mandatory)) – value set to the usedefault attribute

class capsul.process.process.Process(**kwargs)[source]

A process is an atomic component that contains a processing.

A process is typically an object with typed parameters, and an execution function. Parameters are described using Enthought traits through Soma-Base Controller base class.

In addition to describing its parameters, a Process must implement its execution function, either through a python method, by overloading _run_process(), or through a commandline execution, by overloading get_commandline(). The second way allows to run on a remote processing machine which has not necessary capsul, nor python, installed.

Parameters are declared or queried using the traits API, and their values are in the process instance variables:

from __future__ import print_function
from capsul.api import Process
import traits.api as traits

class MyProcess(Process):

    # a class trait
    param1 = traits.Str('def_param1')

    def __init__(self):
        super(MyProcess, self).__init__()
        # declare an input param
        self.add_trait('param2', traits.Int())
        # declare an output param
        self.add_trait('out_param', traits.File(output=True))

    def _run_process(self):
        with open(self.out_param, 'w') as f:
            print('param1:', self.param1, file=f)
            print('param2:', self.param2, file=f)

# run it with parameters
MyProcess()(param2=12, out_param='/tmp/log.txt')

Note about the File and Directory traits

The File trait type represents a file parameter. A file is actually two things: a filename (string), and the file itself (on the filesystem). For an input it is OK not to distinguish them, but for an output, there are two different cases:

  • the file (on the filesystem) is an output, but the filename (string) is given as an input: this is the classical “commandline” behavior, when we tell the program where it should write its output file.

  • the file is an output, and the filename is also an output: this is rather a “function return value” behavior: the process determines internally where it should write the file, and tells as an output where it did.

To distinguish these two cases, in Capsul we normally add in the File or Directory trait a property input_filename which is True when the filename is an input, and False when the filename is an output:

self.add_trait('out_file',
               traits.File(output=True, input_filename=False))

However, as most of our processes are based on the “commandline behavior” (the filename is an input) and we often forget to specify the input_filename parameter, the default is the “filename is an input” behavior, when not specified.

Attributes

name

the class name.

Type:

str

id

the string description of the class location (ie., module.class).

Type:

str

log_file

if None, the log will be generated in the current directory otherwise it will be written in log_file path.

Type:

str (default None)

Note

  • Type ‘Process.help()’ for a full description of this process parameters.

  • Type ‘<Process>.get_input_spec()’ for a full description of this process input trait types.

  • Type ‘<Process>.get_output_spec()’ for a full description of this process output trait types.

Initialize the Process class.

add_trait(name, trait)[source]

Ensure that trait.output and trait.optional are set to a boolean value before calling parent class add_trait.

check_requirements(environment='global', message_list=None)[source]

Checks the process requirements against configuration settings values in the attached CapsulEngine. This makes use of the requirements() method and checks that there is one matching config value for each required module.

Parameters:
  • environment (str) – config environment id. Normally corresponds to the computing resource name, and defaults to “global”.

  • message_list (list) – if not None, this list will be updated with messages for unsatisfied requirements, in order to present the user with an understandable error.

Returns:

config – if None is returned, requirements are not met: the process cannot run. If a dict is returned, it corresponds to the matching config values. When no requirements are needed, an empty dict is returned. A pipeline, if its requirements are met will return a list of configuration values, because different nodes may require different config values.

Return type:

dict, list, or None

get_commandline()[source]

Method to generate a commandline representation of the process.

If not implemented, it will generate a commandline running python, instantiating the current process, and calling its _run_process() method.

Returns:

commandline – Arguments are in separate elements of the list.

Return type:

list of strings

get_help(returnhelp=False, use_labels=False)[source]

Generate description of a process parameters.

Parameters:
  • returnhelp (bool (optional, default False)) – if True return the help string message formatted in rst, otherwise display the raw help string message on the console.

  • use_labels (bool) – if True, input and output sections will get a RestructuredText label to avoid ambiguities.

get_input_help(rst_formating=False)[source]

Generate description for process input parameters.

Parameters:

rst_formating (bool (optional, default False)) – if True generate a rst table with the input descriptions.

Returns:

helpstr – the class input traits help

Return type:

str

get_input_spec()[source]

Method to access the process input specifications.

Returns:

outputs – a string representation of all the input trait specifications.

Return type:

str

get_inputs()[source]

Method to access the process inputs.

Returns:

outputs – a dictionary with all the input trait names and values.

Return type:

dict

get_log()[source]

Load the logging file.

Returns:

log – the content of the log file.

Return type:

dict

get_missing_mandatory_parameters()[source]

Returns a list of parameters which are not optional, and which value is Undefined or None, or an empty string for a File or Directory parameter.

get_output_help(rst_formating=False)[source]

Generate description for process output parameters.

Parameters:

rst_formating (bool (optional, default False)) – if True generate a rst table with the input descriptions.

Returns:

helpstr – the trait output help descriptions

Return type:

str

get_output_spec()[source]

Method to access the process output specifications.

Returns:

outputs – a string representation of all the output trait specifications.

Return type:

str

get_outputs()[source]

Method to access the process outputs.

Returns:

outputs – a dictionary with all the output trait names and values.

Return type:

dict

get_parameter(name)[source]

Method to access the value of a process instance.

Parameters:

name (str (mandatory)) – the trait name we want to modify

Returns:

value – the trait value we want to access

Return type:

object

get_study_config()[source]

Get (or create) the StudyConfig this process belongs to

classmethod help(returnhelp=False)[source]

Method to print the full help.

Parameters:
  • cls (process class (mandatory)) – a process class

  • returnhelp (bool (optional, default False)) – if True return the help string message, otherwise display it on the console.

make_commandline_argument(*args)[source]

This helper function may be used to build non-trivial commandline arguments in get_commandline implementations. Basically it concatenates arguments, but it also takes care of keeping track of temporary file objects (if any), and converts non-string arguments to strings (using repr()).

Ex:

>>> process.make_commandline_argument('param=', self.param)

will return the same as:

>>> 'param=' + self.param

if self.param is a string (file name) or a temporary path.

params_to_command()[source]

Generates a commandline representation of the process.

If not implemented, it will generate a commandline running python, instantiating the current process, and calling its _run_process() method.

This method is new in Capsul v3 and is a replacement for get_commandline().

It can be overwritten by custom Process subclasses. Actually each process should overwrite either params_to_command() or _run_process().

The returned commandline is a list, which first element is a “method”, and others are the actual commandline with arguments. There are several methods, the process is free to use either of the supported ones, depending on how the execution is implemented.

Methods:

capsul_job: Capsul process run in python

The command will run the _run_process() execution method of the process, after loading input parameters from a JSON dictionary file. The only second element in the commandline list is the process identifier (module/class as in get_process_instance()). The location of the JSON file will be passed to the job execution through an environment variable SOMAWF_INPUT_PARAMS:

return ['capsul_job', 'morphologist.capsul.morphologist']
format_string: free commandline with replacements for parameters

Command arguments can be, or contain, format strings in the shape ‘%(param)s’, where param is a parameter of the process. This way we can map values correctly, and call a foreign command:

return ['format_string', 'ls', '%(input_dir)s']
json_job: free commandline with JSON file for input parameters

A bit like capsul_job but without the automatic wrapper:

return ['json_job', 'python', '-m', 'my_module']
Returns:

commandline – Arguments are in separate elements of the list.

Return type:

list of strings

requirements()[source]

Requirements needed to run the process. It is a dictionary which keys are config/settings modules and values are requests for them.

The default implementation returns an empty dict (no requirements), and should be overloaded by processes which actually have requirements.

Ex:

{'spm': 'version >= "12" and standalone == "True"')
run(**kwargs)[source]

Obsolete: use self.__call__ instead

static run_from_commandline(process_definition)[source]

Run a process from a commandline call. The process name (with module) are given in argument, input parameters should be passed through a JSON file which location is in the SOMAWF_INPUT_PARAMS environment variable.

If the process has outputs, the SOMAWF_OUTUT_PARAMS environment variable should contain the location of an output file which will be written with a dict containing output parameters values.

save_log(returncode)[source]

Method to save process execution information in json format.

If the class attribute log_file is not set, a log.json output file is generated in the process call current working directory.

Parameters:

returncode (ProcessResult) – the process result return code.

set_parameter(name, value, protected=None)[source]

Method to set a process instance trait value.

For File and Directory traits the None value is replaced by the special Undefined trait value.

Parameters:
  • name (str (mandatory)) – the trait name we want to modify

  • value (object (mandatory)) – the trait value we want to set

  • protected (None or bool (tristate)) – if True or False, force the “protected” status of the plug. If None, keep it as is.

set_study_config(study_config)[source]

Set a StudyConfig for the process. Note that it can only be done once: once a non-null StudyConfig has been assigned to the process, it should not change.

class capsul.process.process.ProcessMeta(name, bases, attrs)[source]

Class used to complete a process docstring

Use a class and not a function for inheritance.

Method to print the full help.

Parameters:
  • mcls (meta class (mandatory)) – a meta class.

  • name (str (mandatory)) – the process class name.

  • bases (tuple (mandatory)) – the direct base classes.

  • attrs (dict (mandatory)) – a dictionary with the class attributes.

static complement_doc(name, docstr)[source]

complement the process docstring

class capsul.process.process.ProcessResult(process, runtime, returncode, inputs=None, outputs=None)[source]

Object that contains running information a particular Process.

Parameters:
  • process (Process class (mandatory)) – A copy of the Process class that was called.

  • runtime (dict (mandatory)) – Execution attributes.

  • returncode (dict (mandatory)) – Execution raw attributes

  • inputs (dict (optional)) – Representation of the process inputs.

  • outputs (dict (optional)) – Representation of the process outputs.

Initialize the ProcessResult class.

capsul.process.nipype_process submodule

Utilities to link Capsul and NiPype interfaces

Functions

nipype_factory()

capsul.process.nipype_process.nipype_factory(nipype_instance, base_class=<class 'capsul.process.process.NipypeProcess'>)[source]

From a nipype class instance generate dynamically a process instance that encapsulate the nipype instance.

This function clones the nipye traits (also convert special traits) and connects the process and nipype instances traits.

A new ‘output_directory’ nipype input trait is created.

Since nipype inputs and outputs are separated and thus can have the same names, the nipype process outputs are prefixed with ‘_’.

It also monkey patch some nipype functions in order to execute the process in a specific directory: the monkey patching has been written for Nipype version ‘0.10.0’.

Parameters:

nipype_instance (instance (mandatory)) – a nipype interface instance.

Returns:

process_instance – a process instance.

Return type:

instance

See also

_run_interface, _list_outputs, _gen_filename, _parse_inputs, sync_nypipe_traits, sync_process_output_traits, clone_nipype_trait

capsul.process.runprocess submodule

capsul.process.runprocess is not a real python module, but rather an executable script with commandline arguments and options parsing. It is provided as a module just to be easily called via the python command in a portable way:

python -m capsul.process.runprocess <process name> <process arguments>

Classes

ProcessParamError

Functions

set_process_param_from_str()

get_process_with_params()

run_process_with_distribution()

convert_commandline_parameter()

main()

exception capsul.process.runprocess.ProcessParamError[source]

Exception used in the runprocess module

capsul.process.runprocess.get_process_with_params(process_name, study_config, iterated_params=[], attributes={}, *args, **kwargs)[source]

Instantiate a process, or an iteration over processes, and fill in its parameters.

Parameters:
  • process_name (string) – name (mosule and class) of the process to instantiate

  • study_config (StudyConfig instance)

  • iterated_params (list (optional)) – parameters names which should be iterated on. If this list is not empty, an iteration process is built. All parameters values corresponding to the selected names should be lists with the same size.

  • attributes (dict (optional)) – dictionary of attributes for completion system.

  • *args – sequential parameters for the process. In iteration, “normal” parameters are set with the same value for all iterations, and iterated parameters dispatch their values to each iteration.

  • **kwargs – named parameters for the process. Same as above for iterations.

Returns:

process

Return type:

Process instance

capsul.process.runprocess.main()[source]

Run the capsul.process.runprocess module as a commandline

capsul.process.runprocess.run_process_with_distribution(study_config, process, use_soma_workflow=False, resource_id=None, password=None, config=None, rsa_key_pass=None, queue=None, input_file_processing=None, output_file_processing=None, keep_workflow=False, keep_failed_workflow=False, write_workflow_only=None)[source]

Run the given process, either sequentially or distributed through Soma-Workflow.

Parameters:
  • study_config (StudyConfig instance)

  • process (Process instance) – the process to execute (or pipeline, or iteration…)

  • use_soma_workflow (bool or None (default=None)) – if False, run sequentially, otherwise use Soma-Workflow. Its configuration has to be setup and valid for non-local execution, and additional file transfer options may be used.

  • resource_id (string (default=None)) – soma-workflow resource ID, defaults to localhost

  • password (string) – password to access the remote computing resource. Do not specify it if using a ssh key.

  • config (dict (optional)) – Soma-Workflow config: Not used for now…

  • rsa_key_pass (string) – RSA key password, for ssh key access

  • queue (string) – Queue to use on the computing resource. If not specified, use the default queue.

  • input_file_processing (brainvisa.workflow.ProcessToSomaWorkflow processing code) – Input files processing: local_path (NO_FILE_PROCESSING), transfer (FILE_TRANSFER), translate (SHARED_RESOURCE_PATH), or translate_shared (BV_DB_SHARED_PATH).

  • output_file_processing (same as for input_file_processing) – Output files processing: local_path (NO_FILE_PROCESSING), transfer (FILE_TRANSFER), or translate (SHARED_RESOURCE_PATH). The default is local_path.

  • keep_workflow (bool) – keep the workflow in the computing resource database after execution. By default it is removed.

  • keep_failed_workflow (bool) – keep the workflow in the computing resource database after execution, if it has failed. By default it is removed.

  • write_workflow_only (str) – if specified, this is an output filename where the workflow file will be written. The workflow will not be actually run, because int his situation the user probably wants to use the workflow on his own.

capsul.process.runprocess.set_process_param_from_str(process, k, arg)[source]

Set a process parameter from a string representation.

capsul.process.xml submodule

Read and write a Process as an XML file.

Classes

XMLProcess

Functions

string_to_value()

trait_from_xml()

create_xml_process()

Decorator

xml_process()

class capsul.process.xml.XMLProcess(**kwargs)[source]

Base class of all generated classes for processes defined as a Python function decorated with an XML string.

Note

  • Type ‘XMLProcess.help()’ for a full description of this process parameters.

  • Type ‘<XMLProcess>.get_input_spec()’ for a full description of this process input trait types.

  • Type ‘<XMLProcess>.get_output_spec()’ for a full description of this process output trait types.

Initialize the Process class.

capsul.process.xml.create_xml_process(module, name, function, xml)[source]

Create a new process class given a Python function and a string containing the corresponding Capsul XML 2.0 definition.

Parameters:
  • module (str (mandatory)) – name of the module for the created Process class (the Python module is not modified).

  • name (str (mandatory)) – name of the new process class

  • function (callable (mandatory)) – function to call to execute the process.

  • xml (str (mandatory)) – XML definition of the function.

Returns:

results – created process class.

Return type:

XMLProcess subclass

capsul.process.xml.string_to_value(string)[source]

Converts a string into a Python value without executing code.

capsul.process.xml.trait_from_xml(element, order=None)[source]

Creates a trait from an XML element type (<input>, <output> or <return>).

capsul.process.xml.xml_process(xml)[source]

Decorator used to associate a Python function to its Process XML representation.