capsul.process¶
capsul.process module
capsul.process.process submodule¶
Process main class and infrastructure
Classes¶
Process
¶
FileCopyProcess
¶
InteractiveProcess
¶
NipypeProcess
¶
ProcessMeta
¶
ProcessResult
¶
- class capsul.process.process.FileCopyProcess(activate_copy=True, inputs_to_copy=None, inputs_to_clean=None, destination=None, inputs_to_symlink=None, use_temp_output_dir=False)[source]¶
A specific process that copies all the input files.
- __call__()¶
Note
Type ‘FileCopyProcess.help()’ for a full description of this process parameters.
Type ‘<FileCopyProcess>.get_input_spec()’ for a full description of this process input trait types.
Type ‘<FileCopyProcess>.get_output_spec()’ for a full description of this process output trait types.
Initialize the FileCopyProcess class.
- Parameters:
activate_copy (bool (default True)) – if False this class is transparent and behaves as a Process class.
inputs_to_copy (list of str (optional, default None)) – the list of inputs to copy. If None, all the input files are copied.
inputs_to_clean (list of str (optional, default None)) – some copied inputs that can be deleted at the end of the processing. If None, all copied files will be cleaned.
destination (str (optional default None)) – where the files are copied. If None, the output directory will be used, unless use_temp_output_dir is set.
inputs_to_symlink (list of str (optional, default None)) – as inputs_to_copy, but for files which should be symlinked
use_temp_output_dir (bool) – if True, the output_directory parameter is set to a temp one during execution, then outputs are copied / moved / hardlinked to the final location. This is useful when several parallel jobs are working in the same directory and may write the same intermediate files (SPM does this a lot).
- class capsul.process.process.InteractiveProcess(**kwargs)[source]¶
Base class for interactive processes. The value of the is_interactive parameter determine if either the process can be run in background (eventually remotely) as a standard process (is_interactive = False) or if the process must be executed interactively in the user environment (is_interactive = False).
Note
Type ‘InteractiveProcess.help()’ for a full description of this process parameters.
Type ‘<InteractiveProcess>.get_input_spec()’ for a full description of this process input trait types.
Type ‘<InteractiveProcess>.get_output_spec()’ for a full description of this process output trait types.
Initialize the Process class.
- class capsul.process.process.NipypeProcess(*args, **kwargs)[source]¶
Base class used to wrap nipype interfaces.
Note
Type ‘NipypeProcess.help()’ for a full description of this process parameters.
Type ‘<NipypeProcess>.get_input_spec()’ for a full description of this process input trait types.
Type ‘<NipypeProcess>.get_output_spec()’ for a full description of this process output trait types.
Initialize the NipypeProcess class.
NipypeProcess instance gets automatically an additional user trait ‘output_directory’.
This class also fix some lacks of the nipype version ‘0.10.0’.
NipypeProcess is normally not instantiated directly, but through the CapsulEngine factory, using a nipype interface name:
ce = capsul_engine() npproc = ce.get_process_instance('nipype.interfaces.spm.Smooth')
However, it is now still possible to instantiate it directly, using a nipype interface class or instance:
npproc = NipypeProcess(nipype.interfaces.spm.Smooth)
NipypeProcess may be subclassed for specialized interfaces. In such a case, the subclass may provide:
(optionally) a class attribute _nipype_class_type to specify the
nipype interface class. If present the nipype interface class or instance will not be specified in the constructor call. * (optionally) a
__postinit__()
method which will be called in addition to the constructor, but later once the instance is correctly setup. This __postinit__ method allows to customize the new class instance. * (optionally) a class attribute _nipype_trait_mapping: a dict specifying a translation table between nipype traits names and the names they will get in the Process instance. By default, inputs get the same name as in their nipype interface, and outputs are prefixed with an underscore (‘_’) to avoid names collisions when a trait exists both in inputs and outputs in nipype. A special trait name _spm_script_file is also used in SPM interfaces to write the matlab script. It can also be translated to a different name in this dict.Subclasses should preferably not define an __init__ method, because it may be called twice if no precaution is taken to avoid it (a __np_init_done__ instance attribute is set once init is done the first time).
Ex:
class Smooth(NipypeProcess): _nipype_class_type = spm.Smooth _nipype_trait_mapping = { 'smoothed_files': 'smoothed_files', '_spm_script_file': 'spm_script_file'} smooth = Smooth()
- Parameters:
nipype_instance (nipype interface (mandatory, except from internals)) – the nipype interface we want to wrap in capsul.
use_temp_output_dir (bool or None) – use a temp working directory during processing
- _nipype_interface¶
private attribute to store the nipye interface
- Type:
Interface
- classmethod help(nipype_interface, returnhelp=False)[source]¶
Method to print the full wrapped nipype interface help.
- Parameters:
cls (process class (mandatory)) – a nipype process class
nipype_instance (nipype interface (mandatory)) – a nipype interface object that will be documented.
returnhelp (bool (optional, default False)) – if True return the help string message, otherwise display it on the console.
- requirements()[source]¶
Requirements needed to run the process. It is a dictionary which keys are config/settings modules and values are requests for them.
The default implementation returns an empty dict (no requirements), and should be overloaded by processes which actually have requirements.
Ex:
{'spm': 'version >= "12" and standalone == "True"')
- class capsul.process.process.Process(**kwargs)[source]¶
A process is an atomic component that contains a processing.
A process is typically an object with typed parameters, and an execution function. Parameters are described using Enthought traits through Soma-Base Controller base class.
In addition to describing its parameters, a Process must implement its execution function, either through a python method, by overloading
_run_process()
, or through a commandline execution, by overloadingget_commandline()
. The second way allows to run on a remote processing machine which has not necessary capsul, nor python, installed.Parameters are declared or queried using the traits API, and their values are in the process instance variables:
from __future__ import print_function from capsul.api import Process import traits.api as traits class MyProcess(Process): # a class trait param1 = traits.Str('def_param1') def __init__(self): super(MyProcess, self).__init__() # declare an input param self.add_trait('param2', traits.Int()) # declare an output param self.add_trait('out_param', traits.File(output=True)) def _run_process(self): with open(self.out_param, 'w') as f: print('param1:', self.param1, file=f) print('param2:', self.param2, file=f) # run it with parameters MyProcess()(param2=12, out_param='/tmp/log.txt')
Note about the File and Directory traits
The
File
trait type represents a file parameter. A file is actually two things: a filename (string), and the file itself (on the filesystem). For an input it is OK not to distinguish them, but for an output, there are two different cases:the file (on the filesystem) is an output, but the filename (string) is given as an input: this is the classical “commandline” behavior, when we tell the program where it should write its output file.
the file is an output, and the filename is also an output: this is rather a “function return value” behavior: the process determines internally where it should write the file, and tells as an output where it did.
To distinguish these two cases, in Capsul we normally add in the
File
orDirectory
trait a propertyinput_filename
which is True when the filename is an input, and False when the filename is an output:self.add_trait('out_file', traits.File(output=True, input_filename=False))
However, as most of our processes are based on the “commandline behavior” (the filename is an input) and we often forget to specify the
input_filename
parameter, the default is the “filename is an input” behavior, when not specified.Attributes
- log_file¶
if None, the log will be generated in the current directory otherwise it will be written in log_file path.
- Type:
str (default None)
Note
Type ‘Process.help()’ for a full description of this process parameters.
Type ‘<Process>.get_input_spec()’ for a full description of this process input trait types.
Type ‘<Process>.get_output_spec()’ for a full description of this process output trait types.
Initialize the Process class.
- add_trait(name, trait)[source]¶
Ensure that trait.output and trait.optional are set to a boolean value before calling parent class add_trait.
- check_requirements(environment='global', message_list=None)[source]¶
Checks the process requirements against configuration settings values in the attached CapsulEngine. This makes use of the
requirements()
method and checks that there is one matching config value for each required module.- Parameters:
- Returns:
config – if None is returned, requirements are not met: the process cannot run. If a dict is returned, it corresponds to the matching config values. When no requirements are needed, an empty dict is returned. A pipeline, if its requirements are met will return a list of configuration values, because different nodes may require different config values.
- Return type:
- get_commandline()[source]¶
Method to generate a commandline representation of the process.
If not implemented, it will generate a commandline running python, instantiating the current process, and calling its
_run_process()
method.- Returns:
commandline – Arguments are in separate elements of the list.
- Return type:
list of strings
- get_input_spec()[source]¶
Method to access the process input specifications.
- Returns:
outputs – a string representation of all the input trait specifications.
- Return type:
- get_inputs()[source]¶
Method to access the process inputs.
- Returns:
outputs – a dictionary with all the input trait names and values.
- Return type:
- get_missing_mandatory_parameters()[source]¶
Returns a list of parameters which are not optional, and which value is Undefined or None, or an empty string for a File or Directory parameter.
- get_output_spec()[source]¶
Method to access the process output specifications.
- Returns:
outputs – a string representation of all the output trait specifications.
- Return type:
- get_outputs()[source]¶
Method to access the process outputs.
- Returns:
outputs – a dictionary with all the output trait names and values.
- Return type:
- classmethod help(returnhelp=False)[source]¶
Method to print the full help.
- Parameters:
cls (process class (mandatory)) – a process class
returnhelp (bool (optional, default False)) – if True return the help string message, otherwise display it on the console.
- make_commandline_argument(*args)[source]¶
This helper function may be used to build non-trivial commandline arguments in get_commandline implementations. Basically it concatenates arguments, but it also takes care of keeping track of temporary file objects (if any), and converts non-string arguments to strings (using repr()).
Ex:
>>> process.make_commandline_argument('param=', self.param)
will return the same as:
>>> 'param=' + self.param
if self.param is a string (file name) or a temporary path.
- params_to_command()[source]¶
Generates a commandline representation of the process.
If not implemented, it will generate a commandline running python, instantiating the current process, and calling its
_run_process()
method.This method is new in Capsul v3 and is a replacement for
get_commandline()
.It can be overwritten by custom Process subclasses. Actually each process should overwrite either
params_to_command()
or_run_process()
.The returned commandline is a list, which first element is a “method”, and others are the actual commandline with arguments. There are several methods, the process is free to use either of the supported ones, depending on how the execution is implemented.
Methods:
- capsul_job: Capsul process run in python
The command will run the
_run_process()
execution method of the process, after loading input parameters from a JSON dictionary file. The only second element in the commandline list is the process identifier (module/class as inget_process_instance()
). The location of the JSON file will be passed to the job execution through an environment variable SOMAWF_INPUT_PARAMS:return ['capsul_job', 'morphologist.capsul.morphologist']
- format_string: free commandline with replacements for parameters
Command arguments can be, or contain, format strings in the shape ‘%(param)s’, where param is a parameter of the process. This way we can map values correctly, and call a foreign command:
return ['format_string', 'ls', '%(input_dir)s']
- json_job: free commandline with JSON file for input parameters
A bit like capsul_job but without the automatic wrapper:
return ['json_job', 'python', '-m', 'my_module']
- Returns:
commandline – Arguments are in separate elements of the list.
- Return type:
list of strings
- requirements()[source]¶
Requirements needed to run the process. It is a dictionary which keys are config/settings modules and values are requests for them.
The default implementation returns an empty dict (no requirements), and should be overloaded by processes which actually have requirements.
Ex:
{'spm': 'version >= "12" and standalone == "True"')
- static run_from_commandline(process_definition)[source]¶
Run a process from a commandline call. The process name (with module) are given in argument, input parameters should be passed through a JSON file which location is in the
SOMAWF_INPUT_PARAMS
environment variable.If the process has outputs, the
SOMAWF_OUTUT_PARAMS
environment variable should contain the location of an output file which will be written with a dict containing output parameters values.
- save_log(returncode)[source]¶
Method to save process execution information in json format.
If the class attribute log_file is not set, a log.json output file is generated in the process call current working directory.
- Parameters:
returncode (ProcessResult) – the process result return code.
- class capsul.process.process.ProcessMeta(name, bases, attrs)[source]¶
Class used to complete a process docstring
Use a class and not a function for inheritance.
Method to print the full help.
- Parameters:
- class capsul.process.process.ProcessResult(process, runtime, returncode, inputs=None, outputs=None)[source]¶
Object that contains running information a particular Process.
- Parameters:
process (Process class (mandatory)) – A copy of the Process class that was called.
runtime (dict (mandatory)) – Execution attributes.
returncode (dict (mandatory)) – Execution raw attributes
inputs (dict (optional)) – Representation of the process inputs.
outputs (dict (optional)) – Representation of the process outputs.
Initialize the ProcessResult class.
capsul.process.nipype_process submodule¶
Utilities to link Capsul and NiPype interfaces
Functions¶
nipype_factory()
¶
- capsul.process.nipype_process.nipype_factory(nipype_instance, base_class=<class 'capsul.process.process.NipypeProcess'>)[source]¶
From a nipype class instance generate dynamically a process instance that encapsulate the nipype instance.
This function clones the nipye traits (also convert special traits) and connects the process and nipype instances traits.
A new ‘output_directory’ nipype input trait is created.
Since nipype inputs and outputs are separated and thus can have the same names, the nipype process outputs are prefixed with ‘_’.
It also monkey patch some nipype functions in order to execute the process in a specific directory: the monkey patching has been written for Nipype version ‘0.10.0’.
- Parameters:
nipype_instance (instance (mandatory)) – a nipype interface instance.
- Returns:
process_instance – a process instance.
- Return type:
instance
See also
_run_interface
,_list_outputs
,_gen_filename
,_parse_inputs
,sync_nypipe_traits
,sync_process_output_traits
,clone_nipype_trait
capsul.process.runprocess submodule¶
capsul.process.xml submodule¶
Read and write a Process as an XML file.
Classes¶
XMLProcess
¶
Functions¶
string_to_value()
¶
trait_from_xml()
¶
create_xml_process()
¶
Decorator¶
xml_process()
¶
- class capsul.process.xml.XMLProcess(**kwargs)[source]¶
Base class of all generated classes for processes defined as a Python function decorated with an XML string.
Note
Type ‘XMLProcess.help()’ for a full description of this process parameters.
Type ‘<XMLProcess>.get_input_spec()’ for a full description of this process input trait types.
Type ‘<XMLProcess>.get_output_spec()’ for a full description of this process output trait types.
Initialize the Process class.
- capsul.process.xml.create_xml_process(module, name, function, xml)[source]¶
Create a new process class given a Python function and a string containing the corresponding Capsul XML 2.0 definition.
- Parameters:
- Returns:
results – created process class.
- Return type:
XMLProcess subclass
- capsul.process.xml.string_to_value(string)[source]¶
Converts a string into a Python value without executing code.