populse_mia.data_manager package

Module to handle the projects and their database.

Contains:
Module:
  • data_loader

  • database_mia

  • filter

  • project

  • project_properties

Submodules

populse_mia.data_manager.data_history_inspect module

This module is dedicated to pipeline history.

class populse_mia.data_manager.data_history_inspect.ProtoProcess(brick=None)[source]

Bases: object

Lightweight convenience class, stores a brick database entry, plus additional info (used)

__init__(brick=None)[source]
populse_mia.data_manager.data_history_inspect.brick_to_process(brick, project)[source]

Converts a brick database entry (document) into a “fake process”: a ̀~capsul.process.process.Process direct instance (not subclassed) which cannot do any actual processing, but which represents its parameters with values (traits and values). The process gets a name and an uuid from the brick, and also an exec_time.

populse_mia.data_manager.data_history_inspect.data_history_pipeline(filename, project)[source]

Get the complete “useful” history of a file in the database, as a “fake pipeline”.

The pipeline contains fake processes (unspecialized, direct Process instances), with all parameters (all being of type Any). The pipeline has connections, and gets all upstream ancestors of the file, so it contains all processing used to produce the latest version of the file (it may have been modified several time during the processing), and gets as inputs all input files which were used to produce the final data.

Processing bricks which are not used, probably part of earlier runs which have been orphaned because the data file has been overwritten, are not listed in this history.

populse_mia.data_manager.data_history_inspect.data_in_value(value, filename, project)[source]

Looks if the given filename is part of the given value. The value may ba a list, a tuple, or a dict, and may include several layers, which are parsed.

The input filename may be the temp value “<temp>”, or a filename in its “short” version (relative path to the project database data directory).

populse_mia.data_manager.data_history_inspect.find_procs_with_output(procs, filename, project)[source]

Find in the given process list if the given filename is part of its outputs.

Parameters

procs: iterable

process in the list is a ProtoProcess instance

filename: str

a file name

project: Project instance

used only to get the database folder (base directory for data)

Returns

sprocs: dict

exec_time: [(process, param_name), …]

populse_mia.data_manager.data_history_inspect.get_data_history(filename, project)[source]

Get the processing history for the given data file. Based on get_data_history_processes().

The history dict contains several elements:

  • parent_files: set of other data used (directy or indirectly) to produce the data.

  • processes: processing bricks set from each ancestor data which lead to the given one. Elements are process (brick) UUIDs.

Returns

history: dict

populse_mia.data_manager.data_history_inspect.get_data_history_bricks(filename, project)[source]

Get the complete “useful” history of a file in the database, as a set of bricks.

This is just a fileterd version of get_data_history_processes() (like data_history_pipeline() in another shape), which only returns the set of brick elements actually used in the “useful” history of the data.

populse_mia.data_manager.data_history_inspect.get_data_history_processes(filename, project)[source]

Get the complete “useful” history of a file in the database.

The function outputs a dict of processes (ProtoProcess instances) and a set of links between them. data_history_pipeline() is a higher-level function which is using this one, then converts its outputs into a Pipeline instance to represent the data history.

The processes output by this functon may include extra processes that are looked during history search, but finally not used. They are not filtered out, but they are distinguished with their used attribute: those actually used have it set to True.

Processing bricks which are not used, probably part of earlier runs which have been orphaned because the data file has been overwritten, are either not listed in this history, or have their used property set to False.

Returns

procs: dict

{uuid: ProtoProcess instance}

links: set

{(src_protoprocess, src_plug_name, dst_protoprocess, dst_plug_name)}. Links from/to the ‘exterior” are also given: in this case src_protoprocess or dst_protoprocess is None.

populse_mia.data_manager.data_history_inspect.get_direct_proc_ancestors(filename, project, procs, before_exec_time=None, only_latest=True, org_proc=None)[source]

Retrieve processing bricks which are referenced in the direct filename history. It can get the latest before a given execution time. As exec time is ambiguous (several processes may have finished at exactly the same time), several processes may be kept for a given exec time.

The “origin” process, if given, is excluded from this exec time filtering (as we are looking for the one preceding it), but still included in the ancestors list.

This function manipulates processing bricks as ProtoProcess instances, a light wrapper for a brick database entry.

Parameters

filename: str

data filename to inspect

project: Project instance

used to access database

procs: dict

process dict, {uuid: ProtoProcess instance}, the dict is populated when bricks are retrieved from the database.

before_exec_time: datetime instance (optional)

if it is specified, only processing bricks not newer than this time are used.

only_latest: bool (optional, default: True)

if True, only the latest processes retrieved from the history are kept. If before_exec_time is also used, then it is the latest before this time.

org_proc: ProtoProcess instance (optional)

if filename is the output of a process, we can specify it here in order to exclude it from the time filtering (otherwise only this process will likely remain).

Returns

procs: dict

{brick uuid: ProtoProcess instance}

populse_mia.data_manager.data_history_inspect.get_filenames_in_value(value, project, allow_temp=True)[source]

Parses value, which may be an imbrication of lists, tuples and dicts, and gets all filenames referenced in it. Only filenames which are database entries are kept, and the “<temp>” value if allow_temp is True (which is the default). Other non-indexed filenames are considered to be read-only static data (such as templates, atlases or other software-related data), and are not retained.

populse_mia.data_manager.data_history_inspect.get_history_brick_process(brick_id, project, before_exec_time=None)[source]

Get a brick from its uuid in the database, and return it as a ProtoProcess instance.

A brick that has not been executed (its exec status is not "Done"), or if it is newer than before_exec_time if this parameter is given, is discarded.

If discarded (or nor found in the database), the return value is None.

populse_mia.data_manager.data_history_inspect.get_proc_ancestors_via_tmp(proc, project, procs)[source]

Normally an internal function used in get_data_history_processes() and data_history_pipeline(): it is not meant to be part of a public API.

Try to get upstream process(es) for proc, connected via a temp value (“<temp>”).

For this, try to match processes in the output files history bricks.

Bricks are looked for, first in the process input files direct histories. If no matching process is found, then the full database bricks history is searched, which may be much slower for large databases.

The matching is made by the “<temp>” filename and processing time, thus is error-prone, especially if searching the whole bricks database.

proc should be a ProtoProcess instance

Returns

new_procs: dict

{uuid: ProtoProcess instance}

links: set

{(src_protoprocess, src_plug_name, dst_protoprocess, dst_plug_name)}. Pipeline link from/to the pipeline main plugs are also given: in this case src_protoprocess or dst_protoprocess is None.

populse_mia.data_manager.data_history_inspect.is_data_entry(filename, project, allow_temp=True)[source]

Checks if the input filename is a database entry. The return value is either the relative path to the database data directory, or “<temp>” if filename is this value and allow_temp is True (which is the default), or None if it is not in the database.

populse_mia.data_manager.data_loader module

Module to handle the importation from MRIFileManager and its progress

Contains:
Class:
  • ImportProgress : Inherit from QProgressDialog and handle the progress bar

  • ImportWorker : Inherit from QThread and manage the threads

Methods:
  • read_log : Show the evolution of the progress bar and returns its feedback

  • tags_from_file : Returns a list of [tag, value] contained in a Json file

  • verify_scans : Check if the project’s scans have been modified

class populse_mia.data_manager.data_loader.ImportProgress(project)[source]

Bases: QProgressDialog

Handle the progress bar.

Parameters:

project – A Project object

__init__(project)[source]
onProgress(i)[source]

Signal to set the import progressbar value

Parameters:

i – int, value of the progressbar

class populse_mia.data_manager.data_loader.ImportWorker(project, progress)[source]

Bases: QThread

Manage threads.

Parameters:
  • project – A Project object

  • progress – An ImportProgress object

__init__(project, progress)[source]
notifyProgress
run()[source]

Override the QThread run method. Executed when the worker is started, fills the database and updates the progress.

populse_mia.data_manager.data_loader.read_log(project, main_window)[source]

Show the evolution of the progress bar and returns its feedback, a list of the paths to each data file that was loaded.

Parameters:
  • project – current project in the software

  • main_window – software’s main window

Returns:

the scans that have been added

populse_mia.data_manager.data_loader.tags_from_file(file_path, path)[source]

Return a list of [tag, value] contained in a Json file.

Parameters:
  • file_path – file path of the Json file (without the extension)

  • path – project path

Returns:

a list of the Json tags of the file

populse_mia.data_manager.data_loader.verify_scans(project)[source]

Check if the project’s scans have been modified.

Parameters:

project – current project in the software

Returns:

the list of scans that have been modified

populse_mia.data_manager.database_mia module

Module that contains class to override the default behaviour of populse_db and some of its methods

Contains:
Class:
  • DatabaseMIA

  • DatabaseSessionMIA

class populse_mia.data_manager.database_mia.DatabaseMIA(database_url, caches=None, list_tables=None, query_type=None)[source]

Bases: Database

Class overriding the default behavior of populse_db

database_session_class

alias of DatabaseSessionMIA

class populse_mia.data_manager.database_mia.DatabaseSessionMIA(database)[source]

Bases: DatabaseSession

Class overriding the database session of populse_db

add_collection(name, primary_key, visibility, origin, unit, default_value)[source]

Override the method adding a collection of populse_db.

Parameters:
  • name – New collection name

  • primary_key – New collection primary_key column

  • visibility – Primary key visibility

  • origin – Primary key origin

  • unit – Primary key unit

  • default_value – Primary key default value

add_field(collection, name, field_type, description, visibility, origin, unit, default_value, index=False, flush=True)[source]

Add a field to the database, if it does not already exist.

Parameters:
  • collection – field collection (str)

  • name – field name (str)

  • field_type – field type (string, int, float, boolean, date, datetime, time, list_string, list_int, list_float, list_boolean, list_date, list_datetime or list_time)

  • description – field description (str or None)

  • visibility – Bool to know if the field is visible in the databrowser

  • origin – To know the origin of a field, in [TAG_ORIGIN_BUILTIN, TAG_ORIGIN_USER]

  • unit – Origin of the field, in [TAG_UNIT_MS, TAG_UNIT_MM, TAG_UNIT_DEGREE, TAG_UNIT_HZPIXEL, TAG_UNIT_MHZ]

  • default_value – Default_value of the field, can be str or None

  • flush – bool to know if the table classes must be updated (put False if in the middle of filling fields) => True by default

add_field_attributes_collection()[source]

Blabla

add_fields(fields)[source]

Add the list of fields.

Parameters:

fields – list of fields (collection, name, type, description, visibility, origin, unit, default_value)

get_field(collection, name)[source]

Blabla

get_fields(collection)[source]

Blabla

get_shown_tags()[source]

Give the list of visible tags.

Returns:

the list of visible tags

remove_field(collection, fields)[source]

Removes a field in the collection

Parameters:
  • collection – Field collection (str, must be existing)

  • field – Field name (str, must be existing), or list of fields (list of str, must all be existing)

Raises:

ValueError

  • If the collection does not exist

  • If the field does not exist

set_shown_tags(fields_shown)[source]

Set the list of visible tags.

Parameters:

fields_shown – list of visible tags

populse_mia.data_manager.filter module

Module that handle the filter class which contains the results of both rapid and advanced search

Contains:
Class:
  • Filter

class populse_mia.data_manager.filter.Filter(name, nots, values, fields, links, conditions, search_bar)[source]

Bases: object

Class that represent a Filter, containing the results of both rapid and advanced search.

The advanced search creates a complex query to the database and is a combination of several “query lines” which are linked with AND or OR and all composed of: - A negation or not - A tag name or all visible tags - A condition (==, !=, >, <, >=, <=, CONTAINS, IN, BETWEEN) - A value

Parameters:
  • name – filter’s name

  • nots – list of negations (”” or NOT)

  • values – list of values

  • fields – list of list of fields

  • links – list of links (AND/OR)

  • conditions – list of conditions (==, !=, <, >, <=, >=, IN, BETWEEN, CONTAINS, HAS VALUE, HAS NO VALUE)

  • search_bar – value in the rapid search bar

__init__(name, nots, values, fields, links, conditions, search_bar)[source]

Initialization of the Filter class.

Parameters:
  • name – filter’s name

  • nots – list of negations (”” or NOT)

  • values – list of values

  • fields – list of list of fields

  • links – list of links (AND/OR)

  • conditions – list of conditions (==, !=, <, >, <=, >=, IN, BETWEEN, CONTAINS, HAS VALUE, HAS NO VALUE)

  • search_bar – value in the rapid search bar

generate_filter(current_project, scans, tags)[source]

Apply the filter to the given list of scans.

Parameters:
  • current_project – Current project

  • scans – List of scans to apply the filter into

  • tags – List of tags to search in

Returns:

The list of scans matching the filter

json_format()[source]

Return the filter as a dictionary.

Returns:

the filter as a dictionary

populse_mia.data_manager.project module

Module that handle the projects and their database.

Contains:
Class:
  • Project

class populse_mia.data_manager.project.Project(project_root_folder, new_project)[source]

Bases: object

Class that handles projects and their associated database.

Parameters:
  • project_root_folder – project’s path

  • new_project – project’s object

__init__(project_root_folder, new_project)[source]

Initialization of the project class.

Parameters:
  • project_root_folder – project’s path

  • new_project – project’s object

add_clinical_tags()[source]

Add new clinical tags to the project.

Returns:

list of clinical tags that were added

cleanup_orphan_bricks(bricks=None)[source]

Remove orphan bricks from the database

cleanup_orphan_history()[source]

Remove orphan bricks from the database

cleanup_orphan_nonexisting_files()[source]

Remove orphan files which do not exist from the database

del_clinical_tags()[source]

Remove clinical tags to the project.

Returns:

list of clinical tags that were removed

files_in_project(files)[source]

Return values in files that are file / directory names within the project folder.

files are walked recursively and can be, or contain, lists, tuples, sets, dicts (only dict values() are considered). Dict keys are dropped and all filenames are merged into a single set.

The returned value is a set of filenames (str).

finished_bricks(engine, pipeline=None, include_done=False)[source]

blabla

getDate()[source]

Return the date of creation of the project.

Returns:

string of the date of creation of the project if it’s not Unnamed project, otherwise empty string

getFilter(filter)[source]

Return a Filter object from its name.

Parameters:

filter – Filter name

Returns:

Filter object

getFilterName()[source]

Input box to type the name of the filter to save.

Returns:

Return the name typed

getName()[source]

Return the name of the project.

Returns:

string of the name of the project if it’s not Unnamed project, otherwise empty string

getSortOrder()[source]

Return the sort order of the project.

Returns:

string of the sort order of the project if it’s not Unnamed project, otherwise empty string

getSortedTag()[source]

Return the sorted tag of the project.

Returns:

string of the sorted tag of the project if it’s not Unnamed project, otherwise empty string

get_data_history(path)[source]

Get the processing history for the given data file.

The history dict contains several elements:

  • parent_files: set of other data used (directy or indirectly) to produce the data.

  • processes: processing bricks set from each ancestor data which lead to the given one. Elements are process (brick) UUIDs.

Returns:

history (dict)

get_finished_bricks_in_pipeline(engine, pipeline)[source]

blabla

get_finished_bricks_in_workflows(engine)[source]

blabla

get_orphan_bricks(bricks=None)[source]

blabla

get_orphan_history()[source]

blabla

get_orphan_nonexsiting_files()[source]

Get orphan files which do not exist from the database

hasUnsavedModifications()[source]

Return if the project has unsaved modifications or not.

Returns:

boolean, True if the project has pending modifications, False otherwise

init_filters()[source]

Initialize the filters at project opening.

loadProperties()[source]

Load the properties file.

redo(table)[source]

Redo the last action made by the user on the project.

Parameters:

table – table on which to apply the modifications

Actions that can be undone:
  • add_tag

  • remove_tags

  • add_scans

  • modified_values

  • modified_visibilities

reput_values(values)[source]

Re-put the value objects in the database.

Parameters:

values – List of Value objects

saveConfig()[source]

Save the changes in the properties file.

saveModifications()[source]

Save the pending operations of the project (actions still not saved).

save_current_filter(custom_filters)[source]

Save the current filter.

Parameters:

custom_filters – The customized filter

setCurrentFilter(filter)[source]

Set the current filter of the project.

Parameters:

filter – new Filter object

setDate(date)[source]

Set the date of the project.

Parameters:

date – new date of the project

setName(name)[source]

Set the name of the project if it’s not Unnamed project, otherwise does nothing.

Parameters:

name – new name of the project

setSortOrder(order)[source]

Set the sort order of the project.

Parameters:

order – new sort order of the project (ascending or descending)

setSortedTag(tag)[source]

Set the sorted tag of the project.

Parameters:

tag – new sorted tag of the project

undo(table)[source]

Undo the last action made by the user on the project.

Parameters:

table – table on which to apply the modifications

Actions that can be undone:
  • add_tag

  • remove_tags

  • add_scans

  • modified_values

  • modified_visibilities

unsaveModifications()[source]

Unsave the pending operations of the project.

property unsavedModifications

Setter for _unsavedModifications.

update_data_history(data)[source]

Cleanup earlier history of given data by removing from their bricks list those which correspond to obsolete runs (data has been re-written by more recent runs). This function only updates data status (bricks list), it does not remove obsolete bricks from the database.

Returns

a set of obsolete bricks that might become orphan: they are not used any longer in input data history, and were in the previous ones. But they still can be used in other data.

update_db_for_paths(new_path=None)[source]

Update the history and brick tables with a new project file.

Necessary when a project is renamed or when a new project is loaded from outside.

populse_mia.data_manager.project_properties module

Module that contains the class to handle the projects saved in the software.

Contains:

Class: - SavedProjects

class populse_mia.data_manager.project_properties.SavedProjects[source]

Bases: object

Class that handles all the projects that have been saved in the software.

__init__()[source]
Initialise the savedProjects attribute from the saved_projects.yml

file.

The pathsList attribute is initialised as the value corresponding to the “paths” key in the savedProjects dictionary.

addSavedProject(newPath)[source]

Add a new project to save in the savedProjects and pathsList attributes.

Finally, save the savedProjects attribute in the saved_projects.yml file.

Parameters:

newPath – new project’s path to add

Returns:

the new path’s list (pathsList attribute)

loadSavedProjects()[source]

Load the savedProjects dictionary from the saved_projects.yml file.

If the saved_projects.yml file is not existing, it is created with the “{paths: []}” value and the returned dictionnary is {paths: []}.

Returns:

the dictionary

removeSavedProject(path)[source]

Removes a saved project from the saved_projects.yml file.

Parameters:

path – path of the saved project to remove

saveSavedProjects()[source]

Saves the savedProjects dictionary to the saved_projects.yml file.