populse_mia.data_manager package

Module to handle the projects and their database.

Contains:
Module:
  • data_history_inspect

  • data_loader

  • database_mia

  • filter

  • project

  • project_properties

  • project

Submodules

populse_mia.data_manager.data_history_inspect module

This module is dedicated to pipeline history.

Contains:
Class:
  • ProtoProcess: A lightweight convenience class that stores a brick

    database entry along with additional usage information.

Functions:
  • brick_to_process: Convert a brick database entry into a

    ‘fake process’.

  • data_history_pipeline: Retrieves the complete processing history of

    a file in the database.

  • data_in_value: Determine if the specified filename is present within

    the given value.

  • find_procs_with_output: Identify processes that have the specified

    filename as part of their outputs.

  • get_data_history: Retrieves the processing history for a given data

    file

  • get_data_history_bricks: Retrieves the complete “useful” history of

    a file in the database.

  • get_data_history_processes: Retrieves the complete “useful”

    processing history of a file in the database.

  • get_direct_proc_ancestors: Retrieve processing bricks referenced in

    the direct filename history.

  • get_filenames_in_value: Extract filenames from a nested structure of

    lists, tuples, and dictionaries.

  • get_history_brick_process: Retrieve a brick from the database and

    return it as ProtoProcess instance.

  • get_proc_ancestors_via_tmp: Retrieve upstream processes connected

    via a temporary value (“<temp>”).

  • is_data_entry: Determine if the given filename is a valid database

    entry.

class populse_mia.data_manager.data_history_inspect.ProtoProcess(brick=None)[source]

Bases: object

A lightweight convenience class that stores a brick database entry along with additional usage information.

Parameters:

brick – The brick database entry to store. Defaults to None.

__init__(brick=None)[source]
populse_mia.data_manager.data_history_inspect.brick_to_process(brick, project)[source]

Convert a brick database entry into a ‘fake process’.

This function transforms a brick database entry into a Process instance that represents its parameters and values. The process gets a name, uuid, and exec_time from the brick. This “fake process” cannot perform actual processing but serves as a representation of the brick’s traits and values.

Parameters:
  • str) (brick (dict or) – The brick database entry to convert. If a string is provided, it is treated as the brick’s unique ID, and the corresponding brick document is retrieved from the project’s database.

  • (object) (project) – The project object providing access to the database and its documents.

Return (Process or None):

A Process instance representing the brick’s parameters and values. Returns None if the brick is not found.

populse_mia.data_manager.data_history_inspect.data_history_pipeline(filename, project)[source]

Retrieves the complete processing history of a file in the database, formatted as a “fake pipeline”.

The generated pipeline consists of unspecialized (fake) processes, each representing a processing step with all parameters of type Any. The pipeline includes connections and traces all upstream ancestors of the file, capturing the entire processing path leading to the latest version of the file.

If the file was modified multiple times, the pipeline reflects only the relevant processing steps that contributed to the final output. Orphaned processing steps from overwritten versions are omitted.

Parameters:
  • (str) (filename) – The name of the file whose processing history is being retrieved.

  • (Project) (project) – The project object containing the database and relevant details.

Return (Pipeline | None):

A Pipeline object representing the processing history, or None if no relevant history is found.

populse_mia.data_manager.data_history_inspect.data_in_value(value, filename, project)[source]

Determine if the specified filename is present within the given value.

This function recursively searches through the value, which can be a string, list, tuple, or dictionary, to check if it contains the specified filename. The filename can be a special placeholder “<temp>” or a “short” filename, which is a relative path within the project’s database data directory.

Parameters:
  • dict) (value (str, list, tuple, or) –

    The data structure to search.

    It can be:

    • A string representing a file path.

    • A list or tuple containing multiple file paths.

    • A dictionary where file paths are stored as values.

  • (str) (filename) – The filename to search for. It can be: - The special placeholder “<temp>” indicating a temporary value. - A relative file path to the project database data directory.

  • (object) (project) – The project object containing the project’s folder path as an attribute (project.folder).

Return (bool):

True if the filename is found in the value, False otherwise.

populse_mia.data_manager.data_history_inspect.find_procs_with_output(procs, filename, project)[source]

Identify processes in the given list that have the specified filename as part of their outputs.

This function searches through a list of processes to determine which ones have the specified filename in their output values. The results are organized by execution time.

Parameters:
  • ProtoProcess) (procs (iterable of) – A collection of ProtoProcess instances to search through.

  • (str) (filename) – The filename to search for within the processes’ outputs.

  • (Project) (project) – An instance of the project, used to access the database folder.

Return (dict):

A dictionary where keys are execution times and values are lists of tuples. Each tuple contains a process and the parameter name associated with the filename. Format: {exec_time: [(process, param_name), …]}.

populse_mia.data_manager.data_history_inspect.get_data_history(filename, project)[source]

Retrieves the processing history for a given data file, based on get_data_history_processes().

The returned dictionary contains:

  • “parent_files”: A set of filenames representing data (direct or

    indirect) used to produce the given file.

  • “processes”: A set of UUIDs of processing bricks that contributed

    to the file’s creation.

Parameters:
  • (str) (filename) – The name of the file whose processing history is being retrieved.

  • (Project) (project) – The project object containing the database and relevant details.

Return (dict):

A dictionary with the following keys: - “processes” (set): A set of UUIDs representing the

processing bricks involved.

  • “parent_files” (set): A set of filenames that were

    used to produce the data.

populse_mia.data_manager.data_history_inspect.get_data_history_bricks(filename, project)[source]

Retrieves the complete “useful” history of a file in the database as a set of processing bricks.

This function is a filtered version of get_data_history_processes(), similar to data_history_pipeline(), but instead of constructing a pipeline, it returns only the set of brick elements that were actually used in the relevant processing history of the file.

Parameters:
  • (str) (filename) – The name of the file whose processing history is being retrieved.

  • (Project) (project) – The project object containing the database and relevant details.

Return (set):

A set of brick elements representing the “useful” processing steps that contributed to the final version of the given data file.

populse_mia.data_manager.data_history_inspect.get_data_history_processes(filename, project)[source]

Retrieves the complete “useful” processing history of a file in the database.

This function returns: - A dictionary of processes (ProtoProcess instances), where

keys are process UUIDs.

  • A set of links between these processes, forming the processing graph.

Unlike data_history_pipeline(), which converts the history into a Pipeline, this function provides a lower-level representation. Some processes retrieved during history traversal may not be used; they are distinguished by their used attribute (set to True for relevant processes).

Processing bricks that are not used (possibly from earlier runs where the data file was overwritten) may either be absent from the history or have used = False.

Parameters:
  • (str) (filename) – The name of the file whose processing history is being retrieved.

  • (Project) (project) – The project object containing the database and relevant details.

Return (tuple):
  • procs (dict): {uuid: ProtoProcess instance} mapping.

  • links (set): `{
    (

    src_protoprocess, src_plug_name, dst_protoprocess, dst_plug_name

    )

    }`.

    External connections are represented with None as src_protoprocess or dst_protoprocess.

populse_mia.data_manager.data_history_inspect.get_direct_proc_ancestors(filename, project, procs, before_exec_time=None, only_latest=True, org_proc=None)[source]

Retrieve processing bricks referenced in the direct filename history.

This function identifies the most recent processing steps that generated the given filename. If multiple processes share the same execution time, they are all retained to account for ambiguity. The function also allows filtering by execution time and excluding a specified originating process.

Parameters:
  • (str) (filename) – The data filename to inspect.

  • (Project) (project) – The project instance used to access the database.

  • (dict) (procs) – Dictionary mapping process UUIDs to ProtoProcess instances. This dictionary is updated with newly retrieved processes.

  • (datetime) (before_exec_time) – If specified, only processing bricks executed before this time are considered.

  • (bool) (only_latest) – If True (default), keeps only the latest processes found in the history. If before_exec_time is specified, retains only the latest before that time.

  • (ProtoProcess) (org_proc) – The originating process, which is excluded from execution time filtering but included in the ancestor list.

Return (dict):

A dictionary mapping brick UUIDs to ProtoProcess instances.

populse_mia.data_manager.data_history_inspect.get_filenames_in_value(value, project, allow_temp=True)[source]

Extract filenames from a nested structure of lists, tuples, and dictionaries.

This function parses the given value, which can be a nested combination of lists, tuples, and dictionaries, to retrieve all filenames referenced within it. Only filenames that are valid database entries or the special “<temp>” value (if allow_temp is True) are retained. Other filenames are considered read-only static data and are not included in the results.

Parameters:
  • (object) (project) – The value to parse. It can be a single string, a list, tuple, dictionary, or a nested combination of these types.

  • (object) – The project object providing access to the database.

  • optional) (allow_temp (bool,) – If True, includes the temporary filename “<temp>” in the results. Defaults to True.

Return (set):

A set of filenames that are valid database entries or the temporary filename “<temp>” (if allowed).

populse_mia.data_manager.data_history_inspect.get_history_brick_process(brick_id, project, before_exec_time=None)[source]

Retrieve a brick from the database using its UUID and return it as a ProtoProcess instance.

This function fetches a brick from the database using its unique identifier (UUID). It returns the brick as a ProtoProcess instance if the brick has been executed (its execution status is “Done”) and, if specified, its execution time is not later than before_exec_time. If the brick does not meet these criteria or is not found in the database, the function returns None.

Parameters:
  • (str) (before_exec_time) – The unique identifier (UUID) of the brick to retrieve.

  • (object) (project) – The project object providing access to the database.

  • (str) – An execution time filter. If provided, bricks executed after this timestamp are discarded.

Return (ProtoProcess or None):

A ProtoProcess instance representing the brick if it meets the criteria; otherwise, None.

populse_mia.data_manager.data_history_inspect.get_proc_ancestors_via_tmp(proc, project, procs)[source]

Retrieve upstream processes connected via a temporary value (“<temp>”).

This function is intended for internal use within get_data_history_processes and data_history_pipeline. It attempts to identify upstream processes connected to the given process (proc) through a temporary filename.

The function first searches the direct history of the process’s output files. If no matching process is found, it searches the entire database of bricks, which may be slower for large databases. Matching is based on the temporary filename and processing time, which can be error-prone.

Parameters:
  • (ProtoProcess) (proc) – The process whose ancestors need to be determined.

  • (object) (project) – The project object providing access to the session and other necessary functionalities for processing.

  • (dict) (procs) – A dictionary of processes, where keys are process IDs and values are ProtoProcess instances.

Returns:

  • new_procs (dict): A dictionary mapping process UUIDs to

    ProtoProcess instances.

  • links (set): A set of tuples representing pipeline links in the format (src_protoprocess, src_plug_name, dst_protoprocess, dst_plug_name). Links from/to the pipeline main plugs are also included, where src_protoprocess or dst_protoprocess may be None.

Contains:
Private function:
  • _get_tmp_param: Identifies a process parameter associated

    with a temporary value.

populse_mia.data_manager.data_history_inspect.is_data_entry(filename, project, allow_temp=True)[source]

Determine if the given filename is a valid database entry within the specified project.

This function checks whether the input filename is either a recognized temporary value (“<temp>”) or a file located within the project’s database data directory. If the filename is valid, it returns either the relative path to the database data directory or “<temp>” (if allowed). If the file is not found in the database, the function returns None.

Parameters:
  • (str) (filename) – The full path or special value “<temp>” to be checked.

  • (object) (project) – The project object providing access to the database and folder structure.

  • optional) (allow_temp (bool,) – If True, allows the special value “<temp>” to be considered a valid entry. Defaults to True.

Return (str or None):
  • The relative path to the project’s database data directory if the filename is a valid database entry.

  • “<temp>” if the input is “<temp>” and allow_temp is True.

  • None if the filename is not a valid database entry.

populse_mia.data_manager.data_loader module

Module to handle the importation from MRIFileManager and its progress.

Contains:
Class:
  • ImportProgress: Inherit from QProgressDialog and handle the

    progress bar

  • ImportWorker: Inherit from QThread and manage the threads

Functions:
  • read_log: Show the evolution of the progress bar and returns its

    feedback

  • tags_from_file: Returns a list of [tag, value] contained in a Json

    file

  • verify_scans: Check if the project’s scans have been modified

class populse_mia.data_manager.data_loader.ImportProgress(project)[source]

Bases: QProgressDialog

Displays a progress bar for the import process.

Parameters:

(Project) (project) – The project object being imported.

Methods:
  • onProgress: Updates the progress bar value.

__init__(project)[source]
onProgress(value)[source]

Updates the progress bar value.

Parameters:

(int) (value) – The current progress value.

class populse_mia.data_manager.data_loader.ImportWorker(project, progress)[source]

Bases: QThread

Worker thread for importing scans into the project database.

This class manages the import process, reading from export logs, processing scan files, and updating the database accordingly.

Attributes:

project: A Project object progress: An ImportProgress object notifyProgress: Signal emitted to update the progress bar

__init__(project, progress)[source]
notifyProgress

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

run()[source]

Execute the import process.

This method overrides the QThread run method and is executed when the worker is started. It processes the export logs, imports scans, and updates the database.

property scans_added

Get a copy of the scans_added list in a thread-safe manner.

populse_mia.data_manager.data_loader.read_log(project, main_window)[source]

Display a progress bar for data loading and return loaded file paths.

This function shows the evolution of the progress bar while loading data files and returns a list of paths to each data file that was successfully loaded.

Parameters:
  • (Project) (project) – The current project instance in the software

  • (MainWindow) (main_window) – The software’s main window instance used to display the progress bar.

Return (list):

A list of paths to the data files (scans) that were successfully added.

populse_mia.data_manager.data_loader.tags_from_file(file_path, path)[source]

Returns a list of [tag, value] pairs from a JSON file.

Parameters:
  • (str) (path) – File path of the Json file (without the extension)

  • (str) – Project path

Return (List[List[Union[str, dict]]]:

A list of the Json tags of the file

populse_mia.data_manager.data_loader.verify_scans(project)[source]

Check if the project’s scans have been modified.

Parameters:

(Project) (project) – Current project in the software

Return (List[str]):

The list of scans that have been modified or are missing.

populse_mia.data_manager.database_mia module

Module containing a class that provides tools adapted to Mia for interacting with the populse_db API.

Contains:
Class:
  • DatabaseMIA

class populse_mia.data_manager.database_mia.DatabaseMIA(database_engine)[source]

Bases: object

Class providing tools for interacting with a database, under the supervision of populse_db.

__init__(database_engine)[source]

Initializes a DatabaseMIA instance with the given database file.

Parameters:

(str) (database_engine) – Path to the database file (e.g., ‘/a/folder/path/file.db’).

close()[source]

Closes any open resources or connections held by the instance.

This method sets the storage attribute to None, effectively releasing any held references and cleaning up the object’s state.

data(write=None, create=None)[source]

Provides a context manager for accessing the database data layer.

This method allows safe read and write access to the database data, ensuring proper resource management.

Parameters:
  • (bool) (create) – If True, enables write mode.

  • (bool) – If True, allows creating new records.

Yields (DatabaseMiaData):

The data interface for the database.

schema()[source]

Provides a context manager for accessing the database schema.

This method allows safe access to the database schema, ensuring proper resource management.

Yields (DatabaseMiaSchema):

The schema interface for the database.

class populse_mia.data_manager.database_mia.DatabaseMiaData(storage_data)[source]

Bases: object

Managing database interactions within the MIA framework.

This class provides methods to handle collections and documents within the database, allowing operations such as retrieving, adding, updating, and removing records.

__init__(storage_data)[source]

Initializes a new instance of the DatabaseMiaData class.

Parameters:

(populse_db.storage.Storage) (storage_data) – The data storage interface for the database.

add_document(collection_name, document)[source]

Adds a document to a specified collection in the storage.

If the specified collection exists, the document is added to it. The method assigns a primary key to the document based on the collection’s primary key configuration. The changes are saved to the storage.

Parameters:
  • (str) (document) – The name of the collection where the document should be added.

  • (str) – The document name to be added.

filter_documents(collection_name, filter_query)[source]

Retrieve documents from a specified collection that match a given filter.

This method searches for documents in the specified collection based on the provided filter query. It returns the results as a list of rows from the collection table.

The filter_query can either be:
  • The result of self.filter_query().

  • A string defining a filter.

Filter Query Format:
  • A filter condition must follow this syntax:

    {<field>} <operator> “<value>”.

  • Supported operators:

    ==, !=, <=, >=, <, >, IN, ILIKE, LIKE.

  • Multiple filter conditions can be combined using AND or OR.

  • Example:

    ` "((({BandWidth} == "50000")) AND (({FileName} LIKE "%G1%")))" `

Note: Due to potential database access issues such as “database already open.”, this implementation currently returns a list instead of using yield. However, using yield may be reconsidered in the future for better memory management.

Parameters:
  • (str) (filter_query) – The name of the collection to filter (must exist).

  • (str) – The filter query to apply.

Return (list):

A list of rows matching the filter criteria.

get_collection_names()[source]

Retrieves a list of all collection names in the database.

Return (list):

All collection names in the database.

get_document(collection_name, primary_keys=None, fields=None)[source]

Retrieve documents from the specified collection with optional filtering.

This method checks if the specified collection exists. If it does, it retrieves documents from the collection, optionally filtering by primary keys and selecting specific fields. If the collection does not exist, an empty list is returned.

Parameters:
  • (str) (collection_name) – Name of the document collection. The collection must already exist in the database.

  • optional) (fields (str | list[str],) – A single primary key or a list of primary keys to filter documents. If None, no filtering by primary keys is applied.

  • optional) – A single field or a list of fields to include in the result. If None, all fields are included.

Return (list):

A list of documents matching the specified criteria, or an empty list if the collection does not exist.

get_document_names(collection_name)[source]

Retrieve a list of all document names in the specified collection.

Parameters:

(str) (collection_name) – The name of the collection to retrieve document names from. The collection must already exist.

Return (list[str]):

A list of document names if the collection exists, otherwise an empty list.

get_field_attributes(collection_name, field_name=None)[source]

Retrieve attributes of a specific field or all fields in a collection from the storage.

Parameters:
  • (str) (collection_name) – The name of the collection.

  • optional) (field_name (str,) – The name of a specific field within the collection. If not provided, attributes for all fields in the collection will be retrieved.

Return (dict | list[dict]):

Attributes of the specified field as a dictionary, or a list of dictionaries with attributes for all fields if field_name is not provided.

get_field_names(collection_name)[source]

Retrieve the list of all field names in the specified collection.

Parameters:

(str) (collection_name) – The name of the collection to retrieve field names from. The collection must exist in the database.

Return (list | None):

A list of all field names in the collection if it exists, or None if the collection has no fields or does not exist.

get_primary_key_name(collection_name)[source]

Retrieve the primary key of the specified collection.

This method returns the first key from the specified collection within the database.

Parameters:

(str) (collection_name) – The name of the collection to retrieve the primary key from.

Return (str):

The first key in the collection, representing the primary key.

get_shown_tags()[source]

Give the list of visible tags.

Return (list):

The list of visible tags.

get_value(collection_name, primary_key, field)[source]

Retrieves the current value of a specific field in a document from the specified collection.

This method accesses the underlying storage to fetch the value of a given field within a document, identified by its primary key, in the specified collection.

Parameters:
  • (str) (field) – The name of the collection containing the document.

  • (str) – The unique identifier (primary key) of the document.

  • (str) – The name of the field within the document to retrieve.

Return (Any):

The current value of the specified field.

has_collection(collection_name)[source]

Checks if a collection with the specified name exists in the database.

Parameters:

(str) (collection_name) – The name of the collection to check.

Return (bool):

True if the collection exists, otherwise False.

has_document(collection_name, primary_key)[source]

Checks if a document with the specified primary key exists in the given collection.

Parameters:
  • (str) (primary_key) – The name of the collection.

  • (str) – The primary key of the document to check.

Return (bool):

True if the document exists, False otherwise.

remove_document(collection_name, primary_key)[source]

Remove a document from a specified collection.

This method deletes the document identified by primary_key from the given collection in the storage.

Parameters:
  • (str) (primary_key) – The name of the collection containing the document.

  • (str) – The unique identifier of the document to be removed.

Raises:

KeyError – If the collection or the document does not exist.

remove_value(collection_name, primary_key, field)[source]

Removes the specified field from a document in the given collection, if it exists. Raises a KeyError if the field, collection, or document is not found.

Parameters:
  • (str) (field) – The name of the collection containing the document.

  • (str) – The primary key of the document in the collection.

  • (str) – The field to be removed from the document.

Raises:

KeyError – If the collection or document cannot be found.

set_shown_tags(fields_shown)[source]

Set the list of visible tags.

Parameters:

(list) (fields_shown) – A list of visible tags.

set_value(collection_name, primary_key, values_dict)[source]

Store or update a record in the specified collection.

This method either stores a new record or updates an existing record in the specified collection, using the provided primary key. The fields of the record are set according to the data in the provided dictionary.

Parameters:
  • (str) (primary_key) – The name of the collection where the record will be stored or updated.

  • (str) – The unique key used to identify the record.

  • (dict) (values_dict) – A dictionary containing the data to store or update in the record. Keys represent field names, and values represent the corresponding data.

class populse_mia.data_manager.database_mia.DatabaseMiaSchema(storage_schema)[source]

Bases: object

Provides tools for managing the schema of a MIA database.

This class allows users to manipulate database collections, fields, and field attributes under the supervision of populse_db.

__init__(storage_schema)[source]

Initializes the DatabaseMiaSchema instance.

Parameters:

(populse_db.storage.Storage) (storage_schema) – The schema storage interface for the database.

add_collection(collection_name, primary_key, visibility, origin, unit, default_value)[source]

Add a new collection to the storage database, if it does not already exist.

This method overrides the default behavior to add a collection with additional field attributes, ensuring proper schema updates and collection initialization.

Parameters:
  • (str) (unit) – The name of the new collection.

  • (str) – The primary key column for the collection.

  • (bool) (visibility) – Visibility of the primary key field.

  • (str) – Origin of the primary key field.

  • (str) – Unit of the primary key field.

  • (Any) (default_value) – Default value for the primary key field.

add_field(fields)[source]

Adds one or more fields to the collection.

Each field should be represented as a dictionary containing the following keys:

  • collection_name (str): The collection to which the field belongs.

  • field_name (str): The name of the field.

  • field_type (str): The data type of the field.

  • description (str): A brief description of the field.

  • visibility (bool): The visibility status of the field.

  • origin (str): The origin of the field.

  • unit (str): The unit associated with the field.

  • default_value (Any): The default value of the field.

Parameters:

list[dict]) (fields (dict |) – A dictionary representing a single field’s attributes, or a list of dictionaries representing multiple fields’ attributes.

add_field_attributes_collection()[source]

Ensures that the FIELD_ATTRIBUTES_COLLECTION is available in the database.

If it does not exist, it creates the collection and adds specific fields to it such as ‘visibility’, ‘origin’, ‘unit’, and ‘default_value’.

data()[source]

Provides a context manager for accessing the database data.

This method ensures safe access to the database data layer, managing resources properly.

Yields (DatabaseMiaData):

The data interface for the database.

remove_field(collection_name, field_name)[source]

Removes a specified field in the collection_name

This method updates the schema to remove the specified field from the collection and handles associated attributes cleanup.

Parameters:
  • (str) (field_name) – The name of the collection from which the field will be removed (must exist).

  • (str) – The name of the field to remove (must exist).

Raises:

ValueError – If the collection_name does not exist or if the field_name does not exist.

remove_field_attributes(collection_name, field_name)[source]

Remove attributes associated with a specific field in a collection.

This method deletes the document storing metadata or attributes for the specified field in the given collection.

Parameters:
  • (str) (field_name) – The name of the collection containing the field.

  • (str) – The name of the field whose attributes are to be removed.

Raises:

ValueError – If the attributes document does not exist or cannot be removed.

update_field_attributes(collection_name, field_name, visibility, origin, unit, default_value, description, field_type)[source]

Updates the attributes of a field in the database for a specific collection.

This method constructs an index using the provided collection and field_name (‘collection|field_name’), and then updates the field’s attributes in the FIELD_ATTRIBUTES_COLLECTION.

Parameters:
  • (str) (description) – The name of the collection the field belongs to.

  • (str) – The name of the field to update.

  • (bool) (visibility) – The visibility status of the field.

  • (str) – The origin or source of the field.

  • (str) – The unit of measurement for the field.

  • (Any) (field_type) – The default value to assign to the field.

  • (str) – The description of the field.

  • (Any) – The type of the field.

populse_mia.data_manager.filter module

Module that handle the filter class which contains the results of both rapid and advanced search

Contains:
Class:
  • Filter

class populse_mia.data_manager.filter.Filter(name, nots, values, fields, links, conditions, search_bar)[source]

Bases: object

Class that represent a Filter, containing the results of both rapid and advanced search.

The advanced search creates a complex query to the database and is a combination of several “query lines” which are linked with AND or OR and all composed of: - A negation or not - A tag name or all visible tags - A condition (==, !=, >, <, >=, <=, CONTAINS, IN, BETWEEN) - A value

Parameters:
  • name – filter’s name

  • nots – list of negations (”” or NOT)

  • values – list of values

  • fields – list of list of fields

  • links – list of links (AND/OR)

  • conditions – list of conditions (==, !=, <, >, <=, >=, IN, BETWEEN, CONTAINS, HAS VALUE, HAS NO VALUE)

  • search_bar – value in the rapid search bar

__init__(name, nots, values, fields, links, conditions, search_bar)[source]

Initialization of the Filter class.

Parameters:
  • name – filter’s name

  • nots – list of negations (”” or NOT)

  • values – list of values

  • fields – list of list of fields

  • links – list of links (AND/OR)

  • conditions – list of conditions (==, !=, <, >, <=, >=, IN, BETWEEN, CONTAINS, HAS VALUE, HAS NO VALUE)

  • search_bar – value in the rapid search bar

generate_filter(current_project, scans, tags)[source]

Apply the filter to the given list of scans.

Parameters:
  • current_project – Current project.

  • scans – List of scans to apply the filter into.

  • tags – List of tags to search in.

Return (list):

The list of scans matching the filter.

json_format()[source]

Return the filter as a dictionary.

Return (dict):

The filter as a dictionary.

populse_mia.data_manager.project module

Module that handle the projects and their database.

Contains:
Class:
  • Project

class populse_mia.data_manager.project.Project(project_root_folder, new_project)[source]

Bases: object

Class that handles projects and their associated database.

Parameters:
  • project_root_folder – project’s path

  • new_project – project’s object

__init__(project_root_folder, new_project)[source]

Initialization of the project class.

Parameters:
  • project_root_folder – project’s path

  • new_project – project’s object

add_clinical_tags()[source]

Add new clinical tags to the project.

Returns:

list of clinical tags that were added.

cleanup_orphan_bricks(bricks=None)[source]

Remove orphan bricks and their associated files from the database.

This method performs the following cleanup operations: 1. Removes obsolete brick documents from the brick collection 2. Removes orphaned file documents from both current and initial

collections

  1. Deletes the corresponding physical files from the filesystem

Parameters:

(str) (bricks) – list of brick IDs to check for orphans. If None, checks all bricks in the database.

cleanup_orphan_history()[source]

Remove orphan histories, their associated bricks, and files from the database.

This method performs three cleanup operations: 1. Removes obsolete history documents from the history collection 2. Removes orphaned brick documents from the brick collection 3. Removes orphaned file documents from both current and initial

collections, along with their corresponding physical files

cleanup_orphan_nonexisting_files()[source]

Remove database entries for files that no longer exist in the filesystem.

This method: 1. Identifies files referenced in the database that are missing

from disk

  1. Removes their entries from both current and initial collections

  2. Ensures any remaining physical files are deleted (defensive cleanup)

del_clinical_tags()[source]

Remove clinical tags from the project’s current and initial collections.

Iterates through predefined clinical tags and removes them from both collections if they exist in the current collection’s field names.

Return (list):

Clinical tags that were successfully removed.

files_in_project(files)[source]

Extract file/directory names from input that are within the project folder.

Recursively processes the input to find all file paths, handling nested data structures. Only paths within the project directory are included.

Parameters:

files

Input that may contain file paths. Can be: - str: A single file path - list/tuple/set: Collection of file paths or

nested structures

  • dict: Only values are processed, keys are ignored

Return (set):

Relative file paths that exist within the project folder, with paths normalized and made relative to the project directory

finished_bricks(engine, pipeline=None, include_done=False)[source]

Retrieve and process finished bricks from workflows and pipelines.

This method: 1. Gets finished bricks from workflows and optionally a specific

pipeline

  1. Filters them based on their presence in the MIA database

  2. Updates brick metadata with execution status and outputs

  3. Collects all output files that are within the project directory

Parameters:
  • engine – Engine instance for retrieving finished bricks

  • pipeline – Optional pipeline object to filter specific bricks

  • include_done – If True, includes all bricks regardless of execution status. If False, only includes “Not Done” bricks.

Return (dict):

Dictionary containing: - ‘bricks’: Dict mapping brick IDs to their metadata - ‘outputs’: Set of output file paths relative to project

directory

Contains:
Private function:
  • _update_dict: Merge two dictionaries by updating the first

    with the second

  • _collect_outputs: Recursively collects file paths from

    output values that are within the project directory.

getDate()[source]

Return the date of creation of the project.

Return (str):

The date of creation of the project if it’s not Unnamed project, otherwise empty string

getFilter(target_filter)[source]

Return a Filter object from its name.

Parameters:

(str) (target_filter) – Filter name

Return (Filter):

Filter object corresponding to the given name or None if not found

getFilterName()[source]

Input box to type the name of the filter to save.

Return (str):

Return the name typed by the user or None if cancelled

getName()[source]

Return the name of the project.

Return (str):

The name of the project if it’s not Unnamed project, otherwise empty string

getSortOrder()[source]

Return the sort order of the project.

Return (str):

Sort order of the project if it’s not Unnamed project, otherwise empty string

getSortedTag()[source]

Return the sorted tag of the project.

Return (str):

Sorted tag of the project if it’s not Unnamed project, otherwise empty string

get_data_history(path)[source]

Get the processing history for the given data file.

The history dict contains several elements: - parent_files: set of other data used (directly or indirectly) to

produce the data.

  • processes: processing bricks set from each ancestor data which

    lead to the given one. Elements are process (brick) UUIDs.

Parameters:

path – Path to the data file

Returns:

history (dict)

get_finished_bricks_in_pipeline(pipeline)[source]

Retrieves a dictionary of finished processes (bricks) from a given pipeline, including nested pipelines, if any.

Parameters:

Process) (pipeline (Pipeline or) – The pipeline or single process to analyze. If a single process is provided, it will be treated as a minimal pipeline.

Return (dict):

A dictionary where keys are process UUIDs (brick IDs) and values are dictionaries containing the associated process instances.

get_finished_bricks_in_workflows(engine)[source]

Retrieves a dictionary of finished bricks (jobs) from Soma-Workflow workflows.

Parameters:

(object) (engine) – The engine instance used to interact with the study configuration and Soma-Workflow module.

Return (dict):

A dictionary where keys are brick IDs (UUIDs) and values are dictionaries containing metadata about each finished job, including: - workflow: The workflow ID in which the job is

contained.

  • job: The Soma-Workflow job instance.

  • job_id: The ID of the job in Soma-Workflow.

  • swf_status: The status information for the job in

    Soma-Workflow.

get_orphan_bricks(bricks=None)[source]

Identifies orphan bricks and their associated weak files.

Parameters:

set) (bricks (list or) – A list or set of brick IDs to filter the search. If None, all bricks in the database are considered. Defaults to None.

Return (tuple):

A tuple containing two sets: - orphan (set): Brick IDs considered orphaned, meaning

they have no valid or existing outputs linked to the current database.

  • orphan_weak_files (set): Paths to weak files associated

    with orphaned bricks, such as script files or files that no longer exist.

get_orphan_history()[source]

Identifies orphaned history entries, their associated orphan bricks, and weak files.

Return (tuple):

A tuple containing three sets: - orphan_hist (set): IDs of history entries that are no longer

linked to any current document in the database.

  • orphan_bricks (set): IDs of bricks associated with orphaned

    history entries.

  • orphan_weak_files (set): Paths to weak files (e.g., script

    files or non-existent files) linked to orphaned history entries.

get_orphan_nonexisting_files()[source]

Retrieves orphaned files listed in the database that no longer exist on the filesystem.

Return (set):

A set of filenames from the database that are not found on the filesystem and are not associated with existing bricks.

hasUnsavedModifications()[source]

Return if the project has unsaved modifications or not.

Return (bool):

True if the project has pending modifications, False otherwise

init_filters()[source]

Initializes project filters by loading them from stored JSON files.

This method sets the currentFilter to a default empty filter and populates the filters list with Filter objects created

loadProperties()[source]

Loads the project properties from the ‘properties.yml’ file.

This method reads the project’s YAML properties file and returns its contents as a Python dictionary.

Return (dict):

A dictionary containing the project properties if successfully loaded, or None if an error occurs.

redo(table)[source]

Redo the last action made by the user on the project.

Parameters:

(QTableWidget) (table) – The table on which to apply the modifications.

Actions that can be redone:
  • add_tag

  • remove_tags

  • add_scans

  • modified_values

  • modified_visibilities

Raises:

(ValueError) – If an unknown action type is encountered.

reput_values(values)[source]

Re-put the value objects in the database.

Parameters:

(list) (values) – List of Value objects

saveConfig()[source]

Save the changes in the properties file.

saveModifications()[source]

Save the pending operations of the project (actions still not saved).

save_current_filter(custom_filters)[source]

Save the current filter.

Parameters:

custom_filters – The customized filter

setCurrentFilter(new_filter)[source]

Set the current filter of the project.

Parameters:

new_filter – New Filter object

setDate(date)[source]

Set the date of the project.

Parameters:

date – New date of the project

setName(name)[source]

Set the name of the project if it’s not Unnamed project, otherwise does nothing.

Parameters:

(str) (name) – New name of the project

setSortOrder(order)[source]

Set the sort order of the project.

Parameters:

order – New sort order of the project (ascending or descending)

setSortedTag(tag)[source]

Set the sorted tag of the project.

Parameters:

tag – New sorted tag of the project

undo(table)[source]

Undo the last action made by the user on the project.

Parameters:

table – Table on which to apply the modifications

Actions that can be undone:
  • add_tag

  • remove_tags

  • add_scans

  • modified_values

  • modified_visibilities

unsaveModifications()[source]

Unsave the pending operations of the project.

property unsavedModifications

Getter for _unsavedModifications.

update_db_for_paths(new_path=None)[source]

Update database paths when renaming or loading a project.

This method updates path references in the database when a project is renamed or loaded from a different location. It scans the HISTORY and BRICK collections to identify the old project path, then systematically replaces it with the new path.

The method looks for the old path in brick input/output fields and history pipeline XML data. If the old path contains ‘data/derived_data’, the method uses the portion before this segment as the base path.

Parameters:

(str) (new_path) – The new project path. If not provided, the current project folder path is used.

Contains:
Private method:
  • _update_json_data: Helper method to update paths in JSON

    data structures

populse_mia.data_manager.project_properties module

Module that contains the class to handle the projects saved in the software.

Contains:

Class: - SavedProjects

class populse_mia.data_manager.project_properties.SavedProjects[source]

Bases: object

Handles all saved projects in the software.

Methods:
  • addSavedProject: Adds a new saved project.

  • loadSavedProjects: Loads saved projects from ‘saved_projects.yml’.

  • removeSavedProject: Removes a project from the config file.

  • saveSavedProjects: Saves projects to ‘saved_projects.yml’.

__init__()[source]

Initializes the saved projects from ‘saved_projects.yml’.

Attributes:

savedProjects (dict): Dictionary containing saved project paths. pathsList (list): List of saved project paths.

addSavedProject(newPath)[source]

Adds a project path or moves it to the front if it exists.

Parameters:

(str) (newPath) – Path of the new project.

Return (list):

Updated project paths list.

loadSavedProjects()[source]

Loads saved projects from ‘saved_projects.yml’, or creates a default file if missing.

Return (dict):

Loaded project paths.

removeSavedProject(path)[source]

Removes a project path from pathsList and updates the file.

Parameters:

(str) (path) – Path to remove.

saveSavedProjects()[source]

Writes savedProjects to ‘saved_projects.yml’.