Platform Description

GLOBEM provides three major modules and a few utility functions:

Feature Preparation Module
Model Computatoin Module
Configuration Module

Each algorithm (DepressionDetectionAlgorithmBase defined in algorithm/base.py) consists of these three modules to form a complete pipeline that leads to one (or more) machine learning models: from feature preparation (as model input) to model computation (to obtain model output), with parameters controlled by the configuration module.

Input

After dataset preparation (as explained in Setup page), an initial input data point will be a standard (feature matrix, label) pair.

label: the ground truth (currently, it is a binary label) indicating a subject's self-report depressive symptom status on a certain date.

feature matrix: given the date of the label, the feature matrix includes daily feature vectors in the past four weeks, with the dimension as (28, # of features).

Feature Preparation Module

This module defines the features used by the algorithm as the input. The function DepressionDetectionAlgorithmBase.prep_data_repo determines this process of an algorithm.

For traditional machine learning algorithms, this can be basic feature selection, aggregation, and filtering (e.g., mean, std) along the feature matrix's temporal dimension (e.g., Canzian et al., Saeb et al.), or complex feature extraction (e.g., Xu et al., Chikersal et al.).

For deep learning algorithms, this is a definition of a feature data feeding process (i.e., a data generator) that prepares data for deep model training (e.g., ERM).

Model Computation Module

This module defines the model construction and training process. The function DepressionDetectionAlgorithmBase.prep_model determines a prediction model generated by the algorithm. The prep_model function will return a DepressionDetectionClassifierBase object that specifies the model design, training, and prediction process.

For traditional machine learning algorithms, this can be some off-the-shelf model such as an SVM (e.g., Farhan et al.), or some customized statistical model (e.g., Lu et al.) that is ready to be trained with input data.

For deep learning algorithms, this is a definition of deep modeling architecture and training process (e.g., IRM), and builds a deep model that is ready to be trained with input data.

Multiple Models from One Algorithm

It is worth noting that one algorithm can define multiple models.

For example, ERM can use different deep learning architectures such as ERM-1D-CNN, ERM-2D-CNN, ERM-Transformer; DANN can take each dataset as a domain (DANN-dataset as domain), or each person as a domain (DANN-person as domain).

This is controlled by the config files and the algorithm_factory. The next page introduces how the two parts work together.

Configuration Module

This module provides the flexibility of controlling different parameters in the Feature Preparation Module and Model Computation Module. Each algorithm has its own unique parameters that can be added to this module.

The platform employs a simple yaml file system. Each model (NOT algorithm) has its own unique config yaml file in config folder with a unique file name.

For example, Chikersal et al. can have one model, and its config file is config/ml_chikersal.yaml; DANN can have two models, so it has two config files: config/dl_dann_ds_as_domain.yaml and config/dl_dann_person_as_domain.yaml, respectively.

Platform Description

Input#

Feature Preparation Module#

Model Computation Module#

Multiple Models from One Algorithm#

Configuration Module#

Input

Feature Preparation Module

Model Computation Module

Multiple Models from One Algorithm

Configuration Module