by BioSymetrics

A biomedical AI language designed to simplify and automate data science workflows

Augusta is a biomedical AI (Artificial Intelligence) and ML (Machine Learning) framework designed to transition time from data pre-processing and integration to model building and interrogation using familiar toolsets within Python. Augusta begins with diverse, raw medical data types (e.g. images, chemical structures, genomic data, tabular data), and operates across three modules:
  1. Augusta Pre-Processing
  2. Augusta ML (Machine Learning)
  3. Augusta Architect
Common Use Cases:
  • Drug discovery and development, incl. small molecule activity prediction
  • Diagnostics & precision medicine
  • Patient outcomes prediction and stratification
  • Preprocessing
  • Feature reduction & selection
  • Data Integration (e.g. combining genomics with clinical data)
  • Model creation
  • Model tuning
  • Model training
  • Model interrogation
  • Visualization
  • Faster, effective data pre-processing, directly integrated with model building
  • Seamless distributed computing
  • Flexible architecting of processing pipelines, changing as data type and volume requires
  • Adaptable to changing needs/preferences over time


  • Quickly standardize or normalize data
  • Permute over pre-processing options
  • Create workflows where critical biases can be reduced
  • Save, modify, and re-run workflows
Use with:
  • Any Data Source: BYOD (Bring Your Own Data) local, databases, cloud
  • Any Combination of Sources: Modular and customizable pipelines for processing raw data in any combination
Sample Data Pipelines:
  • MRI/fMRI and other imaging modalities
  • EEG
  • Genomics,
  • Proteomics
  • EHR/EMR data
  • Custom data options available
Feature Optimization

Integrating data of various types (e.g. combining genomics with clinical data), enables the engineering of unique features, providing for greater machine learning insights. Features can be easily grouped, sub grouped, and archived, making them easily accessible to models, increase tuning parameters, and enhanced interrogation capabilities


Model Creation

Use machine learning models from Tensorflow and Scipy, with a unified syntax and output Model Tuning and Interrogation

Iterate over model-specific parameters, investigate the impact of combinations of pre-processing decisions and model hyperparameters in the context of model performance

  • Evaluate model performance via cross-validation, using metrics such as accuracy, precision/recall and AUC
  • Implement multiple feature reduction methods and evaluate impact on model performance
Visualization incorporating Seaborn and Matplotlib packages


Augusta Architect is a simple, Python-based syntax that allows the processing and integration of multiple, diverse data types, and ability to run/compare multiple machine learning algorithms

At a glance