GLEAM (Galaxy Learning and Modeling) is a suite of machine learning tools for the Galaxy platform. Developed by the Goecks Lab, GLEAM empowers researchers to train models, generate predictions, and produce reproducible reports—all from a user-friendly interface without writing code.
- Modern best practices for machine learning
 - Reproducible and scalable workflows
 - Machine learning support for diverse data types: tabular, image, text, categorical, and more
 - Deep learning via Ludwig and automated ML via PyCaret
 - Easy installation in Galaxy via XML wrappers
 - Auto-generated visual reports
 
Machine learning for structured tabular datasets using PyCaret.
- Train classification and regression models
 - Evaluate performance and extract feature importance
 - Generate predictions on new datasets
 - Create interactive HTML reports
 
Deep learning-based image classification using Ludwig.
- input files: Zip file with images and csv with metadata
 - Tasks: classification
 - Models available: ResNet, EfficientNet, VGG, Shufflenet, Vit, AlexNet and More...
 - Output: Ludwig_model file, a report in the form of an HTML file (with learning curves, confusion matrices, and etc...), and a collection of CSV/json/png files containing the predictions, experiment stats and visualizations.
 
General-purpose interface to Ludwig's full machine learning capabilities.
- Train and evaluate models on structured input (tabular, image, text, etc.)
 - Expose Ludwig’s flexible configuration system
 - Ideal for users needing advanced model customization
 
Set of three specialized tools designed to transforms raw, large pathology images into a structured format, enabling the application of best practices for model development and ensuring data readiness for robust and efficient training.
- Image Tiler: Accepts .svs image format, which is the most common proprietary format for digital pathology whole slide images.
 - Embedding Extractor: Leverages pre-trained models from the TorchVision foundation models for feature extraction (for example, ResNet50, EfficientNet_B0, DenseNet121).
 - Multiple Instance Learning (MIL) Bag Processor: Facilitates the aggregation of embeddings from individual image tiles into "bags" using various pooling techniques (such as Max Pooling or Attention Pooling).
 
GLEAM tools are available in the Galaxy ToolShed and can be installed directly into your Galaxy instance:
- Log in to your Galaxy instance as an administrator
 - Navigate to Admin → Install and Uninstall (or Manage Tools)
 - Search for the following tool suites under the goeckslab owner:
suite_tabular_learner- TabularLearner toolssuite_imagelearner- ImageLearner toolssuite_ludwig- Galaxy-Ludwig toolssuite_tiler- Image Tiler toolsuite_embedding_extractor- Embedding Extractor toolsuite_mil_bag- Multiple Instance Learning Bag Processor tool
 - Select the desired tool suites and click Install
 
Galaxy will automatically handle dependencies and configuration.
If you prefer to install from source or need to modify the tools:
- 
Clone the repository:
git clone https://github.com/goeckslab/gleam.git
 - 
Add entries for each tool in your tool_conf.xml of your galaxy instance:
<tool file="<path-to-your-local-tabularlearner/tabular_learner.xml>" /> <tool file="<path-to-your-local-imagelearner/image_learner_train.xml>" /> <tool file="<path-to-your-local-galaxy-ludwig/ludwig_train.xml>" />
 
We welcome contributions. To propose new tools, report bugs, or suggest improvements:
- 
Fork the repository
 - 
Create a feature branch
 - 
Commit and test your changes
 - 
Submit a pull request