Home / Platforms / Vision Foundry
Platform in-development image-classification

Vision Foundry

Vision Foundry is a self-service platform for AI-powered image analysis. It is designed to assist researchers working with unlabeled or hard-to-label image datasets, especially in the biomedical domain. It helps researchers extract high-dimensional feature representations from unlabeled image data.

At the core of Vision Foundry is DinoMX, a modular PyTorch-based training framework that facilitates self-supervised representation learning using Vision Transformers (ViTs). The pipeline builds upon the DINO and DINOv2 frameworks (self-distillation with no labels) introduced by Meta in 2023.

Key Capabilities

  • LoRA Fine-Tuning (via PEFT): Parameter-efficient adaptation without retraining the full backbone
  • Knowledge Distillation: Transfer representations from large teacher to smaller student model
  • ClearML Integration: Experiment tracking, logging, and artifact storage
  • Model Standardization: Hugging Face compatible checkpoints
  • Multi-modal development: Integration of vision encoders with LLMs for clinical AI

The initial use case focused on neuropathology. As part of the Federated Brain Digital Slide Archive project, NP-TEST-0 was developed — a Vision Transformer pretrained using DinoMX on real-world neuropathology data. It supports transfer learning, tissue segmentation, patch-level classification, and similarity search.