LLM Factory is a locally controlled platform that empowers users to interact with the latest open-source large language models (LLMs) through an intuitive web interface or programmatically via API calls. As the foundation for all internal LLM research and applications, the platform offers proven versatility. Users can securely fine-tune their own Large Language Models and query the model on a platform controlled locally.
Cutting edge models like Deep Seek R1, Llama 3.2 90B Vision, and Llama 3.0 8B are accessible via the chat interface or through OpenAI compatible API. The OpenAI library, and other tools compatible with OpenAI, work with the API, enabling users to leverage a range of advanced features, including chat, embeddings, transcription, function calling, and more.
LLM Factory leverages Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) to make model customization efficient and scalable. Users can upload their own datasets, train adapters by layering data on a base model, privately configure and explore models, and expose models to external, OpenAI compatible API.
Layering additional information, called adapters, on top of a base model enables fine-tuning on only a small subset of parameters. Previously, custom model training required significant computational resources. Now, with LoRA methods, we can achieve comparable performance to traditional full fine-tuning, but with significantly reduced memory usage and trainable parameters. The latest open-source models and adapter training are supported by the NVIDIA DGX computing cluster, with 3.2TB of VRAM.
A key advantage of this locally controlled platform is data security. User data, chat history, and API interactions are secure and private. Unlike commercial services with ambiguous data policies, user interactions are only stored locally. For projects with sensitive information, individual HIPAA-compliant instances are available upon request.
Importantly, the API endpoints are OpenAI compatible. This means that all other libraries, tools, and systems that are OpenAI compatible (the industry standard) can be integrated seamlessly.
An informational video on LLM Factory is available on YouTube.
Contact ai@uky.edu for more information.
Users of LLM Factory must agree to acknowledge the tool in their research, publications, and/or products. Read the paper on the Institutional Platform for Secure Self-Service Large Language Model Exploration in PubMed.
LLM Factory relies on local cyberinfrastructure, including inference servers, storage, and, in the case of adapter training, the DGX cluster.