Azure AI Foundry Labs | Early-Stage AI Experiments & Prototypes

Promptions

Promptions dynamically generates UI to help users steer AI responses more effectively. It’s simple, effective, and easily customizable, making it suitable for developers from individual vibe-coders to enterprise software engineers.

EXPLORE →

Experiment

RosettaFold3

RosettaFold3 (RF3) is a unified biomolecular modeling system that predicts 3D structures of proteins, nucleic acids, and small molecules within a single framework.

EXPLORE →

Experiment

MMCT-Agent

Complex visual reasoning on images and long-form videos

EXPLORE →

Experiment

RetroChimera

RetroChimera is a model that takes as input a target molecule that one wants to synthesize, encoded as a sequence of characters (using the SMILES notation), and produces several potential chemical reactions which could be used to produce that input molecule. Each reaction is represented as a group of ingredients (reactant molecules), and those molecules…

EXPLORE →

Experiment

Data Formulator

Data Formulator blends natural language and visual interfaces to help analysts explore and visualize data with AI agents. Starting with any data (clean or messy, small or large), analysts can easily describe their goals to AI agents and collaboratively explore data in different directions in parallel to discover new insights.

EXPLORE →

Experiment

Magentic Marketplace

Magentic Marketplace is an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale. It provides a foundation for studying these markets and guiding them toward outcomes that benefit everyone. 

EXPLORE →

Experiment

Skala

Skala is a deep-learning-based exchange-correlation functional that achieves experimental accuracy in density functional theory. Trained on the largest high-accuracy dataset of molecular energies, Skala advances computational chemistry with reliable, scalable predictions for molecules and materials.

EXPLORE →

Experiment

MatterGen

MatterGen is a diffusion-based generative model for inorganic materials design. It can propose stable, novel crystal structures and be guided by target properties like bulk modulus, band gap, or magnetic density, accelerating materials discovery.

EXPLORE →

Experiment

CalcLM

CalcLM is a prototype from Azure AI Foundry Labs that experiments with bringing agentic AI into the grid interface of Excel. It enables users to express agent steps as formulas, chain tasks across cells, and quickly explore how simple agent workflows might flow in a familiar surface.

EXPLORE →

Experiment

Debug-gym

Debug-gym is an open-source research environment for teaching AI coding agents to debug more like humans—interactively and iteratively. With tools like Python’s pdb, agents can set breakpoints, inspect code, and run tests, enabling smarter, more reliable coding workflows.

EXPLORE →

Experiment

Trellis

Trellis is a research system for generating editable 3D assets from simple text or image prompts. Using a novel latent representation, it produces meshes, radiance fields, and 3D Gaussians with rich texture and structure—accelerating workflows in gaming, AR/VR, digital twins, and industrial design.

EXPLORE →

Experiment

NextCoder

NextCoder is a research model series designed to improve code editing. Using a novel synthetic data pipeline and the SeleKT adaptation algorithm, it teaches language models to handle diverse edit requirements while retaining strong code generation skills, outperforming peers across multiple benchmarks.

EXPLORE →

Experiment

TypeAgent

TypeAgent is research sample code exploring how to build a single personal agent with natural language interfaces. By distilling language models into logical structures for actions, memory, and plans, it enables safer, faster, and lower-cost ways to map user requests into meaningful applications.

EXPLORE →

Experiment

Magentic-UI

Magentic-UI is a research platform for advancing human-in-the-loop AI experiences. It explores co-planning, co-tasking, and task learning. Features like action guards and safe sandboxes enable more trustworthy human-AI collaboration.

EXPLORE →

Experiment

MCP Server

The MCP Server for Azure AI Foundry Labs equips GitHub Copilot with custom tools for model discovery, integration guidance, and rapid prototyping. By streamlining workflows and reducing the idea-to-prototype cycle to under 10 minutes, it accelerates research adoption and boosts developer productivity.

EXPLORE →

Experiment

EvoDiff

EvoDiff is a diffusion framework for controllable protein generation in sequence space. Trained on evolutionary-scale data, it produces high-fidelity, diverse, and structurally plausible proteins, enabling novel designs beyond structure-based models and advancing sequence-first protein engineering.

EXPLORE →

Experiment

PEACE

PEACE enhances multimodal large language models with geologic expertise, enabling accurate interpretation of complex maps. By combining structured extraction, domain knowledge, and reasoning, it supports disaster risk assessment, resource exploration, and infrastructure planning—transforming general AI into a specialized geoscience tool.

EXPLORE →

Experiment

ReMe

ReMe is a web-based framework that helps researchers build AI chatbots for personalized memory and cognitive training. By combining puzzle tasks, life-logging, and multimodal interaction, it enables more engaging, adaptable interventions to advance digital health and non-pharmacological approaches to cognitive care.

EXPLORE →

Experiment

BitNet

BitNet is the first open-source, native 1-bit large language model, with every parameter represented as −1, 0, or 1. Scaled to 2 billion parameters, it demonstrates how ternary LLMs can achieve strong performance while dramatically reducing memory, compute, and energy requirements for AI training and inference.

EXPLORE →

Experiment

MatterSim

MatterSim is a deep learning model for accurate, efficient materials simulation and property prediction across a wide range of elements, temperatures, and pressures. It enables in silico materials design—supporting metals, oxides, sulfides, halides, and more across crystalline, amorphous, and liquid states.

EXPLORE →

Experiment

TamGen

TamGen is a transformer-based chemical language model for target-specific drug design. It can generate novel compounds or optimize existing ones by creating target-aware fragments—accelerating discovery, reducing costs, and highlighting the promise of generative AI in advancing new treatments.

EXPLORE →

Experiment

Aurora

Aurora is a large-scale foundation model for atmospheric forecasting. Trained on over a million hours of weather and climate simulations, it predicts variables like wind, temperature, and air quality with high resolution and speed—helping anticipate and mitigate extreme weather impacts.

EXPLORE →

Experiment

OmniParser V2

OmniParser V2 is a pioneering screen parsing module that turns user interfaces into actionable elements through visual input. By recognizing and mapping UI components, it enables more robust automation, helping agents interact with apps and workflows more effectively across platforms.

EXPLORE →

Experiment

Magentic-One

Magentic-One is a generalist multi-agent system for complex web and file-based tasks. With an Orchestrator coordinating specialized agents, it automates multi-step workflows across environments—using task and progress ledgers to plan, adapt, and optimize actions for efficient problem-solving.

EXPLORE →

Experiment

ExACT

ExACT is an approach for teaching AI agents to explore more effectively, enabling them to intelligently navigate their environments, gather valuable information, evaluate options, and identify optimal decision-making and planning strategies.

EXPLORE →

Experiment

PromptWizard

PromptWizard is a self-evolving framework that automates prompt optimization. By iteratively refining instructions and examples with model feedback, it delivers task-aware, high-quality prompts in minutes—combining expert reasoning, joint optimization, and adaptability across diverse use cases.

EXPLORE →

Experiment

BioEmu-1

BioEmu-1 is a deep learning model that generates thousands of protein structures per hour on a single GPU—vastly more efficient than classical simulations. By modeling structural ensembles, it opens new insights into how proteins function and accelerates drug discovery and biomedical research.

EXPLORE →

Experiment

Magma

Magma is a multimodal foundation model that perceives text and visuals to generate actions in both digital and physical environments. From navigating UIs to manipulating real-world tools, it advances the vision of general-purpose AI assistants that integrate seamlessly across contexts.

EXPLORE →

PUBLIC PREVIEW

Microsoft Phi-4

Phi-4 expands Microsoft’s small language model family with two new releases: Phi-4-multimodal, integrating speech, vision, and text for richer interactions, and Phi-4-mini, a compact model excelling at reasoning, coding, and long-context tasks. Together, they enable efficient, versatile AI across apps and devices.

EXPLORE →

Experiment

Muse

Muse is a World and Human Action Model (WHAM) developed by Microsoft Research with Ninja Theory. Trained on the game Bleeding Edge, it can generate visuals, controller actions, or both—showcasing how generative AI could accelerate creativity and interactive design in gaming.

EXPLORE →