Observe, Optimize, and Protect Your Hosted Agents in Microsoft Foundry - Microsoft Build 2026

AI agents are moving from prototypes into real enterprise workflows, but traditional monitoring tools were not built to explain why an agent behaved unpredictably across multi-step tasks.

Event Snapshot

Session: Observe, Optimize, and Protect Your Hosted Agents in Microsoft Foundry
Location: In-person at Microsoft Build 2026
Format: Hands-on technical lab with sandbox access
Focus: Observability, evaluation suites, and secure agent deployment
Audience: AI developers, platform engineers, architects, and technical leaders building hosted agents
Key Outcomes:
- Detect and diagnose agent failures traditional monitoring misses
- Build automated evaluation suites and test datasets for agent quality
- Implement continuous evaluation and adaptive red teaming for production reliability

The Challenge: Why AI Agents Fail in Production

Modern AI agents do not behave like traditional applications. They reason, call tools, use prompts, rely on context, and complete multi-step tasks, which means standard application monitoring often misses the failures that matter most.

Teams building hosted agents need a better way to answer questions like:

Why did the agent generate that response?
Which tool, prompt, or context contributed to the issue?
How do we measure whether the agent is improving over time?
How do we find vulnerabilities before users do?

Microsoft Foundry Observability introduces workflows designed for AI systems, including context-aware evaluation, trace-linked diagnostics, and continuous testing built into the development process.

What You'll Learn in the Lab

By the end of the session, you’ll understand how to:

Build context-specific evaluation suites using auto-generated evaluators and test datasets.
Integrate evaluation into developer pipelines using skills and MCP tooling.
Use trace-linked analysis to diagnose agent decisions and tool calls.
Run continuous evaluation to improve agent quality over time.
Apply adaptive red teaming to uncover vulnerabilities, edge cases, and security risks.
Move hosted agents from prototype to production using a practical evaluation and observability workflow.

Agenda at a Glance

1. The Observability Gap in AI Agents

Understand why traditional application monitoring and logging tools often fall short for hosted AI agents.

2. Building Evaluation Suites

Learn how to create evaluators and test datasets tailored to your agent’s context.

3. Embedding Observability into Developer Workflows

Explore how skills, MCP tooling, and traces can connect evaluation directly to agent decisions and tool usage.

4. Continuous Evaluation and Optimization

See how ongoing quality checks can help teams improve agent reliability and response quality over time.

5. Adaptive Red Teaming

Learn how adaptive red teaming can help surface vulnerabilities, edge cases, and security risks before they reach users.

6. Hands-On Sandbox Exploration

Explore the guided lab environment and continue experimenting after the session.

Who Should Attend?

This lab is built for technical professionals working with AI platforms and hosted agents.

Ideal attendees include:

AI developers building agents or Copilot extensions
Platform engineers responsible for AI infrastructure and pipelines
Solution architects designing with Microsoft AI services
Engineering teams taking AI systems from POC to production
Builders working with MCP-enabled tools, skills, or hosted agents

You'll get the most value if you:

Are deploying or planning to deploy hosted AI agents
Need better ways to evaluate agent behavior and reliability
Want to integrate observability into your AI development pipeline

Real-World Impact

Teams that adopt structured AI evaluation and observability are able to:

Detect agent failures earlier in development cycles
Improve response quality through automated evaluation pipelines
Reduce deployment risk when shipping agent-powered experiences into enterprise environments

This session is designed to translate emerging AI engineering practices into workflows you can apply immediately.

Get the Session Slide Deck

Get the slide deck from Kanwal's Microsoft Foundry Observability session and learn how technical teams can evaluate, monitor, and improve hosted AI agents using trace-linked diagnostics, continuous evaluation, and adaptive red teaming.

FAQs

What is Microsoft Foundry Observability?

A set of tools and workflows for monitoring, evaluating, and improving hosted AI agents covering behavior analysis, failure detection, and continuous quality improvement.

Is this a lecture or hands-on?

Fully hands-on. You'll work directly with evaluation tools, observability features, and a sandbox environment.

Do I need prior agent-building experience?

Basic familiarity with AI development or agent-based systems will help you get the most out of the lab.

Will there be a sandbox to keep?

Yes, you'll walk away with a sandbox to explore additional Foundry Observability features on your own.

Will the session be recorded?

Availability depends on the final event format and will be confirmed closer to the date.

Security & Compliance

Last updated on:

May 27, 2026

Published on:

May 27, 2026

Learn more

Observe, Optimize, and Protect Your Hosted Agents in Microsoft Foundry - Microsoft Build 2026

Event Snapshot

The Challenge: Why AI Agents Fail in Production

What You'll Learn in the Lab

Agenda at a Glance

1. The Observability Gap in AI Agents

2. Building Evaluation Suites

3. Embedding Observability into Developer Workflows

4. Continuous Evaluation and Optimization

5. Adaptive Red Teaming

6. Hands-On Sandbox Exploration

Who Should Attend?

Real-World Impact

Get the Session Slide Deck

FAQs

Send Me The Slidedeck

Similar resources

Webinar

Mastering Microsoft Technology Governance and Migration in the AI Era

E-Book

Most Common SharePoint & Teams Sprawl Issues and How to Solve Them

Event

The Agent PMO - Governance Without Gridlock