Building Multi-Agent Systems with Multi-Models in Semantic Kernel

Imagine running a restaurant. Each of your team member, from the chef to server to the cashier has got specialised skills. While the chef prepares delicious meals, the server ensures that the food is delivered promptly and the cashier ensures smooth transactions. Together, they deliver a seamless dining experience.

Similarly, in a multi-agent system powered by Semantic Kernel, each AI agent specialises in its task, such as generating travel plans, analysing budget or creating marketing campaign banners. These agents communicate and collaborate seamlessly, like a well-coordinated restaurant team, to tackle complex, multifaceted problems efficiently to achieve overarching goals.

Why do we need Agents?

Agents are autonomous or semi-autonomous software entity designed to perform tasks by processing inputs, making decisions and producing outputs. They introduce modularity and adaptability to AI systems. They provide capabilities like collaboration, autonomy and human-agent interactions. For example, in a customer support system, one agent might analyse user queries while other fetches data from databases and a third one generates a response.

Why Multi-Agent Systems?

While a single agent can tackle specific tasks effectively (see below), complex problems often demand collaboration. Multi-agent systems introduce distributed intelligence, allowing agents to specialise and coordinate tasks in parallel. For example, in a dynamic sales process, one agent could monitor social media trends, another analyses customer behavior and a third provides actionable insights. This way, they work together for a holistic strategy.

Why Multi-Model Systems?

Integrating multiple models into such systems amplifies their versatility. A single agent or even a group of agents tied to one model can be limiting, particularly when diverse tasks demand specialised capabilities. By using multi-model orchestration, agents to seamlessly use different models to get the most out of frontier models. For example, GPT-4 handles customer interactions, Gemini Ultra interprets visual data from a damaged product and Llama 3.2 ensures real-time translations for global communication.

Integration of all of the above, enterprises can achieve faster operational efficiency and adaptability. These systems can enable businesses to harness AI’s full potential by delivering personalised customer experiences with optimal innovation.

Let’s talk about building these agents

Semantic Kernel provides an Agent Framework to build these agents. There are certain basic concepts which I just summarise here as I do not want to just rehash the Microsoft Learn docs here.

This framework is an experimental state. Its architecture designed to enable the creation and management of intelligent agents. It allows multiple agents to collaborate within a single conversation and manage multiple concurrent conversations including the multi-model and multimodal capabilities. The framework includes core components like the Agent class, Agent Chat and Agent Channel, which facilitate agent interactions and collaboration.

It integrates with the Semantic Kernel’s core principles by ensuring consistency, enabling advanced and autonomous agent behaviors. Additionally, plugins and function calling enables the agents’ capabilities by allowing them to interact with external services and execute complex tasks.

This time, I took a different approach to guide you as a developer on how you can learn all of the Agent Framework’s capabilities swiftly. As you may remember, I have created a getting started notebook for Semantic Kernel which you can learn step by step by following the examples. I won’t write about the details which are already included in the latest Agents notebook. However, I thought to give you a brief overview of agents I have created as a part of notebook.

Kitchen Assistant (Single Agent)

A single agent can be considered as a smart assistant that is specialised in a certain area. Considering below diagram, it illustrates how Food Agent functions within a smart kitchen assistant to provide personalised meal recommendations. The process begins with the user’s query, such as “What should I eat right now?”, routed through InvokeAgentAsync to the Food Agent. Powered by the foodAgentKernel, the agent uses integrated plugins like the TimePlugin (to factor in the time of day) and FoodPlugin (to access recipes and ingredients) to generate a tailored meal suggestion. It also leverages ChatHistory to maintain context, ensuring a seamless and personalized interaction.

The foodAgent creates meal recommendations that consider the weather, time of day, and dietary preferences (e.g., Halal or vegetarian). It provides recipes alongside its suggestions, making the experience user-friendly and interactive. This overall system showcases how a single, well-coordinated agent can dynamically adapt to user needs by delivering thoughtful and relevant results with minimal input.

As you may notice at the start of the notebook, we’re aiming to use Azure OpenAI, Google Gemini and Meta’s Llama to utilise multi-model capabilities for better reasoning and outputs. It also doesn’t take a lot of effort to switch the models as well. For example, if you see that I am using GPT-4 for weather service, you can always change it to Google Gemini to see experience it differently.

Here’s the output it generated:

Concierge Service (Multi Agents)

The following diagram showcases the workflow of an AI Concierge Service coordinated by an Agent Group Chat. The process begins with the User submitting a request for a travel plan. The Agent Group Chat directs the task to the Travel Planner Agent, which creates an itinerary and calculates the budget. This itinerary and budget are then sent to the Budget Advisor Agent, which checks the feasibility of the proposed plan within the specified budget.

The Budget Advisor Agent either approves the plan (✅) or rejects it (🚫). This decision is relayed back to the Agent Group Chat, which communicates the final outcome to the User. The Semantic Kernel Agent Framework manages collaboration between specialised agents, delivering a well-considered travel plan while adhering to financial constraints. It highlights the seamless coordination among agents to meet user requirements. There’s a lot to it which you can surely explore within the agents-01 notebook.

.NET has received an AI boost in the shape of Semantic Kernel, once again. Building multi-agent systems with multi-models have never been unified and simpler than ever before with this awesome framework. Not just with Microsoft or OpenAI offerings, you can also use all the other frontier models to build productive agents and enhance their overall performances. This post covers (including my Generative AI for Developers notebook) everything that is good for a beginner or junior developer stepping into the AI ecosystem.

However, there’s one thing which I did not talk about… and that’s human in the loop interactions. I’m aiming it to cover in our next post with some advanced features.

Until next time.

Agents, Open AI, Semantic Kernel