Today, Microsoft’s productivity revolution has reached a new milestone—
It is embedding AI into every possible corner and crevice.
Beyond the Copilot+ PC preview revealed just before the conference, Microsoft unveiled a wealth of announcements at Build:
- Continuous Upgrades to Copilot: Custom Copilots, Team Copilot, and Copilot Extensions are among the highlights. In summary, all developers can customize and extend Copilot, making team collaboration significantly more convenient.
- Commitment to Small Language Models (SLMs): Phi-3 runs effortlessly on PCs and mobile devices, featuring the first multimodal model in the series, Phi-3-vision.
Other notable updates include real-time intelligence capabilities for Microsoft Fabric, which analyzes and manages high-precision business data to aid decision-making; AI-powered real-time video translation in the Edge browser; and native PyTorch support on AI PCs…
At the opening ceremony, Satya Nadella immediately clarified the significance of today’s event.

For over thirty years, Microsoft has had two dreams: First, can computers understand us, rather than requiring us to understand computers? Second, in this world of ever-increasing information, can computers help us reason, plan, and act more effectively based on all that data?
This wave of AI is the answer to those dreams.
The diverse productivity scenarios showcased at Build represent the stage where Microsoft realizes these visions.
Towards the end of the opening ceremony, Sam Altman appeared on stage to respond to questions and tease details about new models.

The market responded positively to Microsoft’s stock price, which surged to $431.84 at one point. It must be said that Microsoft has seen significant gains over the past two days.

Let’s start with the continuous upgrades to Copilot.
GitHub Copilot Extensions: Natural Language Interaction Across the Board
Targeting developers and teams, Microsoft has introduced GitHub Copilot Extensions, allowing users to customize their GitHub Copilot experience through natural language interactions via third-party service functionalities.

These extensions can be deployed immediately to Azure. Users can manage Azure resources through language interactions—for example, asking Azure where a web application is deployed and troubleshooting related code with a single click:

Any developer can create extensions for GitHub Copilot, incorporating various tools within the stack as well as internal proprietary tools.

By opening the Copilot Workspace, developers can view the entire codebase and request modification suggestions. Copilot will automatically apply these customizations:

Microsoft also introduced Copilot Connectors, enabling developers to customize Copilot using third-party business data, applications, and workflows.

Team Copilot: A Key Member of the Team
In addition, Microsoft continues to upgrade its Copilot offerings with the launch of Team Copilot. Copilot is no longer just a personal assistant; it can now become an integral member of the team.

It can be added to team chat groups to act as a meeting moderator. Copilot records the entire content of meetings in real-time:

With one click, it organizes topics and takes notes based on the progress of team discussions. Other group members can also modify Copilot’s recorded content:

If issues arise during discussions, team members can ask Copilot directly:

When team members reach a consensus on a discussion point, Copilot automatically updates the previous notes in real-time:

Agents Can Be Customized
Meanwhile, Microsoft Copilot Studio has introduced a new feature for customizing Agents.

Developers can define the role of an Agent or choose from existing templates:

Developers can delegate permissions to Copilots assigned to different roles to automate business processes. If an Agent encounters a problem it does not understand or cannot handle, it will proactively present the issue and seek assistance.
Additionally, Agents have the ability to learn from user feedback.

Nadella stated on stage:
I believe this is one of the key factors that will truly drive transformation in the coming year.
Commitment to Small Language Models
In addition, Microsoft has updated its model family—Phi-3—continuing its commitment to Small Language Models (SLMs).
The main models include:
- Phi-3-mini: 3.8 billion parameters, supporting context lengths of 128k and 4k.
- Phi-3-small: 7 billion parameters, supporting context lengths of 128k and 8k.
- Phi-3-medium: 14 billion parameters, supporting context lengths of 128k and 4k.
- Phi-3-vision: 4.2 billion parameters, supporting a 128k context length.
- Phi-3-Silica: 3.3 billion parameters.

Phi-3-mini was first unveiled in April this year. At the time, it garnered significant attention for performing comparably to Llama 2 in benchmark tests. With the addition of Phi-3-small and Phi-3-medium, these models can be accessed via Azure Machine Learning’s model catalog and collections.
As the smallest model in the family, Phi-3-Silica will be embedded into Copilot+ PCs starting in June. It is the most compact variant with only 3.3 billion parameters.
Microsoft claims that Phi-3-Silica achieves a first-token output speed of 650 tokens per second while consuming just 1.5 watts of power, meaning it does not interfere with normal workloads or memory usage. During sustained operation, token generation reuses the NPU’s KV cache and runs on the CPU, generating 27 tokens per second.
Phi-3-vision is the multimodal large model within the Phi-3 family, designed to run directly on mobile devices.
Building upon the foundation of the Phi-2 model, Phi-3-vision can perform everyday visual reasoning tasks.
It has been specifically optimized for charts, enabling it to analyze information within graphs and answer user questions.
During the presentation, Nadella demonstrated a use case: feeding a chart into Phi-3-vision depicting AI tool usage among different age groups in the workplace.
Phi-3-vision accurately extracted data from each group in the chart, compared results across age demographics, and provided a detailed report.

However, unlike other large models, Phi-3-vision can currently only read images; it cannot generate them.
In terms of evaluation scores, the Small and Medium pure-text models outperformed other models of similar size overall.
Even Mini, with fewer than 4 billion parameters, surpassed Llama 3-8B, which has twice its parameter count.

Specifically, Small defeated GPT-3.5-Turbo in a series of tests involving various languages, reasoning, and mathematics. However, it lagged slightly behind in coding capabilities, with a more noticeable gap in knowledge retention.

The Medium version competes with Claude 3 Sonnet and Gemini 1.0 Pro. Its strengths mirror those of Small—language understanding, reasoning, and mathematics are strong suits, while knowledge retention remains a weakness.


Altman’s 9-Minute Surprise Appearance
Nadella reiterated that OpenAI remains Microsoft’s most important strategic partner. Two hours into the presentation, Sam Altman, currently at the center of a public relations storm, made his appearance to close out the keynote address.
However, this time he did not share the stage with Nadella; instead, he stood alongside Microsoft CTO Kevin Scott.

In a brief nine-minute speech, he discussed OpenAI’s next steps, GPT-4o, and advice for developers.
He began by addressing the launch of GPT-4o, describing it as part of a “crazy week.” He noted that he had never seen a technology adopted so rapidly in such a meaningful way.
Regarding the recent controversy surrounding OpenAI’s voice feature (often referred to as the “Black Widow” incident), Altman did not explicitly mention the issue but specifically highlighted their voice mode capabilities.
As AI speeds up and costs decrease, OpenAI has been able to introduce new modalities like speech;
The speech modality was actually a real surprise for me.
Finally, addressing the developers in attendance, Altman offered this advice:
He stated that we are currently at a unique moment and urged everyone to make full use of it rather than waiting to build what they want to create. He described this as perhaps the most exciting time since the advent of smartphones, or even the internet. However, he cautioned against expecting AI to do everything for you; while it is a powerful driver, it will not automatically break existing business rules.
Altman also teased that OpenAI’s latest and most powerful model is about to be released—
What can be revealed now may seem mundane but is crucial: new modalities, general intelligence, and unprecedented power.
Comments
Sign in to join the discussion and leave a comment.
Sign in with Google