What runs GPT-4o and Microsoft Copilot? | Largest AI supercomputer in the cloud | Mark Russinovich

What runs GPT-4o and Microsoft Copilot? | Largest AI supercomputer in the cloud | Mark Russinovich

Microsoft has built the world’s largest cloud-based AI supercomputer that is already exponentially bigger than it was just 6 months ago, paving the way for a future with agentic systems. For example, its AI infrastructure is capable of training and...
15 Minuten

Beschreibung

vor 1 Jahr

Microsoft has built the world’s largest cloud-based AI
supercomputer that is already exponentially bigger than it was
just 6 months ago, paving the way for a future with agentic
systems.


For example, its AI infrastructure is capable of training and
inferencing the most sophisticated large language models at
massive scale on Azure. In parallel, Microsoft is also developing
some of the most compact small language models with Phi-3,
capable of running offline on your mobile phone. 


Watch Azure CTO and Microsoft Technical Fellow Mark Russinovich
demonstrate this hands-on and go into the mechanics of how
Microsoft is able to optimize and deliver performance with its AI
infrastructure to run AI workloads of any size efficiently on a
global scale. 


This includes a look at: how it designs its AI systems to take a
modular and scalable approach to running a diverse set of
hardware including the latest GPUs from industry leaders as well
as Microsoft’s own silicon innovations; the work to develop a
common interoperability layer for GPUs and AI accelerators, and
its work to develop its own state-of-the-art AI-optimized
hardware and software architecture to run its own commercial
services like Microsoft Copilot and more.


QUICK LINKS:
00:00 - AI Supercomputer
01:51 - Azure optimized for inference
02:41 - Small Language Models (SLMs)
03:31 - Phi-3 family of SLMs
05:03 - How to choose between SLM & LLM
06:04 - Large Language Models (LLMs)
07:47 - Our work with Maia
08:52 - Liquid cooled system for AI workloads
09:48 - Sustainability commitments 
10:15 - Move between GPUs without rewriting code or building
custom kernels. 11:22 - Run the same underlying models and code
on Maia silicon
12:30 - Swap LLMs or specialized models with others
13:38 - Fine-tune an LLM
14:15 - Wrap up


 


Unfamiliar with Microsoft Mechanics? 


As Microsoft's official video series for IT, you can watch and
share valuable content and demos of current and upcoming tech
from the people who build it at Microsoft.


• Subscribe to our YouTube:
https://www.youtube.com/c/MicrosoftMechanicsSeries


• Talk with other IT Pros, join us on the Microsoft Tech
Community:
https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog


• Watch or listen from anywhere, subscribe to our podcast:
https://microsoftmechanics.libsyn.com/podcast


 


Keep getting this insider knowledge, join us on social:


• Follow us on Twitter: https://twitter.com/MSFTMechanics 


• Share knowledge on LinkedIn:
https://www.linkedin.com/company/microsoft-mechanics/


• Enjoy us on Instagram: https://www.instagram.com/msftmechanics/


• Loosen up with us on TikTok:
https://www.tiktok.com/@msftmechanics


 

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15