Co-authored technical report on Azure ND GB300 v6 Virtual Machines achieving record 1.1 million tokens/s on Llama2 70B inference.
How media typically covers Mark Gitau
Referenced in coverage
Microsoft Azure ND GB300 v6 virtual machines achieve a record 1.1 million tokens per second on Llama2 70B inference, a 27% improvement over the previous ND GB200 v6 record and 5× higher throughput per GPU than previous-generation H100 systems.
“Co-authored technical report on Azure ND GB300 v6 Virtual Machines achieving record 1.1 million tokens/s on Llama2 70B inference.”