Graphcore claims its IPU-POD outperforms Nvidia A100 in model training

Hear from the CIO, CTO and other C-level and senior executives on data and AI strategies at the Future of Work Summit on January 12, 2022. Learn more

Bristol-headquartered Graphcore, a startup developing chips and systems to accelerate AI workloads, seems to be taking on category-leader Nvidia with significant improvements in performance and functionality.

In the latest MLPerf matrix, Graphcore stated that its IPU-POD16 server could easily outperform Nvidia’s DGX-A100 640GB server. Specifically, when the systems were tested to train the computer vision model RESNET-50, the Graphcore unit worked faster for about a minute. The model took 28.3 minutes to train, while the DGX100 took 29.1 minutes.

Significant improvement in train-to-train

Graphcore said the numbers show a 24% jump over previous MLPerf results and that could be directly attributed to software optimization. For IPU-POD64, with system training RESNET-50 in just 8.50 minutes, the performance increase was 41%. Meanwhile, the IPU-POD128 and IPU-POD256 – Graphcore’s flagship scale-up systems – RESNET-50 took only 5.67 minutes and 3.79 minutes to train.

The MLPerf benchmark is maintained by the MLCommons Association, a consortium supported by Alibaba, Facebook AI, Google, Intel, Nvidia and others who act as independent governors.

The results also detail the ability of the Graphcore system to handle natural language processing (NLP) workloads. During testing on the NLP model BERT, the IPU-POD16 time-to-train MLPerf had 26.05 minutes in the open category (with flexibility in model implementation), while the POD64 and POD128 took only 8.25 and 5.88 minutes, respectively.

However, when compared to the last MLPerf benchmark, the performance gains on BERT were not as high as in the case of RESNET-50.

Graphcore also tested its systems on other workloads to show how consumers will handle the newer, more innovative models they are exploring to move beyond RESNET and BERT. Part of this was experimenting with the EfficientNet B4, a computer vision model trained in just 1.8 hours on the company’s IPU-POD256. On the IPU-POD16, the same model was trained in 20.7 hours – three times faster than the Nvidia DGX A100.

The development places Graphcore as a major competitor to Nvidia, which already shipping machines to accelerate AI workloads and has a major footprint in the segment. Other players in the space include Google and Cerebras systems. Google’s systems also outperformed Nvidia’s servers in MLPerf tests, although they were preview machines and were not readily available on the market.

Graphcore has raised more than $ 700 million so far and is valued at $ 2.77 billion after raising its latest fundraiser.


VentureBeat’s mission is to become a digital town square for technical decision makers to gain knowledge about transformative technology and practices. Our site delivers essential information on data technologies and strategies so you can lead your organizations. We invite you to access, to become a member of our community:

  • Up-to-date information on topics of interest to you
  • Our newsletters
  • Gated idea-leader content and discounted access to our precious events, such as Transform 2021: Learn more
  • Networking features and more

Become a member

Similar Posts

Leave a Reply

Your email address will not be published.