Hugging Face to make $10M worth of old Nvidia GPUs freely available to AI devs

Open source AI champion Hugging Face is making $10 million in GPU compute available to the public in a bid to ease the financial burden of model development faced by smaller dev teams.

The program, called ZeroGPU, was announced by Hugging Face CEO Clem Delangue via Xitter on Thursday.

"The open source community doesn't have the resources available to train and demo these models that the big tech have at their disposal, which is why ChatGPT remains the most used AI application today," he wrote.

"Hugging Face is fighting this by launching ZeroGPU, a shared infrastructure for indie and academic AI builders to run AI demos on Spaces, giving them the freedom to pursue their work without financial burden."

Founded in 2016, Hugging Face has become a go-to source of open source AI models which have been optimized to run on a wide variety of hardware - thanks in part to close partnerships with the likes of Nvidia, Intel, AMD and others.

Delangue regards open source as the way forward for AI innovation and adoption, so his biz is making a bounty of complete resources available to whoever needs it. ZeroGPU will be made available via its application hosting service and run atop Nvidia's older A100 accelerators - $10 million worth of them - on a shared basis.

This setup differs from the way many cloud providers rent GPU resources. Customers often require long-term commitments in order to get the best deals, which can be limiting for smaller players who can't predict the success of their models ahead of time. The Big Cloud model is also problematic for larger ones trying to commercialize the models they already have.

Stability AI's GPU commitments were reportedly so large that the British model builder behind the wildly popular Stable Diffusion image generator actually defaulted on its AWS bills.

The shared nature of Hugging Face's approach means that - at first at least - it will be limited to AI inferencing, rather than training. Depending on the size of the dataset and model, training even small models can require thousands of GPUs running flat out for extended periods of time. Hugging Face's admittedly thin support docs state that GPU functions are limited to a maximum of 120 seconds, which is clearly not sufficient for training.

The Register contacted Hugging Face for clarification on the applications of ZeroGPU, and a spokesperson replied that it is "mostly inferencing, but we have exciting ideas for the others." So watch this space.

In terms of how Hugging Face has gotten around having to dedicate entire GPUs to individual users, there's no shortage of ways to achieve this, depending on the level of isolation required.

According to Delangue, the system is able to "efficiently hold and release GPUs as needed" - but how that actually plays out under the hood isn't clear.

Techniques like time slicing to run multiple workloads simultaneously and Nvidia's multi-instance GPU (MIG) tech - which allows the chip to be partitioned into seven logical GPUs - have previously been employed by cloud providers like Vultr to make GPU compute more accessible to developers.

Another way of going about it is by running the workloads in GPU-accelerated containers orchestrated by Kubernetes. Or Hugging Face could be running serverless functions similar to how Cloudflare's GPU services work.

However, it's worth noting there are practical limits to all of these approaches - the big one being memory. Based on the support docs, Hugging Face appears to be using the 40GB variant of the A100. Even running 4-bit quantized models, that's only enough grunt to support a single 80 billion parameter model. Due to key-value cache overheads, the practical limit will be less.

We've asked Hugging Face for clarification on how it's going about sharing those compute resources. We'll update if and when there's new information.

At a time when GPUs are a scarce resource - so much so that bit barns like Lambda and CoreWeave are using their hardware as collateral to acquire tens of thousands of additional accelerators - Hugging Face's offering may come as a relief for startups looking to build AI-accelerated apps based on popular models.

It probably doesn't hurt that Hugging Face raised $235 million in a Series-D funding round led by all of the AI heavy weights you might expect - including Google, Amazon, Nvidia, AMD, Intel, IBM and Qualcomm.

However, this is also somewhat ironic, in that several of Hugging Face's biggest supporters are the ones developing the kinds of proprietary models Delangue worries could end up squeezing out smaller AI startups.

ZeroGPU Spaces is in open beta now. ®

Search
About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Aug 31
Canadian artist wants Anthropic AI lawsuit corrected

Interview Tim Boucher objects to the mischaracterization of his work in authors' copyright claim

Aug 31
GPT apps fail to disclose data collection, study finds

Researchers say that implementing Actions omit privacy details and expose info

Aug 30
MongoDB takes a swing at PostgreSQL after claiming wins against rival

Open source competitor is still the most popular database among devs, though

Aug 30
Have we stopped to think about what LLMs actually model?

Claims about much-hyped tech show flawed understanding of language and cognition, research argues

Aug 30
Broadcom has brought VMware down to earth and that's welcome

VMware Explore But users aren't optimistic it will land softly

Aug 30
Nvidia admits Blackwell defect, but Jensen Huang pledges Q4 shipments as promised

The setback won't stop us from banking billions, CFO insists

Aug 29
Fintech outfit Klarna swaps humans for AI by not replacing departing workers

Insists it's not cutting jobs and pays harder-to-automate people more with AI savings