Comparison of Cloud GPU Providers

in Cloud Computing , Machine Learning , MLOps , FinOps

Comparison of Cloud GPU Providers

Shifting focus from just video and gaming, GPU is now being adopted in many fields such as finance, health care, machine learning, data science, and other new fields like cryptomining. This makes it important to be available for use on the cloud for easy accessibility, especially now that companies move from on-premise infrastructure.

However, getting a hold of which cloud providers offer the best GPU service can be difficult, and this article tackles just that.


    While most people are familiar with CPUs, GPUs are famous for being a key component in providing great graphics in gaming systems and graphics-intensive software. In recent times, GPUs are widely adopted in artificial intelligence and machine learning to process extensive data used in training machine learning models.

    Graphics Processing Unit, GPU, just like CPU, is a silicon-based microprocessor used to accelerate graphics creation by manipulating and altering the memory. Unlike in CPUs, GPUs do their computation in parallel, enabling them to do intense computer calculations at once.

    This high computational speed is what is needed to render high-quality graphics like in video games and intensive datapoint processing required in data science and training machine learning models.

    In native IT infrastructures, many organizations use GPUs for their compute-intensive applications and workloads. Like in the case of many technology stacks, GPU technology is fast-changing, with NVIDIA (one of the biggest GPU providers) releasing a new GPU almost every year. Keeping up with this fast-changing technology is not only tricky but also very expensive.

    But since many organizations are now moving their IT infrastructure to the cloud, cloud service providers can continually update their GPU technology and make them available to these companies at a lower cost than on-premise.

    There are quite a several GPU cloud providers, including major cloud service providers. They all offer GPUs of various models, compute power, storage capacity, and price.

    Let’s explore some of the best in this article.

    Amazon Web Services (AWS)

    jpeg;base64b9fdc828bc914b87.jpg

    As one of the first major cloud providers to provide GPU cloud service, Amazon offers various GPUs in its P3 and G4 EC2 instances. Amazon’s P3 instance offers the Tesla V100 GPU (one of the most popular NVIDIA GPU provided by many cloud providers), which has 16 GB and a 32 GB variant of VRAM per GPU. The G4 instances are in two types: G4dn, which offers NVIDIA T4 GPUs with 16GB VRAM, and the G4ad instances powered by more powerful AMD Radeon Pro VS20 GPUs with 64 vCPUs.

    AWS allows clustering multiple GPU instances using the xlarge instance sizes, available in various locations, including US East, US West, Europe, Asia Pacific, Middle East, Africa, and China regions, depending on your chosen GPU instance.

    Paperspace

    png;base647254563cd6d24945.png

    Paperspace is easily one of the best cloud dedicated-GPU providers with a virtual desktop that allows you to launch your GPU servers quickly.

    It offers 4 GPU cards starting from the P4000 GPU with 8GB VRAM at $0.51 per GPU/hour, the 16GB VRAM P5000 GPU at $0.78 per hour, P6000 dedicated GPU with 30GB VRAM at $1.10 per hour, and the powerful 16GB NVIDIA Tesla V100 GPU which is ideal for various intensive tasks at $2.30 per hour.

    Paperspace also offers multiple GPU clusters with its P5000 x 4 and P6000 x 4 GPUs offered at $3.12 and $4.40 per hour, respectively.

    Google Cloud Platform

    png;base6421ccbc76801a7f5a.png

    To allow you to run your intensive applications, Google Cloud offers a wide range of GPU servers in its cloud instances.

    It offers the popular NVIDIA V100 GPU with 16GB GPU RAM and 900GB/s bandwidth and the Tesla K80 12GB VRAM, 240GBps bandwidth at $2.48 and $0.45 per hour, respectively.

    Other GPUs available on Google cloud include the NVIDIA Tesla P100 (16GB VRAM, 732GBps bandwidth @ $1.46 per GPU/hour), T4 (16GB VRAM, 320GBps bandwidth @ $0.35 per GPU/hour), and P4 (8GB VRAM, 192GBps bandwidth @ $0.6 per GPU/hour).

    Google cloud’s Tesla T4 GPU is a high bandwidth and highly efficient multipurpose GPU that can be used for various high-end workloads at a low cost per hour.

    The T4 and other GPUs are generally available in the US central region (Lowa). Depending on your chosen model, Google cloud GPUs are available in US West (Oregon, Los Angeles, Las Vegas, and Salt lake city), US East, North America, South America, Europe, and Asia.

    Vast.ai

    png;base64c9d9f9bb8aa27a0.png

    Vast.ai is a marketplace that allows both public and private individuals to rent out their unused GPU capacities.

    With various individual cloud GPUs available in different locations worldwide, you can get a Tesla V100 GPU with 16.2GB GPU RAM and 71.45GB/s bandwidth at just $0.85 per hour in the Texas US region.

    Other cloud GPU models available include GTX 1080, RTX 3090, and Quadro P5000, all at a relatively low price compared to major cloud providers.

    Oracle Cloud

    png;base64c49685de8d3167a4.png

    Oracle Cloud offers three NVIDIA GPU models: Tesla P100 16GB VRAM with 25GBps bandwidth at $1.27 per hour, the popular Tesla V100 GPU with 16GB VRAM and 4GBps bandwidth at $2.95 and the new powerful NVIDIA A100 GPU with 40GB VRAM and 12.5GBps bandwidth at $3.05 per GPU/hour. Oracle is the first to offer the A100 GPU with double memory and a much larger local storage capacity.

    Oracle cloud GPUs like NVIDIA Tesla Volta V100 and P100 are also made available on virtual machines and can be used in the London (UK), Ashburn(US),  and Frankfurt (Germany) regions.

    Microsoft Azure

    png;base64bfb4d4aabad5359f.png

    Microsoft Azure offers a wide number of GPUs in its cloud instance series. It offers the NVIDIA Tesla V100 at $2.95 per hour and the T4 GPU with AMD EPYC2 processor in its NCv3 instance series.

    It also offers the Tesla M60, Volta V100, and the K80 GPU at $0.87 per GPU/hour. The AMD Radeon Instinct M125 GPU is one of the most powerful GPUs offered by Microsoft Azure, and it only operates on Windows OS.

    Microsoft Azure GPUs and virtual machines are generally available in South Central US, US West, and North Europe Azure regions.

    LeaderGPU

    png;base648722b46d10fbee5d.png

    LeaderGPU is a full-fledged platform for renting cloud GPUs. It makes a wide range of GPUs available depending on your use case and time commitment.

    It provides various GPU models like the NVIDIA Tesla Volta V100 (16GB GPU RAM, 900Gbps bandwidth), Tesla P100 (16GB GPU RAM, 720Gbps bandwidth), RTX 3090 (24GB GPU RAM, 936ps bandwidth), Tesla T4 (16GB GPU RAM, 320Gbps bandwidth) and GTX 1080 (8GB GPU RAM, 320Gbps bandwidth).

    It offers these GPU servers primarily as multi GPUs, such as the 6 x Tesla T4 at €90.71 per day and an 8 x GTX 1080Ti at €108.3 per day.

    IBM Cloud

    png;base641c97c47b6b212655.png

    IBM Cloud offers 3 NVIDIA T4 GPU with 32GB GPU RAM but varying Intel Xeon processors in its GPU cloud instances. The T4 GPU with an Intel Xeon has 20 CPU cores offered at $819/month. The 32 cores Intel Xeon 5218 T4 GPU is offered at $934/month, and the Intel Xeon, 6248 T4 GPU with 40 cores, is offered at $1,704 per month.

    IBM Cloud GPUs are available in various data centers, including the US, Canada, EU, and Asia regions.

    They also offer 4 variants of AC virtual GPU servers starting from $1.95/hour.

    Alibaba Cloud

    png;base641c1aaa024e9da432.png

    Alibaba Cloud offers GPU in 5 instance variants: the GA1, GN4, GN5, GN5i, and GN6 instance types. The GA1 instances offer a maximum of 4x AMD Fire Pro S7150 GPU with 32GB GPU RAM, the GN4 Instances offer 24GB GPU memory NVIDIA Tesla M40 GPU, and the GN5, GN5i, and GN6 instance offers Tesla P100 (128 GB, max 8x), P4 (16GB VRAM, max 2x), and Tesla V100 (128GB GPU memory) respectively.

    Tencent Cloud

    png;base647e3aa73ba085b306.png

    Tencent Cloud is a cloud platform that provides various cloud solutions, including cloud GPU services. It offers various NVIDIA GPU instances dedicated to large computing needs. The Tencent Cloud GN10 instances offer the 32GB VRAM Tesla V100 with NVLINK, GN2 instance offer Tesla M40 GPU, the GN6 instance offers Tesla P40 GPU, and GN7 and GN7vw NVIDIA GPU instances are both powered by the Tesla T4 GPUs.

    Tencent Cloud GPU instances are available in Asia, in Guangzhou, Shanghai, Beijing, Singapore, and Silicon Valley in the Asian region.

    You might need to check the Tencent cloud website for the closest availability zone to your region.

    Which GPU Cloud Provider Should I Choose?

    To answer this question straight and point-blank, we cannot choose one cloud provider among the ones in this article for you as such a decision is dependent on your use case and some other factors.

    However, as a rule of thumb, you should choose a cloud GPU provider based on your budget and the availability in a close region (because this affects price). Other factors will apply when choosing the GPU instance or model you’ll be using.

    NVIDIA Teslas V100 is a really powerful GPU offered by nearly all cloud providers, and it’s suitable for running intensive computing, including machine learning, high-end graphics rendering, and 3D applications.

    If you’re running a GPU-intensive application, you can use the Tesla but make sure to compare the price with other cloud providers. Paperspace offers a relatively low price for the V100 GPU with high speed and reliability.

    The Tesla K80 GPU is another powerful one that is not as costly as the Tesla V100.  It’s ideal for training mid-level machine learning models, some card programs, and high quality video rendering.

    All GPU models are built for various use cases, and their pricing differs from one cloud platform to another.


    Get similar stories in your inbox weekly, for free



    Share this story with your friends
    editorial
    The Chief I/O

    The team behind this website. We help IT leaders, decision-makers and IT professionals understand topics like Distributed Computing, AIOps & Cloud Native

    Latest stories


    DevOps and Downed Systems: How to Prepare

    Downed systems can cost thousands of dollars in immediate losses and more in reputation damage …

    Cloud: AWS Improves the Trigger Functions for Amazon SQS

    The improved AWS feature allows users to trigger Lambda functions from an SQS queue.

    Google Takes Security up a Notch for CI/CD With ClusterFuzzLite

    Google makes fuzzing easier and faster with ClusterFuzzLite

    HashiCorp Announces Vault 1.9

    Vault 1.9 released into general availability with new features

    Azure Container Apps: This Is What You Need to Know

    HTTP-based autoscaling and scale to zero capability on a serverless platform