YellowDog Platform
With the YellowDog Platform, you can benefit from all the cloud has to offer, without any limitations, enabling you to manage your workloads across different geographical regions, machine shapes and instance lifecycles.

YellowDog Platform delivers
Maximises flexibility via YellowDog’s support for Linux & Windows, container architectures and all xPU types (Intel, AMD, NVIDIA, AWS Trainium/Inferentia etc.) including Graviton/ARM based instances which are more cost and energy efficient
Instance selection can be policy-based (e.g., waterfall) or attribute-based (e.g., based on price powered by YellowDog Insights), automatically selecting the best source of compute to meet customer requirements and preferences. Computing clusters are composable and expand and contract, as required, to respond to the specific needs of your workload
YellowDog is Spot pre-emption aware and able to rapidly move tasks to alternate nodes and re-provision pre-empted instances to minimise any downtime thereby enabling customers to utilise Spot instances with confidence
YellowDog was designed from the ground-up to facilitate massive scale, and through partnership with AWS built one of the world’s largest supercomputers in the cloud, hitting 1m vCPUs within 7mins and reaching 3.2 million vCPUs in just over half an hour as well as demonstrating throughput at 3000 tasks per second. The YellowDog Platform is capable of scaling to 200k nodes and 30m+ vCPUs
YellowDog helps customers to optimally locate their data for workload mobility in hybrid and multi-region deployments, and in doing so maximises cost efficiency whilst managing any constraints around data gravity, latency, and confidentiality
YellowDog helps customers become more cost efficient through use of low-price Spot instances, high node utilisation, reduced compute wastage caused by over-provisioning, and lower engineering burden in managing cloud resource x-CSP, as well as minimising data transfer and storage costs through optimising workload mobility
User-friendly web-based interface (‘single pane of glass’) providing a realtime Dashboard for monitoring usage, including an ability to monitor performance down to the node level (CPU and memory utilisation), and tools for managing compute provisioning, object storage, work scheduling, and platform admin
YellowDog’s security by design approach is ISO 27001 Certified and reflects our dedication to ensuring the confidentiality, integrity, and availability of our platform and your data
YellowDog Portal provides visibility of compute usage by user and time period, as well as an ability to monitor and manage access to resources at the user level with hard and soft limits thereby enabling full FinOps control
Maximises flexibility via YellowDog’s support for Linux & Windows, container architectures and all xPU types (Intel, AMD, NVIDIA, AWS Trainium/Inferentia etc.) including Graviton/ARM based instances which are more cost and energy efficient
Instance selection can be policy-based (e.g., waterfall) or attribute-based (e.g., based on price powered by YellowDog Insights), automatically selecting the best source of compute to meet customer requirements and preferences. Computing clusters are composable and expand and contract, as required, to respond to the specific needs of your workload
YellowDog is Spot pre-emption aware and able to rapidly move tasks to alternate nodes and re-provision pre-empted instances to minimise any downtime thereby enabling customers to utilise Spot instances with confidence
YellowDog was designed from the ground-up to facilitate massive scale, and through partnership with AWS built one of the world’s largest supercomputers in the cloud, hitting 1m vCPUs within 7mins and reaching 3.2 million vCPUs in just over half an hour as well as demonstrating throughput at 3000 tasks per second. The YellowDog Platform is capable of scaling to 200k nodes and 30m+ vCPUs
YellowDog helps customers to optimally locate their data for workload mobility in hybrid and multi-region deployments, and in doing so maximises cost efficiency whilst managing any constraints around data gravity, latency, and confidentiality
YellowDog helps customers become more cost efficient through use of low-price Spot instances, high node utilisation, reduced compute wastage caused by over-provisioning, and lower engineering burden in managing cloud resource x-CSP, as well as minimising data transfer and storage costs through optimising workload mobility
User-friendly web-based interface (‘single pane of glass’) providing a realtime Dashboard for monitoring usage, including an ability to monitor performance down to the node level (CPU and memory utilisation), and tools for managing compute provisioning, object storage, work scheduling, and platform admin
YellowDog’s security by design approach is ISO 27001 Certified and reflects our dedication to ensuring the confidentiality, integrity, and availability of our platform and your data
YellowDog Portal provides visibility of compute usage by user and time period, as well as an ability to monitor and manage access to resources at the user level with hard and soft limits thereby enabling full FinOps control
Integrations
Workflow tools
Integration with popular workflow tools Ray (distributed compute, ML), Apache AirFlow (data engineering, ML), NextFlow (Life Sciences) enables customers to keep their existing workload infrastructures and pipelines whilst taking advantage of YellowDog’s intelligent cloud provisioning
Moreover, the YellowDog Platform is easily configured and managed via a comprehensive set of REST APIs and SDKs (Python, Java and C++) for easy integration into a customer’s DevOps and CI/CD processes



Cloud providers
Fully integrated with all major cloud service providers (CSPs)
Utilises all the latest CSP APIs, such as the AWS EC2 Fleet and Spot Fleet APIs to launch a fleet of thousands of Amazon EC2 instances in a single operation
Data/Storage
Out of the box connectors to object storage services including cloud provider (Azure Blob, Amazon S3, Google Cloud Storage etc.) and 3rd party (VAST, Weka), as well as integration with HPC cluster file systems
Simplifies data access & data transfer across multiple cloud providers, and interworks with YellowDog’s Data Anywhere to facilitate easier workload mobility
3rd party schedulers
Through integrations with all popular schedulers (Slurm, IBM Symphony, LSF, Moab, PBS, Grid Engine, GridServer), YellowDog can act as a unified workload submission platform across both cloud and on-prem resources