YellowDog Platform Services

Software Architecture

Our software architecture consists of:

The YellowDog Platform: a collection of microservices that perform the core functions relating to workload management, compute orchestration and provisioning. These microservices are clusterable, scalable and support multi-tenanted deployments both on-premise and in the cloud.
The YellowDog Agent: installed on the computing resources available to the YellowDog Platform. The Agent sends a notification to confirm that it is alive and provides information on task status, resource utilisation and computing performance.

Account Service

Our Account Service provides secure Identity and Access Management for the YellowDog Platform. We use best in class, secure Keyrings technology for the storage of cloud credentials.

Using military-grade AES-256 encryption, the user retains full control over sensitive access credentials. During automated operation, our system creates temporary, privilege and time constrained access to only execute the task at hand. Full auditing of accounts, identities and actions ensures traceability and accountability.

Object Store Service

Our Object Store Service combines multiple, distinct storage services (e.g. Azure Blob, Amazon S3, Google Cloud Storage), across multiple providers and regions, into one coherent data surface. This overcomes many of the data management constraints inherent in hybrid and multi-cloud deployments.

Our segmentation and distribution of data improves the storage performance beyond that offered by a single object store. As our service distributes the data across multiple buckets, the impact of any bandwidth limits are dispersed. This means that a cloud object store can be fast enough to boot an operating system, or transfer assets at a speed that would otherwise require a high-end data transfer appliance.

Our Object Store Service also verifies effective data transfer, allowing faster and connectionless protocols to be used.

Image Service

Our Image Service is a virtual machine image catalogue, for images across all regions and cloud providers.

This ensures the right version of the image is matched to the right type of computing instance and operating system. This is imperative when the YellowDog Platform is automatically choosing the Best Source of Compute, as incompatibilities in the underlying instance type and operating system would mean applications and workloads underperform or fail. As a result, tasks are completed faster and at a lower cost than any other third-party workload manager.

Compute Service

Our Compute Service is a common API to provision, manage and de-provision computing resources across multiple clouds and on-premise infrastructure. When combined with our SDKs, our Compute Service makes it quick and easy to orchestrate cloud and on-premise computing resources.

Out of the box connectors are available for:

Amazon Web Services (AWS)
Microsoft Azure
Google Cloud Platform (GCP)
Oracle Cloud Infrastructure (OCI)
Alibaba Cloud

Multiple deployment strategies are supported, including managing “fleets” of AWS Spot instances and GCP pre-emptive instances.

Customer Driven Performance

With our Compute Service, strategies can be deployed to determine and provision the Best Source of Compute for workloads across multiple clouds. This could be the fastest deployment time, the lowest cost, the lowest environmental impact, or delivery to a deadline. It also means that business constraints can be adhered to, such as data sovereignty, security certifications, or environmental impact.

For intelligent resource management, our Compute Service can combine different strategies within a single Workload Requirement to achieve ultimate efficiency and performance.

Placement is made by understanding:

Cost
Resource availability
Utilisation
Latency
Time to provision
Performance between different node types
Environmental impact

Scheduler Service

Our Scheduler Service increases utilisation levels by sharing computing resources, using fine grained control and prioritisation of tasks.

Scheduling workloads in a hybrid and multi-cloud environment is significantly more complex than in a single system location. To provide robust and consistent performance, the scheduler must handle issues such as varying environmental characteristics (e.g. external factors impacting network or vCPU performance), greater asynchronicity of interactions, distributed resource management and ownership, dynamic resource availability, complex network topographies, and failures of both computing resources and network connections.

Our Scheduler Service is built from the ground up to handle these problems, resulting in efficient, reliable and well-utilised computing resources.

Our Scheduler Service is also designed to complement any existing schedulers, via Meta Scheduling.

Meta Scheduling provides a single workload submission system that can coordinate multiple third-party heterogeneous clusters. This is achieved by allowing users to create and manage dynamically scaling compute clusters, run by a variety of technologies and schedulers.

Integration with Object Store

Our Scheduler Service is fully integrated with our Object Store Service so data is automatically supplied for tasks exactly when it is needed; and data output is captured, stored or provided as a form of data pipeline between tasks. This ensures that any dependencies between tasks and tasks groups are mapped, tracked and synchronised, so workloads are delivered effectively, regardless of where processing takes place.

Usage Service

Our Usage Service enables you to view and monitor how much cloud computing time the application has used. Set various filters to isolate a particular project you want to evaluate and download any results in .csv format.

The Usage Service also allows you to set allowances on compute at any level (e.g. account, user, namespace). These user defined “pre-emptions” can be configured with hard (terminate all compute when an allowance is exceeded) and soft limits (maintain current compute requests but do not allow new requests). In addition, grace periods, ad-hoc boosts and reset periods can be configured.

Logs Service

Our Logs Service gives you a detailed breakdown of what the other services in the YellowDog Platform are doing, or have done in the past.

Within the Logs Service:

View live logs or logs within a certain timeframe.
Filter by specific services – e.g. Compute Service, Scheduler Service etc.
Filter by specific text.
View logs from individual sources.
Filter by ID within the log data.

Our technology benefits from a number of supporting 3rd party technologies. For more information, please contact our team.

Request a Demo Today

Contact the team today and book a demo of the YellowDog Platform.

Learn More