The power of choice in data-aware cluster scheduling

In this post we'll cover a scheduler called KMN that is looking to solve scheduling I/O intensive tasks in distributed compute frameworks like Spark or MapReduce. This scheduler is different than the ones we discussed previously, as it's emphasizing on a data-aware scheduling which we'll cover in this post. Background In today's batch computing frameworks … Continue reading The power of choice in data-aware cluster scheduling

Advertisements

Quasar: Resource-Efficient and QoS-Aware Cluster Management

Last post I covered Paragon, which is a QoS aware resource scheduler. In this paper, the same authors extended Paragon to improve cluster utilization efficiency either on-prem or in the cloud. Background It's a well-known fact that everyone using the cloud is wasting most of it's capacity. In this paper, the authors analyzed a production cluster from … Continue reading Quasar: Resource-Efficient and QoS-Aware Cluster Management

Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters

After a long pause (I blame it on starting a startup...), I'd like to continue the cluster scheduling series that I started in 2015! Today's post I'd like to cover Paragon, a cluster scheduler that is Quality of Service aware that utilizes machine learning to help its service placement decision. This is work that was … Continue reading Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters

Hierarchical Scheduling for Diverse Datacenter Workloads

Hierarchical Scheduling for Diverse Datacenter Workloads In this post we’ll cover the paper that introduced HDRF (Hierarchical Dominant Resource Fairness) which builds upon the team's existing work DRF (Dominant Resource Fairness), but looking to also provide hierarchical scheduling. Background Prior work DRF, was an algorithm that was able to decide how to allocate multi-dimensional resources … Continue reading Hierarchical Scheduling for Diverse Datacenter Workloads

Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs

Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs Background In the previous posts around datacenter scheduling, most of the focus has been long running services or batch jobs that runs from minutes to days. Sparrow is looking to solve a different use case, where it looks to solve the scheduling problem when placing jobs that runs … Continue reading Sparrow : Scalable Scheduling for Sub-Second Parallel Jobs

Omega: flexible, scalable schedulers for large compute clusters

Omega: flexible, scalable schedulers for large compute cluster This post is part of the Datacenter scheduling series, which I’ll be covering Omega, paper published by Google back in 2013 around their work to improve their internal container orchestrator. Background Google runs mixed workload in their production for better utilization and effiency, and it is the Google’s … Continue reading Omega: flexible, scalable schedulers for large compute clusters

Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters

Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters  In this post I’ll be covering Tetrisched, a scheduler based on alsched. To summarize what is alsched, it is a scheduler that allows users to supply soft constraints with utility functions. I'll be skipping background and motivation and details about alsched as it's mostly covered by the previous post. … Continue reading Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters