In this post I’ll be covering Tetrisched, a scheduler based on alsched. To summarize what is alsched, it is a scheduler that allows users to supply soft constraints with utility functions. I’ll be skipping background and motivation and details about alsched as it’s mostly covered by the previous post.
Tetrisched is similar to alsched where it is also trying to maximize the amount of utility based on supplied utility functions. However, it differs from alsched by not just considering space (or in other words, amount of resources) but also time (when these resources are consumed).
Utility operators in alsched used to describe utility based on values of resources, but Tetrisched now includes also the periods of time that influences utility. For example, a utility function could include a deadline and also describing either choosing 2 GPU enabled servers that can complete job in 2 units, or running anywhere with a slower runtime of 3 units. The scheduler will then combine all the expressions, and turn each scheduling decision into a Mixed Integer Linear Program (MILP) using solvers that are configured to get to within 10% optimal solution (otherwise diminishing returns between amount of work and improvement).
Another important aspect of Tetrisched is plan-ahead, where it considers multiple jobs soft constraints and future placements and decide whether it should wait for preferred resources or more alternative options. This can be also computationally intensive so it has be limited to how far advanced it computes, but according to the paper can lead to 3x improvements. Without plan-ahead (alsched), Tetrisched can potentially perform worse than hard constraints at high load.
Tetrisched also introduces a wizard tool that will help translate user requirements into utility functions that inputs into Tetrisched. The wizard supports various job types that is configured to know how to compose a utility function based on the type. For example, a HDFS type job will automatically come up with a utility function that computes the utility of maximizing the utility of scheduling on HDFS storage nodes vs non-HDFS and gaining benefits even if it’s partial. Each job also specifies a budget (can be based on priority or other values), sensitivity for delaying and desired times and optional penalty for dropping the job.
With the plan-ahead and wizard, Tetrisched brings better usability and scheduling especially when there is a higher level of burstiness and load.
The most interesting aspect of this paper is bringing the time dimension and the ability to express how it impacts the overall utility to take that into account when scheduling. By understanding deadlines besides just amount of resources these batch jobs needs, utilization can be further unlocked especially in a shared cluster. This is still a pretty unexplored space in any container scheduler in the wild.
What will also be interesting to see a scheduler that can take into account deadlines and batch jobs, as well as long running jobs and able to make tradeoffs between them, which does better than just killing all batch jobs when long running jobs needs resources.