Probabilistic models are derived from analysing real workloads and using relationships therein to predict future behaviour. The two approaches presented are based on the same process lifetime probability distribution model in which older processes have a higher probability of remaining in the system than younger processes. The difference between the two models is the way in which they schedule the processes.
Leland and Ott [LO86] investigated the behaviour of 9.5 million UNIX processes and found that there was on average, an almost perfectly linear relationship between the current age of a process and its expected residual CPU time. This resulted in the rejection of any type of exponential distribution for process lifetimes, and the finding that it is better to distribute old rather than new processes due to a long thick tail on the actual distribution.
They consider the theoretical Spiral Assignment of processes to
hosts by age as the basis of a load distribution policy. Essentially,
if all L processes
in a system are sorted by age A, so that
at some time t:
Then the spiral assignment assigns the processes to hosts such that:
where N is the number of hosts in the system. This arrangement has a number of advantages:
The Spiral assignment policy as given, requires instantaneous and free
reshuffle of the set of processes over the set of hosts, and in this
sense is not practical. Thus Leland and Ott developed an
approximation to the spiral assignment based on the sum of the ages of
all processes younger than process
on the same host (
).
If the oldest process
on a host has a
greater than a
MINCRIT threshold, then that process is eligible for migration
to an idle host. This policy is implemented as receiver-initiated,
if a host is idle, it calls for bids and the location selection
policy selects the host h with the highest load that also has
some processes
above the MINCRIT threshold. The
candidate selection policy then selects the process
with the greatest
value from host h.
The work by Harchol-Balter and Downey [HBD95] is a logical continuation from the work by Leland and Ott. In particular, the empirical observation that migration of old processes is superior to that of migrating young processes, forms the basis of the policy developed by Harchol-Balter and Downey.
The primary departure is that Harchol-Balter and Downey use the
process lifetime distribution probabilities explicitly. Rather than
assigning processes by age in a spiral assignment, and always
migrating the oldest process, they use a two tier candidate selection
approach for selecting processes to transfer from the most loaded
to
least loaded
.
This approach is described in more detail in section 8.2, as it is one of the algorithms used in the experimental part of the thesis.