CmSc 335 Operating Systems
Chapter 10. Multiprocessor and Real-Time Scheduling
- Multiprocessor Scheduling
1.1. Classification of multiprocessor systems
- Loosely coupled or distributed multiprocessor or cluster:
- each processor has its own main memory and I/O channels.
- Functionally specialized processors:
- an example is an I/O processor
- controlled by a master processor
- Tightly coupled multiprocessing:
- processors share a common main memory
- controlled by the operating system
1.2. Synchronization granularity
Scheduling concurrent processes has to take into account the synchronization
Synchronization granularity means the frequency of synchronization
between processes in a system.
Types of synchronization granularity
- Fine parallelism inherent in a single instruction stream
- Medium parallel processing or multitasking within a single
- Coarse multiprocessing of concurrent processes in a
- Very coarse distributed processing across network nodes to form a
single computing environment
- Independent multiple unrelated processes
1. 2. 1. Independent parallelism
With independent parallelism, there is no explicit synchronization among
- Separate application or job
- No synchronization
- Same service as a multiprogrammed uniprocessor
Time-sharing systems exhibit this type of parallelism
1. 2. 2. Coarse and very coarse-grained parallelism
With coarse and very coarse-grained parallelism, there is synchronization
but at a very gross level. (e.g. at the beginning and at the
This kind of situation is easily handled as a set of concurrent processes
running on a multiprogrammed uniprocessor and can be supported on a
multiprocessor with little or no change to user software.
In general, any collection of concurrent processes that need to communicate
can benefit from the use of a multiprocessor architecture.
1. 2. 3. Medium-grained parallelism
Medium-grained parallelism is present in parallel processing or multitasking
within a single application.
A single application can be effectively implemented as a collection of
threads within a single process.
Because the various threads of an application interact so frequently,
scheduling decisions concerning one thread may affect the performance of the
1. 2. 4. Fine-grained parallelism
Fine-grained parallelism represents a much more complex use of parallelism
than is found in the use of threads.
- Highly parallel applications
- Specialized and fragmented area
1.3. Design issues
Scheduling on a multiprocessor involves three interrelated issues:
- The assignment of processes to processors
- The use of multiprogramming on individual processors
- The actual dispatching of a process
The scheduling depends on
- degree of granularity
- number of processors available
1.3.1. Assignment of processes to processors
The simplest scheduling approach is to treat the processors as a pooled
and assign processes to processors on demand.
1. Static or dynamic assignment of a process?
- Static assignment: a process is permanently assigned to one processor
from activation until its completion. A dedicated short-term queue is
maintained for each processor.
Advantages: less overhead in the scheduling.
Disadvantages: one processor can be idle, with an empty queue, while
another processor has a backlog.
- Dynamic assignment: All processes go into one global queue and are
scheduled to any available processor. Thus, over the life of a process, the
process may be executed on different processors at different times.
Advantages: better processor utilization.
Disadvantages: inefficient use of cache memory,
more difficult for the processors to communicate.
2. Type of architecture - master/slave or peer?
- Master/slave architecture: key kernel functions of the
operating system always run on a particular processor. The master is
responsible for scheduling jobs.
Advantages: simple approach, requires little enhancement to a
uniprocessor multiprogramming operating system
a failure of the master brings down the whole system,
the master can become a performance bottleneck.
- Peer architecture: the operating system can execute on any processor,
and each processor does self-scheduling from the pool of available processes.
Problems: the operating system becomes complicated
Techniques must be employed to resolve and synchronize competing claims to
1.3.2. The use of multiprogramming on
Depends on the synchronization granularity
A. Course-grained parallelism:
Pprocessor utilization considerations:
Each individual processor should be able
to switch among a number of
B. Medium-grained parallelism:
Application performance considerations:
An application that consists of a number
of threads may run poorly unless
all of its threads are available to run simultaneously
1.4. Process dispatching
In Uniprocessor Systems: sophisticated scheduling algorithms
In Multiprocessor Systems:
Processes: simple algorithms work best (see Fig.10.1 in the textbook)
Threads: the main consideration is the synchronization between threads
1.4.1. Process scheduling
Basic method used: dynamic assignment
- Single queue for all processes
- Multiple queues are used for priorities
- All queues feed to the common pool of processors
Specific scheduling discipline is much less important
with multiprocessors than
with one processor
(see Fig.10.1 in the textbook)
1.4.2. Thread scheduling
Key features of threads:
- An application can be a set of threads that cooperate and execute
concurrently in the same address space
- Threads running on separate processors yield a dramatic gain in
General approaches to thread scheduling:
- Load sharing: processes are not assigned to a particular processor. A
global queue of ready threads is maintained, and each processor, when idle,
selects a thread from the queue.
- Gang scheduling: a set of related threads is scheduled to run on a set
of processors at the same time, on a one-to-one basis.
- Dedicated processor assignment: each program is allocated a number of
processors equal to the number of threads in the program, for the duration of
the program execution (this is the opposite of the load-sharing
- Dynamic scheduling: the application is responsible for assigning
its threads to processors. It may alter the number of threads
during the course of execution.
22.214.171.124. Load Sharing
One of the most commonly used schemes in current multiprocessors.
- The load is distributed evenly across the processors,
assuring that no processor is idle while work is available to do.
- No centralized scheduler is required; when a processor is available,
the scheduling routine of the operating system is run on that processor to
select the next thread.
- A global queue of ready threads is maintained. It can be organized and
accessed using any priority-based schemes and schemes that consider execution
history or anticipated processing demands.
Versions of Load Sharing:
- First come first served (FCFS): when a job arrives, each of its
threads is placed consecutively at the end of the shared queue.
- Smallest number of threads first: the shared ready queue is organized
as a priority queue, with highest priority given to threads from jobs with the
smallest number of unscheduled threads.
- Preemptive smallest number of threads first: highest priority is given
to jobs with the smallest number of unscheduled threads. An arriving job with
a smaller number of threads than an executing job will preempt threads
belonging to the scheduled job.
Disadvantages of Load Sharing
- The common queue needs mutual exclusion
- May be a bottleneck when more than one processor looks for work at the
- Preempted threads are unlikely to resume execution on the same processor
- Cache use is less efficient
- If all threads are in the global queue, all threads of a program will not
gain access to the processors at the same time
126.96.36.199. Gang Scheduling
Simultaneous scheduling of threads that make up a single
Useful for applications where performance severely degrades when any part of
the application is not running, since threads may often need to synchronize with
assume there are N processors and M applications, and each
application has not more than N threads.
The first option may result in processors staying idle,
as illustrated in Figure 10.2.
- each application is given 1/M of the available time on the N processors
- the given time slice is proportional to the number of threads in each
Co-scheduling is based on the concept of scheduling a related set of
called a task force. The related tasks can be treated as threads.
- synchronization blocking is reduced
- less time for resource allocation
188.8.131.52. Dedicated Processor Assignment
An extreme form of gang scheduling is to dedicate a group of processors to an
application for the duration of the application.
Two observations in defense of this strategy:
- In a highly parallel system, with tens or hundreds of processors, each of
which represents a small fraction of the cost of the system,
processor utilization is no longer so important as a metric for
effectiveness or performance.
- The total avoidance of process switching during the lifetime of a
program should result in a substantial speedup of that program.
Problem: if the number of the active threads is greater than the number
of the processors,
there would be greater frequency of thread preemption and
Comparison with gang scheduling:
Similarities - threads are assigned to processors at the same time
Differences - in dedicated processor assignment threads do not change
Gang scheduling and dedicated processor assignment are more similar to memory
assignment than to uniprocessor scheduling:
In memory scheduling pages are assigned to processes
In gang scheduling and dedicated processor assignment processors are assigned
184.108.40.206. Dynamic Scheduling
Both the operating system and the application are involved in making
- The operating system is responsible for partitioning the processors among
- Each job uses the processors currently in its partition to execute some
subset of its runnable tasks by mapping these tasks to threads.
On request for a processor, OS does the following:
- If there are idle processors, use them to satisfy the request.
- Otherwise, if the job making the request is a new arrival, allocate it a
single processor by taking one away from any job currently allocated more
than one processor.
- If any portion of the request cannot be satisfied, it remains outstanding
until either a processor becomes available for it or the job rescinds the
Upon release of one or more processors (including job departure), OS does the
- Scan the current queue of unsatisfied requests for processors.
- Assign a single processor to each job in the list that currently has no
processors (i.e., to all waiting new arrivals).
- Then scan the list again, allocating the rest of the processors on an FCFS
The overhead of this approach may negate this apparent performance advantage.
- Correctness of the system depends not only on the logical result of the
but also on the time at which the results are produced.
- Tasks or processes attempt to control or react to events that take place
in the outside world.
- These events occur in "real time" and processes must be able to
keep up with them.
- Control of laboratory experiments
- Process control plants
- Air traffic control
- Military command and control systems
2.2. Types of Tasks
A. With respect to urgency
A hard real-time task is one that must meet its deadline;
will cause undesirable damage or a fatal error to the system.
A soft real-time task has an associated deadline that is desirable but
it still makes sense to schedule and complete the task even if it
has passed its deadline.
B. With respect to execution
An non-periodic task has a deadline by which it must finish or start,
or it may have a constraint on both start and finish time.
A periodic task is one that executes once per period T or
exactly T units apart.
2.3. Characteristics of Real-time Operating
Real-time operating systems can be characterized as having unique
requirements in five general areas:
- User control
- Fail-soft operation
- Operations are performed at fixed, predetermined times or within
predetermined time intervals
- Concerned with how long the operating system delays before acknowledging
- How long, after acknowledgment, it takes the operating system to service
- Includes amount of time to begin execution of the interrupt
- Includes the amount of time to perform the interrupt
Determinism and responsiveness together make up the response time to
It is essential to allow the user fine-grained control over task priority.
The user should be able to distinguish between hard and soft tasks and to
specify relative priorities within each class.
- User specifies priority
- Specifies paging
- What processes must always reside in main memory
- Disks algorithms to use
- Rights of processes
Loss or degradation of performance may have catastrophic consequences
Fail-soft operation is a characteristic that refers to the ability of a
system to fail in such a way
as to preserve as much capability and data as
- attempt either to correct the problem or minimize its effects while
continuing to run.
Stability: A real-time system is stable if, in cases where it is
impossible to meet all task deadlines, the system will meet the deadlines of its
most critical, highest-priority tasks, even if some less critical task deadlines
are not always met.
2.4. Necessary features in order to meet the
- Small size (with its associated minimal functionality)
- Fast process or thread switch
- Ability to respond to external interrupts quickly. This includes:
- Multitasking with interprocess communication tools such as semaphores,
signals, and events
- Preemptive scheduling based on priority
- Minimization of intervals during which interrupts are disabled
- Primitives to delay tasks for a fixed amount of time and to pause/resume
- Special alarms and time-outs
- Use of special sequential files that can accumulate data at a fast rate
What is important is that all hard real-time tasks complete (or start) by
their deadline and that as many as possible soft real-time tasks also complete
(or start) by their deadline.
2.5. Real-Time Scheduling
Real-time scheduling is one of the most active areas of research in computer
The algorithms can be classified along three dimensions:
- When to dispatch
- How to schedule
(1) whether a system performs schedulability analysis,
(2) if it does, whether it is done statically or dynamically, and
(3) whether the result of the analysis itself produces a schedule or plan
according to which tasks are dispatched at run time.
- Types of tasks: periodic/non-periodic
deadline scheduling - periodic and non-periodic
rate monotonic scheduling - periodic tasks
2.5.1. When to dispatch
The problem here concerns how often the operating system will
interfere to make a scheduling decision. Examples of different policies
are listed below:
- Round-robin preemptive scheduler
- Priority-driven non-preemptive scheduler
- Priority-driven preemptive scheduler on preemption points
- Immediate preemptive scheduler
They are illustrated in Figure 10.4.
2.5.2. How to schedule
Classes of algorithms
- Static table-driven approaches: these perform a static analysis of
feasible schedules of dispatching. The result of the analysis is a schedule
that determines, at run time, when a task must begin execution.
- Applicable to periodic tasks
- Inflexible approach - any change requires the schedule to be redone
- Static priority-driven preemptive approaches: again, a static analysis
but no schedule is drawn up. Rather, the analysis is used to
assign priorities to tasks,
so that a traditional priority-driven preemptive
scheduler can be used.
- Dynamic planning-based approaches: feasibility is determined at run
An arriving task is accepted for execution only if it is
feasible to meet its time constraints.
- Dynamic best effort approaches: no feasibility analysis is performed.
The system tries to meet all deadlines and aborts any started process whose
deadline is missed.
Used for non-periodic tasks.
2.5.3. Scheduling periodic and non-periodic tasks
220.127.116.11. Deadline Scheduling
Real-time applications are generally not concerned with sheer speed but
rather with completing (or starting) tasks at the most valuable times, neither
too early nor too late, despite dynamic resource demands and conflicts,
processing overloads, and hardware or software faults.
Information about each task, needed for deadline scheduling:
- Ready time: time at which task becomes ready for execution.
- Starting deadline: time by which a task must begin.
- Completion deadline: time by which a task must be completed.
typical real-time application will either have starting deadlines
completion deadlines, but not both.
- Processing time: time required to execute the task to completion.
- Resource requirements: set of resources (other than the processor)
required by the task while it is executing.
- Priority: measures relative importance of the task.
- Subtask structure: a task may be decomposed into a mandatory subtask
and an optional subtask.
Earliest deadline strategy: Scheduling tasks
with the earliest deadline minimizes the fraction of tasks that miss their
Earliest deadline strategy with unforced idle times:
Always schedule the eligible task with the earliest deadline and let
that task run to completion.
- Ending deadlines - earliest deadline strategy with preemptive
See an example.
- Starting deadlines - non-preemptive scheduling with unforced
See an example.
Requirement: we need to know in advance the processes to be executed
18.104.22.168. Rate Monotonic Scheduling
Rate monotonic scheduling (RMS) assigns priorities to tasks on the basis
of their periods.
The tasks period T is the amount of
time between the arrival of one instance
of the task and the arrival of the next
instance of the task.
Figure 10.7 illustrates the concept.
For RMS, the highest-priority task is the one with the shortest period, the
second highest-priority task is the one with the second shortest period, and so
To ensure that the deadlines of n tasks are met, the following
inequality should hold:
Cn/Tn £ 1
where Ci is the execution time of the i-th task,
Ti is its period.
Reasons RMS has been adopted in industrial applications:
- Utilization as high as 90% can be achieved
- Most hard real-time systems also have soft real-time components, such as
certain noncritical displays and built-in self tests that can execute at
lower priority levels to absorb the processor time that is not used with RMS
scheduling of hard real-time tasks.
- Stability is easier to achieve with RMS, priorities can be modified so
that essential tasks always complete.
- List and briefly define five different categories of synchronization
- Which are the main issues concerning multiprocessor scheduling?
- Which are the issues concerning assignment of processes to processors?
- What is the disadvantage of static assignment of processes to processors?
- When is it reasonable to have multiprogramming on a single processor in
multiprocessor system? Why?
- What is the basic method used for assignment of processes to processors?
Describe its key features.
- List and briefly define four techniques for thread scheduling.
- List and briefly define three versions of load sharing
- What is the difference between hard and soft real-time tasks?
- Which tasks are said to be periodic?
- What is fail-soft operation?
- When do we say that a real-time system is stable?
- Describe deadline scheduling.
- Describe rate monotonic scheduling.
Back to Lecture Notes page
Back to CmSc 335 web page
Created 08/08/01 by Lydia Sinapova