International Journal of Production Research
Vol. 47, No. 21, 1 November 2009, 6145–6158
Real time production improvement through bottleneck control
a*, Qing Chang
, Jun Ni
and Stephan Biller
Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan, USA;
Manufacturing Systems Research Lab, General Motors R&D Center, Warren, Michigan, USA
(Received 24 January 2008; final version received 23 May 2008)
Variability is a key characteristic for evaluating the performance of a process.
Small variability for a bottleneck machine can generate high production
variability. Short-term production analysis and bottleneck identification are
imperative for enabling optimal response to dynamic changes within the system.
In comparison to the rich and abundant literature available on long-term
analysis, only a small section of the literature addresses the dynamic bottleneck
control policies, which may be used to maximise sustainable benefits. In this
paper, a real time bottleneck control method is introduced to efficiently utilise the
finite manufacturing resources and to mitigate the short-term production
constraints by using two practical approaches: initial buffer adjustment and
maintenance task prioritisation. The objective for real time bottleneck control is
to obtain a continuous production improvement towards a balanced-line status to
increase the throughput efficiently. The benefits of this method are presented by
considering an industrial case study of an automotive assembly line. The results
obtained from this case study show significant production improvements as
compared to traditional approaches.
Keywords: real time bottleneck control; short-term; balanced-line status;
bottleneck inertia phenomena; initial buffer adjustment
Throughput is an important parameter to evaluate production performance. Extensive
work has been done in the area of throughput analysis (Dallery and Gershwin 1992,
Gershwin 1994, Govil and Fu 1999, Hopp and Spearman 2000). A machine is defined as a
throughput bottleneck if the performance of the machine is the most sensitive to the
overall performance of a manufacturing system. The existing work in bottleneck detection
can be categorised into analytical methods (Gershwin 1994, Wang et al. 1999, Blumenfeld
and Li 2005) and simulation-based methods (Law and McComas 1998, Bonder and
McGinnis 2002). Most of the bottleneck studies using analytical methods are restricted to
long-term steady-state bottleneck detection because of their statistical and probability
distribution assumptions for machine performance. Also, developing analytical closed
form solutions for complex lines is difficult. As compared to analytical methods, discrete
event simulation may be used to understand complex layout and study their dynamic
performance. The major drawbacks of the simulation approach are system specific
knowledge, relatively less flexibility to layout changes, long development time and possible
*Corresponding author. Email: [email protected]
ISSN 0020–7543 print/ISSN 1366–588X online
� 2009 Taylor & Francis
misinterpretations of simulation results. These factors greatly impede the wide application
of simulation-based methods.
Different approaches have been developed for long-term steady-state bottleneck
control. Adams et al. (1988) described an approximation method for solving the minimum
makespan problem of job shop scheduling through the shifting bottleneck procedure.
Computational testing shows that the proposed approach yields consistently better results
than other procedures discussed in the literature. In Pourbabai (1993), an optimal
operational strategy is used to optimise the system utilisation while controlling the
bottleneck problem in a finite capacity integrated assembly line system. Lawrence and
Buss (1995) critically analysed production bottlenecks from an economic perspective,
addressing important facilities-design and demand-planning problems. Queueing theory
has been used to demonstrate that production bottlenecks are inevitable when there are
differences in job arrival rates, processing rates, or costs of productive resources.
In Banaszak (1997), a bottleneck control problem for general periodic job shops with
blocking where each machine has an input buffer of finite capacity is investigated.
A distributed buffer control policy that restricts a job from entering an input buffer of a
local machine in a specific sequence is proposed to schedule periodic job shops. A typical
control model for manufacturing systems, the production planning and control (PPC)
system, models manufacturing systems using block diagrams and dynamic equations in
continuous or discrete time (Fandel 1994, Towill et al. 1997, Duffie and Falu 2002,
Ratering and Duffie 2003). The ‘planning’ subtasks usually consist of materials
requirements planning, throughput scheduling, and capacity collation while the ‘control’
subtasks include job release, fine scheduling, sequence planning, and operational data
acquisition (Fandel 1994). In the PPC approach, manufacturing systems are modelled in
closed loops with feedback control algorithms and disturbances are modelled as uncertain
stochastic processes. As an important part of PPC control, work-in-process (WIP) control
has been studied to improve system performance in the long term (Wiendahl and
Breithaupt 2000, Lawley and Sulistyono 2002, Ioannidis et al. 2004). The WIP in the
manufacturing system increases inventory cost and the system’s cycle time, which lead to
higher cost and lower responsiveness, respectively. Therefore, reducing fluctuations in
production and maintaining low WIP while maintaining the required throughput is the
purpose of WIP control. It is observed that these bottleneck control policies focus on
steady-state production control while ignoring real time bottleneck control to obtain a
continuous production improvement towards an efficiently balanced-line status.
In comparison to the rich and abundant literature available for the long-term analysis,
only a small section of the literature addresses the dynamic bottleneck control policies that
may be used to maximise sustainable profits. A possible reason is that in the long term,
the system performance can be modelled statistically while in the short term, the system
performance is difficult to be monitored and no certain pattern or distribution can be
followed. The short term is referred to an operational period not large enough for
machines’ failure behaviour to be described by a statistic distribution. This short-term
period could be hours, shifts, or days for example in a mass production environment.
Nakata et al. (1999) introduced a workflow control system for semiconductor
manufacturing called ‘JUSTICE/MORAL’ (just time process control system/method of
optimum-buffer restriction and adjustment logic) which dynamically detects a
machine causing a bottleneck and feeds work to that machine at an appropriate time.
Chang et al. (2007) proposed a simulation-based method to control a production line
6146 L. Li et al.
through mitigation of short-term bottlenecks in order to obtain an optimal control policy.
The drawbacks of the simulation approach impede its wide application.
For manufacturing systems with unreliable machine and finite internal buffers, there is
a need for a control policy, capable of providing short-term real time control in order to
satisfy various performance levels. Using real time data analysis can provide sustainable
benefits or opportunities that may not be recognised during the long-term analysis.
In practical situations, it is desired to make real time decisions based on bottleneck
identification and mitigation. However, both analytical and simulation methods have their
limitations to perform real time bottleneck control, which leads to loss of maintenance
opportunity and possible loss of production. In this paper, a real time bottleneck control
method is developed using online measurable data such as production line blockage and
starvation information to monitor system performance at the real time and to obtain
sustainable production benefits based on continuous production improvement. Two
practical methods for short-term bottleneck mitigation, initial buffer adjustment and
maintenance task prioritisation, are developed to continuously improve system
performance towards balanced-line production status. The benefits of this approach are
illustrated using an industrial case study of an automotive assembly line.
The rest of this paper is organised as follows. Section 2 details the framework and
approach for the real time bottleneck control. Section 3 presents industrial case studies.
Finally, Section 4 provides conclusions and future work.
2.1 Bottleneck control framework
A manufacturing system can most accurately be described as a discrete, dynamic, and
nonlinear system. Continuous improvement is an important route to improving
production efficiency. This continuous improvement process can be obtained by providing
a control framework as illustrated in Figure 1. Here, ‘control’ is defined as an action that
assists the personnel on the plant floor based on on-line feedback information of the
system to improve the system performance consistently overtime.
The desired performance for a manufacturing system includes a throughput target and
an ideal balanced-line status described by blockage time and starvation time of all stations.
If the actual throughput deviates from the target, bottleneck detection measures the
performance variation from a balanced-line production status. The controller makes
Figure 1. On-line bottleneck control framework.
International Journal of Production Research 6147
decisions on how to mitigate the bottleneck to reduce variation of production and improve
the system performance. Generally, the controllable parameters in a real production line
include machine repair time and cycle time. As cycle time is difficult to adjust for a paced
assembly line, the focus of our research is the reduction of downtimes.
In this research, a new data driven method for throughput bottleneck detection is
used (Li et al. 2008). This method utilises production line blockage and starvation
information to identify production constraints. Compared to traditional bottleneck
detection methods, the data driven method identifies bottleneck locations in both the
short term and the long term based on the online data without building a simulation or
analytical model. The main advantage of this method is that it can be adapted easily to
different production lines.
The disturbances to the system usually include random failures and lack of workforce
due to absenteeism. Random failures cannot be completely eliminated, but efforts can be
made to reduce them by devising a good maintenance scheduling and control policy.
The objective of the control mechanism for this research is to maintain the production
line at a relatively balanced status. Since the research focus is a production transfer line,
the balanced-line status is preferable for improving production efficiency (Gershwin 1994).
For a tandem line, if all stations have equal capacity, the line is balanced (Hopp and
Speraman 2000). For an ideal balanced line, all machines may be regarded as bottlenecks
(Hopp and Speraman 2000). Jacobs and Meerkov (1993) defined a balanced line in terms
of improvability. If a production line is unimprovable, then the production line is said to
be well designed or optimally balanced. Unimprovable means that any improvement in the
productivity of any individual machine will not improve the overall system throughput.
This situation can be represented as:
� �, 8i 2 ð1, . . . , mÞ,
where ��1 and �TPsys, i is the system throughput increment due to a performance change
by reducing downtime of machine i, while �TPi is the standalone throughput increment of
Furthermore, Jacobs and Meerkov (1993) indicated that for an unimprovable balanced
line, each intermediate machine is blocked and starved with equal frequency. The
frequency of blockage of each preceding machine is equal to the frequency of starvation of
the succeeding machine. Since a balanced-line implies better production efficiency, for
every time period the bottleneck control goal is to reach the throughput objective as close
as possible to obtain the highest possible efficiency.
To make an effective control operation, the control frequency needs to be set carefully.
Based on the feedback control framework, the latest performance of the system is
measured and corrective action is applied to improve production efficiency. The impact of
the current bottleneck will last until the system dynamics stabilise and the balancing
situation has changed so that the bottlenecks switch to other locations. This phenomenon
is called the ‘bottleneck inertia’, which is often observed in a designed balanced
system with finite buffers. Besides the parameters describing the machine performance,
such as mean time to repair (MTTR), mean time between failures (MTBF) and cycle
time, bottleneck inertia phenomena affect the control frequency as well. Generally,
different production lines have different control frequency. Specific study needs to
6148 L. Li et al.
2.2 Bottleneck control strategy
Based on the analysis in bottleneck inertia phenomena, it is observed that the bottleneck
location will change under different operating conditions. The amount of downtime to be
reduced directly affects the variation at the bottleneck location. Although reducing all the
unplanned downtime is one of the most important purposes of the maintenance operation,
limited maintenance resources (e.g., maintenance personnel) on the plant floor do not
allow all the maintenance work-orders to be performed at the same time. Therefore, high
priority should be given to the maintenance work-order that has high effect on the system
performance improvement. It means downtime reduction should be performed on the
bottleneck machine until the new bottleneck is detected as shown in the case for bottleneck
inertia phenomena. Reducing the downtime which causes the most production loss is a
much more meaningful and practical goal. In this way, the finite maintenance recourse is
efficiently utilised for throughput improvement. The proposed bottleneck control strategy
is developed to study this control process.
To realise real time control, the threshold value for downtime reduction on bottleneck
machine i, which is defined as the amount of downtime to be reduced until machine i
becomes non-bottleneck, is calculated as �TDi based on real data. This value is obtained
using a simplified three-machine-no-buffer model as pictorially represented in Figure 2.
The assumptions and simplifications in the calculation include:
(1) The first machine Mi�1 is never starved, and the last machine Miþ1 is never
(2) The cycle time for each machine is the same.
(3) Machine Mi is the turning point.
The following notation is used throughout this paper:
TBj – blockage time for machine j, j¼ i�1, i
TSj – starvation time for machine j, j¼ i, iþ1
TDj – downtime for machine j, j¼ i�1, i, iþ1
TWj – working time for machine j, j¼ i�1, i, iþ1
T – sampling time (e.g., one shift)
TC – cycle time of machines
TPsys – overall system throughput
TPj – standalone throughput for individual machine j, j¼ i�1, i, iþ1
– intersection of time range in time axis.
TBj, TSj, TDj and TWj are not a cumulative number but a series of time ranges in the
time axis when performing intersection. As illustrated in Figure 3, TDj occurs from 10 to
15 and 25 to 35 while TDjþ1 occurs from 12 to 14 and 20 to 30. As a result, the interaction
of them is equal to [12,14] and [25,30].
Figure 2. Selected segment for analytical calculation.
International Journal of Production Research 6149
Mathematically, the definition of bottleneck can be formulated as (Chang et al. 2007,
Li et al. 2008):
then machine k is defined as the bottleneck machine.
In an n-machine serial production line with n�1 buffers, the overall system throughput
TPsys over a time period is a function of each individual standalone throughput and
buffer content variation: TPsysðtÞ ¼ fðTP1ðtÞ, . . . , TPnðtÞ, B1ðtÞ, . . . , Bn�1ðtÞÞ. Therefore,
the sensitivity value of each machine i �TPsys, i=�TPi decides the location of the
Since the status of each machine has only four possibilities (blocked, starved, down,
and operating), the equations based on time summation for each machine can be obtained:
TBi�1 þTDi�1 þTWi�1 ¼ T ð1Þ
TBi þTSi þTDi þTWi ¼ T ð2Þ
TSiþ1 þTDiþ1 þTWiþ1 ¼ T: ð3Þ
It is clear that TBi is mainly caused by TDiþ1, and TSi is mainly caused by TDi�1.
Furthermore, TBi�1 is caused by TDi and TDiþ1, and TSiþ1 is caused by TDi�1 and TDi.
Therefore, four additional equations according to these conditions can be obtained as:
TBi ¼ TDiþ1 �TDi TDiþ1 � �� TDi�1 TDiþ1 �TDi�1 TDi TDiþ1ð Þ ð4Þ
TSi ¼ TDi�1 �TDi�1 TDi �ð1� �Þ � TDi�1 TDiþ1 �TDi�1 TDi TDiþ1ð Þ ð5Þ
TBi�1 ¼ TDi þTDiþ1 �TDi TDiþ1 �TDi�1 TDi �TDi�1 TDiþ1
þTDi�1 TDi TDiþ1
TSiþ1 ¼ TDi�1 þTDi �TDi�1 TDi �TDi�1 TDiþ1 �TDi TDiþ1
þTDi�1 TDi TDiþ1,
where 0���1. When both machines Mi�1 and Miþ1 break down, and if there is a part on
machine Mi, then machine Mi is said to be blocked during the failure; else if there is no
Figure 3. Notation illustrations.
6150 L. Li et al.
part on machine Mi when both machine Mi�1 and machine Miþ1 have failed, then machine
Mi is said to be starved. Therefore, parameter � is a normalised index between 0 and 1
describing this type of uncertainty in Equations (4) and (5).
The sensitivity value for each machine has been obtained (Li et al. 2008) as:
�TD� �TDTDi � �TDTDiþ1 þ �TDTDi TDiþ1
�TD� �TDTDi�1 � �TDTDiþ1 þ �TDTDi�1 TDiþ1
�TD� �TDTDi�1 � �TDTDi þ �TDTDi�1 TDi
According to Equations (8), (9), and (10), when TDi is reduced, �TPsys, i�1=�TPi�1 and
�TPsys, iþ1=�TPiþ1 are both increased, while �TPsys, i=�TPi remains the same. Then, we
conclude that the sensitivity of non-bottleneck machines approaches the sensitivity of the
bottleneck machine as the downtime of the bottleneck machine is reduced. Therefore,
when TDi is reduced below a certain value, �TPsys, i�1=�TPi�1 or �TPsys, iþ1=�TPiþ1 will
become higher than �TPsys, i=�TPi. At this time, the bottleneck will switch to machine
i�1 or iþ1.
Assuming the threshold value for downtime reduction is �TDi, according to Equations
(8), (9), and (10), the new sensitivity values after this reduction can be obtained as:
�TPsys, i�1, new
�TD� �TDðTDi � �TDiÞ� �TDTDiþ1 þ �TDðTDi � �TDiÞTDiþ1
�TPsys, i, new
�TD� �TDTDi�1 � �TDTDiþ1 þ �TDTDi�1 TDiþ1
�TPsys, iþ1, new
�TD� �TDTDi�1 � �TDðTDi � �TDiÞþ �TDTDi�1 ðTDi � �TDiÞ
If TDi�14TDiþ1, then �TPsys, i�1=�TPi�1 > �TPsys, iþ1=�TPiþ1. Setting Equation
) �TD TDi � �TDi �TDi�1ð Þ ¼ �TDðTDi � �TDi �TDi�1ÞTDiþ1: ð11Þ
International Journal of Production Research 6151
Relation TDi��TDi�TDi�1¼0 can make Equation (11) always be true. Therefore,
when �TDi ¼ TDi �TDi�1, leads to �TPsys, i, new=�TPi, new ¼ �TPsys, i�1, new=�TPi�1, new
and TD0i ¼ TDi�1 ¼ TDi � �TDi, where TD
i is updated TDi in this new stage.
Then, we substitute �TDi ¼ TDi �TDi�1 into Equations (8a), (9a), and (10a). These
three equations become:
�TPsys, i�1, new
�TD� �TDTDi�1 � �TDTDiþ1 þ �TDTDi�1 TDiþ1
�TPsys, i, new
�TD� �TDTDi�1 � �TDTDiþ1 þ �TDTDi�1 TDiþ1
�TPsys, iþ1, new
Next, make (8b)¼ (9b)¼ (10b) so that the sensitivity value of each machine is equal.
Assuming the new threshold value to be �TDi�1 on machines i�1 and i, substitute
TDi�1 ¼ TDi�1 � �TDi�1 into (8b), (9b), and (10b) and equate these three expressions:
) �TDTDiþ1 ¼ �TDðTDi�1 � �TDi�1ÞTDiþ1: ð12Þ
As a result, condition TDi�1 � �TDi�1 ¼ TDiþ1 can make Equation (12) always true,
and we can conclude that when �TDi�1 ¼ TDi�1 �TDiþ1, further relations can be
�TPsys, i�1, new
�TPsys, i, new
�TPsys, iþ1, new
TD0i�1 ¼ TD
i ¼ TDiþ1 ¼ TDi�1 � �TDi�1:
In the extreme case when the downtimes of all the machines are reduced to zero,
the sensitivity values of all the machines will become zero according to Equations (8), (9).
The threshold value for downtime reduction obtained above is under the assumption
that there is no buffer within the three-machine segment. In real manufacturing systems,
buffers play an important role to balance production line and alleviate bottleneck
problems. Although the threshold value can theoretically identify the boundary value for
bottleneck change, the presence of buffers causes the bottleneck location change to be
Furthermore, based on relationship:
TBi�1 þTDi�1 þTWi�1 ¼ T
TBi þTSi þTDi þTWi ¼ T
where TWi�1 ¼ TWi, we also obtain �TDi ¼ TDi �TDi�1 ¼ TBi�1 �ðTBi þTSiÞ, which
implies that the control strategy for downtime reduction is equivalent to reducing the
blockage and starvation time of non-bottleneck machines to make it equal to the blockage
and starvation time of the bottleneck machine. For production lines without buffers, the
bottleneck machine usually has a higher downtime. On the other hand, for production
lines with buffers, the bottleneck machines may not have higher downtime than
6152 L. Li et al.
non-bottleneck machines, but they will still have smaller blockage plus starvation
time than adjacent non-bottleneck machines as mentioned in Li et al. (2008). In
this case, although the term of ‘downtime reduction’ is still used, the outcome of control
is equivalent to ‘reduction’ of the blockage and starvation of the non-bottleneck stations
towards balanced-line status. The threshold value for downtime reduction becomes:
�TDi ¼ min TBi�1 � TBi þTSið Þ, TSiþ1 � TBi þTSið Þð Þ:
Based on the calculation of threshold value for downtime reduction, the real time
bottleneck control algorithm is developed to consistently improve system performance
towards balanced-line status as shown in Figure 4.
The practical actions to realise real time bottleneck control in the short term includes
performing the reactive maintenance task prioritisation and initialising the buffer content.
For maintenance task prioritisation, the waiting time for machine repair is reduced and
finite maintenance resources can be utilised efficiently. In the proposed bottleneck control
strategy, high priority is given to all the bottlenecks detected rather than only focusing on
one bottleneck. For buffer initialisation, threshold downtime to be reduced can be
translated into initial buffer contents, and this research explores a data driven approach on
initial buffer adjustment to mitigate bottleneck in the short term.
Two realistic assumptions are made for this approach:
. Buffer capacity is relatively large (410)
. Initial buffer level is adjustable at the beginning and end of every production shift.
The practical method is to run certain machines for a longer time.
The initial buffer contents are adjusted around the bottleneck machine as demon-
strated in Figure 5, in which M3 is the bottleneck. ‘Bottleneck gain’ is defined as �i, which
is the number of parts adjusted between two buffers:
As a result, the final buffer content after initial buffer adjustment is equal to:
�i, final ¼ �i, original þ �i, in � �i, out,
Figure 4. Bottleneck control algorithm.
International Journal of Production Research 6153
where �i, original and �i, final are the content for buffer i before and after adjustment,
respectively. �i, in is the bottleneck gain into buffer i while �i, out is the buffer gain out of
3. Industrial case study
3.1 Case study for maintenance task prioritisation
An industrial case study of an automotive assembly line is used to illustrate the
implementation of the proposed bottleneck control strategy. In this section, the method of
reactive maintenance task prioritisation for bottleneck control is used. The schematic
layout is shown in Figure 6. This production line starts from station S1 and ends at
station S17. The parameters for the stations are shown in Table 1.
The blockage and starvation information after one day of production are recorded.
It can be observed that stations S1, S4, and S14 are the three bottlenecks. The
prioritisation policy for bottleneck control in this research is that all the three
bottlenecks have high priority to be maintained rather than only considering one or
To verify this conclusion, we compare our proposed prioritisation policy with other
policies using simulation of this real production line. The assumptions and conditions for
this simulation include:
. Only one maintenance engineer is available and the effort on each policy is the
. Statistical results of replications are used.
. Only reactive maintenance is considered. That means the repair or replacement is
triggered only when a machine breaks down.
. For baseline, the policy for performing reactive maintenance tasks is first come
first served (FCFS), which means the maintenance engineer will work on the
machine which fails first without considering bottleneck locations. For the
prioritisation policy with bottleneck detection considered, reactive maintenance is
performed on the bottleneck machine first.
The production comparisons for different prioritisation policies are summarised in
Table 2. It is observed that the proposed control strategy on three bottlenecks S1, S4, and
S14 results in higher throughput and efficiency than other policies. Hence, we state that the
finite maintenance resources are better utilised.
Figure 5. Adjustment of initial buffer levels around the bottleneck.
6154 L. Li et …
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.Read more
Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.Read more
Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.Read more
By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.Read more