Menu
Guía de 90 días para paradas no planificadas en fabricantes de tamaño medio

Guía de 90 días para paradas no planificadas en fabricantes de tamaño medio

Un plan de 90 días de diagnóstico seguido de rutas para reducir el tiempo de inactividad no planificado en plantas de tamaño medio: qué se arregla en los días 1-30, por qué la mayoría de los programas se estancan y cómo consolidar los logros.
Guía de 90 días para paradas no planificadas en fabricantes de tamaño medio
The 90-Day Unplanned Downtime Playbook for Mid-Market Manufacturers

Key takeaways

  • Most plants do not have an unplanned-downtime problem. They have a downtime-data problem: events are logged in three or four places, none of which agree, so the same losses get fixed twice and the biggest ones never get fixed at all.
  • A 90-day program can close that gap without ripping up the existing stack. The point of the first 90 days is not "more uptime", it is one trusted downtime number that every meeting in the plant can argue against.
  • Days 1-30 are diagnostic: instrument the top three lines, define what counts as downtime, and consolidate every loss into one register. Days 31-60 are about routing: every recurring loss type maps to a single named owner and a single response loop. Days 61-90 are about compounding: the top five recurring causes get a permanent fix and the metrics move from lagging to leading.
  • Plants that finish the 90 days typically see unplanned downtime drop by a double-digit percentage on the instrumented lines, but the bigger win is structural, the plant now operates from one set of numbers instead of arguing about which spreadsheet is right.

Why most downtime-reduction programs stall

The reason most "downtime reduction" initiatives fail is not effort. It is that they start from the wrong place. A new dashboard goes up, an OEE number is published, a kaizen team is assembled, and within two months the production manager, the maintenance manager and the planner are each looking at a different downtime number and quietly assuming the other two are wrong.

That gap is real, and it is structural. Downtime is recorded by operators on the line in one system, captured by maintenance in a CMMS or paper log in another, and inferred by planning from missed production targets in a third. None of those three views see the same thing. The operator counts only stops they had time to log. Maintenance counts only stops that escalated to a work order. Planning counts only stops that broke the schedule. The total never matches and nobody trusts the average.

A 90-day playbook works because it does not try to reorganize the plant. It builds one trusted downtime register, routes every loss type to a single owner, and lets the volume of fixes do the rest.

Days 1-30: get one number everyone trusts

1. Pick three lines, not the whole plant

The fastest way to lose the next 90 days is to instrument every line at once. Pick the three lines that account for the largest share of unplanned downtime, or, if you do not know that yet, the three lines whose operators complain most. Three lines is enough to see the pattern and small enough to fix in one quarter.

2. Define what counts as downtime, in writing

Every plant has an unwritten threshold below which a stop "does not count." On most lines that threshold is anywhere from 30 seconds to 5 minutes. Pick one, write it down, and apply it the same way on all three lines. The number matters less than the consistency. A 90-second threshold applied to every line beats a 0-second threshold applied selectively. For a deeper view on how those micro-stops add up, see the article on production loss analysis.

3. Capture every stop in one register

The register can live in a CMMS, in an OEE tool, in a spreadsheet, what matters is that maintenance, production and planning all read from it. Each entry needs four fields: line, start time, duration, and a free-text reason. Categorisation comes later. Trying to define the perfect 32-reason taxonomy on day one is how the project dies in week three. Start with free text; cluster after week four.

4. Reconcile, then publish

At the end of day 30, reconcile the register against the maintenance log and the planning gap report for the same three lines. Expect them to disagree by 20-40%. Walk the floor and ask the operators which of the three is closest. Publish the reconciled number as the downtime number for those lines. Do not republish a "corrected" number a week later. Trust beats accuracy in this phase.

Days 31-60: route every loss type to one owner

5. Cluster the free-text reasons

By day 30 the register has between 300 and 1,500 entries. Cluster the free-text reasons into seven to ten loss types: changeover, micro-stops, minor mechanical, electrical/sensor, material starvation, quality reject, planned-but-overran, operator-related, and an "other" bucket. Anything bigger than ten types is too granular to act on; anything smaller than seven hides the real distribution.

6. Assign one owner per loss type

This is the single biggest leverage point in the 90 days. Each loss type gets one named owner, a person, not a department. Changeover goes to the production manager. Minor mechanical and electrical go to the maintenance manager. Material starvation goes to the planner. The "other" bucket has no owner; it stays in the register but does not get worked until it shrinks below 5%. The point is not perfection, it is making sure no loss type has two owners (which means none) or zero owners (which means none).

7. Create a single response loop per loss type

Each owner defines one response loop. Maintenance gets an automatic work order when any minor-mechanical stop exceeds five minutes. Production gets a daily review of changeovers longer than the standard. Planning gets an exception report on starvation. These loops should be boring and repeatable. The goal is not novelty; it is making sure every recurring loss has a path to a fix that does not depend on anyone remembering to look.

8. The first wave of fixes

In week 7 and 8, the top three recurring causes start showing up as fix candidates. These are usually unglamorous, a sensor that triggers a false stop, a guide rail that drifts after the night-shift changeover, a feed rate setting that gets reset every Monday. Fix three of them. Do not pick the most interesting ones; pick the most frequent. For more on how to identify and prioritise the right ones, see our guide on root cause analysis in manufacturing.

Days 61-90: shift from lagging to leading metrics

9. Move the conversation from "how much" to "how often"

By day 60, downtime is being measured the same way by everyone. Day 60 to 90 is about changing what gets discussed in the morning meeting. Instead of "we lost 47 minutes to micro-stops yesterday," the question becomes "how many of those micro-stops were the same root cause we saw on Friday?" That shift is what turns the program from reactive to proactive.

10. Add leading indicators for the top three loss types

For each of the top three recurring causes, define one leading indicator. For minor mechanical, it might be number of false-stop sensor events per shift. For changeover, mean changeover variance vs standard. For material starvation, hours of inventory cover for the bottleneck. Track these daily. When a leading indicator drifts, the response loop fires before the downtime happens. For a structured way to choose these, the piece on manufacturing KPIs is a useful reference.

11. Lock in the gains

The last two weeks are not about new fixes, they are about making sure the existing ones do not erode. A meaningful share of "fixed" downtime causes come back if the change is not codified into a standard operating procedure or a preventive task. Every fix from days 30-75 gets a written standard, a check, and an owner. This is also where the program connects to the wider preventive maintenance schedule: a confirmed root cause should drive a PM task, not just a one-time repair.

What "done" looks like at day 90

By the end of the 90 days, the three instrumented lines have:

  • One downtime register that production, maintenance and planning all trust.
  • Seven to ten loss types, each with a single owner and a defined response loop.
  • Three permanent fixes on the top recurring causes.
  • Three leading indicators reviewed daily.
  • A floor that argues about how to reduce the next loss type, not about whose number is right.

The first three lines become the template for the next ten. The plant has not bought a new system, hired a new team, or rolled out a new methodology. It has just made the existing data agree with itself, which, for most mid-market plants, is the only thing standing between them and a step-change in OEE.

How Fabrico fits

The playbook above is deliberately tool-agnostic, but it works best when the downtime register, the maintenance work orders and the OEE calculations all live in one system rather than in three. Fabrico is a manufacturing platform built specifically for that: real-time OEE monitoring and a field-ready CMMS share the same downtime events, the same asset hierarchy and the same loss taxonomy, so the reconciliation step in days 21-30 is automatic instead of manual. If you want to see how that looks on your own three lines, book a demo and we can walk through your live data.

Frequently asked questions

How long should a downtime reduction program take?

The honest answer is that the structural part, getting one trusted downtime number, one owner per loss type, and a working response loop, fits comfortably in 90 days. The continuous improvement part never ends. Plants that try to do both at once usually do neither.

What is a realistic unplanned-downtime reduction in the first 90 days?

On the three instrumented lines, a double-digit reduction in unplanned downtime is realistic if the top three recurring causes get a permanent fix in that window. The plant-wide number moves more slowly because most plants only instrument three lines in the first quarter.

Do we need a new OEE or CMMS system for this?

No. The playbook works with whatever is already in place, including spreadsheets. A unified OEE + CMMS platform removes the reconciliation step and shortens the cycle from "stop happens" to "work order opens", but the methodology does not depend on it.

Why limit to three lines instead of the whole plant?

Because what kills downtime programs is the cost of the data argument, not the cost of the fixes. Three lines is enough to surface the recurring loss types and small enough to fix the data layer in 30 days. Whole-plant rollouts almost always stall in the reconciliation phase.

What is the single biggest mistake in the first 30 days?

Trying to define the perfect loss-type taxonomy before any data is in the register. Start with free-text reasons and cluster them at week four. Taxonomies designed in a conference room never survive contact with the floor.

Lo último de nuestro blog

Defina su hoja de ruta de confiabilidad
Valida tu retorno de inversión potencial: Reserva una demostración en vivo.
Defina su hoja de ruta de confiabilidad
Al hacer clic en el botón Aceptar, usted da su consentimiento para el uso de cookies al acceder a este sitio web y utilizar nuestros servicios. Para obtener más información sobre cómo se utilizan y gestionan las cookies, consulte nuestra Política de privacidad y Declaración de cookies