WHY OEE IS NOT A MAINTENANCE KPI
OEE is a useful metric for understanding how well a piece of equipment is performing. It combines availability, performance, and quality into a single number that captures all three types of equipment loss at once. We covered how to calculate and use it in how to calculate manufacturing OEE properly.
But OEE is not a maintenance KPI. OEE tells you what is happening with the equipment. Maintenance KPIs tell you whether your maintenance program is causing it, or preventing it.
A plant with high OEE can have a failing maintenance program if equipment reliability is declining, PM compliance is dropping, and the maintenance team is spending 80 percent of their time on reactive repairs. A plant with average OEE can have a strong maintenance program if it is systematically improving mean time between failures, recovering faster when equipment goes down, and shifting its labor mix toward planned work.
The maintenance metrics that matter are not about the equipment's output. They are about the maintenance program's health. These are the six we use to evaluate a plant's maintenance function, and the ones we recommend every plant track and review weekly.
PM COMPLIANCE: THE FOUNDATION METRIC
Preventive maintenance compliance is the percentage of scheduled PMs completed on time. A PM that was due Tuesday and ran Friday is not compliant. A PM skipped entirely and caught up two weeks later is not compliant. Either way, the equipment was exposed to a failure risk it was not supposed to have.
Most plants that struggle with equipment reliability have PM compliance problems before they have technical problems. The maintenance team has a schedule. The PMs get deferred when production is behind schedule. The equipment that was supposed to be protected runs unprotected for days or weeks. When something eventually fails, the failure gets called a reliability problem when it was actually a compliance problem.
Target for PM compliance: 90 percent or above on the monthly count of PMs completed on schedule. Anything below 80 percent signals that the PM program is either under-resourced, poorly scheduled, or systematically deprioritized in favor of production output. The preventive maintenance guide covers how to structure PM schedules that are realistic to comply with, including how to divide ownership between operators and technicians.
MTBF: TRACKING RELIABILITY OVER TIME
Mean Time Between Failures is the average time between unplanned breakdowns on a piece of equipment. Calculate it by taking total operating time over a period and dividing by the number of failures during that period.
If a machine ran 600 hours last quarter and had 6 unplanned breakdowns, MTBF was 100 hours. If your PM program is working, MTBF on critical equipment should trend upward over time. If MTBF is flat or declining despite a PM program running, the PMs are either not addressing the actual failure modes or are not being done correctly.
Track MTBF per machine, not as a plant-wide average. A plant average hides the fact that one critical machine is failing every two weeks while everything else runs reliably. You need machine-level visibility to act on it.
A machine going from MTBF of 80 hours to MTBF of 200 hours over 12 months, with a structured PM program running, is a reasonable target for a machine with a history of frequent failures. World-class MTBF on well-maintained critical equipment in mid-market manufacturing is typically 500 to 1,000-plus operating hours between unplanned failures.
MTTR: MEASURING RECOVERY SPEED
Mean Time To Repair is the average time from when a machine goes down to when it is producing again. It includes diagnosis, parts retrieval, repair, and verification. Calculate it by averaging the downtime duration across all unplanned breakdown events over a period.
MTTR is a measure of your maintenance organization's capability, not just the equipment's condition. A plant where the maintenance technician spends 45 minutes hunting for the correct part before the repair starts has a spare parts and documentation problem. A plant where diagnosis takes three hours because nobody documented what fixed the same failure last time has a maintenance records problem.
We have seen plants cut MTTR from four hours to under 90 minutes through two changes: a spare parts strategy for the top five constraint machines, and a maintenance history log that lets technicians see what fixed the problem the previous three times it occurred. The spare parts strategy guide covers how to structure parts stocking for critical assets specifically.
Target MTTR for critical equipment in a mid-market plant: under two hours for most electrical and mechanical failures, under four hours for complex failures requiring parts ordered in. Average MTTR above four hours consistently points to a documentation or parts availability gap worth addressing directly.
UNPLANNED DOWNTIME PERCENTAGE
Unplanned downtime as a percentage of total scheduled production time is the most direct measure of how well the maintenance program is protecting equipment availability. It answers the question: what share of the time the equipment was supposed to be running did it spend down due to an unexpected failure?
Calculation: total unplanned downtime minutes divided by total scheduled production time minutes, expressed as a percentage.
A plant with 10 percent unplanned downtime on its constraint line is losing one hour in every ten to unexpected failures. A plant at 3 percent has a functioning PM program. A plant at 15 percent or above is in reactive mode, and the maintenance team is spending its days responding to the same machines failing the same ways.
This metric is distinct from OEE availability loss because it isolates the unplanned component. OEE availability drops for multiple reasons: long changeovers, material shortages, planned maintenance windows. Unplanned downtime percentage filters to breakdowns only, which is the specific territory of the maintenance program's effectiveness.
Tracking this metric by machine and by week gives the maintenance team the clearest ongoing signal about where the PM program is protecting the operation and where it is not. The downtime tracking guide covers the event capture system and reason code structure needed to generate this metric reliably from floor-level data.
MAINTENANCE BACKLOG: THE LEADING INDICATOR
The maintenance backlog is the count of work orders that have been identified and approved but not yet completed, measured in hours of labor required to finish the outstanding work.
A healthy backlog is one to two weeks of available maintenance labor. A backlog at four to six weeks means the team cannot keep up with identified needs. Work is being identified and approved but not executed. That outstanding work represents future breakdowns: the things the team knows are wrong or degrading that are not getting fixed before they fail.
Backlog growth over time reveals two things. Consistent backlog growth signals that the maintenance team is under-resourced relative to the plant's actual maintenance needs. Rapidly shrinking backlog can mean the team is working through it effectively, or that work orders are being declined without proper evaluation. Both cases merit attention from the maintenance lead.
Review the backlog in the weekly maintenance planning meeting. Items aging past 60 days without action should either be escalated to a capital project, rescheduled with a firm date, or removed from the list with a documented decision. A backlog that is never reviewed is just a list of intentions, not a management tool.
PLANNED VERSUS REACTIVE MAINTENANCE RATIO
The ratio of planned maintenance hours to reactive maintenance hours is a direct measure of program maturity. Reactive maintenance is everything unscheduled: breakdown response, emergency repairs, unplanned parts runs. Planned maintenance is scheduled PM work, autonomous maintenance, and scheduled replacement of predictably wearing components.
A plant early in building a maintenance program typically runs 20 to 30 percent planned, 70 to 80 percent reactive. Most of the team's time is spent responding to failures. A plant with a strong PM program runs 60 to 70 percent planned and 30 to 40 percent reactive. World-class maintenance organizations run above 80 percent planned.
The shift from reactive to planned is not automatic. It requires PM compliance to be high enough that scheduled work is actually running, and equipment reliability to be improving enough that reactive demand is declining. The two work together: higher PM compliance reduces breakdowns, which frees up technician time for more PM work, which further reduces breakdowns.
Track this ratio monthly. If it is not moving in the planned direction over six months of PM program operation, the PM tasks are not addressing the actual failure modes and the program needs a technical review.
BUILDING THE MAINTENANCE KPI DASHBOARD
These six KPIs belong on a one-page dashboard reviewed weekly by the maintenance lead and monthly by plant leadership. The format does not need to be complex.
| KPI | Target | This Month | Trend | |---|---|---|---| | PM Compliance | 90%+ | 84% | Improving | | MTBF (critical 5) | Trending up | 142 hrs | Up from 108 | | MTTR | Under 2 hrs avg | 2.4 hrs | Flat | | Unplanned downtime % | Under 5% | 7.3% | Improving | | Backlog (labor hrs) | Under 2 weeks | 3.2 weeks | Growing | | Planned vs reactive | 60%+ planned | 44% planned | Improving |
Red for any metric below target. A shared spreadsheet that the maintenance lead fills in weekly is sufficient for most plants. The discipline of filling it in consistently is more valuable than the platform it lives on.
Post the dashboard on the maintenance area board and reference it in the daily production meeting when it is relevant. When OEE drops unexpectedly, the maintenance dashboard tells you whether the maintenance program is a contributing cause or the equipment is performing as expected given its current PM status.
The P7 downtime log and PM schedule templates in the Sharpen templates library give you the data capture structure these KPIs depend on. The downtime log template generates the unplanned downtime and reason-code data that feeds MTBF, MTTR, and unplanned downtime percentage. The PM schedule template is the tracking tool for PM compliance.
USING KPIS TO PRIORITIZE MAINTENANCE INVESTMENT
Once you are tracking these six metrics consistently, they become the basis for a rational maintenance investment conversation, not just a reporting exercise.
PM compliance below 80 percent is almost always a resource or scheduling problem. The conversation to have is: do we have enough technician hours to run the scheduled PMs, or are we chronically pulling them into reactive work? The answer determines whether the fix is staffing, scheduling discipline, or a reduction in PM scope to match available capacity.
MTBF that is flat after six months of PM compliance above 90 percent points to a PM design problem. The tasks are being done but the failures are not reducing. The right response is to review which failure modes are actually occurring and whether the current PM tasks would prevent them. Often the answer is no: the tasks were inherited from OEM manuals and have not been validated against the specific failure history of those machines.
MTTR above target consistently, despite adequate parts availability, usually points to a skills or documentation gap. The technician can find the failure but takes too long to resolve it. A maintenance history database showing how past failures were fixed cuts diagnostic time significantly. The work order and maintenance documentation guide covers how to structure that history in a way that is fast to use under time pressure.
The planned versus reactive ratio trend, more than any single month's number, tells you whether the program is actually changing the operation's underlying maintenance culture. A program that is moving consistently from 30 to 40 to 50 percent planned over six months is working. A program that has been flat at 35 percent for a year needs a structural review.
WHAT TO DO NEXT
These maintenance KPIs are the complement to the downtime cost analysis covered in how to calculate the true cost of manufacturing downtime. Downtime cost tells you what poor maintenance is costing in dollars. These KPIs tell you what the maintenance program is or is not doing to prevent it. Used together, the two give you both the burning platform and the measurement system to track improvement.
For a structured assessment of your plant's equipment and maintenance maturity, the free Sharpen diagnostic at /intake covers P7 (Equipment and Maintenance) across all four maturity stages in the 10 pillars framework: Stage 1 reactive, Stage 2 early PM, Stage 3 planned maintenance with metrics, Stage 4 predictive maintenance and high reliability. Most plants find their biggest maintenance leverage point in the first 20 minutes of the assessment.