Key Takeaways
  • What Is OCAP?
  • When Does OCAP Activate?
  • The 5-Step OCAP Process
  • Why Most OCAPs Are Slow — And How AI Fixes It
  • OCAP Metrics That Matter
Key Takeaway: OCAP (Out of Control Action Plan) is the standardized procedure semiconductor fabs follow when SPC charts detect an out-of-control (OOC) condition. A well-designed OCAP reduces mean time to resolution from hours to minutes. AI-assisted root cause recommendation can cut OCAP response time by 60%.
▶ Key Numbers
80%
fewer trial wafers with Smart DOE
$5,000
typical cost per test wafer
70%
reduction in FDC false alarms
<50ms
run-to-run control latency

What Is OCAP?

OCAP stands for Out of Control Action Plan. It is a predefined, step-by-step procedure that production engineers follow when a Statistical Process Control (SPC) chart signals an out-of-control (OOC) condition — meaning a process parameter has violated control limits or triggered a pattern rule (Western Electric rules, Nelson rules, etc.).

Every semiconductor fab maintains OCAPs for critical process steps. Without a structured OCAP, engineers waste time debating what to check first, leading to extended tool downtime and potential lot contamination.

When Does OCAP Activate?

OCAP is triggered when SPC monitoring detects any of the following OOC conditions:

  • Point beyond control limits: A single measurement exceeds ±3σ from the process mean
  • Run rule violations: 7+ consecutive points on one side of the center line
  • Trend patterns: 6+ consecutive increasing or decreasing points
  • Stratification: 15+ consecutive points within ±1σ (indicates measurement system issues)
  • Mixture patterns: Points consistently near the control limits with few near the center

The 5-Step OCAP Process

Step 1: Alert Acknowledgment and Lot Hold

When SPC triggers an OOC alarm, the responsible engineer must acknowledge within the defined time window (typically 15-30 minutes). Affected lots are placed on hold to prevent potentially defective wafers from advancing to the next process step.

Step 2: Initial Assessment

The engineer performs a quick triage:

  • Is this a measurement error? (Re-measure if suspect)
  • Is it a known pattern? (Check recent maintenance, PM, or part replacement)
  • Is it an isolated point or part of a trend?
  • Which lots are potentially affected?

Step 3: Root Cause Analysis

This is the most time-consuming step. Engineers use fishbone diagrams (Ishikawa), 5-Why analysis, and equipment logs to identify the root cause. Common root causes include:

  • Equipment drift: Consumable wear, gas flow degradation, temperature controller drift
  • Material variation: Incoming wafer quality, chemical lot changes
  • Recipe errors: Unintended parameter changes, software updates
  • Environmental factors: Cleanroom temperature/humidity excursions
  • Human error: Incorrect wafer loading, wrong recipe selection

Step 4: Corrective Action and Verification

Once the root cause is identified, the corrective action is implemented and verified:

  • Adjust the affected process parameter
  • Run qualification wafers to confirm the fix
  • Review SPC data to verify the process is back in control
  • Release held lots if downstream metrology confirms they are within spec

Step 5: Documentation and SOP Update

Every OCAP event is documented with root cause, corrective action, and time-to-resolution. If the root cause reveals a gap in the existing OCAP, the procedure is updated to include the new failure mode.

Why Most OCAPs Are Slow — And How AI Fixes It

The bottleneck in traditional OCAP is Step 3 — root cause analysis. Engineers must manually correlate SPC data with equipment logs, maintenance records, and upstream process data. In a modern fab with 500+ tools, this can take hours.

AI-assisted OCAP accelerates root cause identification by:

  • Automated correlation: ML models instantly correlate the OOC event with equipment sensor patterns, identifying the most likely root cause in seconds
  • Historical pattern matching: Search past OCAP records for similar OOC signatures and their confirmed root causes
  • Predictive alerts: Detect subtle trends before they trigger OOC, enabling preventive action
  • Cross-tool analysis: Identify whether the issue is tool-specific or recipe-specific by comparing data across all tools running the same process

OCAP Metrics That Matter

  • MTTR (Mean Time to Resolution): Industry benchmark is 2-4 hours; AI-assisted fabs achieve under 30 minutes
  • OCAP closure rate: Percentage of OCAPs resolved within the target time window
  • Repeat OCAP rate: If the same root cause triggers multiple OCAPs, the corrective action was insufficient
  • Lots scrapped per OCAP: The ultimate cost metric — effective OCAPs minimize wafer loss

Building a Better OCAP System

Modern fabs are moving from static OCAP flowcharts to dynamic, AI-powered OCAP systems that learn from every event. The goal is not just faster response, but fewer OOC events in the first place — through tighter process control, better equipment maintenance scheduling, and continuous optimization of process windows.

Still tuning R2R coefficients manually?

NeuroBox E3200 replaces metrology wait with real-time VM prediction. Control parameters auto-adapt based on prediction confidence. No manual lambda tuning.

Book a Demo →
MST
MST Technical Team
Written by the engineering team at Moore Solution Technology (MST), a Singapore-headquartered AI infrastructure company. Our team includes semiconductor process engineers, AI/ML researchers, and equipment automation specialists with 50+ years of combined fab experience across Singapore, China, Taiwan, and the US.