Data science for semiconductor process reliability
Link to Kaggle notebook for this project: https://www.kaggle.com/code/adrianacosta0/data-science-for-semiconductor-process-reliability
I set up this mini-tutorial to demonstrate simulating a CVD SiO2 growth process on a wafer, focusing on process reliability and variation across multiple chambers. The key steps include:
1) Process Simulation: Created a simplified model for deposition rate, incorporating various process parameters like pressure, temperature, and gas flow rates.
2) Data Generation: Simulated data for 10 chambers processing 10 wafers each, including realistic hardware/sensor variations and an intentional outlier chamber.
3) Visualization: Plotting deposition rates across wafers, revealing chamber-to-chamber variations.
4) Principal Component Analysis (PCA): Applied PCA to visualize chamber clustering based on process conditions, effectively identifying the outlier chamber.
5) Machine Learning for Variation Source: Implemented a Random Forest classifier to identify key input parameters explaining the variation between normal and outlier chambers.
6) Drift Analysis: Introduced an artificial drift in deposition rate and analyzed its impact using PCA and sensor data visualization.
7) Predictive Modeling: Used a Random Forest regressor to predict deposition rates and identify the most influential process parameters.