What Is Statistical Process Control?

The third leg of the DataOps stool is statistical process control. What is statistical process control?

I went to undergrad around the turn of the millennium. At that time, lean manufacturing was all the rage. We’re all learning about just in time manufacturing, how Japanese auto manufacturers operate, and folks are going hard in the paint on getting their belts in six sigma.

This is where statistical process control comes from. It’s this idea that you measure processes and model them with statistics, then continue to measure to determine if a process is operating withing normal parameters.

Your data ecosystem is a series of processes. Those processes have to be monitored. We live in a messy universe where nothing runs with 100% uptime. Your EDW is going to fall over. It is going to blow up. At a minimum, especially when it’s new and still a baby, it’s gonna barf on your shoulder when you try to burp it. You need to understand when EDW isn’t doing what it’s supposed to be doing so you can take corrective action. Ideally, the system itself will take corrective action, but that requires some knowledge of calculus and we’re not there yet.

Last updated