Key Features
MSA is a key element of the
Quality Management System (QMS) and the
Six Sigma methodology, on top of that, it is also a mandatory point of the
Production Part Approval Process (PPAP) in the automotive industry.
As is was mentioned before, the measured value have two factors: one is the true value (part-to-part
variation), which is affected by our manufacturing process, the other one is the variation of our measurement system.
Sources of variation (Source: qMindset.com; AIAG MSA Manual)
During an MSA study we focus on the influencing factors of the measurement system. A proper MSA helps us
to:
- Evaluate our measuring system, in what extent it influences our data set.
- Set up proper measuring systems before serial production.
- Maintain proper measuring systems during production.
- Identify the intervention point in the measuring system (in case of unacceptable variation).
In order to perform a proper MSA study, we have to conduct the following actions and have to pay attention
to the next steps:
- Identification of the correct measurement method and approach, so the technology must fit for purpose.
- Assessment of the device, operators, procedures, measurement interactions.
- Calculate measurement uncertainty of individual devices, and the complete test system (e.g. GR&R).
- Conducting study of system accuracy (bias, linearity, stability) and precision (repeatability, reproducibility, sensitivity, etc.).
- Pay attention to handling, environment and objectiveness.
Several basic terms are used in the frame of measurements, such as:
- Resolution: the smallest scale unit of a measuring device (gage). It represents our smallest detection limit, so the correct
selection of the right resolution is indispensable. Hint: use the 10 to 1 rule, which means if you are measuring 10.23 mm, your measurement
system must have the resolution of microns, which is capable to measure more precise values by one digit, e.g. 10.234 mm.
- Reference value: the value defined in the specification (e.g. 10.0 mm +/- 0.2 mm).
- True value: the real value of the produced part (measurement process target). Important: we never know the true value. We would
only know it, if our measurement was completely precise without measurement uncertainty.
Affects of variation sources on the bell curve (Source: qMindset.com)
An MSA study consists of various calculations that focus on the mathematical description of various measuring
system characteristics. When we analyse our measurement system, we aim to find out the location variation (accuracy), the width variation
(precision) of our measured distribution, and the overall measurement system variation. The more accurate and more precise the system is, the
more true and trustworthy the measured values are.
Accuracy and precision (Source: qMindset.com)
Variation elements of MSA |
Variation |
MSA Variable / Feature |
Meaning |
Comment |
Location variation (accuracy) "closeness to the true value" |
Bias |
Difference between the average of measured values and the reference value. |
|
Linearity |
The change in Bias throughout the operating range (size). |
|
Stability (drift) |
The change in Bias over time |
|
Width variation (precision) "closeness of repeated readings" |
Repeatability (within appraiser) |
Equipment variation (EV) |
GR&R (combined variation of the two "R"s) |
Reproducibility (between appraisers) |
Appraiser variation (AV) |
Sensitivity |
Smallest input that results a detectable output signal. |
|
Consistency |
The change of Repeatability over time. |
|
Uniformity |
The change of Repeatability throughout the operating range (size). |
|
Measurement system variation |
Capability |
Variation coming from Repeatability and Reproducibility, plus Bias (short-term). |
Note: type-1 study's Capability contains Bias and Repeatability |
Performance |
Represents the combined effect of Capability, Stability and Consistency. |
|
Uncertainty |
Range about the measured value, that may contain the true value with a given confidence level. |
|
Bias: represents the difference between the true value (reference value) and the average of measured values
of the same characteristic. Zero bias means that the measured average is exactly the same as the reference value. In real life zero bias is
unreachable, however our measurement system has to reach as low bias as possible (that is statistically not significantly different from zero). Bias
always comes from a systematic error of the measurement system (e.g. worn gage or calibration problem, worn master samples, incorrect measurement
procedure, improper use, different temperature or other environmental conditions, etc.).
Minitab: "Bias examines the difference between the observed average measurement and a reference value. Bias
indicates how accurate the gage is when compared to a reference value".
Visualization of bias (Source: qMindset.com; AIAG MSA Manual)
Example: your device measures the height of one master part multiple times, and you measure higher compared
to the real height. In the following case, your bias is 0.28 mm, so the system overestimates.
Calculation of Bias |
Measurement |
Reference value (mm) |
Measured value (mm) |
Difference (mm) |
Trial 1 |
10.0 |
10.2 |
+0.2 |
Trial 2 |
10.0 |
10.1 |
+0.1 |
Trial 3 |
10.0 |
10.4 |
+0.4 |
Trial 4 |
10.0 |
10.5 |
+0.5 |
Trial 5 |
10.0 |
10.2 |
+0.2 |
Average |
10.0 |
10.28 |
0.28 |
Bias = 10.28 – 10.00 = 0.28 mm |
Formulae:
Bias = | Average of measured values – Reference value |
Bias = | x̄ - Reference value |
Calculation for the comparison of bias and the tolerance range:
Bias (%) = | Bias | / Feature tolerance * 100, or Bias (%) = | Bias | / T * 100
Other commonly used calculation, comparing the bias value to the process variation:
Bias (%) = | Bias | / Process variation * 100, or Bias (%) = | Bias | / 6 * sigma * 100
Remark: a very good way to evaluate if our bias is statistically acceptable is to make a one-sample t-test on it
(H0: the bias is statistically zero, so the measured mean is similar to the reference value; Ha: the measured mean is statistically different
from the reference).
Linearity: means the change in bias over the measurement range, or in other words, the homogeneity of
difference in the measurement system over the measurement range. If your bias is changing over the whole measurement scale or size, your
bias is not constant, i.e. your linearity is poor. In everyday life zero linearity is unreachable. Possible causes of poor linearity:
measurement method, calibration, worn master samples, temperature, humidity, vibration, dust, corrosion, etc.
Minitab: "Linearity examines how accurate your measurements are through the expected range of the
measurements. Linearity indicates whether the gage has the same accuracy across all reference values".
Visualization of linearity (Source: qMindset.com; AIAG MSA Manual)
Example: you measure various master parts with different reference values (e.g. 5.0 mm, 10.0 mm and 15,0 mm)
with the same device. You experience the higher you go in the measured size, the bigger your bias gets, and inconsistent bias means high
linearity. Aim: to reach low linearity (nearly constant bias) with the smallest bias as much as possible.
Linearity table |
Measurement |
Reference value (mm) |
Measured value (mm) |
Difference (mm) |
Trial 1 |
5.0 |
4.9 |
-0.1 |
Trial 2 |
5.0 |
5.1 |
+0.1 |
Trial 3 |
5.0 |
5.0 |
0.0 |
Bias = 0 mm |
Trial 4 |
10.0 |
10.2 |
+0.2 |
Trial 5 |
10.0 |
10.1 |
+0.1 |
Trial 6 |
10.0 |
9.8 |
-0.2 |
Bias = 0.03 mm |
Trial 7 |
15.0 |
15.3 |
+0.3 |
Trial 8 |
15.0 |
15.2 |
+0.2 |
Trial 9 |
15.0 |
15.2 |
+0.2 |
Bias = 0.23 mm |
Linearity is calculated from individual bias values (measurement results and various reference values) on
the measurement scale. Some statistical software are easily calculate the best fit bias line by regression, using the following formulae:
y = ax + b
(Where "y" is the bias, "a" means the slope of the line, "x" is the reference value, and "b" means the
intercept)
Both the slope and the intercept should be statistically zero, which means low and consistent bias.
Stability: means the change in bias over time. Aim: to reach consistent stability. In real life absolute
stability is unreachable, however a statistically good stability is our aim to get to.
Minitab: "Measurement stability is the change in bias over time. It represents the total variation in
measurements of the same part measured over time. This variation over time is called drift".
Visualization of stability (Source: qMindset.com; AIAG MSA Manual)
Example: stability can be examined by SPC control charts. In this case each operator measures the same
parts frequently in the same way, with the same device over a longer time period. The control chart makes the "drift" visible, which
indicates stability issues. Collect master samples with reference values from the complete expected measuring range (e.g. if you expect a
characteristic to be between 10.0 mm and 20.0 mm, then use master parts, that have this characteristic on 10.0 mm, 15.0 mm, and 20.0 mm).
Measure all master parts five to ten times frequently (you define the number of parts to be measured – a.k.a. subgroup size - and the time-frame
as well), but it should be periodic. The ten measurements will give the mean of the subgroup. Then indicate the mean values on an R-chart,
which will show trends, special cause effects (i.e. how high your bias is, how far your means are from the given reference).
X-Bar / R-Chart for stability (Source: qMindset.com; AIAG MSA Manual)
Repeatability: describes the variation coming from the device variation as an influencing factor of
the measurement results. It is commonly referred to as Equipment Variation (EV) or "Within appraiser variability". In this case the same operator
performs the measurements repeatedly with the same device in short-term trials. Possible causes of poor repeatability: improper method, form
or surface of samples, worn measurement equipment, incorrect operator technique, temperature, vibration, humidity and other environmental
conditions.
Minitab: "The ability of an operator to consistently repeat the same measurement of the same part, using the
same gage, under the same conditions."
Visualization of repeatability (Source: qMindset.com; AIAG MSA Manual)
Type-1 study: capability of the measurement system, by assessing the "short-term" Repeatability of a measurement system (also including the
bias in C
gk calculation). This study type is not contained in the
AIAG MSA manual, but contained in
VDA 5 and became widely spread among automotive companies in the last decades.
To assess our measurement equipment, we need to evaluate its capability, by using the following capability
indices: Cg and Cgk. During the study, one appraiser measures one etalon more times (usually 20 – 50 times) in the same conditions.
The calculation formulas are as follows:
Measurement capability index Cg, used for the assessment of repeatability:
Calculation of Cg (Source: qMindset.com; AIAG MSA Manual)
T = feature tolerance, n = the number of standard deviations (usually 6, giving the spread), and Sg
= the standard deviation of the measured results. Remark: in the denominator we use the percentage of the feature tolerance, the multiplier
of T can vary, based on the severity of the analyst (usual values are 0.1, 0.15, 0.2, etc.).
Example:
Cg calculation logic |
Variable / Feature |
Value |
Xref (reference value) |
40.00 mm |
USL (upper specification limit) |
40.20 mm |
LSL (lower specification limit) |
39.80 mm |
T (tolerance range = USL – LSL) |
0.40 mm |
n (number of standard deviations, used as a multiplier) |
6 |
Sg (standard deviation of measured values) |
0.004 mm |
Cg |
Cg = (0.2 * T) / (n * Sg) = (0.2 * 0.4) / (6 * 0.004) = 3.33 |
Result |
The measurement device is acceptable, as the result is over 1.33. (it is able to reproduce the
measurement) |
Measurement capability index Cgk, used for the combined assessment of repeatability and accuracy (bias):
Calculation of Cgk (Source: qMindset.com; AIAG MSA Manual)
T = feature tolerance, x̄ = the mean of measured values, Xref = the reference value of the part, n = the
number of standard deviations (usually 3, giving half the spread), and Sg = the standard deviation of the measured results. Remark: in the
denominator we use the percentage of the feature tolerance, the multiplier of T can vary, based on the severity of the analyst (usual values
are 0.1, 0.15, 0.2, etc.). In addition, the percentage multiplier of Cgk is usually half of the Cg multiplier.
Example:
Cgk calculation logic |
Variable / Feature |
Value |
Xref (reference value) |
40.00 mm |
USL (upper specification limit) |
40.20 mm |
LSL (lower specification limit) |
39.80 mm |
T (tolerance range = USL – LSL) |
0.40 mm |
x̄ (mean of measured values) |
40.002 mm |
n (number of standard deviations, used as a multiplier) |
3 |
Sg (standard deviation of measured values) |
0.004 mm |
Cgk |
Cgk = (0.1 * T - |x̄ - Xref|) / (n * Sg) =
= (0.1 * 0.4 - |40.002 - 40.0|) / (3 * 0.004) =
= (0.04 - 0.002) / 0.012 = 0.038 / 0.012 = 3.16 |
Result |
The measurement device is acceptable, as the result is over 1.33. (it is able to reproduce the
measurement with a low bias) |
The Cgk values will be higher (better), when:
- The standard deviation (Sg) of the measurement is low, e.g. the measurement variation is low.
- The measurement bias is low, in other words the mean of the measured values is near to the reference value.
Reproducibility: describes the variation that arises due to the operators and the interaction between the
operators and the same device. It is commonly referred to as Appraiser Variation (AV), "Between-system variation", or "Between-conditions
variation". In this case different operators measure the identical characteristic of the same part with the same device. Possible causes of
poor reproducibility: appraiser training, technique, skill, observation error, environment, design of measurement instrument, etc.
Minitab: "The ability of a gage, used by multiple operators, to consistently reproduce the same measurement
of the same part, under the same conditions."
Visualization of reproducibility (Source: qMindset.com; AIAG MSA Manual)
Type-2 study: Gage Repeatability and Reproducibility (GR&R) with Average and Range Method. The GR&R study
is a key element of the AIAG MSA manual, however the ANOVA method started to be the ruling principle, as it can distinguish all contributing factors, and
its calculation requirements are not causing any problems in the era of computers (Average and Range method was usable on paper).
A commonly spread and major part of MSA is the Gage Repeatability and Reproducibility (GR&R) study. It is an
analysis method of measurement systems, which focuses on the repeatability and reproducibility of the measurement system and the contribution
of the measurement system to the overall variation. GR&R is also a statistical tool that measures the extent of variation originated from the
device itself and the appraisers, who use this device. The AIAG MSA manual has been published for the automotive industry, and contains
comprehensive knowledge of performing MSA studies, including GR&R. During the study, more appraisers measure many parts, more than one times
(e.g. 3 appraisers measure 10 parts, 3 times each).
As we can see, repeatability analysis mainly focuses on the measuring device, while reproducibility
represents the evaluation of appraisers and appraiser-device interactions. To be sure about our evaluation, measuring one part is not a
real measurement. In practice, when we conduct GR&R study, we use not only one, but more (usually n >= 10) reference parts from the expected
range that represents the true variation of the manufacturing process.
To calculate combined gage variation (repeatability and reproducibility), we need to perform Gauge
Repeatability and Reproducibility (GR&R) analysis, as follows.
Visualization of GRR (Source: qMindset.com; AIAG MSA Manual)
Example: we conduct the GR&R study of a device with 3 appraiser, measuring 10 different parts, performing
the measurements 3 times on each part. In order to properly summarize the results, you need to use a data collection sheet, where you fill
your data set (in this case 3 x 3 x 10 = 90 measurement results).
GRR data collection sheet (Source: qMindset.com; AIAG MSA Manual)
Calculation of Repeatability and Reproducibility: the first section of the study is to calculate some major
variables that will provide the basics of our further calculations. These are averages, ranges, etc.
Step 1: calculate the average out of three trials of each measured part for each appraiser (see row 4, 9
and 14). Do the same, but now calculate the ranges by subtracting the lowest value from the highest value (see row 5, 10, 15). Now you have 6 x 10 values.
Step 2: From the received 6 x 10 values (in row 4, 5, 9, 10, 14, 15), calculate the average of the ranges
and the average of the averages (see the average column on the right side of the table). Now you have: R̄a, R̄b, R̄c and X̄a,
X̄b, X̄c.
We will use these 6 values for further calculations.
Step 3: Calculate the overall average of all ranges: R̿ = (R̄a + R̄b + R̄c) / 3.
Step 4: Calculate the highest difference between the maximum and the minimum of the averages:
X̄ Diff = (max X̄) - (min X̄)
Step 5: Sum the measured values for each trial, for each part, and divide it by the number of measurements
(in this case 9). Write the results into the cells of row 16. Select the highest and the lowest part average values, and subtract the smallest
value from highest value. Then you get Rp, which is the range of part averages.
In the second section of the study, we focus on the determination of EV (Equipment Variation), the
the AV (Appraiser Variation) and the GRR, PV (Part Variation) and TV (Total Variation) by using the previously calculated variables.
GRR report sheet (Source: qMindset.com; AIAG MSA Manual)
Step 6: Calculate the Repeatability (Equipment Variation) by multiplying the average of all ranges
(R̿) by a pre-defined constant K1 value.
In our actual case the K1 value is 0.5908, as the appraiser perform 3 trials on each part.
Step 7: Calculate the Reproducibility (Appraiser Variation) by using the given formulae:
where n = the number of parts, r = the number of trials, and K2 is a constant that depends on the number of
appraisers.
In the actual case, the K2 value is 0.5231, as we conduct the study with 3 appraisers.
Step 8: Using the EV and AV numbers, we can calculate the combined variation of the measurement system
(Repeatability and Reproducibility, or GRR).
Step 9: To calculate PV (Part Variation or ϬP), we use the Rp number we calculated in step 5, and a K3
constant.
Step 10: Having both the GRR (variation of the measurement system) and PV (variation of the parts), we are
able to calculate the total variation:
Now we have all major variables of the measurement system, part variation and the total variation. In the
third section of the study our next task is to check the contribution of each variation to the total variation. This is calculated as percentage,
and gives us the numbers, that we can use for decision making (either to accept or reject the actual measurement system).
Step 11: calculate the ratio of %EV, %AV, %GRR and %PV compared to the Total Variation (TV). Remark: the sum
value of each ratio is not equal to 100%!
Variation ratios (component of variation vs total variation) |
Ratio |
Formulae |
%EV= |
100 * ( EV / TV ) |
%AV= |
100 * ( AV / TV ) |
%GRR= |
100 * ( GRR / TV ) |
%PV= |
100 * ( PV / TV ) |
%GRRtol= |
100 * ( GRR / Tolerance ) |
The received GRR value highly affect the difference between our observed process capability and the actual
(real) process capability, as the measurement system distorts the measured results.
Correlation of real and observed process capability (Source: qMindset.com; AIAG MSA Manual)
On top of having the previous MSA metrics, we calculate the ndc (Number of Distinct Categories), which
represents the number of how many separate categories the measurement system is able to distinguish.
ndc = 1.41 * (PV / GRR)
Step 12: by having the contribution values and the ndc number, we can decide if the actual measurement
system is able to properly measure. The following key requirements are valid in the automotive industry to judge the measurement system:
Acceptance levels of GRR and ndc |
Value |
Evaluation |
%GRR < 10% |
Measurement system is acceptable |
10% < %GRR < 30% |
Measurement system is conditionally acceptable for given applications, improvement is necessary |
%GRR > 30% |
Measurement system is unacceptable |
ndc > 10 |
Favored measurement system |
ndc ≥ 5 |
Measurement system is acceptable |
ndc < 5 |
Measurement system is unacceptable |
Source: qMindset.com; AIAG MSA Manual; Minitab.com