## Process Watch: Having confidence in your confidence level

*By Douglas G. Sutherland and David W. Price*

** Author’s Note: **The Process Watch series explores key concepts about process control—defect inspection and metrology—for the semiconductor industry. Following the previous installments, which examined the 10 fundamental truths of process control, this new series of articles highlights additional trends in process control, including successful implementation strategies and the benefits for IC manufacturing.

While working at the Guinness® brewing company in Dublin, Ireland in the early-1900s, William Sealy Gosset developed a statistical algorithm called the T-test^{1}. Gosset used this algorithm to determine the best-yielding varieties of barley to minimize costs for his employer, but to help protect Guinness’ intellectual property he published his work under the pen name “Student.” The version of the T-test that we use today is a refinement made by Sir Ronald Fisher, a colleague of Gosset’s at Oxford University, but it is still commonly referred to as Student’s T-test. This paper does not address the mathematical nature of the T-test itself but rather looks at the amount of data required to *consistently achieve* the ninety-five percent confidence level in the T-test result.

A T-test is a statistical algorithm used to determine if two samples are part of the same parent population. It does not resolve the question unequivocally but rather calculates the probability that the two samples are part of the same parent population. As an example, if we developed a new methodology for cleaning an etch chamber, we would want to show that it resulted in fewer fall-on particles. Using a wafer inspection system, we could measure the particle count on wafers in the chamber following the old cleaning process and then measure the particle count again following the new cleaning process. We could then use a T-test to tell if the difference was statistically significant or just the result of random fluctuations. The T-test answers the question: what is the probability that two samples are part of the same population?

However, as shown in Figure 1, there are two ways that a T-Test can give a false result: a false positive or a false negative. To confirm that the experimental data is actually different from the baseline, the T-test usually has to score less than 5% (i.e. less than 5% probability of a false positive). However, if the T-test scores greater than 5% (a negative result), it doesn’t tell you anything about the probability of that result being false. The probability of false negatives is governed by the number of measurements. So there are always two criteria: (1) Did my experiment pass or fail the T-test? (2) Did I take enough measurements to be confident in the result? It is that last question that we try to address in this paper.

Changes to the semiconductor manufacturing process are expensive propositions. Implementing a change that doesn’t do anything (false positive) is not only a waste of time but potentially harmful. Not implementing a change that could have been beneficial (false negative) could cost tens of millions of dollars in lost opportunity. It is important to have the appropriate degree of confidence in your results and to do so requires that you use a sample size that is appropriate for the size of the change you are trying to affect. In the example of the etch cleaning procedure, this means that inspection data from a sufficient number of wafers needs to be collected in order to determine whether or not the new clean procedure truly reduces particle count.

In general, the bigger the difference between two things, the easier it is to tell them apart. It is easier to tell red from blue than it is to distinguish between two different shades of red or between two different shades of blue. Similarly, the less variability there is in a sample, the easier it is to see a change^{2}. In statistics the variability (sometimes referred to as noise) is usually measured in units of standard deviation (s). It is often convenient to also express the difference in the means of two samples in units of s (e.g., the mean of the experimental results was 1s below the mean of the baseline). The advantage of this is that it normalizes the results to a common unit of measure (s). Simply stating that two means are separated by some absolute value is not very informative (e.g., the average of A is greater than the average of B by 42). However, if we can express that absolute number in units of standard deviations, then it immediately puts the problem in context and instantly provides an understanding of how far apart these two values are in relative terms (e.g., the average of A is greater than the average of B by 1 standard deviation).

Figure 2 shows two examples of data sets, before and after a change. These can be thought of in terms of the etch chamber cleaning experiment we discussed earlier. The baseline data is the particle count per wafer before the new clean process and the results data is the particle count per wafer after the new clean procedure. Figure 2A shows the results of a small change in the mean of a data set with high standard deviation and figure 2B shows the results of the same sized change in the mean but with less noisy data (lower standard deviation). You will require more data (e.g., more wafers inspected) to confirm the change in figure 2A than in figure 2B simply because the signal-to-noise ratio is lower in 2A even though the *absolute change* is the same in both cases.

Figure 2. Both charts show the same absolute change, before and after, but 2B (right) has much lower standard deviation. When the change is small relative to the standard deviation as in 2A (left) it will require more data to confirm it.

The question is: how much data do we need to confidently tell the difference? Visually, we can see this when we plot the data in terms of the Standard Error (SE). The SE can be thought of as the error in calculating the average (e.g., the average was X +/- SE). The SE is proportional to s/√n where n is the sample size. Figure 3 shows the SE for two different samples as a function of the number of measurements, n.

Figure 3. The Standard Error (SE) in the average of two samples with different means. In this case the standard deviation is the same in both data sets but that need not be the case. With greater than x measurements the error bars no longer overlap and one can state with 95% confidence that the two populations are distinct.

For a given difference in the means and a given standard deviation we can calculate the number of measurements, x, required to eliminate the overlap in the Standard Errors of these two measurements (at a given confidence level).

The actual equation to determine the correct sample size in the T-test is given by,

where n is the required sample size, “Delta” is the difference between the two means measured in units of standard deviation (s) and *Z*_{x} is the area under the T distribution at probability x. For α=0.05 (5% chance of a false positive) and β=0.95 (5% chance of a false negative), Z_{1-a/2} and Z_{b} are equal to 1.960 and 1.645 respectively (Z values for other values of α and β are available in most statistics textbooks, Microsoft® Excel® or on the web). As seen in Figure 3 and shown mathematically in Eq 1, as the difference between the two populations (Delta) becomes smaller, the number of measurements required to tell them apart will become exponentially larger. Figure 4 shows the required sample size as a function of the Delta between the means expressed in units of σ. As expected, for large changes, greater than 3σ, one can confirm the T-test 95% of the time with very little data. As Delta gets smaller, more measurements are required to consistently confirm the change. A change of only one standard deviation requires 26 measurements before and after, but a change of 0.5σ requires over 100 measurements.

Figure 4. Sample size required to confirm a given change in the mean of two populations with 5% false positives and 5% false negatives

The relationship between the size of the change and the minimum number of measurements required to detect it has ramifications for the type of metrology or inspection tool that can be employed to confirm a given change. Figure 5 uses the results from figure 4 to show the time it would take to confirm a given change with different tool types. In this example the sample size is measured in number of wafers. For fast tools (high throughput, such as laser scanning wafer inspection systems) it is feasible to confirm relatively small improvements (<0.5σ) in the process because they can make the 200 required measurements (100 before and 100 after) in a relatively short time. Slower tools such as e-beam inspection systems are limited to detecting only gross changes in the process, where the improvement is greater than 2σ. Even here the measurement time alone means that it can be weeks before one can confirm a positive result. For the etch chamber cleaning example, it would be necessary to quickly determine the results of the change in clean procedure so that the etch tool could be put back into production. Thus, the best inspection system to determine the change in particle counts would be a high throughput system that can detect the particles of interest with low wafer-to-wafer variability.

Figure 5. The measurement time required to determine a given change for process control tools with four different throughputs (e-Beam, Broadband Plasma, Laser Scattering and Metrology)

Experiments are expensive to run. They can be a waste of time and resources if they result in a false positive and can result in millions of dollars of unrealized opportunity if they result in a false negative. To have the appropriate degree of confidence in your results you must use the correct sample size (and thus the appropriate tools) that correspond to the size of the change you are trying to affect.

*References:*

- https://en.wikipedia.org/wiki/William_Sealy_Gosset
- Process Watch: Know Your Enemy,
*Solid State Technology*, March 2015

*About the Authors*:

Dr. David W. Price is a Senior Director at KLA-Tencor Corp. Dr. Douglas Sutherland is a Principal Scientist at KLA-Tencor Corp. Over the last 10 years, Dr. Price and Dr. Sutherland have worked directly with more than 50 semiconductor IC manufacturers to help them optimize their overall inspection strategy to achieve the lowest total cost. This series of articles attempts to summarize some of the universal lessons they have observed through these engagements.

## Process Watch: Having confidence in your confidence level

*By Douglas G. Sutherland and David W. Price*

** Author’s Note: **The Process Watch series explores key concepts about process control—defect inspection and metrology—for the semiconductor industry. Following the previous installments, which examined the 10 fundamental truths of process control, this new series of articles highlights additional trends in process control, including successful implementation strategies and the benefits for IC manufacturing.

While working at the Guinness® brewing company in Dublin, Ireland in the early-1900s, William Sealy Gosset developed a statistical algorithm called the T-test^{1}. Gosset used this algorithm to determine the best-yielding varieties of barley to minimize costs for his employer, but to help protect Guinness’ intellectual property he published his work under the pen name “Student.” The version of the T-test that we use today is a refinement made by Sir Ronald Fisher, a colleague of Gosset’s at Oxford University, but it is still commonly referred to as Student’s T-test. This paper does not address the mathematical nature of the T-test itself but rather looks at the amount of data required to *consistently achieve* the ninety-five percent confidence level in the T-test result.

A T-test is a statistical algorithm used to determine if two samples are part of the same parent population. It does not resolve the question unequivocally but rather calculates the probability that the two samples are part of the same parent population. As an example, if we developed a new methodology for cleaning an etch chamber, we would want to show that it resulted in fewer fall-on particles. Using a wafer inspection system, we could measure the particle count on wafers in the chamber following the old cleaning process and then measure the particle count again following the new cleaning process. We could then use a T-test to tell if the difference was statistically significant or just the result of random fluctuations. The T-test answers the question: what is the probability that two samples are part of the same population?

However, as shown in Figure 1, there are two ways that a T-Test can give a false result: a false positive or a false negative. To confirm that the experimental data is actually different from the baseline, the T-test usually has to score less than 5% (i.e. less than 5% probability of a false positive). However, if the T-test scores greater than 5% (a negative result), it doesn’t tell you anything about the probability of that result being false. The probability of false negatives is governed by the number of measurements. So there are always two criteria: (1) Did my experiment pass or fail the T-test? (2) Did I take enough measurements to be confident in the result? It is that last question that we try to address in this paper.

Changes to the semiconductor manufacturing process are expensive propositions. Implementing a change that doesn’t do anything (false positive) is not only a waste of time but potentially harmful. Not implementing a change that could have been beneficial (false negative) could cost tens of millions of dollars in lost opportunity. It is important to have the appropriate degree of confidence in your results and to do so requires that you use a sample size that is appropriate for the size of the change you are trying to affect. In the example of the etch cleaning procedure, this means that inspection data from a sufficient number of wafers needs to be collected in order to determine whether or not the new clean procedure truly reduces particle count.

In general, the bigger the difference between two things, the easier it is to tell them apart. It is easier to tell red from blue than it is to distinguish between two different shades of red or between two different shades of blue. Similarly, the less variability there is in a sample, the easier it is to see a change^{2}. In statistics the variability (sometimes referred to as noise) is usually measured in units of standard deviation (s). It is often convenient to also express the difference in the means of two samples in units of s (e.g., the mean of the experimental results was 1s below the mean of the baseline). The advantage of this is that it normalizes the results to a common unit of measure (s). Simply stating that two means are separated by some absolute value is not very informative (e.g., the average of A is greater than the average of B by 42). However, if we can express that absolute number in units of standard deviations, then it immediately puts the problem in context and instantly provides an understanding of how far apart these two values are in relative terms (e.g., the average of A is greater than the average of B by 1 standard deviation).

Figure 2 shows two examples of data sets, before and after a change. These can be thought of in terms of the etch chamber cleaning experiment we discussed earlier. The baseline data is the particle count per wafer before the new clean process and the results data is the particle count per wafer after the new clean procedure. Figure 2A shows the results of a small change in the mean of a data set with high standard deviation and figure 2B shows the results of the same sized change in the mean but with less noisy data (lower standard deviation). You will require more data (e.g., more wafers inspected) to confirm the change in figure 2A than in figure 2B simply because the signal-to-noise ratio is lower in 2A even though the *absolute change* is the same in both cases.

Figure 2. Both charts show the same absolute change, before and after, but 2B (right) has much lower standard deviation. When the change is small relative to the standard deviation as in 2A (left) it will require more data to confirm it.

The question is: how much data do we need to confidently tell the difference? Visually, we can see this when we plot the data in terms of the Standard Error (SE). The SE can be thought of as the error in calculating the average (e.g., the average was X +/- SE). The SE is proportional to s/√n where n is the sample size. Figure 3 shows the SE for two different samples as a function of the number of measurements, n.

Figure 3. The Standard Error (SE) in the average of two samples with different means. In this case the standard deviation is the same in both data sets but that need not be the case. With greater than x measurements the error bars no longer overlap and one can state with 95% confidence that the two populations are distinct.

For a given difference in the means and a given standard deviation we can calculate the number of measurements, x, required to eliminate the overlap in the Standard Errors of these two measurements (at a given confidence level).

The actual equation to determine the correct sample size in the T-test is given by,

where n is the required sample size, “Delta” is the difference between the two means measured in units of standard deviation (s) and *Z*_{x} is the area under the T distribution at probability x. For α=0.05 (5% chance of a false positive) and β=0.95 (5% chance of a false negative), Z_{1-a/2} and Z_{b} are equal to 1.960 and 1.645 respectively (Z values for other values of α and β are available in most statistics textbooks, Microsoft® Excel® or on the web). As seen in Figure 3 and shown mathematically in Eq 1, as the difference between the two populations (Delta) becomes smaller, the number of measurements required to tell them apart will become exponentially larger. Figure 4 shows the required sample size as a function of the Delta between the means expressed in units of σ. As expected, for large changes, greater than 3σ, one can confirm the T-test 95% of the time with very little data. As Delta gets smaller, more measurements are required to consistently confirm the change. A change of only one standard deviation requires 26 measurements before and after, but a change of 0.5σ requires over 100 measurements.

Figure 4. Sample size required to confirm a given change in the mean of two populations with 5% false positives and 5% false negatives

The relationship between the size of the change and the minimum number of measurements required to detect it has ramifications for the type of metrology or inspection tool that can be employed to confirm a given change. Figure 5 uses the results from figure 4 to show the time it would take to confirm a given change with different tool types. In this example the sample size is measured in number of wafers. For fast tools (high throughput, such as laser scanning wafer inspection systems) it is feasible to confirm relatively small improvements (<0.5σ) in the process because they can make the 200 required measurements (100 before and 100 after) in a relatively short time. Slower tools such as e-beam inspection systems are limited to detecting only gross changes in the process, where the improvement is greater than 2σ. Even here the measurement time alone means that it can be weeks before one can confirm a positive result. For the etch chamber cleaning example, it would be necessary to quickly determine the results of the change in clean procedure so that the etch tool could be put back into production. Thus, the best inspection system to determine the change in particle counts would be a high throughput system that can detect the particles of interest with low wafer-to-wafer variability.

Figure 5. The measurement time required to determine a given change for process control tools with four different throughputs (e-Beam, Broadband Plasma, Laser Scattering and Metrology)

Experiments are expensive to run. They can be a waste of time and resources if they result in a false positive and can result in millions of dollars of unrealized opportunity if they result in a false negative. To have the appropriate degree of confidence in your results you must use the correct sample size (and thus the appropriate tools) that correspond to the size of the change you are trying to affect.

*References:*

- https://en.wikipedia.org/wiki/William_Sealy_Gosset
- Process Watch: Know Your Enemy,
*Solid State Technology*, March 2015

*About the Authors*:

Dr. David W. Price is a Senior Director at KLA-Tencor Corp. Dr. Douglas Sutherland is a Principal Scientist at KLA-Tencor Corp. Over the last 10 years, Dr. Price and Dr. Sutherland have worked directly with more than 50 semiconductor IC manufacturers to help them optimize their overall inspection strategy to achieve the lowest total cost. This series of articles attempts to summarize some of the universal lessons they have observed through these engagements.

## New ClassOne chamber cuts copper plating costs 95%

ClassOne Technology (classone.com), manufacturer of budget-friendly Solstice plating systems, announced it’s new CopperMax chamber — a design that is demonstrating major copper plating cost reductions for users of ≤200mm wafers.

ClassOne cited actual performance data from a CopperMax pilot installation on a Solstice tool at a Fortune 100 customer. Over a six-month period the customer tracked their actual production operating costs while using the new chamber for copper TSV, Damascene and high-rate copper plating. For the three processes with CopperMax they reported that operating costs were reduced between 95.8% and 98.4% compared with previously used conventional plating chambers.

“Many of our emerging market customers are starting to do copper plating,” said Kevin Witt, President of ClassOne Technology. “So we’ve spent a lot of time on the process, working to reduce customer costs and also increase performance. And the new CopperMax chamber is proving to do both.”

ClassOne pointed out that consumables are the largest cost factor in copper plating. Optimizing copper plating generally requires the use of expensive organic additives — which are consumed very rapidly and need to be replenished frequently.

“We learned, however, that over 97% of those expensive additives were *not* being consumed by the actual plating process,” said Witt. “Most were being used up simply by contact with the anode throughout the process! So, we designed our new copper chamber specifically to keep additives *away* from the anode — and the results are pretty dramatic. Significant savings can be realized by high- and medium-volume users with high throughputs as well as by lower-volume and R&D users that have long idle times.”

The company explained that the CopperMax chamber employs a cation-exchange semipermeable membrane to divide the copper bath into two sections. The upper section contains all of the additives, and it actively plates the wafer. The lower section of the bath contains the anode that supplies elemental copper — which is able to travel through the membrane and into the upper section to ultimately plate the wafer. However, the membrane prevents additives from traveling down to the anode, where they would break down and form process-damaging waste products.

As a result, the CopperMax bath remains much cleaner, and bath life is extended by over 20x. This increases uptime, enables higher-quality, higher-rate Cu plating, and it reduces cost of ownership very substantially.

For example, a customer using a Solstice system with six CopperMax chambers and running TSV and high-rate copper plating will save over $300,000 per year just from additive use reductions.

In addition, the CopperMax also reduces Cu anode expenses. The chamber is designed to use inexpensive bulk anode *pellets* instead of solid machined Cu material, which cuts anode costs by over 50%. And since the pellets have 10x greater surface area they also increase the allowable plating rates.

“Like the rest of our equipment, this new chamber aims to serve all those smaller wafer users who have limited budgets,” said Witt. “Simply stated, CopperMax is going to give them a lot more copper plating performance for a lot less.”

## New ClassOne chamber cuts copper plating costs 95%

ClassOne Technology (classone.com), manufacturer of budget-friendly Solstice plating systems, announced it’s new CopperMax chamber — a design that is demonstrating major copper plating cost reductions for users of ≤200mm wafers.

ClassOne cited actual performance data from a CopperMax pilot installation on a Solstice tool at a Fortune 100 customer. Over a six-month period the customer tracked their actual production operating costs while using the new chamber for copper TSV, Damascene and high-rate copper plating. For the three processes with CopperMax they reported that operating costs were reduced between 95.8% and 98.4% compared with previously used conventional plating chambers.

“Many of our emerging market customers are starting to do copper plating,” said Kevin Witt, President of ClassOne Technology. “So we’ve spent a lot of time on the process, working to reduce customer costs and also increase performance. And the new CopperMax chamber is proving to do both.”

ClassOne pointed out that consumables are the largest cost factor in copper plating. Optimizing copper plating generally requires the use of expensive organic additives — which are consumed very rapidly and need to be replenished frequently.

“We learned, however, that over 97% of those expensive additives were *not* being consumed by the actual plating process,” said Witt. “Most were being used up simply by contact with the anode throughout the process! So, we designed our new copper chamber specifically to keep additives *away* from the anode — and the results are pretty dramatic. Significant savings can be realized by high- and medium-volume users with high throughputs as well as by lower-volume and R&D users that have long idle times.”

The company explained that the CopperMax chamber employs a cation-exchange semipermeable membrane to divide the copper bath into two sections. The upper section contains all of the additives, and it actively plates the wafer. The lower section of the bath contains the anode that supplies elemental copper — which is able to travel through the membrane and into the upper section to ultimately plate the wafer. However, the membrane prevents additives from traveling down to the anode, where they would break down and form process-damaging waste products.

As a result, the CopperMax bath remains much cleaner, and bath life is extended by over 20x. This increases uptime, enables higher-quality, higher-rate Cu plating, and it reduces cost of ownership very substantially.

For example, a customer using a Solstice system with six CopperMax chambers and running TSV and high-rate copper plating will save over $300,000 per year just from additive use reductions.

In addition, the CopperMax also reduces Cu anode expenses. The chamber is designed to use inexpensive bulk anode *pellets* instead of solid machined Cu material, which cuts anode costs by over 50%. And since the pellets have 10x greater surface area they also increase the allowable plating rates.

“Like the rest of our equipment, this new chamber aims to serve all those smaller wafer users who have limited budgets,” said Witt. “Simply stated, CopperMax is going to give them a lot more copper plating performance for a lot less.”

## Global semiconductor wafer-level equipment revenue to grow 11% in 2016

Worldwide semiconductor wafer-level manufacturing equipment (WFE) revenue totaled $37.4 billion in 2016, an 11.3 percent increase from 2015, according to final results by Gartner, Inc. The top 10 vendors accounted for 79 percent of the market, up 2 percent from 2015.

“Spending on 3D NAND and leading-edge logic process drove growth in the market in 2016,” said Takashi Ogawa, research vice president at Gartner. “This spending was driven by momentum for high-end services in data centers and requirements for faster processors and high-volume memory for mobile devices.”

Applied Materials continued to lead the WFE market with 20.5 percent growth in 2016 (see Table 1). The active investment in 3D device manufacturing provided significant momentum in Applied’s etch revenue, specifically in the conductor etch segment. Screen Semiconductor Solutions experienced the highest growth in the market, with 41.5 percent. This was due to a combination of the appreciation of the Japanese Yen against the U.S. dollar, which elevated dollar-based sales estimates and the demand in premium smartphone and data center servers for big data analysis that drove investment in 3D-NAND capacity and leading-edge technology in foundries.

**Table 1**

**Top 10 Companies’ Revenue From Shipments of Total Wafer-Level Manufacturing Equipment, Worldwide (Millions of U.S. Dollars)**

Rank 2015 |
Rank 2014 |
Vendor |
2016 Revenue |
2016 Market Share (%) |
2015 Revenue |
2015 Market Share (%) |
2015-2016 Growth (%) |

1 |
1 |
Applied Materials |
7,736.9 |
20.7 |
6,420.2 |
19.1 |
20.5 |

2 |
4 |
Lam Research |
5,213.0 |
13.9 |
4,808.3 |
14.3 |
8.4 |

3 |
2 |
ASML |
5,090.6 |
13.6 |
4,730.9 |
14.1 |
7.6 |

4 |
3 |
Tokyo Electron |
4,861.0 |
13.0 |
4,325.0 |
12.9 |
12.4 |

5 |
5 |
KLA-Tencor |
2,406.0 |
6.4 |
2,043.2 |
6.1 |
17.8 |

6 |
6 |
Screen Semiconductor Solutions |
1,374.9 |
3.7 |
971.5 |
2.9 |
41.5 |

7 |
7 |
Hitachi High-Technologies |
980.2 |
2.6 |
788.3 |
2.3 |
24.3 |

8 |
8 |
Nikon |
731.5 |
2.0 |
724.2 |
2.2 |
1.0 |

9 |
9 |
Hitachi Kokusai |
528.4 |
1.4 |
633.8 |
1.9 |
-16.6 |

10 |
13 |
ASM International |
496.9 |
1.3 |
582.5 |
1.7 |
-14.7 |

Others |
7,988.0 |
21.4 |
7,586.2 |
22.6 |
5.3 |
||

Total Market |
37,407.3 |
100.0 |
33,613.7 |
100 |
11.3 |

*Source: Gartner (April 2017)*

Additional information is provided in the Gartner report “MarketShare: SemiconductorWaferFab Equipment, Worldwide, 2016.” The report provides rankings and market share for the top 10 vendors. In 2015, Gartner changed the segment reporting to focus on wafer-level manufacturing and is no longer providing segment details for die-level packaging or automatic test. This report is limited to wafer-level manufacturing equipment.