Well, folks, it’s my time to shine—or, at a minimum, be less dull. This is the reliability issue, and that apparently is my quest, according to the title of my column. I’m starting to wish my column were titled “Quest to Eat All the Pizza you Want and Not Gain a Pound,” but reliability it is.
For those who might be new to my column, I work for an independent electronics laboratory that deals with root-cause failure analysis and product qualification of electronic assemblies. That includes all the parts and materials that go into that process. It also means that, on a regular basis, we see failed electronics discovered at in-circuit testing all the way to a product that has been in the field for many years.
In a nutshell, I can tell you it is much cheaper to perform product-specific reliability testing before the product goes into the field. If you find out after release that you have to work backward to discover the issue and determine whether everything in the field is at risk of failure and recall, then you still have to go back and do the testing that should have been done in the first place. The monetary cost of a recall can be more than the project was worth in the first place if you look at repairing or replacing products, and that doesn’t even consider the other costs associated with a recall like a possible future business with that customer. The most important factor of your product may possibly be related to something that people need to stay alive.
One of the best examples of a recall being detrimental in all aspects is Takata automotive airbags. While it was not directly related to what we do in the world of electronic hardware, it speaks to the need for extensive reliability testing before release. The biggest cost associated with that recall is, of course, the loss of human life, but in the business sense, it cost Takata more than $24 billion and—in the end—the company itself. If the PCBA that controls your pizza oven goes out, the stakes are much lower (debatable) but most likely still could have been rooted out with proper upfront reliability testing.
This month, I plan to share some testing recommendations based on failure analysis, as well as lessons learned from a few of our customers over the years using case studies and data on failed units. Make no mistake, I will focus a lot on cleanliness and how it relates to reliability. “Write what you know,” they said, so that’s my plan.
What does reliability even mean? According to the all-knowing internet, reliability is “the quality of being trustworthy or of performing consistently well.” I think that pretty much sums it up from the 50,000-foot view. When we get a little closer to the ground, we need to expand that to refer more specifically to the class of product being manufactured. Reliability and Class 1 don’t really overlap in the big Venn diagram of quality; it will most likely work when it goes out the door. That’s about it in a lot of cases.
When looking at Class 2 and Class 3 hardware, there is most certainly a need to focus on reliability. According to IPC-A-610, “Class 2 Dedicated Service Electronic Products include products where continued performance and extended life is required and for which uninterrupted service is desired but not critical. Typically, the end-use environment would not cause failures” and “Class 3 High-Performance Electronic Products include products where continued high performance or performance-on-demand is critical, equipment downtime cannot be tolerated, end-use environment may be uncommonly harsh, and the equipment must function when required, such as life support or other critical systems.”
What this tells me is that not all reliability is equal. When it comes down to it, there are minor differences between these two classes of electronics. Outside of some high-end exotic assemblies, most parts and assembly processes are used for both classes. The biggest difference is what happens if it fails. It’s literally a matter of life and death in some cases. Sorry, I didn’t mean to bring you down there, but it is important to remember that. The good news is that most companies building those types of electronics are on top of it with testing that would not be required for many Class 2 assemblies.
Enough of the pseudo-philosophical electronics talk; let’s get down to it. Approving a new supplier for any part of your process is a major key to reliability because you need to know that the bare board and components aren’t going to also supply a surprise down the road. Let’s start at the bare board level. When it comes to guidance, anything that is agreed to between the user and the supplier will dominate any requirement from any other source.
In lieu of any internal guidance, most companies lean on IPC-6012: Qualification and Performance Specification for Rigid Printed Boards. Looking at the applicable documents specific to PCB manufacturing, there are 23 test methods within the TM-650, 35 related documents, and another 18 joint industry and other association documents. That is a lot of information for those who need it and should cover pretty much every conceivable combination of materials.
In no way am I suggesting you need to review each and every one of these documents, but they are there either way. If you start with IPC-6012, you can go pretty much anywhere in the testing realm, but not all tests are required—or even necessary—for new supplier approval. Some of the parameters to test for include plating thickness on PTH barrels and pads, solder mask cure, conductive anodic filament (CAF) resistance, and cleanliness, among others. Let’s look at what some of those tests are looking for and the possible reliability issues tied to those.
I’m going to start at layer one of the PCB fab process. Quality really does start there, and each subsequent step adds another opportunity to screw it up. The CAF test is used on bare fabs to determine whether process chemistries are present on the inner layer of the PCB that will produce electrochemical migration. This is the same as dendrite growth found on a fully populated PCBA. No matter where it occurs, if you have conductive residue, moisture, and potential, you run an elevated risk for electrical leakage and dendrite growth.
The CAF condition is greatly aided by poor resin flow, creating dry weave that will absorb plating chemistries and allow them to bridge anode and cathode. The IPC test for CAF is found in TM-650, 2.6.25. This is an environmental test that is normally done on test coupons manufactured by your PCB supplier using the same materials you plan to use for normal production. The test boards are subjected to elevated heat and humidity for at least 596 hours under bias. In Figures 1 and 2, you can see dry weave facilitating CAF that will render any subsequent processing steps meaningless.
Scanning electron microscopy (SEM) and energy-dispersive X-ray spectroscopy (EDS) are among the best analytical tools for investigating CAF. By using SEM/EDS, you can determine the composition of the material and compare it to the base metals being used for barrel plating. Figures 3 and 4 show SEM and EDS examples. If it matches, you have CAF and need to work with your PCB fabrication supplier to optimize the process.
In your effort to optimize the bare fab process, you need to know what the contaminations are that facilitate the CAF. For that, you want to use ion chromatography. That will tell you exactly what ions are present and at what concentrations. Those results can be matched back to chemistries used in the plating process, and then the optimization is focused and can happen a lot faster in most cases. The IC data in Table 1 shows typical ionic content from an inner layer cleanliness issue, high levels acetate, sulfate, and sodium residues. These ions are normally found in plating chemistries and suggest that the final rinse is insufficient to completely remove all the residues.
Ion chromatography should also be used on normal production PCBs to determine the level of cleanliness on the outside surface. If IC is to be used for process monitoring, you will want to perform global extractions for baseline data. Localized extractions over concentrated distributions of plated through-holes, over-plated pads, and overly-bare solder mask areas should all be considered to get the clearest idea of just how clean each of those parts of the process is.
Solder mask cure is another critical parameter that should be examined. When a mask is properly cured, it will exhibit a continuous smooth texture, like a marble countertop. If the solder mask is under cured, the surface will be rough with nooks and crannies, like an English muffin. The same way that muffin will hold delicious butter and jam, the solder mask will hold flux, wash chemistry, and other processing residues.
The IPC test methods related to solder mask cure are 2.3.23B and 22.214.171.124A. These are chemical tests that use drops of methyl chlorine or methyl chloroform on the solder mask, followed by using a wooden spudger to see if you can scratch the mask. If it easily scratches, give it a “cure bump” with either UV or thermal exposure and then repeat the test. If the mask is then unaffected, you can go back to your supplier and have them adjust their cure profiles. Uneven solder mask coverage can expose the base metals to less than optimal environments, and that alone can be enough to cause issues like corrosion (shown in Figure 5). There are many different tests specific to bare boards, so it’s a good idea to consider the end-use environment, warranty period, and any other product-specific details to determine which test is most applicable for your product.
Many of the same processes used for the plating of bare boards are also used for component leads. Both processes use chemistries that can increase the risk of corrosion or issues related to electrochemical migration if not fully removed. This happens with components when those chemistries find a way up into the die area, causing corrosion and dendrite growth. This can easily happen when there is a small gap at the overmold/lead frame interface
(Figures 6 and 7).
Even if the residues don’t make it all the way to the die, they can be present on the edge of the package body between the leads and propagate electrochemical migration (Figure 8).
If you are using a fully no-clean assembly process, you can’t rely on an end of the assembly wash process to remove any of those residues. This is when you can use IC analysis to determine the effectiveness of the component wash process. Standard bag extractions will detect any elevated levels of ionics on the outer surfaces. For internal surfaces, Parr Bomb analysis is a pressurized extraction for harvesting possible plating residues that have been absorbed into the component overmold material, down to the lead frame. This can be done without any physical damage to the component. Much like the bare boards, components bring their own inherent reliability risk before the first part is soldered.
Now that you know how those raw parts can impact your reliability, let’s put all those pieces together. Per IPC J-STD-001 Section 8, you need to qualify an assembly process using SIR per TM-650 126.96.36.199 on test boards to show how well the CM is processing the proposed set of materials and what impact elevated heat and humidity have on electrical resistance. This test is the bare minimum that needs to be completed to verify the assembly process. Contract manufacturers need to do this testing to generate objective evidence that can be applied to process monitoring analysis. As most people know by now, the historical acceptance criteria of 1.56 µg NaCl equivalence per square centimeter has been given the old heave-ho, and rightfully so. If you want some details on how and why that criterion was removed, I recommend reading IPC-WP-019: An Overview on Global Change in Ionic Cleanliness Requirement. (Spoiler alert: It should never have been used as it has been.)
Here is an example of how a CM can generate objective evidence and use it for process monitoring. If a company needs to qualify a product with a new customer, and the plan needs to include monitoring the approved assembly process, they choose test coupons that are most representative of their final product based on the mix of SMT and PTH components. They then assemble boards using the proposed combination of materials and equipment to be used for the final product. Along with two bare reference samples, the assembled test boards are tested per IPC 188.8.131.52. If they pass that test, they are tested with ion chromatography to determine the average levels of specific anions, cations, and WOA to create baseline data.
Next, they build a set of 20 samples of the actual product. A set of 10 boards are tested using ion chromatography with global extraction. The second set of 10 is tested in a ROSE tester. The average of the ROSE test results is the acceptance criteria used on a per shift basis. Remember that the number is being derived from your ROSE tester and can differ from another machine of the same make and model. It doesn’t really matter if that number is 1 or 101 µg NaCl equivalence per square centimeter. That number has been verified with other testing. Often, IC testing is done on a quarterly basis for further evidence of process control. The quarterly test results are compared to the baseline.
Some customers will also perform elevated heat and humidity exposure testing with normal operating voltages to further validate the acceptance criteria. This is known as temperature-humidity-bias (THB) testing and is similar to SIR testing. THB testing is done on actual products using normal operating voltages and duty cycles. This is one of the most important tests to consider because while the test coupons are considered predictors of performance, a lot of things change when it’s the real deal.
A large percentage of reliability failures we see are tied to cleanliness. In this column, I have addressed bare board, raw component, and test board assembly cleanliness, but those are only three sources for contamination out of a much larger number of options. Anything that can come into contact with the PCBA, either directly or indirectly, is a possible source of contamination. You must consider testing everything around the PCBAs, such as housings, large connector bodies, and any other number of materials.
We see a lot of failures that have good objective evidence of their assembly process, but because they were only testing the PCBAs, they don’t see the full picture. Materials like mold release on metal and plastic housings can be very ionic. If enough atmospheric moisture is available, it will collect at a low point and drip down on the board. That moisture can contain high levels of ionic content from the housing interior surface.
We have seen vibration dampening foam be extremely high in ionic content that was pressed directly against the surface of the PCBA without doing any cleanliness testing on the material (Figure 9). This was used in an under-hood application and not hermetically sealed. This was done on purpose by someone getting paid to make those types of decisions. It can happen to the best of us. With any luck, someone reading this right now will start to think about every part of their product outside of just PCBA manufacturing that can impact their product reliability.
I have barely scratched the surface on reliability testing, as so many are product-specific. Some products require a lot of vibration or extreme temperature exposure testing, but what I have covered applies to every product foundation. The title of my column this month is “Reliability Starts at the Bottom,” and by that, I mean at the base of every electronic product. At a minimum, you must have components, bare boards, and assembly materials to build an assembly. You must be able to confirm that you are building on a reliable base.
I reference IPC test and guidance documents frequently for a good reason; they are compiled by industry experts and, when followed, will more than likely yield a reliable product. I also frequently say that IPC has no idea what you are building specifically, so it’s imperative that you own your product. By that, I mean which tests are required for both initial acceptance and ongoing process monitoring? Being product-specific with your requirements might go above and beyond what IPC—or any other industry association—recommends, but it’s the best thing you can do for your product’s reliability. And isn’t that what it’s all about? Well, that, and pizza, of course.
This column originally appeared in the September issue of SMT007 Magazine.