Ideas for Validating Production Software

The FDA requires medical-device manufacturers to validate software that is part of production or a quality system. But what does that mean? Those charged with the job are often not software engineers nor do they have backgrounds in validation or regulatory issues. As a result, some manufacturers fall short of validating quality-system software for its intended use.


Production software is part of the “quality system.” It’s not embedded in a medical device. It’s software in production tools, for process monitoring, and control applications.


The Food and Drug Administration (FDA) suggests the validation rigor relate to the amount of risk the software poses. This doesn’t necessarily mean subjecting high-risk software to elaborate and expensive testing programs.


Regulatory background


The federal regulation that applies to quality-system software is found in 21 CFR 820.70 (i) on Production and Process Controls. It reads: “(i) Automated processes. When computers or automated data processing systems are used as part of production or the quality system, the manufacturer shall validate computer software for its intended use according to an established protocol. All software changes shall be validated before approval and issuance. These validation activities and results shall be documented.”


Section 6 of the FDA’s document, General Principles of Software Validation; Final Guidance for Industry and FDA Staff, January 11, 2002 (commonly referred to as GPSV), gives some advice for applying validation techniques to production and quality-system software.


Of course 21 CFR 820.70 (i) covers more than production software. Medical device manufacturers should inventory all software that automates any aspect of medical-device production. The GPSV, for instance, mentions programmable logic controllers, digital function controllers, statistical process control, supervisory control and data acquisition, robotics, human-machine interfaces, input/output devices, and computer operating systems.


Validation is more than testing


Validation is not synonymous with testing. But the GPSV makes it clear that validation includes testing and might also include actions that lead to the conclusion that the software is “fit for its intended use.” Some or all these components might be appropriate for a given program, depending on the intended use and associated risks. Validation includes:


Determining the software lifecycle. 


The lifecycle of software developed in-house by a medical-device manufacturer differs from software embedded in a production tool purchased by the manufacturer. Determine what the lifecycle will look like to facilitate validation activities at each phase of the lifecycle. The accompanying waterfall charts provide two examples.


Document intended use


For example, itemize each function or command. It is important to document each item in an Intended Use document along with how it influences the production process. Understanding how each command or function impacts the overall production process, prioritized by risk, lets users take advantage of the following process level validation techniques. 


For example, consider a Visual Basic program written by an ambitious production engineer that controls a grinding machine for sharpening cutting edges onto surgical scalpels. Assume the software has three commands: one to select the grinding profile (the shape of the blade), one to select the grinding pitch (how fast the edge tapers), and one to initiate a grinding sequence on a scalpel blank. A failure of any of these functions will have a serious effect on the finished product. The intended use of the software is to control the grinding machine to sharpen scalpel blanks to an accuracy of x degrees of specified pitch, and y mils of specified shape. The pitch-set command takes a numeric entry from the machine operator to define the scalpel-edge pitch ranging from a to b degrees in increments of c degrees. The profile-select command selects from files on disk that define the shape on a grid of resolution d mils. 


Manufacturing a low-cost line of scalpels this way sends them from grinder, to sterilizer, to packaging without inspection. Production greatly depends on the proper operation of the grinder. A second, higher-cost line of scalpels manufactured by the grinding machine subjects the scalpels to an individual inspection to assure the proper profile and pitch. Because each scalpel is inspected for defects and eliminated if found, a failure of the software will have less impact on the production process.


Risk analysis and management


Consider how the failure of a software item would impact the medical device in production. Do this by considering a failure of each of the functions listed in the Intended Use document. If the impact on the device could adversely affect patient safety, it’s an indication you must do more to validate software than you would to validate a command that has a less severe impact. 


Continuing with the scalpel-grinding example, the risk of failure for the low-cost scalpels not inspected after production is high relative to the higher-cost inspected scalpels. We have not defined any other risk control measures for the low-cost line. Therefore, we depend more on the correct operation of the software controlled grinding machine.


Managing risk includes identifying and implementing risk-control measures. These might include manual or automated checking for failures of the software item. Verifying the output of a software-governed process is a form of validation, and is preferable to simply testing the software once. Putting enough risk-control measures in place to significantly reduce residual risk can reduce the reliance on remaining validation tasks.


Knowing when enough risk control measures are in place is a matter of understanding what level of risk is acceptable to your end users (and to your business), and being able to quantify the residual risk. Since risk is a combination of severity and probability, one can reduce risk by reducing either the severity or the probability of failure. Unfortunately, the probability of software failure is difficult if not impossible to quantify. Therefore, it is better, where possible, to focus efforts on reducing the severity of a failure, or to reduce the process probability of failure. We can tolerate software errors by detecting and correcting them downstream in the manufacturing process. 


This is key to using automation to reduce software testing activities. Verifying 100% of the output of a software-controlled process validates the correct behavior of the software. For this logic to work, the failure modes that result in a severe failure must have corresponding verification tests on the finished product. 


Configuration management and version control – 


A configuration-management plan should be in place for all software. This plan should identify who is responsible for decisions to upgrade the software, who supplies the upgrades, who installs the software, and who takes responsibility for re-validating before it goes online. Consider also other items in your process that might be incompatible with new versions of the software, and how that incompatibility will be determined and resolved. The plan should also require a new risk assessment at the installation of each software upgrade. New capability in upgraded software could result in new usage modes that would require validation.




Quality plans and software verification and validation plans detail tasks and deliverables related to an individual software item or the collection of programs that make up an automated process. Plans should cover activities, roles, responsibilities, and resources for each phase of the software lifecycle.


Technical evaluations and management reviews


In technical evaluations, determine whether or not the software is technically up to the intended use before putting it online. Management reviews should examine the technical evaluation records and consider whether the residual risk of failure is acceptable before the software is deployed. 


Consider this imaginary management review. It’s conducted for a piece of software critical to the operation of a production line. This is management’s opportunity to ensure that everything possible has been considered in making sure the software will operate properly when deployed. Managers may note that several versions have been tested with little decline in the defect rate. Production engineers have noted in evaluations that the new releases of software are being produced at a rapid rate and have concerns about their stability. Several high severity errors were detected on the last few versions. This is a case where management should be concerned that severity of failure has not been controlled, and that the probability of failure is high because the software has not matured as evidenced by the rapid release rate. 


Management should also consider whether the software’s users are sufficiently trained to successfully deploy the software. The management review could be considered the final decision point to deployment.




Focus testing on risk-control measures in place, then on failures that could lead to severe consequences. There is little value in testing software functions that have little effect on a manufactured device. For electronic-record systems, focus testing on functions whose failure could result in record loss, alteration, or loss of security. 




Some documented tracking is needed to show that the capability detailed in the intended-use document was implemented or acquired. Further, any risk-control measure identified in the risk-management report must be traced to where it was implemented and where its ability to mitigate the risk was tested. Finally, functional software tests should be traceable back to software requirements or intended use capability.


Let the validation effort reflect risk


The GPSV says, “The level of validation effort should be commensurate with the risk posed by the automated operation. In addition to risk, other factors such as the complexity of the process software and the degree to which the device manufacturer is dependent upon that automated process to produce a safe and effective device, determine the nature and extent of testing needed as part of the validation effort.” That’s a long way of saying the greater the risk of operation, the more you must test to assure that failure will not occur. 


Unfortunately, the FDA does not further distinguish what validation components can be part of a reduced validation effort, and those that should be part of a maximum validation effort. 


It’s beyond the scope of this article to detail how a validation effort can be scaled so it is commensurate with risk. It’s usually hard to justify skimping on the documentation of intended use, risk assessment and risk management, configuration management, planning, and management review activities. The thought, discussion, and debate required by these activities may well lead a validation team to the conclusion that a software item is of low risk. In fact, these are the activities that are necessary to come to that conclusion, and that is why it would be difficult to reduce or eliminate them. The main activity that can be scaled to risk is the testing effort.


How to test


Not surprisingly, there are different ways to test different types of software. Software developed in-house, with full knowledge of the specifications, design, and implementation, could be tested at a unit, integration, and system level much like device software. Off-the shelf software may come with few specifications, design documents, or source code. Some medical-device manufacturers have embarked on elaborate and expensive exercises to reverse-engineer requirements, specifications, and designs to test against. In some cases, the software vendor is “audited” by the medical-device manufacturer in an attempt to document that the software was designed and developed in a controlled environment. The rationale is that process-driven development is less subject to error than chaotic development. 


While satisfying regulatory intent, these activities do little to “prove” that the software is safe for its intended use. Reverse engineering probably guesses much less than 50% of the capability of a piece of software, and even less about design and implementation. Although we advocate process-driven development, defects occasionally slip through. The bottom line is that “validation activities” for off-the-shelf software are time-consuming and expensive, weak at best, and do little to add assurance about the software’s safe operation in the production process.


Unfortunately, much of the software used in a production process is off-the-shelf, or embedded in production machine tools. The software can control critical elements of production. If the software fails, it could be disastrous for the product and, ultimately, the product’s end user. So, how can you assure the proper functioning of the software?


An alternative to testing software once against a set of real or assumed requirements is to test its outputs for each medical part it is responsible for creating. This is what we referred to as 100% verification of outputs. It is not testing to provide a level of assurance that the software is error-free. It is testing each output to detect and correct software malfunctions. This type of verification adds value to the production process and can be automated. Here’s how.


Medical device manufacturers can benefit from the experience of other high-reliability, regulated industries such as automotive, defense, and aerospace. Military protocols for the assembly of mission-critical devices are as demanding as those required under FDA’s 21 CFR validation requirements. Errors cannot be tolerated when building miniaturized, high-tolerance, low-failure devices, such as those used in guidance or life-support systems. Assembly processes must be monitored and periodic quality evaluations recorded at various steps.