Fault injection


Fault injection is a testing technique which aids in understanding how system behaves when stressed in unusual ways. This technique is based on simulation's or experiment's result, thus it may be more valid compared to statistical methods.
In software testing, fault injection is a technique for improving the coverage of a test by introducing faults to test code paths, in particular error handling code paths, that might otherwise rarely be followed. It is often used with stress testing and is widely considered to be an important part of developing robust software. Robustness testing is a type of fault injection commonly used to test for vulnerabilities in communication interfaces such as protocols, command line parameters, or APIs.
The propagation of a fault through to an observable failure follows a well defined cycle. When executed, a fault may cause an error, which is an invalid state within a system boundary. An error may cause further errors within the system boundary, therefore each new error acts as a fault, or it may propagate to the system boundary and be observable. When error states are observed at the system boundary they are termed failures. This mechanism is termed the fault-error-failure cycle and is a key mechanism in dependability.

History

Technique of fault injection dates back to the 1970s when it was first used to induce faults at a hardware level. This type of fault injection is called Hardware Implemented Fault Injection and attempts to simulate hardware failures within a system. The first experiments in hardware fault involved nothing more than shorting connections on circuit boards and observing the effect on the system. It was used primarily as a test of the dependability of the hardware system. Later specialised hardware was developed to extend this technique, such as devices to bombard specific areas of a circuit board with heavy radiation. It was soon found that faults could be induced by software techniques and that aspects of this technique could be useful for assessing software systems. Collectively these techniques are known as Software Implemented Fault Injection.

Model implemented fault injection

By increasing complexity of Cyber-Physical Systems, applying traditional fault injection's methods are not efficient anymore, so tester trying to use fault injection in the model level.

Software implemented fault injection

SWIFI techniques for software fault injection can be categorized into two types: compile-time injection and runtime injection.
Compile-time injection is an injection technique where source code is modified to inject simulated faults into a system. One method is called mutation testing which changes existing lines of code so that they contain faults. A simple example of this technique could be changing a = a + 1 to a = a – 1
Code mutation produces faults which are very similar to those unintentionally added by programmers.
A refinement of code mutation is Code Insertion Fault Injection which adds code, rather than modifying existing code. This is usually done through the use of perturbation functions which are simple functions which take an existing value and perturb it via some logic into another value, for example

int pFunc
int main

In this case pFunc is the perturbation function and it is applied to the return value of the function that has been called introducing a fault into the system.
Runtime Injection techniques use a software trigger to inject a fault into a running software system. Faults can be injected via a number of physical methods and triggers can be implemented in a number of ways, such as: Time Based triggers ; Interrupt Based Triggers.
Runtime injection techniques can use a number of different techniques to insert faults into a system via a trigger.
These techniques are often based around the debugging facilities provided by computer processor architectures.

Protocol software fault injection

Complex software systems, especially multi-vendor distributed systems based on open standards, perform input/output operations to exchange data via stateful, structured exchanges known as "protocols." One kind of fault injection that is particularly useful to test protocol implementations is fuzzing. Fuzzing is an especially useful form of Black-box testing since the various invalid inputs that are submitted to the software system do not depend on, and are not created based on knowledge of, the details of the code running inside the system.

Hardware implemented fault injection

This technique was applied on hardware's prototype. Testers inject fault by changing voltage of some parts in circuit, increasing or decreasing temperature, bombard the board by high energy radiation and etc.

Efficient fault injection

Faults have three main parameters.
These parameters create the fault space realm. The fault space realm will increase exponentially by increasing system complexity. Therefore, the traditional fault injection method will not be applicable to use in the modern cyber-physical systems, because they will be so slow, and they will find small number of faults. Hence, the testers need an efficient algorithm to choose critical faults that have higher impact on system behavior. Thus, the main research question is how to find critical fault in the fault space realm which have catastrophic affect on system behavioral? Here are some methods that can aid fault injection to efficiently explore the fault space to reach higher fault coverage in less simulation time.
Although these types of faults can be injected by hand the possibility of introducing an unintended fault is high, so tools exist to parse a program automatically and insert faults.

Research tools

A number of SWIFI Tools have been developed and a selection of these tools is given here. Six commonly used fault injection tools are Ferrari, FTAPE, Doctor, Orchestra, Xception and Grid-FIT.
In contrast to traditional mutation testing where mutant faults are generated and injected into the code description of the model, application of a series of newly defined mutation operators directly to the model properties rather than to the model code has also been investigated. Mutant properties that are generated from the initial properties and validated by the model checker should be considered as new properties that have been missed during the initial verification procedure. Therefore, adding these newly identified properties to the existing list of properties improves the coverage metric of the formal verification and consequently lead to a more reliable design.

Application of fault injection

Fault injection can take many forms. In the testing of operating systems for example, fault injection is often performed by a driver that intercepts system calls and randomly returning a failure for some of the calls. This type of fault injection is useful for testing low level user mode software. For higher level software, various methods inject faults. In managed code, it is common to use instrumentation. Although fault injection can be undertaken by hand, a number of fault injection tools exist to automate the process of fault injection.
Depending on the complexity of the API for the level where faults are injected, fault injection tests often must be carefully designed to minimize the number of false positives. Even a well designed fault injection test can sometimes produce situations that are impossible in the normal operation of the software. For example, imagine there are two API functions, Commit and PrepareForCommit, such that alone, each of these functions can possibly fail, but if PrepareForCommit is called and succeeds, a subsequent call to Commit is guaranteed to succeed. Now consider the following code:

error = PrepareForCommit;
if

Often, it will be infeasible for the fault injection implementation to keep track of enough state to make the guarantee that the API functions make. In this example, a fault injection test of the above code might hit the assert, whereas this would never happen in normal operation.
Fault-injection can be used at testing time, during the execution of test cases. For example, the short-circuit testing algorithm injects exceptions during test suite execution so as to simulate unanticipated errors. This algorithm collects data for verifying two resilience properties.