Error detection and diagnosis for networked embedded systems remain challenging and tedious due to issues such as a large number of computing entities, hardware resource constraints, and non-deterministic behaviors. The run-time checking is often necessitated by the fact that the static verification fails whenever there exist conditions unknown prior to execution.
Complexities in hardware, software and even the operating environments can also defeat the static analysis and simulations. Record and-replay has long been proposed for distributed systems error diagnosis. Under this method, assertions are inserted in the target program for run-time error detection. At run-time, the violation of any asserted property triggers actions for reporting an error and saving an execution trace for error replay.
This dissertation takes wireless sensor networks, a special but representative type of networked embedded systems, as an example to propose a dependence-based source-level tracing-and-replay methodology for detecting and reproducing errors. This work makes three main contributions towards making error detection and replay automatic. First, SensorC, a domain-specific language for wireless sensor networks, is proposed to specify properties at a high level.
This property specification approach can be not only used in our record-replay methodology but also integrated with other verification analysis app roaches, such as model checking. Second, a greedy heuristic method is developed to decompose global properties into a set of local ones with the goal of minimizing the communication traffic for state information exchanges. Each local property is check ed by a certain sensor node. Third, a dependence-based multi-level method for memory-efficient tracing and replay is proposed.
In the interest of portability across different hardware platforms, this method is implemented as a source-level tracing and replaying tool. To test our methodology, we have built different wireless sensor networks by using TelosB motes and Zolertia Z1 motes separately. The experiments’ results show that our work has made it possible to instrument several test programs on wireless sensor networks under the stringent program memory constraint, reduce the data transferring required for error detection, and find and diagnose realistic errors.
In this chapter, we define the error detection and diagnosis problem discussed in this dissertation, and then we propose the methodology and system framework at a high level. The types of errors targeted by our scheme go beyond the system crash. In order to verify if a program is implemented correctly at run-time, the application programmer may specify a set of correctness properties, e.g., sensor data must be reported from each mote to the base station within a certain time limit.
GLOBAL PROPERTY VIOLATION DETECTION
In Section 3.1, we design a domain specific language, called Sensor C, to specify global properties for WSN applications. Section 3.2 discusses the design of a source-to-source compiler, simply called the Sensor C compiler to decompose global properties into conjunctions of local properties. The framework is illustrated in Figure 3.1.
DEPENDENCE-BASED TRACING AND REPLAY METHODOLOGY FOR A SINGLE NODE
After we determine the set of functions to instrument, we insert operations into the source code of these functions to record the following pieces of information. Figure 4.1 gives an example of a function after instrumentation for recording. We shall prove in this section that this set of information is sufficient for accurate replay.
Figure 4.2 shows a diagram for our replay scheme. The Preprocessor reorganizes Recorded Log File, which contains the raw log information recorded from the motes, into Reconstructed Log File. The latter file consists of a data section, which is to be fed to the replay program later, and an interrupt table, which is used for the creation of the replay program.
DEPENDENCE-BASED TRACING AND REPLAY METHODOLOGY FOR THE ENTIRE WSN SYSTEM
Illustrated in Figure 5.1(a), after replaying the node’s execution and generating the replayed trace, the error source can be researched by programmers tracing back along with the use /def chain from where the error is detected.
ERROR#2 (shown in Figure 5.5) was detected on Node 0. The replay for Node 0 alone showed that the error source was also located on Node 0: the memory address of the received message was incorrectly changed and the message type was read in an incorrect position.
Program analysis is widely used for wireless sensor network error detection. Model checking is one of the main approaches for error detection in distributed systems. The essential idea is that given a system design/implementation and a property which should be satisfied by that system, by systematically checking all possible execution paths, a model checker either outputs YES if the property can be maintained by the system or generates a counterexample other wise.
CONCLUSION AND FUTURE WORK
Error detection and diagnosis for network embedded systems remain challenging tasks due to their large number of computing entities, hardware resource constrains, and inherited nondeterminism. In this dissertation, we take wireless sensor networks, a special but representative type of network embedded systems, as a concrete example, to investigate error detection and dignosis.
We have presented a domain specific language SensorC to specify properties and a Global Property Decomposition (GDP) algorithm intergraded with our developed SensorC compiler, which is responsible for decomposing the given properties and for finding nodes to detect those properties locally. As our experiments have illustrated, the approach can (a) reduce the error detection time; (b) reduce the communication traffic for state information exchanges used in centralized error detection; and (c) narrow down the range of collected trace used for off-line replay.
In addition, the global specification can also be intergraded with other verification approaches such as model checking. To help programmers reproduce and diagnosis errors, we have presented a dependence-based source-level method for memory-efficient tracing and replay. The tool developed based on our method is independent of the hardware platforms and the cross compiler (except for a system library call to make certain memory accesses atomic) and has been applied on WSNs consisting of TelosB motes and Z1 motes separately.
The experiments results show that our work has advanced a way to instrument several test programs on WSN under the stringent program memory constraints by using this proposed method, and we found and diagnosed realistic errors. Our current experiments are performed on TinyOS-based WSN applications; however, the proposed methodology and tool can be applied to other networked embedded systems if (a) the system domain information is acquired; and (b) the applications satisfy the assumptions made in Chapter 2.
Despite the above efforts, there are still several problems to be solved in the future research. First, in our current work, the routing information is useful for property decomposition only if the WSN is stationary, and we have yet to define the decomposition rules in a fine-gained for dynamic WSN. In addition, although we have detected and located some realistic errors, we need to conduct more experiments so as to apply the algorithm to additional WSN applications and other types of networked embedded systems.
Source: Man Wang
Author: Purdue University