Improving Wireless Simulation Through Noise Modeling

HyungJune Lee

Èñòî÷íèê: http://portal.acm.org/citation.cfm?id=1236364


Abstract

We propose modeling environmental noise in order to efficiently and accurately simulate wireless packet delivery. We measure noise traces in many different environments and propose three algorithms to simulate noise from these traces. We evaluate applying these algorithms to signal-to-noise curves in comparison to existing simulation approaches used in EmStar, TOSSIM, and ns2. We measure simulation accuracy using the Kantorovich-Wasserstein distance on conditional packet delivery functions. We demonstrate that using a closest-fit pattern matching (CPM) noise model can capture complex temporal dynamics which existing approaches do not, increasing packet simulation fidelity by a factor of 2 for good links, a factor of 1.5 for bad links, and a factor of 5 for intermediate links. As our models are derived from real-world traces, they can be generated for many different environments.

1. INTRODUCTION

Simulation is a critical part of developing, testing, and evaluating sensornet protocols and systems. Having complete control of the simulated environment allows us to run reproducible experiments, explore parameter spaces, and disambiguate causes of error or undesirable behavior. The inherent difficulty in developing robust sensornet codes has led many tools to focus on system dynamics through real-code simulation [1, 10, 15, 22]. Very accurate system simulation allows users to test code paths. It does not, however, promise a representative execution environment. First and foremost, low-power wireless networks have many complex, rare, and difficult behaviors that protocols must address properly in order to be effective in practice [5, 6, 9, 21, 24]. Early studies noted that packet delivery rates are highly variable over distance [9, 24]. Many existing simulators have used the high-level packet delivery data from these experiments in their network models [10, 15]. This approach allows simulators such as TOSSIM and EmStar to have packet delivery behavior similar to the real world. However, as these simulators simulate loss rather than its causes, they are unable to easily or accurately model novel environments, concurrent transmissions, or variable packet sizes. Recent investigations into low-cost radio hardware have distinguished how many different factors, such as hardware calibration, interference, and orientation affect packet delivery [20]. In particular, these and other results [16, 21] have verified that packet delivery follows a simple SNR curve. Furthermore, these studies have shown that the RSSI of received packets (the S of the SNR) is often very stable over long periods. Taken together, these observations point at the causes of temporal variations in packet loss and bursty connectivity. Hardware variations cause node pairs to have different SNR curves, but for any given pair the curve is precise. As RSSI is generally stable over short periods, it is reasonable to conclude that the missing piece of the RF simulation puzzle is the environmental noise. With that in mind, in the context of this paper, we term any RF energy generated by sources outside the control of a protocol designer noise, while interference is RF energy that can be controlled. Unfortunately, simulating environmental noise is hard. Unlike hardware-based noise, which is typically modeled as additive white Gaussian noise (AWGN), environmental noise is often from packetbased devices. Section 2 shows how packet based noise appears as brief, strong, short-lived noise spikes which can be temporally correlated. To simulate this noise, we gather 1kHz noise traces using current 802.15.4 sensor node platforms and use these traces to generate statistical models of noise using three techniques, presented in Section 3: probabilistic sampling, closest-fit pattern matching (CPM), and a non-Gaussian random process. We simulate radio packet delivery with these noise models using an SNR/PRR curve. Whenever a simulated node receives a packet, it samples a noise reading from its model to determine the SNR and computes the packet delivery probability. We implemented these three techniques as replacements of the TinyOS 2.0 TOSSIM simulator radio model. Section 4 evaluates how well the algorithms as well as wireless protocol simulators such as EmStar [10], TOSSIM 1.x [15], TOSSIM 2.x [4], and ns2 [2] simulate packet delivery dynamics for good, intermediate, and bad links. To capture temporal packet dynamics, we evaluate simulation accuracy using conditional packet delivery functions (CPDFs), which describe the probability a packet will be delivered successfully after n consecutive failures or successes. We compare CPDFs using the Kantorovich-Wasserstein distance [11]. Our results indicate that existing techniques are sufficient for environments with little noise from external transmitters, but for noisy environments CPM significantly outperforms all other approaches.

2. BACKGROUND

Accurately simulating wireless packet delivery is a long-standing challenge in sensornet research. Early studies used a unit-disc model, which defines transmission range as a simple disc of binary connectivity; nodes within a range r successfully receive packets, while those outside r do not. This model, while simple to implement and reason about, has little basis in reality. Experimental studies have shown that connectivity varies tremendously over distance [9, 24] and that many links fall into a “grey region” of intermediate quality. In response to the observation that connectivity is more complex than what simple disc or RF propagation models (such as those used in ns2 [2]) can express, sensornet simulators have for the most part adopted an empirical approach. Rather than trying to model the underlying causes of RF connectivity, such as interference, noise, and RF propagation, an empirical approach merely recreates packetlevel behavior. For example, TOSSIM takes inter-node distances and samples from a packet reception rate (PRR) distribution to determine the connectivity between a pair of nodes [15]. This simple approach can capture a large number of real-world complexities, such as link asymmetries and highly variable spatial connectivity. However, it also makes simplifying assumptions that do not hold in practice. First and foremost, this approach assumes that every link is independent (they are sampled independently from the distance distribution), while real networks tend to have “bad” nodes with poor connectivity. This simplification causes discrepancies between simulation and testbed experiments. The EmStar system [10] avoids the independence problems of TOSSIM by having one of its radio models using PRR values measured in real-world networks [5]. This has the benefit of capturing effects such as poor receivers. The cost is that it can only simulate networks for which PRR has been measured. The EmStar and TOSSIM approaches assume that packet losses are independent (PRR does not change), but experimental results have shown that PRR varies significantly over time [5, 6]. Recent studies have begun to shed light on the underlying causes of the complex packet delivery behavior of real networks [20]. One important observation from these studies is that for a given node pair, there is a crisp SNR/PRR curve. Effects such as the reception grey region are caused by different pairs having different curves and signal strength variations. A hardware covariance matrix can capture these effects with reasonable accuracy [25]. Experimental studies of current sensornet platforms, such as the micaZ and telosB, have shown that signal strength is stable over short periods of time, but can have longer-term variations due to environmental conditions [16, 21]. However, computing PRR from an SNR curve requires the noise as well as the signal. Historically, the RF community has explored how to simulate signal propagation in tremendous detail [12, 19]. The underlying assumption in all of this work, however, is that the noise encountered is all additive white Gaussian noise (AWGN). If the spectrum is not shared with any other RF sources, then the signal propagation of simulated transmitters and AWGN can describe the entire channel. As sensornets often operate in unlicensed ISM bands, their spectrum is crowded with many conflicting transmitters. 2.4 GHz, the band used by micaz, telos, and imote2 nodes, is particularly crowded, as it is also occupied by 2.4 GHz phones, 802.11b/g, microwave ovens, and Bluetooth, all of which interfere significantly. Without these considerations, SNR-based simulation models are fundamentally limited in accuracy.

3. NOISE CHARACTERIZATION

This section describes three approaches to statistically characterize noise traces: naive sampling, closest-pattern matching and correlation distortion. In this paper we broadly define environmental noise as the RF interference produced by any unsimulated RF sources in the node’s spectrum in addition to the thermal agitation of charge carriers in the electronic circuits and devices [14, 17]. The first approach, naive sampling, generates a probability distribution of a noise trace and simply samples from this distribution. Naive sampling is fast and simple, but makes the assumption that noise samples are independent. The second approach, closest-fit pattern matching (CPM), computes the conditional probability distribution of noise values given k previous noise readings. It generates a noise value based on the matching series and defaults to the mode when no measured series matches. The third approach uses a non-Gaussian random model with the correlation distortion method in order to describe noise as a random process. This can capture temporal dynamics, but is computationally expensive and has difficulty with signals that are highly non-Gaussian.

4. EXPERIMENTAL METHODOLOGY

We measure simulation accuracy by comparing conditional packet delivery functions (CPDFs). A conditional packet delivery function describes the probability that a packet will be received successfully given n previous failures or successes. For example, the CPDF of node A to node B, cAB, of 5 (cAB(5)) is the probability that B will receive a packet from A after 5 consecutive failures, while secutive successes. If packet losses are independent, then the CPDF is for the most part uniform; if packet losses are bursty, then the CPDF is non-uniform. We compare CPDFs using a rigorous theoretical measure, the Kantorovich-Wasserstein distance [11]. The Kantorovich-Wasserstein distance has been widely used in theoretical statistics and image signal processing applications to show the similarity of probability distributions. To calculate the Kantorovich-Wasserstein (KW) distance as our evaluation metric, we used open-source codes for the Earth Mover’s Distance [18], which is equivalent to KW distance. Both quantify how much elements of two distributions would have to be shifted to make the two distributions equal. We do not use the Chi-squared test because CPDF values are not independent, and do not use the Kolmogorov-Smirnov test because CPDFs are not continuous functions. We use the Kantorovich-Wasserstein distance of CPDFs rather than measuring the noise itself because of the difficulty of comparing noise traces. Because our goal is to generate a representative and reusable model of an environment’s noise, rather than simply replay it, simulated noise will inherently differ from the measured noise. We found that comparing mathematical properties of simulated and real noise gave some indications that they might lead to similar packet behavior, but for almost every similarity measure between noise traces it is simple to create a degenerate case that is mathematically similar but behaves completely differently. We therefore measure similarity in terms of the behavior we seek to recreate: packet delivery.

5. EVALUATION

We generated many traces with a variety of signal strengths in order to measure packet delivery behavior for good, bad, and intermediate links. For the most part, low-rate traffic and quiet environments behave in a simple fashion: packet losses due to noise are independent. For example, packet losses from the Grand Canyon trace would be due to AWGN and the SNR curve, both of which cause independent losses rather temporally correlated bursts of loss. In low-rate conflicting traffic environments, clock skew as well as differing intervals between conflicting sources and sensor nodes make periodic losses possible but highly unlikely. Figure 5 shows that for an intermediate link, packet losses are independent with respect to consecutive packet losses. This means that low 802.11b traffic does not lead to bursty packet errors and the temporal effects are negligible in a low-traffic 802.11b environment. Therefore, other simulation methods do not capture the differences between these two types of environments.

6. DISCUSSION AND CONCLUSION

This paper takes a step forward in simulating packet delivery by modeling difficult noise signatures from measurements. Rather than depend on a simplified and abstract view of an environment, the models strive to recreate the behavior of a real network. This allows us to simulate a particular network, rather than a fictional one. However, modeling noise as we have presented here has three simplifying assumptions; relaxing each assumption is in and of itself a complete research topic which we plan to explore in the future. First, by modeling each node’s noise traces independently, these models ignore the fact that noise is spatially dependent. If node A hears a noise spike, nearby node B will hear it as well. In one formulation, this means that node A’s noise value not only depends on its prior noise values but also the noise values of its nearby neighbors. Capturing these dependencies requires information on where the noise sources are. Another formulation is infer noise sources from correlated measurements and simulate those sources. Second, while packet-based noise changes are abrupt and therefore contribute to short-term changes in SNR and correlated losses, there are also longer-term changes due to gradual RSSI trends [16, 21]. Concurrently simulating both phenomena – brief noise spikes and long-term RSSI swings – would allow simulation to accurately capture both long-term and short-term dynamics. Furthermore, CPM only handles short-term noise bursts; characterizing longer-term noise trends (busy and quiet periods) would allow longer-running simulations that address another level of dynamism. Finally, all of our results are based on a single (albeit dominant) low-power radio technology, and we have not observed all forms of 2.4GHz interference. Microwave ovens and analog 2.4GHz devices, for example, produce relatively long (seconds-minutes) periods of high interference, while Bluetooth’s frequency hopping undoubtedly has complex and interesting dynamics. Evaluating our approaches in other ISM bands (e.g., the 433 and 915 MHz CC1000 radio on the mica2 platform) would better establish whether or not they are general or particular to the crowded 2.4GHz band. Our experimental results demonstrate that using an SNR curve with a closest-fit pattern matching noise model can significantly increase wireless packet delivery simulation accuracy. Furthermore, we can easily generate CPM models from real noise traces, allowing tools to effectively represent real-world environments in simulation. This shifts the focus of simulation from hypothetical network configurations to capturing real-world behavior based on real-world data. This allows us to quantify simulation accuracy, allowing us to avoid the pitfall of simulation results which do not reflect the real systems we are trying to improve.

REFERENCES
  1. A. Cerpa, N. Busek, and D. Estrin. Scale: A tool for simple connectivity assessment in lossy environments. Technical Report 0021, Sept. 2003.
  2. A. Cerpa, J. L. Wong, M. Potkonjak, and D. Estrin. Temporal properties of low power wireless links: Modeling and implications on multi-hop routing. In Proceedings of the Sixth ACM International Symposium on Mobile Ad Hoc Networking and Computing (MOBIHOC’05), 2005.
  3. D. Conner and J. Hammond. Modeling of stochastic system inputs having prescribed distribution and covariance functions. In Applied Mathematical Modeling, volume 3, 1979.
  4. R. Deutsch. Nonlinear Transformations of Random Processes. Prentice-Hall, 1962.
  5. D. Ganesan, B. Krishnamachari, A. Woo, D. Culler, D. Estrin, and S. Wicker. An empirical study of epidemic algorithms in large scale multihop wireless networks. UCLA Computer Science Technical Report UCLA/CSD-TR 02-0013, 2002.
  6. L. Girod, T. Stathopoulos, N. Ramanathan, J. Elson, D. Estrin, E. Osterweil, and T. Schoellhammer. A system for simulation, emulation, and deployment of heterogeneous sensor networks. In Proceedings of the 2nd international conference on Embedded networked sensor systems (SenSys), pages 201–213, New York, NY, USA, 2004. ACM Press.
  7. C. Givens and R. Shortt. A class of wasserstein metrics for probability distributions. In Michigan Math. J., volume 31, pages 231–240, 1884.
  8. H. Hashemi. The Indoor Radio Propagation Channel. Proceedings of the IEEE., 81(7), July 1993.
  9. G. Johnson. Constructions of particular random process. In Proceedings of the IEEE, volume 82, pages 270–285, 1994.
  10. J. Johnson. Thermal agitation of electricity in conductors. Physics Review, 32(97), 1928.