Using treemaps for variable selection in
spatio-temporal visualisation
Aidan Slingsby, Jason Dykes, Jo Wood
Èñòî÷íèê: http://ivi.sagepub.com/content/7/3-4/210.full.pdf
Abstract
We demonstrate and reflect upon the use of enhanced treemaps that incor- porate spatial and temporal ordering for exploring a large multivariate spatio-temporal data set. The resulting data-dense views summarise and simultaneously present hundreds of space-, time-, and variable-constrained subsets of a large multivariate data set in a structure that facilitates their meaningful comparison and supports visual analysis. Interactive techniques allow localised patterns to be explored and subsets of interest selected and compared with the spatial aggregate. Spatial variation is considered through interactive raster maps and high-resolution local road maps. The techniques are developed in the context of 42.2 million records of vehicular activity in a 98 km2 area of central London and informally evaluated through a design used in the exploratory visualisation of this data set. The main advan- tages of our technique are the means to simultaneously display hundreds of summaries of the data and to interactively browse hundreds of variable combi- nations with ordering and symbolism that are consistent and appropriate for space- and time-based variables. These capabilities are difficult to achieve in the case of spatio-temporal data with categorical attributes using existing geovisu- alisation methods. We acknowledge limitations in the treemap representation but enhance the cognitive plausibility of this popular layout through our two- dimensional ordering algorithm and interactions. Patterns that are expected (e.g. more traffic in central London), interesting (e.g. the spatial and temporal distribution of particular vehicle types) and anomalous (e.g. low speeds on particular road sections) are detected at various scales and locations using the approach. In many cases, anomalies identify biases that may have impli- cations for future use of the data set for analyses and applications. Ordered treemaps appear to have potential as interactive interfaces for variable selec- tion in spatio-temporal visualisation.
Introduction
Large multivariate spatio-temporal data sets – for example, traffic flow data or mobile telephone logs – are likely to contain structure and patterns that provide useful information about characteristics of the measured phenomena. The identification and comparison of such patterns may assist in understanding these phenomena and may be used for a number of purposes including research, gaining competitive advantage, planning and other operational tasks. The complexity that arises from the interactions among the spatial, temporal and attribute aspects in such data sets1,2 and the imprecise goals that are often associated with their initial exploration3 makes the identification of patterns and structure challenging.1–3 Infor- mation visualisation and geovisualisation techniques are an increasingly
Data
The London-based courier company eCourier9 collected 42.2 million GPS points from delivery vehicles (an average of 48 vehicles per day) at approximately 10-s intervals between June 2006 and May 2007 in a 98 km 2 area of Central London (Figure 1). Each record contains the vehicle’s position, speed, vehicle type (van, large van, motorbike, large motorbike or bicycle) and the time at which it was collected. This data set is interesting for a number of reasons. Firstly, its analysis may be of specific value to the courier company concerned10 to help optimise vehicle alloca- tion, scheduling and routing. Secondly, the techniques may also be of more general interest; for example trans- port authorities assess patterns of traffic flow to help set policy to reduce congestion.11,12 Thirdly, it typifies a trend whereby large data sets such as those that are volunteered or derived from computer logs are released through open APIs.13 This is increasing opportunities for (geo)visual analysis14 and is fuelling the emerging field of visual analytics.8,15 In this context, visual data exploration techniques can be used to identify patterns of interest that may relate to significant characteristics of the phenomenon under study. Importantly, they may also help draw attention to biases and data quality issues in large informal data sets of unknown quality.
Visualisation challenges
Large spatio-temporal multivariate data sets pose substan- tial visualisation challenges in terms of data complexity – both the interactions among spatial, temporal and attribute aspects2 and the complex relationships between the data collected and the phenomena under consideration. Large data sets (millions of records) are prone to the trade-off between the potential for visual occlusion caused by overplotting and the loss of resolution inherent in highly aggregated summaries.1 Memory and computation overheads are significant when processing large data sets, which can make interactive querying to produce the ‘instant’ response times required to support visualisation2 difficult. This applies to the eCourier data set through the following characteristics: • Large size: The 42.2 million records pose challenges for providing interactivity. • High spatial density: The 1 km grid squares contain between 0.3 and 227 points per metre of road (mean = 20) positioned on 1831 km of road network comprising 28,838 road segments. Visual occlusion is a problem for displaying such data making the use of colour symbolism on the road geometry ineffective at the global scale. • Multivariate nature: A number of variables can be derived from the data. We use hour of the day, day of the week and 1 km 2 grid cells as categorical spatial and temporal variables in our analysis, alongside the recorded vehicle type. Ignoring space for the moment, we can select from 1199 possible combinations of single values of the three categorised variables (24 hours of the day, 7 days of the week, 5 vehicle types; 5 + 7 + 24 + (5 x 7) + (5 x 24) + (7 x 24) + (5 x 7 x 24)). This number rises markedly when we consider geographic subsets – the 98 possible grid squares for example. Existing techniques such as small multiples or animation are not able to give access to all these subsets simultaneously. • Spatio-temporal aspects: Spatial and temporal data have inherent ordering essential for comparison, interpreta- tion and assimilation. These aspects must be graphically represented so that subsets can be compared in their spatial and temporal contexts.
Approach
The challenge is to develop views and interactions that provide access to information about the relationships between these subsets and their spatio-temporal charac- teristics in a manner that aids comparison and assimila- tion. Our approach has two elements – broadly following Schneiderman’s16 ‘information-seeking mantra’. Some innovation was required in order to address the challenges outlined above in our efforts to explore the eCourier data set. • Overview – Treemaps with spatial and temporal ordering simultaneously provide rich data-dense summaries of hundreds or thousands of subsets of the data set, using consistent ordering that reflect its spatio-temporal nature, without visual occlusion. • Zoom, filter and details on demand – An interactive design for exploration through which variable-constrained subsets of interest can be selected for inspection as: (a) raster maps for the entire area – no visual occlu- sion; (b) road maps for individual grid squares – an appro- priate scale for displaying road segments such that the symbology is discernable; (c) treemaps for comparing local data with global summaries. We use four visual techniques in our design and develop links between them. The first is used for an overview and the latter three for zooming, filtering and obtaining details on demand. Figure 5 shows a screenshot of our design – an interactive prototype – that contains: • Spatial treemaps, with fixed-size nodes and spatial and temporal ordering – coloured by traffic volume and by speed. These show overall spatio-temporal patterns in vehicle use. We term these layouts ‘maptrees’ where the top-level of the hierarchy is a spatial unit (Figure 4). • Interactive attribute treemaps (top left of Figure 5) sized by global traffic volume, coloured by local (1 km 2 grid squares) traffic volume or speed allowing the compar- ison of global characteristics with local characteristics of a subset. Interesting subsets of data represented by nodes and leaves in the tree can be selected (e.g. vans on Monday) for display as raster maps (to show the spatial distribution of the subset’s traffic volume or average speed) and road maps (for individual grid squares). • Interactive raster maps (bottom of Figure 5) show the spatial variation of subsets selected in the interactive attribute treemap. They facilitate the selection of indi- vidual grid squares, allowing the subset summary to be compared with the global traffic volume (in the inter- active treemaps) and mapped as a road map. • Road maps (top right of Figure 5) are shown for the 1 km 2 grid cells selected in the raster map and map the traffic volume or average speed for each road segment. Brushing the road segments numerically displays details on demand – the number of data points and average speed for that segment. This approach attempts to address the challenges posed by large multivariate spatio-temporal data sets using visual encodings that do not exhibit visual occlusion (treemaps and raster maps), give access to coarse and fine-grained aggregates (subsets) and support their visual comparison. It is scalable due to the use of pregenerated summaries at hundreds or thousands of levels of aggregation. Our treemap algorithm is implemented as a Java application20 with output in SVG and a number of image formats. The designs are encoded in SVG and interaction is provided using the DOM with JavaScript.
Treemaps for showing multivariate data
Treemaps display hierarchical data17 by recursively and exhaustively subdividing space at each level in a
Treemap layout and order
Information visualisation techniques project data onto Euclidean two-dimensional planes for display, a process known as spatialisation.24 Although geovisualisation techniques can use well-established cartographic coordi- nate systems, the way in which non-spatial data should be spatialised is less clear-cut. Skupin and Fabrikant25 reasonably argue for the use of consistent spatial metaphors in layouts, such as Tobler’s so-called ‘First Law of Geography’26 where proximity can be associated with relatedness, in order to improve cognitive plausibility. The recursive subdivision of space at each level in the treemap hierarchy has the effect of isolating layouts within each level, leading to discontinuities between hierarchies and abrupt contraventions of Tobler’s ‘First Law’ . For example, the relative position of ‘Friday vans’ and ‘Friday motorbikes’ is arbitrary in Figure 2 because they are ordered independently of each other, making comparison of Friday for different vehicle types diffi- cult. Concerns about the cognitive load imposed on the user that such discontinuities and inconsistencies may have are expressed in the literature.25 We acknowledge and address some of these concerns with our enhanced treemaps that use consistent layout, appro- priate ordering and interaction, and also argue that some concerns relating to treemap usability27–29 involve very different tasks and contexts to those under consideration here. Node size is an appropriate ordering criterion for comparing magnitudes (Figure 2). However, where data sets contain spatial and temporal structure, categories and category combinations may have inherent order- ings in time and space. Where categories are ordered in one dimension (e.g. ‘day of week’) we apply an ‘ordered squarified’ algorithm,20 ordering from top left to bottom right. Where categories are ordered in two dimensions (e.g. spatial subsets) we use spatial ordering.20 Both these techniques produce consistent ordering within and between hierarchies and thus more cognitively plausible layouts that support the comparison of overall patterns across and within hierarchies. These techniques mean that while hierarchical information is maintained in the layout, spatial relationships in the treemap relate more closely to one or two-dimensional relationships in our variables. We argue that these enhancements reduce the cognitive load and increase the plausibility of this partic- ular spatialisation.
Temporal ordering
For temporally consistent ordering, we use treemap leaves of constant size. While we lose one information carrying dimension, by using size consistently to give every subset equal prominence, we gain another – order. Addition- ally, low volumes of traffic – which may be as worthy of further exploration as high volumes – are more easily detected. Figure 3 shows the same treemaps presented in Figure 2 but with fixed leaf sizes and temporal ordering (midnight is at the top left; vehicle type ordering is arbi- trary but consistent). Since we have lost the property of size for conveying numerical values we need a second treemap – coloured by traffic volume (purple). The traffic volume treemap shows striking temporal patterns. The repeated diagonals are expected, showing that most traffic occurs during daylight hours on weekdays; however, some patterns are perhaps less expected. Van traffic appears to be heavy at all times, with a large increase in daytime traffic in comparison to night traffic (see logarithmic scale bar). The speed treemap shows that patterns of average speed are not so strongly correlated with time, but that van traffic, motorbike traffic and large motorbike Saturday traffic tends to be slower at night, a surprising finding that may be worthy of further investigation – are there spatial patterns to this trend for example? While adjacencies of leaf nodes at the boundaries between branches are fairly arbitrary, temporal ordering within branches introduces an ordering consistency that enables temporal patterns to be identified even if the detail of the leaves is not visible.
Spatial ordering
Treemaps usually employ one-dimensional ordering in two-dimensional space. This is the case in Figures 2 and 3 for the levels of the hierarchy relating to day and hour. Where spatial data are involved, two-dimensional ordering can be used, resulting in spatial treemaps.20 By subsetting the data set into geographic units (such as the 1 km 2 square shown in Figure 1), inserting the spatial subsets into the base of the variable hierarchy (grid square name and location), fixing the node size and using the correct aspect ratio, a spatially ordered treemap can be produced. This might be termed a ‘maptree’, because as shown in Figure 4, this is effectively a (geographical) map of localised versions of the treemap shown in Figure 3. The consistent ordering at the appropriate levels of the hierarchy can be used to draw attention to spatial and temporal patterns across the entire data set. For example, it is not surprising to note that the highest traffic volumes are around the centre, but there are also high volumes of traffic at certain times of day in the east. Temporal patterns for each grid square can be seen; for example, grid squares in the centre and towards the southwest have higher volumes of van traffic at all times (upper left of each grid square) and high daytime volumes of large motorbike use (lower right of each spatial square), but nighttime and weekend van traffic is much lower in the east. Bicycle traffic (top right of each spatial square) is only found in the centre and towards the northwest and motorbike traffic is almost non-existent in the northeast. The speed treemap shows lower speeds (as expected) in the centre, but isolated grid squares can be picked out containing consistently high average speeds. The fast squares in northeast and west London have high speeds associated with vans (top left of each spatial square) and large motorbikes (bottom right of each spatial square). These high speeds are associated with main roads (M40 in west; A12 in northeast; see Figure 1). In the south and just north of the centre, it is only large motorbike traffic with particularly high average speeds. The initial visual- isation suggests that these combinations of space, time and attribute may warrant further investigation. Although we focus on false attribute hierarchies here, inherent hierarchies relating to different granularities of space and time are a common consideration (e.g. coun- ties within countries). The effect of spatial granularity on statistical aggregates can be explored using spatial treemaps as hierarchical cartograms,20 by creating hierar- chies from coarse to fine granularities; for example, where colour would show values of leaf nodes (finest granu- larity) and size represents relative values within and across whole hierarchies (if the value is additive through spatial granularities – as is traffic volume). However, because we have chosen to fix node size, little benefit would be gained from looking at different spatial granularities in our ‘maptrees’, other than through interactively changing the hierarchy depth. Instead, we choose the two spatial granularities, 1 km 2 grid squares (we found this spatial resolution to be helpful for the maptrees and raster maps and they correspond to National Grid mapping squares) and road segments.
Interactive methods
The overview techniques have resulted in the identifica- tion of patterns and related ideas that warrant further exploration. To support this activity through an itera- tive processes of overview, zoom, filter and details-on- demand16 we use a series of interactive techniques (listed in the next subsection) to link treemaps, spatial treemaps, raster maps and road maps. Doing so in response to ideas generated through the overview treemaps and maptrees allows us to examine interesting subsets of data in detail and at a higher resolution. A screenshot of the iteratively developing design used in our analysis is shown in Figure 5. It employs open tech- nologies for high-level scripting, including HTML, SVG, JavaScript and CSS in a manner that has evolved as our data exploration has progressed. It contains a series of novel aspects and reveals structure in the eCourier data set for various variable combination subsets at a number of scales.
Hierarchy switching and changing hierarchy depth
We have drawn attention to the need to switch hier- archy and change the hierarchy depth. Our design allows different levels of the attribute hierarchy to be selected. In Figure 5, the interactive treemap is shown at level 1 – aggregating by transport type and in this example enabling all van traffic to be selected. In Figure 6 (top), all three levels of the hierarchy are shown, each with a local colour scheme showing the noise associated with
Treemaps for comparing global and local patterns
Treemaps offer the properties of colour, size, labelling and order to convey data values and other information. Although the dependency between size and order is problematic when using treemaps for global summaries (positional inconsistencies between hierarchies), size and colour can be usefully used for comparing global with local patterns. In Figure 6 (bottom), we use size and colour to represent global and local traffic volume, respectively, for a variable subset and a grid square. Local colouring is selected by clicking cells in the raster map. Where large nodes have dark shading or small nodes have light shading, the global (all traffic for the whole area) and local (traffic volume for the selected subset in the selected grid square) patterns are most similar. The treemaps in Figure 6 (top) are coloured by average speed, so that large dark nodes represent the situation where both global traffic
Raster and road maps
Raster and road maps are used to consider spatial patterns of particular variable combinations. The high number (28,838) of road segments in central London, their vari- ation in length and dense geographical arrangement make it difficult to discern all roads in a single overview, let alone represent additional attribute information for visualisation through colour. Broad spatial patterns in filtered subsets of large data sets can be considered and compared using traditional raster maps. Where we need to inspect the geography of a grid square in detail, we use generalised road maps that summarise traffic volume and speed on particular segments. A 38% random sample of GPS points was snapped to nearest road segments in the examples provided here. Maps for any variable combination can be interac- tively selected by clicking nodes in the treemaps in our design in Figure 5; so for example, clicking the ‘van’ in the treemap will shade the raster map according to ‘van’ traffic. Changing the depth of the treemap hierarchy would enable us to generate raster maps based on the values of more than one variable, for example motorbikes on Thursday at 13:00–14:00 is shown in Figure 7D (left). Selecting ‘GrnwdDk’ will produce a road map (Figure 7D, right) and colour the treemap according to the local situation there. These views allow us to study the spatial structure of this subset and compare with the global situ- ation. Figure 7 (A, B and C; left) show raster maps of all van traffic (the top-level node in our treemap). Blackfriars dominates in terms of van numbers and has an unusually low average speed. Our interactive techniques allow us to subsequently select the Blackfriars grid square and view its local treemap and map its traffic summarised by road segment (Figure 7A, B and C; right). The logarithmic scale hides the large variation at the upper end of the scale, but when a local linear scale is used (Figure 7B, right), it becomes clear that the majority of vehicles are found on one no-through road. Since this clearly does not represent through traffic, this is consistent with the anomalously low-speed observed.
Inspection of individual values and comparison with global trends
Our design provides important details on demand. Values associated with individual symbols are displayed when the symbols are touched enabling us to compare average speed with traffic volume – effectively the sample size. This is important where outliers may be caused by particularly small samples. Figure 7 (bottom) shows the average speeds for motorbikes on Sunday at 13:00–14:00, representing a small subset of the data. The raster map shows that traffic in the ‘GrnwdDk’ (6,12) grid square has a particularly high average speed. Inspecting and querying the road map shows that this high speed is asso- ciated with one road segment with a single GPS point. This outlier is likely to have a significant effect on the graphics. Maptrees can be compared with the raster map so that the data-dense overviews can inform the visual data exploration process (see semi-opaque maptree and ‘opacity’ control in Figure 5). Various images of these overviews can be loaded (see ‘image’ control in Figure 5) including maptrees of standard deviation and coefficient of variation of speed, enabling us to account for outliers such as the ‘GrnwdDk’ road segment. These linked views and coordinated interactive tech- niques allow us to identify specific instances and broad trends through aggregated summaries at various levels as well as detailed information about data points at different resolutions.
Discussion Visualisation challenges
Large, multivariate spatio-temporal data sets pose prob- lems for the design and implementation of effective visual data exploration systems. Keim et al.1 noted that many visual techniques either aggregate and summarise the data to a high degree (e.g. barcharts) or suffer from the visual occlusion of overlapping data points. We try to address these potential weaknesses with our combi- nation of treemaps (that simultaneously show multiple aggregates ranging in aggregation from high to low, exhaustively tessellating space), raster maps (spatial aggre- gates that exhaustively tessellate space), maptrees (that combine these two approaches) and localised generalised road maps (aggregated to a finer spatial resolution at an appropriate spatial for the density of the road network). If the only means of visual encoding were the traditional cartographic techniques of raster maps for large areas and roadmaps for localised areas, it would be difficult to compare summaries of so many data subsets. Techniques such as small multiples, animation or interactive selection could be used, but these would significantly restrict the number of comparisons that could be made. The simul- taneous display of summaries of all these subsets through false hierarchies with spatial and temporal ordering is a novel use of the treemap that enables broad comparisons between subsets to be made and has enabled us to iden- tify characteristics that are both expected and surprising. Adding interactivity to link treemap overviews to the more traditional techniques of raster maps and roadmap, allows interesting subsets to be inspected in detail. This provides particularly rich overviews and the opportunity to zoom, filter and obtain details on demand.
Treemap issues
Treemaps are data-dense and efficient data visualisation techniques for hierarchical data. Their popularity is due in part to the intuitive recursive division used to represent
Technologies
The technological challenge of working with such a large data set was considerable and this had an impact on some of our design decisions. We used a PostgreSQL30 database to store, maintain and query the data and generate output for visualisation. The PostGIS spatial extensions31 enabled us to snap GPS points to their closest road segment. The high computational overhead associated with this operation forced us to use a randomly sampled 16 million point subset (38%). Initial experi- ments showed even small random samples appeared to be representative of the data set as a whole (we tried 1 million, 3 million and 16 million point samples). The main problem is that some subsets have sample sizes that are too small for meaningful summaries to be generated (as apparent in Figure 7D). The large size of the data set necessitated the precom- putation of all the numerical summaries used to generate graphics and for incorporation into the interactive design (this can be scripted and generated without interven- tion), reducing the flexibility of the approach somewhat. We chose to use the open technologies of SVG and Javascript to build our interactive system and found these to perform adequately, even though thousands of pregenerated numerical summaries need to be provided to the user on-demand. A significant limitation of the technologies used is that although they are based on well- documented standards, they are inconsistently imple- mented by browsers. They work well in Microsoft Internet Explorer under Windows with Adobe’s SVG plugin and adequately in Safari on MacOS X, but more implementa- tion work is required for consistent interpretation by more browsers.
Further work
The large size of our data set has had an impact on the design and functionality of our system, but we are looking to ways in which we can add more interactive filtering. Combining some of the relatively arbitrary subsets used here (e.g. vans and large vans, 10:00–11:00 and 11:00–12:00 or Angel and Farringdon) would not require a large computational overhead and could be achieved using the technologies and methods we employ – allowing the data set to be studied at different levels of granularity. Filtering that involves re-aggregating the orig- inal point data – such as filtering out subsets that contain few data points or considering trajectories with particular characteristics – is technically more challenging. Such queries are likely to be useful, and work to explore some of these possibilities is ongoing. We are also looking at techniques that measure similar- ities and differences between subsets, rather than relying entirely on visual inspection. This might help alert the user to potentially interesting combinations and would draw this work closer to that of the geovisual analytics research agenda.8 We have used established theory, our own ideas and experience, and the published techniques and sugges- tions of others to develop these methods. In so doing we have developed our knowledge of the eCourier data set. We are engaging with other users of the eCourier and similar data sets to establish whether such techniques can help meet the goals of their specific exploratory analysis tasks.
Conclusion
This work was motivated by the desire to interpret and evaluate a large data set representing the characteristics of eCourier vehicle traffic through an open API.9 The data set has the challenging properties of being large and multi- variate with a dense spatial structure but likely to contain strong spatio-temporal patterns relating to traffic usage in London. We have designed, developed and reflected upon novel techniques for the visual exploration of this large and complex multivariate spatio-temporal data set that has posed substantial challenges in terms of gener- ating meaningful aggregated summaries, providing access to specific and selected detailed information and interac- tively linking these two processes to support exploration. Our approach involves providing a rich overview of the data set through new treemap techniques that visualises thousands of variable-constrained subsets simultaneously in a single data-dense graphic in which appropriate and consistent ordering is used to facilitate the identification of patterns through space, time and attributes. Our inter- active design allows us to select hundreds of subsets of interest from these rich graphics for further investigation. This is achieved through linked views through which we can compare the summaries of selected subsets with global traffic volume (using interactive treemaps), explore their variation across space (using raster maps) and consider the detailed distribution on the road network in localised areas (using road maps). Our enhancements to the ordering mechanisms used in treemaps extend their suitability to geovisualisation techniques. We acknowl- edge some of the widely cited weaknesses of treemaps and address some of these by using appropriate and consis- tent node ordering, hierarchy switching and interactive techniques. These support hierarchy depth changing, brushing to highlight equivalent nodes across the hier- archy, interactive linking and techniques for zooming, filtering and providing details on demand (Figure 7). We have shown examples of how these techniques have enabled us to find structure and patterns in the data, some of which may be difficult to identify with alterna- tive methods. The most important benefit of the treemap technique is the multifaceted overview of the entire data set that can be generated, such as those shown in Figure 4, due to the consistent spatial and temporal ordering, which effectively provide visual signatures of the traffic characteristics. Although the larger traffic volumes and slower speeds in the central area are expected patterns, some of the differences in the traffic composition are not so expected. For example, the central grid squares in Figure 4 (top) show similar signatures of high van traffic (upper left) at all times of the day, that motorbikes (lower left) and large motorbikes (lower right) show strong diurnal variation and that their use varies spatially. The interactive design provides an interface that makes it possible to rapidly switch from an aggregated overview to a detailed interactive view in which the statistics of individual road segment can be retrieved and, in our example in Figure 7D, used to assess the appropriateness of the statistics. The detailed visual analysis in Blackfriars suggests that much of the high volume of ‘van traffic’ does not, in fact, represent through traffic. This type of finding suggests spatial bias that has implications for alternative uses of the data set. We recommend further work in using treemaps and other information visualisation techniques for repre- senting variable selection combinations that include time and space as more and larger spatio-temporal data sets become available through similar APIs to that provided and maintained by eCourier.9 There is also scope for empirical cognitive studies that examine the effectiveness of combining different spatial, temporal and thematic layouts of treemap nodes in the same representation.
Acknowledgments
We are grateful for the comments received from the organ- isers and participants of the GeoVisualisation of Dynamics, Movement and Change workshop at the AGILE 2008 confer- ence, where this work was presented. We are also grateful to all the reviewers for their constructive and thorough reviews. These have been a great help in producing this paper and have made a significant contribution to the work. Finally, we thank eCourier for providing public access to this large and interesting data set.
References
1 Keim D, Hao MC, Ladisch J, Hsu M, Dayal U. Pixel bar charts: a new technique for visualizing large multi-attribute data sets without aggregation. In: Symposium on Information Visualisation 2001 (San Diego, CA), IEEE Computer Society: Silver Spring, MD, 2001; 113–122. 2 Chen J, MacEachren AM, Guo D. Supporting the process of exploring and interpreting spacetime multivariate patterns: the visual inquiry toolkit. Cartography and Geographic Information Science 2008; 35: 33–50. 3 Keim DA. Information visualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics 2002; 8: 1–8. 4 MacEachren AM, Wachowicz M, Edsall R, Haug D, Masters R. Constructing knowledge from multivariate spatiotemporal data: integrating geographical visualization with knowledge discovery in database methods. International Journal of Geographical Information Science 1999; 13: 311–334. 5 Gahegan M. Beyond tools: Visual support for the entire process of GIScience. In: Dykes J, MacEachren AM, Kraak M-J (Eds). Exploring Geovisualization, Elsevier Ltd: Amsterdam. 2005; 83–99. 6 Robinson AC, Chen J, Lengerich EJ, Meyer HG, MacEachren AM. Combining usability techniques to design geovisualization tools for epidemiology. Cartography and Geographic Information Science 2005; 32: 243–255. 7 MacEachren AM, Kraak M-J. Research challenges in geo- visualization. Cartography and Geographic Information Science 2001; 28: 3–12. 8 Andrienko G, Andrienko N, Jankowski P, Keim D, Kraak M-J, MacEachren AM, Wrobel S. Geovisual analytics for spatial decision support: setting the research agenda. International Journal of Geographical Information Science 2007; 21: 839–857. 9 eCourier. eCourier API [WWW document] http://api.ecourier. co.uk/ (accessed 12 June 2008). 10 eCourier. Ecourier News [WWW document] http://www. ecourier.co.uk/news.php (accessed 12 June 2008). 11 Department for Transport. DfT public service targets: technical note – PSA target 4 [PDF document] http://www.dft.gov.uk/ pdf/about/howthedftworks/psa/spendingreview2004psatargets2 (accessed 12 June 2008). 12 Department for Transport. Tackling congestion on our roads [PDF document] http://www.dft.gov.uk/pdf/pgr/roads/roadcongestion/ (accessed 12 June 2008). 13 Goodchild MF. Citizens as sensors: the world of volunteered geography. GeoJournal 2007; 69: 211–221. 14 Dykes J, Purves RS, Edwardes A, Wood J. Exploring volunteered geographic information to describe place: visualization of the ‘Geograph British Isles’ collection. In: Proceedings of GIS Research UK (Manchester, UK), 2008; 256–267. 15 Thomas JJ, Cook KA. Illuminating the Path: The Research and Development Agenda for Visual Analytics. National Visualization and Analytics Center: 2005; 190pp, ISBN is 0-7695-2323-4. http://nvac.pnl.gov/. 16 Shneiderman B. The eyes have it: a task by data type taxonomy for information visualizations. Symposium on Visual Languages 1996 (Boulder, CO), IEEE Computer Society: Washington, DC, USA, 1996; 336–343. 17 Scheiderman B. Tree visualization with tree-maps: a 2D space- filling approach. ACM Transactions on Graphics 1992; 11: 92–99. 18 LeBlanc J, Ward MO, Wittels N. Exploring N-dimensional databases. In: Proceedings of the First Conference on Visualization ’90: 1990 (San Francisco, CA), IEEE Computer Society: Silver Spring, MD, 1990; 230–237. 19 Feiner K, Beshers C. Visualizing n-dimensional virtual worlds with n-vision. SIGGRAPH Computer Graphics 1990; 24: 37–38. 20 Wood J, Dykes J. Spatially ordered treemaps. IEEE Transactions on Visualization and Computer Graphics 2008; 14 (6): in press. 21 Bruls M, Huizing K, Wijk JV. Squarified Treemaps. 2000 [PDF document] http://www.win.tue.nl / ? vanwijk/stm.pdf. (accessed 12 June 2008). 22 Harrower M, Brewer CA. ColorBrewer.org: an online tool for selecting colour schemes for maps. The Cartographic Journal 2003; 40: 27–37. 23 Beshers C, Feiner S. AutoVisual: rule-based design of interactive multivariate visualizations. IEEE Computer Graphics and Applications 1993; 13: 41–49. 24 Fabrikant SI. Visualizing region and scale in semantic spaces. In: The 20th International Cartographic Conference: 2001 (Beijing, China), 2001; 2522–2529. 25 Skupin A, Fabrikant SI. Spatialization methods: a cartographic research agenda for nongeographic information visualization. Cartography and Geographic Information Science 2003; 30: 99–119. 26 Tobler W. A computer movie simulating urban growth in the Detroit region. Economic Geography 1970; 46: 234–240. 27 Andrews K, Kasanick JA. Comparative study of four hierarchy browsers using the hierarchical visualisation testing environment (HVTE). In: Proceedings of the 11th International Conference Information Visualization: 2007 (Zurich, Switzerland), IEEE Computer Society: Los Alamitos, CA, 2007; 81–86. 28 Cawthorn N, Moere AV. The effect of aesthetic on the usability of data visualization. In: Proceedings of the 11th International Conference Information Visualization: 2007 (Zurich, Switzerland), IEEE Computer Society: Los Alamitos, CA, 2007; 637–648. 29 Blanch R, Lecolinet E. Browsing zoomable treemaps: structure- aware multi-scale navigation techniques. IEEE Transactions on Visualization and Computer Graphics 2007; 13: 1248–1253. 30 PostgreSQL [WWW document] http://www.postgresql.org/ (accessed 12 June 2008). 31 PostGIS [WWW document] http://www.postgis.org/ (accessed 12 June 2008).