Raw data can come in different formats. We need to process or transform it into a form that is consumable by D3 and thus producing meaningful visual elements. In this post we will cover the following topics related to data in D3:
- Data array and array manipulation (d3-array)
- Collection (e.g Object, Map, Set) (d3-collection)
- Creating nested data (d3-collection)
- Loading data from external files (d3-dsv)
- Data binding
Arrays are the most common data structure used in D3. Hence it is imperative to look at the various methods available in the library that helps us deal with data array. Array related APIs are part of the d3-array module in D3.
Following are the most commonly used methods in d3-array:
- d3.min(array[, accessor]) : Returns the minimum value in the given array using natural order. If the array is empty, returns undefined.
- d3.max(array[, accessor]) : Returns the maximum value in the given array using natural order. If the array is empty, returns undefined.
- d3.extent(array[, accessor]) : Returns the minimum and maximum value in the given array using natural order. If the array is empty, returns [undefined, undefined].
- d3.sum(array[, accessor]) : Returns the sum of the given array of numbers. If the array is empty, returns 0. This method ignores undefined and NaN values.
- d3.mean(array[, accessor]) : Returns the mean of the given array of numbers. If the array is empty, returns undefined.
- d3.merge(arrays) : Merges the specified arrays into a single array.
- d3.ticks(start, stop, count) : Returns an array of approximately count + 1 uniformly-spaced, nicely-rounded values between start and stop (inclusive).
- d3.tickStep(start, stop, count) : Returns the difference between adjacent tick values.
These methods should be enough for most of our array related operation. For a comprehensive list of array methods in d3-array check the API Reference.
D3 provides APIs to deal with data structures like Objects, Maps, Sets and Nests in its microlibrary called d3-collection. So lets start by installing d3-collection.
Lets look at some of the APIs provided for investigating plain objects:
- d3.keys(object) : Returns an array containing the property names of the specified object. The order of the returned array is undefined.
- d3.values(object) : Returns an array containing the property values of the specified object. The order of the returned array is undefined.
- d3.entries(object) : Returns an array containing the property keys and values of the specified object. Each entry is an object with a key and value attribute.
Creating nested data
It is sometime necessary to transform flat data to hierarchical tree structure. For example lets say we have an array of tweet objects as shown below:
It would make more sense if we can group the objects by user and then we can draw a barchart to showcase number of tweets by each user. To achieve this we have to transform the otherwise flat data to nested data.
D3 provides a method called d3.nest() to do just that. With this method we can do multiple levels of grouping (e.g group by user name and then group by topic). The levels are created by specifying key functions. Lets do the nesting to our tweets example:
You can see the power of d3.nest() here, with a few lines of code our data is transformed into more intuitive tree representation.
Loading data from external files
We can load data from an external file such as .csv or .tsv or any delimiter separated file and then parse it to use in our D3 application. The d3-dsv module provides a parser and formatter for delimiter separated values. To use this lets install the d3-dsv module first.
Since CSV(Comma Separated Value) and TSV(Tab Separated Value) are so common there are dedicated methods for parsing and formatting these values:
For other delimiters, use d3.dsvFormat(). For example if we want to use the pipe “|” as delimiter, it can be parsed as follows:
Data binding is the process of attaching data to specific elements of the DOM. After attaching data to a DOM element we can change the visual appearance of the DOM element. For example the height of a rectangle in a barchart can be made taller if data associated with it is of a higher value. Without bounded data the DOM elements has very little meaning in data visualization.
So how do we bind data to DOM elements ? D3’s selection has a method called data() which is used to bind data to DOM elements. But before binding any data we need to select the target elements in the DOM.
Create the following DOM elements in your HTML page:
Execute the following code in your browser’s console:
The above code binds data 30 to the first circle and 40 to the second circle. The bound data is stored in a attribute called __data__ of the element. But note that the __data__ attribute is not present in the DOM as evident from the following screenshot:
The __data__ attribute is present in the memory and to see that we need to print the nodes in the console:
With this we have covered the essential concepts of data formats and data binding. Next up is – The Enter-Update-Exit Pattern in D3.