Data visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information". In the end everything we do, needs to be somehow presented to the users. Fortunately, we humans are intensely visual creatures. Few of us can detect patterns among rows of numbers, but even young children can interpret bar charts, extracting meaning from those numbers’ visual representations. For that reason, data visualization is a powerful exercise and is the fastest way to communicate it to others.
What is D3?
D3 is a JavaScript library, which helps in manipulating documents based on data. It uses HTML, SVG and CSS to create visualizations. With D3.js, the complete capabilities of new-age browsers can be used without being constrained to a framework. Fundamentally, D3 is an elegant piece of software that facilitates generation and manipulation of web documents with data. It does this by:
• Loading data into the browser’s memory
• Binding data to elements within the document, creating new elements as needed
• Transforming those elements by interpreting each element’s bound datum and setting its visual
properties accordingly
• Transitioning elements between states in response to user input
• Binding data to elements within the document, creating new elements as needed
• Transforming those elements by interpreting each element’s bound datum and setting its visual
properties accordingly
• Transitioning elements between states in response to user input
Learning to use D3 is simply a process of learning the syntax used to tell it how you want it to load and bind data, and transform and transition elements.
Basics of D3
Naturally the first thing we need to do is to install it, which can be easily done using Bower. If you're not familiar with the tool, please read the article what is Bower is and why you need it:
bower install d3
I myself am made entirely of flaws, stitched together with good intentionsThis Augusten Burroughs quote fits me well, when I try to explain someone a new technology. The reason behind this is trying to demonstrate live examples, instead of diving into theory, like many others do. Or maybe write a series of tutorials and gradually increasing the complexity of material. Today will be no different :) On D3 GitHub page, there is a whole gallery demonstrating it's capabilities. Even more examples can be found on Christophe Viau's site. We'll be picking on of them.
Force Directed Graph
I decided to choose Force Directed Graph example, cause it seemed cool enough to get your attention and the implementation wasn't too intimidating. The code can be found here - I've rewritten a bit the example to make it more readable and concise. Let's see what we're building first:
Pull and let go one of the nodes with the mouse and see what happens. Your mind should be blown away instantly, as it cannot comprehend the amount of coolness contained in a single example.
Once you look at the implementation, you'll be even more amazed how little code is needed to make things working and we shall see how the magic happens right away.
var width = 960, height = 500, svg = d3.select("body").append("svg") .attr("width", width) .attr("height", height), graph = miserables; force = d3.layout.force() .charge(-120) .linkDistance(30) .size([width, height]); force.nodes(graph.nodes) .links(graph.links) .start();Firstly we create a svg element with specified dimensions - no news here, maybe only the d3.select method, which acts almost as you'd expect. That is selecting the first element, that matches the specified selector string, returning a single-element selection, even if selector matches several elements. If no elements in the current document match the specified selector, returns the empty selection.
Then we create our force layout, a flexible force-directed graph layout implementation using position Verlet integration. Force layout supports many behaviours, at this time we'll be focusing on the ones needed for our example. For broader description about the layout, please refer the it's API page.
In our snippet we override the default charge, linkDistance and size attributes. Size parameter is quite self explanatory. Now, what is charge? Charge is a force, that a node can exhibit, where it can either attract (positive values) or repel (negative values). In our case, repelling. The bigger the value, in it's absolute form, the sparser our graph will be. The last parameter, linkDistance, is the distance we desire between connected nodes. Most often this property is set to a constant value for an entire visualization, but D3 also lets us define it as a function. When we do that, we can set a different value for each link.
After configuring our layout, we initiate it with our nodes and links from the famous Les Misérables novel, formatted in the JSON format. start method starts the simulation; this method must be called when the layout is first created, after assigning the nodes and links.
var miserables = { "nodes":[ {"name":"Myriel","group":1} ... ], "links":[ {"source":1,"target":0,"value":1} ... ] }Nodes can except any information, we might find useful in the visualization process. In our nodes we have the name of character and group to which one belongs. The group will be used for dyеing our nodes later. Links, on the other hand, must have source and target attributes, which serve for connecting the directed graph. The objects may have additional fields that you specify; this data can be used to compute link strength and distance on a per-link basis using an accessor function. In our case there is a value, which will be used later to calculate the stroke width. Moving forward:
var color = d3.scale.category20(), node = svg.selectAll(".node") .data(graph.nodes) .enter().append("circle") .attr("class", "node") .attr("r", 5) .style("fill", function(d) { return color(d.group); }) .call(force.drag); node.append("title").text(function(d) { return d.name; }); var link = svg.selectAll(".link") .data(graph.links) .enter() .append("line") .attr("class", "link") .style("stroke-width", function(d) { return Math.sqrt(d.value); });We'll be using built-in category20 method, which constructs a new ordinal scale with a range of twenty categorical colors. There are several variations of this method with b and c suffix, returning different palette of colors. You can choose the one you like from library's wiki.
Next we select all nodes using selectAll method, which works similar to its single variant. Interesting fact you may notice, that we don't have any nodes yet. So what will be selected? The way D3 works is returning pseudo-array - a placeholders for our future elements. Once marked, nodes data is applied using data method. Then enter method actually does the job of entering selection: placeholder nodes for each data element for which no corresponding existing DOM element was found in the current selection. Note that the enter method merely returns a reference to the entering selection, and it is up to you to add the new nodes, which we do with append method, inserting the circle element on each node. Then we assign CSS class, radius and fill color. Notice the dynamic nature of fill color, which is retrieved according to the node's group using color alias to category20 method. In the end we invoke force.drag method, using call method, which is mere helper method, made for chaining the calls on the current selection. drag method binds a behavior to nodes to allow interactive dragging, either using the mouse or touch. Finally we add character's name to each node, by adding title element inside our circles.
Next we add the links to connect our nodes. Nothing interesting here, besides dynamically setting the stroke-width CSS attribute according to value attribute. Not sure what good it does, probably just for demonstration purposes. Running the example up until now, will generate all the elements positioned in the top left corner. The next section is what makes everything dance.
force.on("tick", function() { node.attr("cx", function(d) { return d.x; }) .attr("cy", function(d) { return d.y; }); link.attr("x1", function(d) { return d.source.x; }) .attr("y1", function(d) { return d.source.y; }) .attr("x2", function(d) { return d.target.x; }) .attr("y2", function(d) { return d.target.y; }); });From the moment we call start method of our force layout, tick events are dispatched for each tick of the simulation. The this section, we listen to the events to update the displayed positions of nodes and links. The event handler is executed at each iteration of the layout. When it does, the force layout calculations have been updated and would have set various properties in our nodes and linked objects, which we could use to position them within the SVG container. First let's reposition the nodes. As the force layout runs it updates the x and y properties, which define where the node should be centered. To move the node, we set the appropriate SVG attributes to their new values. Later we update our links by setting the appropriate start and end points. That's it!
The beauty of D3 library is how it empowers us with tools to visualize nearly for data in the way we like. Some say it is less efficient than it's competitors like Sigma.js and Processing.js, however performance issues, like in any other computer science domain, should be managed and here is a good post to get your started.
If you felt I was moving too fast or wanted to read more detailed tutorial about the library, have a peek at Dashing D3js site - it walks though the material quite thoroughly.