Zoomgraph version 0.3 (alpha)
Eytan Adar & Joshua Tyler
This tool is:
This tool isn’t:
You’re going to need 3 things:
Zoomgraph is launched from the command line, so start up a command prompt. You’re going to want to set the CLASSPATH so that it includes all the JAR files. We’ve included a file called setvarbs.bat which you can run to set everything for you.
To build a database you can run:
java com.hp.hpl.zoomgraph.DBServer db_name db_definition_file
To run the browser do:
java com.hp.hpl.zoomgraph.ZoomableGraph db db_name
There are a few other ways to run zoomgraph. You can type:
java com.hp.hpl.zoomgraph.ZoomableGraph db edge_def_file
This will take the edge definition file: edge_def_file (see Section 5 for formatting instructions) and will generate a temporary database (overwriting the previous temporary database). You can also work without a database by doing:
java com.hp.hpl.zoomgraph.ZoomableGraph edge_def_file
Finally, you can also run zoomgraph as an applet (see Section 5).
Let’s start with a simple example. There is a sample database (sample.database) in the zip file. It includes about 400 nodes and 700 edges. Take a look at it to get a sense of what goes into a data definition file. But don’t get intimidated, almost none of it is required.
After setting your classpath (see above), type the following to transform the text file to an actual SQL server:
java com.hp.hpl.zoomgraph.DBServer sample sample.database
then do:
java com.hp.hpl.zoomgraph.ZoomableGraph db sample
You’ll see something that looks like Figure 1. (If you ever find yourself running into an out of memory error you can try upping the stack size by using the –Xmx and –Xms commands, e.g. java –Xmx300m –Xms128 com.hp.hpl.zoomgraph.ZoomableGraph db sample).
The graph that popped up represents a corporate communication network. Each node represents an employee (with a department property), and each edge represents communication between two employees (with a frequency property on the edge indicating the number of undirected communications).

Figure 1
Try moving around in this space. If you hover over a node or edge you can see some details pop up. If you click on the node it will center in the display. Clicking on an edge will bring both end points into view dynamically. Left clicking and dragging on the background will allow you to move the display around. Right clicking and moving the mouse will zoom you in and out of the display.
Type “node1” into the search box and hit enter. The display will automatically shift to center on that node.
Ok, now back in the command prompt where you started zoomgraph you should see a prompt that looks like this “>.” Here you can type whatever commands you want to manipulate the graph. Type “center” and hit enter. The display will automatically center to include all the nodes. Your display should look like Figure 2. (note: type quit at any time to exit or just close the display window… don’t ctrl-c as you may corrupt your database).

Figure 2
Let’s make the nodes a little bigger so we can see them a little better. Type “nodesize 10” to make all nodes 10 pixels. If things don’t immediately change on the display for any of these commands just type “redraw.”
Nodes can either be selected by name or through a SQL query. For example, try typing: “nodecolor red node5,node6” This will make nodes 5 and 6 red. Our sample database has other properties on nodes. Specifically, nodes here have a department. To select nodes by a SQL query you can just type what you would after the WHERE clause. For example, “nodecolor black dept = 'dept5'” will set all the people in department 5 to a black color. Some commands just assume you want all the nodes if you don’t enter a list, otherwise the character ‘*’ means all nodes (or edges if it’s an edge command).
Edges are accessed in a slightly different way. Edges have names that are the start and end nodes. For example, “edgecolor red node67-node76” changes the edge between person 67 and 76 to red. You can also access edges by SQL queries. As mentioned earlier, edge in this case have an attribute called freq (frequency). So if we wanted to hide edges where the communication frequency was under 100 we would type: “hideedges freq < 100” The ‘-‘ also implies directionality. If the database indicated directions (which this one doesn’t) you could talk about node67->node76, node67<-node76, or node67<->node76.
The last mechanism for accessing edges is by defining node sets. Let’s say we only care about communications between dept 4 and 9. Let’s hide everything: hideall. Then show only the nodes in departments 4 or 9: shownodes dept = ‘dept4’ OR dept = ‘dept9’. Finally, we can change the color for inter-departmental edges by typing: “edgecolor red {dept = 'dept9'}-{dept like 'dept4'}” This command tells the Zoomgraph to find all nodes in dept 4 and all nodes in department 9, and through some SQL magic that goes on in the background it will find all edges between them (in this case only one). We can also do “edgecolor blue {dept = ‘dept9’}-{dept = ‘dept9’}” to just get intra-departmental links blue. You should see something like Figure 3.

Figure 3
The Zoomable system also contains a number of analysis modules to simply basic tasks (calculating graph metrics, etc.) These are described in much more detail elsewhere, but just to give you a flavor try this… First, reset the graph to it’s starting state. Type: showall. Then type: edgecolor green and finally nodecolor blue (you should see the same thing as before). We do this because when nodes and edges invisible it is the same as if they weren’t there and are not counted in various calculations. Type “analysis density *” This should calculate the density of the graph (.00827…).
Other analysis modules do more interesting things. For example, colorize will color nodes and edges by different features. Try typing “analysis colorize dept.” Each node will now be a different (random) color. The colorize function will also generate a bunch of subgraphs. Then try “analysis colorize freq [table=edges,linear=true]” which will assign a color over a linear range (from blue to red) based on the frequency of communications.
Subgraphs let us bundle a related edges and nodes together and give them a name. You can type “sg list” to see a list of named subgraphs. Running the colorize command gave us a different subgraph for each department and one main one “AUTO_nodes_dept_all” that holds all the subgraphs (yes, subgraphs can be nested). Operations that work on nodes and edges will also work on named subgraphs in the same way. You can type “hidenodes AUTO_nodes_dept_dept1” to hide all the nodes in that subgraph. “sg details subgraphname” will tell you which nodes and edges and nested subgraphs are part of the named subgraph. Nested subgraphs are referenceable by using a .* at the end of the subgraph name. For example “hideall AUTO_nodes_dept_all.*” will hide each individual dept subgraph contained in AUTO_nodes_dept_all (this command isn’t all that interesting since you could do without the .*, but try “sg details AUTO_nodes_dept_all.*” and “sg details AUTO_nodes_dept_all” to see the difference).
Our last example that combines everything and adds some new twist. Let’s visualize all the pairwise connections between departments, one at a time. We can do this by using the foreach loop. The foreach loop takes two arguments a variable and a set. For each element in the set, the variable will be set to that element and the loop will be done. So try this:
> foreach sg1 AUTO_nodes_dept_all.*
> foreach sg2 AUTO_nodes_dept_all.*
> hideall
> shownodes sg1
> shownodes sg2
> hideedges *
> showedges {sg1}-{sg2}
> redraw
> pause 2000
> .
> .
What we’re doing is looping over all the colorized subsets. The variables “sg1” and “sg2” are set by the foreach loop. The system knows that you’re talking about subgraphs and not nodes or edges because of the naming convention. If you wanted all the edges in the subgraph you would do “foreach edge1 AUTO_nodes_dept_all” similarly all nodes are “foreach node1 …”
As of release .3 of Zoomgraph, we are supporting “states” and the smooth transitions between them (caution: this is still a little rough). Here’s a simple example:
Once you start a movie anything you do to the display will be saved out (moving nodes by hand, changing the display, whatever…). The morph and movie commands support a great many features (tracking nodes with the camera, how fast things go, controlling which frames get saved, etc.). You should look at the documentation for these commands below.
Most of the commands you’ll see below allow you to enter a list of objects that you are applying the command to. The tutorial gave you some examples of how to call graphs and edges, but a quick summary may be helpful.
Nodes can be referred to by their name.
· You can submit a list of comma delimited nodes when you have the option to list. For example “node5,node6,node7” is a valid list.
Edges are more complex.
· If you had an edge between node5 and node6 you could refer to it as node5-node6 or node6-node5. If you have multiple edges between node5 and node6, edges also have a numerical id property that you can use as the name. You can list edges the same was as nodes: “node5-node6,node8-node9”
The DBServer program will initialize your database for you. It takes as input a database name and a database description file (see above on the command line). The description file has two components a list of nodes and a list of edges.
The node definition section starts with the line: “nodedef> name …”
The only required column for nodes is a name, which needs to be a string. So a valid node definition line would be: “nodedef> name VARCHAR(256)” After that you can simply put down a list of nodes, one per line that match the specification in the nodedef line (in this case all you need is a name). Similarly, edges are defined through an “edgedef>” line. An edgedef line must define two columns n1 and n2 (the start and end nodes of the edge). A valid database description file is then:
nodedef> name VARCHAR(256)
A
B
C
edgedef> n1 VARCHAR(256),n2 VARCHAR(256)
A,B
B,C
A,C
Which basically represents an undirected graph with three nodes and three edges.
In addition to the required columns there are a number of optional ones for both nodes and edges. These are created for you by the DBServer if you don’t do it yourself. Note that the def describes what comes next. You have to have the same number of columns in each of your node and edge lines as you did in your definition lines.
You may choose different defaults, but try to use the same types – DOUBLE, INT, etc. that are described here. The Zoomgraph makes certain assumptions about what’s in a database. Node definition lines may include:
Edges can have the following properties:
Beyond these basics everything is fair game. You can add whatever columns and properties you want and then use them to control your visualization. For example, let’s extend our basic definition above to indicate node size and a new column called city, and edges will have a number representing the number of planes (totally fake):
nodedef> name VARCHAR(256), SIZE DOUBLE DEFAULT 2, CITY VARCHAR(256)
A,10,new york
B,6,boston
C,4,san jose
edgedef> n1 VARCHAR(256),n2 VARCHAR(256),ROUTES INT DEFAULT 0
A,B,40
B,C,30
A,C,20
center <list>
Centers the display to include all nodes (if no argument is given), or only those that are in the list. This will only center on visible nodes.
Note: You may notice an exception sometimes when you do this even though the display does the right thing. There seems to be some race condition in Jazz.
centerall <list>
Centers the display to include nodes (all or in list), visible or not.
freeze
Freezes the display. Changes will not appear on the screen until you unfreeze it.
unfreeze
Unfreezes the display
rq <low|medium|high>
Sets the render quality of the visualization to one of three states. If you just type “rq” it will tell you what the current state is. The quality only applies to a display that has stopped changing. Moving around may cause the display to shift to a different rendering quality (see rqi). Default: low
rqi <low|medium|high>
Same as rq, but sets the interactive state. This sets the rendering quality when the display is changing. Default is also low. Changing this may degrade performance in display related features since rendering will take longer.
background color
Sets the background color to color (see section 5 for more information about colors).
redraw
Re-renders the display. Sometimes this is necessary to get the display to sync up with certain commands.
iw+
opens the information window. As you mouse over nodes and edges the information window reflects details about the objects. The information is the same as the node/edge details command.
iw-
hides the information window
hullson
Turns on the convex hull rendering for subgraphs on a global scale. Default: off
hullsoff
Turns off the convex hull rendering for subgraphs on a global scale. Default: off
refresh
Redraws the screen
hideall <list>
Hides all graph objects if no argument is specified. Otherwise only hides the objects in the list. This can be a mix of edges and nodes.
hideall <list>
Shows all graph objects if no argument is specified. Otherwise only shows the objects in the list. This can be a mix of edges and nodes.
muteall <list>
Mutes all graph objects if no argument is specified. Muted nodes are shown in the muted color (default gray). If a list is supplied only mutes the objects in the list. This can be a mix of edges and nodes.
unmuteall <list>
Unmutes all graph objects if no argument is specified. If a list is supplied only unmutes the objects in the list. This can be a mix of edges and nodes.
mutecolor color
Sets the muted color for muted nodes/edges
directed
Sets the graph mode to directed. If your edges are undirected you won’t see anything different. If there is a direction, you should see some arrows (this only sort of works)
undirected
Sets the graph mode to undirected.
commitoff
Turns off database commits. Changes made to nodes/edges will not be committed to the database. Default: on
commiton
Turns on database commits. Changes made to nodes/edges will be committed to the database. Default: on
savejpg file_name
Outputs the current visualization to the specified jpeg file
savesvg file_name
Outputs the current visualization in svg format to the specified file
savelog log_name
All commands that are typed will be saved to the log specified by log_name. Using this you can save what you do and re-run or modify it later to make macros.
loadlog log_name
All commands in the log file (log_name) will be executed against the current environment. This can be used to reply macros that you have saved earlier.
stoplog
Stops logging commands
savecsv file_name SQL_query
Will generate a file (specified by file name) in CSV format for the database columns you are interested in
Example: “savecsv foo.csv select name,degree from nodes”
Will save the name and degree columns to the file foo.csv
exppajek filename [nodes=list,edges=list]
A basic export of a graph to a Pajek file. The default behavior is to save out all nodes and edges that are visible. To change this you can specify a query (this is a little broken).
pause milliseconds
Pauses the system for some number of milliseconds. This is useful when running a script and you don’t want the display to refresh too quickly.
foreach variable list
The foreach loop lets you loop over a list of objects. The variable naming convention is to start the name with “node” if you want the nodes, “edge” if you want the edges, and “sg” if you want the subgraphs. So you could do: “foreach node1 somesubgraph” to talk about the nodes in the subgraph or “foreach edgefoo somesubgraph” to references the edges. Any instance of the variable name gets replaced by an element in the list each time the foreach loops. To close a foreach statement just type a period (“.”) on it’s own line.
node details <list>
Gives you some details about the node (same as you would see in the information window).
hidenode(s) <list>
Hides all nodes if no argument is specified. Otherwise only hides the nodes in the list.
shownode(s) <list>
Shows all nodes if no argument is specified. Otherwise only shows the nodes in the list.
mutenode(s) <list>
Mutes all nodes if no argument is specified. Otherwise only mutes the nodes in the list.
unmutenode(s) <list>
Unmutes all nodes if no argument is specified. Otherwise only unmutes the nodes in the list.
fixnode(s) <list>
Fixes all nodes if no argument is specified. Otherwise only fixes the nodes in the list. Fixed nodes will not be moved by the layout algorithm.
unfixnode(s) <list>
Unfixes all nodes if no argument is specified. Otherwise only unfixes the nodes in the list.
nodecolor color <list>
Sets the node color of all nodes (if no argument is given) or just those listed.
hidedis <list>
Hides nodes (all if no argument, or from the subset of list) that do not have any visible edge going to them.
nodesize size <list>
Sets the node size of nodes (or all if no argument) to size
labelnodes <list>
Shows a node label next to the nodes in the list
unlabelnodes <list>
Hides node labels.
edge details <list>
Gives you some details about the edge (same as you would see in the information window).
edgecolor color <list>
Sets the edge color of edges (or all if no argument) to color. Sometimes this requires an explicit redraw to be called.
edgewidth width <list>
Sets the edge width of edges (or all if no argument) to width. Sometimes this requires an explicit redraw to be called.
hideedge(s) <list>
Hides all edges if no argument is specified. Otherwise only hides the edges in the list.
showedges(s) <list>
Shows all edges if no argument is specified. Otherwise only shows the edges in the list.
muteedges(s) <list>
Mutes all edges if no argument is specified. Otherwise only mutes the edges in the list.
unmuteedge(s) <list>
Unmutes all edges if no argument is specified. Otherwise only unmutes the edges in the list.
unkink
Some layout routines cause bends in the edges, you’ll want to run this to restore the straight line mode.
edgeaverage <list>
Sets the edge color of one or more edges named in the list to the average of the two nodes connecting that the edge connects
Subgraphs are hierarchically
structured objects that contain nodes, edges, and references to other
subgraphs. Figure 4 is a representation of three subgraphs, sgA, sgB, and
sgC. The subgraph sgA directly contains nodes A and B, and edges A-B. It also
contains a pointer to subgraphs sgB and sgC. All commands and lookups applied
to sgA will be recursively applied to these embedded subgraphs. For example,
setting the node color to red for sgA will cause not only nodes A and B to
become red but also nodes D and E. This is usually the expected thing, but be
careful when applying operations like delete as they will be applied to lower
level subgraphs as well. For example deleting node D from sgA will actually
cause its removal from sgB. The commands for subgraph manipulations are as
follows:
sg subraph_command <optional arguments>
Performs a subgraph command
analysis analysis_command <optional arguments>
Performs an analysis_command. Valid commands include:
The following are R commands (and so require rserve to be running and you connected to it). It then uses the SNA library for calculating various centrality measures. The results will be placed in a new column corresponding to the name. Syntax is still “analysis command_name <list>.” See the R SNA documentation for a full explanation of these commands.
layout layout_command <extra>
These are various layout commands. Nodes that are fixed will not be moved.
Saves the current state of the graph to the state “statename” (which can be any set of alphanumeric characters)
Loads the state statename and changes the displayed nodes and edges (not subgraphs yet!) to the values of the saved state.
Smoothly transitions the display from the current state to statename moving edges, nodes, and changing colors and visibility. You can list any number of states. Each transition takes 3 seconds by default. Advanced options are:
Tells Zoomgraph to start saving the display to a (QuickTime) movie file. The fps option allows you to control the frames per second. Playing with this as well as the morph times can make for much smoother animations. The auto option (by default true) tells Zoomgraph if you want all display changes automatically to be saved.
Saves the current view as a frame (you can only do this if you started a movie in the non-auto mode). The current view will only be one frame unless you set the loop to something else (e.g. if loop=5 the current display will be saved as 5 frames). You can also force Zoomgraph to pause between each frame renderings to allow certain repaint commands to happen.
Stops recording the movie.
Typing “backdoor” puts you in direct contact with the SQL server. Your prompt will change to “b>” and any commands typed after that will be directly routed as SQL. You should be careful if you choose to use this. The results of select commands will be dumped to the screen. Typing backdoor again when in this mode will put you back in the regular mode. Typing quit will exit zoomgraph not just the backdoor mode.
select/SELECT sql_stuff
Does a SQL select and dumps the results.
If you’re into doing some heavy-duty table merges and selects you can use this. It will route this type of command to the SQL server. You should remember to drop this if you don’t want it to be persistent.
Lets you drop the view you created. Routed directly to the SQL server.
temp sql_stuff
This will run a full query on the database and place the resulting nodes and/or edges into the subgraph __temp. This comes in handy if you’re doing weird selects and joins. You can then copy the nodes/edges out of the temporary subgraph. For example: the command “temp SELECT * from nodes where xloc < 500” will load up the __temp with nodes that are on the left side of the display. The routine basically sees if the table you’re querying on has a column called “name” which indicates that you want nodes, and/or columns named n1 and n2. You can get both through different select and view operations.
Colors are defined by a comma delimited numbers representing red, green, and blue. There can be no space between them. For example the color red is “255,0,0.” There are also a number of predefined named colors: black, blue, cyan, darkGray, gray, green, lightGray, magenta, orange, pink, red, white, and yellow.
Saved subgraphs are pretty straightforward. A subgraph is named on a line preceded by the “>” character. Any items found until the next subgraph line are taken to be either nodes, edges, or other subgraphs. Other subgraph names should be prepend by two “_” characters. For example, the following file:
>subgraphA
a
b
a-b
__subgraphB
>subgraphB
c
d
c-d
describes two subgraphs, subgraphA which contains two nodes (a and b) and one edge (a-b) as well as the nodes and edges defined by subgraphB (c,d, and c-d).
If the subgraph definition file includes an edge or node that is not defined in the database you will see an error.
Wherever SQL can be used we more or less allow anything that HSQLDB allows. Please see the hsqlSyntax.html file included in the distribution. It should give you a sense of what you can (most things) and can’t do.
If you want to do things without a database or run just have Zoomgraph make you a temporary database, you can create an edge definition file. Each line is:
node1-node2 <node1x,node1y node2x,node2y>
See the files sample sampleedgefile_nocoordinates.txt or sampleedgefile_withcoordinates.txt as examples
You can run the Zoomable Graph as an applet without the database features. Look at applet.html for an example. You can simply paste in the edge definition file into the INITEDGES parameter. This is in the same format as the edge definition file above but with a semicolon at the end.
Zoomgraph currently contains a couple of R-based functions… specifically, we pass the matrix representing the graph to R and can calculate betweenness and node degree. The values are deposited into a column of the database.
To use this, you will need to:
· Install R from http://www.r-project.org (I use 1.6.2),
· Install Rserve from http://stats.math.uni-augsburg.de/Rserve/.
· You will also need to install the SNA library (available at the CRAN site on the r-project.org page above). If you’re using windows, you can just do it through the Package menu (download package from CRAN…).
· Finally, run Rserve as specified for your platform (Windows: type “Rserve”, Unix: type “R CMD Rserve”)
When running Zoomgraph you will need to tell it the machine on which you have R installed. You do this by typing:
analysis rservehost hostname Where hostname is the machine you are Rserve on. If you’re running Zoomgraph and R on the same machine you probably don’t need to do anything since the default is “localhost.” You can then test the connection by typing: analysis nodedegree * In version 0.3 we have also added a command “rmode” which will send any commands you enter after wards to R and then display the results. For example if you type “rmode” you will see that the prompt changes to “r>” Typing “2+2” will result in “4.0” Type rmode again to exit this mode. There are other things you can do here, but they’re still broken in this release.