Understanding Data Trees

I’ve been on holiday for a while now, and haven’t been able to update my tutorials for the last few months, but regardless i still get plenty of questions about some of the content that I produce.

One of the most common questions I get is about data trees, they seem to be a very tough concept to grasp, and rightly so – so I thought I might have a go at explaining them in this post.

For starters, data trees are the way that information is passed around in grasshopper. They are a useful way to create and manipulate hierarchies, and everything that you do in grasshopper relies on data trees, even if you are just passing one single data point through, or a list of data points, or lists of lists of data points.

Panels and Param Viewers

Two very useful tools that we have to understand data trees are the panel (yellow), and the param viewer (grey, double click to toggle between its two different views). The panel tells us specific information about what exactly is in our data tree, and the param viewer gives us an idea of the structure of our data.

I’ve been racking my brain for the best way to explain data structures, and I think the best way to understand it is to think of it like a street address. We can find out exactly where an item is located (much like how we find out where people live) based on its index, which is broken down into two parts:

  • path index (think of these numbers as the country, ZIP code, suburb, street name, etc.)
  • item index (think if this is the exact street address)

The path index is the string of numbers inside the curly braces, denoted by {}. Each collection of values inside the curly braces is its own list (or path), so when we see that we have a collection of lists, we call this a list of lists.

The item index is the number at the leftmost of every line in a panel, and this number always starts at 0. This is a convention of computer language, 0 is always the 1st item in the list.

So if we wanted to find the first item in this data structure,
We would be looking for {0;0;0}(0)
The second item would be {0;0;0}(1), etc.
The seventh item in this collection however, would be {0;0;1}(0), the first item in the second list. This is because every list in this current data structure has 6 items, so the seventh item is found inside the second list.

Try using more panels and param viewers, and try and pay more attention to the path indices, which as you can see, appear in both the panel and param viewer, so it’s easier to debug your scripts.

Graft, Flatten, and Simplify

So what do these functions do? These are all ways to manipulate our data structure very simply.

Flattening our data structure strips out file hierarchy. As you can see, the first collection of data is organised into eleven lists, each with six items in them, but when we flatten that list, it becomes one list of 66 items. Note that our path index also changes. When you flatten a list, it is moved into a new path which is now {0}. This would be like moving every single house onto one main street.

Grafting on the other hand, takes every item and puts it on its own unique list, this would be like putting every single house in a neighbourhood on its own street. So if we look at our param viewer now, we can see that there are 66 lists, and each of them has a single item. Also note what happens to the path indices, we keep the initial {0}, and a new index is added with a semi-colon,
so our first item is located at {0;0}(0)
and our second item is located at {0;1}(0), etc.

So when is it useful to flatten and graft?

When we flatten data, in essence, we could say that it becomes easier to access. It would be a lot easier for eleven separate posties to deliver their junkmail to everyone if they all lived on the same street. Or in grasshopper, it becomes a lot easier to connect eleven grafted points to each of the 66 points which exist in our hierarchy if that collection of points is flattened.

But if we wanted each of our posties to only deliver mail to their assigned street, we would graft our eleven posties and connect them to our original rectangular grid, and because that is already broken up into eleven lists, each with six items, we have corresponding data structures.

And what about simplify…

Simplify is probably a function you won’t use for a while at first, it’s a way to tidy up your data structure. Sometimes when you’ve strung a lot of components together, they will add placeholder indices, this is something David Rutten talks about in a more detailed blog post if you are interested. So simplify basically eliminates all placeholder values that have accumulated through your definition.

The second image shows when this might be useful. If you were trying to combine two data sets, but their paths were different, they will not merge together into lists, but when we simplify the data coming into the merge component, we can see that it outputs eleven lists, each with the now twelve items we wanted in them.

Additional Resources

Modelab Grasshopper Primer info on data trees

David Rutten’s master class on data trees

One of my early tutorials on data structures

The why and how of data trees – a comprehensive explanation of why we use data trees in grasshopper, by David Rutten


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s