Advanced datasets

In the sections Introduction to datasets up to and including Using the dataset III datasets are discussed and the basics applications are explained. This section deals with datasets that have a more complicated structure, and it is therefore advisable to have a good understanding of them. Reading the model structure page also helps.

Datasets with multiple nodes

Up to now, we discussed data sets that had only one node: data. Therefore, all our data could be found in dataset.data. However, we might want to structure our data a bit more. For example, consider the following data from Mircosoft Excel:

Advanced data in Excel
Advanced data in Excel

If we would put all this data in the dataset.data node, we might run into trouble when we need to do something with the data (for example, change the departments or loop through the birth dates). In this case we need a different kind of dataset type. To do so, we have the ‘expert mode’ option in the dataset type editor. The following image illustrates this by [Actions > Dataset > add icon > Expert mode].

Creating a more complicated dataset type
Creating a more complicated dataset type

The way this is done is fairly easy: you add a node for each ‘table’ in the Excel file. You can then add variables (‘columns’) to each node. For example, in the image we created a node (‘table’) for each table in the Excel file: one for basic data, one for address and one for function. Next, we added variables to it for each column.

Filling in the data in your newly created dataset type is done in the same manner as before. However, as the structure of the dataset can no longer be viewed as a single spreadsheet, the interface will look significantly different:

Entering data in your new dataset
Entering data in your new dataset

While this might look both more technical and more complicated, it is in fact exactly the same as before. However, as the Berkeley Studio can no longer create simple columns and rows, it shows the structure we gave it. So, ds_employee is in front of every field: this is the dataset itself. Next, you will find the row counter[0] followed by one of the nodes: basic_data, address or function. These determine which variables are accessible: ds_employee.basic_data contains first_name, last_name and so on, while ds_employee.function contains department and branche.

Looking this dataset up in the data inspector will give a nice overview of your dataset:

Checking data in your new dataset
Checking data in your new dataset

Saving graphs as datasets

Sometimes you will need to save an entire (sub)graph as a dataset. For example, you ask the user for information on several people and you need to do a selection on those. In this case, you can ‘fill’ a dataset with the information present in the graph. The studio will automatically use that graph as dataset type for you dataset. The corresponding function to do this is graphtodataset(). The accompanying model can be found here at the graphtodataset() example.

The structure of creating a dataset from a graph
The structure of creating a dataset from a graph

As you can see, we loop through several of the ds_dataset graphs. After the user has entered the information in each graph, we use the function graphtodataset(ds_dataset) and we’re done! If you use the inspector on the model now, you’ll get a result like this when you open the data inspector:

The resulting dataset in the data inspector
The resulting dataset in the data inspector

Datasets in datasets

While technically possible yet rarely useful, you may encounter situations in which you need a dataset within another dataset. Again, this is quite easily done (as shown below) yet keep in mind that you can quickly lose track of where information is stored and how and when you can edit it. Furthermore, most situations don’t need a solution like this and we generally don’t recommend this option.

That said, the actual creation of a dataset within a dataset is quite simple. As shown below, we created a new datatype dt_complicated (via [Actions > Dataset > add icon > Expert mode] within the model used above under ‘Datasets with multiple nodes’). This datatype has a dataset within the node data:

The new dataset type with a dataset
The new dataset type with a dataset

Pressing OK will bring us to the dataset screen. In this screen, you’ll see that the resulting dataset includes its own dataset. So everytime you create a new iteration (‘row’) of your dataset, it will create an entirely new dataset within that row. This is visualized in the screenshot below:

The new dataset with some data
The new dataset with some data

Again, this is quite easy to set up, but hard to maintain and keep track of.

Other material on datasets

The more advanced operations on datasets are also discussed here. Please see copying and deleting parts of a dataset, purging datasets to save memory and merging (combining) datasets. Furthermore, examples of functions can be found in the example models.