Grouping assets in Dagster
Start your free 7-days trial now!
Grouping assets using the group_name property in asset decorator
By default, all assets belong to a group called "default"
. We can organize assets into specific groups by supplying the group_name
property in the asset decorator. For example, suppose we have the following main.py
file:
from dagster import assetimport pandas as pd
@asset(name="iris_data")def get_iris_data(): return df
@asset(name="setosa", group_name="flower")def get_setosa(iris_data):
@asset(name="versicolor", group_name="flower")def get_versicolor(iris_data):
defs = Definitions( assets=[get_iris_data, get_setosa, get_versicolor])
Launch the Dagster UI using the following command:
dagster dev -f main.py
Our data lineage should look like the following:
Here, we are currently focused on the default
group, which is why the assets setosa
and versicolor
are minimized. The top-left corner tells which group is currently focused:
If we click on the setosa
asset, the focused group now becomes flower
and all assets belonging to this flower
group will be shown like so:
Notice how the focused group is also updated in the top-left corner:
To see all our groups, click on the hamburger menu icon in the top-left corner:
To see all our assets, click on the following button in the top-right corner:
We should now see all our assets:
Grouping assets when loading assets from modules
Instead of specifying the group_name
property for every asset individually, we can assign a group
to all assets in a module. For instance, suppose we have the following two files:
my_assets.pymain.py
Suppose my_assets.py
is:
from dagster import asset
@asset(name="setosa")def get_setosa(iris_data): return iris_data.query("species == 0")
@asset(name="versicolor")def get_versicolor(iris_data): return iris_data.query("species == 1")
Notice how we did not specify the group_name
property to these assets.
Now, suppose our main.py
file is as follows:
from dagster import Definitions, load_assets_from_modules, assetimport pandas as pdimport my_assets
@asset(name="iris_data")def get_iris_data(): df = pd.read_csv("https://raw.githubusercontent.com/SkyTowner/sample_data/main/iris_data.csv") return df
flower_assets = load_assets_from_modules( [my_assets], group_name="flower",)
defs = Definitions(assets=([get_iris_data] + flower_assets))
Here, we are assigning all assets a group_name
of flower
when we load them using load_assets_from_modules(-)
.
If we assign the group_name
in load_assets_from_modules(-)
, then an error will be thrown if any of assets in the imported module contain the group_name
property within the asset decorator.