Skip to content

Datasets

Containers of the dataset type can store any kind of object inside. You can put any object type inside, including datasets and dataframes.

Creating datasets

You can create such a container as follows:

[Copy Code](javascript:void(0)) Python
from research_sdk import Dataset, DataStorageInterface, DataStorageType with DataStorageInterface.create(DataStorageType.Datacat) as storage: with Dataset(name="Test dataset", storage=storage) as ds: # I assume you already have some existing object to add in data_object variable ds.add(data_object)

A dataset has the following functions to work with:

  • add
  • get
  • ennumerate

Adding an object to a dataset

The add function adds an object and flushes it to datacatalog immediately.

[Copy Code](javascript:void(0)) Python
def add(self, obj: DataObject) -> DataObjectInfo

The returned DataObjectInfo contains the information about added object.

[Copy Code](javascript:void(0)) Python
DataObjectType = Type["DataObject"] @dataclass class DataObjectInfo: id: str name: str type: DataObjectType

where

  • id is the identifier of the added object.
  • name is the object name.
  • type is the python type of the object.

Getting an object from the dataset

[Copy Code](javascript:void(0)) Python
def get(self, name: str) -> Optional["DataObject"]:

Not implemented.

Enumerating objects in a dataset

[Copy Code](javascript:void(0)) Python
def enumerate(self) -> List["DataObjectInfo"]:

Not implemented.