Datasets
Containers of the dataset type can store any kind of object inside. You can put any object type inside, including datasets and dataframes.
Creating datasets
You can create such a container as follows:
| [Copy Code](javascript:void(0)) Python |
from research_sdk import Dataset, DataStorageInterface, DataStorageType with DataStorageInterface.create(DataStorageType.Datacat) as storage: with Dataset(name="Test dataset", storage=storage) as ds: # I assume you already have some existing object to add in data_object variable ds.add(data_object) |
A dataset has the following functions to work with:
- add
- get
- ennumerate
Adding an object to a dataset
The add function adds an object and flushes it to datacatalog immediately.
| [Copy Code](javascript:void(0)) Python |
def add(self, obj: DataObject) -> DataObjectInfo |
The returned DataObjectInfo contains the information about added object.
| [Copy Code](javascript:void(0)) Python |
DataObjectType = Type["DataObject"] @dataclass class DataObjectInfo: id: str name: str type: DataObjectType |
where
- id is the identifier of the added object.
- name is the object name.
- type is the python type of the object.
Getting an object from the dataset
| [Copy Code](javascript:void(0)) Python |
def get(self, name: str) -> Optional["DataObject"]: |
Not implemented.
Enumerating objects in a dataset
| [Copy Code](javascript:void(0)) Python |
def enumerate(self) -> List["DataObjectInfo"]: |
Not implemented.