duckdb
mainsequence.client.data_sources_interfaces.duckdb
DuckDBInterface
Persist/serve (time_index, unique_identifier, …) DataFrames in a DuckDB file.
__init__(db_path=None)
Initializes the interface with the path to the DuckDB database file.
Args: db_path (Optional[str | Path]): Path to the database file. Defaults to the value of the DUCKDB_PATH environment variable or 'analytics.duckdb' in the current directory if the variable is not set.
drop_table(table)
Drops the specified table from the database.
Args: table (str): The name of the table to drop.
list_tables()
Lists all user-defined tables in the main schema of the database.
Returns: List[str]: A list of table names. Returns an empty list if the database file does not exist or on error.
read(table, *, start=None, end=None, great_or_equal=True, less_or_equal=True, ids=None, columns=None, unique_identifier_range_map=None)
Reads data from the specified table, with optional filtering. Handles missing tables by returning an empty DataFrame.
Args: table (str): The name of the table to read from. start (Optional[datetime.datetime]): Minimum time_index filter. end (Optional[datetime.datetime]): Maximum time_index filter. great_or_equal (bool): If True, use >= for start date comparison. Defaults to True. less_or_equal (bool): If True, use <= for end date comparison. Defaults to True. ids (Optional[List[str]]): List of specific unique_identifiers to include. columns (Optional[List[str]]): Specific columns to select. Reads all if None. unique_identifier_range_map (Optional[UniqueIdentifierRangeMap]): A map where keys are unique_identifiers and values are dicts specifying date ranges (start_date, end_date, start_date_operand, end_date_operand) for that identifier. Mutually exclusive with 'ids'.
Returns: pd.DataFrame: The queried data, or an empty DataFrame if the table doesn't exist.
Raises:
ValueError: If both ids
and unique_identifier_range_map
are provided.
upsert(df, table)
Idempotently writes a DataFrame into table using (time_index, uid) PK. Extra columns are added to the table automatically.
Args: df (pd.DataFrame): DataFrame to upsert. table (str): Target table name.