local_data_lake

`mainsequence.client.data_sources_interfaces.local_data_lake`

`DataLakeInterface`

`build_time_and_symbol_filter(start_date=None, great_or_equal=True, less_or_equal=True, end_date=None, unique_identifier_list=None, unique_identifier_range_map=None)` `staticmethod`

Build hashable parquet filters based on the parameters.

Args: metadata (dict): Metadata dictionary, not used for filtering here but included for extensibility. start_date (datetime.datetime, optional): Start date for filtering. great_or_equal (bool): Whether the start date condition is >= or >. less_or_equal (bool): Whether the end date condition is <= or <. end_date (datetime.datetime, optional): End date for filtering. asset_symbols (list, optional): List of asset symbols to filter on.

Returns: tuple: Hashable parquet filters for use with pandas or pyarrow.

`filter_by_assets_ranges(table_name, asset_ranges_map)`

:param table_name: :param asset_ranges_map: :return:

`persist_datalake(data, overwrite, table_name, time_index_name, index_names)`

Partition per week , do not partition per asset_symbol as system only allows 1024 partittions Args: data:

Returns:

`query_datalake(table_name, filters=None)`

Queries the data lake for time series data.

If the table_hash is in nodes_to_persist, it retrieves or creates the data. Otherwise, it updates the series from the source.

Args: ts: The time series object. latest_value: The latest timestamp to query from. symbol_list: List of symbols to retrieve data for. great_or_equal: Boolean flag for date comparison. update_tree_kwargs: Dictionary of kwargs for updating the tree.

Returns: pd.DataFrame: The queried data.

`memory_usage_exceeds_limit(max_usage_percentage)`

Checks if the current memory usage exceeds the given percentage of total memory.

`read_full_data(file_path, filters=None, use_s3_if_available=False, max_memory_usage=80)` `cached`

Cached access to static datalake file

local_data_lake

mainsequence.client.data_sources_interfaces.local_data_lake

DataLakeInterface

build_time_and_symbol_filter(start_date=None, great_or_equal=True, less_or_equal=True, end_date=None, unique_identifier_list=None, unique_identifier_range_map=None) staticmethod

filter_by_assets_ranges(table_name, asset_ranges_map)

persist_datalake(data, overwrite, table_name, time_index_name, index_names)

query_datalake(table_name, filters=None)

memory_usage_exceeds_limit(max_usage_percentage)

read_full_data(file_path, filters=None, use_s3_if_available=False, max_memory_usage=80) cached