Giter VIP home page Giter VIP logo

crypto-lake / lake-api Goto Github PK

View Code? Open in Web Editor NEW
20.0 2.0 3.0 123 KB

Python API for accessing Lake high frequency tick trades & order book data

Home Page: https://crypto-lake.com/

License: Apache License 2.0

Makefile 1.30% Python 98.70%
cryptocurrency cryptocurrency-api data-provider market-data market-data-service trading-strategies crypto-trading-strategies dataset orderbook orderbook-data

lake-api's Introduction

Lake API

Pypi package status

Documentation status

Build status

API for accessing Lake crypto market data.

Lake is a service providing historical cryptocurrency market data in high detail, including order book data, tick trades and 1m trade candles. It is tuned for convenient quant and machine-learning purposes and so offers high performance, caching and parallelization.

Usage

If you don't have a paid plan with AWS credentials, you can access sample data:

import lakeapi

lakeapi.use_sample_data(anonymous_access = True)

df = lakeapi.load_data(
    table="book",
    start=None,
    end=None,
    symbols=["BTC-USDT"],
    exchanges=["BINANCE"],
)

With paid access, you can query any data:

import lakeapi

# Downloads SOL-USDT depth snapshots for last 2 days from Kucoin exchange
df = lakeapi.load_data(
    table="trades",
    start=datetime.datetime.now() - datetime.timedelta(days=2),
    end=None,
    symbols=["SOL-USDT"],
    exchanges=["KUCOIN"],
)

We recommend putting .lake_cache directory into .gitignore, because Lake API stores cache into this directory in the working directory.

lake-api's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

lake-api's Issues

Corrupted files

  • Lake API version: 0.8.0
  • Python version: 3.10
  • Operating System: macOS

Description

I want to report corrupted trades for all symbols on Coinbase, at 2023-03-04.

What I Did

I tried to download the aforementioned data. Returned DataFrame is empty and the following log is printed.

WARNING  [_read_parquet.py:59] Ignoring corrupted file = s3://qnt.data/market-data/cryptofeed/trades/exchange=COINBASE/symbol=ADA-USD/dt=2023-03-04/1.snappy.parquet

At what time of the day is the data from previous day available?

Question:
I would like to know:

  • How many times per day is data consolidated to be available? (i.e. once a day, once per hour ...)
  • At which hour in the day data is expected to be finished?

Use case is to read new data coming in, scheduling data ingestion from last day/hour at a certain time.
I checked the docs, and didn't find this.

Thanks

How to access Lake pro with credentials

  • Lake API version: 0.8.0
  • Python version: 3.10.12
  • Operating System: MacOS

Description

I've just bought a subscription plane in crypto lake and there's no documentation on how to access lake pro, and where enter the api credentials.

What I Did

I've tried this command on Google Colab, with no luck


from google.colab import userdata
userdata.get('secretName')

Free API layer not working as defined in docs

  • Lake API version: latest
  • Python version: 3.8.10
  • Operating System: Ubuntu 20.04

Description

Trying to get data through the Free Data API, with anonymous_access=True, fails to provide the data.

botocore.exceptions.UnauthorizedSSOTokenError: The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

What I Did

Running this code as suggested in the docs:

import lakeapi
import datetime

def save_currency_rates_crypto_lake(symbols=["BTC-USDT"]):
    lakeapi.use_sample_data(anonymous_access=True)

    df = lakeapi.load_data(
        table="book",
        start=datetime.datetime(2022, 10, 1),
        end=datetime.datetime(2022, 10, 2),
        symbols=symbols,
        exchanges=["BINANCE"],
    )
    print(df)

if __name__ == '__main__':
    save_currency_rates_crypto_lake()

Output:

$ python3 dags/utils/crypto_lake.py 
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2137, in _get_credentials
    response = client.get_role_credentials(**kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/botocache/botocache.py", line 54, in _make_api_call
    return super()._make_api_call(operation_name, api_params)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/client.py", line 983, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.UnauthorizedException: An error occurred (UnauthorizedException) when calling the GetRoleCredentials operation: Session token not found or invalid

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dags/utils/crypto_lake.py", line 18, in <module>
    save_currency_rates_crypto_lake()
  File "dags/utils/crypto_lake.py", line 7, in save_currency_rates_crypto_lake
    df = lakeapi.load_data(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/main.py", line 161, in load_data
    df = lakeapi._read_parquet.read_parquet(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_read_parquet.py", line 611, in read_parquet
    dfs=_read_dfs_from_multiple_paths(
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_read.py", line 145, in _read_dfs_from_multiple_paths
    kwargs["boto3_session"] = boto3_to_primitives(kwargs["boto3_session"])
  File "/home/user/.local/lib/python3.8/site-packages/lakeapi/_utils.py", line 44, in boto3_to_primitives
    "aws_access_key_id": getattr(credentials, "access_key", None),
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 406, in access_key
    self._refresh()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 499, in _refresh
    self._protected_refresh(is_mandatory=is_mandatory_refresh)
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 515, in _protected_refresh
    metadata = self._refresh_using()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 662, in fetch_credentials
    return self._get_cached_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 672, in _get_cached_credentials
    response = self._get_credentials()
  File "/home/user/.local/lib/python3.8/site-packages/botocore/credentials.py", line 2139, in _get_credentials
    raise UnauthorizedSSOTokenError()
botocore.exceptions.UnauthorizedSSOTokenError: The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

Candles data can't request for cnSOL-USDC and tSOL-USDC

  • Lake API version: Latest
  • Python version: 3.11.4
  • Operating System: MACOS

Description

I can't download candles data for cnSOL-USDC and tSOL-USDC

What I Did

Paste the command(s) you ran and the output.
trades = lakeapi.load_data(
            table="candles",
            start=datetime.datetime.now() - datetime.timedelta(days=2),
            end=None,
            symbols=["tSOL-USDC"],
            exchanges=None,
        )

trades = lakeapi.load_data(
            table="candles",
            start=datetime.datetime.now() - datetime.timedelta(days=2),
            end=None,
            symbols=["cnSOL-USDC"],
            exchanges=None,
        )
If there was a crash, please include the traceback here.

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3548, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
trades = lakeapi.load_data(
^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/lakeapi/main.py", line 124, in load_data
assert symbols[0].upper() == symbols[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

Error when downloading funding data - "cannot convert input with unit 's'"

  • Lake API version: 0.12.0
  • Python version: 3.11.7
  • Operating System: Ubuntu version: 20.04.3 LTS

Description

Error when trying do download PERP funding data as the API tries to convert the column "next_funding_time" to pd.datetime which fails since the data is not given in unix format.

Reproduce Error

table = "funding"
exchange = "BINANCE_FUTURES"
trading_pair = "BTC-USDT-PERP"

start_date = datetime(2023, 1, 1, 0, 0)
end_date = datetime(2023, 12, 31, 0, 0)

df = lakeapi.load_data( 
    table=table,
    start=start_date,
    end=end_date,
    symbols=[trading_pair],
    exchanges=[exchange],
    drop_partition_cols=True,
)

Error Message

cannot convert input with unit 's'

Cause of Trouble

lake-api/main.py line 216

if "next_funding_time" in df.columns:
        df["next_funding_time"] = pd.to_datetime(df["next_funding_time"], unit="s", cache=True)

Problem

The content of column "next_funding_time" is presumably not given in unix format but rather the absolute number of nano seconds until the next funding time so it is rather a time-difference than a time-stamp. I have not read the Binance API documentation, this is just the first explanation which came to my mind.

Potential Solution

Just leave "next_funding_time" in its plain format or optionally rename it to something like "ns_to_next_funding_time".

if "next_funding_time" in df.columns:
        df.rename(columns={"next_funding_time": "ns_to_next_funding_time"}, inplace=True)

Alternatively, "next_funding_time" could just be added to origin_time to get a timestamp column format. However, at least in the limited samples I have checked, next_funding_time does not really match with the specific time difference to the actual next funding data point anyways, so I don't think this would actually add useful information.

Why is Pandas <2 required?

  • Lake API version: 0.8.0
  • Python version: 3.10
  • Operating System: macOS, Ubuntu

Description

I want to install lake API with a Pandas version greater than 2.0.0.
Why is lake api restricting pandas version to <2?

What I Did

Just installing requirements as follows:

pandas >= 2.1.0
lakeapi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.