Comments (1)
I can't reproduce your issue, but I get something quite similar. basically after an append the index get out of order. data is OKish, but not the index.
is a small file this is not an issue as results can be sorted, but on large files it's very slow
import pandas as pd
import pystore
# create new store
pystore.set_path(path='/tmp/store')
store = pystore.store(datastore='datastore', engine='pyarrow')
test_eod = store.collection(collection='TEST.EOD')
# generate sample data
df = pd.DataFrame(
list(range(10)), index=pd.date_range(start='2020-1-1', periods=10), columns=['data']
)
df
data
2020-01-01 0
2020-01-02 1
2020-01-03 2
2020-01-04 3
2020-01-05 4
2020-01-06 5
2020-01-07 6
2020-01-08 7
2020-01-09 8
2020-01-10 9
# generate 2 overlaping sets of data
dfa = df[:-3].copy()
dfb = df[-5:].copy()
#
dfa
data
2020-01-01 0
2020-01-02 1
2020-01-03 2
2020-01-04 3
2020-01-05 4
2020-01-06 5
2020-01-07 6
dfb
data
2020-01-06 5
2020-01-07 6
2020-01-08 7
2020-01-09 8
2020-01-10 9
# write data
test_eod.write('STOCK',data=dfa)
# query data
test_eod.item('STOCK').to_pandas()
data
2020-01-01 0
2020-01-02 1
2020-01-03 2
2020-01-04 3
2020-01-05 4
2020-01-06 5
2020-01-07 6
# append data
test_eod.append('STOCK',data=dfb)
# query again
test_eod.item('STOCK').to_pandas()
data
2020-01-03 2
2020-01-04 3
2020-01-08 7
2020-01-01 0
2020-01-02 1
2020-01-05 4
2020-01-06 5
2020-01-07 6
2020-01-09 8
2020-01-10 9
pystore manages overlaping data, so i'm quite sure that's not the issue. The issue seems to be coming from dask, however there's no "sort_index" in dask only sort values and set index, which is not what we need here.
i wonder if this is caused by dask droping support for fastparquet
from pystore.
Related Issues (20)
- Symbols containing regex are broken with dask/fastparquet/fsspec HOT 1
- Dependabot couldn't authenticate with https://pypi.python.org/simple/
- Does append() work on OSX? HOT 3
- how to read all columns but the one use for partition
- Nested Dataframes causes exception
- collection.list_items() with metadata paremeter is showing "*** json.decoder.JSONDecodeError: Expecting value: line 1 column 198 (char 197)" HOT 1
- Append function not working
- Cause of most silent append errors HOT 3
- Multiindex and/or building minute bars HOT 1
- Is append loading the entire data into memory just to append new data ? HOT 1
- .to_pandas() error [can't read parquet file even though there is data in it when i look with parquet viewer] HOT 1
- Pystore Tutorial loading data problem
- issue reading back an item with metadata.json but no "_metadata"
- _updated in metadata use hour instead of minute
- Append lose data : by default remove duplicted indices. HOT 1
- Importing Pystore now gives ''EntryPoints' object has no attribute 'get''. HOT 1
- problem
- Strange path behaviour when using IPython terminal in Spyder
- drop fastparquet and use pyarrow. this is required on latest versions of dask HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pystore.