Comments (2)
Hello @sheraztariq007, thanks for the issue.
The memory issue is not a bug from tslearn
, it is due the fact that a large matrix of shape (n_ts, n_ts)
is created to store the pairwise distances.
I also have a memory issue on my local computer when I run the following code:
import numpy as np
n_ts = 66000
a = np.zeros((n_ts, n_ts))
print(a.shape)
Then I have the following error message:
numpy.core._exceptions.MemoryError: Unable to allocate 32.5 GiB for an array with shape (66000, 66000) and data type float64
While everything is working fine for when I am using a bit less time series:
import numpy as np
n_ts = 65000
a = np.zeros((n_ts, n_ts))
print(a.shape)
Then the following message is printed:
(65000, 65000)
When I look into my computer parameters, I see that the Memory is equal to 31.0 GiB.
A first possibility to solve your issue is to use a computer or server with more memory.
from tslearn.
Also, the else
condition of the function silhouette_score
is reachable when the input parameter metric
is callable.
It allows the user to use customized metrics.
However, I found an error in this else
condition that I fixed in PR #508.
In next tslearn
version (current version is 0.6.3), the following code:
import numpy as np
from tslearn.clustering.utils import silhouette_score
from tslearn.metrics import dtw
from tslearn.metrics import cdist_dtw
np.random.seed(0)
n_ts = 200
sz = 3
d = 2
X = np.random.randn(n_ts, sz, d)
labels = [0] * (n_ts // 2) + [1] * (n_ts // 2)
"""First method"""
score = silhouette_score(
X=cdist_dtw(X),
labels=labels,
metric="precomputed")
print(score)
"""Second method"""
score = silhouette_score(
X=X,
labels=labels,
metric=dtw)
print(score)
will print two equal scores:
0.0018015434161644396
0.0018015434161644396
In the current tslearn
version, the second function raises a recursion error.
This second method to compute the silhouette score might solve your memory issue since tslearn
's silhouette_score
function is then calling scikit-learn
's sihouette_score
function which does not create a score matrix of shape (n_ts, n_ts)
when n_ts
is too large. Instead, it creates several partial score matrices of shape (n_chunk_samples, n_samples)
, computes the scores for each chunked matrix and then concatenates the scores.
from tslearn.
Related Issues (20)
- Scalers inverse_trasnsform() function HOT 2
- UCR_UEA_datasets().list_datasets() return KEY error
- Compute SoftDTWLossPyTorch with normalization option and time series of different lengths HOT 1
- TimeSeriesKMeans with custom metric HOT 1
- [BUG] `cdist_soft_dtw_normalized` fails unexpectedly when time series panels have different number of instances HOT 3
- Can neural prophet use soft-dtw loss function? HOT 4
- [BUG] non-conformance of `metrics.lcss` with input interface expectations (3D numpy) HOT 2
- Cluster Centers are not updating after assigning init HOT 1
- How to use to_time_series_dataset with a multidimensional dataset HOT 1
- Got a message "NoneType has no atribute 'values'" when trying to extract the shapelets HOT 5
- N-dimensional features issue in the method HOT 2
- Columns and DataType Not Explicitly Set on line 552 of cast.py
- LearningShapelets implmentation for imbalanced dataset in the params providing class_weights and loss will be helpful HOT 1
- lcss similarity is returns unity for all timeseries HOT 1
- Global alignment kernel returns NaN for all timeseries HOT 1
- How to scale cluster centers back in the original scale HOT 1
- Soft DTW with ignore_padding_token HOT 2
- UCR_UEA_datasets().list_univariate_datasets() and UCR_UEA_datasets().load_dataset() and Failed HOT 5
- Implement Model Persistence for `NonMyopicEarlyClassifier`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tslearn.