gaprieto / multitaper Goto Github PK
View Code? Open in Web Editor NEWMultitaper codes translated into Python.
License: MIT License
Multitaper codes translated into Python.
License: MIT License
Hi!
First of all many thanks for the great library! It saves me a lot of work, not having to implement the multitapering myself!
However, I have a proposal to speed up the jackknife confidence interval estimation by calling scipy.stats.t.ppf
vectorized. Replacing the part (here) by a version without for-loop
#------------------------------------------------------
# Use the degrees of freedom
#------------------------------------------------------
smaller1 = se[:nfft] < 1.0
if np.any(smaller1):
indices = np.where(smaller1)[0]
print('DOF < 1 at ', indices[0], 'th frequency: ', se[indices[0]])
raise ValueError("Jackknife - DOF are wrong")
qt = scipy.stats.t(df = se).ppf(0.95)
var[:, 0] = np.exp(qt) * np.sqrt(var[:,0])
#-----------------------------------------------------------------
...
makes a tremendous speedup, if you work with larger data sets points.
Many thanks!
EDIT (6th Dec 2022):
I also found here, that a bias estimate is calculated, which is never used by the jackspec
function. Additionally, the line consumes an enormous amount of RAM for large input arrays, as the difference of a (nfft, 1)
-shaped array and a (nfft,)
-shaped array is of shape (nfft, nfft)
. I am not sure, what the idea behind this line was...
Hi,
I got this error:
psd.reshape()
Doing F test
Traceback (most recent call last):
File "/var/folders/q5/dgt5lxr10m9ct0v4v6tbbxxwwnnz8p/T/ipykernel_73077/1107857592.py", line 1, in <cell line: 1>
psd.reshape()
File "/Users/aa/opt/anaconda3/envs/newenv/lib/python3.8/site-packages/multitaper/mtspec.py", line 404, in reshape
yk,sline = utils.yk_reshape(self.yk,self.vn,fcrit=fcrit)
File "/Users/aa/opt/anaconda3/envs/newenv/lib/python3.8/site-packages/multitaper/utils.py", line 1420, in yk_reshape
p = utils.ftest(vn,yk)[1]
NameError: name 'utils' is not defined
line 1420 in utils.py needs to change to :
p = ftest(vn,yk)[1]
...
Thanks for the code...
Hi Germán,
I noticed some drastic performance decrease switching from the FORTRAN to the Python codes (some decrease would be expected, but not 100 / 1000x).
For MTSpec
the most time is needed for the execution of xint
and set_xint
(utils.py). One could solve the performance issue without rewriting a lot of code by adding numba
(JIT-compiler) with caching.
That simple dependency would increase the performance drastically (see example) without changing a lot of code. I only checked that function, but it could probably also used for other functions.
I used one of your example files to test the execution time:
import sys
sys.path.append("../..")
import multitaper.utils as utils
from multitaper import MTSpec
import timeit
def mtspec(x,nw,dt):
psd = MTSpec(x,nw=nw,dt=dt)
freq, spec = psd.rspec()
return freq[:,0], spec[:,0]
fname = 'v22_174_series.dat'
nw = 4.0
x = utils.get_data(fname)
npts = np.shape(x)[0]
dt = 4930
t = np.arange(npts)*dt
reps = 10
time = timeit.timeit(lambda: mtspec(x,nw,dt), number=reps) / reps
# time: 0.33, unmodified
# time: 0.13, numba
# time: 0.004, numba + cache
I opened a pull-request #10.
Best,
Kilian
When generating an MTSpec object, and subsequently calling df_spec() results in a ValueError: too many values to unpack
Example code:
import multitaper
import numpy as np
psd = multitaper.mtspec.MTSpec(np.random.randn(2**7),nw=4,dt=1)
psd.df_spec()
It seems that df_spec() is returning an additional frequency attribute.
When trying to use mtspec.spectrogram function for sub-second window, it throws a zero division error:
ZeroDivisionError Traceback (most recent call last)
...
... site-packages\multitaper\mtspec.py in spectrogram(data, dt, twin, olap, nw, kspec, fmin, fmax, iadapt)
973 npts = np.size(data)
974 nmax = npts-nwin
--> 975 nvec = np.arange(0,nmax,njump)
976 t = nvec*dt
977 nspec = len(nvec)
Turns out that the code at fault is line #971: njump = int(np.round(twin*(1.0-olap))/dt)
which produces 0 whenever twin
is a fraction. From the documentation it seems twin
parameter is the length of (overlapping) segments on which multitaper estimation is carried out. It is not clear why it should not be less than 1 second. Or may be this is a bug, and the round function should have included dt
.
I use data from automated weather stations. Due to power failures and other reasons gaps are present in the observations. I am wanting to use multitaper spectiral estimates such as those described in this paper https://core.ac.uk/download/pdf/237709478.pdf That code is in matlab however and I am looking for a pythonic version.
Can I use your package ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.