Giter VIP home page Giter VIP logo

detex's People

Contributors

d-chambers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

detex's Issues

using subSamp = True in createCluster causes waveform misalignment

If a SubSpace instance is created using a ClusterStream instance that was initiated with subSamp = True the subsample extrapolation can cause misalignment in the multiplexed waveforms. This results in a higher than necessary dimension of representation. The following examples from the intro tutorial illustrate the point. The first is a plot from a SubSpace instance when its Cluster instance used subSamp = False and the second when subSamp = True

image

image

Can no longer get hypoDD times

I used a standard TemplateKey from an earthquake catalog. I first ran
cl = detex.createCluster(CCreq=0.68,trim=[5,30],fetch_arg='../EventWaveForms',fileName='clustDD.pkl',enforceOrigin=True)
and then

cl.writeSimpleHypoDDInput(minCC=0.70)
Traceback (most recent call last):

File "", line 1, in
cl.writeSimpleHypoDDInput(minCC=0.70)

File "/home/pankow/anaconda/lib/python2.7/site-packages/detex/subspace.py", line 110, in writeSimpleHypoDDInput
trdf = self.TRDF[self.TRDF.Station == sta].iloc[0]

AttributeError: 'ClusterStream' object has no attribute 'TRDF'

KML function to rep. clusters

A method of the detex.subspace.Cluster class to output KML files (for google earth) for each station cluster structure would be useful for visualization

No events survived pre-processing, check DataFetcher and event quality

Hi Chambers,

I recently installed the code and I am trying to run the intro example you provided in the website. However, I am having some issues when I call the function createCluster():

Cannot remove response without a valid inventoryArg, setting removeResponse to False
Starting IO operations and data checks
/opt/anaconda/lib/python3.5/site-packages/scipy/linalg/basic.py:1226: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver.
  warnings.warn(mesg, RuntimeWarning)
Traceback (most recent call last):
  File "Intro_detex.py", line 5, in <module>
    cl = detex.createCluster(CCreq=0)
  File "/opt/anaconda/lib/python3.5/site-packages/detex-1.0.8-py3.5.egg/detex/construct.py", line 122, in createCluster
    detex.log(__name__, msg, level='error')
  File "/opt/anaconda/lib/python3.5/site-packages/detex-1.0.8-py3.5.egg/detex/__init__.py", line 138, in log
    raise e(msg)
Exception: No events survived pre-processing, check DataFetcher and event quality

These are lines I am using so far:

import detex
detex.getdata.makeDataDirectories()
cl = detex.createCluster(CCreq=0.4)

Have you seen this before? Thanks in advance for the help.

too many columns, fail to write .index.db

fails to write indkey table in .index.db. This happens for the latest version of detex I have, also tried with an older version. Deleted and re-indexed though clustering and with getdata- same issue, too many columns. When I do get it to write a .index.db, I get an error because there is no 'indkey' table- so still seems like related error.

Traceback and associated files below.

File "", line 1, in
detex.getdata.makeDataDirectories(getContinuous=False)

File "/home/linville/Applications/anaconda/lib/python2.7/site-packages/detex/getdata.py", line 174, in makeDataDirectories

File "/home/linville/Applications/anaconda/lib/python2.7/site-packages/detex/getdata.py", line 202, in _getTemData

File "/home/linville/Applications/anaconda/lib/python2.7/site-packages/detex/getdata.py", line 936, in indexDirectory

File "/home/linville/Applications/anaconda/lib/python2.7/site-packages/detex/util.py", line 880, in saveSQLite
DF, Tablename, con=conn, flavor='sqlite', if_exists='append')

File "/home/linville/Applications/anaconda/lib/python2.7/site-packages/detex/pandas_dbms.py", line 83, in write_frame
cur.execute(schema)

OperationalError: too many columns on indkey

Archive.zip

Detex compilation warnings with Anaconda3 install.

Derrick,
After installing the newer Anaconda3-5.0.0-MacOSX-x86_64 and using the below configuration,

conda config --add channels conda-forge
conda create -n detex python=2.7
source activate detex
conda install pyqt=4 (Had to use an older version)
conda install joblib
conda install simplekml
conda install basemap
conda install obspy_

we get the warning

/home/blycker/anaconda3/envs/detex/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by " \

I didnt know if you were still supporting this but I thought I should drop you a line.

Bill

Events that occur too close in time link events to clusters that don't belong

Some events that occur close in time with other events can be included in the data pulled when each event is correlated against every other event to determine event clusters.

For example, in the following 2 plots:

image

contains 2 events (although only one is in the catalog).

Most of the events in the cluster associate with the second event, one event, however, associates better with the first.

image

This results in an alignment of the events in the first plot to the second event in the same plot. Consequently, the waveform in the bottom figure gets poorly aligned, as in this plot of a waveform group:

one_event_doesnt_fit

put detex on pypi

Putting detex on pypi will allow easier updates and installation through pip/easy_install

Failed alignment due to similar end effects

The current function in the construct model for calculating waveform correlation coefficients (_CCX2)
works by taking one of the two waveforms (each of length n) and zero padding n elements to the beginning and n elements to the end to the waveform. Conceptually, the other waveform is then slide over the zero padded waveform and the CC is calculated at every time step. This can allow similar parts of the end of one waveform and the beginning of another (such as filter effects) to produce the highest correlation coefficient in the cc trace, even though only a few samples are actually similar, as is the case in the plots shown in issue 19. When this happens it breaks the alignment algorithm in the createSubSpace call.

In order to remedy this the waveform the waveform to be padded will only receive n/2 zero elements at the beginning and n/2 zero elements at the end.

edit : The padding had to remain n on both sides or else the normalization gets screwed up, but the correlation coefficient vector is now sliced before determining its max to the n/2 bound, essentially doing the same thing.

Error messages when running ss.SVD and the final ss.detex steps

I have been able to successfully run detex subspace on my stations for the year 2003, however when I try to add another year (in this case 2004, but it has happened with other years as well) I get the following message when running ss.SVD()
2003-2004ss svderror
To try and continue forward with the program, I set the threshold (ss.SVD(Threshold=0.25)) and run ss.detex(useSingles=True, fillZeros=True), but then get the following error message that looks very similar to the previous one...
2003-2004ss detexerror
I tried attaching my Station and Template Keys to this posting, but it wouldn't let me (said "Attaching documents requires write permission to this repository"). I can email them to you however if needed. All the Continuous Data I have used for these runs has been filtered out using the script you and Kris wrote a few weeks back.

Missing information in SQLite database

I ran the detex command and got all the detections for each station. When I went to use the results module, I discovered that the only table that made it to the SQLite database is ss_df, the one containing the results of the detections for the subspaces. After it gave the detections for the final station, it threw this up before stopping:

sta lta req of 8 failing on station IMU, dropping sta/lta requirement
sta lta req of 8 failing on station IMU, dropping sta/lta requirement
sta lta req of 8 failing on station TCRU, dropping sta/lta requirement
UU.TCRU starting on 2018-11-20T15:23:00 is shorter than expected
sta lta req of 8 failing on station TCRU, dropping sta/lta requirement
UU.TCRU starting on 2019-05-01T01:00:00 is shorter than expected
sta lta req of 8 failing on station TCRU, dropping sta/lta requirement
Traceback (most recent call last):
File "", line 1, in
File "/home/arecord/Subspace/detex/detex/subspace.py", line 1873, in detex
self.setSinglesThresholds()
File "/home/arecord/Subspace/detex/detex/subspace.py", line 1089, in setSinglesThresholds
useSubSpaces=False, **kwargs)
File "/home/arecord/Subspace/detex/detex/subspace.py", line 1743, in getFAS
issubspace=False)
File "/home/arecord/Subspace/detex/detex/fas.py", line 59, in _initFAS
ssArrayTD, ssArrayFD, staltalimit)
File "/home/arecord/Subspace/detex/detex/fas.py", line 119, in _getDSVect
detex.log(name, msg, level='error')
File "/home/arecord/Subspace/detex/detex/init.py", line 138, in log
raise e(msg)
Exception: Could not get any data for ECUT

createCluster removes templates if not exactly equal to the median length

In trying to cluster older data ~75% of the events were being removed. Relu found that there is an if statement to delete templates if not equal to the median template length. While this makes sense to remove lesser quality data, it is also removing templates that differ by a single point.

This one point difference in the template length is introduced in the obspy trim function. We think it would be best to keep all waveforms with only a single point difference, but only use the median template length in the calculations.

writing new detections to template key

When using res.writeDetections(eventDir='DetectedEvents',updateTemKey=True)
to update the TemplateKey the 'TIME' format is wrong. Looking at the bottom of the TemplateKey (head and tail shown below), the TIME seems to be in seconds instead of the date time stamp. When I later tried to use this template key to get lag times with
cl = detex.createCluster(CCreq=0.68,trim=[5,30],fetch_arg='../EventWaveForms',fileName='clustDD.pkl',enforceOrigin=True)
The memory fills and python crashes. Although I can run cluster
cl = detex.createCluster(CCreq=0.68,trim=[5,30],fetch_arg='../EventWaveForms')
and get cluster results.

[brewster:DATA/CIRCLEVILLE/Detections_1wk] pankow% head TemplateKey.csv_det
,Unnamed: 0,TIME,NAME,LAT,LON,MAG,DEPTH,STMP
0,0.0,2010-09-29T15-48-59.63,2010-09-29T15-48-59.63,38.202,-112.251833333,1.29,4.32,1285775339.63
1,1.0,2011-01-03T12-06-36.88,2011-01-03T12-06-36.88,38.2473333333,-112.33983333299999,4.56,5.4,1294056396.88
2,2.0,2011-01-03T12-10-08.66,2011-01-03T12-10-08.66,38.2491666667,-112.30616666700001,2.92,2.03,1294056608.66
3,3.0,2011-01-03T12-23-19.05,2011-01-03T12-23-19.05,38.248666666700004,-112.320333333,0.96,1.68,1294057399.05
[brewster:DATA/CIRCLEVILLE/Detections_1wk] pankow% tail TemplateKey.csv_det
397,,1294325117.3600001,d2011-01-06T14-45-17,,,0.12130792350080775,,
398,,1294328589.025,d2011-01-06T15-43-09,,,0.42995299945604737,,
399,,1294330616.4850001,d2011-01-06T16-16-56,,,0.05488828309071697,,
400,,1294331688.385,d2011-01-06T16-34-48,,,-0.1258472051222964,,
401,,1294332071.1999998,d2011-01-06T16-41-11,,,-0.06266347145329888,,

Station UU.NMU Makes No Detections When It Should

UU.NMU is a close station (~30km) to my swarm of interest with good template event waveforms. However, it is making no detections even though it looks like there are good detections to be made that were detected by my other 3 stations, UU.MSU, UU.IMU, UU.DWU.

Here is an example of a detection UU.NMU is missing.
2003-11-28t16-48-00

I have tried running just the vertical component in case there was an issue running one 3-component station with single component stations, but again no detections. Any thoughts on why this would be happening? It would be useful for me and others to be able to utilize this station.

SVD ObsPyDeprecation warning

Just wanted to give you a heads up in case you have not seen this:
/home/pankow/anaconda/lib/python2.7/site-packages/obspy/init.py:159: ObsPyDeprecationWarning: Module 'obspy.station' is deprecated and will stop working with the next ObsPy version. Please import module 'obspy.core.inventory' instead.
ObsPyDeprecationWarning)
/home/pankow/anaconda/lib/python2.7/site-packages/obspy/init.py:159: ObsPyDeprecationWarning: Module 'obspy.station.response' is deprecated and will stop working with the next ObsPy version. Please import module 'obspy.core.inventory.response' instead.
ObsPyDeprecationWarning)
/home/pankow/anaconda/lib/python2.7/site-packages/obspy/init.py:159: ObsPyDeprecationWarning: Module 'obspy.station.util' is deprecated and will stop working with the next ObsPy version. Please import module 'obspy.core.inventory.util' instead.
ObsPyDeprecationWarning)

Report 'list' object has no attribute 'tolist' error and failed to write ss_info to SubSpace.db

When I run ss.detex with after ss.SVD(threshold =0.15), it failed to write the ss_info to the SubSpace.db and report error: 'list' object has no attribute 'tolist'. I check the code and it seems that the self.histSubSpace[sta][skey] will return a list not an array.
image
image
image
I run ss.detex before with the whole year continuous data and threshold in SVD is none, it worked well and didn't report this error. This time I run ss.detex just in April which station US.HILD has no continuous data in this month, the error occurred.
Now, I'm running ss.detex in the April data with threshold in SVD equal none and see if this error would occur again.

Detections for only subset of data in Continuous Waveforms

The options UTCstart and UTCend when creating detections from continuous data do not seem to work. Even with these values set when looking for detections, detex looks through all the continuous data. Stephen suggested that the search window be tied to the date strings in the StationKey

Detex not reading continous waveform data

Trying to run SVD or detections on data has resulted in Detex being unable to read the Continuous data files.

I have tried running the data in pickle and mseed format. Neither has worked properly.

Using data that was pulled by previous versions of Detex I received this error (file format pickle):

screenshot 2016-03-02 11 37 22


When trying to run SVD on the older data, it skips all waveforms and ends up with no data to run SVD.

screenshot 2016-03-02 11 59 43


Thinking that there might be something wrong with the continuous waveform data, I downloaded a month's worth of continuous data to a new directory. This was completed in both pickle and mseed format using Detex 1.0.6.
After the download completed, Detex began to auto-index the ContinuousWaveform directory with this result:

screenshot 2016-03-02 15 33 30


I tried running SVD (after terminating the auto indexing and Detex tried to index again), with this result:
screenshot 2016-03-02 15 36 47


Just to try it, I created a subspace with the new data:

screenshot 2016-03-02 15 37 02


I have been able to successfully use detex.pickTimes() and have been able to see those waveforms. Detex has also had no problem reading either pickle or mseed format EventWaveforms. The clusters have been produced without error. I am using fillZeros=True, but all other parameters (minus directory location variables) have remained unchanged.

My TemplateKey:
TemplateKey.txt

My Station Key:
StationKey.txt

*Both have been switched to ".txt" for uploading.

*If you choose to download the data, the four stations will be about 9 GB of data for the single month. The event waveforms take up approximately 50 MB of disk space.

Clustering returning empty link matrix

I've been trying to create clusters and haven't been having any luck. I keep getting this error:


screen shot 2015-10-13 at 1 58 47 pm


I've attached the Template Key and the Station Key files that I'm using. I did try running the tutorial again to be sure it wasn't Detex and the tutorial ran correctly. I tried just running the clustering on one station (IMU) and I still received the same error. I'm going to keep trying to figure out why I'm having this trouble, but any insight into the issue would be great.

Oh, I can pull data, it's just getting the cluster to form that I'm having issue.

If it helps to narrow the issue, I also have had trouble creating a subspace. I thought that maybe it was my previous cluster (which created correctly just a few days ago (with more stations in the Station Key and more events in the Template Key) so I tried remaking the cluster and I started receiving the above error.

StationKey.txt
TemplateKey.txt

Detex not loading subspace

I have been trying to get Detex to load the subspace that I created earlier without success.
The error that I am receiving is regarding "no Subspace stream":

screenshot 2016-03-02 11 33 46

I have attempted to load subspaces created with pickle and mseed formats (thinking there may possibly be some loss of data between formats). Neither one will load properly.

Below is a snippet of the output created by a subspace that failed to load once written out to the drive.

screenshot 2016-03-02 15 37 02

This subspace was created with newly downloaded EventWaveform and Continuous data, attempting to look over one month of continuous data.

_attachResponse function of getdata raises HTTPError when used with "uuss" setting

This happens because the chan parameter passed to _attachResponse is a list of channels. The obspy FDSN get_stations function, however, requires a string that may use wildcards. A quick solution is to loop over the channels and add the inventories to an empty inventory like so:

def _attachResponse(fet, st, start, end, net, sta, loc, chan):
    """
    Function to attach response from inventory or client
    """
    if not fet.removeResponse or fet.inventory is None:
        return st
    if isinstance(fet.inventory, obspy.station.inventory.Inventory):
        st.attach_response(fet.inventory)
    else:
        inv = obspy.station.Inventory([], 'detex')
        for cha in chan:
            inv += fet.inventory.get_stations(starttime=start,
                                          endtime=end,
                                          network=net,
                                          station=sta,
                                          loc=loc,
                                          channel=cha,
                                          level="response")
        st.attach_response(inv)
    return st

createSubspace routine failing

For select the datasets, the createSubspace module fails. Screen shot of output below.
screen shot 2015-10-30 at 4 06 48 pm

Relu has researched this error and it seems that if there are no singletons the program fails. We think that this is an unwanted feature. It would be nice to treat all events in an area as a subspace or to look at small datasets that are all correlated.

exclude temporally coincident detections from unassociated templates?

doesn't seem robust to report a new event from temporally correlated detections on 2 stations with uncorrelated templates. So if you have templates A-C and you look at which templates are linked in a detection requiring 2 stations, they should like like [A,A],[B,B] or [C,C]. If A and C made a subspace they could look like [[A,C],C] or [[A,C],A] etc...

worth adding a new filter in results?

If helpful, I'm using the following code verify template continuity. It's hack, so check it if you follow this route.

screen shot 2016-09-21 at 3 29 17 pm

%%

sgdb = loadSQLite('SubSpace.db','sg_info')
ssdb = loadSQLite('SubSpace.db','ss_info')

"""I think Kris makes this with res = detex.results.detResults(blah,blah)
res.Dets.to_pickle('detections_2RS.pkl')"""

with open('detections_2RS.pkl','rb') as f:
detections = pickle.load(f)

templates = readKey('TemplateKey.csv')

%%

yams = []
for i in range(len(detections.Dets)):
tempyam = []
for j in range(len(detections.Dets[i])):
each = detections.Dets[i].reset_index(drop=True)
try:
tempyam.append([templates[templates['NAME'] == sgdb[(sgdb['Name'] == each.Name[j]) & (sgdb['Sta'] == each.Sta[j])].Events.iloc[0]].index[0]])
except:
junkyam = list(np.empty(len(ssdb[(ssdb['Name'] == each.Name[j]) & (ssdb['Sta'] == each.Sta[j])].Events.iloc[0].split(','))))
for k in range(len(ssdb[(ssdb['Name'] == each.Name[j]) & (ssdb['Sta'] == each.Sta[j])].Events.iloc[0].split(','))):
junkyam[k]=templates[templates['NAME'] == ssdb[(ssdb['Name'] == each.Name[j]) & (ssdb['Sta'] == each.Sta[j])].Events.iloc[0].split(',')[k]].index[0]
tempyam.append(junkyam)
yams.append(tempyam)

idx=[min(len(set(x).intersection(set(yams[i][0]))) for x in yams[i]) for i in range(len(yams))]

where idx == 0 is not a valid detection

writeSimpleHypoDDInput

In calculating differences in lag times sometimes inputs 'nan' to dt.cc file. I have not investigated further.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.