sceccode / pycsep Goto Github PK

View Code? Open in Web Editor NEW

47.0 47.0 22.0 55.23 MB

Tools to help earthquake forecast model developers build and evaluate their forecasts

Home Page: https://docs.cseptesting.org

License: BSD 3-Clause "New" or "Revised" License

Python 75.32% Shell 0.31% Perl 0.35% Jupyter Notebook 24.03%

pycsep's People

Contributors

Stargazers

Watchers

pycsep's Issues

cumulative poisson evaluations

we need to support the cumulative consistency tests from csep1. the function should accept a list of forecasts and return a cumulative test result.

it might be possible to create a single function that could support every poisson test. effectively, the cumulative forecast introduces the time dimension. some care must be taken to handle missing time windows.

right now, this can be implemented by users, but should be included as a standard function along with associated testing.

accumulating floating point errors for magnitude bins

magnitude bins that are created using a function like numpy.arange can cause issues when computing offsets into the bins. this problem is exacerbated by the fact that most magnitudes are reported on bin edges and can cause representation issues.

first, magnitude bins should not be created by adding decimal bin widths to an offset. they should be created using integers and then converted to decimal.

second, we can add a tolerance to the reported magnitude units and account for the abs_err in the floating point numbers.

errors installing pycsep from --channel conda-forge

I tried installing pycsep from the conda --channel conda-forge on USC's new cluster discovery. I started by installing the latest Anaconda3 distribution, which seemed to install correctly.

(base) [maechlin@discovery ~]$ conda info

 active environment : base
active env location : /home1/maechlin/anaconda3
        shell level : 1
   user config file : /home1/maechlin/.condarc

populated config files :
conda version : 4.8.5
conda-build version : 3.18.11
python version : 3.8.3.final.0
virtual packages : __glibc=2.17
base environment : /home1/maechlin/anaconda3 (writable)
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /home1/maechlin/anaconda3/pkgs
/home1/maechlin/.conda/pkgs
envs directories : /home1/maechlin/anaconda3/envs
/home1/maechlin/.conda/envs
platform : linux-64
user-agent : conda/4.8.5 requests/2.24.0 CPython/3.8.3 Linux/3.10.0-1062.4.1.el7.x86_64 centos/7.7.1908 glibc/2.17
UID:GID : 14364:30916
netrc file : None
offline mode : False

(base) [maechlin@discovery ~]$ conda env list

conda environments:

base * /home1/maechlin/anaconda3

When I tried to install pycsep from the conda-forge channel, the process worked for a long time, then output an error report that starts with the following information. I have not include the full error report, because it is quite lengthy. Any suggestions on why this install didn't work using Anaconda3?

(base) [maechlin@discovery ~]$ conda install --channel conda-forge pycsep
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
Examining conflict for jinja2 anaconda conda-build jupyterlab bokeh jupyterlab_server conda-verify sphinx markupsafe notebook nbconvExamining conflict for jinja2 anaconda conda-build jupyterlab bokeh jupyterlab_server conda-verify sphinx markupsafe notebook nbconvExamining conflict for anaconda xlsxwriter: : 327it [50:36, 2.11s/it] Examining conflict for fontconfig lxml dbus pyodbc pango glib conda-build harfbuzz libarchive gst-plugins-base python-libarchive-c cExamining conflict for fontconfig lxml dbus pyodbc pango glib conda-build harfbuzz libarchive gst-plugins-base python-libarchive-c cExamining conflict for qt cairo: : 342it [50:51, 1.65s/it] failed
UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:
Specifications:

alabaster -> python[version=’2.7.|3.5.|3.6.|>=2.7,<2.8.0a0|>=3.5,<3.6.0a0|>=3.6,<3.7.0a0|3.4.|>=3.7,<3.8.0a0’]
anaconda-navigator -> python[version=’2.7.|3.5.|3.6.|3.4.’]
anaconda-project -> python[version=’2.7.|3.5.|3.6.*|>=3.5,<3.6.0a0|>=3.7,<3.8.0a0|>=2.7,<2.8.0a0|>=3.6,<3.7.0a0']
atomicwrites -> python[version=’2.7.|3.5.|3.6.*|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|>=2.7,<2.8.0a0|>=3.5,<3.6.0a0']
attrs -> python[version=’2.7.|3.5.|3.6.*|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=2.7,<2.8.0a0|>=3.5,<3.6.0a0']
autopep8 -> python[version=’2.7.|3.5.|3.6.|3.4.|>=3.6,<3.7.0a0|>=2.7,<2.8.0a0|>=3.7,<3.8.0a0|>=3.5,<3.6.0a0']

Catalog gridding - Differenene in the outcome of gridding of test catalog

I was trying to repeate the results of the global forecasting testing experiment from a forecast from previous paper. I was using my own catalog gridding function. And in the end, I used a mock class to interface the data (gridded catalog, gridded forecast) with toolkit to run the poission consistency tests. I was unable to get the same results.

So I dug a bit deeper. I found that the spatial counts (spatial gridding) obtained from the toolkit were different as that yielded by my function. I tried to locate the difference. But apparently, it was working fine on my side. And I also tried to look into pycsep code. But I was unable to follow it. However, I am certain that this issue is due to boundary events.

I am sharing here a link with files and code that provides different gridded catalog. Can you please help me locate where the error is coming from? Is it due to some unnoticed typo my side or some rounding off error in the toolkit? I am looking for an external opinion about this.

The files on the link are:
1- Testing Catalog (2014-2019).
2- Toolkit_grid : Global testing grid obtained from toolkt
3- Toolkit_gridded_cat : Spatial gridded counts obtained from toolkit for the above grid.
4- Python script that I used to acquire my gridded spatial rates for grid of (2). And compared it with spatial counts of (3).

https://drive.google.com/drive/folders/1rnIuX9Bqwg95BMhwKIQ23sP8ld4q95PU?usp=sharing

Plot for W-test

Hi Bill,

Here, I am following the discussion on W-test results. For forecasting areas with a small number of earthquakes, it might be important to run W-tests and be able to visualize the results.

Thanks,

Toño

masking of cells in forecast file

current implementation masks an entire spatial cell and doesn't provide granular access to apply binning for individual space-magnitude cells. the csep1 forecast file allows individual space-magnitude cells to be masked, therefore this should be included

creating plots that span a magnitude range

the markedgriddeddataset and the griddedforecast classes should have an easy API to return the summed (or non-summed) rates over a particular magnitude range. this will allow the user to select a portion of the forecast they would like to use/and or plot.

a potential working example exists in the plotting examples notebook on the project documentation site.

single catalog class?

Right now we have different classes to represent different types of catalogs. Should we simplify this and provide a single class with different functions to load the single catalog?

dependency management needs addressed

overall dependency management of the package needs to be addressed. there are some less-than-lightweight packages that are required.

for example, obspy and cartopy are required and only used within singular areas of the code.

cl test incorrectly normalizes the observed likelihoods

I just found a bug in the _poisson_likelihood_test function. For the CL-test, this function normalizes the observed likelihood score instead of comparing the score with the distribution of scores that would be expected if the model was correct, conditional on the number of observed earthquakes. Here, I am submitting a corrected version of the function that makes this little distinction for the CL-test, using an extra input variable CL=True.

Dependency problem (mpl_toolkits.basemap)

The module mpl_toolkits.basemap is not found, at least with the matplotlib version specified in the requirements file. Apparently this is a problem with pip3 (reproduced both from pycharm and using the instructions of csep2). It appears that it is only found in conda repository. Could we reconsider on choosing another library for basemaps plotting?

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/pciturri/Desktop/Forecasting/csep2_test/csep2/csep/__init__.py", line 1, in <module> from csep.core import forecasts File "/home/pciturri/Desktop/Forecasting/csep2_test/csep2/csep/core/forecasts.py", line 13, in <module> from csep.utils.plots import plot_spatial_dataset File "/home/pciturri/Desktop/Forecasting/csep2_test/csep2/csep/utils/plots.py", line 5, in <module> from mpl_toolkits.basemap import Basemap ModuleNotFoundError: No module named 'mpl_toolkits.basemap

Use PyCSEP for plotting test results without using EvaluationResult object

If I use basic testing functions for evaluation, it would be difficult to use PyCSEP for making plots, as it uses EvaluationResult type input. It would be a good idea if plotting functions have the felxibility to plot results in raw form.

warnings with numpy 1.20.1

test suite includes numerous warnings when running with numpy 1.20.1. this issue is a reminder that we need to keep an eye on this issue, because deprecation.

comparing forecasts with different regions

comparisons should be able to be made between two forecasts A and B where the region associated with forecast A is different from the region associated with forecast B. this can naturally occur where forecasts choose to mask certain cells.

the main condition that forecasts are comparable is: the intersection of regionA and regionB must not be zero. they must have overlapping cells.

magnitude bins on L and CL test

In L and CL poisson tests, likelihood calculations throws error when catalog or catalog.region has no magnitude binning.
Should catalog.spatial_magnitude_counts() have gridded_forecast.mag_bins() as arg? Or the user must modify catalog.region to define its magnitude bins first?

pycsep/csep/core/poisson_evaluations.py

Line 307 in a3c9d70

gridded_catalog_data = observed_catalog.spatial_magnitude_counts()

pycsep/csep/core/poisson_evaluations.py

Line 177 in a3c9d70

gridded_catalog_data = observed_catalog.spatial_magnitude_counts()

Error when plotting a forecast model

Below I show the error output when trying to plot the example forecast model "helmstetter_mainshock" as shown in the PyCSEP documentation page:

Before I switch to the most recent version of PyCSEP, I did this example before with no problem.

+Han

Writing mock class to directly plug-in gridded datasets to apply pycsep tests

Hi,

I am integrating gridded catalog and gridded forecasts directly into the toolkit though mock classes, in order to use the tests built into the toolkit. Here are the mock classes. My datasets are mostly spatially gridded without magnitude bins. I needed opinion, whether the mock classes appear to be fine.
It appears to be working fine. However, I am just looking for external opinion.

class MyForecast:

def __init__(self, data, name= 'my_forecast_class'):
        # initialize class data here….
        self.data = data
        self.name = name
        self.magnitudes = 5.95
        self.event_count = numpy.sum(data)

def spatial_counts(self):
    
    """
    It would do return the spatial counts.
    Or if they are already 1D, retune the same data
    """
    
    if len(numpy.shape(self.data))>1:
        spatial_count = numpy.sum(self.data, axis = 1)
    else:
        spatial_count = self.data
    return spatial_count

def event_count(self):
    """
    It would return total number of events in Gridded Forecast
    """
    return self.sum()
    
class MyCatalog:
 
def __init__(self, data, name='my_observed_catalog'):
      self.name = name
      self.data = data
      self.event_count = numpy.sum(data)

def spatial_counts(self):
    
    """
    It would do return the spatial counts.
    Or if they are already 1D, retune the same data
    """
    
    if len(numpy.shape(self.data))>1:
        spatial_count = numpy.sum(self.data, axis = 1)
    else:
        spatial_count = self.data
    return spatial_count
 
def event_count(self):
    """
    It would return total number of events in gridded Catalog
    """
    return self.sum()

    
def spatial_magnitude_counts(self):
    """
    It would return same 2D numpy array 
    """   

    spatial_magnitude_count = self.data
    return spatial_magnitude_count

Cannot create a virtual environment as it described in the README

tb@tokaido:/tmp$ git clone https://github.com/SCECcode/csep2
Cloning into 'csep2'...
remote: Enumerating objects: 262, done.
remote: Counting objects: 100% (262/262), done.
remote: Compressing objects: 100% (171/171), done.
remote: Total 1717 (delta 178), reused 156 (delta 91), pack-reused 1455
Receiving objects: 100% (1717/1717), 4.19 MiB | 6.34 MiB/s, done.
Resolving deltas: 100% (1244/1244), done.
tb@tokaido:/tmp$ cd csep2/
tb@tokaido:/tmp/csep2$ mkdir venv ; cd venv
tb@tokaido:/tmp/csep2/venv$ python3 -m venv csep
tb@tokaido:/tmp/csep2/venv$ cd ..
tb@tokaido:/tmp/csep2$ source venv/csep/bin/activate
(csep) tb@tokaido:/tmp/csep2$ pip3 install -r requirements.txt
Collecting python~=3.6 (from -r requirements.txt (line 1))
  Could not find a version that satisfies the requirement python~=3.6 (from -r requirements.txt (line 1)) (from versions: )
No matching distribution found for python~=3.6 (from -r requirements.txt (line 1))
(csep) tb@tokaido:/tmp/csep2$

tests/test_evaluations.py:152: AssertionError

running the unit tests in my current local branch produces a failure:

tb@gurke:~/devel/csep2$ time pytest-3 
============================================================= test session starts ==============================================================
platform linux -- Python 3.6.9, pytest-3.3.2, py-1.5.2, pluggy-0.6.0
rootdir: /home/tb/devel/csep2, inifile:
collected 96 items                                                                                                                             

tests/test_JmaCsvCatalog.py .                                                                                                            [  1%]
tests/test_UCERF3Forecast.py ....                                                                                                        [  5%]
tests/test_acceptance_ucerf3_experiment.py .                                                                                             [  6%]
tests/test_adaptiveHistogram.py ......                                                                                                   [ 12%]
tests/test_basic_types_utils.py .                                                                                                        [ 13%]
tests/test_calc.py ......                                                                                                                [ 19%]
tests/test_comcat.py ......                                                                                                              [ 26%]
tests/test_create_catalog.py ..                                                                                                          [ 28%]
tests/test_csep1_evaluations.py ....                                                                                                     [ 32%]
tests/test_evaluations.py ....F.                                                                                                         [ 38%]
tests/test_file_system.py ..                                                                                                             [ 40%]
tests/test_math.py ...............                                                                                                       [ 56%]
tests/test_spatial.py ..............                                                                                                     [ 70%]
tests/test_stats.py ................                                                                                                     [ 87%]
tests/test_system.py ....                                                                                                                [ 91%]
tests/test_time_utilities.py ......                                                                                                      [ 97%]
tests/test_workflow.py ..                                                                                                                [100%]

=================================================================== FAILURES ===================================================================
____________________________________________________ TestPoissonLikelihood.test_likelihood _____________________________________________________

self = <tests.test_evaluations.TestPoissonLikelihood testMethod=test_likelihood>

    def test_likelihood(self):
        qs, obs_ll, simulated_ll = poisson_likelihood_test(self.forecast_data, self.observed_data, num_simulations=1,
                                                           random_numbers=self.random_matrix, use_observed_counts=True)
    
        # very basic result to pass "laugh" test
        self.assertEqual(qs, 1)
    
        # forecast and observation are the same, sum(np.log(poisson(1, 1))) = -4
        self.assertEqual(obs_ll, -4)
    
        # calculated by hand given the expected catalog, see explanation in zechar et al., 2010.
>       self.assertEqual(simulated_ll[0], -7.178053830347945)
E       AssertionError: (-7.1780538303479444+0j) != -7.178053830347945

tests/test_evaluations.py:152: AssertionError
==================================================== 1 failed, 95 passed in 247.23 seconds =====================================================

real	4m11.951s
user	3m47.120s
sys	0m4.756s
tb@gurke:~/devel/csep2$

also here:

tb@gurke:~/devel/csep2$ time python3 -m unittest
..../usr/lib/python3.6/tempfile.py:939: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp2x12eohy'>
  _warnings.warn(warn_message, ResourceWarning)
....[ 2.  2.  2.  0.  0.  0.  0.  0.  0.  0.  2.  0.]
............/home/tb/devel/csep2/tests/artifacts/Testing/example_csep1_forecasts/Forecast/EEPAS-0F_12_1_2007.dat
.N Test: Running Unit Test
.T Test: Running Unit Test
.W Test: Running Unit Test
.F.......................................
======================================================================
FAIL: test_likelihood (tests.test_evaluations.TestPoissonLikelihood)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tb/devel/csep2/tests/test_evaluations.py", line 152, in test_likelihood
    self.assertEqual(simulated_ll[0], -7.178053830347945)
AssertionError: (-7.1780538303479444+0j) != -7.178053830347945

----------------------------------------------------------------------
Ran 64 tests in 185.042s

FAILED (failures=1)

real	3m24.705s
user	2m59.451s
sys	0m4.978s
tb@gurke:~/devel/csep2$

It's an off-numpy floating point issue, and i suggest to deal with it by using cmath.isclose(a, b, rel_tol=1e-09, abs_tol=0.0)

the INGV-ercmt catalog reader confuses mw columns with depth in some events

(Pending fix) The catalog parser does not separate appropriately every column of the catalog, sometimes mixing magnitudes with depths

pycsep/csep/utils/readers.py

Line 486 in d7d9c09

'Mw': 61}

e.g. event 1060 of (pycsep/tests/artifacts/Testing/ingv_rcmt-catalogs/full.csv)

To perform binning on observed catalog for variable sized spatial grids.

I need to perform gridding on observed catalog. The origin coordinates of every grid are available. Now there should be a way to plug such observed catalog into PyCSEP

combine examples into comprehensive tutorial

Examples on the documentation page (eg., this, this, and this) should be combined into a comprehensive tutorial.

This would become a new example page on the documentation.

graphical comparison between forecast likelihood scores

the package should include some type of function that can graphically compare the likelihood scores between two forecasts using something along the lines of a deviance residual plot.

initially forecasts should be defined over the same regions, but this should be upgraded in the future to allow the comparison of two forecasts that share a subset of cells.

Error when loading GCMT 'ndk' catalog.

Hi all,

I would like to ask you a small question about csep.load_catalog() function. So, I'm trying to load a GCMT catalog but encountered this problem (see the screenshot below). A copy of the 'ndk' file that I'm trying to load is also attached (a suffix '.txt' is added to the file name to upload, so make sure you delete it when trying to reproduce the error).
GCMT_temp.ndk.txt

Looks like it occurred when .../csep/core/catalogs.py [_get_catalog_as_ndarray(self)] trying to convert a line data/strings but has a length mismatch.

Any kind of help is much appreciated.

Han

reading gzipped u3etas forecast files

u3etas forecast files can be gzipped. these files should be read inline.

Problem with multiple filtering & datetime filters

Hello! By filtering a catalog using a list of filters (e.g catalog temporal filtering from the tutorial), the catalog is just filtered by last member in list. The problem can be reproduced as:

Input

import csep
import copy
from csep.utils.time_utils import strptime_to_utc_epoch,strptime_to_utc_datetime
from csep.core.catalogs import CSEPCatalog
# define dummy cat
date1 = strptime_to_utc_epoch('2009-01-01 00:00:00.0000')
date2 = strptime_to_utc_epoch('2010-01-01 00:00:00.0000')
date3 = strptime_to_utc_epoch('2011-01-01 00:00:00.0000')
Catalog = [(b'1', date1, 1.0, 1.0, 1.0, 1.0),
           (b'2', date2, 2.0, 2.0, 2.0, 2.0),
           (b'3', date3, 3.0, 3.0, 3.0, 3.0)]
test_cat1 = CSEPCatalog()
test_cat1.catalog = Catalog
test_cat2 = copy.deepcopy(test_cat1)
# Filter together
start_epoch = strptime_to_utc_epoch('2009-07-01 00:00:00.0')
end_epoch = strptime_to_utc_epoch('2010-07-01 00:00:00.0')
filters = [f'origin_time >= {start_epoch}', f'origin_time < {end_epoch}']    # should return only event 2
test_cat1.filter(filters)
print('Events after filter by list:', test_cat1.get_event_ids())
# Filter separately
for i in filters:
    test_cat2.filter(i)
print('Events after filter by two operators:' , test_cat2.get_event_ids(),'\n')

Output

Events after filter by list: [b'1' b'2']
Events after filter by two operators: [b'2']

I think is due to reassigning in each iteration of the filters list, the values of the original catalog object to the new filtered catalog. In this line:

pycsep/csep/core/catalogs.py

Line 521 in 17c40cc

filtered = self.catalog[operators[oper](self.catalog[name], float(value))]

A solution would be maybe to replace line 521 with 522.

Also, the filter gives error with datetime objects:

Input

start_time = strptime_to_utc_datetime('2010-01-01 00:00:00.0000')
end_time = strptime_to_utc_datetime('2010-01-04 00:00:00.0000')
test_cat = csep.query_comcat(start_time, end_time, verbose=False)

# Filter together
start_epoch = strptime_to_utc_datetime('2010-01-02 00:00:00.0')
filter = f'origin_time >= {start_epoch}'
test_cat.filter(filter)

Output

 line 499, in filter
    name, oper, value = statements.split(' ')

ValueError: too many values to unpack (expected 3)

I believe is because f-string formatting a datetime object gives a space between date and time

filter
'origin_time >= 2010-01-02 00:00:00+00:00'

Thank you!

adding plotting functionality for catalogs

catalogs should have an associated .plot() method that provides a simple interface to plot catalogs. the plot can be a simple in nature, but needs to handle arbitrary event locations.

for example, a catalog of global events should be plotted on a different basemap than italian seismicity. eventually, the function should accept an arbitrary image layer such as a gridded forecast.

time aware or not time aware

that is the question....

all times in pycsep are treated as being timezone aware with the timezone being set to UTC. before we adopt a wider user base we can rethink this decision. the major con is that using timezones can complicate issues and forces people to work with them directly. on the other hand, certain regions share their catalogs in local time coordinates instead of UTC. any thoughts on this? @pabloitu

misleading catalog statistics of catalogs overlapping the IDL

By loading a semi-artificial catalog overlapping the International Date Line i discovered very misleading stats :

$ ipython
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from csep.core.catalogs import JmaCsvCatalog

In [2]: c = JmaCsvCatalog(filename = '../../Documents/JMA/JMA-hypocenter-timestamp-shifted-area-only.csv')

In [3]: c.load_catalog()
Out[3]: <csep.core.catalogs.JmaCsvCatalog at 0x7fd4a61e8cc0>

In [4]: print(c)

        Name: None

        Start Date: 1923-01-08 04:46:29.170000+00:00
        End Date: 2014-05-31 14:58:40.910000+00:00

        Latitude: (17.4093, 52.5082)
        Longitude: (-179.9999, 179.9999)

        Min Mw: -1.6
        Max Mw: 9.0

        Event Count: 2508992
        

In [5]:

Shouldn't the longitude be something like [165 .. 180), [-180 .. -165] instead of (-179.9999, 179.9999)? Or should we report here just the centroid of the epicenter cloud and the maximum distance of the events?

read basic catalog using csv.reader

if the user has a csv file named 'my_catalog.csv' with the form

event_id, time, lon, lat, depth, mag

they should be able to read their catalog in using the following command. right now, loaders that are generator functions are not supported.

import csv
cat = csep.load_catalog('my_catalog.csv', loader=csv.reader)

wrapping and running external executables

this is the start of a thread to discuss implementing standardized ways of running external executables, which will typically be forecast models. we need this feature in order to autonomously run forecasting experiments in testing centers.

this task will require defining and preparing configurations for executables and managing any files produced from the program.

@gen2

catalog pre-preprocessing

time-dependent forecasts require special processing in order to compute t- and w-tests along with cumulative versions of the consistency tests. this will need to be incorporated in order to evaluate the forecasts consistently with what was done in csep1.

metadata for evaluation results

we should have an open discussion about what quantities of the forecast should appear as meta data in an evaluation result file. this comes from the fact that im working with !~1750 forecasts all having different start times, and it made me wonder how much of the forecast metadata should be associated with the result .json file.

Read flagged cells from forecast file

Flagged cell column (last in the csep1 gridded forecast ascii format) should be read when creating a forecast and properly applied to the mask of that region.

Error when loading catalog with ingv_emrcmt

When trying to load the INGV ercmt file for the workshop using the ingv_emrcmt reader, the following error is returned:

S-test differences for same test catalog

Using the Helmstetter example gridded forecasts and one of my own, I tried to compare S-test results and was surprised that the Helmstetter examples did poorly. This seems to be related to the spatial filtering step and the region associated with the test catalog, even when the regions have the same midpoints and the evaluation catalog contains the same events.

Comparing the midpoints for the two regions with np.testing.assert_allclose flags that they are different:

Not equal to tolerance rtol=1e-07, atol=0

Mismatched elements: 15287 / 15364 (99.5%)
Max absolute difference: 11.600000000000009
Max relative difference: 0.34850863422291994
x: array([[-125.35, 40.15],
[-125.35, 40.25],
[-125.35, 40.35],...
y: array([[-117.15, 31.55],
[-117.05, 31.55],
[-116.95, 31.55],...

I sorted the midpoint xs and ys and they are identical. Essentially my test forecasts and the example have different regions, though the sorted midpoints are the same. The filtered catalogs for the evaluation are also identical, but the s-tests results are very different.

Do the regions maybe need some sorting somewhere for the spatial test?

My test forecast:
CSEP_test_forecast.txt

Greeting

Hello everyone, I now working in IEFCEA and is also member of China CSEP2 team. In previous days I have focused our CSEP2 program all the time and these days it seems good for the progress you have done. So what kinds of help or example data of China can I suplly for you for your test in this coding program? I also expect to cooperate into our team, however, the coding ability maybe cannot satisfy your need. But I can do plot some figure or do some data analysis if you need. BTW, is the PyCSEP program here can be installed in my unbuntu pc and do some forecast and evaluation? Do the scripts contains the forecast model or algrithm? Sorry I have not see the details of the scripts. Thanks for your guide.

italian testing region

we need to implement an italian testing region from the csep1 xml polygon file. this should also be included in the repository as well.

need test for JMA catalog

need a test for the creation of the JMA catalog, using random data and snippet of test catalog. @gen2

documentation

the code base is seriously lacking in documentation. this needs addressed before we can work towards an initial release.

comparisons of region information

should implement the ability to compare regions and test for equality, this will be useful for some defensive programming when gridded catalogs are to be compared against gridded forecasts.

load_gridded_forecast leads to strange forecast plots

I had some interesting results using load_gridded_forecast, where the forecast loads and the region midpoint locations match with the examples, but when plotted it's clear that the intensity values are mapping to the wrong grid cells. I think this is related to forecast.region which maybe assumes or internally sorts longitudes of the grid cells in a certain order, hence the matching region midpoints. The plotting function takes the lon/lats from the region but the intensities are assigned based on the arrangement of the input file, which I think is what leads to this:

Sorting my gridded forecast file so the longitudes are in ascending order before loading let me load a forecast that had the expected spatial pattern (more smoothed seismicity than the above!).

bounding box grid forecast

The bounding box of the class CartesianGrid2D (get_bbox) returns the extent of region lowerleft-corners, rather than the extent of the forecast (including topright). Should its behavior be like that?. To reproduce

import csep
from csep.utils import datasets
import numpy

forecast = csep.load_gridded_forecast(datasets.helmstetter_mainshock_fname,
                                      name='helmstetter_mainshock')
bbox_forecast = forecast.region.get_bbox()

array_ = numpy.genfromtxt(datasets.helmstetter_mainshock_fname) 
bbox_dat = (array_[:,0].min(), array_[:,1].max(), array_[:,2].min(), array_[:,3].max()) 

print(bbox_forecast)
print(bbox_dat)

Tracked to here

pycsep/csep/core/regions.py

Lines 630 to 632 in 1fa6b26

 def get_bbox(self): 

 """ Returns rectangular bounding box around region. """ 

 return (self.xs.min(), self.xs.max(), self.ys.min(), self.ys.max())

But when I look how CartesianGrid2D.xs and ys are defined in L730,

pycsep/csep/core/regions.py

Line 703 in 1fa6b26

xs = self.dh * numpy.arange(nx + 1) + bbox[0][0]

I see that it was intended that the grid extends one more away from the number of cells. For instance, this gives the same result

forecast.spatial_counts(cartesian=True).shape == (forecast.region.ys.shape[0], forecast.region.xs.shape[0])

Would it be a precision issue?

For the plots.plot_spatial_dataset(), it does not plot the last lat/lon of the CartesianGrid. Although, this could be bypassed quickly inline that plotting function, but I prefer to make sure.

'one_sided_lower option' of the poisson_plot function not functioning with obs_value larger than predicted range

Hi Bill,

Just want to report a small issue. The two plots below shows a comparison of figure with 'one_sided_lower=True' and the other with 'one_sided_lower=False'. No rush, tho.

plots.plot_poisson_consistency_test is not capturing inf statistic

For an EvaluationResult object with attribute observed_statistic which equals to inf, (such as the case in the S-test of a forecast with bin rates of 0.0) the plotting function does not capture correctly this issue, but rather lets ax.set_xlim() throw the error.

pycsep/csep/utils/plots.py

Line 1051 in 22e9640

ax.set_xlim(*_get_axis_limits(xlims))

To reproduce

input

import csep
import numpy
result_dict = {'name': 'Poisson S-Test',
               'sim_name': 'Dummy model',
               'obs_name': None,
               'obs_catalog_repr': '',
               'quantile': 0.0,
               'observed_statistic': -numpy.inf,
               'test_distribution': [-150., -145.],
               'status': 'normal',
               'min_mw': 4.95,
               'type': 'EvaluationResult'}
Result = csep.core.poisson_evaluations.EvaluationResult.from_dict(result_dict)
csep.utils.plots.plot_poisson_consistency_test(Result)

output

  File ".../pycsep/venv/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 3213, in _validate_converted_limits
    raise ValueError("Axis limits cannot be NaN or Inf")
ValueError: Axis limits cannot be NaN or Inf

Do you think we can make a try/except blockthat indicates that this model is diverging, rather than let matplotlib throws the error? Or plot something for this model that indicates that is diverging... for example something like:

Let me know

ModuleNotFoundError: No module named 'obspy'

I try to test my code, but run into an nasty error not resolvable by my OS pkg mgr:

/usr/bin/python3.6 /snap/pycharm-community/179/plugins/python-ce/helpers/pydev/pydevconsole.py --mode=client --port=39395

import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['/home/tb/PycharmProjects/csep2'])

Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
Type "copyright", "credits" or "license" for more information.

IPython 5.5.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
PyDev console: using IPython 5.5.0

Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
In[2]: from csep.core.catalogs import JmaCsvCatalog
Backend TkAgg is interactive backend. Turning interactive mode on.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-8f4b98ac0bf6>", line 1, in <module>
    from csep.core.catalogs import JmaCsvCatalog
  File "/snap/pycharm-community/179/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "/home/tb/PycharmProjects/csep2/csep/__init__.py", line 2, in <module>
    from csep.core.catalogs import UCERF3Catalog, CSEPCatalog, ComcatCatalog
  File "/snap/pycharm-community/179/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "/home/tb/PycharmProjects/csep2/csep/core/catalogs.py", line 13, in <module>
    from csep.utils.comcat import search
  File "/snap/pycharm-community/179/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "/home/tb/PycharmProjects/csep2/csep/utils/comcat.py", line 26, in <module>
    from obspy.core.event import read_events
  File "/snap/pycharm-community/179/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'obspy'

If obspy is a hard dependency shouldn't a specific version be a submodule of this project?

Catalog Filter with argument of year range?

Hi Bill,

I followed the documentation page to explore with the Catalog class and its functions. I try to played with the CSEPCatalog.filter() function but can't do it with any 'year' arguments (like example showed here https://cseptesting.org/reference/generated/csep.core.catalogs.CSEPCatalog.filter.html#csep.core.catalogs.CSEPCatalog.filter)

Han

one unit test failes

tb@tokaido:~/PycharmProjects/csep2$ python3 -m unittest
..../usr/lib/python3.6/tempfile.py:939: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpfnbtlb0q'>
  _warnings.warn(warn_message, ResourceWarning)
....[ 2.  2.  2.  0.  0.  0.  0.  0.  0.  0.  2.  0.]
.......E..........................
======================================================================
ERROR: test_catalog_create_with_failure (tests.test_create_catalog.TestCreateCatalog)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tb/PycharmProjects/csep2/tests/test_create_catalog.py", line 16, in test_catalog_create_with_failure
    AbstractBaseCatalog(catalog=catalog)
  File "/home/tb/PycharmProjects/csep2/csep/core/catalogs.py", line 49, in __init__
    self.catalog = catalog
  File "/home/tb/PycharmProjects/csep2/csep/core/catalogs.py", line 190, in catalog
    self._catalog = self._get_catalog_as_ndarray()
  File "/home/tb/PycharmProjects/csep2/csep/core/catalogs.py", line 501, in _get_catalog_as_ndarray
    catalog = numpy.array(n, dtype=self.dtype)
TypeError: a bytes-like object is required, not 'int'

----------------------------------------------------------------------
Ran 42 tests in 0.013s

FAILED (errors=1)
tb@tokaido:~/PycharmProjects/csep2$

in_place option of the filter_spatial function does not properly bind region information

when in_place=False, the region information is not properly passed to the newly created class.

poisson_evaluations likelihood_test out of bounds error

I can run the magnitude, number and spatial tests from poisson_evaluations without issue, but the likelihood_test and conditional_likelihood_test both give the same error when I try to run them with the example (gridded) forecast (called forecast_ex here).

IndexError                                Traceback (most recent call last)
<ipython-input-85-fdd092fbc693> in <module>
----> 1 llhood_test_result_ex = poisson.conditional_likelihood_test(forecast_ex, catalog)
      
c:\users\kirst\pycsep\csep\core\poisson_evaluations.py in conditional_likelihood_test(gridded_forecast, observed_catalog, num_simulations, seed, random_numbers, verbose)
    175 
    176     # simply call likelihood test on catalog data and forecast
--> 177     qs, obs_ll, simulated_ll = _poisson_likelihood_test(gridded_forecast.data, gridded_catalog_data,
    178                                                         num_simulations=num_simulations, seed=seed, random_numbers=random_numbers,
    179                                                         use_observed_counts=True,

c:\users\kirst\pycsep\csep\core\poisson_evaluations.py in _poisson_likelihood_test(forecast_data, observed_data, num_simulations, random_numbers, seed, use_observed_counts, verbose)
    511     # these operations perform copies
    512     observed_data_nonzero = observed_data.ravel()[target_idx]
--> 513     target_event_forecast = log_bin_expectations[target_idx] * observed_data_nonzero
    514 
    515     # main simulation step in this loop

IndexError: index 350244 is out of bounds for axis 0 with size 314962

I loaded the catalog using the example in the documentation. I filtered the catalog using catalog.filter_spatial(forecast_ex.region) and confirmed that the number of spatial grid cells was correct. I think there are too many magnitude bins in the gridded_catalog_data, which seems to have 65 if I go by the shape of the array result from catalog.spatial_magnitude_counts().
From a bit of digging, this seems to come from forecast_ex.region.magnitudes which starts at 2.5 giving 65 bins, while the actual data in the example start from magnitude 4.95 giving 41 magnitude bins, and there are 7682 spatial cells which would line up with the point at which the error occurs.

	def get_bbox(self):
	""" Returns rectangular bounding box around region. """
	return (self.xs.min(), self.xs.max(), self.ys.min(), self.ys.max())

sceccode / pycsep Goto Github PK

pycsep's People

Contributors

Stargazers

Watchers

Forkers

pycsep's Issues

conda environments:

Input

Output

Input

Output

input

output

Recommend Projects

Recommend Topics

Recommend Org