Comments (10)
Thank you for reporting this. I'll look into it
from bfast.
made some minor edits to the initial comment that do not affect the gist of the issue.
In the R version I was wrongly assuming that calendars are 0 based, so I was miscalculating the day of year by 1.
from bfast.
after some more tests, in which it happened that there was a break in the last day of the monitoring period, I noticed that the breaks property is a 1 based index. Is this correct?
By this I mean that the first date in the monitoring period must have an index of 1, in order for the breaks property to be used correctly as an index. The implication being that my above script is wrong by 1 date. Unfortunately it moves the detected break even further, not closer to the R version.
Am I doing something wrong with the indexing?
I am editing the comment to correct this bug in my script, but please check the edits to better understand what I mean.
from bfast.
Good evening,
I subscribed to this issue since I'm dealing with the same problem for some of my currents tests, both in Python and OpenCL.
I just was reading the paper (https://arxiv.org/pdf/1807.01751.pdf), and take some time seeing the R files implementation (https://www.rdocumentation.org/packages/strucchange/versions/1.5-2/source - monitoring.R), focusing on breaks calculation.
From the paper, I could understand that (among other things) the calculation of the breaks is dependent on the calculation of bounds values. In the paper the bounds are computed with the following expression:
with:
being t the monitoring period.
After inspecting the R files I found the "OLS-MOSUM" method the most identical to what is done in Python. In the method, the bounds are calculated as:
The formula is identical to what is done in Python since 'lam' (critval) is previously multiplied by the sqrt of 2. However, I think that in Python the division value inside the log is calculated differently from the R code. 't', as stated in the paper should be the monitoring period and 'n' the last value of the history, but in Python, you do this:
I think you divide by the last mapped_indice of all time serie. What do you think about:
Can you give me some feedback about this practice?
Thank you Dmitry!
from bfast.
Hi @Carolina710. Python is 0-indexed, in comparison to R, which is 1-indexed. Furthermore, in Python, you can index with -1 to get the last value in the array, hence array[-1] and array[n-1] are equivalent. Se here for more information
I am currently looking into the issue
from bfast.
Hi, thanks for your reply.
I understand that logic, I just thought that "n" or "self.n" (depending on your version) was the number of data images from the period of history, right? Like: "self.n = self._compute_end_history (dates)". I also thought that the mapped indices were calculated for all time series, including the history and the monitoring periods, with a total of "N" data images.
So, in this scenario, you take the last mapped index of all time series, that is also the last data from the monitoring period [-1] or [N-1], not the last one in history [n-1], I think ...? And my question was, shouldn't it be the last mapped index in history? Like, "histsize" in R.
Thank you very much!
from bfast.
Hi @mirt001 and @Carolina710
This particular bug should be fixed now. However, I need to look more thoroughly into the boundary computation soon (including the issue @Carolina710 mentioned).
Thanks everyone!
from bfast.
I think that I have misunderstood you answer earlier @Carolina710. You are right, it should be divided by the last index in the history period. I have pushed it to the develop branch. Nice catch!
from bfast.
Thanks @mortvest and also thanks @Carolina710
I just checked the results using the latest develop branch, on the above Python test, and while the results are closer, they are still different in the sense that the break is detected at a different date.
The new detected date for python is 2018 146 , or 2018-05-26, which is one date too early, when compared to the R version, which detects the break on 2018 162, or 2018-05-26. Are you getting different results?
I will test this more to see if Python is always 1 date away, or if there are more inconsistencies.
from bfast.
The returned breakpoint is 0-indexed, so you don't need to subtract 1 from it. I get 2018 163 this way. The small error is due to the way the dates are converted, which is outside of the scope of this implementation. The output breakpoint index is the same as in R version.
from bfast.
Related Issues (20)
- The error message of level does not reflect its actual behaviour HOT 1
- performance benchmark between the different backend ? HOT 11
- drop usage of sklearn ? HOT 2
- pyhton backend n_chunk is never used HOT 3
- why _model is reset each time fit is called ? HOT 3
- OpenCL backend is way too slow compared to futhark bench HOT 1
- Write an example that demonstrates usage of multiple GPUs
- GPU data transfer overhead amortization
- BFAST cannot be install from git source
- Version 0.7 fails to install on google colab HOT 2
- opencl backend only supports a number of harmonics (k) in range(3,11) HOT 4
- use bfast for one dimentional time series data HOT 10
- numpy should not be referenced in the setup.py HOT 1
- Stable period determination not implemented HOT 2
- How to get full list of breaks for one pixel HOT 2
- Error in normal terminal and Google colab with Get Started example HOT 1
- IndexError when datetime is used with hour != 0
- Support for multiple detected breaks
- Data not enough for pixel
- Installation fails due to obsolete numpy code HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bfast.