Giter VIP home page Giter VIP logo

fhmm's Issues

estimation output

  • sort: state with highest mu at the front, descending
  • design estimation result output in txt-file (names, elements, order)
  • Hessian computation
  • AIC and BIC computation
  • check if iterlim was exceeded, if so, increase

Documentation issues

  • Set_controls() sollte als zweites stehen, da es für prepare_data() wichtig ist. Klarstellen: An object of class RprobitB_controls oder fHMM_controls?
  • prepare_data() sollte als drittes stehen, weil hierbei erst die class fHMM_data eingeführt wird, was man für das jetzt zweite plot braucht. Das könnte verwirren
  • Decode_state: ich würde „the most likely” state sequence schreiben. Rprobit_B unklar
  • fHMM_events(): mir wäre nicht ganz klar, was diese Funktion genau macht. Prüft sie gegeben Events, dass diese passend sind zum Einlesen?
  • fHMM_parameters: A tpm of dimension controls$states[1]. – tpm würde ich ausschreiben. Unterschied mu/mus_star wird nicht klar

Plot of SDDs for simulated HHMM gives odd x-scale

Try

controls = list(
  id            = "test", 
  sdds          = c("normal","normal"),
  states        = c(3,2),
  time_horizon  = c(100,30),
  at_true       = TRUE,
  overwrite     = TRUE,
  seed          = 4
)

and see that this gives an odd x-scale in sdds.pdf. Set them based on distribution limits.

Is there a built-in function to graph the simulated data?

First of all - I love the package! I struggled with some not-so-userfriendly packages in the past, but this is really something else!

To my question - is there existing funtion in the package to visualize/plot only the simulated data but with the same structure, visualizing the state scales underneath?

If not - is the simulated data bundled together with the data I fitted the model on in the data.rds file?

lenn

In the picture above, logReturns and dataRaw with 2714 elements. I only modelled a year, so this got to be all of the data, right? So I only need to fetch he last 365 elements and then I'm fine?

I do hope I was clear enough. If there are any confusion, just leave a quick comment and I will try to explain further.

Thanks in advance,

Carlos

Incorporate covariates

Incorporate covariates into the state process(es) to determine which factors affect the probabilities of switching to bearish and bullish markets, respectively (just an idea, perhaps something for later versions of the package!).

visualization

  • make visualization more flexible for any number of states
  • add. parameter: vector with dates and labels to highlight in the plot (e.g. Lehman bankruptcy)
  • check that plots don't get overwritten

likelihood computation

  • check that nLL_hmm works
  • check that nLL_hhmm works
  • transformation of thetaUncon to thetaCon with function
  • nLL_hmm can also be called from nlm

check_controls

write funtion "check_controls" that checks control parameters and gives output about model formulation

Data on coarse scale

Problem

Log-return averages on the coarse scale seems not to be the best idea. It's hard for the code to detect different states / state switches for this type of data.

Idea

Include parameter in controls to select type of coarse scale data (e.g. sum of absolute values, mean, average of absolute values). Plot coarse-scale data in ts.pdf to see if this yields better data.

Calculation of Hessian

Use option hessian=FALSE in nlm and only hessian=TRUE in final estimation run. Should give speed improvement.

Create R-package

  • Add a folder called "R" that contains all .R files and a folder called "scr" that contains all .cpp files (this is the folder structure that is required for the R package). Here's a cheat sheet on creating packages that could be useful for the development: https://github.com/rstudio/cheatsheets/raw/master/package-development.pdf.
  • Create a separate .R file for each function.
  • Write documentation for each function using roxygen tags, see comment below.
  • Where functions from other packages are used, use packageName::functionName() instead of functionName() to avoid conflicts.
  • Choose name for the package: fHMM
  • R package hex sticker
  • description file

control "data"

Give controls parameter "data" which is a list containing all parameters related to data processing/simulation. Update documentation.

Falsche state-dependent distribution bei 3 States

Beim Durchlaufen dieses Codes:

simulated HMM -----------------------------------------------------------
seed = 1
controls = list(
states = 3,
sdds = "gamma",
horizon = 500,
fit = list("runs" = 100)
)
controls %<>% set_controls
data = prepare_data(controls, seed = seed)
data %>% summary
data %>% plot
model = fit_model(data, ncluster = 1, seed = seed) %>%
decode_states %>%
compute_residuals
summary(model)
model %<>% reorder_states(state_order = 1:3)
compare(model)
model %>% plot("ll")
model %>% plot("sdds")

image

wird der 3. Status leider nicht richtig erkannt. Ich habe dasselbe auch mit 1000 Runs einmal ausgeführt, geändert hat sich am Ergebnis allerdings nichts.

Odd behaviour for fixed dfs

Try the fixed-dfs model

controls = list(
  id = "test",
  sdds = c("t(Inf)",NA),
  states = c(2,0),
  time_horizon = c(100,NA),
  seed = 1
)

and see that two states cannot be identified. However, the dfs-flexible model

controls = list(
  id = "test",
  sdds = c("t",NA),
  states = c(2,0),
  time_horizon = c(100,NA),
  seed = 1
)

works.

Improve numerical optimization

  • Early stopping of non-promising optimization runs.
  • Parallelise numerical optimization runs.
    • Set number of cores in controls via ncores.
    • In check_controls, read out available number of cores, give warning if not (all-1) and error if too many (>=all) cores are used.
    • Divide all runs into ncores batches. Last one has ceiling(runs/ncores) runs, all others floor(runs/ncores) runs. Implement progress bar for last batch. ncores must not exceed runs.

Ideas

A collection of ideas on how to further extend the code:

  • Functionality to loop over different numbers of states.
  • How to deal with NA values in empirical data ("Close" may not exist for every time point, two data sets may not share all close days)?
  • Include comparison between true states and predicted states for simulated data in contingency table.
  • Possibility to extract any column from dataset, not only "Close".
  • Extend for fix of degrees of freedom on one scale only.
  • Show progress bar before first iteration
  • Give error if any state = 1.
  • Download new data automatically from https://finance.yahoo.com/. Write function download_data in 'data.R'.

sim

simulate HMM and HHMM (depending on N=0 or N!=0)

Flexible FS time horizon

For empirical data, implement that the fine-scale horizon can be monthly / quarterly. Leads to different fine-scale chunk sizes. In this case warning, if !controls[["data_cs_type"]] in c("mean","mean_abs")

control "nlm"

Give controls parameter "nlm" which is a list containing all parameters that can be passed to nlm. Update documentation.

update on readData

  • process two different sources of data
  • truncation: find nearest date if truncation point does not exist, possibility to include NA for no truncation
  • print out which data is used and how
  • check in check_control if correct emp. data is supplied for HMM and HHMM

PRS of simulated HHMM with normal SDDs on FS look odd

Try

controls = list(
  id            = "test", 
  sdds          = c("normal","normal"),
  states        = c(3,2),
  time_horizon  = c(100,30),
  at_true       = TRUE,
  overwrite     = TRUE,
  seed          = 4
)

and see that the pseudo-residuals of the fine scale are not normal.

Extend for other SDDs

Extend code for other state-dependent distributions:

  • t: t-distribution
  • t(x): t-distribution with x fixed degrees of freedom (which replaces fix_dfs in controls)
  • norm: normal distribution, i.e. t(Inf)
  • gamma: gamma-distribution

Include control sdd (character vector of length two).

Access the predicted values in a list/array

I'm doing an analysis of our predictions and I would like to access the output in a list/array so I could run MAPE, MSE and some other indicators. I can't seem to find it - where is it?

Examples in .Rd-files

add small executable examples in main .Rd-files to illustrate the use of the exported function but also enable automatic testing

Simulated values

I'm raising the issue again - I can't find the simulated data in the provided model files. Just the graphical representation. I've looked through the output files and I can't find it.

@return for each .Rd file

  • Add @return to roxygen tags and explain the functions results in the documentation.
  • Write about the structure of the output (class) and also what the output means. (If a function does not return a value, document that too).
  • Missing @return tag in
    • apply_viterbi.Rd
    • check_decoding.Rd
    • create_visuals.Rd
    • download_data.Rd
    • fit_hmm.Rd
    • plot_ll.Rd
    • plot_sdd.Rd
    • plot_ts.Rd

Warnings when using other datasets

download_data("dax", "^GDAXI", path=".")

download_data("hk", "HEN3.DE", path=".")

horizon: 2020-01-02 to 2021-03-01

Warnings:

  1. possibly unidentified states (C.6)
  2. events ignored (V.2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.