Giter VIP home page Giter VIP logo

hadron's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hadron's Issues

HDF5 correlator reading is a bottleneck

For the sigma project I want to do some statistical analysis of the raw correlators. With the current set of momenta and operators and diagrams there are roughly 50k correlators (4 GB). My plan was to compute a few quantities per raw correlator:

  • Maximum {integrated autocorrelation time per time slice}
  • Maximum {relative error per time slice}
  • Argmax of the above
  • Median of {relative error per time slice for a few slices around t = 10}

This way we can see whether the blocking is chosen large enough and how the signals are.

I cannot load all the data into memory on my laptop (8 GB available) and I do not want to write a monolithic analysis that can only be run on qbig until it grows out of that machine as well. Therefore I am trying to do one correlator at a time.

The problem is that with the current data, which is not even half of one ensemble, and not including displacements, it would take days on my laptop. For a four point diagram it would take say 20 hours on four cores and we have four of these. The two and three point diagrams take half and five minutes each, respectively.

From my system monitor the bottleneck seems to be the CPU:

screenshot_20180427_102737

Most importantly the SATA3 SSD is not the bottleneck as it seems.

What I do is a h5ls on the first configuration of each diagram and then load the available statistic of that correlator:

        names <- h5ls(files[1])$name
        
        diagram_agg <- lapply(names, function (name) {
            cf <- readbinarycf(files = files,
                               T = param$extent_time,
                               hdf5format = TRUE,
                               hdf5name = name,
                               hdf5index = c(1, 2))
            
            shifted <- takeTimeDiff.cf(cf)
            uw <- uwerr.cf(shifted)

Running this through R's sampling profiler shows that almost all time is spend in the reading routine and the computation is 2% of the time:

screenshot_20180427_123332

Digging deeper into the readbinarycf function shows that most time is spend in the h5read and h5ls, which are not our functions:

screenshot_20180427_124629

The data is stored such that for each diagram and configuration there is one HDF5 file. Extracting one correlator from all configurations is maximally inefficient because each time we have to load every file and extract just a sliver from it. As the h5ls calls take up half the time in readbinarycf this seems like a huge waste. Also within h5read we have 35% of the time spend with actual reading and the remainder spend with IO related. I would think that this is per file open and therefore opening less files would give a speed-up.

For the autocorrelation time we do need the whole statistic. But reading all diagrams does not scale in the RAM. I thought about “transposing” the files: There is one file per diagram with one HDF5-group per flavor-momentum-displacement-gamma combination which then contains the whole statistic. The disadvantage is that these files needs to be generated compiled again when one increases the statistic, but it might make loading much easier. This could be done once with a short Python script which loads and stores the data.

When building the GEVP later on I will have to load certain correlators in their full statistic as well. There the same issue will occur, though perhaps not every single correlator is actually read.

Is the performance problem a real problem or do I just have to accept that this will take two days once? Is this “transpose” option sensible? Do you have a better idea?

interactive() is problematic for "my" typical usecase

They way I use R makes the usage of interactive() a bit problematic. My analysis drivers have loads of parameters in order to be suitable for different analyses (as far as that is possible) and as a result I generally run them from an interactive session. It would be helpful for me if interactivity as a reason for calling X11() to get new windows could be an internal state rather than what is reported by interactive(). What do you think?

uwerrprimary does not return the error on Gamma

The uwerrprimary function returns a list with lots of variables, but the error on Gamma is missing. It is computed as needed in the plot.uwerr function. I'd like to extract the data and plot with some different package, therefore I'd need this data directly.

I have looked into the code and in uwerrprimary there is a GammaFbb which seems different from the GammaFbb that is used in plot.uwerrprimary.

Could the error on Gamma be also included in the return value of this function?

Update: It seems that I cannot use the gammaerror function in my own script because that does not get exported into the main namespace, it seems. I suppose this is a function with file-scope only, then.

Update: I managed to import the uw.R source file and then just copied the needed code from the plotting routine. So I have the data that I want, so this issue can wait.

Allow fitting with “matrixfit” using only a subset of points

With highly autocorrelated correlators, such as from the sigma project, the inversion of the covariance matrix becomes unstable. Therefore one should be able to use only a subset of points for the fit. This reduces the elements in the covariance matrix significantly and therefore improves the signal to noise ratio.

Absolute value not applied consistently in two-state fit model

The two-state/pc model uses the absolute value on the energy and energy difference parameters to force them to be positive. In the deriv.pcModel this absolute value is missing, however. The functions pcChi, pcChisqr, dpcChi, and dpcChisqr seem to have it consistent. In case the former function is actually used, this might lead to inconsistencies.

analysis_online should be rewritten to use `cf` class

Currently, the analysis of online measurements relies on a number of legacy routines which work, but may produce somewhat inconsistent results. For instance, the PCAC effective mass is computed from the symmetric effective mass definition, while the pion effective mass is computed using invcosh. The return values of the various functions are also highly inconsistent.

This will require rewriting the uwerr functions for these observables, with the benefit that these can then be used elsewhere to check the results of blocked-bootstrap-based fits.

readutils: extract.obs fails for pre-filtered cmicor and any(vec.obs>1)

The consistency check in extract.obs prevents extracting observables with CMI ID > 1 when the correlator files have been read with the obs argument.

expected behaviour: A cmicor object created by, say, readcmidatafiles with the argument obs=c(5) contains only one observable with ID 5. extract.obs should allow the extraction of this observable.

actual behaviour: extract.obs quits with the error message `extract.obs: data inconsistent, nrObs in the data does not match to input data length``

I don't fully understand the reason for this consistency check, it seems to be illogical, since cmicor can very well have only one observable with any ID imaginable... I would just remove the consistency check, but there may be a valid reason for it being there, I just can't see it.

removeTemporal.cf sets weight.factor to 1.0

The removeTemporal.cf function sets the weight.factor field of the cf object to 1.0. This makes sense for the plain shifted case, but when actual temporal states are removed I would think that setting it to mass2$t0 - mass1$t0 would make more sense.

Would it make sense to change this, or is it a misunderstanding of mine?

Add possibility of bootstrapped auxiliary parameters in bootstrap.nlsfit

The bootstrap.nlsfit function currently accepts bootstrap samples for the independent and dependent variable as well as a single closure which describes the model. This is become limiting.

For the Breit-Wigner fit in the ρ-project we have the phase shift as dependent and CM energy as independent variable. The function however also depends on the pion mass, which we currently have to put into the closure. This means that the pion mass uncertainty is neglected during this fit. As this uncertainty is small, the damage likely is small.

I see four ways to go forward:

  1. Pass the ID r of the bootstrap sample (and 0 or NA for the original data) to the closure. In the concrete example it will then take pion_mass_val or pion_mass_boot[r]. This is most flexible.

  2. Pass an additional matrix with boot.R or 1 + boot.R number of rows to bootstrap.nlsfit. The function then slices this matrix and passes the appropriate row to the fit closure. The disadvantage is that the closure then has to unpack this row and additionally this has to be of homogeneous data type.

  3. Somehow cram the additional parameters to be additional independent or dependent variables. Fill up the other one with NA. Then in the closure carefully unpack this.

  4. Do not alter bootstrap.nlsfit at all. Instead let the closure also have a copy of the bootstrap samples of the independent variable. The closure then searches for the current set of independent variables in the matrix of bootstrap samples, extracts the row number and figures out the sample number in this way.

I prefer number 1, though.

Does not work with GSL 2.1

Fedora ships with GSL 2.1. The configure.ac contains a snippet that assumes that it works with version 1.x, this I have generalized in my fork. The version 2.x seems to have some API changes, so hadron does not compile any more:

gcc -m64 -I/usr/include/R -DNDEBUG   -I/usr/local/include  -O3 -Wall -fpic  -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic  -c gsl_fit.c -o gsl_fit.o
gsl_fit.c: In function 'multifit_cor':
gsl_fit.c:569:28: error: 'gsl_multifit_fdfsolver {aka struct <anonymous>}' has no member named 'J'
   gsl_multifit_covar(solver->J, 0.0, covar);
                            ^~

I guess support for 1.x and 2.x at the same time is not feasible?

removeTemporal.cf overwrites c

In removeTemporal.cf, there is c <- 1.0 and later c(…) is used as concatenation. I initially thought that this would be a problem, but it isn't. You can overwrite c in your local environment and c() will still be a call to the c function. This only works with c, with ls it does not.

But perhaps it would be better to just rename this factor and have less confusion.

Define the cf class as properly as possible

Let's continue the previous discussion here.

On Stack Overflow one suggestion was to look into the R6 class system. It does not look very hard to use and it might even be possible to use it as a drop-in replacement of the current S3 class. This would give us private members and functions. I am not sure that this is a improvement as it would be the only R6 class in the library. Also I am not sure how we can prevent adding too many member functions.

When a correlation function gets shifted, we need to keep track of that. Currently it sets a flag and then functions like matrixfit do something with that. If we would like to prevent setting of that shifted member from outside, then we either need a setter (nothing achieved then) or make the shifting function a public member function. But this goes for most other things and would then make a god-class.

So perhaps sticking with the S3 system is okay, we just need to be a bit more careful about it.

What do you think about making it an R6 class?

invertCovMatrix blocking

When invertCovMatrix is run on bootstrap samples, there should be no further blocking done! The bootstrap samples are already from a blocked bootstrap.

Move to Roxygen

The documentation in R are the Rd files in the man directory. These had to be written by hand, now there exists Roxygen which can generate them. The documentation then sits above the functions and it works just like in C/C++ (Doxygen), Python (Epydoc, Sphinx), PHP (phpdoc), Java (JavaDoc).

Let's have the discussion whether we want to convert all the documentation to Roxygen (using Rd2roxygen) makes sense. Please feel free to edit this post below the line such that we have a concise collection of arguments for both cases.


Pro:

  • The function signature including default values is automatically copied to the “usage” part of the documentation.
  • Makes it easier to see whether all function arguments are documented.
  • Updates to the source are more likely to be reflected in the documentation as well.
  • Once we have marked all functions with @export, Roxygen can create the NAMESPACE file for us.
  • Features like @family make it very easy to cross-link a bunch of related functions together. Otherwise one would have to manually insert \link{…} commands in each of the files, an O(N²) problem.
  • Undocumented functions stand out.

Con:

  • The source code files become longer due to all the documentation, especially when there are lots of examples.
  • One has to call devtools::document() and check those files into git after each change to the documenation.

bootstrap.nlsfit returns boot.R+1 elements

In the bulk of the hadron library there are distinct t and t0 fields for the bootstrap samples and the actual value. In the bootstrap.nlsfit function also has a t0 but the t field has boot.R + 1 elements and contains a copy of t0 at t[1]. I find this very inconsistent and cumbersome to work with.

Just having it of length boot.R like the remainder of the code would be better I believe. Please make the change a new major version as it breaks the API.

invCovMatrix blocks already blocked data again

If bootstrap samples are used, via boot.samples = TRUE, then the bootstrap samples will be blocked again. If one passes cf$cf.tsboot$t[, ii] as cf, then the inversion routine will call block.ts again. This cannot be what we want in this case.

removeTemporal.cf does too many things

On B55.32 we also see some thermal state (though perhaps just a constant) on the two-particle scattering:

auswahl_001

Perhaps physically this can only be a constant shift and trying to remove M_\pi from this would not make any sense. But if that makes sense, we cannot do that with removeTemporal.cf because it can only subtract the energy difference between two fitted states and here we only have one state.

I propose to refactor removeTemporal.cf into several smaller more focused functions:

  • A dispersion relation, which we likely already have in hadron or LuescherAnalysis.
  • Perhaps an utility which extracts the energy level and bootstrap samples from a fit.
  • A weighting function which takes a correlation function, an energy and the energy bootstrap samples.
  • Reduce removeTemporal.cf to take a correlation function, energy and bootstrap samples and let it call the weighting, the shift, and the weighting again.

cleanup commit in urbach/hadron breaks onlinemeas

The removal of getCor (cfunction.R) breaks some functionality. While I use this functionality regularly, maybe it would be better if I rewrite onlinemeas (or move away from it in my analysis routines). What do you think?

issues with and questions about invertCovMatrix

I think invertCovMatrix does not reproduce hep-lat/9412087 exactly. I have been looking into the issue of the undulating effective mass plateaus and the corresponding "bad" ChiSqr of the effective mass fit and hence had to look in a bit more detail at what is going on under the hood.

  1. The eigenvalue replacement seems to not follow the prescription in the paper exactly. In particular (although I may be misunderstanding what happens in the construction of the inverse), it seems that the eigenvalues are not rescaled by the factor K.

  2. The decision to retain an eigenvalue is not taken as in the paper but all n-sqrt(N) small eigenvalues are replaced by the average, even if some of those could be larger than the average.

  3. I'm not sure it is reasonable to do blocking of the bootstrapped data for the construction of the inverse as is done for the correlated effective mass fit. Also, the number of eigenvalues which are replaced has no meaning here. Rather, in my opinion, the number of measurements should be passed as a function argument and then divided by boot.l. (currently it's the number of bootstrap samples divided by boot.l)

  4. Was the inverse covariance matrix ever studied in depth for B55 where we probably have enough data to build the real thing? Is the same reduction in ChiSqr seen? (compared to using the "real" inverse covariance matrix) What about fits with explicit excited state contributions and GEVPs?

Finally: I can see that we have a bit of an issue because we are fitting exponentially decaying data which has similarly decaying variance (although of course relative variance grows with t) and thus exponentially growing inverse (co-)variance. And although multiplication with the squared distances should renormalize the resulting contribution to ChiSqr (since the squared distances are themselves exponentially decaying functions of t). It seems to me that this eigenvalue replacement results in more emphasis being put on early timeslices, which is perhaps not a bad thing but I find it a bit hard to really understand the effect this has on the resulting fit. In particular, when I have 3x170 (170 blocked measurements, ll, ls, ss) well-separated measurements and I want to fit 3x20 timeslices or so, I would expect the real inverse covariance matrix to give a good ChiSqr if the model is justified since presumably all correlations are now taken into account. However, the ChiSqr that results when I use a "real" inverse covariance matrix as the weight is generally by about a factor of 3 too large, suggesting to me that perhaps our model is wrong and we absolutely should add some excited state contribution or just do a GEVP instead... On the other hand, using the inverse covariance matrix provided by invertCovMatrix with eigenvalue replacement results in a ChiSqr which seems to justify the model.

Comments with # and ##

In the codebase I have seen comments starting with # and ## . What is the meaning of these, and is it a general R style rule?

Get rid of warnings from zero length arrows

plotwitherror produces warnings like

Warning messages:
1: In replayPlot(x) : zero-length arrow is of indeterminate angle and so skipped

heavily. Can we get rid of those?

discussion: correlated plateau fit is biased to 'visually' wrong value

For some reason, a correlated chisq plateau fit to the pion vector three-point function produces a fit result which is visually offset from the apparent correct result.

The covariance matrix for this (theta=0, cA2.60.24) indicates strong correlation of all time slices:

3pt covmatrix

with the matrix elements distributed like

3pt covmatrix hist

Which results in this inverse (using 'solve') (the result is the same if the border is omitted in the inversion):

3pt invcovmatrix

with matrix elements distributed like:

3pt invcovmatrix hist

It is clear, that this might be a rather problematic situation...

In hadron, the resulting invCovmatrix looks like so (considering all time-slices):

3pt invcovmatrix hadron

3pt invcovmatrix hadron hist

So I'm a bit unsure how to proceed from here...

Value of correlation function is redundant, inconsistencies

A correlation function has the field cf which is a matrix containing the original samples. Then there is cf0 which contains the mean of this matrix, that is the actual value. The bootstrap samples are stored in the value cf.tsboot$t. But there is a copy of the value as cf.tsboot$t0.

The function subtract.excitedstates modifies cf, cf0 and cf.tsboot$t; but it does not modify cf.tsboot$t0 to match cf0. Therefore said function breaks an unwritten invariant of the cf-class.

I see two possible solutions:

  • Fix the function such that cf.tsboot$t is updated accodingly.
  • Remove the redundant field in the cf-class everywhere.

What would make more sense?

two-state fits to principal correlators

Following Phys.Rev. D82, 034508 (2010) [section V], principal correlators can be fitted with a (single) discarded excited state by noting that the overlap factors are fully determined by the eigenvectors, the mass of the state and the value of the correlation function at the diagonalisation time t_0.

The principal correlator of state n has amplitude 1 at t_0. As a result, one can fit the functional form

(1-A_n) e^{-m_n(t-t_0)} + A_n e^{-m^\ast_n (t - t_0) }

where m^\ast_n is an effective excited state contribution which is discarded.

Termination reason of nls.lm opt.res$info: -1

I found this output:

Termination reason of nls.lm opt.res$info: -1

The documentation of minpack.lm::nls.lm only explains the termination reasons 0 to 9. Do you have an idea what -1 could be?

gevp default parameters are almost always wrong

The gevp function has parameters element.order and matrix.size. These are set up such that a GEVP of size 2×2 can be solved by default. The matrix.size is redundant because the first block in the function is

  if(matrix.size^2 != length(element.order)) {
    stop("matrix.size^2 must be equal to length of element.order. Aborting...\n")
  }

One could also just use matrix.size = as.integer(round(sqrt(length(element.order)))) and the requirement would also be satisfied automatically and the user would not be burdened with this parameter.

Also the correlation function has the nrObs field which already contains this information. Therefore a better default might be element.ordering = 1:cf$nrObs.

Changing this will of course break existing analysis scripts. However, I just managed to do a wrong GEVP which only took the first four matrix elements and interpreted this as a 2×2 problem and luckily it was not positive-definite such that I got a hard error.

segfault in LuescherZeta

I want to generate a plot of the w_lm versus q. For this I have first calculated w_00 in the P² = 0 frame, that worked fine. Now with various P², various l and m I face R crashes coming from the LuescherZeta function.

screenshot_20180622_214746

Before you say that it is RStudio to blame, I have also tried this in a regular plain Rscript. Output (using pbmclapply) is this:

 *** caught segfault ***
address 0x5630fc94f000, cause 'memory not mapped'

Traceback:
 1: LuescherZeta(qsq = rel_momentum_q^2, l = spin_l, m = spin_m,     dvec = total_momentum_d_vec, gamma = boost_gamma, ...)
 2: scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i], 1, grid$l[i],     grid$m[i])
 3: data.frame(spin_l = grid$l[i], spin_m = grid$m[i], boost_gamma = grid$gamma[i],     w = scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i],         1, grid$l[i], grid$m[i]))
 4: FUN(X[[i]], ...)
 5: lapply(1:nrow(grid), function(i) {    print(q)    print(grid[i, ])    print(total_momentum_d_vec)    data.frame(spin_l = grid$l[i], spin_m = grid$m[i], boost_gamma = grid$gamma[i],         w = scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i],             1, grid$l[i], grid$m[i]))})
 6: func(param_row, value_row)
 7: FUN(X[[i]], ...)
 8: lapply(X = S, FUN = FUN, ...)
 9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        sm <- strsplit(conditionMessage(e), "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && isTRUE(getOption("show.error.messages"))) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
13: try(lapply(X = S, FUN = FUN, ...), silent = TRUE)
14: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
15: FUN(X[[i]], ...)
16: lapply(seq_len(cores), inner.do)
17: parallel::mclapply(1:nrow(joined$param), closure)
18: pvcall(.func, rho_kinematics)
19: as.data.frame(pv$param)
20: make_summary(pvcall(.func, rho_kinematics))
An irrecoverable exception occurred. R is aborting now ...

And this also occurs with just one thread (using lapply):

# q:
 [1] 0.0000000 0.6666667 1.3333333 2.0000000 2.6666667 3.3333333 4.0000000
 [8] 4.6666667 5.3333333 6.0000000
# more parameters
  l  m gamma
1 2 -2     1
# total_momentum_d_vec
[1] 0 0 0

 *** caught segfault ***
address 0x5653abf6c000, cause 'memory not mapped'

Traceback:
 1: LuescherZeta(qsq = rel_momentum_q^2, l = spin_l, m = spin_m,     dvec = total_momentum_d_vec, gamma = boost_gamma, ...)
 2: scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i], 1, grid$l[i],     grid$m[i])
 3: data.frame(spin_l = grid$l[i], spin_m = grid$m[i], boost_gamma = grid$gamma[i],     w = scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i],         1, grid$l[i], grid$m[i]))
 4: FUN(X[[i]], ...)
 5: lapply(1:nrow(grid), function(i) {    print(q)    print(grid[i, ])    print(total_momentum_d_vec)    data.frame(spin_l = grid$l[i], spin_m = grid$m[i], boost_gamma = grid$gamma[i],         w = scattering_w_lm(q, total_momentum_d_vec, grid$gamma[i],             1, grid$l[i], grid$m[i]))})
 6: func(param_row, value_row)
 7: FUN(X[[i]], ...)
 8: lapply(1:nrow(joined$param), closure)
 9: pvcall(.func, rho_kinematics, serial = TRUE)
10: as.data.frame(pv$param)
11: make_summary(pvcall(.func, rho_kinematics, serial = TRUE))
An irrecoverable exception occurred. R is aborting now ...

Likely I call the function wrong as it at least does not crash for all the other inputs before. Perhaps we can add more pre-condition checks to the function to make it safer to use?

Plot routine error looks wrong

On the jackknife-as-bootstrap-plus-nlsfit branch the errors in the plots seem to be wrong. On B55.32 I see only six timeslices significantly different from zero. Looking into the data I think that is a bug in the error calculation of matrixfit and effectivemass. Probably that was introduced with the cov_fn.

Do you have some ideas what might be happening there?

matrixModel and subtract.excitedstates are incompatible

The function subtract.excitedstates in R/matrixfit.R calls matrixModel with five parameters whereas said function needs a sixth parameter ov.sign.vec. This overall sign seems to have been added later on without a default value.

How should this be resolved? By adding a default value of +1 in matrixModel and leave subtract.excitedstates unchanged or by exposing the argument in subtract.excitedstates and let the user decide whether this argument needs to be passed?

Can one fit a negative correlator with matrixfit anyway? If not, then setting the default argument to +1 seems to be the easiest solution.

Matrixfit: fit.formatrixboot, optim does not converge

@urbach
For some reason, even after updating to your latest master, optim in fit.formatrixboot does not converge and hence the errors on the parameters that I get from the bootstrap samples are gibberish.

 ** Result of one state exponential fit **

based on 138 measurements
time range from 10  to  23 
mass:
m   =    0.12241 
dm  =    9.15615 

Amplitudes:
P 1     =    1.965518 
dP 1    =    182.3369 
P 2     =    2.525797 
dP 2    =    217.474 

boot.R  =    400 
boot.l  =    1 
useCov  =    FALSE 
chisqr  =    6.356022 
dof =    53 
chisqr/dof=  0.1199249 
Quality of the fit (p-value): 1 

The original optimization converges just fine. I've checked that the input correlators look perfectly fine when potted and the intitial parameters passed to the function are the results of the previous fit. For testing I disabled the gradient function but this made no difference. I'm a bit at a loss at this point...

Publishing hadron on CRAN

Via email there was the idea to publish hadron on CRAN. I am not sure whether the optional dependency of rhdf5 from the BioConductor repository makes this impossible. Once we have figured that out, we can further discuss.


Collection of tasks:

  • The license is states as GPL 3 or later which is not automatically recognized. We should either just write GPL-3 and state that other versions are fine in some text, or live with this.
  • Using library within the code is apparently something that should not be done, we should instead use :: or requireNamespace. The problem with using library or require in package code is that it alters the search path for the user of the package. It is recommended to instead use requireNamespace('foo') and then use foo::bar.
  • The dependency to rhdf5 is not declared in the DESCRIPTION. I had tried to fix that but then Travis CI would fail. The dependency should be in there, though.
  • There is boot:::ts.array somewhere in the code, which is a no-go. We must not depend on implementation details that are not exported.
  • Fix warnings about probable errors.
  • Fix warnings about S3 naming inconsistencies.
  • Fix inconsistencies between function signatures and their hand-written documentation pages.
  • Document all parameters of already documented functions.
  • Document all functions
  • Document all data sets.
  • The compilation flags in PKG_CFLAGS, -O3 -Wall, are not portable. This need to be changed, perhaps R already passes sufficient optimization flags such that this is not needed?
  • The hadron.so exports some elements from the standard library, this likely is a mistake and needs fixing.
  • Use only ASCII characters in R code.

Please specify license version

The file DESCRIPTION states that the license is GPL but it does not state which version. So it could be either of

  • GPL 2
  • GPL 2 or later
  • GPL 3
  • GPL or later

I'd like to make a Fedora package from this package, but I would need to know the exact license in order to correctly package it.

unify bootstrap procedures

I think we should unify all bootstrap methods. Currently its a bit of a mess, and its never clear how the bootstrap data type is called. Maybe we should have $t and $t0 everywhere like in the boot package?

Move to list-based intermediate format (medium- to long-term)

I have recently started moving a substantial part of my own scripts to a list-based container format. This has the substantial benefit that one can use mclapply to significantly speed up anlyses as well as the reading of files using mulitple processes. Since the work is pre-schduled, the ordering of the output list is as expected.

I have taken the approach that I store each bootstrap sample (of a correlator, say) as a list element and one could think about storing our bootstrap samples in the same way. Then the following works:

temp <- mclapply(X=data,FUN=function(x) { fit(x,...) })

The result can then relatively easily be transformed into an array using unlist(temp) as data for an array with specific dimensions. Sometimes one has to transpose in addition.

Doing Correlated fit using matrixfit.

Hi,

I always get the error

"[matrixfit] inversion of variance covariance matrix failed!\n"

when I try to use the matrixfit with the option useCov=TRUE independently of the fit range actually.

Could be that the parameter boot.l was not completely removed?
Then it might help to use instead of

invertCovMatrix(cf$cf.tsboot$t[,ii], boot.l=boot.l, boot.samples=TRUE), silent=TRUE)
invertCovMatrix(cf$cf.tsboot$t[,ii], boot.l=cf$boot.l, boot.samples=TRUE), silent=TRUE)

Cheers

Feri

Merge fork from urbach

Is there a reason Carsten's fork hasn't been merged with the etmc hadron code? As it looks it should be fast-forward and at least two new people cloned the 4 year old version of the code.

symmetrise.cf is not used in readtextcf

In readtextcf I noticed that symmetrization is done “by hand” and I thought that we do not have a function for symmetrization. Just now I have found symmetrise.cf; it should be used in the IO functions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.