Giter VIP home page Giter VIP logo

babette's Introduction

rOpenSci

Project Status: Abandoned

This repository has been archived. The former README is now in README-NOT.md.

babette's People

Contributors

giappo avatar jeroen avatar kant avatar richelbilderbeek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

babette's Issues

Process last things

From here:

Approved! Thanks @bjoelle and @dwinter for your thorough reviews and follow-ups, and @richelbilderbeek for your diligent work. It's been a long road and a lot of code to go through and I'm glad to have this suite as part of rOpenSci.

To-dos:

  • Wrap up richelbilderbeek/babette#38

  • Transfer the repos to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.

  • Add the rOpenSci badge to top and footer to the bottom of your READMEs
    " [![Peer Review Status](https://badges.ropensci.org/209_status.svg)](https://github.com/ropensci/onboarding/issues/209)"
    " [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)"

  • Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname).

  • We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

  • Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

  • Welcome aboard! We'd also love a blog post about your packages, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing.

  • We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

Feature request: shifted gamma distribution

From @dwinter:

I don't think it possible to set a shifted gamma distribution (i.e. a gamma distribution with an additional "offset" value) yet. I didn't check others, but these shifted distributions are often helpful for setting node dates

Thank rOpenSci reviewers in DESCRIPTION

From dwright I found here the best way to thank rOpenSci reviewers:

Example:

person("Bea", "Hernández", role = "rev",
       comment = "Bea reviewed the package for rOpenSci, see 
                  https://github.com/ropensci/onboarding/issues/116")

Investigate report Axel Hille

# babette


library(devtools)
#devtools::install_github("richelbilderbeek/beautier")
#devtools::install_github("richelbilderbeek/tracerer")
#devtools::install_github("richelbilderbeek/beastier")
#devtools::install_github("richelbilderbeek/mauricer")
#devtools::install_github("richelbilderbeek/babette")




library(beautier)
library(tracerer)

library(beastier)
install_beast2()

library(mauricer)
library(babette)



session_info()
#Session info --------------------------------------------------------------------------------------------------------------
#setting  value
#version  R version 3.4.1 (2017-06-30)
#system   x86_64, linux-gnu
#ui       RStudio (1.1.419)
#language (EN)
#collate  de_DE.UTF-8
#tz       Europe/Berlin
#date     2018-12-24

#Packages ------------------------------------------------------------------------------------------------------------------
#  package     * version date       source
#babette     * 1.3     2018-12-24 Github (richelbilderbeek/babette@4d28d36)
#base        * 3.4.1   2017-07-08 local
#beastier    * 1.5.2   2018-12-24 Github (richelbilderbeek/beastier@bfcd6ef)
#beautier    * 1.15    2018-12-24 Github (richelbilderbeek/beautier@949c01a)
#compiler      3.4.1   2017-07-08 local
#curl          3.2     2018-03-28 cran (@3.2)
#datasets    * 3.4.1   2017-07-08 local
#devtools    * 1.13.3  2017-08-02 CRAN (R 3.4.1)
#digest        0.6.18  2018-10-10 cran (@0.6.18)
#git2r         0.19.0  2017-07-19 CRAN (R 3.4.1)
#graphics    * 3.4.1   2017-07-08 local
#grDevices   * 3.4.1   2017-07-08 local
#httr          1.3.1   2017-08-20 CRAN (R 3.4.1)
#knitr         1.17    2017-08-10 CRAN (R 3.4.1)
#mauricer    * 1.1.1   2018-12-24 Github (richelbilderbeek/mauricer@3cd6bd7)
#memoise       1.1.0   2017-04-21 CRAN (R 3.4.1)
#methods     * 3.4.1   2017-07-08 local
#R.cache       0.13.0  2018-01-04 CRAN (R 3.4.1)
#R.methodsS3   1.7.1   2016-02-16 cran (@1.7.1)
#R.oo          1.21.0  2016-11-01 cran (@1.21.0)
#R.rsp         0.42.0  2018-01-10 CRAN (R 3.4.1)
#R.utils       2.6.0   2017-11-05 cran (@2.6.0)
#R6            2.3.0   2018-10-04 cran (@2.3.0)
#rappdirs      0.3.1   2016-03-28 cran (@0.3.1)
#Rcpp          1.0.0   2018-11-07 cran (@1.0.0)
#rstudioapi    0.6     2016-06-27 CRAN (R 3.4.1)
#stats       * 3.4.1   2017-07-08 local
#testit        0.9     2018-12-05 cran (@0.9)
#tools         3.4.1   2017-07-08 local
#tracerer    * 1.5.2   2018-12-24 Github (richelbilderbeek/tracerer@f345fc3)
#utils       * 3.4.1   2017-07-08 local
#withr         2.1.2   2018-03-15 CRAN (R 3.4.1)
#yaml          2.1.16  2017-12-12 cran (@2.1.16)


#https://github.com/richelbilderbeek/babette/blob/master/vignettes/tutorial.R

require(babette)

## ------------------------------------------------------------------------
fasta_filename <- get_babette_path("anthus_aco.fas")
testit::assert(file.exists(fasta_filename))

## ------------------------------------------------------------------------
mcmc <- create_mcmc(chain_length = 2000, store_every = 1000)

## ------------------------------------------------------------------------
site_model <- create_site_model_jc69()
site_model <- create_jc69_site_model()

## ------------------------------------------------------------------------
clock_model <- create_clock_model_strict()
clock_model <- create_strict_clock_model()

## ------------------------------------------------------------------------
tree_prior <- create_tree_prior_yule()
tree_prior <- create_yule_tree_prior()

## ------------------------------------------------------------------------
mrca_prior <- create_mrca_prior(
  alignment_id = get_alignment_id(fasta_filename = fasta_filename),
  taxa_names = get_taxa_names(filename = fasta_filename)[1:2],
  is_monophyletic = TRUE
)

## ------------------------------------------------------------------------
mrca_distr <- create_normal_distr(
  mean = 15.0,
  sigma = 1.0
)

## ------------------------------------------------------------------------
mrca_prior <- create_mrca_prior(
  alignment_id = get_alignment_id(fasta_filename = fasta_filename),
  taxa_names = get_taxa_names(filename = fasta_filename),
  mrca_distr = mrca_distr
)

## ------------------------------------------------------------------------
if (1 == 2) {
  beast2_input_filename <- "beast_input.xml"
  beast2_output_log_filename <- "beast_ouput.log"
  beast2_output_trees_filenames <- "beast_output.trees"
  beast2_output_state_filename <- "beast_state.xml.state"
  all_files <- c(
    beast2_input_filename,
    beast2_output_log_filename,
    beast2_output_trees_filenames,
    beast2_output_state_filename
  )

  out <- bbt_run(
    fasta_filename = fasta_filename,
    mcmc = mcmc,
    beast2_input_filename = beast2_input_filename,
    beast2_output_log_filename = beast2_output_log_filename,
    beast2_output_trees_filenames = beast2_output_trees_filenames,
    beast2_output_state_filename = beast2_output_state_filename
  )
  testit::assert(all(file.exists(all_files)))
  file.remove(all_files)
}

## ------------------------------------------------------------------------
beast2_path <- beastier::get_default_beast2_path()
print(beast2_path)

## ------------------------------------------------------------------------
if (file.exists(beast2_path)) {
  out <- bbt_run(
    fasta_filename = fasta_filename,
    mcmc = mcmc,
    beast2_path = beast2_path
  )
}

#Fehler in check_input_filename_validity(input_filename = input_filename,  :
#'input_filename' must be a valid BEAST2 XML file. File '/tmp/RtmpAYPTAj/beast2_2eaf0e5f70.xml' is not a valid BEAST2 file. FALSE
#Zusätzlich: Warnmeldungen:
#1: Ausführung von Kommando ''/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java' -jar "/home/axel/.local/share/beast/lib/beast.jar" -validate "/tmp/RtmpAYPTAj/beast2_2eaf0e5f70.xml" 2>&1' ergab Status 1
#2: Ausführung von Kommando ''/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java' -jar "/home/axel/.local/share/beast/lib/beast.jar" -validate "/tmp/RtmpAYPTAj/beast2_2eaf0e5f70.xml" 2>&1' ergab Status 1

Request: TPM site model

	Model 		f(a) 	f(c) 	f(g) 	f(t) 	kappa 	titv 	Ra	Rb	Rc	Rd	Re	Rf	pInv 	gamma
----------------------------------------------------------------------------------------------------------------------------------------
AIC 	TPM2uf+I+G	0.41	0.20	0.13	0.25	0.00	0.00	  0.488   7.746   0.488   1.000   7.746   1.000    0.61	   3.08

Nu zou ik dus graag een TPM2uf+I+G model willen gebruiken voor mijn COI alignment.
Ik kende dit model nog niet (eerder gebruikte ik gewoon een algemeen GTR model), maar vond dit erover http://www.iqtree.org/doc/Substitution-Models.
Typisch aan dit model is, dat bepaalde frequenties gelijk worden gesteld aan elkaar: AC=AT, AG=CT, CG=GT and equal base freq.
Is het mogelijk om dit te doen mbv babette? Of uberhaupt in BEAUti, dat jij weet?
Begin ik met "create_gtr_site_model()" om dan vervolgens daar details op te geven van frequenties van substituties?
Ik neem aan dat als ik dat doe door de waarden uit de vet gedrukte rij hierboven simpelweg over te nemen, dit alleen startwaarden zijn voor de MCMC run?

ESS for combined chains correct?

From @thijsjanzen:

Babette is echt superhandig om de ESS te berekenen, en lijkt ook wel sneller te zijn dan tracer. Wat me wel opviel is dat de ESS waarden exact overeenkomen met tracer, behalve voor 'combined'.

Ik run vaak 10 onafhankelijke BEAST chains, en plak deze dan aan elkaar om 1 grote chain te krijgen. Zo weet je zekerder dat je niet in een lokaal optimum zit. Om te checken dat de chains allemaal naar hetzelfde convergeren, is het handig om de ESS van de gecombineerde chain te checken. Nu zag ik alleen dat deze ESS waarden niet overeenkomen met die van tracer. 

Het mooiste zou nu natuurlijk zijn als ik een user-case zou meegeven, maar de log files zijn wat groot om te demonstreren hier. Ik vermoed dat je deze situatie ook makkelijk zelf kan repliceren.

Feature request: multiple alignments

Moved from there:

Feedback dwinter:

As far as I can tell it is not possible to estimate one tree from multiple alignments. (i.e. one tree prior, multiple alignments possibly each with its own site model). I suspect this is a common use-case (either specified this way or through a concatenated alignment with partitions). If I have this wrong then a partitioned dataset should be added to the examples vignette. If this is not currently possible it should be a priority for future development

Trees returned by tracerer have names

test_that("use, one alignment, plot with nLTT", {

  skip("Expose bug")
  out <- run(
    fasta_filenames = get_babette_path("anthus_aco.fas"),
    mcmc = create_mcmc(chain_length = 1000, store_every = 1000)
  )
  testit::assert(
    all(grepl(pattern = "STATE_", x = names(out$anthus_aco_trees)) == FALSE)
  )
  testthat::expect_silent(
    nLTT::nltts_plot(out$anthus_aco_trees)
  )

})

Instead, the names should be stripped to have a pure multiPhylo.

Change babette logo

From Huw Ogilvie, one of the article reviewers:

Continuing the grand tradition of picking alliterative puns for BEAST program names, the authors have gone with "babette". I guess that this is in reference to one of the maids that cleans up after The Beast in the famous fairy tale. I think the name is logical and clever. However in the older Disney animated versions of the fairy tale the maid is a "sexy maid" stereotype, and the authors seem to have directly copied the babette logo from the animation. The authors should reconsider this, not only because of the notorious litigousness of Disney lawyers, but also because the optics of using a "sexy maid" logo for scientific software is bad, especially in 2018, especially from two male authors. Sorry if this point gets interpreted as not having a sense of humor, but I think it should be addressed.

Will change logo, sent out email to research group for contributions.

standard deviation must be positive

[65] "Error 110 parsing the xml input file"                                            
[66] ""                                                                                
[67] "validate and intialize error: standard deviation must be positive (-4.554)"      
[68] ""                                                                                
[69] "Error detected about here:"                                                      
[70] "  <beast>"                                                                       
[71] "      <run id='mcmc' spec='MCMC'>"                                               
[72] "          <distribution id='posterior' spec='util.CompoundDistribution'>"        
[73] "              <distribution id='prior' spec='util.CompoundDistribution'>"        
[74] "                  <prior id='GammaShapePrior.s:anthus_aco' name='distribution'>" 
[75] "                      <LogNormal id='LogNormalDistributionModel.0' name='distr'>"

Invalid BEAST2 file when using RLN and MRCA prior with MRCA distribution

Relaxed lognormal clock model does not work properly

# Reproduce the bug

# Load data
fasta_filename <- paste0("primates_long.fas")
testit::assert(file.exists(fasta_filename))

# Create MCMC chain
sample_interval <- 1000
mcmc <- create_mcmc(chain_length = 2000, store_every = sample_interval)

# Other priors
site_model <- create_site_model_jc69()
clock_model <- create_clock_model_rln()
tree_prior <- create_tree_prior_yule()

# Set MRCA prior
mrca_distr <- create_normal_distr(mean = 15.0, sigma = 1.0)
mrca_prior <- mrca_prior <- create_mrca_prior(
  alignment_id = get_alignment_id(fasta_filename = fasta_filename),
  taxa_names = get_taxa_names(filename = fasta_filename),
  mrca_distr = mrca_distr
)

# Run babette
out <- bbt_run(
  fasta_filenames = fasta_filename,
  mcmc = mcmc,
  site_models = site_model,
  clock_models = clock_model,
  tree_priors = tree_prior,
  mrca_priors = mrca_prior,
  verbose = F
)

A nested-sampling run should only do a nested-sampling run

I misunderstood the software machinery of how the Nested-Sampling marginal likelihood estimation worked:

  • A normal run returns a posterior with trees and model parameter estimates
  • A nested-sampling run returns a posterior with trees and nested sampling estimates

This should be separated in the documentation and code.

BEAUti reports an error with two alignments option from babette: "IDs should be unique"

Dear Richel,

I created a BEAST2 input file as follows:
create_beast2_input_file(input_filenames = c(input_Plectostoma_COI,input_Plectostoma_ITS1),
output_filename = output_Plectostoma_all_markers_combined)

No error message.

When I open in BEAUti I get this error message, suggesting a duplicate sequence name. But this 'duplicate' is from two different alignments.

image

What can I have done wrong?
Thanks!

beta must be positive

[65] "Error 110 parsing the xml input file"                                           
[66] ""                                                                               
[67] "validate and intialize error: beta must be positive (-87.108)"                  
[68] ""                                                                               
[69] "Error detected about here:"                                                     
[70] "  <beast>"                                                                      
[71] "      <run id='mcmc' spec='MCMC'>"                                              
[72] "          <distribution id='posterior' spec='util.CompoundDistribution'>"       
[73] "              <distribution id='prior' spec='util.CompoundDistribution'>"       
[74] "                  <prior id='GammaShapePrior.s:anthus_aco' name='distribution'>"
[75] "                      <Gamma id='Gamma.0' name='distr'>"       

alpha must be positive

[65] "Error 110 parsing the xml input file"                                           
[66] ""                                                                               
[67] "validate and intialize error: alpha must be positive (-9.486)"                  
[68] ""                                                                               
[69] "Error detected about here:"                                                     
[70] "  <beast>"                                                                      
[71] "      <run id='mcmc' spec='MCMC'>"                                              
[72] "          <distribution id='posterior' spec='util.CompoundDistribution'>"       
[73] "              <distribution id='prior' spec='util.CompoundDistribution'>"       
[74] "                  <prior id='GammaShapePrior.s:anthus_aco' name='distribution'>"
[75] "                      <Gamma id='Gamma.0' name='distr'>"                     

Add overwrite parameter to bbt_run

From here:

One additional thing I would suggest would be to include the overwrite parameter in babette::bbt_run (it's only present in beastier::run_beast2 at present), and to set it to FALSE by default (as it is in BEAST2).

Process last things mauricer

From #48:

  • Transfer the repos to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.

  • Add the rOpenSci badge to top and footer to the bottom of your READMEs
    " [![Peer Review Status](https://badges.ropensci.org/209_status.svg)](https://github.com/ropensci/onboarding/issues/209)"
    " [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)"

  • Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname).

  • We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

  • Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

As mauricer has not been reviewed, I cannot add names to DESCRIPTION,

Problem with writing permissions on CentOS

Hi,

Thank you for your effort, it is a great initiative. I would love to try how babette can fit in some automated pipeline.

I was just trying out babette with some example data, but I got the following issue:

library(babette)

fasta_name <- "anthus_aco.fasta"

mrca_priors <- create_mrca_prior(
  taxa_names = get_taxa_names(fasta_name),
  alignment_id = get_alignment_id(fasta_name),
  mrca_distr = create_normal_distr(mean = 15.0, sigma = 0.025)
)
 
mcmc = create_mcmc(chain_length=2000, store_every = 1000)

out <- bbt_run(fasta_name, mcmc = mcmc, mrca_priors = mrca_priors)
Error: file.exists(output_log_filename) is not TRUE
In addition: Warning messages:
1: In file.rename(from = from, to = output_log_filename) :
  cannot rename file 'anthus_aco.log' to '/tmp/RtmpI5Egd6/beast2_258b458753999log', reason 'Invalid cross-device link'
2: In file.rename(from = from, to = to) :
  cannot rename file 'anthus_aco.trees' to '/tmp/RtmpI5Egd6/beast2_anthus_aco_258b42309a56a.trees', reason 'Invalid cross-device link'

I could not find any workaround for this issue. Do you have any idea how to solve?

Thank you in advance!

Best,
Gergo

Detect if babette is run on CentOS

This is a problem with CentOS and the temporary files babette creates, as reported in #31. AFAIK, there is no continuous integration system to create a CentOS environment. As long as I cannot support CentOS, it would be helpful if the users would know there will be problems.

How to work on multiple subsets/partitions?

From Axel Hille:

I have another issue, it is more informal concerning the use of babette. How can I use alignments which are split into various subsets == partitions, for example split genes into introns and exons? In BEAST you have the possibility to load a nexus-file, where the partitions are defined in the "BEGIN CODONS;" - block. I wonder if the "deprecated two and more alignments" solution in babette was for this issue?
Any hints are appreciated.

Better reporting of using binary BEAST2 exe when using Nested Sampling

When you use a Nested Sampling setup and the BEAST2 jar file, one gets this error:

 'input_filename' must be a valid BEAST2 XML file. File '/tmp/RtmpfHeHcJ/beast2_138222bbda78.xml' is not a valid BEAST2 file. FALSE
   Execution halted

It should be something like:

When using Nested Sampling, one must use the binary BEAST2 executable

Upper value should be higher than lower value

[65] "Error 110 parsing the xml input file"                                           
[66] ""                                                                               
[67] "validate and intialize error: Upper value should be higher than lower value"    
[68] ""                                                                               
[69] "Error detected about here:"                                                     
[70] "  <beast>"                                                                      
[71] "      <run id='mcmc' spec='MCMC'>"                                              
[72] "          <distribution id='posterior' spec='util.CompoundDistribution'>"       
[73] "              <distribution id='prior' spec='util.CompoundDistribution'>"       
[74] "                  <prior id='GammaShapePrior.s:anthus_aco' name='distribution'>"
[75] "                      <Uniform id='Uniform.0' name='distr'>"                    

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.