Giter VIP home page Giter VIP logo

siccuracy's People

Contributors

stefanedwards avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

siccuracy's Issues

Make cbind SNP chip

New name for rowconcatenate: cbind_SNP.

Do not use cbind.SNP as this will cause method dispatcher to call this function for an object of class 'SNP'.

Reading integers vs reals, and writing integers vs reals

I have noticed the following:

1: Using format statement (Iw) does not work if the variable is a real.

2: Fortran throws a fit if trying to read a real formatted number into an integer. No help there.

Int to int is easy, real to real even so. But what do we do about cross-overs, i.e. reading integers and output reals (easy), and reading reals and outputting integers?

Add correct call % to imputation accuracy

imputation_accuracy should also count correct called genotypes (column-wise, row-wise), entries were genotype is missing in true, in imputed, or false.
Allow for a tolerance for comparison (e.g. tol = 0.1), to compare with gene dosages.

Update return value of imputation_accuracy to have:

List of 2
 $ snps : data.frame(means, sds, cors, correct, true.na, imputed.na, both.na)
 $ samples : data.frame(rowID, cors, correct, true.na, imputed.na, both.na)

Make convert_plinkA

Use plink -bfile <name stem> --recode A to recode a plink binary file to a text formatted file, coding genotypes as 0, 1, and 2. Two issues exists:

  1. Recoded file as a leading line with column names.
  2. Recoded file as leading columns with family ID, sample ID, sex, paternal ID, maternal ID, and phenotype. Columns 3-6 needs to be stripped. Columns 1-2 will need to be converted to an integer ID.

Options:

  • Give arguments converting familyID and sampleID into an integer.
  • Automatically decide conversion, i.e. if familyID is the same throughout, drop it. If sampleID is the same throughout, drop it. If familyID == sampleID, drop one. Conversion: e.g. familyID's are thousands, sampleIDs are singles. Count maximum sampleID's per familyID to determine minimum radix for familyID.

Return:

  • List with n as number of rows converted, data.frame with mapping, m as number of columns.

This method will also work for converting files for DMU (although with argument --recode 12 [?]).

Row with no variance drops returned element

In adaptive routine, providing a row that has no variance, the corresponding element in rowcors disappears. In fast routine, this is not the case.

Standardization must be FALSE, else it adds variance by scaling and shifting each element in the row separately.

ts <- Siccuracy:::make.test(15, 21)
 true <- ts$true
 true[2,] <- 2
 write.snps(true, ts$truefn)
 
 # No standardization, as this changes each element of row 2 -- and it gets variance!
 imputed <- ts$imputed
 mat1 <- cor(as.vector(true), as.vector(imputed), use = 'complete.obs')
 suppressWarnings(row1 <- sapply(1:nrow(true), function(i) cor(true[i,], imputed[i,], use='na.or.complete')))
 suppressWarnings(col1 <- sapply(1:ncol(true), function(i) cor(true[,i], imputed[,i], use='na.or.complete')))
 
 res <- imputation_accuracy(ts$truefn, ts$imputedfn, standardized = FALSE, adaptive = TRUE)
 expect_equal(res$matcor, mat1, tolerance=1e-9)
 expect_equal(res$rowcors, row1, tolerance=1e-9)
 expect_equal(res$colcors, col1, tolerance=1e-9)
 
 res <- imputation_accuracy(ts$truefn, ts$imputedfn, standardized = FALSE, adaptive = FALSE)
 expect_equal(res$matcor, mat1, tolerance=1e-9)
 expect_equal(res$rowcors, row1, tolerance=1e-9)
 expect_equal(res$colcors, col1, tolerance=1e-9)

Stream line parameter descriptions and names.

ncol, nSNPs, nAnimals, naval, NAval, numeric.format, etc. are just some of the different names for the same concepts. These need to be stream lined.

Task also requires removing nSNPs and ncol as setting this to anything other than the actual number of columns will lead to weird results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.