klkeys / plink.jl Goto Github PK
View Code? Open in Web Editor NEWJulia module to handle PLINK BED files
License: Other
Julia module to handle PLINK BED files
License: Other
PLINK.jl currently implements some linear algebra facilities, but they are currently limited to the minimum necessary for the IHT.jl package to perform penalized linear least squares regression. If the package is to become useful, then it requires some direction.
From an engineering standpoint, the best thing to do is to make BEDFile
objects behave more like actual arrays. Some proposals include:
BEDFile
object as additional fields, which entails adding methods (e.g. addmeans!
, addinvstds!
) to add the requisite vectors to a BEDFile
as well as constructor methods that can read the means + precisions from file.xb!
and xty!
to instead overload A_mul_B!
and At_mul_B!
with methods targeting the BEDFile
. Housing means and precisions in a BEDFile
goes hand-in-glove with simplifying calls to xb!
and xty!
. Matrix-vector multiplication methods currently see means and precisions as optional keyword arguments, but the proposed revamped A_mul_B!
with means/precisions in the BEDFile
itself would absolve the user from the constant need to put means and precisions into the method calls.xb
and xty
to instead overload *()
A_mul_B!
to enable iterative linear solves via the conjugate gradient method (CG). Currently xb!
assumes that the vector multiplicand is sparse. Given the computational burden associated with dense linear algebra with PLINK BED files, this operation should also have both a CPU and a GPU-accelerated variant.sumsq
should become an overloaded sumabs2
and should work on both row and column margins. If the need is present, then sumabs
and related functions should be added.Issues pertaining to parallelism:
SharedArray
parallelism, which is slightly different. Should PLINK.jl abandon SharedArray
parallelism in favor of multithreading?SharedArray
operations. This matters especially for matrix-vector operations in which the compressed data and the vectors may require cores to share data at the boundaries of their local indices. A fully optimized parallel model with SharedArray
s would ensure that each core operates completely independently on its portions of x.x
, x.xt
, and the vector multiplicand.xb!
is not parallelized since it is optimized for sparse vector multiplicands, but that would change if support for dense vectors is added.Issues pertaining to compression:
getindex
currently requires ~1.5-2x the memory and compute time of normal array indexing. Any small improvements in getindex
can precipitate potentially dramatic reductions in compute time since array indexing is a fundamental operation in the linear algebra routinesIssues pertaining to design, code base maintenance:
xty!
as an example. It may improve matters to build a PLINKOptions
structure with all default arguments included whenever the module is loaded. All functions can then use the parameter defaults via PLINKOptions
. Users can modify the entries in PLINKOptions
as they see fit.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.