labdao / plex Goto Github PK
View Code? Open in Web Editor NEWPlatform for running comp bio applications on distributed compute and storage infrastructure
Home Page: https://lab.bio
License: MIT License
Platform for running comp bio applications on distributed compute and storage infrastructure
Home Page: https://lab.bio
License: MIT License
Waiting to hear back from GCP re: quota increase by EOD 23 Jan
Proteins selected are as follows
examine bacalhau install script
go build cross
for building operating systems
#TODO: validate instructions if it is a dict
https://github.com/labdao/ganglia/blob/3eda40cee46e80942edaad1e65ed45e2ba1bedd3/process.py#L21
undefined
This will set up a compute instance with the requirements for running a containerized diff dock (eg, docker and docker-nvidia). This will be a starting point for a more general management of setting up nodes.
This common starting point will make it easier to reproducibly fix issues such as ticket #5 error running diffdoc on a100 GPU
when using the web3.storage CLI I am receiving different bafys for a given file when compared to kubo. We can try this with this repo here:
labdao % w3 put ganglia/README.md
# Packed 1 file (0.0MB)
# bafybeiaj7fh6mh2yfzlpis4aozddm2lcsicthekdav4b3lwvfjpvtn2dty
⁂ Stored 1 file
⁂ https://dweb.link/ipfs/bafybeiaj7fh6mh2yfzlpis4aozddm2lcsicthekdav4b3lwvfjpvtn2dty
labdao % ipfs add -r --cid-version=1 ganglia/README.md
added bafkreibhdu66zohpk33uraa4p5u4zfwbv2rqq7edt43xnhke4wwaulwmoq README.md
1.46 KiB / 1.46 KiB [===========================================================================================================================] 100.00%
labdao % ipfs pin remote add --background --service=pinata-api bafkreibhdu66zohpk33uraa4p5u4zfwbv2rqq7edt43xnhke4wwaulwmoq
CID: bafkreibhdu66zohpk33uraa4p5u4zfwbv2rqq7edt43xnhke4wwaulwmoq
Name:
Status: queued
What we currently have
<LabDAO uuid>/
index.csv
protein1.pdb
ligand1.sdf
Where we want to go (generally)
<LabDAO uuid>/
index.csv
protein1.pdb
ligand1.sdf
job-<bacal-id>
outputs
complexes.pdb
stdout
stderr
This repo will hold code for managing and deploying our compute node infrastructure.
This will most likely end up including code for:
The first code will be the install.sh script defined in ticket 17.
Possible names;
sudo apt install python3-pip
Right now we use protein_path
but only ligand
, and not ligand_path
to specify the directories in which input data is located.
#TODO: add vina scoring
undefined
out_file is a temporary file generated in the ESM1 embedding step. I propose to remove it from the argument list to reduce the surface error for user errors.
https://github.com/labdao/ganglia/blob/8ffa5fdbe79f8f1581ca9fe39199473989a39772/client.py#L7
The PDBbind protein-ligand core data set is a high quality and relatively small subset (285 complexes) of the complete PDBbind database (total 19,433 protein-ligand complexes). This dataset makes up Comparative Assessment of Scoring Functions (CASF) benchmark for docking/scoring methods.
I successfully downloaded both the core set and general data set from http://www.pdbbind.org.cn/casf.php .
batch mode?
#TODO: introduce guidance on volume mounting, especially mounting multiple volumes
https://github.com/labdao/ganglia/blob/d2d3e0ae38c1e2b34b115870db0812e7c313c7e5/client.py#L33
testing on gpu-prototyping instance
Testing Nvidia Container Toolkit Install
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown.
ERRO[0000] error waiting for container: context canceled
````
The following code is in draft. It generates a random file. Then it calculates the expected cid and compares this with the output of putFile, which returns a cid upon execution. The cid comparison outputs an error, indicating the expected and actual cids differ.
package main
import (
"crypto/rand"
"fmt"
"io/ioutil"
"os"
"testing"
cid "github.com/ipfs/go-cid"
mh "github.com/multiformats/go-multihash"
"github.com/web3-storage/go-w3s-client"
)
func TestPutFile(t *testing.T) {
client, err := w3s.NewClient(
w3s.WithEndpoint("https://api.web3.storage"),
w3s.WithToken(testToken),
)
if err != nil {
t.Fatalf("error creating client: %v", err)
}
fileContent := make([]byte, 1024)
if _, err := rand.Read(fileContent); err != nil {
t.Fatalf("error generating random bytes: %v", err)
}
file, err := ioutil.TempFile("", "")
if err != nil {
t.Fatalf("error creating temp file: %v", err)
}
defer file.Close()
defer os.Remove(file.Name())
if _, err := file.Write(fileContent); err != nil {
t.Fatalf("error writing to file: %v", err)
}
fileBytes, err := ioutil.ReadFile(file.Name())
if err != nil {
t.Fatalf("error reading file: %v", err)
}
// calculate predicted CID
// codec 0x70 is dag-pb
pref := cid.Prefix{
Version: 1,
Codec: uint64(0x70),
MhType: mh.SHA2_256,
MhLength: -1,
}
calculatedCid, err := pref.Sum(fileContent)
if err != nil {
t.Fatalf("error calculating CID: %v", err)
}
// validation that file content and file bytes are the same
fmt.Printf("fileContent length: %d, first few elements: % x\n", len(fileContent), fileContent[:10])
fmt.Printf("file length: %d, first few elements: % x\n", len(fileBytes), fileBytes[:10])
// putFile takes in args of type fs.File, NOT the bytes of the file
actualCid := putFile(client, file)
if !calculatedCid.Equals(actualCid) {
t.Fatalf(`unmatching cids
expected CID: %s
actual CID: %s`, calculatedCid, actualCid,
)
}
}
Example error logs
fileContent length: 1024, first few elements: 27 25 bc be 8c f5 66 15 a8 6b
file length: 1024, first few elements: 27 25 bc be 8c f5 66 15 a8 6b
Uploading to IPFS via web3.storage...
CID: bafybeidk2alhga7fztbtiyult7hsawnfhawtspy5a6xzzvjtaxavvfn2fm
--- FAIL: TestPutFile (1.09s)
ipfs_test.go:77: unmatching cids
expected CID: bafybeieqzyhpmacugzcn7mrh6pth6xakrxxudvbn3l5kvdtderadua27wy
actual CID: bafybeidk2alhga7fztbtiyult7hsawnfhawtspy5a6xzzvjtaxavvfn2fm
Current hypothesis is that w3s.Put function modifies the file before pinning. CID references below, where the Multihash digests differ in spite of the file contents being identical as shown above.
Look into CAR wrapping and generating manually. (lib from Web3S https://github.com/web3-storage/ipfs-car)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.