geohot / corona Goto Github PK
View Code? Open in Web Editor NEWReverse engineering SARS-CoV-2
Reverse engineering SARS-CoV-2
What's your opinion on group testing? Here's a math analysis on it:
Infection rate ► Initial mini-pool size ▼ | 0.5% | 1% | 2% | 3% | 5% | 8% | 10% | 20% | 50% |
---|---|---|---|---|---|---|---|---|---|
2 | 0.51 | 0.51 | 0.53 | 0.54 | 0.57 | 0.62 | 0.64 | 0.78 | 1.12 |
4 | 0.26 | 0.28 | 0.31 | 0.34 | 0.39 | 0.48 | 0.53 | 0.77 | 1.30 |
8 | 0.15 | 0.17 | 0.21 | 0.25 | 0.33 | 0.45 | 0.52 | 0.82 | 1.41 |
16 | 0.09 | 0.12 | 0.18 | 0.23 | 0.33 | 0.46 | 0.54 | 0.87 | 1.47 |
32 | 0.07 | 0.10 | 0.17 | 0.23 | 0.34 | 0.48 | 0.57 | 0.90 | 1.51 |
My concerns are:
max SampleCnt(Patient_i) = N
)as a python programmer, who is interested in helping out, but with not a lot of background in bioinformatics, I'd like to know if there are any low level grunt tasks I can help. out with.
Glaunsinger explains the evolution, genetics, and virulence of coronaviruses
(skip the first 10 minutes/introduction):
https://www.youtube.com/watch?v=8_bOhZd6ieM
Timestamps: https://www.youtube.com/watch?v=8_bOhZd6ieM&lc=UgyZB35CfH70Xwgrtb14AaABAg
Slides: slides.pdf
Fore instance, she talks about CoVs being unusually large (~30kb) compared to other viruses – in fact, CoVs are above the theoretical limit – because they proof-read (exonuclease) the replicated RNA.
I am just thinking if I can translate the whole thing into a Chinese doc (but how about the papers in links...) in order to attract more Chinese ppl attending this.
and I can do some propaganda lol
This guy David Icke, claims that there is no covid-19 and that, exosome is basically what the world
calls covid-19. Vimeo Source
I have little understanding of bio, but would it be possible to prove/disprove this statement?
Sorry it is a news site. But it actually looks decent for a simple explanation of most of the proteins...
This study used Hydroxychloroquine and azithromycin as a treatment of COVID-19.
it says that the mean time of viral shedding in patients suffering from COVID19 in China was 20 days while this combination is able to clear viral is 6 days.
At day 6 post-inclusion, 100% of patients
treated with hydroxychloroquine and azithromycin combination were virologicaly cured
comparing with 57.1% in patients treated with hydroxychloroquine only, and 12.5% in the
control group"
This chart is taken from the same paper
Released 29 August 2020:
https://www.sciencedirect.com/science/article/pii/S0960076020302764
Sample size of 75 is relatively small. Promising though, and probably worth supplementing if you don't already.
Results
Of 50 patients treated with calcifediol, one required admission to the ICU (2%), while of 26 untreated patients, 13 required admission (50%) p value X2 Fischer test p < 0.001. Univariate Risk Estimate Odds Ratio for ICU in patients with Calcifediol treatment versus without Calcifediol treatment: 0.02 (95%CI 0.002-0.17). Multivariate Risk Estimate Odds Ratio for ICU in patients with Calcifediol treatment vs Without Calcifediol treatment ICU (adjusting by Hypertension and T2DM): 0.03 (95%CI: 0.003-0.25). Of the patients treated with calcifediol, none died, and all were discharged, without complications. The 13 patients not treated with calcifediol, who were not admitted to the ICU, were discharged. Of the 13 patients admitted to the ICU, two died and the remaining 11 were discharged.Conclusion
Our pilot study demonstrated that administration of a high dose of Calcifediol or 25-hydroxyvitamin D, a main metabolite of vitamin D endocrine system, significantly reduced the need for ICU treatment of patients requiring hospitalization due to proven COVID-19. Calcifediol seems to be able to reduce severity of the disease, but larger trials with groups properly matched will be required to show a definitive answer.
Hi Geohotz,
Found something that may interest you.
Watched your yesterdays video on youtube. I assume, this guy is also trying to achieve the same thing that you are trying. Protein folding.
Have a look.
https://github.com/bionicles/coronavirus
Thanks for your work to help the people in need! Your site has been added! I currently maintain the Open-Source-COVID-19 page, which collects all open source projects related to COVID-19, including maps, data, news, api, analysis, medical and supply information, etc. Please share to anyone who might need the information in the list, or will possibly contribute to some of those projects. You are also welcome to recommend more projects.
http://open-source-covid-19.weileizeng.com/
Cheers!
If you manage to make a vaccine, I think a fitting name would be coronara1n.
Try adding covid-19
as a topic on this repo, so it can get on the "Explore" page on GitHub.
Look at the top comment here -
https://www.reddit.com/r/siacoin/comments/fi8gc6/important_siasky_files_uploaded_coronope_a/
Great article that categorizes each protein encoded by the SARS-CoV-2 genome: https://www.nytimes.com/interactive/2020/04/03/science/coronavirus-genome-bad-news-wrapped-in-protein.html.
Glad I stumbled upon this project - was working on a theory using the same base dataset.
Since protein/genes are essentially sequences of letters, it led me to the idea of using Transformer models like BERT to classify sequences to their structure. If that theory was valid, I'd want to try a multi-task approach to pairing the valid treatment sequence to the virus sequence and look at whether the model can predict the treatment sequence given the input virus sequence.
I haven't studied the structure as much as you guys probably have - so I'd defer to you on whether this would be plausible/feasible given what we know so far.
Here's a few other starting points I've looked at:
ReSimNet: Drug Response Similarity Prediction using Siamese Neural Networks
Jeon and Park et al., 2018
https://github.com/dmis-lab/ReSimNet
BERN is a BioBERT-based multi-type NER tool that also supports normalization of extracted entities.
Here are a two suggestions to help your workflow aka. power user tools.
Pick any of the four options to look up a word (e.g. angiotensin):
command ⌘
+ control ⌃
+ D
keys.Look up
.command ⌘
+ L
keys.Adding Wikipedia: in the preferences of Dictionary app select "Wikipedia".
Unfortunately, the first three options do not work for words that are part of a hyperlink.
Works best if DuckDuckGo is your default search engine – since this turns your address bar into a CLI for looking-up/searching any site. (Relax you can still search Google!).
Try typing any of these search term examples into DuckDuckGo
!w Human coronavirus NL63
!gene ORF1AB
!protein QHD43418
!a Molecular Biology Cell
!gsch Molecular structure nucleic acids
!alpha AAGCTAGCTAGC
!pubchem n aminoethyl aziridineethanamine
!g regular ass google search
covid-19 positive reddit
I'll add more suggestions as they come to mind.
We know the cleaveage sites into the protein, as explained here.
SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor
Published on Cell, https://doi.org/10.1016/j.cell.2020.02.052
Besides that, as a biotechnologist I would recommend to stop thinking that this approach could work in nature. Most of your questions could be answered by an undergraduate with some knowledge and ability to read and understand scientific papers.
We're able to engineer some organisms, ya sure, but we're so far to a pure "reverse engineering", because of chemical interactions which causes that every protein, every molecule inside a cell couldn't be traited as a standalone thing.
Rough idea, might be not precise:
Haven't looked at it in depth yet, but might be interesting: https://www.longdom.org/open-access/d-llysine-acetylsalicylate--glycine-impairs-coronavirus-replication-jaa-1000151.pdf
About 6000 participants
Study:
https://clinicaltrials.gov/ct2/show/NCT04337541?term=NCT04337541&draw=2&rank=1
Findings :
https://www.acpjournals.org/doi/10.7326/M20-6817
Good? or too small? or not controlled enough?
I saw the "Open Questions" section and decided to answer them in case anyone was still interested, there are definitely more avenues to explore in this project and they could be of huge benefit to researchers.
SARS-CoV-2_Reverse-Engineering_Open_Qs.pdf
https://github.com/kavrakilab/sars-arena
SARS-Arena: A Pipeline for Selection and Structural HLA Modeling of Conserved Peptides of SARS-relate
Hey, I'm using the offsets and comments of corona.py in my own project.
Is it okay if that file is under MIT like the rest of the project or should I add a note about it having no license because this project has none?
Btw this project is so cool. Thank you!
Hi. I'm researching about the Coronavirus.
I was created an twitter account which expose the truths.
here is the proof.
i did predict all stuff: https://www.youtube.com/watch?v=X_ALQs4aUJg
(in this video, i talked about the Coronavirus in Japanese.)
about the Symptons are unknown because "They" mixed [HIV/AIDS + Ebola + Mumps +and other grave stuff] . and the govements can't handle it because they're panicing.
I'm working on my stuff on this account:
https://twitter.com/0x904f40349
about the Symptons, You can look the reality on Liveleak and other gore websites.
We can do everything using Deep-Learning.
about Deep-Learning stuff,
if I assume that people makes mistakes, all the people have the same patterns.
for example:
32 x 64 x 32 = 65,536 colors (limited)
Langueages (limited)
Speaking, facial expressions, psychology, preferences,
This is all a matter of human tendencies.
the only way to decrese deaths is that people follow the right information.
I hope the world is peaceful. Good luck Mr. Geohot.
Thank you.
Work to be done section:
-Multi-sequence compare tools from the broad institute.
IGV: https://software.broadinstitute.org/software/igv/download
Some good command line softwares you will inevitably run into(conda/pip installable):
Bamtools, Samtools, clustalo, blast
-Secondary structure prediction:
This should be completed using the RNA transcript, protein prediction is still in its infancy because no one has taken post translational mods into account. Some papers to get you started:
PTMs in coronavirus:
https://www.futuremedicine.com/doi/full/10.2217/fvl-2018-0008
Ponti, R. D., et al. (2020). "CROSSalive: a web server for predicting the in vivo structure of RNA molecules." Bioinformatics 36(3): 940-941.
Wang, F. Q., et al. "Comparison of Pseudoknotted RNA Secondary Structures by Topological Centroid Identification and Tree Edit Distance." Journal of Computational Biology.
Zhang, Z., et al. (2020). "Accurate inference of the full base-pairing structure of RNA by deep mutational scanning and covariation-induced deviation of activity." Nucleic Acids Research 48(3): 1451-1465.
If you are certain you want to stay in protein prediction:
PDB and exPasy- Prosite can be helpful databases. The PRATT function on expasy is super useful.
Have fun!!!!
MS
KeyError occurs when trying to use "write_unfolded" on the /proteins/villin/1vii.fasta
There is one issue in your write_unfolded function: some amino acids can be represented in different forms (residues). Histidine (HIS - H) is not directly included in the "amber99sb.xml" residues. It only includes "HID", "HIE", "HIP", "HIN".
I see a few solutions:
Links:
dltql
Use blastx to predict protein function from amino acid sequence by comparing to other known proteins using homology and preserved domains, im pretty sure you can also install it as a module in python (https://blast.ncbi.nlm.nih.gov/Blast.cgi)
Use clustal Ω to align multiple amino acid or dna sequences (https://www.ebi.ac.uk/Tools/msa/clustalo/)
Also can use jpred to try and infer secondary protein structure (http://www.compbio.dundee.ac.uk/jpred/)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.