Comments (4)
Hi prihoda. Thank you for the comment! I have encountered similar issues caused by the inconsistent mechanisms of random number generation across different environments. Since we was also processing the sequences one by one during the testing stage, we failed to notice this bug. I will try fixing it and get back to you soon.
from peds2019.
In the eval() function, I accidently made the dataloader shuffle the sequences. Thank you for pointing this out. It will also be greatly appreciated that you could help us test the code again to see if the issue has been resolved.
from peds2019.
Hi @xf3227, thanks for the quick fix. I am now getting the same result when running one by one as when running the whole file 👍 You can close this issue.
Btw a side note, in terms of usability, I think users might find useful to have some instructions on producing the AHo aligned input files. You could even include a script, since it takes a few steps (running anarci to produce an aligned CSV and then converting that CSV to txt while making sure that the same positions as in your input files are present).
Anarci will only include positions that exist within your processed set of sequences, so here's what I got from the ANARCI CSV on my set of sequences:
QVQLKES-GPGLVAPSQSLSITCTVSG-FSVTN-----YGVHWVRQPPGKGLEWLGVIWA----GGITNYNSAFMSRLSISKDNSKSQVFLKMNSLQIDDTAMYYCASRGGHY-------------------GYALDYWGQGTSVTVSS
I then needed to insert the gaps at the correct positions:
-QVQLKES-GPGLVAPSQSLSITCTVSG-FSVTN-----YGVHWVRQPPGKGLEWLGVIWA----GGITNYNSAFMSRLSISKDNSKSQVFLKMNSLQIDDTAMYYCASRGGHY-------------------GYALDYWGQGTSVTVSS
from peds2019.
Hi @prihoda, thank you for locating this bug. I just closed this thread.
As to sequence alignment, sorry that I was not the guy handling this part, neither am I experienced on using sequence alignment tools. Two possible solutions could be:
-
Simply remove gaps from all sequences. The model can run under two modes one of which is to handle unaligned sequences, although the performance may be expected to be a bit poorer.
-
Create user's own training dataset aligned in any specific format.
Of course, thank you for bringing this up! Hope this repo could help with your researches and projects!
from peds2019.
Related Issues (6)
- Filter for unique sequences? HOT 6
- Sequence clustering HOT 1
- Classification model in figure 2b HOT 12
- Result reproduction HOT 7
- How to add a Gap
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peds2019.