Comments (3)
Thanks @al-dunstan,
This library, just like original kseq
, is supposed to parse a mixture of FASTA and FASTQ records in one file which is determined by the starting sign (@
for FASTQ, or >
for FASTA) of the record. When it comes to writing a FASTA/Q record, this still holds and the format of the record is determined by the quality string; i.e. if the quality string is present, the record is considered as FASTQ record. It makes sense because a sequence with quality information cannot be written as FASTA record without data loss or a FASTQ record cannot have an empty quality string. A record with an empty sequence is an exception which is kind of ambiguous whether to consider it as a FASTA record or FASTQ one. The current version considers it as a FASTA record. So, this is independent of the output file/stream or wrap length parameter.
Regarding this issue, it is possible to provide a mechanism to force a specific format. For example, defining some modifiers:
out << klibpp::format::fastq << record; // write `record` in FASTQ
// or
out << klibpp::format::fasta << record; // write `record` in FASTA
But this only makes sense when the record is empty. Otherwise, this modifier should be ignored as I explained above (consistency issue). On the other hand, it makes the writing procedure slightly more expensive to handle this special case (performance issue). So, this suggestion is neither consistent nor efficient.
Considering the fact that empty records are usually supposed to be removed later, I think it would be better to check whether a record is empty before attempting to write it into the stream/file. Any thoughts? Do you have any suggestion/expectation about how to clear the ambiguity of an empty record?
from kseqpp.
from kseqpp.
Not at all. Thanks a lot for the suggestion. I introduced format types format::fastq
, format::fasta
, and format::mix
(default). With the default value, it behaves like before. You can force a specific format either by using set_format(format::Format)
method as you suggested or by passing it using operator<<
(similar to 'iomanip' from standard library):
out.set_format(format::fastq);
out << record;
Or
out << format::fastq << record;
You can find it in develop
branch. I will merge it to master
branch when I find some time to update the test cases.
I am closing this issue. Feel free to re-open if the issue still exists.
from kseqpp.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kseqpp.