Giter VIP home page Giter VIP logo

Comments (3)

cartoonist avatar cartoonist commented on May 30, 2024

Thanks @al-dunstan,

This library, just like original kseq, is supposed to parse a mixture of FASTA and FASTQ records in one file which is determined by the starting sign (@ for FASTQ, or > for FASTA) of the record. When it comes to writing a FASTA/Q record, this still holds and the format of the record is determined by the quality string; i.e. if the quality string is present, the record is considered as FASTQ record. It makes sense because a sequence with quality information cannot be written as FASTA record without data loss or a FASTQ record cannot have an empty quality string. A record with an empty sequence is an exception which is kind of ambiguous whether to consider it as a FASTA record or FASTQ one. The current version considers it as a FASTA record. So, this is independent of the output file/stream or wrap length parameter.

Regarding this issue, it is possible to provide a mechanism to force a specific format. For example, defining some modifiers:

out << klibpp::format::fastq << record;  // write `record` in FASTQ
// or
out << klibpp::format::fasta << record;  // write `record` in FASTA

But this only makes sense when the record is empty. Otherwise, this modifier should be ignored as I explained above (consistency issue). On the other hand, it makes the writing procedure slightly more expensive to handle this special case (performance issue). So, this suggestion is neither consistent nor efficient.

Considering the fact that empty records are usually supposed to be removed later, I think it would be better to check whether a record is empty before attempting to write it into the stream/file. Any thoughts? Do you have any suggestion/expectation about how to clear the ambiguity of an empty record?

from kseqpp.

al-dunstan avatar al-dunstan commented on May 30, 2024

from kseqpp.

cartoonist avatar cartoonist commented on May 30, 2024

Not at all. Thanks a lot for the suggestion. I introduced format types format::fastq, format::fasta, and format::mix (default). With the default value, it behaves like before. You can force a specific format either by using set_format(format::Format) method as you suggested or by passing it using operator<< (similar to 'iomanip' from standard library):

out.set_format(format::fastq);
out << record;

Or

out << format::fastq << record;

You can find it in develop branch. I will merge it to master branch when I find some time to update the test cases.

I am closing this issue. Feel free to re-open if the issue still exists.

from kseqpp.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.