hupo-psi / mzqc Goto Github PK

View Code? Open in Web Editor NEW

27.0 14.0 13.0 114.67 MB

Reporting and exchange format for mass spectrometry quality control data

Home Page: https://hupo-psi.github.io/mzQC/

License: Creative Commons Attribution 4.0 International

Dockerfile 100.00%

mass-spectrometry quality-control metrics

mzqc's People

Contributors

Stargazers

Watchers

Forkers

mwalzer marc-sturm rsalek eisenachm dmccloskey julianu cbielow aspirincode eralpdogu xchen98 poshul nilshoffmann kozo2

mzqc's Issues

[CV] define a way for users to request a new CV without a GitHub account

We should state an e- mail address or any other way how anybody could request a new CV parameter.

[CV] Why is ambient humidity an n-tuple?

The term QC:4000068 "Ambient humidity" is currently an n-tuple.
Is there any reason for this? If so, the description (and name) should reflect this.

XML namespace

Add a namespace to the XML schema.

Original issue reported on code.google.com by [email protected] on 6 Sep 2013 at 12:04

[CV] Request for new CV entries

Dear all,

Julian told me that I can request new CV entries here. Ideally we would like to have all the metrics that we are taking into account in the QCloud (see Table 1 of our paper http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0189209). By order of priority, we would like to have the Peak Area, Mass accuracy, Retention Time, Median Injection Time for MS2, Chromatographic Resolution, Peak Capacity and Total Ion Current.

Thank you!

Roger

Provenance

The qcML file should specify the entire bioinformatics processing cycle up to generating the QC metrics.
As remarked previously, this can be done through the general cvParam elements, but it might be useful to make this a bit more explicit and enforceable. At the very least we should carefully document this.

Additionally, information related to a contact person should be included. This has already been specified in the MIAPE list, so we should ensure that all MIAPE information can be found in the qcML file as well.

No read me, no wiki

Run/Set Elements don't have a xsd:ID attribute

Run/Set Elements don't have a xsd:ID attribute

Original issue reported on code.google.com by [email protected] on 6 May 2013 at 12:43

I think the following basic cv terms are needed

id-free:

Number of MS1
Number of MS2
Number of chromatograms
MZ aquisition range (MZ(m/z) boundaries)
RT aquisition range (RT(s) boundaries)
TIC (linked list of RT(s) and intensities)
TIC outliers (list of outlier RT(s))
MS2 injection times (linked list of precursor m/z and injection time(ms))
List of S/N of MS2 spectra (linked list of precursor m/z and S/N)
List of S/N of MS1 spectra (linked list of RT and S/N)

id:

total number of PSM
total number of identified peptides
total number of identified protein
total number of proteins uniquely identified (unique peptides)
precursor / MC / delta ppm / TD / charge table (table of precursor m/z and delta ppm and missed cleavages and target/decoy distinction and charge)
final ID/MS2 ratio

quant:

Number of features (isotopic envelope)
Number of features identified
Number of features with id conflict

new relation:

not-for-final-report (filter mzqc for volume reduction)

Threshold doesn't have an ID

Because the ID for parameters is stored in QualityParameter and 
AttachmentParameter, and not in AbstractParameter or CvParameter, threshold has 
no ID. Threshold extends CvParameter, but doesn't define it's own ID.

Is this deliberate, i.e. a threshold doesn't need an ID because it is defined 
by its parent qualityParameter? Or should threshold still have an ID, possibly 
for easy conversion to a database?

Original issue reported on code.google.com by [email protected] on 14 Oct 2013 at 8:28

Intra-document references

We need to keep element names document unique to be able to successfully reference items within a json document because we got rid of the 'id attribute'

reference dataset

Provide a reference dataset(s) to show QCML in action.

Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:49

[CV] Chromatography related metric suggestions

Category: Chromatography

synonym: C-1A
name: Fraction of repeat peptide IDs with -4 min
def: Fraction of all peptides identified at least 4 min earlier than max MS1 for ID
comment: Estimates very early peak broadening
synonym: C-1B
name: Fraction of repeat peptide IDs with +4 min
def: Fraction of all peptides identified at least 4 min earlier than max MS1 for ID
comment: Estimates very early peak broadening
synonym: C-2A
name: Interquartile retention time period Period (min)
def: Time period over which 50% of peptides were identified
comment: Longer times indicate better chromatographic separation
synonym: C-2B
name: Interquartile retention time period Pep ID rate
def: Rate of peptide identification during C-2A
comment: Higher rates indicate efficient sampling and identification
synonym: C-3A
name: Peak width at half-height for IDs Median value
def: Median peak widths for all identified unique peptides (s)
comment: Sharper peak widths indicate better chromatographic resolution
synonym: C-3B
name: Interquartile distance of peak widths at half-height for IDs
def: Interquartile distance FWHM
comment: Tighter distributions indicate more peak width uniformity Measure of the distribution of the peak widths; small values indicate consistency
synonym: C-4A
name: Peak widths at half-max over RT deciles for IDs First decile
def: Median peak width for identified peptides in first RT decile (early)
comment: Estimates peak widths at the beginning of the gradient
synonym: C-4B
name: Peak widths at half-max over RT deciles for IDs Last decile
def: Median peak width for identified peptides in first RT decile (end)
comment: Estimates peak widths at the end of the gradient
synonym: C-4C
name: Peak widths at half-max over RT deciles for IDs Median decile
def: Median peak width for identified peptides in first RT decile (middle)
comment: Estimates peak widths at the middle of the gradient
synonym: C-6A
name: Peptide elution order trailing
def: max diff in first N(a, b)/10 peptides; Fraction of extra early (hydrophilic) eluters; Peptide elution order can be used to measure elution differences early (hydrophilic) and late (hydrophobic) in the chromatographic gradient.
comment: set metric for replicates
synonym: C-6B
name: Peptide elution order tailing
def: max diff in last N(a, b)/10 peptides; Fraction of extra late (hydrophobic) eluters
comment: set metric for replicates
synonym: C-7A
name: Interval Acquisition range RT
def: What is the lowest and the highest sampled RT set to in the machine settings?
comment: ?
synonym: C-7B
name: Interval RT-Duration
def: What is the highest scan time observed minus the lowest scan time observed?
comment: ?
synonym: C-8A
name: total ion current chromatogram; Intensity collection
def: the chromatogram in total
- how to add comment of TIC is found in psi-ms.obo? MS:1000235
  comment: ?
synonym: C-8B
name: area under total ion current chromatogram Total intensity
def: TIC AUC
comment: ?
synonym: C-8C
name: area under total ion current chromatogram quartiles Intensity Q1-Q4
def: TIC AUC quartiles
comment: ?
synonym: C-8D
name: TIC quartile in relation to RT duration RT-TIC Q1-Q4
def: The interval when the first, .., last percentile of TIC accumulates divided by RT-Duration
comment: ?
synonym: C-9
name: pump pressure chromatogram
def: list of pressure measurements ordered by RT
is_a: UO:0000110 ! pascal
comment: ?
Metric for number of drops in chromatograms (/histograms)
synonym: C-10A
name: MS1-TIC-Change Q2-Q4
def: The log ratio for second, .., last percentile of TIC changes over first percentile of TIC changes ? log ratio
comment: ?
synonym: C-10B
name: MS1-TIC Q2-Q4
def: The log ratio for second, .., last percentile of TIC over first percentile of TIC ? log ratio
comment: ?

Refine qcML schema changes (0.0.8 to now ) avoiding ambiguities and bloat

This issue is to collect discussions and points of improvements for the current (0.0.8 to now) schema changes, some of which have already been identified in the monthly telcos by @bittremieux and @mwalzer.

New batch of CV term requests, category: Spectrum acquisition

Name:
"Number of chromatograms"

Definition:
"Number of chromatograms"

Comment: A lower number of chromatograms acquired during one sample run compared to similar runs can indicate mismatched instrument settings or issues with the instrumentation.

Proposed value type:

single value
{STATO:0000047}

Name:
"Number of MS1 spectra"

Definition:
"Number of MS1 spectra"

Comment: A lower number of MS1 spectra acquired during one sample run compared to similar runs can indicate mismatched instrument settings or issues with the instrumentation.

Proposed value type:

single value
{STATO:0000047}

Name:
"Number of MS2 spectra"

Definition:
"Number of MS2 spectra"

Comment: A lower number of MS2 spectra acquired during one sample run compared to similar runs can indicate mismatched instrument settings or issues with the instrumentation or unusual low levels of ions collectable for MS/MS.

Proposed value type:

single value
{STATO:0000047}

Name:
"MZ acquisition range"

Definition:
"Upper and lower limit of m/z values at which spectra are recorded."

Comment: Acquisition levels can be used as a criterion to assess the comparability of instrument settings between runs.

Proposed value type:

n-tuple
{MS:1000040,MS:1000040}

Name:
"RT acquisition range"

Definition:
"Upper and lower limit of time at which spectra are recorded."

Comment: Acquisition levels can be used as a criterion to assess the comparability of instrument settings between runs.

Proposed value type:

n-tuple
{UO:0000010,UO:0000010}

Name:
"Fastest frequency for MS level 1 collection"

Definition:
"Fastest frequency for MS level 1 collection"

Comment: Spectrum acquisition frequency can be used to gauge the suitability of used instrument settings for the sample content used.

Proposed value type:

single value
{UO:0000106}

Name:
"Fastest frequency for MS level 2 collection"

Definition:
"Fastest frequency for MS level 2 collection"

Comment: Spectrum acquisition frequency can be used to gauge the suitability of used instrument settings for the sample content used.

Proposed value type:

single value
{UO:0000106}

Name:
"Slowest frequency for MS level 1 collection"

Definition:
"Slowest frequency for MS level 1 collection"

Comment: Spectrum acquisition frequency can be used to gauge the suitability of used instrument settings for the sample content used.

Proposed value type:

single value
{UO:0000106}

Name:
"Slowest frequency for MS level 2 collection"

Definition:
"Slowest frequency for MS level 2 collection"

Comment: Spectrum acquisition frequency can be used to gauge the suitability of used instrument settings for the sample content used.

Proposed value type:

single value
{UO:0000106}

Name:
"Precursor intensity range"

Definition:
"Minimum and maximum precursor intensity recorded."

Comment: The intensity range of the precursors informs about the dynamic range of the acquisition.

Proposed value type:

n-tuple
{MS:1003085,MS:1003085}

New batch of CV term requests, category: Identified signal

Name:
Explained precursor intensity first quarter

Definition:
"Fraction of identified MS2 in the first quarter of precursor intensity."

Comment: Higher fractions of identified MS2 spectra indicate the efficiency of detection and sampling

Proposed value type:

single value

Synonym: MS2-4A

Name:
Explained precursor intensity second quarter

Definition:
"Fraction of identified MS2 in the second quarter of precursor intensity."

Comment: Higher fractions of identified MS2 spectra indicate the efficiency of detection and sampling

Proposed value type:

single value

Synonym: MS2-4B

Name:
Explained precursor intensity third quarter

Definition:
"Fraction of identified MS2 in the third quarter of precursor intensity."

Comment: Higher fractions of identified MS2 spectra indicate the efficiency of detection and sampling

Proposed value type:

single value

Synonym: MS2-4C

Name:
Explained precursor intensity fourth quarter

Definition:
"Fraction of identified MS2 in the fourth quarter of precursor intensity."

Comment: Higher fractions of identified MS2 spectra indicate the efficiency of detection and sampling

Proposed value type:

single value

Synonym: MS2-4D

Name:
Extent of identified precursor intensity

Definition:
"Ratio of 95th over 5th percentile of precursor intensity for identified peptides",

Comment: Can be used to approximate the dynamic range of signal

Proposed value type:

single value
{STATO:0000300}

Synonym: MS1-3A

Name:
"Precursor intensity range of identified MS2"

Definition:
"Minimum and maximum precursor intensity recorded and identified."

Comment: The intensity range of the identified precursors informs about the dynamic range of the acquisition.

Proposed value type:

n-tuple
{MS:1003085,MS:1003085}

Name:
Precursor intensity of identified MS2 Q1, Q2, Q3

Definition:
"From the distribution of precursor intensity of identified MS2, the quartiles Q1, Q2, Q3",

Comment: The (un)identified precursor intensity distribution can aid the interpretation of overall identification success.

Proposed value type:

n-tuple
{STATO:0000167, STATO:0000574, STATO:0000170}

Name:
Precursor intensity of unidentified MS2 Q1, Q2, Q3

Definition:
"From the distribution of precursor intensity of unidentified MS2, the quartiles Q1, Q2, Q3",

Comment: The (un)identified precursor intensity distribution can aid the interpretation of overall identification success.

Proposed value type:

n-tuple
{STATO:0000167, STATO:0000574, STATO:0000170}

Name:
Median S/N for MS1 spectra in RT range in which half of the peptides are identified

Definition:
"Median S/N for MS1 spectra in RT range in which half of the peptides are identified",

Comment: Higher MS1 S/N may correlate with higher signal discrimination

Proposed value type:

single value
{STATO:0000300}

Synonym: MS1-2A

Name:
Median of TIC values in the RT range in which the first half of peptides are identified

Definition:
"Median of TIC values in the RT range in which the first half of peptides are identified"

Comment: Estimates the total absolute signal for peptides (may vary significantly between instruments)

Proposed value type:

single value

Synonym: MS1-2B

Name:
Median S/N for MS1 spectra in the shortest RT range in which half of the peptides are identified

Definition:
"Median S/N for MS1 spectra in the shortest RT range in which half of the peptides are identified",

Proposed value type:

single value

Name:
Median of TIC values in the shortest RT range in which half of the peptides are identified

Definition:
"Median of TIC values in the shortest RT range in which half of the peptides are identified"

Proposed value type:

single value

Name:
Explained base peak intensity median

Definition:
"Median of the ratio of 'max survey scan intensity' over 'sampled precursor intensity' for all peptides identified"

Comment: Gives insight into the amount of overall explained signal and whether the amount of signal could be increased by a better sampling strategy.

Proposed value type:

single value
{STATO:0000574}

Name:
Explained base peak intensity median from least intense 50% base peaks

Definition:
"Ratios of 'max survey scan intensity' over sampled precursor intensity for the bottom half (by MS1 max) of MS2",

Comment: Gives insight into the amount of explained signal and whether the sampling strategy is interfering with the sampling of low abundant peaks.

Proposed value type:

single value

[Schema] Terminology: should mzQC use "Accession" or "Term" to describe something in a controlled vocabulary?

In the terminology world, unique elements in controlled vocabularies are generally referred to as terms. The word "accession" is used in the bio and library community, generally for libraries and collections that change as the world being studied changes (new books are published, genes are identified, proteins are identified).

At present, mzQC uses "accession" in many places where, as a terminologist, I'd expect to see "term". I'm raising this issue to flag that we may want a discussion around which term (sorry!) to adopt.

Raw file information

What information should be stored about raw file(s) used to generate to QC data?

Currently the RawFile element supports the following attributes:

A mandatory id and name attribute to identify the file. Should name be the original filename? Should it include the file extension or not? We should probably specify this in the documentation.
A mandatory location attribute as a URI.
An optional methodFile_ref reference to a methods file. The example given is a TraML file used for SRM analysis. When would this reference need to be used instead of only having an entry for the file in question as another external file?

And the following child elements:

An optional ExternalFormatDocumentation with a URI pointing to additional documentation. As remarked previously it was unclear what this should contain and we decided to remove it (db7c7de).
FileFormat as a cvParam. This imposes barely any restrictions in the XML schema, so probably external validation should happen to verify that a specific CV item denoting a file format is used here?
Optional userParam and cvParam elements that can contain any other information.

What is the basis to put information as an attribute or an element? For example, both location and ExternalFormatDocumentation are simple URIs. There is of course a difference in importance between these 2 URIs, partially indicated by the mandatory/optional status.

What information is still missing here?

As identified in previous teleconferences (579a614, 2607c93), we should include a checksum to allow verification that the file hasn't changed.
A RawFile can contain information about any type of input file. A flag should be included to specify what kind of file this is. (579a614)

Other issues previously mentioned

In November's teleconference it was mentioned that we need to be able to reference multiple raw files, for example for multiple files from different fractions. This is possible already as the maximum number of RawFile elements is unbounded.

[CV] units and more detailed description of how to compute metrics

after going through some metrics in the current OBO, some descriptions are vague (at best) and/or ambiguous.
E.g. https://github.com/HUPO-PSI/mzQC/blob/master/cv/v0_1_0/qc-cv.obo#L194 says Log ratio of .... Log to that base? There is no universal agreement IMHO. Writing LogN or Log2 would be desirable and make QC values more comparable.

Also, most metrics could use a unit (e.g. for RT etc...). Is there a way to restrict (or even fix) the unit to be used?

Also, can we put restrictions on the values, e.g. fractions must be between 0-1. Reporting multiple fractions must sum to 1 etc...

The main idea/benefit of providing a tighter description is a) better interpretability of the data b) better comparability across tools which compute these metrics.

Rename "cv" to "controlledVocabulary" in the JSON

The term "CV" is already well-known in the bioinformatics community as Coefficient of Variation, which is also a term associated with quality control. I would strongly recommend not overloading the abbreviation, and I would strongly recommend not using abbreviations in a standard exchange format (we studiously avoided abbreviations when standardising OWL, for example).

May I request renaming the "cv" element in the JSON to "controloledVocabulary" to avoid this collision and consequent cognitive dissonance?

changelog from 0.0.8

We need a changelog for the schema changes from 0.0.8

Multiple values encoding

Several ways have been suggested to encode a QC metric containing multiple values (i.e. quartiles, deciles, ...).

Option 1: JSON list

<qualityParameter ID="METRIC003" cvRef="PSI-QC-CV" accession="C-8C" name="area under total ion current chromatogram quartiles Intensity Q1-Q3">
    <content cvRef="PSI-QC-CV" accession="list" value='3'>{'UO:0000189':[11235567,49344566,98047696]}</content>
</qualityParameter>

Option 2: key-value pairs

<qualityParameter ID="METRIC002_1" cvRef="PSI-QC-CV" accession="C-8D" name="TIC quartile in relation to RT duration RT-TIC Q1-Q3">
    <content cvRef="PSI-QC-CV" accession="C-8D-json" value='4'>{Q1:0.27,Q2:0.4,Q3:0.74}</content>
</qualityParameter>

The previous consensus was the JSON lists are preferred.

As JSON is very flexible we should carefully specify what is and isn't allowed.

For example: how will we specify the number of elements in a list? Will the CV terms explicitly state "quartile" to indicate that 4 values are required or not? Previously we mentioned that the CV terms won't explicitly contain tags such as "Q1", "Q2", ... for quartiles etc., but that the number of quantiles will be derived from the number of values the metric takes. (0b134a6) In this case it will be crucial that this is stored consistently so that a correct interpretation is possible.

This is also important for file validation, which will need to be done externally.

Style-guide for our documents (incl. ontology)

Having a reference for what variable/name/element is stylised in which context will help consistency and understanding context while getting familiar with the format, implementing a reader/writer, adding new metrics and spotting/talking about issues.

Add new metric: peak width

FWHM or what?

Original issue reported on code.google.com by [email protected] on 3 Sep 2013 at 3:20

New metrics

Add new metrics. I.e. define according CV terms.
Check back http://code.google.com/p/qcml/wiki/PotentialCVterms for missing 
parameters from NIST-CPTAC/QuaMeter, SYMPATICQO, ...

Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:52

Introducing comments to the CV entries

To better understand the interpretation of the CV entries, or rather the values in the mzQC file itself, I would like to introduce comments to the obo files.
This will be in the next pull request (containing the entries from issue #52 ) and needs to be done for all metrics. Help is very welcome for this.

[CV] CV term id prefixes

The id prefix for the first three terms in the CV are MS: rather than CV:. Is this intentional @julianu? I guess there's no rule against that, but to be consistent it seems logical to prefix all ids with QC: instead.

[CV] PSI-MS relations to our entries (see #53)

@dtabb73 added PSI-MS relations to the ID free metrics, e.g. relationship: has_relation MS:1000041 ! charge state.
attached file from #53

Duplicate information

As discussed in our recent teleconference (f52d365) we should compile a list of minimum information that is necessary for QC analyses, even though that information might already be stored in secondary files.

By duplicating that information into the qcML files directly this will facilitate (re)running QC analyses and simplify pipelines by not having to keep track of (all) prior files.

Current information that would be useful as identified during our teleconference is:

Identification information: database, species, source (Uniprot), date.

Please discuss and specify further useful information.

Sync examples and CV, spot in mzQC

We still need to add many CV terms, keep our examples in-line, and have a way to spot offending entries in mzQC documents with a semantic validator. (Also, elaborate on what is 'offending'.)

[CV] Issue with cv-term values and the use of cv in mzQC json

e.g. in corresponding lists, defined: "lists with n elements each, n specified by the CV param value"
If cv value is n, then where do the lists go? This contradicts, how we use e.g. id: QC:4000050
name: XIC-WideFrac

retention time deviation

Add new metric: retention time deviation for a specific feature (sequence)

Original issue reported on code.google.com by [email protected] on 3 Sep 2013 at 2:34

Attachment is derived from CVParamType

Attachment is derived from CVParamType,
but I'm not sure this is really needed.

Maybe for values that are not that important and have no cv.
This would be described in a sepcification doc.

Original issue reported on code.google.com by [email protected] on 6 May 2013 at 12:43

[CV] new CV column / attribute: type of MS experiment

In the current CV term collection (the separator-delimited file now suffixed '.md') it may also be interesting to report, for what MS experiment type (e.g. DDA-high-throughput MS/MS or SRM) a quality metric can only be useful (as Wout pointed out in the QC announcement manuscript, the total number of acquired MS/MS spectra is not so helpful for SRM). By adding such a column / attribute, it is not decided, whether this will be also included in a final CV collection / ontology or whether it is stated via text in a QC format specification document or whether it is modled by a rule in a semantic validator.

New version and structure of repository

In #44 I created a pull request for the version 0.1.0 for our CV.
Besides the starting from scratch of all terms, the CV imports the 0.0.8-legacy for backward compatibility. We should highlight somewhere (at least on the 1.0.0 release), that all terms below QC:4...... are deprecated. and should no longer be used. This could also easily be reflected in the legacy-cv later on.

As we also should have a stable link-position for the current versions of the CV, the file "qcML-development/cv/qc-cv.obo" was created, which should in the future always represent the newest active version. Besides it also resides the legacy version.

I think this also reflects your ideas @mwalzer, right?

Please comment or merge the PR.

Request for new CV entries (part 2)

Oops sorry, I closed the previous discussion by mistake! I was asking if I could create a branch to add the new QCCV entries we are adding to the new QCloud website.

Thanks!

Semantic validation

Things that have to be validated after the syntactic validation outside of the JSON Schema:

Warnings:

If a non-public controlled vocabulary is specified.
Metric names should match the definition in the CV.

Please add any other checks that I might have missed at the moment.

extend sample files and datasets

A extended sample file(s) comprising all features of qcML
is needed for implementation aid and testing.

Original issue reported on code.google.com by [email protected] on 6 Sep 2013 at 8:08

metrics spanning multiple files

Decide on the way to represent metrics spanning multiple files

related to #35

[CV] 'QC Metric type' category is confusing

I fear the parent term (and CV term category) QC Metric type might be confusing. All subterms describe value types of metrics, not metric types.

[CV] Need for new value-types in json

Until now, in the CV we have the xref for each term, like "xref: value-type:xsd:int "The allowed value-type for this CV term."
These value types are still XML specific.
Do we need to specify them in json, and if, what values types do we have there?

MS1-Count and MS2-Count are missing! URGENT!

I was looking for the CV codes for my IDFree MS1-Count and MS2-Count metrics, but the CV seems to lack these. Please help! I am only a PI and cannot fix this myself.

Encapsulate table in an Attachment

Adjust Attachment so a table is encapsulated by a table-element. F.e.:

<Attachment id=...>
<table>
<TableColumnTypes>table row values</TableColumnTypes>
<TableRowValues>1 2 3</TableRowValues>
...
<TableRowValues>n-2 n-1 n</TableRowValues>
</table>
</Attachment>

Original issue reported on code.google.com by [email protected] on 6 Sep 2013 at 11:54

qcML logo

we need a logo! (for the files and the website)

ProteomeXchange interfacing

Investigate ability of PRIDE to handle QCML as part 
of ProteomeXchange submissions. 
Which would be feasible integrations?

Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:48

database implementation availability

make reference database implementation available.
Potentially in /trunk/[reference DB implementation name]/ ?

Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:40

Add new metric: sequence coverage

certain Proteins?

Original issue reported on code.google.com by [email protected] on 3 Sep 2013 at 3:17

availability in software packages

Provide documentation to QCML availability in various software packages.
(Bioconductor proteomics mass spec package?)

Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:43

[CV] Potential MS2 Acquisition QC Metric

We can determine how well dynamic exclusion settings are set for an instrument if we can measure where an MS2 spectrum is acquired relative to the apex of the feature it belongs to. This can be measured as a percent of max intensity or distance to apex.

Poor performance can be denoted as greatest distance from apex and could indicate a change of dynamic exclusion parameters.

[CV] description broken for "MS2 density per quantile"

The description for the term QC:4000062 "MS2 density per quantile" is broken.
Could you please provide the correct description @dtabb73 ?

[Schema] cvParameter definition is referencing itself for a member definition (unit)

Should we really do this? Maybe a cvTerm accession would do.

	"cvParameter": {
		"description": "Base element containing a reference to a controlled vocabulary.",
		"type": "object",
		"properties": {
			"cvRef": {
				"description": "Reference to the CV that contains the parameter definition.",
				"type": "string",
				"pattern": "^[A-Z]+$"
			},
			"accession": {
				"description": "Accession number identifying the parameter within its CV.",
				"type": "string",
				"pattern": "^[A-Z]+:[0-9]{7}$"
			},
			"name": {
				"description": "Name of the CV element describing the parameter.",
				"type": "string"
			},
			"description": {
				"description": "Description of the parameter.",
				"type": "string"
			},
			"value": {
				"description": "Value of the parameter."
			},
			"unit": {
				"description": "A CV element describing the unit of the parameter value.",
				"anyOf": [
					{
						"$ref": "#/definitions/cvParameter"
					},
					{
						"type": "array",
						"minItems": 1,
						"items": {
							"$ref": "#/definitions/cvParameter"
						}
					}
				]
			}
		},
		"required": ["cvRef", "accession", "name"]
	},

hupo-psi / mzqc Goto Github PK

mzqc's People

Contributors

Stargazers

Watchers

Forkers

mzqc's Issues

Category: Chromatography

What information should be stored about raw file(s) used to generate to QC data?

What information is still missing here?

Other issues previously mentioned

Option 1: JSON list

Option 2: key-value pairs

Recommend Projects

Recommend Topics

Recommend Org