Giter VIP home page Giter VIP logo

ncs_mdes's Introduction

NCS Navigator MDES Module

This gem provides a consistent computable interface to the National Children's Study Master Data Element Specification. Most of the data it exposes is derived at runtime from the documents provided by the National Children's Study Program Office, which are embedded in the distribution package. The balance of the data is also derived from NCS PO documentation, but is preprocessed to reduce the footprint of this library.

Use

require 'ncs_navigator/mdes'
require 'pp'

mdes = NcsNavigator::Mdes('3.0')
pp mdes.transmission_tables.collect(&:name)

For more details see the API documentation, starting with {NcsNavigator::Mdes::Specification}. (If you're not looking at this document in the API documentation, try looking at rubydoc.info.)

As of ncs_mdes 0.12.1, Ruby 1.9.3+ and JRuby 1.7+ are supported. (Ruby 1.8.x is not supported as of 0.12.0. Use 0.11.0 or earlier for Ruby 1.8.7.)

Examine

This gem includes a console for interactively analyzing and randomly poking at the MDES. It is called mdes-console:

$ mdes-console
Documents are expected to be in the default location.
$mdesNM is a Specification for N.M.
Available specifications are $mdes12, $mdes20, $mdes21, $mdes22, and $mdes30.
:001 >

It is based on ruby's IRB. Use it to examine the loaded MDES data without a lot of edit-save-run cycles:

:001 > $mdes20.transmission_tables.first.name
 => "study_center"

E.g., find all the variables of a particular XML schema type:

:002 > $mdes20.transmission_tables.collect { |t| t.variables }.flatten.select { |v| v.type.base_type == :decimal }.collect(&:name)
  => ["correction_factor_temp", "current_temp", "maximum_temp", "minimum_temp", "precision_term_temp", "trh_temp", "salts_moist", "s_33rh_reading", "s_75rh_reading", "s_33rh_reading_calib", "s_75rh_reading_calib", "precision_term_temp", "rf_temp", "correction_factor_temp", "sample_receipt_temp"]

Or the labels for a particular code list:

:003 > $mdes20.types.find { |t| t.name == 'confirm_type_cl7' }.code_list.collect(&:label)
 => ["Yes", "No", "Refused", "Don't Know", "Legitimate Skip", "Missing in Error"]

Or the number of code lists that include "Yes" as an option:

:004 > $mdes20.types.select { |t| t.code_list && t.code_list.collect(&:label).include?('Yes') }.size
 => 23

Or which variables in a table are different between two versions:

:005 > pp $mdes20['staff'].diff($mdes32['staff'])[:variables]; nil
#<NcsNavigator::Mdes::Differences::Collection:0x007fdd8bb4dff0
 @entry_differences={},
 @left_only=[],
 @right_only=["ncs_active_date", "ncs_inactive_date"]>

Develop

Writing API docs

Run bundle exec yard server --reload to get a dynamically-refreshing view of the API docs on localhost:8808.

ncs_mdes's People

Contributors

rsutphin avatar hannahwhy avatar

Stargazers

Aaron J. Zirbes avatar

Watchers

Paul Friedman avatar Jalpa Patel avatar Frank Lockom avatar Jeff Lunt avatar George Diaz avatar Nataliya avatar Bhavi Vedula avatar James Cloos avatar Randy Janzen avatar Samuel Hicks avatar Vickie avatar  avatar  avatar

Forkers

umn-enhs

ncs_mdes's Issues

Support tracking "omittable" vs. "nillable"

As noted in #2, there are (at least) two ways that the VDR XSD indicates whether a value is required. In semantic terms, these seem the same, but in terms of producing the VDR XML, they need to be distinguished.

Add separate predicates for these two conditions and make a variable required? only if both are false.

Improve startup performance

ncs_mdes for a particular version loads all of its information from a single giant XML Schema that is parsed all at once. This means there's a large startup cost incurred when using it in a webapp. Perhaps it could serialize the information loaded from the schema into a one or more easier-to-load representations, distributing those with the gem in addition to the XSDs.

spec_blood.equip_id is not a foreign key

According to the adult blood collection instrument, equip_id is a manually entered field, not a reference to spec_equipment. Correct the FK reference mapping to reflect this.

Provide a mechanism to compare MDES versions

Add a differ that can report on changes between the computable representations of two MDES versions. It should consider:

  • New/removed tables
  • New/changed/removed variables
  • New/changed/removed types, including code lists
    • Code list changes must compare code/label pairs in both directions

Annotate instrument data tables with for-mother or for-child information

A key concept for appropriate collection and reporting of instrument data is, for those tables which include a p_id variable, understanding whose p_id is being requested. As far as we are aware, it is always either a parent's p_id or a child's p_id.

This information is not consistently available in any machine-readable form, but it is critical enough that we should have a single place to document/access it. Add a mechanism to annotate TransmissionTable records with the appropriate information. API proposal:

class TransmissionTable
  ##
  # Is this a child instrument data table? (As opposed to a parent instrument data table.)
  #
  # This determines the type of participant whose p_id goes in this table's p_id
  # variable.
  #
  # Return values:
  # * `true`: The p_id should be a child's p_id.
  # * `false`: The p_id should be a parent's p_id.
  # * `nil`: The table isn't an instrument data table, or it doesn't have a p_id
  #   variable, or the childness of the p_id isn't known.
  #
  # @return [true,false,nil]
  def child_instrument_table?
  end

  ##
  # Is this a parent instrument data table? (As opposed to a child instrument data table.)
  #
  # This determines the type of participant whose p_id goes in this table's p_id
  # variable.
  #
  # Return values:
  # * `true`: The p_id should be a parent's p_id.
  # * `false`: The p_id should be a child's p_id.
  # * `nil`: The table isn't an instrument data table, or it doesn't have a p_id
  #   variable, or the childness of the p_id isn't known.
  #
  # @return [true,false,nil]
  def parent_instrument_table?
    child = child_instrument_table?
    child.nil? ? nil : !child
  end
end

Add support for MDES 3.0 draft

We have drafts of MDES 3.0. Add support for the latest to ncs_mdes. We will upgrade to the final version once it's out.

Support MDES 3.1

MDES 3.1 has been published with an initial XSD. Add 3.1 support to this library.

Expose the specification version separate from the version

The version exposed by SourceDocuments and Specification is the common version โ€” the version we use to talk about the MDES. VDR submissions require a more specific version, called the "specification_version". Make that available also.

NcsNavigator::Mdes shortcut does not support options

NcsNavigator::Mdes(version) is a shortcut for NcsNavigator::Mdes::Specification.new(version). The Specification constructor can also take an options hash. Allow this to be passed through from the shortcut also.

Provide a human-readable difference report

Specification#diff and friends (from #10) provide a machine-readable difference structure. Design a human-readable corresponding report and implement a way to generate it for two arbitrary specifications.

Handle extended characters in enumerated value lists correctly

Some of the enums in the source XML have typographic dashes. We need to ensure that they are read and exposed correctly. This may mean correcting the encoding for strings provided by this library, or it may mean filtering out the extended characters.

VariableType.from_xsd_simple_type does not pass options to CodeListEntry.from_xsd_enumeration

The various factory methods for the components of a Specification have set of options that get passed from hand to hand. Currently this only contains a logger. When VariableType.from_xsd_simple_type invokes CodeListEntry.from_xsd_enumeration, it does not pass the options. This means that CodeListEntry.from_xsd_enumeration always tries to use a default logger, which fails when $stderr is not compatible with Logger.new.

Retrieve last-modified time of disposition code list

NCS Navigator Cases' GET /api/v1/code_lists service supports conditional get. It will eventually have to return serialized versions of NcsNavigator::Mdes::DispositionCode objects.

To satisfy the DispositionCode requirement whilst continuing to support conditional get, I'd like to have a way to determine the last-modified time of the disposition code collection. (Modification date of the entire MDES would be fine, but might be more difficult than necessary -- AFAICT, the MDES data is spread across several files.)

I was thinking along these lines:

module NcsNavigator::Mdes
  class DispositionCode
    def self.last_modified
      # returns a Time
    end
  end
end

That'd lead to a convenient expression in Cases:

class FieldCodeCollection
  include NcsNavigator::Mdes

  def last_modified
    [NcsCode.last_modified, DispositionCode.last_modified].max
  end
end

Thoughts?

ncs_mdes does not work on JRuby

Several specs fail under JRuby 1.6.4. All of them appear to be related to extracting namespaced attribute values, so I'm guessing it's an incompatibility with Nokogiri. Determine who's at fault and fix it.

/var/lib/hudson/runner-home/.rvm/rubies/jruby-1.6.4/bin/jruby -S bundle exec rspec spec/ncs_navigator/mdes_spec.rb spec/ncs_navigator/mdes/source_documents_spec.rb spec/ncs_navigator/mdes/transmission_table_spec.rb spec/ncs_navigator/mdes/variable_type_spec.rb spec/ncs_navigator/mdes/variable_spec.rb spec/ncs_navigator/mdes/specification_spec.rb spec/ncs_navigator/mdes/version_spec.rb
............................................F.......FFF.......................FFFF.FFFFF................................................................................................

Failures:

 1) NcsNavigator::Mdes::VariableType.from_xsd_simple_type#code_list when there are enumerated values has the description
    Failure/Error: subject.code_list.description.should == "Eq"
      expected: "Eq"
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_type_spec.rb:119:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 2) NcsNavigator::Mdes::VariableType::CodeListEntry.from_xsd_enumeration#label is set
    Failure/Error: missing.label.should == "Missing in Error"
      expected: "Missing in Error"
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_type_spec.rb:179:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 3) NcsNavigator::Mdes::VariableType::CodeListEntry.from_xsd_enumeration#global_value is set
    Failure/Error: missing.global_value.should == '99-4'
      expected: "99-4"
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_type_spec.rb:185:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 4) NcsNavigator::Mdes::VariableType::CodeListEntry.from_xsd_enumeration#master_cl is set
    Failure/Error: missing.master_cl.should == 'missing_data'
      expected: "missing_data"
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_type_spec.rb:191:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 5) NcsNavigator::Mdes::Variable.from_element#pii is false when blank
    Failure/Error: variable('<xs:element ncsdoc:pii=""/>').pii.should == false
      expected: false
           got: :unknown (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:140:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 6) NcsNavigator::Mdes::Variable.from_element#pii is true when "Y"
    Failure/Error: variable('<xs:element ncsdoc:pii="Y"/>').pii.should == true
      expected: true
           got: :unknown (using ==)
      Diff:
      @@ -1,2 +1,2 @@
      -true
      +:unknown
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:144:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 7) NcsNavigator::Mdes::Variable.from_element#pii is :possible when "P"
    Failure/Error: variable('<xs:element ncsdoc:pii="P"/>').pii.should == :possible
      expected: :possible
           got: :unknown (using ==)
      Diff:
      @@ -1,2 +1,2 @@
      -:possible
      +:unknown
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:148:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 8) NcsNavigator::Mdes::Variable.from_element#pii is the literal value when some other value
    Failure/Error: variable('<xs:element ncsdoc:pii="7"/>').pii.should == '7'
      expected: "7"
           got: :unknown (using ==)
      Diff:
      @@ -1,2 +1,2 @@
      -7
      +:unknown
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:152:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 9) NcsNavigator::Mdes::Variable.from_element#status is :active for 1
    Failure/Error: variable('<xs:element ncsdoc:status="1"/>').status.should == :active
      expected: :active
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:162:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 10) NcsNavigator::Mdes::Variable.from_element#status is :new for 2
    Failure/Error: variable('<xs:element ncsdoc:status="2"/>').status.should == :new
      expected: :new
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:166:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 11) NcsNavigator::Mdes::Variable.from_element#status is :modified for 3
    Failure/Error: variable('<xs:element ncsdoc:status="3"/>').status.should == :modified
      expected: :modified
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:170:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 12) NcsNavigator::Mdes::Variable.from_element#status is :retired for 4
    Failure/Error: variable('<xs:element ncsdoc:status="4"/>').status.should == :retired
      expected: :retired
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:174:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

 13) NcsNavigator::Mdes::Variable.from_element#status is the literal value for some other value
    Failure/Error: variable('<xs:element ncsdoc:status="P4"/>').status.should == 'P4'
      expected: "P4"
           got: nil (using ==)
    # org/jruby/RubyProc.java:274:in `call'
    # ./spec/ncs_navigator/mdes/variable_spec.rb:178:in `Mdes'
    # org/jruby/RubyKernel.java:2061:in `instance_eval'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'
    # org/jruby/RubyArray.java:2336:in `collect'

When it is working, enable JRuby in the CI build.

Add table reference (foreign key) information

In order to ensure referential integrity in the MDES export, we need to be able to verify that subtables refer to existing parents. Unfortunately, the MDES 2.0 does not contain much in the way of explicit associations between tables. All it has are some variables tagged as primary keys and other variables tagged as foreign keys. You can infer that a foreign key with the same name as the primary key in another table is a reference to that other table, but unfortunately there are PKs with the same name in different tables.

Implement a heuristic that infers table associations in this way when they are unambiguous and complains when it finds ones that are ambiguous. Provide a manual mapping mechanism to resolve the ambiguous cases.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.