Giter VIP home page Giter VIP logo

criteria2query's Introduction

Criteria2Query

[In Development] An application to parse freetext inclusion criteria and produce a structured cohort definition that can be executed against OMOP CDM

Criteria2Query Introduction

Criteria2Query instruction

Information Extraction

Name Entity Recognition

We implemented our NER methods based on a sequence labeling method, Condition Random Fields (CRF), in CoreNLP with an empirical feature set. After NER, all entities were extracted from free-text criteria with predicted categories assigned automatically.

Relation Extraction

Our pipeline implements binary relation extraction with two relationships: has_temp (temporal) and has_value (Table 2). Relations between entities are determined by reachability according to enhanced++ English universal dependency parsing results.

Logic Detection

We developed a logic detection step following the information extraction pipeline to resolve the logic operators connecting clinical entities. Our heuristic method uses the conjunct tags in enhanced English universal dependency parsing results to group the entities and decompose the logic relations between entities and groups.

Query Formulation

Entity Normalization

We wrapped a lucene-based OMOP mapping tool called Usagi as a web service that queries entity terms and their domains to map terms to OMOP standard concepts. Using OHDSI APIs , we leverage the rich hierarchical relations among concepts in the OMOP CDM to include all descendants for condition concepts and all drugs sharing the same ingredient for drug concepts.

Logic Translation

We developed a logic translation component in Criteria2Query to translate logic within structured criteria to the target data model. In cohort definitions in the OMOP CDM, the logic relations of “And” and “Or” are represented by the templates “have all of the following criteria” and “have any of the following criteria”, respectively. Exclusion criteria are represented by “with exactly 0 using all occurrences”.

Attribute Normalization

We adapted a library for recognizing and normalizing time expressions, SUTime, to standardized temporal expressions into TIMEX3 format first. We then use regular expressions to transform temporal information in TIMEX3 format into the target CDM format. We also developed a heuristic method for the numeric normalization using regular expressions to fill the results in the target format. Both temporal and numeric attributes are linked to their related criteria based on relation extraction results.

criteria2query's People

Contributors

chiyuan1126 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.