eurekaclinical / aiw-i2b2-etl Goto Github PK
View Code? Open in Web Editor NEWProtempa i2b2 tools
License: Apache License 2.0
Protempa i2b2 tools
License: Apache License 2.0
When doing a data load, the log contains WARNING: No metadata for concept
warnings about properties without a value set, it seems because it doesn't know about the specific value of the property ahead of time. This is wrong, and those properties are loaded anyway. Here is an example:
WARNING: No metadata for concept edu.emory.cci.aiw.i2b2etl.dest.metadata.conceptid.PropDefConceptId@38a75842[propId=PatientDetails,propertyName=dateOfDeath,value=org.protempa.proposition.value.DateValue@512a5cc6[date=xxxx-xx-xx xx:xx:xx.x],conceptCode=<null>,metadata=edu.emory.cci.aiw.i2b2etl.dest.metadata.Metadata@347180d0]; this data will not be loaded
For example, there's verbiage in every file that you need an i2b2 instance to run the tests, which is false. There are throws lines for exceptions that are not thrown. Etc.
Is it actually doing anything?
Is it doing anything?
With a QA (and lower powered) Oracle instance, the following query causes ORA-01652: unable to extend temp segment by 128 in tablespace TEMP:
SELECT A1.C_NAME, A1.VALUETYPE_CD, A1.EK_UNIQUE_ID, A3.EK_UNIQUE_ID, (SELECT EK_UNIQUE_ID FROM NCATS_MEDS WHERE C_FULLNAME = CASE WHEN SUBSTR(A1.M_APPLIED_PATH, LENGTH(A1.M_APPLIED_PATH), 1) = '%' THEN SUBSTR(A1.M_APPLIED_PATH, 1, LENGTH(A1.M_APPLIED_PATH) - 1) ELSE A1.M_APPLIED_PATH END AND C_SYNONYM_CD ='N' AND M_APPLIED_PATH ='@'), A1.C_METADATAXML FROM NCATS_MEDS A3 JOIN EK_TEMP_UNIQUE_IDS A4 ON (A3.EK_UNIQUE_ID=A4.UNIQUE_ID) AND A3.C_SYNONYM_CD ='N' AND A3.M_APPLIED_PATH='@' JOIN NCATS_MEDS A1 ON (A3.C_FULLNAME LIKE A1.M_APPLIED_PATH AND A1.C_SYNONYM_CD ='N' AND A1.M_APPLIED_PATH<>'@' AND A1.C_BASECODE IS NOT NULL)
This doesn't seem to impact query results, but modifier facts for lab test results have the units of the result. This doesn't seem right.
It should have the same modifiers as the ICD-9 diagnosis ontology.
I believe what is happening is that the download date is computed by Protempa, and the update date can be computed by the data source backend. In this case, the update date ends up getting computed before the download date. This is a cosmetic issue, at least for now.
It appears that the breakdowns in the Run Query dialog in i2b2 do not work unless there is a folder with one or more child values. We need to change Vital Status in the Eureka ontology to fit this constraint. It should be:
Demographics
Vital Status
Known Deceased
This bug in our ontology breaks multiple of the breakdown options in the Run Query dialog. The JBoss log complains about Vital Status in all cases.
The project, called eurekaclinical-ontology, should generate an h2 database so that the eureka project's embedded tomcat can just copy the database file into place rather than generate it from scratch every time.
Aiw-i2b2-etl should not depend on this new ontology project.
The logic should be to delete any facts for which the DELETE_DATE attribute is not-null and not in the future. Need to determine policy for what to do if a dimension record has a DELETE_DATE that is not-null and not in the future (cascade or not).
The ETL process loads a special "not recorded" provider record in the event that provider data is not loaded, because i2b2 requires that every fact have an associated provider. This special record is loaded with an update date, even when it is loaded the first time. This is inconsistent with the rest of the ETL, which does not set an update date unless a record has actually been updated.
[INFO] [ERROR] Failed to execute goal org.liquibase:liquibase-maven-plugin:3.4.2:update (liquibase-populate-eureka-ontology-EK_ICD10CM) on project eureka-protempa-etl: Error setting up or running Liquibase: liquibase.exception.SetupException: Error parsing line 5893 column 14 of /Users/arpost/NetBeansProjects/eureka/eureka-protempa-etl/target/eureka-config/dbmigration/eureka-ontology-EK_ICD10CM-changelog.6.xml: cvc-complex-type.2.3: Element 'insert' cannot have character [children], because the type's content type is element-only. -> [Help 1]
I saw the error while building eureka with current aiw-i2b2-etl.
Depends on #1.
in src/main/resources/dbmigration/eureka-ontology-EK_VITALS-changelog.xml. Needed to make i2b2 pop up the value threshold dialog. See how it's done in eureka-ontology-EK_LABS-changelog.xml for an example.
The main file where work is needed is src/main/resources/i2b2-data-schema-changelog.xml. Work may be needed in the Java code if SQL queries fail due to the added column, or in the stored procedure code if any stored procedures fail due to the added column. The stored procedures for Oracle and PostgreSQL are in src/main/resources/sql.
This is inconsistent with the other dimensions and how we treat observation_fact.
It walks all metadata not just the metadata we know is derived (the Phenotypes subtree).
This functionality will set the delete date column with the value of Proposition.getDeleteDate().
Probably should be in a separate liquibase changelog so we can optionally insert the record if we want to.
We should support the order type modifier from the ACT ontology. This affects src/main/resources/dbmigration/eureka-ontology-EK_MED_ORDERS*.
We currently use the BIRTHDATE value for death date, because there is no deathdate attribute value in Protempa.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.