srdc / tofhir Goto Github PK
View Code? Open in Web Editor NEWMapping toolset to migrate/transform existing datasets to HL7 FHIR
License: Apache License 2.0
Mapping toolset to migrate/transform existing datasets to HL7 FHIR
License: Apache License 2.0
For example, mvn test detects 6 test cases for FhirMappingFolderRepositoryTest class, runs 2 of them and prints All tests passed message ingoring the 4 test cases.
When the log file grows, old logs are archived as a zip file. When there are logs belonging to the same exection in the archived logs and the last log file read, the logs could be splitted and the log-server gives an error.
reload command of tofhir-engine sometimes work, sometimes it does not. I simply update a mapping and run reload command, then I try to run it, the old mapping is executed.
- Implement for non-stream data sources(file and SQL).
- Implement for stream data sources(Kafka).
- When the csv file is not found, an appropriate error should be logged.
- When the sink url is not reachable, an appropriate error should be logged.
- When the csv columns and schema columns do not match, an appropriate error should be logged.
When a project is deleted, we only delete the folders in the repo.
They are not deleted from caches. When a project is deleted, we need to clear the caches from the memories.
Resource: https://hl7.org/fhir/R4/QuestionnaireResponse.html
Endpoint:
http://localhost:8085/tofhir/fhir-definitions?q=elements&profile=http://hl7.org/fhir/StructureDefinition/QuestionnaireResponse
onFHIR: Default configurations without common data model
toFHIR Mapping Repo: https://gitlab.srdc.com.tr/medic/coverchild-integrations
item.answer
object. But it does not exist in the response from toFHIR API.item.answer.item
. But it does not exist.item.item
are wrong. For example, marked path should be item.item.linkId
instead of item.linkId
If tofhir-server has definitions-root-urls = ["http://hl7.org/fhir/"] in the application.conf file, we get the following exception:
Exception in thread "main" io.onfhir.exception.InitializationException: Some of the given infrastructure resources (http://hl7.org/fhir/StructureDefinition/clinicaldocument,http://hl7.org/fhir/StructureDefinition/Composition,http://hl7.org/fhir/StructureDefinition/catalog) of type StructureDefinition does not conform to base FHIR specification! http://hl7.org/fhir/StructureDefinition/clinicaldocument :: JObject(List((severity,JString(error)), (code,JString(invalid)), (diagnostics,JString(Invalid value 'http://terminology.hl7.org/ValueSet/v3-ConfidentialityClassification|2014-03-26' for FHIR primitive type 'canonical'!)), (expression,JArray(List(JString(snapshot.element[19].binding.valueSet)))))),JObject(List((severity,JString(warning)), (code,JString(invalid)), (diagnostics,JString(Constraint 'sdf-0' is not satisfied for the given value! Constraint Description: 'Name should be usable as an identifier for the module by machine processing applications such as code generation'. FHIR Path expression: 'name.matches('[A-Z]([A-Za-z0-9_]){0,254}')')), (expression,JArray(List(JString($this))))))
http://hl7.org/fhir/StructureDefinition/Composition :: JObject(List((severity,JString(error)), (code,JString(invalid)), (diagnostics,JString(Invalid value 'http://terminology.hl7.org/ValueSet/v3-ConfidentialityClassification|2014-03-26' for FHIR primitive type 'canonical'!)), (expression,JArray(List(JString(snapshot.element[18].binding.valueSet)))))),JObject(List((severity,JString(error)), (code,JString(invalid)), (diagnostics,JString(Invalid value 'http://terminology.hl7.org/ValueSet/v3-ConfidentialityClassification|2014-03-26' for FHIR primitive type 'canonical'!)), (expression,JArray(List(JString(differential.element[10].binding.valueSet))))))
http://hl7.org/fhir/StructureDefinition/catalog :: JObject(List((severity,JString(error)), (code,JString(invalid)), (diagnostics,JString(Invalid value 'http://terminology.hl7.org/ValueSet/v3-ConfidentialityClassification|2014-03-26' for FHIR primitive type 'canonical'!)), (expression,JArray(List(JString(snapshot.element[19].binding.valueSet)))))),JObject(List((severity,JString(warning)), (code,JString(invalid)), (diagnostics,JString(Constraint 'sdf-0' is not satisfied for the given value! Constraint Description: 'Name should be usable as an identifier for the module by machine processing applications such as code generation'. FHIR Path expression: 'name.matches('[A-Z]([A-Za-z0-9_]){0,254}')')), (expression,JArray(List(JString($this))))))
at io.onfhir.config.BaseFhirConfigurator.validateGivenInfrastructureResources(BaseFhirConfigurator.scala:190)
at io.onfhir.config.BaseFhirConfigurator.initializePlatform(BaseFhirConfigurator.scala:85)
at io.tofhir.server.service.FhirDefinitionsService.<init>(FhirDefinitionsService.scala:55)
at io.tofhir.server.endpoint.FhirDefinitionsEndpoint.<init>(FhirDefinitionsEndpoint.scala:16)
at io.tofhir.server.endpoint.ToFhirServerEndpoint.<init>(ToFhirServerEndpoint.scala:36)
at io.tofhir.server.ToFhirServer$.start(ToFhirServer.scala:15)
at io.tofhir.server.Boot$.delayedEndpoint$io$tofhir$server$Boot$1(Boot.scala:4)
at io.tofhir.server.Boot$delayedInit$body.apply(Boot.scala:3)
Currently, we have the following exceptions:
We'll implement an umbrella exception class for the whole system and the others will extend this for specific things for example: FhirMappingException (exceptions during mapping execution), FhirWriteException (exceptions while communicating with FHIR Server), FhirSourceReadException (exceptions while reading source data) etc.
Profile: https://aiccelerate.eu/fhir/StructureDefinition/AIC-Practitioner
Search for qualification
element in json response returned by SimpleStructureDefinitionService. It is sliced and has two elements (slices) as children: mainQualification and No Slice.
There is a problem with No Slice section. No Slice
should have 4 elements as its children but it has 1 instead. It seems service creates an extra element between its children and No Slice
element. See the image below:
As the title indicates, details of custom function libraries should be passed to the frontend to be able to provide suggestions on them.
In case of time-series type of source data, multiple rows should be mapped to the one fhir resource.
As in example, each number in the data field may be representing different row on the source.
Note: Is this really a requirement?
...
"valueSampledData": {
"origin": {
"value": "0.0",
"unit": "mg/dl",
"system": "http://unitsofmeasure.org",
"code": "mg/dl"
},
"period": "512.0",
"dimensions": "1",
"data": "99 103 108 114 121 128 132 137 142 148 157 192 197 201 205 208 206 198 207 171 157 143 128 115 106 103 107 103 110 122 138 154 165 170 176 184 188 188 194 198 208 211 215 212 213 216 220 225 228 231 238 239 240 244 249 252 255 256 257 257 257 254 255 258 259 260 254 244 230 214 198 185 177 180 173 174 174 176 177 176 176 174 172 170 167 165 164 162 162 161 162 159 156 153 152 148 141 143 144 147 148 146 144 144 142 142 142 141 139 137 132 130 130 125 121 105 102 100 97 95 92 90 84 84 84 82 79 77 76 74 74 75 73 73 76 77 78 78 79 79 80"
},
...
Enable users to map certain information to FHIR Path (http://hl7.org/fhir/fhirpatch.html) or JSON Patch(https://tools.ietf.org/html/rfc6902.) content which then can be used to patch a specific existing record with supplied values by executing FHIR Patch interaction.
e.g. Add a condition to EpisodeOfCare as the main diagnosis as a reference via FHIR Patch
{
"expression": {
"name": "result",
"language": "application/fhir-template+json",
"value": [
{
"op": "add",
"path": "/diagnosis/-",
"value": {
"condition": {
"reference": "Condition/{{conditionId}}"
}
}
}
]
},
"interaction": "json-patch",
"rid": "{{episodeId}}"
}
If a running job is wanted to be deleted, either
Profile: http://hl7.org/fhir/StructureDefinition/bp
Search for component
element in json response returned by SimpleStructureDefinitionService. It is sliced and has 5 elements (slices) as children: No Slice, SystolicBP, SBPCode, DiastolicBP, DBPCode.
SBPCode and DBPCode slices are missing sliceName property.
It would be nice if mappings referred by a mapping job are re-fetched from the file system when the mapping jobs is loaded again
It is necessary to display a clear error message when there are no project folders for all definitions (e.g. schemas, mappings).
If the validation of some resources fails while execution a mapping with sinkSettings.errorHandling = halt, we do not get the FhirMappingJobResult log.
However, if we use sinkSettings.errorHandling = continue, we get it.
This issue describes how each execution of mapping jobs will be managed.
Scenarios:
Technical specs:
Sub-issues:
Available Bugs
While running a streaming mapping job even if it does not process anything i.e. file, it consumes too much CPU and memory. Is there any Spark configuration to resolve this problem ?
We can configure the error handling settings of a mapping job using mappingErrorHandling and sinkSettings.errorHandling options.
This gives us the following combinations:
Further, we should test it with batch and streaming jobs. In total, it should work for the eight different use cases.
Currently, it does not work if mappingErrorHandling is set to Halt.
Finally, we should add some tests for this functionality.
Resource: https://hl7.org/fhir/R4/Patient.html
Endpoint:
http://localhost:8085/tofhir/fhir-definitions?q=elements&profile=https://www.medizininformatik-initiative.de/fhir/core/modul-person/StructureDefinition/PatientPseudonymisiert
onFHIR: REDCap common data model
toFHIR Mapping Repo: https://gitlab.srdc.com.tr/medic/redcap-integration-mapping
Mapping: Patient mapping in erker-mapping
Path: address.extension
There is missing data on sliced extensions:
Response:
Difference is isArray
and dataTypes
fields.
We can use (* -> <source_unit>) , (<target_unit> -> <conversion_function>) kind of entries to define unit conversions regardless of the source code.
Profile Url: http://localhost:8085/tofhir/fhir-definitions?q=elements&profile=https://aiccelerate.eu/fhir/StructureDefinition/AIC-LabResultWithinSurgicalWorkflow
code.coding
is sliced. There are duplicate slices with the same name labResultLoincCode
as its elements.
Instead of transferring whole csv file during updates, we should be using pagination/chunking methods. Large csv files causes some problems such as unnecessary data transfering, slow response etc.
Extend data source types with a FHIR Server option. It is particularly beneficial while adapting FHIR Resources to a new version.
e.g. In MIMIC-IV, in diagnosis table there is no date information to use as diagnosis date but we know which encounter it is done. So during 'admissions' to Encounter mapping if we can store the admission times per admission in cache we can use this during the diagnosis mapping to use it as the date of diagnosis.
Cache mechanism will be implemented base on Redis
Sometimes, there is no unique id in the data source. In this case, we can generate an id by combining and hashing the data in columns of that row.
In some scenarios, even if we combine the data in the columns, there may be duplicate rows in the data source. For those situations, we can add a condition to check duplicate ids for each batch (if not in the same batch, the resource with the same id is updated because we are using PUT) and eliminate the duplicate ones.
And we can log the duplicate rows for better identification.
Return list of mappings and mapping jobs that use the spesific schema
This is rather a question/suggestion, and not really an issue.
Does toFHIR read HL7 messages?
If so, how to identify the fields and sub-fields in the Schema?
If not, is it planned in the future?
It would have been nice to add textual comments, descriptions to the mappings.
writeErroneousDataset
method. Currently, we infer the schema by taking a sample from erroneous records assuming the schema of all records is the same.Since our mapping models have only URL field about schema, a getSchemaByUrl service is needed.
Make sure no same URL is created more than once
Logs of scheduling jobs does not have the all fields that other job logs have. This causes error while reading the execution logs. Because filtering in log-server needs fields like jobId, projectId etc.
Tofhir throws an error when the path field contains a space character.
toFHIR supports additional Spark options exclusively for file system data sources.
For example, you can find Spark data source options for various formats, such as CSV, at https://spark.apache.org/docs/latest/sql-data-sources-csv.html#data-source-option.
This capability should be expanded to include other data source types, such as SQL and Kafka.
Custom function libraries are required for some mappings. It would be nice to provide configurations to inject required libraries while launching the toFHIR server. It could be a kind of packing configuration as the libraries are expected to be provided as external dependencies. To be decided...
ExecutionService
initialize a ToFhirEngine
and FhirMappingJobManager
only once by retrieving and caching the resources (mappings, schemas, etc.) from a pre-configured location. This means that it is not in sync with server repositories making updates on the resources.Although tofhir-engine allows us to use environment variables such as DATA_FOLDER_PATH and FHIR_REPO_URL inside a mapping job definition, we can not run such jobs using tofhir-server.
We can have a single configuration parameter such as clearCheckpointDirectory, which could be used to implement job and mapping level configurations to clear corresponding folders. The implementation could be as follows:
Relates to #84
When this setting is set to halt
, an error log is generated for a resource that cannot be mapped. However, there is no such error log in case of continue
Base Resource: http://hl7.org/fhir/StructureDefinition/Observation
Profile: https://github.com/DataTools4Heart/common-data-model/blob/main/profiles/DT4H-Electrocardiograph.StructureDefinition.json
Endpoint:
http://localhost:8085/tofhir/fhir-definitions?q=elements&profile=https://datatools4heart.eu/fhir/StructureDefinition/DT4H-Electrocardiograph
onFHIR: DT4H common data model
toFHIR Mapping Repo: https://github.com/DataTools4Heart/data-ingestion-suite
Something wrong with slices on Observation.component. Error from toFHIR server:
Currently, the frontend directly calls onfhir using the FHIR repository URL in the job definition, but this causes problems in the production environment. For example, the validation of a resource on the mapping testing page cannot call onfhir. To address this issue, we need to add proxy logic to tofhir-server that redirects the request to onfhir and returns the response to the frontend as it is.
The following has been reported while using toFHIR to load considerably large files:
"While loading files into the labresults_csv folder, we've noticed a possible improvement. It might be beneficial to include an initial log entry, such as: "#timestamp #log_level ... #file_name file has been successfully loaded for ingestion." This could help us ascertain the success of the file loading process and provide assurance during the waiting period.
Recently, we encountered a situation where we were ingesting a considerably large file. The logging process took over 30 minutes to initiate, which led to a moment of uncertainty regarding the status of the data load."
In tests, we need to include test jobs to examplify all available config parameters.
In application.conf, we need to ensure that all possible config parameters are written into the file.
README needs to be updated to describe the use of all configuration options.
In some cases, there may be a need to update/delete operation for two different files in the same transaction. For example, when updating a terminolosy system file, job files using that terminology service should be updated as well.
Since we use file system as repository, we have to handle transactional operations and implement some kind of rollback mechanism ourselves.
When multiple schemas are added to a mapping and their source contexts are defined, removing a schema from the mapping does not automatically remove the associated source context in the job. This leads to conflicts between the remaining source contexts and schemas in the mapping. To resolve this, the related source context should be deleted from all jobs that include the mapping whenever a schema is removed from that mapping.
A similar issue is encountered on schema alias change .
Possible scenarios:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.