Giter VIP home page Giter VIP logo

ccdh-terminology-service's People

Contributors

joeflack4 avatar wdduncan avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ccdh-terminology-service's Issues

Support versions

Implement version support:

CRDC-H model
GDC data dictionary
PDC data dictionary
Mappings
caDSR
NCIT

Bug: setup issue: `ModuleNotFoundError`: `linkml_model`, `linkml.utils.slot`

Description

When following the instructions in quick_start.md, I encounter ModuleNotFoundErrors when running the docker-compose up command.

Short errors

  1. ModuleNotFoundError: No module named 'linkml_model'
  2. ModuleNotFoundError: No module named 'linkml.utils.slot'

Long errors

1: linkml_model

Shell error:
ERROR: for ccdh-api Container "8b435e56d182" is unhealthy.

(cd docker && docker-compose up)
Creating network "docker_default" with the default driver
Creating volume "docker_ccdh-neo4j" with default driver
Pulling ccdh-neo4j (neo4j:4.1.3)...
4.1.3: Pulling from library/neo4j
b4d181a07f80: Pull complete
3ee45ae97306: Pull complete
567d410fadc4: Pull complete
ad7fd4930617: Pull complete
7d320042d02a: Pull complete
be5f9d589606: Pull complete
973098f6f4e2: Pull complete
603e26c5e3e5: Pull complete
Digest: sha256:4d0d4bc3e8a636f74900b0817c95921bb398efcffc646ecf87658cb9f1e9723b
Status: Downloaded newer image for neo4j:4.1.3
Creating neo4j ... done
ERROR: for ccdh-api  Container "8b435e56d182" is unhealthy.
ERROR: Encountered errors while bringing up the project.

Docker logs:

Creating a virtualenv for this project...
Pipfile: /app/Pipfile
Using /usr/local/bin/python3.8 (3.8.11) to create virtualenv...
⠋ Creating virtual environment...created virtual environment CPython3.8.11.final.0-64 in 1315ms
creator CPython3Posix(dest=/root/.local/share/virtualenvs/app-4PlAip0Q, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/root/.local/share/virtualenv)
added seed packages: pip==21.1.2, setuptools==57.0.0, wheel==0.36.2
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
✔ Successfully created virtual environment!
Virtualenv location: /root/.local/share/virtualenvs/app-4PlAip0Q
Loading .env environment variables...
Traceback (most recent call last):
File "/usr/local/bin/uvicorn", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/uvicorn/main.py", line 371, in main
run(app, **kwargs)
File "/usr/local/lib/python3.8/site-packages/uvicorn/main.py", line 393, in run
server.run()
File "/usr/local/lib/python3.8/site-packages/uvicorn/server.py", line 50, in run
loop.run_until_complete(self.serve(sockets=sockets))
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/site-packages/uvicorn/server.py", line 57, in serve
config.load()
File "/usr/local/lib/python3.8/site-packages/uvicorn/config.py", line 318, in load
self.loaded_app = import_from_string(self.app)
File "/usr/local/lib/python3.8/site-packages/uvicorn/importer.py", line 25, in import_from_string
raise exc from None
File "/usr/local/lib/python3.8/site-packages/uvicorn/importer.py", line 22, in import_from_string
module = importlib.import_module(module_str)
File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/app/./ccdh/api/app.py", line 7, in <module>
from ccdh.api.routers import mappings, permissible_values, enumerations, models, ccdh_concept_references
File "/app/./ccdh/api/routers/mappings.py", line 1, in <module>
from sssom.sssom_datamodel import Mapping as SssomMapping
File "/src/sssom/sssom/__init__.py", line 1, in <module>
from .util import parse, collapse, dataframe_to_ptable, filter_redundant_rows, group_mappings, compare_dataframes
File "/src/sssom/sssom/util.py", line 11, in <module>
from sssom.datamodel_util import MappingSetDiff, EntityPair
File "/src/sssom/sssom/datamodel_util.py", line 5, in <module>
from sssom.sssom_document import MappingSetDocument
File "/src/sssom/sssom/sssom_document.py", line 1, in <module>
from .sssom_datamodel import MappingSet, Mapping, Entity
File "/src/sssom/sssom/sssom_datamodel.py", line 14, in <module>
from linkml_model.meta import EnumDefinition, PermissibleValue, PvFormulaOptions
ModuleNotFoundError: No module named 'linkml_model'

2: linkml.utils.slot

Shell error:

ERROR: for ccdh-api  Container "1b132121e048" is unhealthy.
ERROR: Encountered errors while bringing up the project.

Docker error logs:

Creating a virtualenv for this project...
Pipfile: /app/Pipfile
Using /usr/local/bin/python3.8 (3.8.11) to create virtualenv...
⠇ Creating virtual environment...created virtual environment CPython3.8.11.final.0-64 in 1417ms
creator CPython3Posix(dest=/root/.local/share/virtualenvs/app-4PlAip0Q, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/root/.local/share/virtualenv)
added seed packages: pip==21.1.2, setuptools==57.0.0, wheel==0.36.2
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
✔ Successfully created virtual environment!
Virtualenv location: /root/.local/share/virtualenvs/app-4PlAip0Q
Loading .env environment variables...
Traceback (most recent call last):
File "/usr/local/bin/uvicorn", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/uvicorn/main.py", line 371, in main
run(app, **kwargs)
File "/usr/local/lib/python3.8/site-packages/uvicorn/main.py", line 393, in run
server.run()
File "/usr/local/lib/python3.8/site-packages/uvicorn/server.py", line 50, in run
loop.run_until_complete(self.serve(sockets=sockets))
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/site-packages/uvicorn/server.py", line 57, in serve
config.load()
File "/usr/local/lib/python3.8/site-packages/uvicorn/config.py", line 318, in load
self.loaded_app = import_from_string(self.app)
File "/usr/local/lib/python3.8/site-packages/uvicorn/importer.py", line 25, in import_from_string
raise exc from None
File "/usr/local/lib/python3.8/site-packages/uvicorn/importer.py", line 22, in import_from_string
module = importlib.import_module(module_str)
File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/app/./ccdh/api/app.py", line 7, in <module>
from ccdh.api.routers import mappings, permissible_values, enumerations, models, ccdh_concept_references
File "/app/./ccdh/api/routers/mappings.py", line 1, in <module>
from sssom.sssom_datamodel import Mapping as SssomMapping
File "/src/sssom/sssom/__init__.py", line 1, in <module>
from .util import parse, collapse, dataframe_to_ptable, filter_redundant_rows, group_mappings, compare_dataframes
File "/src/sssom/sssom/util.py", line 11, in <module>
from sssom.datamodel_util import MappingSetDiff, EntityPair
File "/src/sssom/sssom/datamodel_util.py", line 5, in <module>
from sssom.sssom_document import MappingSetDocument
File "/src/sssom/sssom/sssom_document.py", line 1, in <module>
from .sssom_datamodel import MappingSet, Mapping, Entity
File "/src/sssom/sssom/sssom_datamodel.py", line 16, in <module>
from linkml.utils.slot import Slot
ModuleNotFoundError: No module named 'linkml.utils.slot'

Things I tried

  1. Different branch "improvement-batch": I checked and this branch was closed. I decided not to try and dig up an old tag. I also merged the latest code, but no changes had been made since I forked.
  2. Installing without docker: I tried this and I got the same results.
  3. Updating Pipfile w/ missing/bad requirements: I decided to add the line linkml-model = "*". This fixed the first linkml_model ModuleNotFoundError! But when running docker-compose up again, I got the linkml.utils.slot error.

Possible solutions

Similar to what I tried in "things I tried" (3), this may just be an issue of other linkml modules not having been added to the Pipfile. @jiaola Let's discuss when you return!

(1) PDC test server importer broken, (2) git submodule configuration ok?

Tasks

  • 1. PDC: test server importer broken
  • 2. git submodule configuration ok?

Description

Issue (2) is an issue in and of itself, but it may also be the cause of (1), which is why I'm grouping them together.

1. PDC: test server importer broken

Action: Booting docker container and running python -m ccdh.importers.importer and check https://test.terminology.ccdh.io/models
Expected behavior: should show PDC
Actual behavior: doesn't show PDC

2. git submodule configuration ok?

Production env

cd ccdh-terminology-service; git submodule

 5c3ac5d167786ed2d94ec672ddf67adfad4e7d45 crdc-nodes/HTAN-data-pipeline (remotes/origin/main)
+091ffa25f73f19ee0151e0b1179d3ae864ef7d32 crdc-nodes/PDC-Public (heads/master)
+6457a448dd1cb583cb88f8323b72a57f41b48be5 crdc-nodes/gdcdictionary (2.1.1-rc.2-47-g6457a44)
+bfad07fba992d580b8cd4dc49665866aecb50d57 crdc-nodes/icdc-model-tool (v2.0.0-7-gbfad07f)

git status

HEAD detached at origin/master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
  (commit or discard the untracked or modified content in submodules)
	modified:   crdc-nodes/PDC-Public (new commits)
	modified:   crdc-nodes/gdcdictionary (new commits)
	modified:   crdc-nodes/icdc-model-tool (new commits, modified content)

no changes added to commit (use "git add" and/or "git commit -a")

cd crdc-nodes/PDC-Public/; git branch

* (HEAD detached at 091ffa2)
  master

git log

commit 091ffa25f73f19ee0151e0b1179d3ae864ef7d32 (HEAD, master)
Author: Ngoc Nguyen <[email protected]>
Date:   Tue May 18 23:03:45 2021 -0400

    Checked in release v1.1.5

git remote -v

origin	https://github.com/esacinc/PDC-Public (fetch)
origin	https://github.com/esacinc/PDC-Public (push)

Test env

cd ccdh-terminology-service-test; git submodule

-5c3ac5d167786ed2d94ec672ddf67adfad4e7d45 crdc-nodes/HTAN-data-pipeline
-c00e5e1e145d71e0be008f0bd816a955bf96bd3d crdc-nodes/PDC-Public
-92a77bc7319f55e4ad263c4b1d24c7a0e11186ee crdc-nodes/gdcdictionary
-b3e078a1407299800dc45edf2850eb3a0674ea8f crdc-nodes/icdc-model-tool

git status

HEAD detached at origin/master
nothing to commit, working tree clean

cd crdc-nodes/PDC-Public/; git branch

* (HEAD detached at origin/master)
  issue_40-test_server

git log

Author: joeflack4 <[email protected]>
Date:   Mon Aug 23 17:31:19 2021 -0400

    Updates
    - Updated build script & config.py with some .env changes in order to get build to pass. Needed to also update .env file locations on the server and create some symlinks in order to get things to work correctly server side.

git remote -v

joeflack4	[email protected]:joeflack4/ccdh-terminology-service.git (fetch)
joeflack4	[email protected]:joeflack4/ccdh-terminology-service.git (push)
origin	[email protected]:cancerDHC/ccdh-terminology-service.git (fetch)
origin	[email protected]:cancerDHC/ccdh-terminology-service.git (push)

Discussion

Questions

  • Why does test env show PDC submodule, but when inside the folder, shows git info for ccdh-terminology-service?
  • Is this setup the reason that issue (1) is happening for test env?
  • Should we be updating the submodules in the production env?
  • Anything else 'git submodule related' in production env that should change?
  • Anything else 'git submodule related' in test env that should change?

PDC importer issues: git-lfs failure

Description

While running the importer.py and trying to import PDC data (Importer(neo4j_graph()).import_node_attributes(PdcImporter.read_data_dictionary())), there was an error.

Related issues

This issue largely stemming from incorrect hashes / unfound files. An issue has been created in the PDC-Public repository: esacinc/PDC-Public#4

Errors

Importer.py runtime error

INFO:ccdh.importers.pdc:Loading study.json
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Error using git lfs

pwd
/Users/joeflack4/projects/ccdh-terminology-service/crdc-nodes/PDC-Public/documentation/prod/json
ls
aliquot.json            demographic.json        exposure.json           gene.json               project.json            sample.json             workflowmetadata.json
aliquotrunmetadata.json diagnosis.json          familyhistory.json      geneabundance.json      proteinAbundance.json   study.json
analyte.json            dictionary.json         file.json               portion.json            protocol.json           studyrunmetadata.json
case.json               dictionary_item.json    followup.json           program.json            publication.json        treatment.json
git lfs pull
[59c0032a5454b14bc1f96dc8a1ea5a37b8843b6a89a00d26dc09e7a0ab1a64b9] Object does not exist on the server: [404] Object does not exist on the server
[ffdb0df86a73d045ee5fd9258019633f442d99555fe803748a3fd31229700a81] Object does not exist on the server: [404] Object does not exist on the server
[39d50a5df2b920b54d372be2dc50f2f801732afeba395264af1d13c281a594e6] Object does not exist on the server: [404] Object does not exist on the server
[e8791d255b74d39cef698d724c01a84e42bbc2a1bfa44c0363447ab39ab510ea] Object does not exist on the server: [404] Object does not exist on the server
[8b892cb483a915e902a39c383353223f7d7a702e7c5e504b99f96a5200969ca2] Object does not exist on the server: [404] Object does not exist on the server
[c6d1f666a99596721e635021350b94d92d4616256d813c87b5151305537240ae] Object does not exist on the server: [404] Object does not exist on the server
[9ab643a8adde68f30d9b9fce407fa435baf814d82e4dda55d4b6811451efea0d] Object does not exist on the server: [404] Object does not exist on the server
[8c6f1e133504269009c73b62ed847dc0d37d2f7d2ee20f287ff122d9ea176a1e] Object does not exist on the server: [404] Object does not exist on the server
[d515ec7b7d18ac96eb06997d6a82bed84d93b3387633ac4a98a4d450aaced531] Object does not exist on the server: [404] Object does not exist on the server
[047022e4b1ad4daaefa3e0d18ba3ceb9346803267ff5c83521e9b2634890d648] Object does not exist on the server: [404] Object does not exist on the server
[154b86b0c6c7ac871ed2bc1f42100029589e4bb3dede2e183199b3642466ddcb] Object does not exist on the server: [404] Object does not exist on the server
[85c95df0c0bd785afaea83af48e12bc3f7b15f0139693681c60142d85e60bffc] Object does not exist on the server: [404] Object does not exist on the server
[d9be6cddfa546bc99ef9c6aff8ece29b281ecdcf41b6547c52f5ff0e65ef6577] Object does not exist on the server: [404] Object does not exist on the server
[51e29ee8b69b4a1bbdebacb468f43b59be95f94ab0ac5c033cbd682b3362fc97] Object does not exist on the server: [404] Object does not exist on the server
[79a2f89f4118827b1d58b89b0f274e5befb7a73ff2b88006538656dc8b07743d] Object does not exist on the server: [404] Object does not exist on the server
[13df1f37e9e52ae391ae8e7691843b989c7f349b4f30537b7d304c2f17901b13] Object does not exist on the server: [404] Object does not exist on the server
[6f5fe58e6fad90fad587f26a0f90b9b2b4a4b11e5abc972a650efd860cdb1355] Object does not exist on the server: [404] Object does not exist on the server
[aa56e742ac3d30a4e3c57ce5c7dbc1041428d193ccc8587274ae755fe2b0b568] Object does not exist on the server: [404] Object does not exist on the server
[69fd4dd9d64c2ff9f37fbc3ef6b11b43b27dff4dfe6b544712c0c9734dd22741] Object does not exist on the server: [404] Object does not exist on the server
[bd8f86533aadab1bb2ec6790d94e6a8bad7ef10bc4c49c4dc93b85bffdd382c6] Object does not exist on the server: [404] Object does not exist on the server
[9a0d7401f826be8e07da75c52681c682abb7022af1546025cc3fb668a7b4e459] Object does not exist on the server: [404] Object does not exist on the server
[5b31f04d95b828fb159ba73ddde80292b9d63f97150fd8f6a644056a195098d4] Object does not exist on the server: [404] Object does not exist on the server
[211180c807de80c3ce5a0db290e821b7b360b74fc19af0b412df9475494be099] Object does not exist on the server: [404] Object does not exist on the server
[a9f9ae322c5ca287d2df93621df840db89c96d9fcfabd7f4ac32031c1ca8f587] Object does not exist on the server: [404] Object does not exist on the server
[190006249a1d6ef2bc5159d0b35e6aee42559631847770b5bff326a0d09b8f0e] Object does not exist on the server: [404] Object does not exist on the server
[57a2c5d285a440de10d8be20cf3d0119f6b7746c98be5caac35e7e38c43225f0] Object does not exist on the server: [404] Object does not exist on the server
[b000a2cda149194d5a3aaea6c36fae527b50c790725038dbd228ab3ef7f423f9] Object does not exist on the server: [404] Object does not exist on the server
[8c9e2e5f1f3c2d746cf60a280b408e215b0d9b84f618f442bb1431e620134604] Object does not exist on the server: [404] Object does not exist on the server
[f3111531658ddfa410474b7098ad9ab320397fba18fb199d74dffba6cea90d53] Object does not exist on the server: [404] Object does not exist on the server
[057f86b4b92aff9118695f293a003fff86ee2415d8f2a01a43a37d93f52d77b0] Object does not exist on the server: [404] Object does not exist on the server
[c15767f788106eb41ec6fee3517354e767160a190485fd1e0f835120c353cf84] Object does not exist on the server: [404] Object does not exist on the server
[8879a4587c5fb01dbc9501e6cf23305a06aaf6b0a06bdf9b64812f853432edc4] Object does not exist on the server: [404] Object does not exist on the server
[33f23851aefcf94d63dc5b6bb36ec2aa60233086ecac9def12aa1472cbce0c4a] Object does not exist on the server: [404] Object does not exist on the server
[eb3fd33194428a1459f2501ef18413da6a366bf1e7fb6904eda5b36cd876ab97] Object does not exist on the server: [404] Object does not exist on the server
[6a41ed64bc0f22598a969ed8d275a209c60c03507cf7efb2b18612a57a6dc0f5] Object does not exist on the server: [404] Object does not exist on the server
[34aa8c021c9464d19ac9fbf77f383c605daf1b64d2df8fb1a99fa5292a0df4f4] Object does not exist on the server: [404] Object does not exist on the server
[bc07c2bbfc593a443019b553d69096622f689823749ef7776978aca52a4fea14] Object does not exist on the server: [404] Object does not exist on the server
error: failed to fetch some objects from 'https://github.com/esacinc/PDC-Public.git/info/lfs'

Possible solutions

I think there are two things we need to do to resolve this:

  1. They need to fix this issue so that the files can be located and downloaded.
  2. Once they do, we're still going to have an error when we import, because our importer is expecting .json files inside of a certain directory. However, we'll still need to pull those files down. We'll have to figure out when/how we pull those files down. I'm thinking we'll probably want to do this during the importer.py runtime. And for that, perhaps we can use this Python library: https://pypi.org/project/git-lfs/

/enumerations endpoint(s): return value format changes

Description

Example query: CRDC-H.Specimen.general_tissue_morphology

What we're currently returning:

name: CRDC-H.Specimen.general_tissue_morphology
description: Autogenerated Enumeration for CRDC-H Specimen general_tissue_morphology
permissible_values:
- text: C119010
  description: Peritumoral
  meaning: NCIT:C119010
...

What we're thinking about returning:

name: CRDC-H.Specimen.general_tissue_morphology
description: Autogenerated Enumeration for CRDC-H Specimen general_tissue_morphology
permissible_values:
  - text: Peritumoral
    meaning: NCIT:C119010
...

Edit: 2021/11/01
@wdduncan Basically we're changing the 'description' field ot say 'text' instead, and we're dropping the field that shows the plain code.

Make the api more restful

Based on Sean Davis's suggestion. Make the API more restful so users can navigate the endpoints more easily.

GDC Importer errors & warnings

Description

While running importer.py, the GDC importer (Importer(neo4j_graph()).import_node_attributes(GdcImporter.read_data_dictionary())) produces warnings and errors.

Warnings

INFO:ccdh.importers.gdc:Loading /Users/joeflack4/projects/ccdh-terminology-service/data/data_dictionary/gdc/current.json
WARNING:ccdh.importers.gdc:CDE ID contains no values: 7050286
WARNING:ccdh.importers.gdc:CDE ID contains no values: 3226275
WARNING:ccdh.importers.gdc:CDE ID contains no values: 6161034
WARNING:ccdh.importers.gdc:CDE ID contains no values: 5432604
WARNING:ccdh.importers.gdc:CDE ID contains no values: 2975232
WARNING:ccdh.importers.gdc:CDE ID contains no values: 7068995
INFO:ccdh.importers.importer:Importing NodeAttribute GDC.Aggregated Somatic Mutation.batch_id ...

Errors

Short error message

KeyError: 'permissible_values'

Long error message

INFO:ccdh.importers.gdc:Loading /Users/joeflack4/projects/ccdh-terminology-service/data/data_dictionary/gdc/current.json
...
INFO:ccdh.importers.importer:Importing NodeAttribute GDC.Aggregated Somatic Mutation.batch_id ...
Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py", line 222, in <module>
    Importer.import_all()
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py", line 215, in import_all
    Importer(neo4j_graph()).import_node_attributes(GdcImporter.read_data_dictionary())
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py", line 60, in import_node_attributes
    self.import_node_attribute(node_attribute)
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py", line 42, in import_node_attribute
    permissible_values = node_attribute['permissible_values']
KeyError: 'permissible_values'

Add versioning for: (i) API, (ii) specific endpoints

Requirements

I'm not sure if we want to do Requirement (A), (B), or both.

Requirement A: API Service versioning

Description

We have a version for the entire API, e.g.

Implementation

When do we constitute a new version? Every time we make a new deployment where any change has been made in the underlying data model?

What kind of versioning system to use? Simply incrementing integers?

Requirement B: Individual versioning to all endpoints

Description

Add versioning param to all endpoints. If a given resolved endpoint changes, it should be versioned up.

Implementation

For every endpoint, we would have the ability to tack on a /versions and /versions/<NUM> endpoint to each. Default without adding will be to add latest version.

Examples

Example 1

Before
https://terminology.ccdh.io/models/CRDC-H/entities/Specimen/attributes/tumor_status_at_collection

After
https://terminology.ccdh.io/models/CRDC-H/entities/Specimen/attributes/tumor_status_at_collection

  • Shows latest version

https://terminology.ccdh.io/models/CRDC-H/entities/Specimen/attributes/tumor_status_at_collection/versions

  • Shows all versions available

https://terminology.ccdh.io/models/CRDC-H/entities/Specimen/attributes/tumor_status_at_collection/versions/2

  • Shows version 2

Not found: https://raw.githubusercontent.com/cancerDHC/ccdhmodel/main/src/schema/ccdhmodel.yaml

Description

When importing, yaml file is not found.

This can be verified also by just trying to personally navigate to it in the browser: https://raw.githubusercontent.com/cancerDHC/ccdhmodel/main/src/schema/ccdhmodel.yaml

Possible solutions

A. Get the file back in that location

For that, just need to talk to the ccdhmodel team and figure out what's going on.

B. Point to new location

We'd have to update in two locations:
Screen Shot 2021-09-16 at 6 07 34 PM

Fix the build errors in the github action

The build is failing. See:

https://github.com/cancerDHC/ccdh-terminology-service/runs/3019480561

============================= test session starts ==============================
platform linux -- Python 3.9.5, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /home/runner/work/ccdh-terminology-service/ccdh-terminology-service, configfile: pytest.ini, testpaths: tests
plugins: Faker-8.10.0, anyio-3.2.1, docker-0.10.3
collected 6 items / 3 errors / 3 selected

==================================== ERRORS ====================================
_____________ ERROR collecting tests/ccdh/importers/test_crdc_h.py _____________
ImportError while importing test module '/home/runner/work/ccdh-terminology-service/ccdh-terminology-service/tests/ccdh/importers/test_crdc_h.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.9.5/x64/lib/python3.9/importlib/init.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/ccdh/importers/test_crdc_h.py:1: in
from ccdh.importers.crdc_h import CrdcHImporter
ccdh/importers/crdc_h.py:4: in
from linkml.loaders import yaml_loader
E ModuleNotFoundError: No module named 'linkml.loaders'
______________ ERROR collecting tests/ccdh/mapping/test_sssom.py _______________
ImportError while importing test module '/home/runner/work/ccdh-terminology-service/ccdh-terminology-service/tests/ccdh/mapping/test_sssom.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.9.5/x64/lib/python3.9/importlib/init.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/ccdh/mapping/test_sssom.py:1: in
from sssom.sssom_datamodel import Mapping, MappingSet
src/sssom/sssom/init.py:1: in
from .sssom_datamodel import Mapping, MappingSet
src/sssom/sssom/sssom_datamodel.py:13: in
from linkml.utils.curienamespace import CurieNamespace
E ModuleNotFoundError: No module named 'linkml.utils.curienamespace'
_______________ ERROR collecting tests/tccm/test_enumeration.py ________________
ImportError while importing test module '/home/runner/work/ccdh-terminology-service/ccdh-terminology-service/tests/tccm/test_enumeration.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.9.5/x64/lib/python3.9/importlib/init.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/tccm/test_enumeration.py:2: in
from linkml.utils.compile_python import compile_python
E ModuleNotFoundError: No module named 'linkml.utils.compile_python'
=========================== short test summary info ============================
ERROR tests/ccdh/importers/test_crdc_h.py
ERROR tests/ccdh/mapping/test_sssom.py
ERROR tests/tccm/test_enumeration.py
!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!
============================== 3 errors in 16.19s ==============================

Bug: GDC Importer fhir.hotecosystem.org SSL error

Description

I haven't looked super deeply into this error message yet, but this is a new failure I am seeing that is coming up both during (1) tests / build checks, and (2) when using the GDC importer. I'm not 100% sure yet because I haven't looked deeply into the issue, but it looks to me like their SSL certificate is now invalid.

Screen Shot 2021-07-25 at 7 57 34 PM

Error messages

Short error

requests.exceptions.SSLError: HTTPSConnectionPool(host='fhir.hotecosystem.org', port=443): Max retries exceeded with url: /terminology/cadsr/ValueSet/2513915 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))

Long error

/Users/joeflack4/virtualenvs/ccdh-terminology-service/bin/python3 /Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 63472 --file /Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py
/Applications/PyCharm.app/Contents/helpers/pydev/_pydevd_bundle/pydevd_resolver.py:127: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if found.get(name) is not 1:
pydev debugger: process 4252 is connecting

Connected to pydev debugger (build 181.5540.34)
INFO:ccdh.importers.crcd_h:Retrieving CCDH Model YAML: https://raw.githubusercontent.com/cancerDHC/ccdhmodel/main/src/schema/ccdhmodel.yaml
INFO:ccdh.importers.gdc:Loading /Users/joeflack4/projects/ccdh-terminology-service/data/data_dictionary/gdc/current.json
Traceback (most recent call last):
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/connection.py", line 411, in connect
    self.sock = ssl_wrap_socket(
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='fhir.hotecosystem.org', port=443): Max retries exceeded with url: /terminology/cadsr/ValueSet/2513915 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1664, in <module>
    main()
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/importer.py", line 218, in <module>
    Importer(neo4j_graph()).import_node_attributes(GdcImporter.read_data_dictionary())
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/gdc.py", line 72, in read_data_dictionary
    value_desc = GdcImporter.\
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/gdc.py", line 33, in get_value_descriptions_from_cadsr
    values = get_cadsr_values(cde_id)
  File "/Users/joeflack4/projects/ccdh-terminology-service/ccdh/importers/cadsr.py", line 13, in get_cadsr_values
    value_set = ValueSet.read(cde_id, smart.server)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/fhirclient/models/fhirabstractresource.py", line 83, in read
    instance = cls.read_from(path, server)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/fhirclient/models/fhirabstractresource.py", line 102, in read_from
    ret = server.request_json(path)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/fhirclient/server.py", line 163, in request_json
    res = self._get(path, headers, nosign)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/fhirclient/server.py", line 190, in _get
    res = self.session.get(url, headers=headers)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/requests/sessions.py", line 555, in get
    return self.request('GET', url, **kwargs)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='fhir.hotecosystem.org', port=443): Max retries exceeded with url: /terminology/cadsr/ValueSet/2513915 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))

Possible solutions

A. They fix the issue, if it is indeed fully on their end.

B. We ignore the SSL warning and try and programmatically get around and fetch data / do whatever we need from this resource regardless.

C. ???

PIP install error: `OrderedSet`

Description

It seems like OrderedSet is not able to be installed successfully in either my local environment, or the server environment. It looks like this may be because of various kinds of gcc dependency issues.

Server error

Short error
unable to execute 'gcc': No such file or directory

Long error

[docker@ip-172-31-44-92 testEnv]$ python3 -m pip install orderedset
Collecting orderedset
  Downloading https://files.pythonhosted.org/packages/1d/b0/d85c1893d227ed20f2e446e16006aeab7ca698e721f7c607b647894efc63/orderedset-2.0.3.tar.gz (101kB)
    100% |████████████████████████████████| 102kB 4.1MB/s
Installing collected packages: orderedset
  Running setup.py install for orderedset ... error
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-xynelq68/orderedset/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-5ct6petu-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.7
    creating build/lib.linux-x86_64-3.7/orderedset
    copying lib/orderedset/__init__.py -> build/lib.linux-x86_64-3.7/orderedset
    running build_ext
    building 'orderedset._orderedset' extension
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/lib
    creating build/temp.linux-x86_64-3.7/lib/orderedset
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.7m -c lib/orderedset/_orderedset.c -o build/temp.linux-x86_64-3.7/lib/orderedset/_orderedset.o
    unable to execute 'gcc': No such file or directory
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-xynelq68/orderedset/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-5ct6petu-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-xynelq68/orderedset/
[docker@ip-172-31-44-92 testEnv]$

Local errors (Python 3.9)

Short error
[pipenv.exceptions.InstallError]: /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory

Long error

pipenv install --dev
Courtesy Notice: Pipenv found itself running within a virtual environment, so it will automatically use that environment, instead of creating its own for any project. You can set PIPENV_IGNORE_VIRTUALENVS=1 to force pipenv to ignore that environment and create its own instead. You can set PIPENV_VERBOSITY=-1 to suppress this warning.
Pipfile.lock (dd1d2d) out of date, updating to (306567)...
Locking [dev-packages] dependencies...
Building requirements...
Resolving dependencies...
✔ Success!
Locking [packages] dependencies...
Building requirements...
Resolving dependencies...
✔ Success!
Updated Pipfile.lock (306567)!
Installing dependencies from Pipfile.lock (306567)...
WARNING: Ignoring invalid distribution -dna (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
WARNING: Ignoring invalid distribution -dna (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Collecting pluggy==1.0.0.dev0
  Using cached pluggy-1.0.0.dev0-py2.py3-none-any.whl (17 kB)
Installing collected packages: pluggy
Successfully installed pluggy-1.0.0.dev0
Collecting pytest==6.2.4
  Using cached pytest-6.2.4-py3-none-any.whl (280 kB)
Installing collected packages: pytest
WARNING: Ignoring invalid distribution -ython-dateutil (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Successfully installed pytest-6.2.4
WARNING: Ignoring invalid distribution -equests (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
WARNING: Ignoring invalid distribution -equests (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Collecting toml==0.10.2
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Installing collected packages: toml
Successfully installed toml-0.10.2
Collecting fastjsonschema==2.15.1
  Using cached fastjsonschema-2.15.1-py3-none-any.whl (21 kB)
Installing collected packages: fastjsonschema
WARNING: Ignoring invalid distribution -astapi (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Successfully installed fastjsonschema-2.15.1
WARNING: Ignoring invalid distribution -astapi (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
WARNING: Ignoring invalid distribution -astapi (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Collecting graphviz==0.17
  Using cached graphviz-0.17-py3-none-any.whl (18 kB)
Installing collected packages: graphviz
  Attempting uninstall: graphviz
    WARNING: Ignoring invalid distribution -astapi (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
    Found existing installation: graphviz 0.16
    Uninstalling graphviz-0.16:
      Successfully uninstalled graphviz-0.16
Successfully installed graphviz-0.17
Collecting networkx==2.6rc2▉▉▉▉▉▉▉▉▉▉▉▉ 31/36 — 00:00:01
  Using cached networkx-2.6rc2-py3-none-any.whl (1.9 MB)
Installing collected packages: networkx
WARNING: Ignoring invalid distribution -illow (/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages)
Successfully installed networkx-2.6rc2
An error occurred while installing orderedset==2.0.3 --hash=sha256:b2f5ccfb5a86e7b3b3ddf18b29779cc18b24653abf9d6da4bebecf33780a6e29! Will try again.
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 36/36 — 00:00:24
Installing initially failed dependencies...
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/cli/command.py", line 233, in install
[InstallError]:       retcode = do_install(
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/core.py", line 2052, in do_install
[InstallError]:       do_init(
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/core.py", line 1304, in do_init
[InstallError]:       do_install_dependencies(
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/core.py", line 899, in do_install_dependencies
[InstallError]:       batch_install(
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/core.py", line 796, in batch_install
[InstallError]:       _cleanup_procs(procs, failed_deps_queue, retry=retry)
[InstallError]:   File "/Users/joeflack4/virtualenvs/ccdh-terminology-service/lib/python3.9/site-packages/pipenv/core.py", line 703, in _cleanup_procs
[InstallError]:       raise exceptions.InstallError(c.dep.name, extra=err_lines)
[pipenv.exceptions.InstallError]: Collecting orderedset==2.0.3
[pipenv.exceptions.InstallError]:   Using cached orderedset-2.0.3.tar.gz (101 kB)
[pipenv.exceptions.InstallError]: Building wheels for collected packages: orderedset
[pipenv.exceptions.InstallError]:   Building wheel for orderedset (setup.py): started
[pipenv.exceptions.InstallError]:   Building wheel for orderedset (setup.py): finished with status 'error'
[pipenv.exceptions.InstallError]:   ERROR: Command errored out with exit status 1:
[pipenv.exceptions.InstallError]:    command: /Users/joeflack4/virtualenvs/ccdh-terminology-service/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-wheel-j9pqof33
[pipenv.exceptions.InstallError]:        cwd: /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/
[pipenv.exceptions.InstallError]:   Complete output (25 lines):
[pipenv.exceptions.InstallError]:   running bdist_wheel
[pipenv.exceptions.InstallError]:   The [wheel] section is deprecated. Use [bdist_wheel] instead.
[pipenv.exceptions.InstallError]:   running build
[pipenv.exceptions.InstallError]:   running build_py
[pipenv.exceptions.InstallError]:   creating build
[pipenv.exceptions.InstallError]:   creating build/lib.macosx-10.9-x86_64-3.9
[pipenv.exceptions.InstallError]:   creating build/lib.macosx-10.9-x86_64-3.9/orderedset
[pipenv.exceptions.InstallError]:   copying lib/orderedset/__init__.py -> build/lib.macosx-10.9-x86_64-3.9/orderedset
[pipenv.exceptions.InstallError]:   warning: build_py: byte-compiling is disabled, skipping.
[pipenv.exceptions.InstallError]:
[pipenv.exceptions.InstallError]:   running build_ext
[pipenv.exceptions.InstallError]:   building 'orderedset._orderedset' extension
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-10.9-x86_64-3.9
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-10.9-x86_64-3.9/lib
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-10.9-x86_64-3.9/lib/orderedset
[pipenv.exceptions.InstallError]:   /usr/local/bin/gcc-6 -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch x86_64 -g -I/Users/joeflack4/virtualenvs/ccdh-terminology-service/include -I/Library/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c lib/orderedset/_orderedset.c -o build/temp.macosx-10.9-x86_64-3.9/lib/orderedset/_orderedset.o
[pipenv.exceptions.InstallError]:   In file included from /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/syslimits.h:7:0,
[pipenv.exceptions.InstallError]:                    from /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/limits.h:34,
[pipenv.exceptions.InstallError]:                    from /Library/Frameworks/Python.framework/Versions/3.9/include/python3.9/Python.h:11,
[pipenv.exceptions.InstallError]:                    from lib/orderedset/_orderedset.c:17:
[pipenv.exceptions.InstallError]:   /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory
[pipenv.exceptions.InstallError]:    #include_next <limits.h>  /* recurse down to the real one */
[pipenv.exceptions.InstallError]:                                                                ^
[pipenv.exceptions.InstallError]:   compilation terminated.
[pipenv.exceptions.InstallError]:   error: command '/usr/local/bin/gcc-6' failed with exit code 1
[pipenv.exceptions.InstallError]:   ----------------------------------------
[pipenv.exceptions.InstallError]:   ERROR: Failed building wheel for orderedset
[pipenv.exceptions.InstallError]:   Running setup.py clean for orderedset
[pipenv.exceptions.InstallError]: Failed to build orderedset
[pipenv.exceptions.InstallError]: Installing collected packages: orderedset
[pipenv.exceptions.InstallError]:     Running setup.py install for orderedset: started
[pipenv.exceptions.InstallError]:     Running setup.py install for orderedset: finished with status 'error'
[pipenv.exceptions.InstallError]:     ERROR: Command errored out with exit status 1:
[pipenv.exceptions.InstallError]:      command: /Users/joeflack4/virtualenvs/ccdh-terminology-service/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-record-v1356qvt/install-record.txt --single-version-externally-managed --compile --install-headers /Users/joeflack4/virtualenvs/ccdh-terminology-service/include/site/python3.9/orderedset
[pipenv.exceptions.InstallError]:          cwd: /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/
[pipenv.exceptions.InstallError]:     Complete output (24 lines):
[pipenv.exceptions.InstallError]:     running install
[pipenv.exceptions.InstallError]:     running build
[pipenv.exceptions.InstallError]:     running build_py
[pipenv.exceptions.InstallError]:     creating build
[pipenv.exceptions.InstallError]:     creating build/lib.macosx-10.9-x86_64-3.9
[pipenv.exceptions.InstallError]:     creating build/lib.macosx-10.9-x86_64-3.9/orderedset
[pipenv.exceptions.InstallError]:     copying lib/orderedset/__init__.py -> build/lib.macosx-10.9-x86_64-3.9/orderedset
[pipenv.exceptions.InstallError]:     warning: build_py: byte-compiling is disabled, skipping.
[pipenv.exceptions.InstallError]:
[pipenv.exceptions.InstallError]:     running build_ext
[pipenv.exceptions.InstallError]:     building 'orderedset._orderedset' extension
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-10.9-x86_64-3.9
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-10.9-x86_64-3.9/lib
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-10.9-x86_64-3.9/lib/orderedset
[pipenv.exceptions.InstallError]:     /usr/local/bin/gcc-6 -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch x86_64 -g -I/Users/joeflack4/virtualenvs/ccdh-terminology-service/include -I/Library/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c lib/orderedset/_orderedset.c -o build/temp.macosx-10.9-x86_64-3.9/lib/orderedset/_orderedset.o
[pipenv.exceptions.InstallError]:     In file included from /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/syslimits.h:7:0,
[pipenv.exceptions.InstallError]:                      from /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/limits.h:34,
[pipenv.exceptions.InstallError]:                      from /Library/Frameworks/Python.framework/Versions/3.9/include/python3.9/Python.h:11,
[pipenv.exceptions.InstallError]:                      from lib/orderedset/_orderedset.c:17:
[pipenv.exceptions.InstallError]:     /usr/local/Cellar/gcc@6/6.5.0_2/lib/gcc/6/gcc/x86_64-apple-darwin17.7.0/6.5.0/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory
[pipenv.exceptions.InstallError]:      #include_next <limits.h>  /* recurse down to the real one */
[pipenv.exceptions.InstallError]:                                                                  ^
[pipenv.exceptions.InstallError]:     compilation terminated.
[pipenv.exceptions.InstallError]:     error: command '/usr/local/bin/gcc-6' failed with exit code 1
[pipenv.exceptions.InstallError]:     ----------------------------------------
[pipenv.exceptions.InstallError]: ERROR: Command errored out with exit status 1: /Users/joeflack4/virtualenvs/ccdh-terminology-service/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-1kj3xkg_/orderedset_e5f3b1e0909f4546bed29781911ee2e8/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-record-v1356qvt/install-record.txt --single-version-externally-managed --compile --install-headers /Users/joeflack4/virtualenvs/ccdh-terminology-service/include/site/python3.9/orderedset Check the logs for full command output.
ERROR: Couldn't install package: orderedset
 Package installation failed...

Local errors (Python 3.8.10)

Short error
[pipenv.exceptions.InstallError]: gcc-6: error: this compiler does not support arm64

Long error

(ccdh-terminology-service-3.8) pipenv sync
Courtesy Notice: Pipenv found itself running within a virtual environment, so it will automatically use that environment, instead of creating its own for any project. You can set PIPENV_IGNORE_VIRTUALENVS=1 to force pipenv to ignore that environment and create its own instead. You can set PIPENV_VERBOSITY=-1 to suppress this warning.
Installing dependencies from Pipfile.lock (dd1d2d)...
An error occurred while installing orderedset==2.0.3 --hash=sha256:b2f5ccfb5a86e7b3b3ddf18b29779cc18b24653abf9d6da4bebecf33780a6e29! Will try again.
An error occurred while installing -e git+ssh://[email protected]/HOT-Ecosystem/tccm-api.git@f1f6814d4b13c96f087d979c53c7793d497bc142#egg=tccm-api! Will try again.
An error occurred while installing -e git+ssh://[email protected]/HOT-Ecosystem/tccm-model.git@45fd80f702e27f778f0a2612624b168efacf1ad7#egg=tccm-model! Will try again.
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 113/113 — 00:01:08
Installing initially failed dependencies...
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/cli/command.py", line 696, in sync
[InstallError]:       system=state.system
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/core.py", line 2892, in do_sync
[InstallError]:       system=system,
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/core.py", line 1312, in do_init
[InstallError]:       pypi_mirror=pypi_mirror,
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/core.py", line 900, in do_install_dependencies
[InstallError]:       retry_list, procs, failed_deps_queue, requirements_dir, **install_kwargs
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/core.py", line 796, in batch_install
[InstallError]:       _cleanup_procs(procs, failed_deps_queue, retry=retry)
[InstallError]:   File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pipenv/core.py", line 703, in _cleanup_procs
[InstallError]:       raise exceptions.InstallError(c.dep.name, extra=err_lines)
[pipenv.exceptions.InstallError]: Collecting orderedset==2.0.3
[pipenv.exceptions.InstallError]:   Using cached orderedset-2.0.3.tar.gz (101 kB)
[pipenv.exceptions.InstallError]: Building wheels for collected packages: orderedset
[pipenv.exceptions.InstallError]:   Building wheel for orderedset (setup.py): started
[pipenv.exceptions.InstallError]:   Building wheel for orderedset (setup.py): finished with status 'error'
[pipenv.exceptions.InstallError]:   ERROR: Command errored out with exit status 1:
[pipenv.exceptions.InstallError]:    command: /Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-wheel-9mnlfoak
[pipenv.exceptions.InstallError]:        cwd: /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/
[pipenv.exceptions.InstallError]:   Complete output (18 lines):
[pipenv.exceptions.InstallError]:   running bdist_wheel
[pipenv.exceptions.InstallError]:   The [wheel] section is deprecated. Use [bdist_wheel] instead.
[pipenv.exceptions.InstallError]:   running build
[pipenv.exceptions.InstallError]:   running build_py
[pipenv.exceptions.InstallError]:   creating build
[pipenv.exceptions.InstallError]:   creating build/lib.macosx-11-universal2-3.8
[pipenv.exceptions.InstallError]:   creating build/lib.macosx-11-universal2-3.8/orderedset
[pipenv.exceptions.InstallError]:   copying lib/orderedset/__init__.py -> build/lib.macosx-11-universal2-3.8/orderedset
[pipenv.exceptions.InstallError]:   warning: build_py: byte-compiling is disabled, skipping.
[pipenv.exceptions.InstallError]:
[pipenv.exceptions.InstallError]:   running build_ext
[pipenv.exceptions.InstallError]:   building 'orderedset._orderedset' extension
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-11-universal2-3.8
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-11-universal2-3.8/lib
[pipenv.exceptions.InstallError]:   creating build/temp.macosx-11-universal2-3.8/lib/orderedset
[pipenv.exceptions.InstallError]:   /usr/local/bin/gcc-6 -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g -I/Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/include -I/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8 -c lib/orderedset/_orderedset.c -o build/temp.macosx-11-universal2-3.8/lib/orderedset/_orderedset.o
[pipenv.exceptions.InstallError]:   gcc-6: error: this compiler does not support arm64
[pipenv.exceptions.InstallError]:   error: command '/usr/local/bin/gcc-6' failed with exit status 1
[pipenv.exceptions.InstallError]:   ----------------------------------------
[pipenv.exceptions.InstallError]:   ERROR: Failed building wheel for orderedset
[pipenv.exceptions.InstallError]:   Running setup.py clean for orderedset
[pipenv.exceptions.InstallError]: Failed to build orderedset
[pipenv.exceptions.InstallError]: Installing collected packages: orderedset
[pipenv.exceptions.InstallError]:     Running setup.py install for orderedset: started
[pipenv.exceptions.InstallError]:     Running setup.py install for orderedset: finished with status 'error'
[pipenv.exceptions.InstallError]:     ERROR: Command errored out with exit status 1:
[pipenv.exceptions.InstallError]:      command: /Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-record-l4xz0zag/install-record.txt --single-version-externally-managed --compile --install-headers /Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/include/site/python3.8/orderedset
[pipenv.exceptions.InstallError]:          cwd: /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/
[pipenv.exceptions.InstallError]:     Complete output (17 lines):
[pipenv.exceptions.InstallError]:     running install
[pipenv.exceptions.InstallError]:     running build
[pipenv.exceptions.InstallError]:     running build_py
[pipenv.exceptions.InstallError]:     creating build
[pipenv.exceptions.InstallError]:     creating build/lib.macosx-11-universal2-3.8
[pipenv.exceptions.InstallError]:     creating build/lib.macosx-11-universal2-3.8/orderedset
[pipenv.exceptions.InstallError]:     copying lib/orderedset/__init__.py -> build/lib.macosx-11-universal2-3.8/orderedset
[pipenv.exceptions.InstallError]:     warning: build_py: byte-compiling is disabled, skipping.
[pipenv.exceptions.InstallError]:
[pipenv.exceptions.InstallError]:     running build_ext
[pipenv.exceptions.InstallError]:     building 'orderedset._orderedset' extension
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-11-universal2-3.8
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-11-universal2-3.8/lib
[pipenv.exceptions.InstallError]:     creating build/temp.macosx-11-universal2-3.8/lib/orderedset
[pipenv.exceptions.InstallError]:     /usr/local/bin/gcc-6 -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g -I/Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/include -I/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8 -c lib/orderedset/_orderedset.c -o build/temp.macosx-11-universal2-3.8/lib/orderedset/_orderedset.o
[pipenv.exceptions.InstallError]:     gcc-6: error: this compiler does not support arm64
[pipenv.exceptions.InstallError]:     error: command '/usr/local/bin/gcc-6' failed with exit status 1
[pipenv.exceptions.InstallError]:     ----------------------------------------
[pipenv.exceptions.InstallError]: ERROR: Command errored out with exit status 1: /Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"'; __file__='"'"'/private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-install-3z8wx78x/orderedset_e9d54768e2504e649876c996c1a73597/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/_0/hsvm3gjx1q7br2grg3gx901c0000gn/T/pip-record-l4xz0zag/install-record.txt --single-version-externally-managed --compile --install-headers /Users/joeflack4/virtualenvs/ccdh-terminology-service-3.8/include/site/python3.8/orderedset Check the logs for full command output.
ERROR: Couldn't install package: orderedset
 Package installation failed...

Things I've tried

  1. Trying to install directly from the pipenv lockfile using version 2.0.3.
  2. Tried local environment using both python 3.8 and 3.9.

Ideas

Idea 1: Post a GH issue on OrderedSet repo

I could post a GitHub issue and have them take a look at it.

I'm not sure, but this may be related to similar C compiler related issues previously fixed: simonpercivall/orderedset#27

It may just be that they have rigid requirements for GCC setup. But hopefully this is something that they can maintain/fix, as opposed to anyone who wants to use this dependency.

Issue mirrors

Copy of: "monarch-initiative/omim/pull/7": "Label/synonym formatting: lowercase except for acronyms": Questions

[This is a copy paste from a private repository so that I can share these questions w/ original markdown formatting]


Assignment

Lowercase all text in labels and synonyms, except for any abbreviations / acronyms.

Questions

1. Did you mean acronyms rather than of abbreviations?

AFAIK, acronyms are a subset of abbreviations. I'm guessing so because (i) this makes more sense to me, (ii) online examples showed that by abbreviation in OMIM, they meant acronym, which is a subset of abbreviation

Answers
Nico: Yes :) If you look at mondo, it uses the ABBREVATION synonym type to flag these (I made a related issue here: information-artifact-ontology/ontology-metadata#70).

2. By acronyms/abbreviations, do you mean the ones that appear after the trailing ;, or also ones in the string before it?

Perhaps this is just part of the JSON file we have, and not usually the formatting we'll be ingesting, but most of the labels/synonyms are originally formatted like so:

SMALL NUCLEOLAR RNA, C/D BOX, 43; SNORD43

Most of the strings have a semicolon with an acronym at the end, but also acronyms within the label, such as "RNA" in this case (and maybe C/D?).

Answers
(None)

3. I assume non-acronym abbreviations do not need to retain original capitalization and can be converted to full lowercase. Is that correct?

Example: e.g. "Prof." for "Professor".

Answers
(None)

4. Do you want (; trailing) abbreviations/acronyms to be removed from labels and synonyms?

Before I picked up this project, Dazhi was already removing the "; trailing acronyms" from the labels. But they are retained in the graph. They're just placed elsewhere. See code snippet below.

        # Labels
        abbrev = label.split(';')[1].strip() if ';' in label else None  #  <-------------- extracted

        if omim_type == OmimType.HERITABLE_PHENOTYPIC_MARKER:  # %
            graph.add((omim_uri, RDFS.label, Literal(cleanup_label(label))))
            graph.add((omim_uri, BIOLINK['category'], BIOLINK['Disease']))
        elif omim_type == OmimType.GENE or omim_type == OmimType.HAS_AFFECTED_FEATURE:  # * or +
            omim_type = OmimType.GENE
            graph.add((omim_uri, RDFS.label, Literal(abbrev))) # <------------- but retained
            graph.add((omim_uri, RDFS.subClassOf, SO['0000704']))
            graph.add((omim_uri, BIOLINK['category'], BIOLINK['Gene']))
        elif omim_type == OmimType.PHENOTYPE:  # #
            graph.add((omim_uri, RDFS.label, Literal(cleanup_label(label))))
            graph.add((omim_uri, BIOLINK['category'], BIOLINK['Disease']))
        else:
            graph.add((omim_uri, RDFS.label, Literal(cleanup_label(label))))

4.a. Should I remove "; trailing acronyms" from the synonym label as well?
4.b. If so, should I place the "; trailing acronyms" somewhere else in the graph? If so, I'm not sure where to add them elsewhere and precisely how to do that yet. The code snippet below shows how synonyms are currently being added to the graph.

        graph.add((omim_uri, oboInOwl.hasExactSynonym, Literal(cleanup_synonym(label))))
        for label in other_labels:  # < ------ "other_labels" defined elsewhere
            graph.add((omim_uri, oboInOwl.hasRelatedSynonym, Literal(cleanup_synonym(label))))

Answers
Nico: Whatever Mondo does. There are many OMIM labels in Mondo! _(edit: Joe: I haven't yet found any ;-delimited labels in online examples that I've looked at. Maybe I haven't searched hard enough. Or maybe the JSON file we're ingesting is formatted differently than other/canonical label formats for Mondo?) _

5. Do my TODOs look OK? Are they all necessary?

I have added some comments to one of my file w/ a list of TODOs. I have 2 that I know I need to do, and 5 that are a bit trickier that I should prioritize later, or maybe we don't need to do at all. I feel pretty confident about the 2 at the top. But the 5 "TODO laters"--could you check those and let me know if you want me to do those as well?

# TODO: 1. Maintain capitalization for any acronym in a label, defined as having
# ...1+ A-Z && 1+ 0-9. I'm already capturing acronyms with periods, e.g. "A.B.C", although
# ...our source file does not appear to have many of those.
# TODO 2. Do we want (; trailing) abbreviations/acronyms to be removed synonyms as well?

# TODO Laters:
# 1: Find a pattern for hyphenated types, and maintain acronym capitalization
# ...e.g. MITF-related melanoma and renal cell carcinoma predisposition syndrome
# ...e.g. ATP1A3-associated neurological disorder
# 2. Make pattern for chromosomes
# ...agonadism, 46,XY, with intellectual disability, short stature, retarded bone age, and multiple extragenital malformations
# 3. How to find acronym if it is capitalized but only includes char [A-Z], and
# ... every other char in the string is also capitalized? I don't see a way unless
# ... checking every word against an explicit dictionary of terms, though there are sure
# ... to also be (i) acronyms in that dictionary, and (ii) non-acronyms missing from
# ... that dictionary. And also concern (iii), where to get such an extensive dictionary?
# 4. Add "special character" inclusion into acronym regexp. But which special
# ... chars to include, and which not to include?
# 5. Acronym capture extension: case where at least 1 word is not capitalized:
# ... any word that is fully capitalized might as well be acronym, so long
# ...as at least 1 other word in the label is not all caps. Maybe not a good rule,
# ...because there could be some violations, and this probably would not happen that often anwyay
# 6. Eponyms: re-capitalize first char?
# ...e.g.: Balint syndrome, Barre-Lieou syndrome, Wallerian degeneration, etc.
# ...How to do this? Simply get/create a list of known eponyms? Is this feasible?
# 7. Chromosome special formatting capitalization?
# ...There seems to be special formatting for chromosome refs; they have a comma in the middle, but with no space 
# ...after the comma, though some places I saw on the internet contained a space.
# ...e.g. "46,XY" in: agonadism, 46,XY, with intellectual disability, short stature, retarded bone age, and multiple extragenital malformations

Answers
(None)

6. Are there capitalization mistakes in the existing mondo hierarchy linked earlier?

I checked this out, thanks. But I noticed what looked like to me a lot of mistakes at the top of the "Mendelian diseases" (attached). Some of these are eponyms, etc, but a lot of them look like mistakes to me. Also I saw that "Achoo" syndrome should actually be an acronym (ACHOO), but it's not an acronym in this Mondo hierarchy. This isn't related to my requirements, but I just wanted to draw it to your attention.
Capitalized medallion diseases

Answers
(None)

Remove duplicates in value_only mode

Original Post

The Terminology Service currently returns duplicate permissible values in value-only mode. For example, looking up the enumerations for CRDC-H.Treatment.treatment_effect returns two values for "Unknown", one of which has a description. This duplicate doesn't appear if value_only is set to true. So this problem might go away once we move over to concept identifiers, but I wanted to report it now so we remember to check it later.

Edit from Joe, 2021/09/22

Example yaml

Here is the result from the link that Gaurav posted:

name: CRDC-H.Treatment.treatment_effect
description: Autogenerated Enumeration for CRDC-H Treatment treatment_effect
permissible_values:
- text: 'Yes'
- text: Not Reported  # <---1
- text: Unknown  # <---2
- text: 'No'
- text: Unknown  # <---2
  description: Unknown
- text: Incomplete Necrosis (Viable Tumor Present)
- text: No Necrosis
- text: No Known Treatment Effect
- text: Complete Necrosis (No Viable Tumor)
- text: Not Reported  # <---1
  description: Not Reported

Related issues

linkml/linkml#1068

Tasks

  • 1. Fix this in the terminology service for now
  • 2. Fix this in linkml-runtime (I'm thinking)

Better control over API url case sensitivity

Description

Correct endpoint: https://terminology.ccdh.io/models/CRDC-H/entities
Problematic endpoint example: https://terminology.ccdh.io/models/crdc-h/entities

The problem here is caused by not adhering to the case sensitivity of the API.

Perhaps we could have it so that the API first does a query using the exact casing passed by the user. Then, if it can't find anything, it can lowercase everything and check, and then title case everything and check. And if it finds results, it can return back.

Optionally it'd also be useful to return some metadata to the user explaining what happened.

Current behavior

Returns an empty list [ ].

Expected behavior

404 not found or alternative.

Possible solutions

1. In a decorator

Perhaps it is possible to have this functionality in a decorator. But decorators usually run code before the actual function they're wrapped around is called. And in our case, we'd want to run the function first, and then if it didn't return a response, we'd then change the casing and see if we could get a change.

2. Modify all routes

This seems like it would be a lot of work, and probably too cumbersome.

3. Change the nature of routes? From functions to classes?

Not sure how this would be done, or if it is possible from an architectural standpoint. Does Flask or FastAPI support routes as classes instead of functions?

4. Swagger-only approach

It might be possible for Swagger to support case insensitivity:
go-swagger/go-swagger#303
https://community.smartbear.com/t5/Swagger-Open-Source-Tools/Case-Insensitive-String-parameter-in-schema-of-openApi/td-p/199061

Incorrect `creator_id` in `models/PDC/entities/<ENTITY>/attributes/<ATTR>/mappings`

Description

I'm not sure if this issue is just the case for PDC, and/or if this is jus the case for the models/<MODEL>/entities/<ENTITY>/attributes/<ATTR>/mappings endpoint, but I think the creator_id shown is incorrect.

Example query

https://terminology.ccdh.io/models/PDC/entities/Demographic/attributes/race/mappings

{
  "creator_id": "https://orcid.org/0000-0000-0000-0000",
  "license": "https://creativecommons.org/publicdomain/zero/1.0/",
  "mapping_provider": "https://ccdh.cancer.gov/",
  "mappings": [
    {
      "subject_match_field": "PDC.Demographic.race",
      "subject_label": "not allowed to collect",
      "predicate_id": "SKOS:exactMatch",
      "object_id": "NCIT:C141478",
      "object_label": "Not Allowed To Collect",
      "object_match_field": "CRDC-H.Subject.race",
      "creator_id": "https://gdc.cancer.gov",
      "comment": null,
      "mapping_date": null
    },
    ...
  ]
}

Expected result

I expected 'pdc' for subdomain.

      "creator_id": "https://pdc.cancer.gov",

Actual result

Received 'gdc' subdomain.

      "creator_id": "https://gdc.cancer.gov",

Implement an ICDC data dictionary importer

Description

Need to import ICDC data into our database.

Questions

Tasks

Check here when tasks are adequately answered

  • 1. Where is the data dictionary, if it exists, in whatever form it exists?
  • 2. What are the limitations to the current data dictionary, if it exists?
  • 3. What are the timelines for this data dictionary to be upgrade into something that would be more useful for us?
  • 4. Who is primary ICDC contact for this data dict?
  • 5. What work needed on our end to import, given what we know now?

Discussion

1. Where is the data dictionary, if it exists, in whatever form it exists?

So far, this repo dir contains the model in 2 files (though there is a 3rd .mdf schema file):

2. What are the limitations to the current data dictionary, if it exists?

  • Structure: From what I can see so far, structure is a bit different, but that may be about it.
  • Completeness: It doesn't seem like a very big model. Is this it?

3. What are the timelines for this data dictionary to be upgrade into something that would be more useful for us?

Enumerations will (a) either be added where localhost:refs appear appears, or (b), more likely, they will publish a server in which these can be resolved via a query.

From what I can gather, this will probably be done in 2022, probably within the first half of the year.

4. Who is primary ICDC contact for this data dict?

Mark Jensen & Philip Musk

5. What work needed on our end to import, given what we know now?

TODO

Test server

Description

We need a test server

Steps

Can put test server on as a separate process on our current server. test.terminology.server.io. We need to set up second docker container, just use different port. Pick any port. Edit: /etc/nginx/nginx.conf

Backlog

We'll have to contact CRDC about budget / funding. Then when approved, we can set up a test server. If necessary

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.