ctti-clinicaltrials / aact Goto Github PK

Improving Public Access to Aggregate Content of ClinicalTrials.gov

Home Page: http://aact.ctti-clinicaltrials.org

License: MIT License

Ruby 40.63% HTML 1.42% Shell 0.05% PLpgSQL 57.85% Dockerfile 0.04% JavaScript 0.01%

aact's Introduction

AACT

AACT or Aggregated Analysis of ClinicalTrials.gov is an application for retrieving all of the public data from clinicaltrials.gov (via their API) and make the information available in a relational database. We do this to make this valuable body of information accessible to the public as a complete aggregated set of data.

If you need a copy of the database, but don't want to bother installing & running this app, copies of the database are available for download from the AACT website (Download page). We use pg_dump to create a snapshot of the database after each nightly update, so a version is always available with the most current info from clinicaltrials.gov.

Database Explanation:

Below you'll find an image that illustrates the different AACT databases and schemas, while briefly describing their purposes.

Requireqments

Install zip
- For mac brew install zip
- For windows gem install rubyzip
- For linux sudo apt install ruby-zip
Install graphviz
Install wget
- For mac brew install wget
- For windows
- For Ubuntu and Debian sudo apt install wget
- For CentOS and Fedora sudo yum install wget

Before Start

Common for Core and Admin apps setting up problems/issues:

Ubuntu versions 22.04 and newer have no built-in openssl 1.1.1 and it causes problems with the installation of ruby versions older than 3.02 (information actual 6/20/2023).
Keep in mind that these 2 apps will interact with each other, the purpose of the Core app is for populating DB only. AACT-admin uses the same db as AACT-core. Do not try to "rails s" AACT-core.
In those 2 lines in the environment variables file: export AACT_CORE_DATABASE_URL=postgres://username:passw@localhost:5432/aact export AACT_CORE_TEST_DATABASE_URL=postgres://username:passw@localhost:5432/aact_test username - your Postgresql DB username passw - your postgres DB password

Getting Started

Install PostgreSQL brew install postgresql
https://www.postgresql.org/download/linux/ubuntu/
https://www.postgresql.org/download/macosx/
Create the roles you need for running AACT.
Enter psql postgres command in terminal to open postgres console and enter commands below to set up postgres database:
postgres=# create role <your_aact_superuser> login password <your_superuser_password>;
postgres=# alter user <your_aact_superuser> with superuser;
postgres=# create role read_only;
postgres=# create database aact_alt;
postgres=# \q this exits out of psql
Verify your new user can login to psql with command: psql -U <your_aact_superuser> -d postgres OR use command sudo -u postgres psql for it. You can exit the shell once you see you can log in.
If your terminal asks for a password when logging in to psql you can give it the password automatically by adding it to the “.pgpass” file. If you haven’t been asked for a password, you can skip this step. The “.pgpass” should be at your root.
echo 'localhost:5432:aact:<superuser_name>:<superuser_password>' >> ~/.pgpass
Now check that you can login to psql without giving a password
psql -U <username> -d postgres You can exit the shell once you see you can log in without a password.

Here is a document about the “.pgpass” file https://www.postgresql.org/docs/current/libpq-pgpass.html.

Now we want to store the environmental variables for the superuser that you created in the previous step. That is the user you will be using within AACT to make database changes. You can store these wherever is appropriate for your system. On a Mac you can store it in your “.zshrc”. On all systems you can also store it in your “.bash_profile” or “.bashrc”.
For the following commands I’m storing variables in the “.zshrc” file, change out that file in the commands for the one you use for storing variables.

echo 'export AACT_DB_SUPER_USERNAME=<your_aact_superuser>' >> ~/.zshrc
echo 'export AACT_PASSWORD=<your_superuser_password>'  >> ~/.zshrc
echo 'export PUBLIC_DB_USER=<your_aact_superuser>'  >> ~/.zshrc
echo 'export PUBLIC_DB_PASS=<your_superuser_password>'  >> ~/.zshrc
# for development
echo 'export AACT_USERNAME=<your_superuser_name>'  >> ~/.zshrc 
echo 'export AACT_PASSWORD=<your_superuser_password>'  >> ~/.zshrc
echo 'export AACT_ADMIN_USERNAME=<username for aact_admin database>'  >> ~/.zshrc    
echo 'export AACT_ADMINPASSWORD=<password for aact_admin database>'  >> ~/.zshrc   
echo 'export AACT_PUBLIC_USERNAME=<username for aact_public database>'  >> ~/.zshrc    
echo 'export AACT_PUBLIC_PASSWORD=<password for aact_public database>'  >> ~/.zshrc    

source ~/.zshrc # to load the variables into the terminal session.

Depending on where you store the variables you may need to call source on that file each time you open a new terminal. This is not necessary for “.zshrc”. When you're done check your “.zshrc” or ".bashrc" file. It should look like the example below. If it doesn't, edit it accordingly.

Example of a set of environment variables:

export AACT_PASSWORD=54104754
export AACT_USERNAME=user_name
export PGPASSWORD=passw
export TEST_PUBLIC_DB_USER=user_name
export TEST_PUBLIC_DB_PASS=passw
export PUBLIC_DB_USER=user_name
export PUBLIC_DB_PASS=passw
export AACT_DB_SUPER_USERNAME=user_name
export PATH=$PATH:/lib/postgresql/15/bin
export AACT_CORE_DATABASE_URL=postgres://DB_USER_NAME:DB_PASSW@localhost:5432/aact
export AACT_CORE_TEST_DATABASE_URL=postgres://DB_USER_NAME:DB_PASSW@localhost:5432/aact_test

Add RVM to PATH for scripting. Make sure this is the last PATH variable change.

export PATH="$PATH:$HOME/.rvm/bin"

Ask teamleader for keys instead of 'XXXXXX...'

export DIGITALOCEAN_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXXXXXX
export DIGITALOCEAN_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXX 
export DIGITALOCEAN_REGION=nyc3
export DIGITALOCEAN_BUCKET=aact-dev

Clone this repo: git clone [email protected]:ctti-clinicaltrials/aact.git
Note: Cloning with a ssh url requires that your local ssh key is saved to Github. The key verifies your permission to push and pull so you won't have to log in. If you haven't saved your ssh key to Github use the html url instead when cloning.
cd aact to enter the directory
Install a ruby version manager like rbenv, then install Ruby 2.7.7
Bundle install
The pg gem (used by AACT) may have trouble finding your PostgreSQL installation. If not, skip this step.
For Mac you can direct it to the right location location by adding
echo ‘export PATH=$PATH:/Library/PostgreSQL/x.y/bin’ >> ~/.zshrc change x.y to the version number of your PostgreSQL installation.
Example: export PATH=$PATH:/Library/PostgreSQL/13/bin
https://wikimatze.de/installing-postgresql-gem-under-ubuntu-and-mac/
In the database.yml file, which is in config folder, lines 1-32 should look like:

default: &default
  encoding: utf8
  adapter: postgresql
  min_messages: warning
  pool: <%= [Integer(ENV.fetch("MAX_THREADS", 50)), Integer(ENV.fetch("DB_POOL", 50))].max %>
  timeout: 5000
  
  //  Make sure you've created a database account for the AACT_DB_SUPER_USERNAME (default: aact) with permission to create databases.//
 
  // Also, add an entry for this user in the .pgpass file in the root directory of the user who run the rails apps so the app knows its password.//
  
  username: <%= ENV.fetch("AACT_DB_SUPER_USERNAME", 'your_db_username') %>

development:
  primary:
    <<: *default
    host: localhost
    port: 5432
    username: <%= ENV.fetch("AACT_USERNAME", 'your_db_username') %>
    password: <%= ENV.fetch("AACT_PASSWORD", 'your_password') %>
    database: aact
  admin:
    <<: *default
    host: localhost
    port: 5432
    username: <%= ENV.fetch("AACT_USERNAME", 'your_db_username') %>
    password: <%= ENV.fetch("AACT_PASSWORD", 'your_password') %>
    database: aact_admin
  public:
    <<: *default
    host: localhost
    port: 5432
    username: <%= ENV.fetch("AACT_USERNAME", 'your_db_username') %>
    password: <%= ENV.fetch("AACT_PASSWORD", 'your_password') %>
    database: aact_public

Create databases and run migrations
bin/rake db:create
bin/rake db:create RAILS_ENV=test
bin/rake db:migrate
bin/rake db:migrate RAILS_ENV=test

Populating the Database

The seed files are out of date so DO NOT call db:seed. Instead use the custom rake tasks. These are your options:

bin/rake db:restore_from_file[<path_to_file>,<database_name>]
For this option go to https://aact.ctti-clinicaltrials.org/snapshots and download a copy of the database. Unzip the snapshot folder.
The file path will likely look like: ~/Downloads/<unzipped_snapshot_folder>/postgres_data.dmp
Example: ~/Downloads/20210906_clinical_trials/postgres_data.dmp.
Give this task the path to the postgres_data.dmp file and it will use it to populate the database.
Example: bin/rake "db:restore_from_file[~/Downloads/20210906_clinical_trials/postgres_data.dmp,aact]"
bin/rake db:restore_from_url[<url>,<database_name>]
For this option go to https://aact.ctti-clinicaltrials.org/snapshots and copy the link for one of the database copies. Give this task the url you copied and it will download the file, unzip it, and use it to populate the database.
Note: the rake tasks below take a very long time to run. You should not set full_featured to true if working locally.
bin/rake db:load[<days_back>,<event_type>,<full_featured>]
The days back is an integer, the event_type only takes "full" or "incremental", full_featured is a boolean. You do not have to give it any parameters. If you have no studies it will populate your database with all the studies.
bin/rake db:load_study[<nct_id>] this will update only one study from the clinicaltrials.gov api

Workflow

Branches:

master - This is the stable production branch, anything merged to master is meant to be propagated to production. Hot fixes will be merged directly to master, then pulled into dev. All other Pull Requests (PRs) will be merged into dev first.
dev - This branch contains the changes for the sprint. It is an accumulation of everything that we believe is working and ready for the next release.
feat/AACT-NUM-description - "AACT-Num" refers to the number of the card on Jira. Description is the name of the feature. This is the naming conventions for a feature that you are working on that eventually will be merged to dev once the PR is approved.
fix/AACT-NUM-description - This is the naming conventions for a bug fix. The PR will be merged into dev when approved.
hotfix/AACT-220-description - This is the naming conventions for an emergency fix. This branches off of master and gets merged into master when the PR is approved because it is a fix that needs to be deployed ASAP.

Treat dev as the main branch. Only branch off of master if you need to do a hotfix.

Normal Process

Pick a ticket to work on
Branch off of dev using the naming convention mentioned above to name your branch
Work on the feature or bug fix
Run tests and make sure they pass before creating a PR
Once complete create a PR to dev
Request review for the PR from two people
If there are change requests, makes the changes, run tests and request a review. If not continue to the next step.
The PR will be approved and merged to dev
At the end of the sprint the dev will be merged to master (we will add a semantic tag, this is where we will decide which version number to pick)
Deploy master to production

Hotfix Process

Branch off of master using the naming convention mentioned above to name your branch
Work on the bug fix
Run tests and make sure they pass
Create PR to master
Request review for the PR from two people. PR review could be expedited depending on the emergency
Merge PR to master
Deploy master to production
Bring changes into dev (once things stabilize)

API V2 Comparison Tool

When finishing a model, run the comparison using the following command:

`rake compare:v2_model[,]``

for example:

  rake compare:v2_model[NCT06225401,study]
  rake compare:v2_model[NCT06225401,design]

This tool will tell you where there is a mismatch between version 1 and version 2 of the api

aact's People

Contributors

Stargazers

Watchers

aact's Issues

pg_restore issue - ctti role does not exist

Since 4.4.0 has been pushed, I'm having trouble loading the data. Usually, I can run the following without any problems:

psql -a -c ""DROP SCHEMA IF EXISTS ctgov CASCADE""
pg_restore -e -v -x -O -d mydb dumpfile

But now, I'm getting errors when I do that:

pg_restore: [archiver (db)] could not execute query: FATAL:  role "ctti" does not exist
SSL connection has been closed unexpectedly
     Command was: ALTER ROLE ctti IN DATABASE aact SET search_path TO 'ctgov', 'support', 'public';

It seems that the command mentioned in the error was not previously in the daily (or monthly) dump files. Was it added on purpose? I do not have a ctti user (and cannot create one).

Issue with installing aact on postgresql

I have follewed the instructions to install the static copy of aact on postgres v11.4 windows. While restoring the dump file, there were repeating warning like "pg_restore: [archiver (db)] Error from TOC entry ", or "pg_restore: [archiver (db)] could not execute query".

When the process finished, it seemed that I could query aact on psql. However, when I try this:

aact=# SELECT COUNT(*) FROM studies;
count

945720
(1 record)

This number is much higher than the result on the instruction page. So I was wondering whether I have a corret copy of the database, or I need to reinstall it? What might be the issue resulting these warnings?

THX!

Can we reproduce advanced search with new database?

For example: in old version we have: https://clinicaltrials.gov/ct2/results?term=psoriasis&recrs=e&age_v=&gndr=&type=&rslt=With&Search=Apply
It leads to : "276 Studies found for: psoriasis | Completed Studies | Studies With Results"
I mean, I applied 2 filters for one term. In database I'm not quite sure how to build a query: for example:
"SELECT nct_id FROM studies NATURAL JOIN browse_conditions WHERE (overall_status = 'Completed' AND (downcase_mesh_term = 'psoriasis' OR mesh_term = 'Psoriasis'))"
leads to 713 results but not mentioned "with result" filter. Additionally, "overall_status" seems to not be equal with "completed" filter, because for some NCTs they are not paired.

Please tell me some suggestions, maybe I look into wrong tables.

Incorrect dispersion values, dispersion upper limits and dispersion lower limits in table outcome_measurements

Hello,

There is a problem with the dispersion measurements in the table outcome_measurements. In fact, in some cases the column dispersion_value is filled with data coming from a different outcome (it should be Null). In other cases, the two columns dispersion upper limit and dispersion lower limit have this same problem.

An example trial is https://clinicaltrials.gov/ct2/show/results/NCT01597388, in the 23rd outcome entitled 'Time to AZD2014 Peak Plasma Concentration at Steady State (Tmax,ss) on Cycle 1 Day 22, Continuous Dosing, With Fulvestrant' for the first arm, we have normally a dispersion_lower_limit equaling 0.5, a dispersion_upper_limit equaling 4 and no dispersion_value.

However, for this same outcome in the table outcome_measurements we find a dispersion_value equals to 447.9 instead of Null as it should be. The value 447.9 corresponds to the 22nd outcome. Hence, it shouldn't be put in 23rd one's measurements.

I joined two figures for the example i mentioned, the first one is the data in clinicaltrials.gov and the second corresponds to what we find in the table outcome_measurements.

Thank you

RSS reader should support pagination

The clinicaltrials.gov RSS feed will return a maximum of 1000 records at a time. Currently, the code is set to request 10000 records, but this will not work. Instead, pagination should be used, by adding the start query parameter.

See https://clinicaltrials.gov/ct2/results/rss.xml?rcv_d=&lup_d=14&lupd_s=12%2F05%2F2019&lupd_e=12%2F06%2F2019&count=1000&start=0 for an example. If you change the count to 10000, there are still only 1000 records. But if you change start to 1000, the first trial in the results is different (as are all the others).

setup script fails due to missing rake task

./bin/setup fails with the following error output:

rake aborted!
Don't know how to build task 'dev:prime' (see --tasks)

the 'dev:prime' task is not listed in the output of

bin/rake --tasks

Table mesh_terms is empty after AACT-Migration

Since the 15.11.2021 the new AACT-Database is already Live.

The Migration-Documentation gave the impression that it would not go Live until 01.01.2022
see: https://aact.ctti-clinicaltrials.org/beta/migration

In the AACT-Database the table mesh_terms is now completely Empty.

Could you populate this table again with all the mesh_terms, like it was before the 15.11.2021 ?

Thanks in advance and best regards!

don't strip out <li> tags and line returns in criteria

studies.limitations_and_caveats missing

Recently i noticed the field limitations_and_caveats of the studies table not being filled any more.

In the pipe-delimited-export from september 2021 there is:
NCT02799667 -> "Early termination secondary to slow recruitment."

Which is still available on clinicaltrials.gov online:
https://clinicaltrials.gov/ct2/show/results/NCT02799667?view=results#limit

However, in the latest downloads and also cloud database the fields are always empty:
SELECT nct_id, limitations_and_caveats FROM ctgov.studies WHERE nct_id='NCT02799667';

Trials Missing Conditions

Could something be wrong with the nightly snapshot job? I'm seeing that some trials are missing conditions in the PG restore as well as pipe-delineated files (https://aact.ctti-clinicaltrials.org/snapshots) Here are some example NCTIDs:

NCT03600701
NCT03604978
NCT03606967
NCT03639948
NCT03641313
NCT03646123
NCT03657043
NCT03698994
NCT03724084
NCT03738228
NCT03739814
NCT03745352
NCT03770260
NCT03772925
NCT03778957
NCT03783403
NCT03786783
NCT03798678
NCT03800134
NCT03801902
NCT03803774
NCT03813147
NCT03816319
NCT03816332
NCT03816345
NCT03816358
NCT03831698
NCT03831932
NCT03842228
NCT03854474
NCT03878095
NCT03896503
NCT03907475
NCT03914300
NCT03939897
NCT03941860
NCT03953898
NCT03965689
NCT03968406
NCT03975647
NCT03983824
NCT04022746
NCT04032704
NCT04034927
NCT04052555
NCT04068194
NCT04071223
NCT04079712
NCT04090398
NCT04092270
NCT04108858
NCT04123418
NCT04145115
NCT04157088
NCT04166487
NCT04172532
NCT04175912
NCT04181060
NCT04190550
NCT04197713
NCT04214249
NCT04216290
NCT04216316
NCT04237649
NCT04250545
NCT04252859
NCT04266301
NCT04277442
NCT04284787
NCT04294576
NCT04294628
NCT04317105
NCT04319198
NCT04323436
NCT04338685
NCT04340843
NCT04345913
NCT04384484
NCT04396860
NCT04406857
NCT04458909
NCT04458922
NCT04460937
NCT04494113
NCT04498117
NCT04499924
NCT04500548
NCT04505722
NCT04514484
NCT04514497
NCT04527328
NCT04533750
NCT04539366
NCT04541017
NCT04550481
NCT04550494
NCT04572633
NCT04576104
NCT04592913
NCT04595747
NCT04607421
NCT04609046
NCT04616534
NCT04616547
NCT04616560
NCT04620733
NCT04628767
NCT04631029
NCT04633239
NCT04649359
NCT04652960
NCT04660344
NCT04665739
NCT04671667
NCT04672460
NCT04704661
NCT04708418
NCT04739800
NCT04745832
NCT04750954
NCT04751383
NCT04767984
NCT04799275
NCT04812197
NCT04817241
NCT04840589
NCT04840602
NCT04854499
NCT04870112
NCT04897880
NCT04929028
NCT04929041
NCT04931017
NCT04936230
NCT04941287
NCT04963153
NCT04965753
NCT04971226
NCT04972097
NCT04974008
NCT04976634
NCT04981509
NCT04989283
NCT04998747
NCT04998812
NCT04998851
NCT05021263
NCT05023967
NCT05053971
NCT05059262
NCT05075993
NCT05077423
NCT05081609
NCT05090566
NCT05098210
NCT05099003
NCT05112601
NCT05121051
NCT05123534
NCT05134740
NCT05135650
NCT05136196
NCT05142241
NCT05157841
NCT05172245
NCT05183048
NCT05192889
NCT05199272
NCT05204927
NCT05211323
NCT05223036
NCT05256225
NCT05272384
NCT05276973
NCT05286801
NCT05295589
NCT05315687
NCT05327010
NCT05333458
NCT05343013
NCT05358249
NCT05372640
NCT05387915
NCT05404919

role "aact" does not exist

I'm trying to perform the db migrate with the following command:
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run rake db:migrate

I should not that the docker-compose.dev.yml was modified to replace 'aact2' with 'aact'.

I get this error:
(1.9ms) ROLLBACK
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

PG::UndefinedObject: ERROR: role "aact" does not exist
: DROP USER aact;
CREATE USER aact WITH PASSWORD 'aact';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO aact;
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in async_exec' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in block in execute'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:484:in block in log' /usr/local/bundle/gems/activesupport-4.2.7.1/lib/active_support/notifications/instrumenter.rb:20:in instrument'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:478:in log' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:154:in execute'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:665:in block in method_missing' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in block in say_with_time'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in say_with_time' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:654:in method_missing'
/var/www/app/db/migrate/20160912000000_create_admin_tables.rb:68:in change' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:608:in exec_migration'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:592:in block (2 levels) in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:591:in block in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in with_connection' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:590:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:768:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:998:in block in execute_migration_in_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in block in ddl_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in block in transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/transaction.rb:184:in within_new_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/transactions.rb:220:in transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in ddl_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:997:in execute_migration_in_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:959:in block in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in each' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:823:in up' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:801:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/tasks/database_tasks.rb:137:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/railties/databases.rake:44:in block (2 levels) in <top (required)>'
/usr/local/bundle/gems/rake-11.2.2/exe/rake:27:in <top (required)>' ActiveRecord::StatementInvalid: PG::UndefinedObject: ERROR: role "aact" does not exist : DROP USER aact; CREATE USER aact WITH PASSWORD 'aact'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO aact; /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in async_exec'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in block in execute' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:484:in block in log'
/usr/local/bundle/gems/activesupport-4.2.7.1/lib/active_support/notifications/instrumenter.rb:20:in instrument' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:478:in log'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:154:in execute' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:665:in block in method_missing'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in block in say_with_time' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in say_with_time'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:654:in method_missing' /var/www/app/db/migrate/20160912000000_create_admin_tables.rb:68:in change'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:608:in exec_migration' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:592:in block (2 levels) in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:591:in block in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in with_connection'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:590:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:768:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:998:in block in execute_migration_in_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in block in ddl_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in block in transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/transaction.rb:184:in within_new_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/transactions.rb:220:in transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in ddl_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:997:in execute_migration_in_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:959:in block in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in each'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:823:in up'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:801:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/tasks/database_tasks.rb:137:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/railties/databases.rake:44:in block (2 levels) in <top (required)>' /usr/local/bundle/gems/rake-11.2.2/exe/rake:27:in <top (required)>'
PG::UndefinedObject: ERROR: role "aact" does not exist
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in async_exec' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:155:in block in execute'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:484:in block in log' /usr/local/bundle/gems/activesupport-4.2.7.1/lib/active_support/notifications/instrumenter.rb:20:in instrument'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract_adapter.rb:478:in log' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:154:in execute'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:665:in block in method_missing' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in block in say_with_time'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:634:in say_with_time' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:654:in method_missing'
/var/www/app/db/migrate/20160912000000_create_admin_tables.rb:68:in change' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:608:in exec_migration'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:592:in block (2 levels) in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:591:in block in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in with_connection' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:590:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:768:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:998:in block in execute_migration_in_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in block in ddl_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in block in transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/transaction.rb:184:in within_new_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/transactions.rb:220:in transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:1044:in ddl_transaction'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:997:in execute_migration_in_transaction' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:959:in block in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in each' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:955:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:823:in up' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/migration.rb:801:in migrate'
/usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/tasks/database_tasks.rb:137:in migrate' /usr/local/bundle/gems/activerecord-4.2.7.1/lib/active_record/railties/databases.rake:44:in block (2 levels) in <top (required)>'
/usr/local/bundle/gems/rake-11.2.2/exe/rake:27:in `<top (required)>'
Tasks: TOP => db:migrate
(See full trace by running task with --trace)

any assistance is appreciated

[Question] Legal issues - what is the Licence of Data? Is it public domain?

Hello!

I want to use AACT database in my online course on Coursera. But I can't find appropriate information on the official site.
I put citation mark and links to official site, but I am not sure, that it is enough in this situation.

Is it absolutely public data? Or are there some limitations?
And if there are some legal issues, how can I get licence or solve this problem in other way?

Thanks in advance

How to install AACT on windows (PowerShell)

Hi, I am wondering is there any readme.md file I can follow to install ATCC on the windows? or PowerShell?

Missing trials again

Similar to issue #922 it seems that there are a bunch of trials missing in the nightly snapshots. Is the process working properly? You can see that the size of the snapshot is small ever since things stopped working on 11/9/22. On that date the snapshot 1.42 GB and now the process is generating snapshots that are various sizes that are all smaller than the snapshot on 11/9/22

Inconsistencies in documentation

The relationship between studies and eligibilities is 1-to-n in the schema diagram (https://github.com/ctti-clinicaltrials/aact/blob/master/public/documentation/aact_schema.png), however in the tables excel sheet (https://github.com/ctti-clinicaltrials/aact/blob/master/public/documentation/aact_tables.xlsx) the relationship is 1-to-1.

What is the relationship?

Snapshots not generating

The last nightly snapshot was generated on 12/22. Can someone investigate and get this process working again? Thanks!

digitalocean key an secret

Can anyone please explain the role of digitalocean space. Do I have to create space with name aact-dev?
Is it compulsory to create digitalocean space ?

pinned versions too restrictive

even swapping ruby 2.4.5 out for 2.4.0 leads to a cascade of conflicts. Can you provide a Dockerfile for this?

Is there a way to a specific to freeze a static copy of the database at specific date?

First of all, thanks so much for providing this incredible resource!

I was wondering if there is a way to a freeze a query/a static copy of the database at specific date (for reproducibility reasons)?

For example, I would like to only query the version of the database as it was on XYZ date.

Thanks!

intervention_other_names cross join issue

Intervention_other_names structure needs to be able to be joined with interventions table.

See sql below where the gemzar and abraxane are cross joined with the interventions table values. the intervention ID maybe should be in both tables? Similar issue maybe with browse_interventions - not sure why pancrealipase is in the study below? Not part of the intervention being studied by the trial. The other mesh terms are right but need to be associated with interventions and other_names also.

SELECT s.nct_id,s.last_changed_date,s.brief_title,s.overall_status
,i.intervention_type,i.name,i.description
, ion.name as "other name"
--, bi.mesh_term as "mesh term"
FROM studies s
LEFT JOIN
interventions i
ON s.nct_id = i.nct_id
LEFT JOIN intervention_other_names ion
ON s.nct_id = ion.nct_id
--LEFT JOIN browse_interventions bi
--ON s.nct_id = bi.nct_id
WHERE s.nct_id IN ('NCT00844649')

Unique Facility IDs

Would be great to produce a unique identifier for each facility that would allow users to do facility-level calculations without relying on the current unstandardized 'name' field in facilities table.

I'm currently working to tackle this for a subset of trials using a combination of geocoding and a fuzzy string search algorithm, but it incurs payment for use of the Google Geocoding API (if you geocode more than 2500 facilities per day) and requires a bit of work to tune fuzzy matching approach.

I've learned a lot, but it would be great to just have a unique ID that connects all facility names/ids that refer to the same actual facility - e.g. all the Mayo Clinics in Rochester (although thats a much easier example than most).

How to find keys for groups with variant names?

In this table:
result_groups <- dbGetQuery(con, "SELECT * FROM result_groups WHERE nct_id = 'NCT01087788')
there are similar names of titles between result types. Is it possible to get some "core" name to join between corresponding tables or find their ids? Additionally, they are not corresponding (by intuition) with "ctgov_group_code" numbers between result types. I know it's taken directly from tables, but is there any way to get id of the drug name valid for the whole study?

For example:

id  drug_core_name
1   CZP 400 mg Q4W
2   CZP 200 mg Q2W
3   Placebo

The table 'result_groups' for 'NCT01087788':


       id      nct_id ctgov_group_code      result_type                                              title
1  3896098 NCT01087788               B4         Baseline                                        Total Title
2  3896099 NCT01087788               B3         Baseline                                     CZP 400 mg Q4W
3  3896100 NCT01087788               B2         Baseline                                     CZP 200 mg Q2W
4  3896101 NCT01087788               B1         Baseline                                            Placebo
5  3896102 NCT01087788               P3 Participant Flow                                     CZP 400 mg Q4W
6  3896103 NCT01087788               P2 Participant Flow                                     CZP 200 mg Q2W
7  3896104 NCT01087788               P1 Participant Flow                                            Placebo
8  3896105 NCT01087788               O4          Outcome CZP 200 mg Q2W and CZP 400 mg Q4W (Randomized Set)
9  3896106 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
10 3896107 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
11 3896108 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
12 3896109 NCT01087788               O4          Outcome CZP 200 mg Q2W and CZP 400 mg Q4W (Randomized Set)
13 3896110 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
14 3896111 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
15 3896112 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
16 3896113 NCT01087788               O4          Outcome CZP 200 mg Q2W and CZP 400 mg Q4W (Randomized Set)
17 3896114 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
18 3896115 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
19 3896116 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
20 3896117 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
21 3896118 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
22 3896119 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
23 3896120 NCT01087788               O4          Outcome CZP 200 mg Q2W and CZP 400 mg Q4W (Randomized Set)
24 3896121 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
25 3896122 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
26 3896123 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
27 3896124 NCT01087788               O3          Outcome                    CZP 400 mg Q4W (Randomized Set)
28 3896125 NCT01087788               O2          Outcome                    CZP 200 mg Q2W (Randomized Set)
29 3896126 NCT01087788               O1          Outcome                           Placebo (Randomized Set)
30 3896127 NCT01087788               E3   Reported Event                            All CZP 200 mg + 400 mg
31 3896128 NCT01087788               E2   Reported Event                                 All CZP 400 mg Q4W
32 3896129 NCT01087788               E1   Reported Event                                 All CZP 200 mg Q2W

For now I need to extract names with regex.

rake aborted Error

Hi I am new to this environment and I am getting the following error when I try to get the full import.

bundle exec rake import:full:run
rake aborted!
LoadError: cannot load such file -- kaminari/grape

I am not sure what I am doing wrong.

Thanks!

Create a parsed 1 to many table for "criteria"

Create a table that parses all

tagged criteria.

Often there is a header section splitting "inclusion" and "exclusion" so model probably needs this.

The format across trials isn't uniform so there may be a couple patterns that need to be recognized and addressed. Perhaps a quality control step to mark trials that don't parse reliably

snapshots not being generated

Seems like the reliability of the snapshots being generated has gone down since the beginning of April. Can you see if there are any problems? I noticed that the snapshot dump file version changed also - maybe it is related?

Missing trials

I'm seeing that some trials are missing in the PG restore as well as pipe-delineated files (https://aact.ctti-clinicaltrials.org/snapshots), like:
NCT02654990
NCT03089606
NCT02501096
NCT03568461
NCT03072238
NCT03775486
NCT03511664
NCT02967692
NCT04075396
NCT03427814
NCT02811679
NCT02598557
NCT02346526
NCT02718300
NCT05173623
NCT04816565
NCT04816578
NCT04821791
NCT04789109
NCT02052778
NCT02784288
NCT02913612
NCT01625234
NCT02907359
NCT03004976
NCT02513303
NCT03428217
NCT02858895
NCT03328507
NCT02561234
NCT03499899
NCT00002663
NCT04241315
NCT03180684
NCT02564744
NCT03660878
NCT03682770
NCT03123783
NCT03499795
NCT03321929
NCT02606305
NCT01994382
NCT02770391
NCT03340974
NCT03052127
NCT03200717
NCT02509598
NCT03185013
NCT03588884
NCT03834220
NCT03996447
NCT03626545
NCT02771626
NCT03558503
NCT04203667
NCT03200704
NCT03639116
NCT04937686
NCT05023330
NCT05112822
NCT01553461
NCT03483441
NCT03793829
NCT04063280
NCT03265249
NCT02917629
NCT03699475
NCT03352427
NCT01998035
NCT03713021
NCT02813447
NCT02477124
NCT03022578
NCT03050060
NCT03712371
NCT03430843
NCT03459846
NCT03141177
NCT03662659
NCT03690388

However, I see that the trial returns when hitting the API directly, e.g. https://www.clinicaltrials.gov/api/query/study_fields?expr=NCT02654990&fields=NCTId%2C+OverallStatus&min_rnk=1&max_rnk=&fmt=json

Is it possible to investigate why some studies are not getting into the snapshots?

db_admin

For the db_admin script did you intend for there to be a separate database for these tables or for them to be in the application database aact or whatever you want to call it? If you clone your repo and follow the instructions the load complains that the tables in db_admin don't exist. That and the issue with aact role not existing as the previous issue.

Request: add foreign keys (includes SQL to do so)

This is an incredible resource - thanks!

The database at the moment lacks a primary key on the studies table, and has no foreign keys defined which makes it more challenging to use certain tooling against (for example database visualisations). Also most of the columns are nullable, when many need not be.

I include below SQL required to convert the studies unique index into a primary key, set the referencing columns non-null, and create the missing foreign keys. I've tested this locally against the latest dump 20181111. The SQL was mostly auto-generated, so I've included the SQL I used to generate the statements to make it easy to update in future.

I would be happy to send a pull request creating the relevant migration file in ruby if this is something that's likely to be merged?

BEGIN;
SET search_path TO ctgov, public;

-- Set nct_id as the primary key on studies, using the existing unique index
ALTER TABLE studies ADD CONSTRAINT pk_studies PRIMARY KEY USING INDEX index_studies_on_nct_id;

-- Set referencing columns non-null. (Optional)
-- select 'ALTER TABLE ' || relname || ' ALTER COLUMN ' || attname || ' SET NOT NULL;' from pg_class inner join pg_attribute on (pg_class.oid = pg_attribute.attrelid) where relnamespace = (select oid from pg_namespace where nspname = 'ctgov') and relkind='r' and attname LIKE '%\_id' and not attnotnull;
ALTER TABLE browse_conditions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE design_outcomes ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE id_information ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE interventions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE sponsors ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE baseline_counts ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE baseline_counts ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE baseline_measurements ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE baseline_measurements ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE brief_summaries ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE browse_interventions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE calculated_values ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE central_contacts ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE conditions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE countries ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE design_group_interventions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE design_group_interventions ALTER COLUMN design_group_id SET NOT NULL;
ALTER TABLE design_group_interventions ALTER COLUMN intervention_id SET NOT NULL;
ALTER TABLE design_groups ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE designs ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE detailed_descriptions ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE documents ALTER COLUMN nct_id SET NOT NULL;
-- ALTER TABLE documents ALTER COLUMN document_id SET NOT NULL; -- Disabled because isn't a FK reference
ALTER TABLE drop_withdrawals ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE drop_withdrawals ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE eligibilities ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE facilities ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE facility_contacts ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE facility_contacts ALTER COLUMN facility_id SET NOT NULL;
ALTER TABLE facility_investigators ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE facility_investigators ALTER COLUMN facility_id SET NOT NULL;
ALTER TABLE intervention_other_names ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE intervention_other_names ALTER COLUMN intervention_id SET NOT NULL;
ALTER TABLE ipd_information_types ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE keywords ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE links ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE milestones ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE milestones ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE outcome_analyses ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE outcome_analyses ALTER COLUMN outcome_id SET NOT NULL;
ALTER TABLE outcome_analysis_groups ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE outcome_analysis_groups ALTER COLUMN outcome_analysis_id SET NOT NULL;
ALTER TABLE outcome_analysis_groups ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE outcome_counts ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE outcome_counts ALTER COLUMN outcome_id SET NOT NULL;
ALTER TABLE outcome_counts ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE outcome_measurements ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE outcome_measurements ALTER COLUMN outcome_id SET NOT NULL;
ALTER TABLE outcome_measurements ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE outcomes ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE overall_officials ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE participant_flows ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE pending_results ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE reported_events ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE reported_events ALTER COLUMN result_group_id SET NOT NULL;
ALTER TABLE responsible_parties ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE result_agreements ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE result_contacts ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE result_groups ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE studies ALTER COLUMN nct_id SET NOT NULL;
ALTER TABLE study_references ALTER COLUMN nct_id SET NOT NULL;


-- Create foreign key references for all relevant relations so they can be auto-discovered by various tools. ("ON DELETE CASCADE" optional, perhaps "ON DELETE RESTRICT" more appropriate?)
-- select 'ALTER TABLE ' || local.relname || ' ADD FOREIGN KEY (' || attname || ') REFERENCES ' || (case attname when 'nct_id' then 'studies' when 'result_group_id' then 'result_groups' when 'design_group_id' then 'design_groups' when 'intervention_id' then 'interventions' when 'document_id' then 'documents' when 'facility_id' then 'facilities' when 'outcome_id'  then 'outcomes' when 'outcome_analysis_id' then 'outcome_analyses' else '???' end)  || ' ON DELETE CASCADE;' from pg_class local inner join pg_attribute on (pg_attribute.attrelid = local.oid and pg_attribute.attnum >= 1) where relnamespace = (select oid from pg_namespace where nspname = 'ctgov') and relkind='r' and attname LIKE '%\_id';
ALTER TABLE browse_conditions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE design_outcomes ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE id_information ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE interventions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE sponsors ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE baseline_counts ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE baseline_counts ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE baseline_measurements ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE baseline_measurements ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE brief_summaries ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE browse_interventions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE calculated_values ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE central_contacts ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE conditions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE countries ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE design_group_interventions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE design_group_interventions ADD FOREIGN KEY (design_group_id) REFERENCES design_groups ON DELETE CASCADE;
ALTER TABLE design_group_interventions ADD FOREIGN KEY (intervention_id) REFERENCES interventions ON DELETE CASCADE;
ALTER TABLE design_groups ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE designs ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE detailed_descriptions ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE documents ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
-- ALTER TABLE documents ADD FOREIGN KEY (document_id) REFERENCES documents ON DELETE CASCADE; -- Disabled because isn't a FK reference
ALTER TABLE drop_withdrawals ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE drop_withdrawals ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE eligibilities ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE facilities ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE facility_contacts ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE facility_contacts ADD FOREIGN KEY (facility_id) REFERENCES facilities ON DELETE CASCADE;
ALTER TABLE facility_investigators ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE facility_investigators ADD FOREIGN KEY (facility_id) REFERENCES facilities ON DELETE CASCADE;
ALTER TABLE intervention_other_names ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE intervention_other_names ADD FOREIGN KEY (intervention_id) REFERENCES interventions ON DELETE CASCADE;
ALTER TABLE ipd_information_types ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE keywords ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE links ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE milestones ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE milestones ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE outcome_analyses ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE outcome_analyses ADD FOREIGN KEY (outcome_id) REFERENCES outcomes ON DELETE CASCADE;
ALTER TABLE outcome_analysis_groups ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE outcome_analysis_groups ADD FOREIGN KEY (outcome_analysis_id) REFERENCES outcome_analyses ON DELETE CASCADE;
ALTER TABLE outcome_analysis_groups ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE outcome_counts ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE outcome_counts ADD FOREIGN KEY (outcome_id) REFERENCES outcomes ON DELETE CASCADE;
ALTER TABLE outcome_counts ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE outcome_measurements ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE outcome_measurements ADD FOREIGN KEY (outcome_id) REFERENCES outcomes ON DELETE CASCADE;
ALTER TABLE outcome_measurements ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE outcomes ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE overall_officials ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE participant_flows ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE pending_results ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE reported_events ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE reported_events ADD FOREIGN KEY (result_group_id) REFERENCES result_groups ON DELETE CASCADE;
ALTER TABLE responsible_parties ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE result_agreements ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE result_contacts ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
ALTER TABLE result_groups ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;
-- ALTER TABLE studies ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE; -- Disabled: shouldn't reference itself!
ALTER TABLE study_references ADD FOREIGN KEY (nct_id) REFERENCES studies ON DELETE CASCADE;

COMMIT;

Thanks again!

Recurrent mass ID change

Hi,

I am working with this DB for a while and to my surprise the IDs of some elements (like facilities or interventions) keep changing every once in a while, but there are none (or little) changes in any of the non-ID fields. I find quite annoying that the same element keeps changing IDs, it's really difficult to follow up the changes that way.

Is this behaviour expected?
What is the frequency of this mass ID change?

Thanks,

snapshot URLs all returning 404 errors

For example, https://aact.ctti-clinicaltrials.org/static/static_db_copies/monthly/20200101_clinical_trials.zip results in:

Where to find description of a clinical trial?

I think they used to be the column of brief_summaries, but now they are empty. Can you please point me to the new table if it's ever moved. Thank you!

Snapshots not generating

The last nightly snapshot was generated on 11/9/22. Can someone investigate and get this process working again? Thanks!

Restore of the static copy of the AACT database fails

Restoring the static copy of the database fails immediately with this error:

postgres@lnx$ pg_restore -e -v -O -x -d aact --no-owner /tmp/aact/postgres_data.dmp
pg_restore: connecting to database for restore
pg_restore: creating DATABASE PROPERTIES "aact"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3686; 0 0 DATABASE PROPERTIES aact ctti
pg_restore: [archiver (db)] could not execute query: ERROR:  role "ctti" does not exist
    Command was: ALTER ROLE ctti IN DATABASE aact SET search_path TO 'ctgov', 'support', 'public';

update:
If I add the --clean argument as stated in the docs, I get a different error:

postgres@lnx$ pg_restore -e -v -O -x -d aact --clean --no-owner /tmp/aact/postgres_data.dmp
pg_restore: connecting to database for restore
pg_restore: dropping FK CONSTRAINT study_references study_references_nct_id_fkey
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3436; 2606 37207856 FK CONSTRAINT study_references study_references_nct_id_fkey ctti
pg_restore: [archiver (db)] could not execute query: ERROR:  schema "ctgov" does not exist
    Command was: ALTER TABLE ONLY ctgov.study_references DROP CONSTRAINT study_references_nct_id_fkey;

I used this static copy: https://aact.ctti-clinicaltrials.org/static/static_db_copies/monthly/20200101_clinical_trials.zip

Any idea of why it fails?

Thanks

snapshots not working starting on 12/22/2019

The snapshots haven't been generated for the past couple of days.

Thanks for all your hard work!

data missing from nightly dump

Data for some trials is missing in the nightly SQL dump. An example trial is https://clinicaltrials.gov/ct2/show/NCT02912559, which is active and recruiting, but the study (and related tables) are not in the dump.

This seems to happen for multiple trials.

Add Geoloc to facilities table

Potential solution:

Add lat/long fields to the facilities model/table
Add zip code to lat/lon table to the database/app (from https://gist.github.com/erichurst/7882666)
After completion of full/nightly load, run an update query that joins facilities to zip table that updates all blank or non-matching lat/lon values

Note, this will be lower accuracy than an address based geoloc using something like google API but to save api cost for now and until long term solution for "unique" and "persisted" facilities is implemented, this should be sufficient - a disclaimer may be necessary on the geoloc fields about accuracy to the center of the zip code vs specific address.

conditions.downcase_name not being populated

It seems that the downcase_name column isn't being populated in the conditions table of the nightly dump.

Unable to connect to server: Operation timed out

Apologies if this is a silly question - I've been out of the db game for some time and even then only had some experience.

I started connecting to the CCTI/AACT db via Postgres using pgAdmin 4.0 (on a Mac) a few months ago. Recently, I have been unable to connect. I have dropped the database and have tried to create a new Server using the instructions provided by CCTI (http://aact.ctti-clinicaltrials.org/pgadmin) to no avail.

The error I get is: "Unable to connect to server: could not receive data from server: Operation timed out".

I get a similar message when trying to connect through the Terminal.

Not sure if this is a permissions issue, or if I'm just missing something in general. Any help would be much appreciated.

Trying to install this app on ubuntu...

Having cleared a number of dependency hurdles I get this error:

Resolving dependencies...
The Gemfile's dependencies are satisfied
/usr/lib/ruby/vendor_ruby/bundler/runtime.rb:80:in `rescue in block (2 levels) in require': There was an error while trying to load the gem 'autoprefixer-rails'. (Bundler::GemRequireError)
        from /usr/lib/ruby/vendor_ruby/bundler/runtime.rb:76:in `block (2 levels) in require'
        from /usr/lib/ruby/vendor_ruby/bundler/runtime.rb:72:in `each'
        from /usr/lib/ruby/vendor_ruby/bundler/runtime.rb:72:in `block in require'
        from /usr/lib/ruby/vendor_ruby/bundler/runtime.rb:61:in `each'
        from /usr/lib/ruby/vendor_ruby/bundler/runtime.rb:61:in `require'
        from /usr/lib/ruby/vendor_ruby/bundler.rb:99:in `require'
        from /home/ubuntu/ctnet/aact/config/application.rb:12:in `<top (required)>'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:82:in `require'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:82:in `preload'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:143:in `serve'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:131:in `block in run'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:125:in `loop'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application.rb:125:in `run'
        from /home/ubuntu/gems/gems/spring-1.7.1/lib/spring/application/boot.rb:19:in `<top (required)>'
        from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from -e:1:in `<main>'

I'm not a ruby dev so could anyone cast any light on this please?

Fetching information of clinical trials for a specifc disease / gene

Hi Team,

I have a query pertaining to usage of aact.

I want to pull nct_id for all clinical trials based on disease (eg. any cancer) or pull clinical trials based on drug-gene targets.

Is there a specific way or some correlation to do the same?

Currently, I would need to search in ctgov.conditions for terms like cancer, malignancy, neoplasm, tumor, neoplasia, etc to collect all possible cancer studies. I want to know if there is a way to get this with one search term or maybe some specific column in the database ?

Can it be done similarly for genes ?
Eg. Pull all clinical trials where drug target is IDH gene or BRCA gene, etc

Thanks,
Namrata