Giter VIP home page Giter VIP logo

meddoplace_scoring_script's Introduction

MEDDOPLACE SCORER

This repository contains the official scorer for the MEDDOPLACE Shared Task. MEDDOPLACE is a shared task/challenge and set of resources for the detection of locations, clinical departments, and related types of information such as nationalities or patient movements, in medical documents in Spanish. For more information about the task, data, evaluation metrics, ... please visit the task's website.

Requirements

To use this scorer, you'll need to have Python 3 installed in your computer. Clone this repository, create a new virtual environment and then install the required packages:

git clone https://github.com/TeMU-BSC/meddoplace_scoring_script
cd meddoplace_scoring_script
python3 -m venv venv/
source venv/bin/activate
pip install -r requirements.txt

For Task 2.1 (Toponym Resolution using GeoNames), you will need the GeoNames allCountries.txt gazetteer that can be downloaded from here.

The MEDDOPLACE task data is available on Zenodo. Keep in mind that the reference test set won't be uploaded until the task has finished.

Usage Instructions

This program compares two .TSV files, with one being the reference file (i.e. Gold Standard data provided by the task organizers) and the other being the predictions or results file (i.e. the output of your system). Your .TSV file needs to have the same structure as the reference file, which is explained in the MEDDOPLACE Task Guide.

For every submission you want to evaluate, you will need to create two folders: ref/ and res/. Place the reference file in the former and the predictions file in the latter, as in the example below:

meddoplace_submission/
+-- submission_1/
|   +-- ref/
|       +-- meddoplace_test_complete.tsv
|   +-- res/
|       +-- teambsc_submission_1.tsv
+-- submission_2/
|   +-- ref/
|       +-- meddoplace_test_complete.tsv
|   +-- res/
|       +-- teambsc_submission_2.tsv     

Once your file structure is ready, you can run the program like this:

python meddoplace_normalization.py -i ./folder_with_ref_res -o ./folder_to_output_scores -d <dir/to/allCountries.txt> -p GN

The output will be a scores.txt file, similar to the one outputted by CodaLab, with your evaluation results.

These are the possible arguments:

  • -i: input folder where you've created your ref/ and res/ folders.
  • -o: output folder where you want to save the scores.txt file.
  • -d: path to the GeoNames gazetteer (only needed for task 2.1).
  • -p: used for the subtask, it may be GN, PC, SCTID, task1, task3 or all. Use GN for task 2.1, PC for task 2.2, SCTID for task 2.3 and all for task 4.

For the normalization task, keep in mind that:

  • The predictions file must include a column with the code source (GN, PC or SCTID) or else the program will fail.
  • If any of your codes are not valid, the program will print a warning and count it as a False Negative.
  • The scoring for task 2.1 will take longer than the others as the GeoNames gazetteer is quite big.

Contact

If you have any questions or suggestions, please contact <salvador.limalopez [at] bsc.es>.

meddoplace_scoring_script's People

Contributors

darrylestrada97 avatar s-lilo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.