Giter VIP home page Giter VIP logo

kili-technology / kili-python-sdk Goto Github PK

View Code? Open in Web Editor NEW
221.0 10.0 26.0 389.68 MB

Simplest and fastest image and text annotation tool.

Home Page: https://kili-technology.com

License: Apache License 2.0

Python 7.20% Jupyter Notebook 92.80%
image-annotation-tool text-annotation-tool document-annotation-tool annotation-tool-online annotation-tool-offline annotation labeling labeling-tool image-labeling text-labeling

kili-python-sdk's Introduction

Kili Python SDK

Python 3.8 pre-commit GitHub release (latest by date)


SDK Reference: https://python-sdk-docs.kili-technology.com/

Kili Documentation: https://docs.kili-technology.com/docs

App: https://cloud.kili-technology.com/label/

Website: https://kili-technology.com/


What is Kili?

Kili is a platform that empowers a data-centric approach to Machine Learning through quality training data creation. It provides collaborative data annotation tools and APIs that enable quick iterations between reliable dataset building and model training. More info here.

Annotation tools examples

Named Entities Extraction and Relation PDF classification and bounding-box Object detection (bounding-box)

and many more.

What is Kili Python SDK?

Kili Python SDK is the Python client for the Kili platform. It allows to query and manipulate the main entities available in Kili, like projects, assets, labels, api keys...

It comes with several tutorials that demonstrate how to use it in the most frequent use cases.

Requirements

  • Python >= 3.8
  • Create and copy a Kili API key
  • Add the KILI_API_KEY variable in your bash environment (or in the settings of your favorite IDE) by pasting the API key value you copied above:
export KILI_API_KEY='<your api key value here>'

Installation

Install the Kili client with pip:

pip install kili

If you want to contribute, here are the installation steps.

Usage

Instantiate the Kili client:

from kili.client import Kili
kili = Kili()
# You can now use the Kili client!

Note that you can also pass the API key as an argument of the Kili initialization:

kili = Kili(api_key='<your api key value here>')

For more details, read the SDK reference or the Kili documentation.

Tutorials

Check out our tutorials! They will guide you through the main features of the Kili client.

You can find several other recipes in this folder.

Examples

Here is a sample of the operations you can do with the Kili client:

Creating an annotation project

json_interface = {
    "jobs": {
        "CLASSIFICATION_JOB": {
            "mlTask": "CLASSIFICATION",
            "content": {
                "categories": {
                    "RED": {"name": "Red"},
                    "BLACK": {"name": "Black"},
                    "WHITE": {"name": "White"},
                    "GREY": {"name": "Grey"}},
                "input": "radio"
            },
            "instruction": "Color"
        }
    }
}
project_id = kili.create_project(
    title="Color classification",
    description="Project ",
    input_type="IMAGE",
    json_interface=json_interface
)["id"]

Importing data to annotate

assets = [
    {
        "externalId": "example 1",
        "content": "https://images.caradisiac.com/logos/3/8/6/7/253867/S0-tesla-enregistre-d-importantes-pertes-au-premier-trimestre-175948.jpg",
    },
    {
        "externalId": "example 2",
        "content": "https://img.sportauto.fr/news/2018/11/28/1533574/1920%7C1280%7Cc096243e5460db3e5e70c773.jpg",
    },
    {
        "externalId": "example 3",
        "content": "./recipes/img/man_on_a_bike.jpeg",
    },
]

external_id_array = [a.get("externalId") for a in assets]
content_array = [a.get("content") for a in assets]

kili.append_many_to_dataset(
    project_id=project_id,
    content_array=content_array,
    external_id_array=external_id_array,
)

See the detailed example in this tutorial.

Importing predictions

prediction_examples = [
    {
        "external_id": "example 1",
        "json_response": {
            "CLASSIFICATION_JOB": {
                "categories": [{"name": "GREY", "confidence": 46}]
            }
        },
    },
    {
        "external_id": "example 2",
        "json_response": {
            "CLASSIFICATION_JOB": {
                "categories": [{"name": "WHITE", "confidence": 89}]
            }
        },
    }
]

kili.create_predictions(
    project_id=project_id,
    external_id_array=[p["external_id"] for p in prediction_examples],
    json_response_array=[p["json_response"] for p in prediction_examples],
    model_name="My SOTA model"
)

See detailed examples in this recipe.

Exporting labels

kili.export_labels("your_project_id", "export.zip", "yolo_v4")

See a detailed example in this tutorial.

kili-python-sdk's People

Contributors

aloysio-kili avatar antoine-detailleur avatar baptiste-olivier avatar bruceatkili avatar crsegerie avatar edarchimbaud avatar fannygaudin avatar florianlega avatar florianleroykili avatar frankfacundo avatar fxleduc avatar gheorghetutunaru avatar github-actions[bot] avatar hguerni avatar hnicolas avatar hugo-mailfait-kili avatar hugodegeorges avatar jonas1312 avatar marc-delpech avatar marcenacp avatar mduval1 avatar p-desaintchamas avatar paul-godhouse avatar pierreleveau avatar rcourture avatar renovate[bot] avatar theodu avatar tmarette-kili avatar xavierkl1 avatar xczuba avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kili-python-sdk's Issues

append_many_to_dataset: Mutation "data" failed with error: unknown field

I'm currently trying to add some data in an NLP project via the API with,

      playground.append_many_to_dataset(project_id=project_id,
                                        content_array=['test document 1'],
                                        external_id_array=['doc1'])

this fails with,

  File "kili\mutations\asset\__init__.py", line 61, in append_many_to_dataset
    projects = playground.projects(project_id)
  File "kili\queries\project\__init__.py", line 68, in projects
    return format_result('data', result)
  File "kili\helpers.py", line 17, in format_result
    raise GraphQLError(name, result['errors'])
kili.helpers.GraphQLError: Mutation "data" failed with error: "[{'message': 'unknown field', 'path': ['variable', 'where', 'updatedAtGte'], 'extensions': {'code': 'GRAPHQL_VALIDATION_FAILED'}}]"

it turns out that just running playground.projects(project_id) produced the same error. I noticed that providing a random project_id will also produce the same error. So at this point I have 2 hypothesis,

  • either that error is just an indication that the project_id is incorrect. I'm using the ID from the URL e.g. /label/projects/<project_id>. If it is incorrect is there another way I can find it?
  • or there is an unrelated issue with this project somewhere.

Suggestions how to debug this would be very much appreciated. Thanks!

Comment contrôler la file d'annotation ?

Bonjour,

J'aimerais contrôler précisément l'enchaînement des assets à annoter.

Pour cela j'ai utilisé la fonction update_properties_in_asset() en ajustant d'une part le responsable to_be_labeled_by_array et la priorité priorities que j'ai mise à 1 (j'ai compris qu'elle est à 0 par défaut).

Pourtant lorsque je passe sur le Studio d'annotation, je choisis un verbatim dans le tableau "Explore", je l'annote et le document suivant est un document avec une priority à 0 et qui n'est assigned to personne !

Je choisis un document depuis l'interface Explore :

image

Après l'avoir annoté et cliqué sur Submit, voici le document qu'il m'est demandé d'annoter :

image

Je retourne dans la vue Explore pour vérifier la priorité et la personne à qui ce document est affecté :

image

Le problème ne vient pas du statut puisqu'après la préannotation, j'ai bien veillé à remettre tous les documents en statut TODO

Quels critères sont utilisés pour définir le prochain verbatim à annoter ? Comment modifier ce critère ?

Warning: labelOf { externalId } must be an instance of <enum 'Label'>

I'm trying to get the externalId of assets associated to a label when calling https://cloud.kili-technology.com/docs/python-graphql-api/python-api/#labels

Initially I tried,

>>> labels = playground.labels(project_id=project_id,  fields=['id', 'labelOf'])

which produces,

  File "lib\site-packages\kili\queries\label\__init__.py", line 126, in labels
    return format_result('data', result)
  File "lib\site-packages\kili\helpers.py", line 17, in format_result
    raise GraphQLError(name, result['errors'])
kili.helpers.GraphQLError: Mutation "data" failed with error: "[{'message': 'Field "labelOf" of type "Asset" must have a selection of subfields. Did you mean "labelOf { ... }"?', 'locations': [{'line': 4, 'column': 47}], 'extensions': {'code': 'GRAPHQL_VALIDATION_FAILED'}}]"

So indeed if I change as suggested,

>>> labels = playground.labels(project_id=project_id,  fields=['id', 'labelOf  { externalId }'])

this works as expected, however I now get the following message on stdout

labelOf { externalId } must be an instance of <enum 'Label'>

printed here. Because it's not a warning but a print I cant silence it either.

The returned results are correct and include the "externalId", so it's not a big issue, but it would be nice to be able to silence this message.

Extract masks instead of poly

Hello,

I was wondering if it possible to extract directly the masks instead of the poly vertices for a semantic segmentation task (i.e. each pixel is one class)
Here is an example where poly is a problem, in the following image from left to right annotations I made in Kili:
Base Image | Plate annotation | Class 1 Annotations | Class 2 Annotations | Class 3 Annotations | Class 4 Annotations
CleanShot 2023-10-20 at 08 51 06@2x

Now if I extract the annotations under Kili format, the Plate Annotation will be:
CleanShot 2023-10-20 at 08 47 17@2x

Is there a way to solve this ?

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

github-actions
.github/workflows/bump_commit_release_branch.yml
  • actions/checkout v4
  • actions/setup-python v5
.github/workflows/ci.yml
  • actions/checkout v4
  • actions/setup-python v5
  • pre-commit/action v3.0.1
  • actions/checkout v4
  • actions/setup-python v5
  • actions/checkout v4
  • actions/setup-python v5
  • actions/checkout v4
  • actions/setup-python v5
  • actions/checkout v4
  • lycheeverse/lychee-action v1.9.3
  • actions/checkout v4
  • actions/setup-python v5
  • actions/checkout v4
  • actions/setup-python v5
  • actions/checkout v4
  • actions/setup-python v5
.github/workflows/create_draft_release.yml
  • actions/checkout v4
  • actions/setup-python v5
  • slackapi/slack-github-action v1.25.0
.github/workflows/datadog.yml
  • actions/setup-python v5
.github/workflows/deploy_doc.yml
  • actions/checkout v4
  • actions/setup-python v5
  • slackapi/slack-github-action v1.25.0
.github/workflows/e2e_tests.yml
  • actions/checkout v4
  • actions/setup-python v5
  • slackapi/slack-github-action v1.25.0
.github/workflows/pr.yml
  • amannn/action-semantic-pull-request v5
.github/workflows/publish.yml
  • actions/checkout v4
  • actions/setup-python v5
  • slackapi/slack-github-action v1.25.0
pep621
pyproject.toml
  • pandas >= 1.0.0, < 3.0.0
  • click >= 8.0.0, < 9.0.0
  • requests >= 2.0.0, < 3.0.0
  • tabulate >= 0.9.0, < 0.10.0
  • tenacity >= 8.0.0, < 9.0.0
  • tqdm >= 4.0.0, < 5.0.0
  • typeguard >= 4, < 5
  • typing-extensions >= 4.1.0, < 5.0.0
  • pyparsing >= 3.0.0, < 4.0.0
  • websocket-client >= 1.0.0, < 2.0.0
  • pyyaml >= 6.0, < 7.0
  • Pillow >=9.0.0, <11.0.0
  • cuid >= 0.4, < 0.5
  • urllib3 >= 1.26, < 3
  • ffmpeg-python >= 0.2.0, < 0.3.0
  • gql >= 3.5.0b5, < 4.0.0
  • filelock >= 3.0.0, < 4.0.0
  • pip-system-certs >= 4.0.0, < 5.0.0
  • pyrate-limiter >= 3, < 4
  • dev/pre-commit >= 3.3.0, < 4.0.0
  • dev/pylint ==3.0.3
  • dev/pyright ==1.1.347
  • dev/vulture ==2.11
  • dev/dead ==1.5.2
  • dev/opencv-python >= 4.0.0, < 5.0.0
  • dev/azure-storage-blob >= 12.0.0, < 13.0.0
  • image-utils/opencv-python >= 4.0.0, < 5.0.0
  • azure/azure-storage-blob >= 12.0.0, < 13.0.0
pre-commit
.pre-commit-config.yaml
  • pre-commit/pre-commit-hooks v4.5.0
  • PyCQA/docformatter v1.7.5
  • asottile/pyupgrade v3.15.2
  • srstevenson/nb-clean 3.2.0
  • astral-sh/ruff-pre-commit v0.1.15

  • Check this box to trigger a request for Renovate to run again on this repository

export_labels() sur Databricks renvoie "No such file or directory"

J'exécute le code suivant sur Azure Databricks

from kili.client import Kili

kili = Kili(api_key=api_key['kili-secret'])

path_to_labels = "/databricks/driver/export.zip"

kili.export_labels(project_id=project_id,filename = path_to_labels,fmt="kili",layout = "merged",single_file=True,asset_ids=list("cld3aqa68078x0jw2ds463b3k"),with_assets=False)

renvoie l'erreur suivante :

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpx9xpowaa/clbgd78ic00sw0k3a0y0dcbgv/data.json'

Avec le traceback :

FileNotFoundError Traceback (most recent call last)
in <cell line: 2>()
1 path_to_labels = "/databricks/driver/export.zip"
----> 2 kili.export_labels(project_id=project_id,filename = path_to_labels,fmt="kili",layout = "merged",single_file=True,asset_ids=list("cld3aqa68078x0jw2ds463b3k"),with_assets=False)

/databricks/python/lib/python3.9/site-packages/kili/queries/label/init.py in export_labels(self, project_id, filename, fmt, asset_ids, layout, single_file, disable_tqdm, with_assets, external_ids)
310
311 try:
--> 312 services.export_labels(
313 self,
314 asset_ids=asset_ids,

/databricks/python/lib/python3.9/site-packages/kili/services/export/init.py in export_labels(kili, asset_ids, project_id, export_type, label_format, split_option, single_file, output_file, disable_tqdm, log_level, with_assets)
82 ) # ensures full mapping
83 exporter_class = format_exporter_selector_mapping[label_format]
---> 84 exporter_class(
85 export_params, kili, logger, disable_tqdm, content_repository
86 ).export_project()

/databricks/python/lib/python3.9/site-packages/kili/services/export/format/base.py in export_project(self)
157 local_media_dir=str(self.images_folder),
158 )
--> 159 self.process_and_save(assets, self.output_file)
160
161 @Property

/databricks/python/lib/python3.9/site-packages/kili/services/export/format/kili/init.py in process_and_save(self, assets, output_filename)
55 """
56 clean_assets = self.process_assets(assets, self.label_format)
---> 57 return self._save_assets_export(
58 clean_assets,
59 output_filename,

/databricks/python/lib/python3.9/site-packages/kili/services/export/format/kili/init.py in _save_assets_export(self, assets, output_filename)
34 if self.single_file:
35 project_json = json.dumps(assets, sort_keys=True, indent=4)
---> 36 with (self.base_folder / "data.json").open("wb") as output_file:
37 output_file.write(project_json.encode("utf-8"))
38 else:

/usr/lib/python3.9/pathlib.py in open(self, mode, buffering, encoding, errors, newline)
1240 the built-in open() function does.
1241 """
-> 1242 return io.open(self, mode, buffering, encoding, errors, newline,
1243 opener=self._opener)
1244

/usr/lib/python3.9/pathlib.py in _opener(self, name, flags, mode)
1108 def _opener(self, name, flags, mode=0o666):
1109 # A stub for the opener argument to built-in open()
-> 1110 return self._accessor.open(self, flags, mode)
1111
1112 def _raw_open(self, flags, mode=0o777):

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpx9xpowaa/clbgd78ic00sw0k3a0y0dcbgv/data.json'

More detailed changelog for breaking changes

It would be helpful to have a more detailed changelog (ideally as a file in this repository) that would in particular mention breaking changes.

For instance, with 2.0.4 I have a script that worked fine to create a new project with,

    json_interface = {
        "filetype": "TEXT",
        "jobs": {  .... }
    }

(adapted from a documentation example).
In 2.1.1 this same script produces the following error,

kili.helpers.GraphQLError: Mutation "data" failed with error: "[{'message': '[unsupportedType] Unsupported project input type "NA". Input type should be one of: IMAGE, PDF, TEXT, VIDEO', 'path': ['data']}]"

presumably because the input type now needs to be specified via the input_type parameter of update_properties_in_project (related to #160)

This is a breaking change and it would help if was explicitly specified in the release notes, so one wouldn't have to search through the source code or the the commit history as to what needs to be changed.

Made dataset splits possible during export

I am working on a project using Kili and exporting my datasets with different types of annotations.

I think it would be useful to add a new feature that allows me to split the datasets into train/validation/test sets directly during the export process.
Currently, I have to create another script after the Kili export, and I believe that every other data scientist has to do the same.

Is this feature being developed internally, and do you think it would be a valuable addition?

Spammed by Kili github actions

Hello everyone,

Due to one of my last PR's, I'm getting spammed by failing integration tests in CI (cf image below), I get like 5 emails per day.

It seems like there is a leak in one of the pipelines

image

Supprimer un label avec le SDK

J'effectue la migration du projet initié avec un outil d'annotation custom vers Kili.
Pour ce faire j'utilise le SDK Python.
D'une part je dois pousser des pré-annotations réalisées, pour cela j'ai trouvé la méthode create_predictions()
D'autre part je dois pousser les annotations existantes, pour cela j'utilise la méthode append_labels()

Pour certains verbatim, j'ai à la fois ajouté une préannotation et l'annotation existante.
Je voudrais supprimer la préannotation. Comment faire ?

De plus j'aimerais bien comprendre le mécanisme de append_labels(), est-ce vraiment un append, c'est à dire un ajout de nouveaux labels en plus des labels existants ? ou bien un overwrite c'est à dire un remplacement des labels existants ?

Max. length of text documents?

Hi,
I want to use your platform to annotate text documents. But after uploading the txt files, not the entire text document is displayed. It seems to cut each document after 200 lines (+- depending on the file). Is there a specific limit for txt files?

Thanks,
Max

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.