Giter VIP home page Giter VIP logo

cognite-replicator's Introduction

Cognite logo

Cognite Python Replicator

build codecov Documentation Status PyPI version tox PyPI - Python Version Code style: black

Cognite Replicator is a Python package for replicating data across Cognite Data Fusion (CDF) projects. This package is built on top of the Cognite Python SDK. This component is Community content and not officially supported by Cognite. Bugs and changes will be fixed on a best effort basis. Feel free to open issues and pull requests, we will review them as soon as we can.

Copyright 2023 Cognite AS

Prerequisites

In order to start using the Replicator, you need:

  • Python3 (>= 3.6)
  • Credentials for both the source and destination projects:
    • CLIENT_ID ("Client ID from Azure")
    • CLIENT_SECRET ("Client secret from Azure", only if using authentication via secret)
    • CLUSTER ("Name of CDF cluster")
    • TENANT_ID ("Tenant ID from Azure"
    • PROJECT ("Name of CDF project")

This is how you set the client secret as an environment variable on Mac OS and Linux:

$ export SOURCE_CLIENT_SECRET=<your source client secret>
$ export DEST_CLIENT_SECRET=<your destination client secret>

Installation

The replicator is available on PyPI, and can also be executed .

To run it from command line, run:

pip install cognite-replicator

Alternatively, build and run it as a docker container. The image is avaible on docker hub:

docker build -t cognite-replicator .

Usage

1. Run with a configuration file as a standalone script

Create a configuration file based on the config/default.yml and update the values corresponding to your environment If no file is specified then replicator will use config/default.yml.

via Python

python -m cognite.replicator config/filepath.yml

or alternatively via docker If no access to a browser, you should use the client secret authentication

docker run -e SOURCE_CLIENT_SECRET -e DEST_CLIENT_SECRET -v /absolute/path/to/config/config.yml:/config.yml cognite-replicator /config.yml

2. Setup as Python library

2.1 Without configuration file and interactive login

It will copy everything from source to destination and use your own credentials to run the code, you need to have the right permissions to read on the source project and write on the destination project

import os
import yaml
from cognite.client.credentials import OAuthInteractive
from cognite.client import CogniteClient, ClientConfig
from cognite.replicator import assets, events, files, time_series, datapoints, sequences, sequence_rows

# SOURCE
SOURCE_TENANT_ID = "48d5043c-cf70-4c49-881c-c638f5796997"
SOURCE_CLIENT_ID = "1b90ede3-271e-401b-81a0-a4d52bea3273"
SOURCE_PROJECT = "publicdata"
SOURCE_CLUSTER = "api"

# DESTINATION
DEST_TENANT_ID = "d4febcbc-db24-4823-bffd-92fd05b9c6bc"
DEST_CLIENT_ID = "189e8b95-f1ce-47d2-aa66-4c2fe3567f91"
DEST_PROJECT = "sa-team"
DEST_CLUSTER = "bluefield"

### Autogenerated variables
SOURCE_SCOPES = [f"https://{SOURCE_CLUSTER}.cognitedata.com/.default"]
SOURCE_BASE_URL = f"https://{SOURCE_CLUSTER}.cognitedata.com"
SOURCE_AUTHORITY_URL = f"https://login.microsoftonline.com/{SOURCE_TENANT_ID}"
DEST_SCOPES = [f"https://{DEST_CLUSTER}.cognitedata.com/.default"]
DEST_BASE_URL = f"https://{DEST_CLUSTER}.cognitedata.com"
DEST_AUTHORITY_URL = f"https://login.microsoftonline.com/{DEST_TENANT_ID}"

# Config
BATCH_SIZE = 10000  # this is the max size of a batch to be posted
NUM_THREADS = 10  # this is the max number of threads to be used
TIMEOUT = 90
PORT = 53000

SOURCE_CLIENT = CogniteClient(
    ClientConfig(
        credentials=OAuthInteractive(
            authority_url=SOURCE_AUTHORITY_URL,
            client_id=SOURCE_CLIENT_ID,
            scopes=SOURCE_SCOPES,
        ),
        project=SOURCE_PROJECT,
        base_url=SOURCE_BASE_URL,
        client_name="cognite-replicator-source",
    )
)
DEST_CLIENT = CogniteClient(
    ClientConfig(
        credentials=OAuthInteractive(
            authority_url=DEST_AUTHORITY_URL,
            client_id=DEST_CLIENT_ID,
            scopes=DEST_SCOPES,
        ),
        project=DEST_PROJECT,
        base_url=DEST_BASE_URL,
        client_name="cognite-replicator-destination",
    )
)

if __name__ == "__main__":  # this is necessary because threading

    #### Uncomment the resources you would like to copy
    assets.replicate(SOURCE_CLIENT, DEST_CLIENT)
    #events.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #files.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #time_series.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #datapoints.replicate(SOURCE_CLIENT, DEST_CLIENT)
    #sequences.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #sequence_rows.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)

2.2 Without configuration file and with client credentials authentication

It will copy everything from source to destination and use your own credentials to run the code, you need to have the right permissions to read on the source project and write on the destination project (in the example below, the secrets are stored as environment variables)

import os
from cognite.client.credentials import OAuthClientCredentials
from cognite.client import CogniteClient, ClientConfig
from cognite.replicator import assets, events, files, time_series, datapoints, sequences, sequence_rows

# SOURCE
SOURCE_TENANT_ID = "48d5043c-cf70-4c49-881c-c638f5796997"
SOURCE_CLIENT_ID = "1b90ede3-271e-401b-81a0-a4d52bea3273"
SOURCE_CLIENT_SECRET = os.environ.get("SOURCE_CLIENT_SECRET")
SOURCE_PROJECT = "publicdata"
SOURCE_CLUSTER = "api"

# DESTINATION
DEST_TENANT_ID = "d4febcbc-db24-4823-bffd-92fd05b9c6bc"
DEST_CLIENT_ID = "189e8b95-f1ce-47d2-aa66-4c2fe3567f91"
DEST_CLIENT_SECRET = os.environ.get("DEST_CLIENT_SECRET")
DEST_PROJECT = "sa-team"
DEST_CLUSTER = "bluefield"
### Autogenerated variables
SOURCE_SCOPES = [f"https://{SOURCE_CLUSTER}.cognitedata.com/.default"]
SOURCE_BASE_URL = f"https://{SOURCE_CLUSTER}.cognitedata.com"
SOURCE_TOKEN_URL = f"https://login.microsoftonline.com/{SOURCE_TENANT_ID}/oauth2/v2.0/token"
DEST_SCOPES = [f"https://{DEST_CLUSTER}.cognitedata.com/.default"]
DEST_BASE_URL = f"https://{DEST_CLUSTER}.cognitedata.com"
DEST_TOKEN_URL = f"https://login.microsoftonline.com/{DEST_TENANT_ID}/oauth2/v2.0/token"
COGNITE_CONFIG_FILE = "config/config.yml"
# Config
BATCH_SIZE = 10000  # this is the max size of a batch to be posted
NUM_THREADS = 10  # this is the max number of threads to be used
TIMEOUT = 90
PORT = 53000

SOURCE_CLIENT = CogniteClient(
    ClientConfig(
        credentials=OAuthClientCredentials(
            token_url=SOURCE_TOKEN_URL,
            client_id=SOURCE_CLIENT_ID,
            scopes=SOURCE_SCOPES,
            client_secret=SOURCE_CLIENT_SECRET,
        ),
        project=SOURCE_PROJECT,
        base_url=SOURCE_BASE_URL,
        client_name="cognite-replicator-source",
    )
)

DEST_CLIENT = CogniteClient(
    ClientConfig(
        credentials=OAuthClientCredentials(
            token_url=DEST_TOKEN_URL,
            client_id=DEST_CLIENT_ID,
            scopes=DEST_SCOPES,
            client_secret=DEST_CLIENT_SECRET,
        ),
        project=DEST_PROJECT,
        base_url=DEST_BASE_URL,
        client_name="cognite-replicator-destination",
    )
)

if __name__ == "__main__":  # this is necessary because threading

    #### Uncomment the resources you would like to copy
    assets.replicate(SOURCE_CLIENT, DEST_CLIENT)
    #events.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #files.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #time_series.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #datapoints.replicate(SOURCE_CLIENT, DEST_CLIENT)
    #sequences.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)
    #sequence_rows.replicate(SOURCE_CLIENT, DEST_CLIENT, BATCH_SIZE, NUM_THREADS)

2.3 Alternative by having some elements of the configuration file as variable

Refer to default configuration file or example configuration file for all keys in the configuration file Start with client creation from either step 2.1 or 2.2

if __name__ == "__main__":  # this is necessary because threading
    config = {
        "timeseries_external_ids": ["pi:160670", "pi:160623"],
        "datapoints_start": "100d-ago",
        "datapoints_end": "now",
    }
    time_series.replicate(
        client_src=SOURCE_CLIENT,
        client_dst=DEST_CLIENT,
        batch_size=BATCH_SIZE,
        num_threads=NUM_THREADS,
        config=config,
    )
    datapoints.replicate(
        client_src=SOURCE_CLIENT,
        client_dst=DEST_CLIENT,
        external_ids=config.get("timeseries_external_ids"),
        start=config.get("datapoints_start"),
        end=config.get("datapoints_end"),
    )

3. With configuration file

It will use the configuration file to determine what will be copied In this case, no need to create the client, it will be created based on what is in the configuration file

import yaml
from cognite.replicator.__main__ import main
import os

if __name__ == "__main__":  # this is necessary because threading
    COGNITE_CONFIG_FILE = yaml.safe_load("config/config.yml")
    os.environ["COGNITE_CONFIG_FILE"] = COGNITE_CONFIG_FILE
    main()

4. Local testing

It will use the configuration file to determine what will be copied In this case, no need to create the client, it will be created based on what is in the configuration file

import yaml
import sys
sys.path.append("cognite-replicator") ### Path of the local version of the replicator. Importing from outside of the current working directory requires sys.path, which is a list of all directories Python searches through.
import os

if __name__ == "__main__":  # this is necessary because threading
    COGNITE_CONFIG_FILE = yaml.safe_load("config/config.yml")
    os.environ["COGNITE_CONFIG_FILE"] = COGNITE_CONFIG_FILE
    main()
    sys.path.remove("cognite-replicator")  ## Python will also search these paths for future projects unless they are removed. Removes unwanted search paths

Development

Change the version in the files

Changelog

Wondering about upcoming or previous changes? Take a look at the CHANGELOG.

Contributing

Want to contribute? Check out CONTRIBUTING.

cognite-replicator's People

Contributors

1991sig avatar cognite-bulldozer[bot] avatar csabaalmasi avatar eventh avatar gaetan-h avatar janne123456789 avatar jo-cognite avatar jorge-sanchez-2020 avatar keepfloyding avatar krystele-uy avatar lint-action avatar magneei avatar maudeburkhalter avatar maur1 avatar muradsater avatar nodegard avatar pcperera avatar pierrepernot avatar polyx avatar psalaberria002 avatar renovate[bot] avatar sanderland avatar sceniclife avatar sofiehaug-cognite avatar thomas-schoyen-cognite avatar torbjornopheim avatar tristan-cognite avatar vvemel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognite-replicator's Issues

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

  • Update dependency cognite-sdk to v7.46.1
  • Update codecov/codecov-action action to v4
  • Update dependency protobuf to v5
  • Update dependency pytest to v8
  • Update dependency pytest-cov to v5
  • Update dependency pytz to v2024
  • Update dependency sphinx to v7
  • Update dependency sphinx-rtd-theme to v2
  • Update dependency tox to v4
  • Update dependency twine to v5
  • Update docker/build-push-action action to v5
  • Update docker/login-action action to v3
  • Update docker/setup-buildx-action action to v3
  • ๐Ÿ” Create all rate-limited PRs at once ๐Ÿ”

Edited/Blocked

These updates have been manually edited so Renovate will no longer make changes. To discard all commits and start over, click on a checkbox.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

dockerfile
Dockerfile
github-actions
.github/workflows/cd.yml
  • actions/checkout v3
  • docker/setup-buildx-action v2
  • docker/login-action v2
  • docker/build-push-action v3
.github/workflows/ci.yml
  • actions/checkout v3
  • actions/setup-python v4
  • cognitedata/lint-action v1.6.0
  • actions/checkout v3
  • actions/setup-python v4
  • codecov/codecov-action v3
.github/workflows/python-publish.yml
  • actions/checkout v3
  • actions/setup-python v4
pep621
pyproject.toml
  • poetry >=0.12
poetry
pyproject.toml
  • cognite-sdk ^7.13.8
  • google-cloud-logging ^1.12
  • python ^3.11
  • pyyaml ^6.0.1
  • protobuf ^4.0.0
  • black ^22.8
  • isort ^4.3
  • pre-commit ^1.18
  • pytest ^6.2.5
  • pytest-cov ^2.7.1
  • pytest-mock ^1.11.2
  • sphinx ^2.4.4
  • sphinx-rtd-theme ^0.4.3
  • toml ^0.10.0
  • tox ^3.14
  • tox-pyenv ^1.1
  • twine ^3.1.1
  • pytz *

  • Check this box to trigger a request for Renovate to run again on this repository

Should test against supported python versions

Currently tests are only run against Py3.7. In the readme you say you support 3.6, so you should run your tests against that version aswell. In the sdk we solve this using tox, so you can take a look there.

Replication fails if timeseries is not found

The replication fails if one of the timeseries listed in the yaml file is not found in the source tenant

Replicate the issue:

  • Add a time series that does not exist in the yml config file
  • Run replication

Output:

2020-01-30 14:01:35,281 cognite-sdk DEBUG - HTTP Error 400 POST https://api.cognitedata.com/api/v1/projects/akerbp/timeseries/byids: timeseries ids not found: (id: null | externalId: VALI_23-PT-92532:X.Value)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/cognite/client/utils/_concurrency.py", line 127, in execute_tasks_concurrently
    res = f.result()
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.7/site-packages/cognite/client/_api_client.py", line 111, in _post
    "POST", url_path, json=json, headers=headers, params=params, timeout=self._config.timeout
  File "/usr/local/lib/python3.7/site-packages/cognite/client/_api_client.py", line 139, in _do_request
    self._raise_API_error(res, payload=json_payload)
  File "/usr/local/lib/python3.7/site-packages/cognite/client/_api_client.py", line 650, in _raise_API_error
    raise CogniteAPIError(msg, code, x_request_id, missing=missing, duplicated=duplicated, extra=extra)
cognite.client.exceptions.CogniteAPIError: timeseries ids not found: (id: null | externalId: VALI_23-PT-92532:X.Value) | code: 400 | X-Request-ID: ba8b983a-ab3c-92c8-927b-b507785bd232
Missing: [{'externalId': 'VALI_23-PT-92532:X.Value'}]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/site-packages/cognite/replicator/__main__.py", line 177, in <module>
    main()
  File "/usr/local/lib/python3.7/site-packages/cognite/replicator/__main__.py", line 146, in main
    exclude_pattern=config.get("timeseries_exclude_pattern"),
  File "/usr/local/lib/python3.7/site-packages/cognite/replicator/time_series.py", line 161, in replicate
    ts_src = client_src.time_series.retrieve_multiple(external_ids=target_external_ids)
  File "/usr/local/lib/python3.7/site-packages/cognite/client/_api/time_series.py", line 142, in retrieve_multiple
    ids=ids, external_ids=external_ids, ignore_unknown_ids=ignore_unknown_ids, wrap_ids=True
  File "/usr/local/lib/python3.7/site-packages/cognite/client/_api_client.py", line 259, in _retrieve_multiple
    utils._concurrency.collect_exc_info_and_raise(tasks_summary.exceptions)
  File "/usr/local/lib/python3.7/site-packages/cognite/client/utils/_concurrency.py", line 102, in collect_exc_info_and_raise
    ) from missing_exc
cognite.client.exceptions.CogniteNotFoundError: Not found: [{'externalId': 'VALI_23-PT-92532:X.Value'}]

TypeError: copy_events() takes 6 positional arguments but 7 were given

Attempting to copy events will error with the following error:

TypeError: copy_events() takes 6 positional arguments but 7 were given self._target(*self._args, **self._kwargs) self._target(*self._args, **self._kwargs) self.run() File "/Users/viet/.pyenv/versions/3.8.3/lib/python3.8/threading.py", line 870, in run

Environment:
[tool.poetry.dependencies] python = "^3.8" cognite-replicator = "^0.8.1"

Add support for relationships

Relationships are currently not in the scope of the replicator. Some use cases have relationships, and it would be great to have the ability to replicate them as well.

Replication fails for timeseries that already exists

This is maybe not a bug, but we need to see what we can do about this

Replication fails for timeseries that existed in the tenant from before

For example, timeserie A exists already in the tenant, but does not have the metadata
_replicatedTime
_replicatedSource
_replicatedInternalId

If we now setup replication on the timeserie A, the replication will fail because of duplicate

Duplicated: [{'legacyName': 'xxxxxxxxx-LDB_P'}, {'legacyName': 'xxxxxxxX.Value'}, {'legacyName': 'OilSample_xxxxB_K'}

Replicating Events without Assets causes Exception

Cause: replication.py:get_asset_ids may return an empty list
Possible solutions:

  • Filter out events without asset ids
  • Create event, with assetIds = None
  • Replicate required assets

Screenshot 2019-09-18 at 14 53 21

Sample API interaction:
With empty assetIds:
Screenshot 2019-09-18 at 14 56 24

With no assetIds key:
Screenshot 2019-09-18 at 14 58 19

Unnecessary batching of create time series/events?

I see that time series/events are split into batches of 10,000 and posted in parallell. The SDK already does this, so it shouldn't be necessary to do it here as well.

Also, the batch_size passed to replicate() is not respected. A dynamic batch size is calculated based on the number of threads allocated.

Continous deployment pipeline

  • Publish package to PyPI
  • Publish docker image to docker hub
  • Publish code coverage
  • Publish documentation on readthedocs

pip install fails

pip install cognite-replicator
Collecting cognite-replicator
  Downloading cognite_replicator-1.2.6-py3-none-any.whl (45 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 45.9/45.9 kB 1.2 MB/s eta 0:00:00
Collecting cognite-sdk<6.0.0,>=5.4.4 (from cognite-replicator)
  Downloading cognite_sdk-5.12.0-py3-none-any.whl (291 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 291.7/291.7 kB 4.3 MB/s eta 0:00:00
Collecting google-cloud-logging<2.0,>=1.12 (from cognite-replicator)
  Downloading google_cloud_logging-1.15.3-py2.py3-none-any.whl (141 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 141.6/141.6 kB 5.8 MB/s eta 0:00:00
Requirement already satisfied: protobuf<5.0.0,>=4.0.0 in /Users/[email protected]/Library/Caches/pypoetry/virtualenvs/file-extractor-function-bVgch4Fw-py3.11/lib/python3.11/site-packages (from cognite-replicator) (4.24.3)
Collecting pyyaml<6.0.0,>=5.1.0 (from cognite-replicator)
  Downloading PyYAML-5.4.1.tar.gz (175 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 175.1/175.1 kB 4.6 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  ร— Getting requirements to build wheel did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> [68 lines of output]
      /private/var/folders/f_/n9ywg_t948gg6y4_j7dm3n400000gn/T/pip-build-env-eijdcqyg/overlay/lib/python3.11/site-packages/setuptools/config/setupcfg.py:293: _DeprecatedConfig: Deprecated config in `setup.cfg````

Prepare repo for open source

Tasks needed to complete before we are ready to open source the repo.

  • Add CHANGELOG.md
  • Add CONTRIBUTING.md
  • Add Code of Conduct
  • Prepare to publish to PyPI in pyproject.toml
  • Setup githooks for black and isort
  • Setup CI pipeline for running unittests (or run them on githooks?)
  • Create CLI commands for running the replication
  • Fix logging

Bug when no datapoint start date

datapoints replication fails when there is no start date and no datapoint already in the timeseries
The code currently takes either the time of the latest datapoint of the time series if there is one. If not it takes the datapoint start parameter.
If both are missing => bug

Add Support For Mapping of Annotations

Annotations are stored in Cognite as events of type cognite_annotation. There are sometimes id values in metadata fields CDF_ANNOTATION_resource_id and CDF_ANNOTATION_file_id that we have seen. We depend on these id's being correct to show the list of files applicable to an asset or other entities. When these events are being replicated to the target project, the id's are not being updated with the target id.

Please add support for contextualization annotations in the replicator. Thanks!

Datapoints replication produces Connection Error

Dear SA,

Please see the screenshot for the issue. Datapoints get replicated in a small handful of timeseries (my estimate is 30 out of 300 roughly). I am trying to replicate datapoints from publicdata to my personal tenant in greenfield. Replication of assets and timeseries was successful, with proper linking between ts and assets.

What I have tried: playing with batch_size. It produces the same warning/error.

This could be a wider problem with our API endpoints.

Screen Shot 2020-07-30 at 10 46 51 AM

Dataset Awareness

We have discovered that when using Cognite Replicator, our source Project has dataset id's set, but in the target Project all dataset id's are null. Please add support for replicating datasets and setting dataset id's on all entities to the replicator.

The workaround we will be operating on in the meantime is to run a script afterward to copy the Datasets and set the dataset id's.

datapoints.replicate fails to replicate datapoints from certain time series

I have tried to use the datapoints.replicate function as shown below:

datapoints.replicate(
    source_client,
    target_client,
    external_ids=["NO1_day_ahead_price_2022-07-13T08:58:28"],
    start=datetime.datetime(2000,1,1,1).timestamp()*1000,
    end=datetime.datetime(2040,1,1,1).timestamp()*1000,
)

Before running the function I ensured that a time-series with the given external id existed in both the source and target project. However, after running the replication, none of the datapoints from the time-series in the source project had been replicated to the target project. Moreover, datapoints.replicate() did not return any error and simply finished as normal. What can be causing this issue?

Note! I have only experienced the missing datapoints replication with a subset of time-series. The only link I can find between these time series is that they all have "Is Step" set to "True", and that they have datapoints with timestamps into the future.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.