eclipse-edc / technology-azure Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Currently, the azure-eventgrid
module is not dealing with event framework registering an EventSubscriber
.
extensions
To spread the event framework use
improvement
Connector currently has env EDC_BLOBSTORE_ENDPOINT_TEMPLATE
with default value https://%s.blob.core.windows.net
where placeholder %s
get replaced during runtime with azure account name.
Currently Azure storage has a single domain (unlike S3), so it works. But in future, if supports multiple domain, then transfer from one domain to another will not work.
Let's say we want to test a Azure to Azure transfer between two participant. Each participant deploys there own Azurite locally.
It will have different domains, such as alice-azurite:10000
and bob-azurite:10000
. Hence it doesn't work which forces us to use to share a single azurite deployment for both participants.
Just Like Technology-Aws, Technology-Azure
should also support endpoint_override option in transfer_process API.
When the transfer request needs to handle several files, it does not use sharedKey
to list all the files of a specific prefix, even when it does exist. The method expects the credentials to be stored with a hard-coded pattern key that might not correspond to the value in keyName
.
All authentication strategies need to be consistent, this implies that listContainerFolder
makes uses of the sharedKey
if exist, instead of trying to resolve the key again.
The behavior was reported in test addition done in by this tractusx-edc PR
Improve the AzureVault#sanitize
method, so that it replaces all disallowed characters with allowed characters. Pls check the official documentation
AzureVault
Users could attempt to store secrets with disallowed characters, so this feature would contribute to improve fault tolerance.
String.replaceAll()
in conjunction with a regex to replace every character that does not match [a-zA-Z0-9]
name: Adoption Request
about: Submit your feature to the project
title: 'Adoption Request'
labels: 'adoption'
assignees: ''
Thank you for wanting to contribute to the project! We are very happy to see the functionalities of the EDC being extended. Providing this open-source is a great opportunity for others with similar requirements and to avoid additional work.
For any details about the guidelines for submitting features, please take a look at the contribution categories.
Please provide some information about your project or code contribution.
If you choose to be referenced as a "friend", these will be added to the known friends list.
If you choose to add your feature as a core EDC component, links to your current code and correlated issues, discussions, and pull requests are of great importance.
Title | Description | Contact | Links |
---|---|---|---|
Integration of Azure Data Lake | Support Directories in Azure Data Lake | [email protected] | #77 |
Next, please tell us what level of adoption you intend to provide. (pick only one)
If you chose to add your feature as a core EDC component, please answer the following questions.
Please argue why this feature must be hosted upstream and be maintained by the EDC core team.
The foundation or core feature is already maintained by the EDC core team, but it is focused on the old Azure Storage account, specifically it does not support directories of the successor service, Azure Data Lake Gen2, and it would be most likely of interest to all parties integrating the EDC on Azure to use this feature, as it has especially a cost impact, due to the possibility to selectively scan the Azure Data Lake.
If there is any existing code that can achieve the same thing with little modification, that is usually the preferable way for the EDC core team. We aim to keep the code succinct and want to avoid similar/duplicate code. Make sure you understand the EDC code base well!
Yes, the change is by itself only a little modification to the existing Azure Blob Storage extension.
Basically, we want you to motivate who will use that feature and why, thereby arguing the fact that it is well-suited to be adopted in the core code base. One-off features are better suited to be maintained externally.
As this contribution will help every company integrating the EDC on Azure, it is most likely that multiple use cases and application will benefit from this change.
EDC is a platform rather than an application, therefore we are extremely careful when it comes to introducing third party libraries. The reasons are diverse: security, license issues and over all JAR weight, just to mention a few important ones.
No, it cannot be achieved without third-party dependencies, but these dependencies are already part of the official "Technology_Azure" repository.
Features that do not work well in clustered environments are difficult to adopt, since EDC is designed from the ground up to be stateless and clusterable. Similarly, features, that have dependencies onto certain operating systems are difficult to argue.
This question does not apply, as the nature of this change wants to address a specific platform.
Features that have a large impact on the code base are very complex to thoroughly test, they have a high chance to destabilize the code and require careful inspection. Self-contained features on the other hand are easier to isolate and test.
The feature is self-contained as an extension.
Currently the Azure Vault hard codes the DNS endpoint and does not allow to set this in the configuration of the connector. This makes it impossible to black-box test a connector image
.vaultUrl("https://" + keyVaultName + ".vault.azure.net")
.vaultUrl("https://" + keyVaultName + ".vault.azure.net")
The request is to change this to something similar as it is in HashicorpVault:
that allows to change the entire endpoint.
Any thoughts on this?
Matthias Binzer
Remove the status checker following the upstream eclipse-edc/Connector#3448
During a multi object transfer process, when the DataSource
is not AzureStorage
, an issue arises if the Consumer
defines a blobName
in the destination. In this scenario, all objects are overwritten using the same file name, resulting in data loss or unintended overwriting. This occurs due to the reliance of the AzureStorageDataSinkFactory
on DataSource
specifics, particularly the blobPrefix
property, which is further utilized to determine object naming, if the property is present it would mean it is a multi object transfer, but it is true only if the DataSource
is AzureStorage
.
AzureStorageDataSinkFactory
:
AzureStorageDataSink
:
The behavior was reported in test addition done in by this tractusx-edc PR.
Caused by this bug: eclipse-edc/Connector#2923
We had to provide a nextForState
query that's able to filter by type
.
This was not implemented for CosmosDB due to time constraints.
CosmosContractNegotiationStore
name: Adoption Request
about: Submit your feature to the project
title: 'Adoption Request'
labels: 'adoption'
assignees: ''
Thank you for wanting to contribute to the project! We are very happy to see the functionalities of the EDC being extended. Providing this open-source is a great opportunity for others with similar requirements and to avoid additional work.
For any details about the guidelines for submitting features, please take a look at the contribution categories.
Please provide some information about your project or code contribution.
If you choose to be referenced as a "friend", these will be added to the known friends list.
If you choose to add your feature as a core EDC component, links to your current code and correlated issues, discussions, and pull requests are of great importance.
Title | Description | Contact | Links |
---|---|---|---|
Integration of Azure Data Lake | Add support for federated identity | [email protected] | #77 |
Next, please tell us what level of adoption you intend to provide. (pick only one)
If you chose to add your feature as a core EDC component, please answer the following questions.
Please argue why this feature must be hosted upstream and be maintained by the EDC core team.
The foundation or core feature is already maintained by the EDC core team. Currently, the extension only supports authentication by providing a shared key. Explicit credential management is companies implies efforts and costs, adhering to security / compliance rules like key rotation. With a few lines of code, a huge security and operational efficiency can be achieved by supporting a standard Microsoft feature named "Federated Identity".
If there is any existing code that can achieve the same thing with little modification, that is usually the preferable way for the EDC core team. We aim to keep the code succinct and want to avoid similar/duplicate code. Make sure you understand the EDC code base well!
Yes, the change is by itself only a little modification to the existing Azure Blob Storage extension.
Basically, we want you to motivate who will use that feature and why, thereby arguing the fact that it is well-suited to be adopted in the core code base. One-off features are better suited to be maintained externally.
I strongly assume that this is the case, as the general topic of managing credentials on a platform like Azure is an issue for all companies.
EDC is a platform rather than an application, therefore we are extremely careful when it comes to introducing third party libraries. The reasons are diverse: security, license issues and over all JAR weight, just to mention a few important ones.
No, it cannot be achieved without third-party dependencies, an Azure Identity library has to be added, but that would be just one additional Azure library amongst many which are already present.
Features that do not work well in clustered environments are difficult to adopt, since EDC is designed from the ground up to be stateless and clusterable. Similarly, features, that have dependencies onto certain operating systems are difficult to argue.
This question does not apply, as the nature of this change wants to address a specific platform.
Features that have a large impact on the code base are very complex to thoroughly test, they have a high chance to destabilize the code and require careful inspection. Self-contained features on the other hand are easier to isolate and test.
The feature is self-contained as an extension.
name: Adoption Request
about: Submit your feature to the project
title: 'Adoption Request'
labels: 'adoption'
assignees: ''
Thank you for wanting to contribute to the project! We are very happy to see the functionalities of the EDC being extended. Providing this open-source is a great opportunity for others with similar requirements and to avoid additional work.
For any details about the guidelines for submitting features, please take a look at the contribution categories.
Please provide some information about your project or code contribution.
If you choose to be referenced as a "friend", these will be added to the known friends list.
If you choose to add your feature as a core EDC component, links to your current code and correlated issues, discussions, and pull requests are of great importance.
Title | Description | Contact | Links |
---|---|---|---|
Integration of Azure Data Lake | Meta data to classify data | [email protected] | #77 |
Next, please tell us what level of adoption you intend to provide. (pick only one)
If you chose to add your feature as a core EDC component, please answer the following questions.
Please argue why this feature must be hosted upstream and be maintained by the EDC core team.
The foundation or core feature is already maintained by the EDC core team. The issue with it is that, as that's how it is implemented, data received by the extension sort-of "pop-ups" in the storage account and has to be somehow correlated to what it belongs to. Adding metadata is IMHO the natural way how to solve such topics in context of this existing extension, especially adding e. g. adding the BPN of the sending business partner and adding a correlation id, which can be supplied in the data destination of the transfer request.
If there is any existing code that can achieve the same thing with little modification, that is usually the preferable way for the EDC core team. We aim to keep the code succinct and want to avoid similar/duplicate code. Make sure you understand the EDC code base well!
Yes, the change is by itself only a little modification to the existing Azure Blob Storage extension.
The only option without any change at all would be to "misuse" the file name of data received to store a UUID as correlation id, etc., but that is not acceptable.
Basically, we want you to motivate who will use that feature and why, thereby arguing the fact that it is well-suited to be adopted in the core code base. One-off features are better suited to be maintained externally.
I strongly assume that this is the case, as the general topic of "correlation" was already addressed by other companies in the CX weekly meeting. The very idea of correlating data received with the actual request is essential and part of every integration scenario within companies.
EDC is a platform rather than an application, therefore we are extremely careful when it comes to introducing third party libraries. The reasons are diverse: security, license issues and over all JAR weight, just to mention a few important ones.
No, it cannot be achieved without third-party dependencies, but these dependencies are already part of the official "Technology_Azure" repository.
Features that do not work well in clustered environments are difficult to adopt, since EDC is designed from the ground up to be stateless and clusterable. Similarly, features, that have dependencies onto certain operating systems are difficult to argue.
This question does not apply, as the nature of this change wants to address a specific platform.
Features that have a large impact on the code base are very complex to thoroughly test, they have a high chance to destabilize the code and require careful inspection. Self-contained features on the other hand are easier to isolate and test.
The feature is self-contained as an extension.
Thank you for wanting to contribute to the project! We are very happy to see the functionalities of the EDC being extended. Providing this open-source is a great opportunity for others with similar requirements and to avoid additional work.
For any details about the guidelines for submitting features, please take a look at the contribution categories.
Please provide some information about your project or code contribution.
If you choose to be referenced as a "friend", these will be added to the known friends list.
If you choose to add your feature as a core EDC component, links to your current code and correlated issues, discussions, and pull requests are of great importance.
Title | Description | Contact | Links |
---|---|---|---|
Folder provisioning | support for multiple blobs transfer | [email protected] | (I will add the link ASAP) |
Next, please tell us what level of adoption you intend to provide. (pick only one)
If you chose to add your feature as a core EDC component, please answer the following questions.
Please argue why this feature must be hosted upstream and be maintained by the EDC core team.
The foundation or core feature is already maintained by the EDC core team, but it supports only transfer of single blob. It does not support like for example AWS extension does - registering source data address as 'folder' by specifying blob path prefix. It would be most likely of interest to all parties integrating the EDC on Azure to use such feature.
If there is any existing code that can achieve the same thing with little modification, that is usually the preferable way for the EDC core team. We aim to keep the code succinct and want to avoid similar/duplicate code. Make sure you understand the EDC code base well!
Yes, the change is by itself only a little modification/addition to the existing Azure Blob Storage extension.
Basically, we want you to motivate who will use that feature and why, thereby arguing the fact that it is well-suited to be adopted in the core code base. One-off features are better suited to be maintained externally.
As this contribution will help every company integrating the EDC on Azure, it is most likely that multiple use cases and application will benefit from this change.
EDC is a platform rather than an application, therefore we are extremely careful when it comes to introducing third party libraries. The reasons are diverse: security, license issues and over all JAR weight, just to mention a few important ones.
No, it cannot be achieved without third-party dependencies, but these dependencies are already part of the official "Technology_Azure" repository.
Features that do not work well in clustered environments are difficult to adopt, since EDC is designed from the ground up to be stateless and clusterable. Similarly, features, that have dependencies onto certain operating systems are difficult to argue.
This question does not apply, as the nature of this change designedly address a specific platform.
Features that have a large impact on the code base are very complex to thoroughly test, they have a high chance to destabilize the code and require careful inspection. Self-contained features on the other hand are easier to isolate and test.
The feature is self-contained as an extension.
Currently privateProperties
for assets are only supported for the in-memory and postgreSQL storage.
The implementation for cosmos DB is still missing.
asset-index-cosmos
To enable full support for privateProperties
in assets.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.