Giter VIP home page Giter VIP logo

Comments (13)

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

We could develop the following behaviour:

  1. Pick master entity e.g. Car.class
  2. Map columns respectively to this master entity:
Destination Source Alias
{E}.driver.name column 5 firstName
{E}.car.plate column 1 regNumber
....
  1. Define policy for searching if entity already exists with its behaviour
Entity Unique Index Policy if exists (leave, re-write, create new)
Car.class regNumber re-write if exists
Driver.class firstName, secondName,... leave if exists
...
  1. Define column for EOF detection (when meeting first empty cell in it)
  2. Define map of default values

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

Yeah, this sounds good.

  1. could be either done via: giving the master entity as a parameter, or make it the first step of the wizard. Probably both should be supported.
  2. Yeah, i don't know how the UI should look like. Additionally we should start small with direct String / int attributes. Then we take in ENUMS, then associations or something.
  3. I don't really know what you mean. Can you elaborate?
  4. this is Excel specific, right? For CSV it would be options like "seperator" e.g.
  5. might make sense. Perhaps also not 0.1.0, but sounds good.

from cuba-component-data-import.

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

3 - reading a line you need to see if you already have this record in the database. And if yes, what should be done? Should data be re-written? Should it be left as is with no action? Should be created a new similar row? E.g. in this case if customer exists we will update all the fields of it to what is specified in the importing file
4 - yes, you are right
5 - map of defaults is extremely useful, and already developed (see here). So, if you cannot parse something, or you have an empty value for a mandatory attribute, then you simply apply defaults.

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

Alright - so I implemented the first version. Mainly it was with CSV in mind, but I have already thought about XSLX :)

import-wizard-draft

The current wizard has the following steps:

  1. upload file
  2. select entity
  3. setup columns <-> attributes
  4. extended import configuration
  5. preview

The following features are not included in the first draft:

  • setup default values
  • support for boolean type
  • references to other entities
  • preview on the parsed data instead of the raw csv data
  • feedback if import was not successful
  • select attributes from drop down instead of text field
  • a lot of other stuff

Let's talk about it ;)

from cuba-component-data-import.

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

Looks great!

Form my experience it would be also great to be able to define some kind of transformation script for each column right at step 3 and remove step 4. So, you can (or must for associations attributes or enum types) define a groovy script where you have value as the parameter, then you transform it into the right output format.

What do you think about it? From my perspective, this would make the applicability of such importer much wider.

We also can have prefined scripts for Integers, Doubles, Dates and other widely used types, to convert the value into the corresponding format.

from cuba-component-data-import.

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

Another improvement I see is an ability to save upload scenario and re-use it later.

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

So, you can (or must for associations attributes or enum types) define a groovy script where you have value as the parameter, then you transform it into the right output format.

Ah yeah, I thought about that as well. This would be like the escape mode basically to not being restricted by the pre-defined config options. That makes sense.

But it is questionable, which people will mainly use that wizard. Is it a developer? In this case, this feature makes total sense. Is it an end user / administrator? Than this feature will scare people away. Therefore i think we should try to build the main use cases (enum matcher, lookup of another entity by ID / another column etc.) that into the wizard.

But we should also allow this kind of scripting. Perhaps this will not be part of the wizard but instead just part of the regular Editor of the ImportConfiguration, which will then be prepared by the developers and then saved. Afterwards the users will just use the config and import their file with this config

We also can have prefined scripts for Integers, Doubles, Dates and other widely used types, to convert the value into the corresponding format.

this is what i hard-coded for now (and made it configurable via step 4.).

Another improvement I see is an ability to save upload scenario and re-use it later.

will easily be achievable. It already uses the ImportConfiguration under the covers...

from cuba-component-data-import.

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

But it is questionable, which people will mainly use that wizard. Is it a developer? In this case, this feature makes total sense. Is it an end user / administrator? Than this feature will scare people away. Therefore i think we should try to build the main use cases (enum matcher, lookup of another entity by ID / another column etc.) that into the wizard.

Again, from my experience users will mess everything up using such tool. So, I would say it will be an administrator. A user will be using only pre-configured scenarios.

As for searching by id - this is never the case in the real world. Did you see tables for import with uids from your system in it :)?

At the same time writing a simple script is not a big problem for anyhow educated administrator, e.g.
if (value.equals("high")) return Priority.HIGH else return Priority.LOW
Plus, for all non-string types of attributes, we can have default transformation scripts, like for an int field it would be like:
Integer.valueOf(value)

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

As for searching by id - this is never the case in the real world. Did you see tables for import with uids from your system in it :)?

Obviously not by ID in case of a UUID. But sometimes it might be a auto increment ID and then it might actually be the case... Whatever - the main point is here to select an association by an attribute of that association. Example:

MlbPlayer N:1 MlbTeam through the attribute team. Now MlbTeam has attributes name and code.

In the CSV file there is a reference to team and it contains the code of the MlbTeam entity.

This is a fairly easy case and it should just work out of the box. The equivalent in Excel is vlookup :)

Here's the full example:

bildschirmfoto 2018-04-05 um 12 30 03

bildschirmfoto 2018-04-05 um 12 33 03

I updated the MLB example in the code...

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

3 - reading a line you need to see if you already have this record in the database. And if yes, what should be done? Should data be re-written? Should it be left as is with no action? Should be created a new similar row? E.g. in this case if customer exists we will update all the fields of it to what is specified in the importing file

@aleksey-stukalov , you wrote that above that stated it Unique Index in the table (#5 (comment)). But this is not an already existing DB unique constraint that should get introspected by the application, but rather something that can be defined by the user, right?
So the user says "if Driver.firstname + Driver.lastname in this combo already exists, then do nothing / replace / update etc." - correct?

This would require that actually every data row will some amount of queries to the DB. But it sounds reasonable...

from cuba-component-data-import.

aleksey-stukalov avatar aleksey-stukalov commented on June 15, 2024

@aleksey-stukalov , you wrote that above that stated it Unique Index in the table (#5 (comment)). But this is not an already existing DB unique constraint that should get introspected by the application, but rather something that can be defined by the user, right?

Right, this is some kind of business-key, some set of columns, that specify unique of a record.

This would require that actually every data row will some amount of queries to the DB. But it sounds reasonable...

Yes, but there is no other way...

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

ok then - i understand. Created a story for that:
#20

from cuba-component-data-import.

mariodavid avatar mariodavid commented on June 15, 2024

Will close this issue now, as we have a solid understanding of how the wizard should look like.
Follow up issues will come in order to improve.

from cuba-component-data-import.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.