Giter VIP home page Giter VIP logo

ismailhammounou / db2ixf Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 1.0 1.03 MB

db2ixf is a python package with a CLI that simplifies the parsing and processing of IBM Integration eXchange Format (IXF) files.

Home Page: https://ismailhammounou.github.io/db2ixf/

License: GNU Affero General Public License v3.0

Makefile 8.88% Jinja 0.29% Shell 1.40% Python 89.42%
conversion converter csv db2 db2-database ibm ibm-cloud ixf json parquet

db2ixf's Introduction

Hi There ๐Ÿ‘‹

db2ixf's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

rckmath

db2ixf's Issues

Improve ci-cd

Separate jobs and maybe try use multiple environments or workflows ?

[Feature] Support for VARGRAPHIC data type

Hi @ismailhammounou,

I was trying to use your tool to convert to CSV some IXF files and I've got The column {col_name} has unknown data type error for VARGRAPHIC data type.

Considering that...

I've forked your repo and did some changes in order to get the VARGRAPHIC '464' data type being parsed as a string. I have:

  • Added a new collector;
  • Added a 'VarGraphicLengthException' class;
  • Added the references for these additions;

I have also:

  • Applied .strip() for time, date and timestamp while collecting the column data because for some reason I had IXF files with spaces after the date/time/timestamp;
  • Applied a fallback format for the timestamp collector while converting. The reason is because I had an IXF file that wasn't matching the defined timestamp format;

You can see the mentioned changes through the following commit: rckmath@e60aba5


That's my first hands on with Python and also in contributing to open source projects, and that's why I'm not directly opening a pull request into your repository.

I will appreciate any feedback.

IXF parser

As a data scientist/engineer

I want to be able to parse IBM exchange format and output the result in either json, csv or a parquet

in order to be analysed by other systems like spark for exemple

Fix error with blob_collector

blob collector for BLOB data type does not take into consideration the code page into account and it affects the encoding/decoding

TypeError: 'type' object is not subscriptable

Package vercion 0.7.0 currently returns TypeError: 'type' object is not subscriptable at usage.

# coding=utf-8
import pathlib
from db2ixf.ixf import IXFParser

path = pathlib.Path('Path/to/IXF/FILE/XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    output_path = pathlib.Path('Path/To/Output/YYY.csv')
    with open(output_path, mode='w', encoding='utf-8') as output_file:
        parser.to_csv(output_file, sep='#')

returns

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    from db2ixf.ixf import IXFParser
  File "db2ixf\__init__.py", line 94, in <module>
    from db2ixf.ixf import IXFParser
  File "db2ixf\ixf.py", line 11, in <module>
    from db2ixf.collectors import (collect_bigint,
  File "db2ixf\collectors.py", line 9, in <module>
    from db2ixf.helpers import get_ccsid_from_column
  File "db2ixf\helpers.py", line 10, in <module>
    def get_pyarrow_schema(cols: list[dict]) -> dict[str, object]:
TypeError: 'type' object is not subscriptable

If you want to maintain compatibility with Python 3.8, you need to change

def get_pyarrow_schema(cols: list[dict]) -> dict[str, object]:

to

def get_pyarrow_schema(cols: List[dict]) -> Dict[str, object]:

and

def get_pandas_schema(cols: list[dict]) -> dict[str, object]:

to

def get_pandas_schema(cols: List[dict]) -> Dict[str, object]:

and

def merge_dicts(dicts: list[dict]) -> dict[str, list]:

to

def merge_dicts(dicts: List[dict]) -> Dict[str, list]:

Thank you for the package!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.