Giter VIP home page Giter VIP logo

rckmath / db2ixf Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ismailhammounou/db2ixf

0.0 0.0 0.0 352 KB

db2ixf is a python package with a CLI that simplifies the parsing and processing of IBM Integration eXchange Format (IXF) files.

Home Page: https://ismailhammounou.github.io/db2ixf/

License: GNU Affero General Public License v3.0

Shell 1.51% Python 88.62% Makefile 9.56% Jinja 0.32%

db2ixf's Introduction

Language License

Pipeline Release Pypi

Downloads Contributors

Documentation

DB2IXF Parser

Logo

DB2IXF parser is an open-source python package that simplifies the parsing and processing of IBM Integration eXchange Format (IXF) files. IXF is a file format used by IBM's DB2 database system for data import and export operations. This package provides a streamlined solution for extracting data from IXF files and converting it to various formats, including JSON, CSV, Parquet and Deltalake.

Features

  • Parse IXF files: The package allows you to parse IXF files and extract the rows of data stored within them.
  • Convert to multiple formats: The parsed data can be easily converted to JSON, CSV, Parquet, or Deltalake format, providing flexibility for further analysis and integration with other systems.
  • Support for file-like objects: IXF Parser supports file-like objects as input, enabling direct parsing of IXF data from file objects, making it convenient for handling large datasets without the need for intermediate file storage.
  • Minimal dependencies: The package has few dependencies (ebcdic, pyarrow, deltalake, chardet, typer) which are automatically installed alongside the package.
  • CLI: command line tool called db2ixf comes with the package. (Does not support Deltalake format)

Hypothesis

  • One IXF file contains One table.

Getting Started

Installation

You can install DB2 IXF Parser using pip:

pip install db2ixf

Usage

Here are some examples of how to use DB2 IXF Parser:

CLI

Start with this:

db2ixf --help

Result:

 Usage: db2ixf [OPTIONS] COMMAND [ARGS]...

 A command-line tool (CLI) for parsing and converting IXF (IBM DB2 Import/Export 
 Format) files to various formats such as JSON, CSV, and Parquet. Easily parse 
 and convert IXF files to meet your data processing needs.

+- Options -------------------------------------------------------------------+
| --version             -v        Show the version of the CLI.                |
| --install-completion            Install completion for the current shell.   |
| --show-completion               Show completion for the current shell, to   |
|                                 copy it or customize the installation.      |
| --help                          Show this message and exit.                 |
+-----------------------------------------------------------------------------+
+- Commands ------------------------------------------------------------------+
| csv      Parse ixf FILE and convert it to a csv OUTPUT.                     |
| json     Parse ixf FILE and convert it to a json OUTPUT.                    |
| parquet  Parse ixf FILE and convert it to a parquet OUTPUT.                 |
+-----------------------------------------------------------------------------+

 Made with heart :D

Parsing an IXF file

# coding=utf-8
from pathlib import Path
from db2ixf import IXFParser

path = Path('path/to/IXF/file.XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    rows = parser.parse()
    for row in rows:
        print(row)

Converting to JSON

# coding=utf-8
from pathlib import Path
from db2ixf import IXFParser

path = Path('path/to/IXF/file.XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    output_path = Path('path/to/output/file.json')
    with open(output_path, mode='w', encoding='utf-8') as output_file:
        parser.to_json(output_file)

Converting to CSV

# coding=utf-8
from pathlib import Path
from db2ixf import IXFParser

path = Path('path/to/IXF/file.XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    output_path = Path('path/to/output/file.csv')
    with open(output_path, mode='w', encoding='utf-8') as output_file:
        parser.to_csv(output_file)

Converting to Parquet

# coding=utf-8
from pathlib import Path
from db2ixf import IXFParser

path = Path('path/to/IXF/file.XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    output_path = Path('path/to/output/file.parquet')
    with open(output_path, mode='wb') as output_file:
        parser.to_parquet(output_file)

Converting to Deltalake

# coding=utf-8
from pathlib import Path
from db2ixf import IXFParser

path = Path('path/to/IXF/file.XXX.IXF')
with open(path, mode='rb') as f:
    parser = IXFParser(f)
    output_path = 'path/to/output/'
    parser.to_deltalake(output_path)

For a detailed story and usage, please refer to the documentation.

Contributing

IXF Parser is actively seeking contributions to enhance its features and reliability. Your participation is valuable in shaping the future of the project.

We appreciate your feedback, bug reports, and feature requests. If you encounter any issues or have ideas for improvement, please open an issue on the GitHub repository.

For any questions or assistance during the contribution process, feel free to reach out by opening an issue on the GitHub repository.

Thank you for considering contributing to IXF Parser. Let's work together to create a powerful and dependable tool for working with DB2's IXF files.

Todo

  • Search for contributors/maintainers/sponsors.
  • Add tests (Manual testing was done but need write unit tests).
  • Adding new collector for the floating point data type.
  • Adding new collectors for other ixf data types: binary ...etc.
  • Improve documentation.
  • Add a CLI.
  • Improve CLI: output can be optional.
  • Add better ci-cd.
  • Improve Makefile.
  • Support multiprocessing.
  • Support archived inputs: only python not CLI ?
  • Add logging.
  • Add support for deltalake
  • Add support for pyarrow

License

IXF Parser is released under the AGPL-3.0 License.

Support

If you encounter any issues or have questions about using IXF Parser, please open an issue on the GitHub repository. We will do our best to address them promptly.

Conclusion

IXF Parser offers a convenient solution for parsing and processing IBM DB2's IXF files. With its ease of use and support for various output formats, it provides a valuable tool for working with DB2 data. We hope that you find this package useful in your data analysis and integration workflows.

Give it a try and let us know your feedback. Happy parsing!

db2ixf's People

Contributors

ismailhammounou avatar rckmath avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.