Giter VIP home page Giter VIP logo

getml / getml-community Goto Github PK

View Code? Open in Web Editor NEW
91.0 7.0 8.0 13.45 MB

Fast, high-quality forecasts on relational and multivariate time-series data powered by new feature learning algorithms and automated ML.

Home Page: https://getml.com

License: Other

Dockerfile 0.03% Jupyter Notebook 22.93% Python 5.97% Shell 0.24% CMake 0.03% C++ 31.94% C 38.55% Go 0.27% Jinja 0.03%
data-science feature-engineering feature-learning machine-learning predictive-modeling python-library ai ml mlops relational-learning

getml-community's People

Contributors

liuzicheng1987 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

getml-community's Issues

select via boolean list or array on DataFrame

Currently we can only select rows from a DataFrame via a BooleanColumnView.

Support for selection via list of bools or numpy array of bools should be added.

target_table[list_or_numpy_array_of_bools]

Change in dataframe columns leads to error when printing dataframe (try catch refresh requiered)

When modifying a table via

table= table.drop(
        cols=[
            "A",
            "B",
            "C"
        ]
    ).to_df(name=table.name)

no error is thrown. But if we then try to subsequently access the table, the error is thrown that column "A" does not exists (table.name=TEST):

OSError: Data frame 'TEST' contains no column named 'A'!

The problem is solved by calling

table.refresh()

=> Possible solution add try catch which calls refresh?

Support datetime.datetime (in general support standard lib types)

now = datetime.datetime.now()
table[ table["date"]< now]

Does not work, a np.datetime64 is required.

now = datetime.datetime.now()
table[ table["date"]< np.datetime64(now)]

getml should in general support standard python library types such as datetime.datetime

Recursion error when printing pipelines

import getml
getml.engine.launch()
getml.engine.set_project("test")
names_pipelines = getml.pipeline.list_pipelines()
x = getml.pipeline.load(names_pipelines[0])
print(x)

leads to the error:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2023.22.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.../.../models/test.py", line 10, in <module>
    print(x)
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/pipeline/pipeline.py", line 609, in __repr__
    repr_str = sig._format()
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 194, in _format
    return getattr(self, "_format_" + style)(suppress_none)
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 105, in _format_pep8
    list_split = _split_value(
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 34, in _split_value
    split.extend(_split_value(remaining, start, max_width))
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 34, in _split_value
    split.extend(_split_value(remaining, start, max_width))
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 34, in _split_value
    split.extend(_split_value(remaining, start, max_width))
  [Previous line repeated 977 more times]
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/signature_formatter.py", line 29, in _split_value
    truncated = list(_truncate_line(value, 2, max_width - start, template="{!r}"))
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/formatter.py", line 152, in _truncate_line
    for cell, cum_width in zip(line, cum_widths):
  File "/home/ubuntu/.../.../.venv/lib/python3.10/site-packages/getml/utilities/formatting/formatter.py", line 148, in <genexpr>
    widths = (len(template.format(cell)) + margin for cell in line)
  File "_pydevd_bundle/pydevd_cython.pyx", line 1457, in _pydevd_bundle.pydevd_cython.SafeCallWrapper.__call__
RecursionError: maximum recursion depth exceeded while calling a Python object

Problem is probably that not all strings are allowed for "tag";

"{'schema_name': 'DEMAND', 'country': 'IT', 'tables': 'w:60d', 'include_categorical': False, 'fast_prop': True, 'fast_prop_mode': 'minimal', 'relboost': False, 'mapping': False, 'share_selected_features': 1.0, 't': 'numerical', 'w': 'unused_float', 'solar_radiation': 'unused_float', 'max_depth_feature_selector': 3, 'max_depth_predictor': 3, 'max_depth_rel_boost': 3, 'columns': {'weather': ['date', 'country', 'hour', 'month', 'weekday', 't', 's', 'w']}}"

Tetouan city is a Moroccan city

Hello, while reading the README.md, I saw that there is a mistake. Here is the original text Tetouan | Ten-minute electricity consumption of three different zones in Tetouan City, Mexico. But Tetouan is a Moroccan city, not a Mexican city.

A small issue, I know ;)

Accept list in add column function of DataFrame

getml.DataFrame.add should accept List for col argument.

Also type annotations should be added.

table.add(col=[ ele for ele in table.pet_names], name="test", role="unused_string")

leads to

TypeError: 'col' must be a getml.data.Column or a value!

XGBoostClassifier multiclass objective

Currently "objective" parameter for the XGBoostClassifier is limited to "reg:squarederror", "reg:tweedie", "reg:linear", "reg:logistic", "binary:logistic", "binary:logitraw". And these values are even forced with validation code:

        if kkey == "objective":
            if not isinstance(parameters["objective"], str):
                raise TypeError("'objective' must be of type str")
            if parameters["objective"] not in [
                "reg:squarederror",
                "reg:tweedie",
                "reg:linear",
                "reg:logistic",
                "binary:logistic",
                "binary:logitraw",
            ]:
                raise ValueError(
                    """'objective' must either be 'reg:squarederror', """
                    """'reg:tweedie', 'reg:linear', 'reg:logistic', """
                    """'binary:logistic', or 'binary:logitraw'"""
                )

This code was clearly added when XGBoost already supported multiclass classification. So why can't I use an objective like "multi:softmax"? Or maybe there is some workaround for the multiclass classification?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.