thieu1995 / intelelm Goto Github PK

IntelELM: A Python Framework for Intelligent Metaheuristic-based Extreme Learning Machine

Home Page: https://intelelm.readthedocs.io

License: GNU General Public License v3.0

Python 21.21% Jupyter Notebook 78.79%

classification-models elm evolutionary-computation extreme-learning-machine genetic-algorithm machine-learning metaheuristic-algorithms nature-inspired-algorithms neural-networks particle-swarm-optimization

intelelm's Introduction

IntelELM (Intelligent Metaheuristic-based Extreme Learning Machine) is a Python library that implements a framework for training Extreme Learning Machine (ELM) networks using Metaheuristic Algorithms. It provides a comparable alternative to the traditional ELM network and is compatible with the Scikit-Learn library. With IntelELM, you can perform searches and hyperparameter tuning using the functionalities provided by the Scikit-Learn library.

Free software: GNU General Public License (GPL) V3 license
Provided Estimator: ElmRegressor, ElmClassifier, MhaElmRegressor, MhaElmClassifier
Total Optimization-based ELM Regression: > 200 Models
Total Optimization-based ELM Classification: > 200 Models
Supported datasets: 54 (47 classifications and 7 regressions)
Supported performance metrics: >= 67 (47 regressions and 20 classifications)
Supported objective functions (as fitness functions or loss functions): >= 67 (47 regressions and 20 classifications)
Documentation: https://intelelm.readthedocs.io/en/latest/
Python versions: >= 3.7.x
Dependencies: numpy, scipy, scikit-learn, pandas, mealpy, permetrics

Citation Request

If you want to understand how Metaheuristic is applied to Extreme Learning Machine, you need to read the paper titled "A new workload prediction model using extreme learning machine and enhanced tug of war optimization". The paper can be accessed at the following this link

Please include these citations if you plan to use this library:

@software{nguyen_van_thieu_2023_8249046,
  author       = {Nguyen Van Thieu},
  title        = {IntelELM: A Python Framework for Intelligent Metaheuristic-based Extreme Learning Machine},
  month        = aug,
  year         = 2023,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.8249045},
  url          = {https://github.com/thieu1995/IntelELM}
}

@article{nguyen2020new,
  title={A new workload prediction model using extreme learning machine and enhanced tug of war optimization},
  author={Nguyen, Thieu and Hoang, Bao and Nguyen, Giang and Nguyen, Binh Minh},
  journal={Procedia Computer Science},
  volume={170},
  pages={362--369},
  year={2020},
  publisher={Elsevier},
  doi={10.1016/j.procs.2020.03.063}
}

@article{van2023mealpy,
  title={MEALPY: An open-source library for latest meta-heuristic algorithms in Python},
  author={Van Thieu, Nguyen and Mirjalili, Seyedali},
  journal={Journal of Systems Architecture},
  year={2023},
  publisher={Elsevier},
  doi={10.1016/j.sysarc.2023.102871}
}

Installation

Install the current PyPI release:

$ pip install intelelm

After installation, you can check the version of current installed IntelELM by:

$ python
>>> import intelelm
>>> intelelm.__version__

Tutorials

Tutorials can be found inside tutorials directory.
- Building a handwritten digits classifier using IntelELM
- Building a house price predictor using IntelELM

Examples

In this section, we will explore the usage of the IntelELM model with the assistance of a dataset. While all the preprocessing steps mentioned below can be replicated using Scikit-Learn, we have implemented some utility functions to provide users with convenience and faster usage.

### Step 1: Importing the libraries
from intelelm import ElmRegressor, ElmClassifier, MhaElmRegressor, MhaElmClassifier, get_dataset

#### Step 2: Reading the dataset
data = get_dataset("aniso")

#### Step 3: Next, split dataset into train and test set
data.split_train_test(test_size=0.2, shuffle=True, random_state=100)

#### Step 4: Feature Scaling
data.X_train, scaler_X = data.scale(data.X_train, scaling_methods=("standard", "minmax"))
data.X_test = scaler_X.transform(data.X_test)

data.y_train, scaler_y = data.encode_label(data.y_train)   # This is for classification problem only
data.y_test = scaler_y.transform(data.y_test)

#### Step 5: Fitting ELM-based model to the dataset

##### 5.1: Use standard ELM model for regression problem
regressor = ElmRegressor(hidden_size=10, act_name="relu", seed=42)
regressor.fit(data.X_train, data.y_train)

##### 5.2: Use standard ELM model for classification problem 
classifer = ElmClassifier(hidden_size=10, act_name="tanh", seed=42)
classifer.fit(data.X_train, data.y_train)

##### 5.3: Use Metaheuristic-based ELM model for regression problem
print(MhaElmClassifier.SUPPORTED_OPTIMIZERS)
print(MhaElmClassifier.SUPPORTED_REG_OBJECTIVES)
opt_paras = {"name": "GA", "epoch": 10, "pop_size": 30}
regressor = MhaElmRegressor(hidden_size=10, act_name="elu", obj_name="RMSE", 
                            optimizer="BaseGA", optimizer_paras=opt_paras, seed=42)
regressor.fit(data.X_train, data.y_train)

##### 5.4: Use Metaheuristic-based ELM model for classification problem
print(MhaElmClassifier.SUPPORTED_OPTIMIZERS)
print(MhaElmClassifier.SUPPORTED_CLS_OBJECTIVES)
opt_paras = {"name": "GA", "epoch": 10, "pop_size": 30}
classifier = MhaElmClassifier(hidden_size=10, act_name="elu", obj_name="KLDL", 
                              optimizer="BaseGA", optimizer_paras=opt_paras, seed=42)
classifier.fit(data.X_train, data.y_train)

#### Step 6: Predicting a new result
y_pred = regressor.predict(data.X_test)

y_pred_cls = classifier.predict(data.X_test)
y_pred_label = scaler_y.inverse_transform(y_pred_cls)

#### Step 7: Calculate metrics using score or scores functions.
print("Try my AS metric with score function")
print(regressor.score(data.X_test, data.y_test, method="AS"))

print("Try my multiple metrics with scores function")
print(classifier.scores(data.X_test, data.y_test, list_methods=["AS", "PS", "F1S", "CEL", "BSL"]))

print("Try my evaluate functions")
print(regressor.evaluate(data.y_test, y_pred, list_metrics=("RMSE", "MAE", "MAPE", "NSE", "R2")))

#### Save results
regressor.save_loss_train(save_path="history", filename="loss_train.csv")
regressor.save_metrics(data.y_test, y_pred, list_metrics=("R2", "MAPE", "MAE", "MSE"), save_path="history", filename="metrics.csv")

A real-world dataset contains features that vary in magnitudes, units, and range. We would suggest performing normalization when the scale of a feature is irrelevant or misleading. Feature Scaling basically helps to normalize the data within a particular range.

Where do I find the supported metrics like above ["AS", "PS", "RS"]. What is that? You can find it here: https://github.com/thieu1995/permetrics or use this

from intelelm import MhaElmClassifier, MhaElmRegressor

print(MhaElmRegressor.SUPPORTED_REG_OBJECTIVES)
print(MhaElmClassifier.SUPPORTED_CLS_OBJECTIVES)

I got this type of error

raise ValueError("Existed at least one new label in y_pred.")
ValueError: Existed at least one new label in y_pred.

How to solve this?

This occurs only when you are working on a classification problem with a small dataset that has many classes. For instance, the "Zoo" dataset contains only 101 samples, but it has 7 classes. If you split the dataset into a training and testing set with a ratio of around 80% - 20%, there is a chance that one or more classes may appear in the testing set but not in the training set. As a result, when you calculate the performance metrics, you may encounter this error. You cannot predict or assign new data to a new label because you have no knowledge about the new label. There are several solutions to this problem.
1st: Use the SMOTE method to address imbalanced data and ensure that all classes have the same number of samples.

import pandas as pd
from imblearn.over_sampling import SMOTE
from intelelm import Data

dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
X, y = dataset[:, 0:-1], dataset[:, -1]

X_new, y_new = SMOTE().fit_resample(X, y)
data = Data(X_new, y_new)

2nd: Use different random_state numbers in split_train_test() function.

import pandas as pd
from intelelm import Data

dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
X, y = dataset[:, 0:-1], dataset[:, -1]
data = Data(X, y)
data.split_train_test(test_size=0.2, random_state=10)  # Try different random_state value

When testing several algorithms based on Extreme Learning Machines (ELM), they all produce the same results. Even during the training process, the global best solution remains unchanged.

This issue was identified in version <= v1.0.2 when the default values for the lower bound (lb) and upper bound (ub) were set in the narrow range of (-1, 1). This limited range proved to be too small, causing all algorithms to converge to local optima. Fortunately, this problem has been addressed in versions > v1.0.3, where the default range has been extended to (-10., 10.). You also can define your own lb and ub ranges depend on your problem.
In traditional neural network like MLP, they weights (weights + biases) are typically initialized within the range of (-1., 1.). However, during training using gradient-based methods, these values are updated, and there are no strict bounds on them.
Meanwhile, in metaheuristic optimization, it's necessary to set boundaries for decision variables (weights) each time a new search agent is formed. Therefore, if you define a narrow range, your optimizer may converge more quickly, but it's more likely to get stuck in local optima (which explains why the global best value remains unchanged during training). Moreover, in some cases, there might not even be a global optimum within that narrow range. Conversely, if you set a wider range, the optimization process may be slower, and the global best value may change more gradually. In such cases, you might need to increase the number of epochs, perhaps up to 1000, for the optimizer to explore the solution space thoroughly.

opt_paras = {"name": "GA", "epoch": 30, "pop_size": 30}
model = MhaElmClassifier(hidden_size=10, act_name="elu", obj_name="KLDL", optimizer="BaseGA", 
                         optimizer_paras=opt_paras, verbose=True, seed=42)
model.fit(X_train, y_train, lb=(-10., ), ub=(10., ))
y_pred = model.predict(X_test)

Support (questions, problems)

Official Links

Official source code repo: https://github.com/thieu1995/intelelm
Official document: https://intelelm.readthedocs.io/
Download releases: https://pypi.org/project/intelelm/
Issue tracker: https://github.com/thieu1995/intelelm/issues
Notable changes log: https://github.com/thieu1995/intelelm/blob/master/ChangeLog.md
Official chat group: https://t.me/+fRVCJGuGJg1mNDg1
This project also related to our another projects which are "optimization" and "machine learning", check it here:

intelelm's People

Contributors

Stargazers

Watchers

Forkers

mberkanbicer mealpy-ecosystem dearborn-open-ai

intelelm's Issues

[FEAT]: Best solution ( Best parameter value)

Description

I want to know how to get the best parameter value that has been optimized by the metaheuristic model. Can you give an example?

Thanks

Additional Information

For example in mealpy; print(f"Best agent: {model.g_best}")
print(f"Best solution: {model.g_best.solution}")
print(f"Best accuracy: {model.g_best.target.fitness}")
print(f"Best parameters: {model.problem.decode_solution(model.g_best.solution)}")

get something wrong

Description

Dear thieu, I want to compare you code with kaggle, and I revise the code but something wring,

`import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.neighbors import KNeighborsClassifier
from permetrics import ClassificationMetric
from intelelm import MhaElmClassifier
from sklearn.model_selection import KFold #for K-fold cross validation
from sklearn.model_selection import cross_val_score #score evaluation
from sklearn.model_selection import cross_val_predict #prediction
from sklearn.metrics import confusion_matrix #for confusion matrix
import matplotlib.pyplot as plt
import seaborn as sns

#Loading data
hazel_df = pd.read_csv("hazelnuts.txt", sep="\t", header=None)
hazel_df = hazel_df.transpose()
hazel_df.columns = ["sample_id", "length", "width", "thickness", "surface_area", "mass", "compactness",
"hardness", "shell_top_radius", "water_content", "carbohydrate_content", "variety"]
print(hazel_df.head())

all_features = hazel_df.drop(["variety","sample_id"],axis=1)
target_feature = hazel_df["variety"]
print(all_features.head())

#Dataset preprocessing
X = all_features.values.astype(float) # returns a numpy array of type float
y = target_feature.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

scaler_X = MinMaxScaler()
scaler_y = LabelEncoder()

X_train = scaler_X.fit_transform(X_train)
X_test = scaler_X.transform(X_test)
y_train = scaler_y.fit_transform(y_train)
y_test = scaler_y.transform(y_test)

print(np.unique(y_train))
print(len(np.unique(y_test)))

KNN

model = KNeighborsClassifier(n_neighbors = 25)

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

cm = ClassificationMetric(y_test, y_pred, decimal=6)

print("Results of KNN!")

print(cm.get_metrics_by_list_names(["AS", "RS", "PS", "F1S"]))

elm_multi_class

from sklearn import preprocessing
x = all_features.values.astype(float) #returns a numpy array of type float
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
scaled_features = pd.DataFrame(x_scaled)
scaled_features.head()

opt_paras = {"name": "GA", "epoch": 100, "pop_size": 30}

opt_paras = {"epoch": 100, "pop_size": 30, "pc": 0.9, "pm" : 0.05}

model = MhaElmClassifier(hidden_size=10, act_name="elu", obj_name="BSL", optimizer="BaseGA", optimizer_paras=opt_paras, verbose=False)
model.fit(X_train, y_train)
dt_ga = model.predict(X_test)
kfold = KFold(n_splits=10, random_state=22, shuffle=True) # k=10, split the data into 10 equal parts=
result_ga=cross_val_score(model,scaled_features,target_feature,cv=kfold,scoring='accuracy')

print('The overall score for K Nearest Neighbors Classifier is:',round(result_knn.mean()*100,2))

y_pred = cross_val_predict(model,scaled_features,target_feature,cv=10)
sns.heatmap(confusion_matrix(dt_ga,y_test),annot=True,cmap='summer')
plt.title('KNN Confusion_matrix')`

/Library/Python/3.9/site-packages/sklearn/preprocessing/_encoders.py:975: FutureWarning: sparsewas renamed tosparse_outputin version 1.2 and will be removed in 1.4.sparse_outputis ignored unless you leavesparse to its default value. warnings.warn( Traceback (most recent call last): File "/Users/tangling/Downloads/IntelELM-main/examples/test.py", line 75, in <module> result_ga=cross_val_score(model,scaled_features,target_feature,cv=kfold,scoring='accuracy') File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 562, in cross_val_score cv_results = cross_validate( File "/Library/Python/3.9/site-packages/sklearn/utils/_param_validation.py", line 211, in wrapper return func(*args, **kwargs) File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 309, in cross_validate results = parallel( File "/Library/Python/3.9/site-packages/sklearn/utils/parallel.py", line 65, in __call__ return super().__call__(iterable_with_config) File "/Library/Python/3.9/site-packages/joblib/parallel.py", line 1863, in __call__ return output if self.return_generator else list(output) File "/Library/Python/3.9/site-packages/joblib/parallel.py", line 1789, in _get_sequential_output for func, args, kwargs in iterable: File "/Library/Python/3.9/site-packages/sklearn/utils/parallel.py", line 61, in <genexpr> iterable_with_config = ( File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 311, in <genexpr> clone(estimator), File "/Library/Python/3.9/site-packages/sklearn/base.py", line 75, in clone return estimator.__sklearn_clone__() File "/Library/Python/3.9/site-packages/sklearn/base.py", line 268, in __sklearn_clone__ return _clone_parametrized(self) File "/Library/Python/3.9/site-packages/sklearn/base.py", line 110, in _clone_parametrized new_object = klass(**new_object_params) File "/Library/Python/3.9/site-packages/intelelm/model/mha_elm.py", line 220, in __init__ super().__init__(hidden_size=hidden_size, act_name=act_name, obj_name=obj_name, optimizer=optimizer, optimizer_paras=optimizer_paras, verbose=verbose) File "/Library/Python/3.9/site-packages/intelelm/base_elm.py", line 386, in __init__ self.optimizer = self._set_optimizer(optimizer, optimizer_paras) File "/Library/Python/3.9/site-packages/intelelm/base_elm.py", line 400, in _set_optimizer return optimizer.set_parameters(optimizer_paras) File "/Users/tangling/Library/Python/3.9/lib/python/site-packages/mealpy/optimizer.py", line 92, in set_parameters raise ValueError(f"Invalid input parameters: {new_para_names} for {self.get_name()} optimizer. " ValueError: Invalid input parameters: {'name', 'epoch', 'pop_size'} for GA optimizer. Valid parameters are: {'pc', 'epoch', 'pm', 'pop_size'}.

I follow the message to change parameters opt_paras = {"epoch": 100, "pop_size": 30, "pc": 0.9, "pm" : 0.05}
but raise an error
/Library/Python/3.9/site-packages/sklearn/preprocessing/_encoders.py:975: FutureWarning: sparsewas renamed tosparse_outputin version 1.2 and will be removed in 1.4.sparse_outputis ignored unless you leavesparseto its default value. warnings.warn( Traceback (most recent call last): File "/Users/tangling/Downloads/IntelELM-main/examples/test.py", line 75, in <module> result_ga=cross_val_score(model,scaled_features,target_feature,cv=kfold,scoring='accuracy') File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 562, in cross_val_score cv_results = cross_validate( File "/Library/Python/3.9/site-packages/sklearn/utils/_param_validation.py", line 211, in wrapper return func(*args, **kwargs) File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 309, in cross_validate results = parallel( File "/Library/Python/3.9/site-packages/sklearn/utils/parallel.py", line 65, in __call__ return super().__call__(iterable_with_config) File "/Library/Python/3.9/site-packages/joblib/parallel.py", line 1863, in __call__ return output if self.return_generator else list(output) File "/Library/Python/3.9/site-packages/joblib/parallel.py", line 1789, in _get_sequential_output for func, args, kwargs in iterable: File "/Library/Python/3.9/site-packages/sklearn/utils/parallel.py", line 61, in <genexpr> iterable_with_config = ( File "/Library/Python/3.9/site-packages/sklearn/model_selection/_validation.py", line 311, in <genexpr> clone(estimator), File "/Library/Python/3.9/site-packages/sklearn/base.py", line 75, in clone return estimator.__sklearn_clone__() File "/Library/Python/3.9/site-packages/sklearn/base.py", line 268, in __sklearn_clone__ return _clone_parametrized(self) File "/Library/Python/3.9/site-packages/sklearn/base.py", line 123, in _clone_parametrized raise RuntimeError( RuntimeError: Cannot clone object MhaElmClassifier(obj_name='BSL', optimizer=<mealpy.evolutionary_based.GA.BaseGA object at 0x16aefc790>, optimizer_paras={'epoch': 100, 'pc': 0.9, 'pm': 0.05, 'pop_size': 30}), as the constructor either does not set or modifies parameter optimizer

Additional Information

No response

[BUG]: TypeError: init() missing 1 required positional argument: 'bounds'

Description of the bug

when i try fitting the MhaElmRegressor model i keep gettinto this error TypeError: __init__() missing 1 required positional argument: 'bounds'
"

Steps To Reproduce

opt_paras = {"name": "GA", "epoch": 10, "pop_size": 30}
regressor = MhaElmRegressor(hidden_size=10, act_name="elu", obj_name="RMSE", optimizer="BaseGA", optimizer_paras=opt_paras)
regressor.fit(X_train_scaled, y_train)

Additional Information

No response

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

thieu1995 / intelelm Goto Github PK

intelelm's Introduction

Citation Request

Installation

Tutorials

Examples

Support (questions, problems)

Official Links

Related Documents

intelelm's People

Contributors

Stargazers

Watchers

Forkers

intelelm's Issues

Description

Additional Information

Description

KNN

model = KNeighborsClassifier(n_neighbors = 25)

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

cm = ClassificationMetric(y_test, y_pred, decimal=6)

print("Results of KNN!")

print(cm.get_metrics_by_list_names(["AS", "RS", "PS", "F1S"]))

elm_multi_class

opt_paras = {"epoch": 100, "pop_size": 30, "pc": 0.9, "pm" : 0.05}

print('The overall score for K Nearest Neighbors Classifier is:',round(result_knn.mean()*100,2))

Additional Information

Description of the bug

Steps To Reproduce

Additional Information

Recommend Projects

Recommend Topics

Recommend Org