makgyver / gossipy Goto Github PK
View Code? Open in Web Editor NEWPython module for simulating gossip learning.
License: Apache License 2.0
Python module for simulating gossip learning.
License: Apache License 2.0
Hi, I'm Parsa a master's student at the university of Semnan.
I am trying to get gossipy (particularly main_onoszko_2021.py) to run In google collab. although successful; it takes long training times and shows 0 utilization of GPU . that is when CUDA is available for PyTorch. and also GPU is enabled in collab. let me know if you wanna take a look at my code I will email it to you.
btw I have tried solutions like (numba @jit(target_backend='cuda') ). if you can put me in the right path. may tell me how you managed to get it to use GPU? I will be grateful
I recently attempted to bring TPU support to gossipy. however, when using it it shows 4 hours and 30 minutes to completion of my simulation, significantly slower than the GPU training time of approximately 45 minutes. I was wondering if you could make it work:
I have read that TPUs are supposed to be significantly faster but my attempt does not reflect that!
Here is how I changed stuff :
first installing torch_XLA:
!pip install torch_xla https://storage.googleapis.com/tpu-pytorch/wheels/colab/torch_xla-2.0-cp310-cp310-linux_x86_64.whl
import torch_xla.core.xla_model as xm
class GlobalSettings(metaclass=Singleton):
"""Global settings for the library."""
_device = 'cpu'
def auto_device(self) -> torch.device:
"""Set device to TPU if available, otherwise cuda if available, otherwise cpu.
Returns
-------
torch.device
The device.
"""
if xm.xla_device_exists():
self._device = xm.xla_device()
elif torch.cuda.is_available():
self._device = torch.device('cuda')
else:
self._device = torch.device('cpu')
return self._device
def set_device(self, device_name: str) -> torch.device:
"""Set the device.
Parameters
----------
device_name: name of the device to set (possible values are 'auto', 'cuda', 'cpu', and 'tpu').
When device_name is 'auto', 'cuda' is used if available, otherwise 'cpu'.
Returns
-------
torch.device
The device.
"""
if device_name == "auto":
return GlobalSettings().auto_device()
elif device_name == "tpu" and xm.xla_device():
self._device = xm.xla_device()
else:
self._device = torch.device(device_name)
return self._device
def get_device(self):
"""Get the device.
Returns
-------
torch.device
The device.
"""
return self._device
Hello,
I'm new to machine learning but it seems to me that the implementation of Logistic Regression is the same as Linear Regression, it's simply a linear transformation: https://github.com/makgyver/gossipy/blob/main/gossipy/model/nn.py#L166
I would have suppose that we will have a sigmoid transformation, with the torch.sigmoid
function, for example like in this tutorial : https://towardsdatascience.com/logistic-regression-with-pytorch-3c8bbea594be
Is it a bug or did I miss something ?
Thanks,
Mohamed
Hello,
I was curious to know why you implement all the machine learning models in PyTorch instead of another library like scikit-learn ? Is it because it's easier to have all the models implemented in the same way ? Do you have plan to be compatible with other machine library in the future ?
Thanks,
Mohamed Amine
I know that it doesn't make sense to create a network with one node but I did it to compare results with a centralized solution.
This is the code used (adapted from main_ormandi.py
:
from gossipy.gossipy import set_seed
from gossipy.gossipy.core import AntiEntropyProtocol, CreateModelMode, StaticP2PNetwork, UniformDelay
from gossipy.gossipy.node import GossipNode
from gossipy.gossipy.model.handler import PegasosHandler
from gossipy.gossipy.model.nn import AdaLine
from gossipy.gossipy.data import load_classification_dataset, DataDispatcher
from gossipy.gossipy.data.handler import ClassificationDataHandler
from gossipy.gossipy.simul import GossipSimulator, SimulationReport
from gossipy.gossipy.utils import plot_evaluation
set_seed(42)
X, y = load_classification_dataset("spambase", as_tensor=True)
y = 2*y - 1 #convert 0/1 labels to -1/1
data_handler = ClassificationDataHandler(X, y, test_size=.1)
data_dispatcher = DataDispatcher(data_handler, 1, eval_on_user=False, auto_assign=True)
topology = StaticP2PNetwork(1, None)
model_handler = PegasosHandler(net=AdaLine(data_handler.size(1)),
learning_rate=.01,
create_model_mode=CreateModelMode.MERGE_UPDATE)
# For loop to repeat the simulation
nodes = GossipNode.generate(data_dispatcher=data_dispatcher,
p2p_net=topology,
model_proto=model_handler,
round_len=100,
sync=False)
simulator = GossipSimulator(
nodes=nodes,
data_dispatcher=data_dispatcher,
delta=100,
protocol=AntiEntropyProtocol.PUSH,
delay=UniformDelay(0,10),
online_prob=.2, #Approximates the average online rate of the STUNner's smartphone traces
drop_prob=.1, #Simulate the possibility of message dropping,
sampling_eval=.1
)
report = SimulationReport()
simulator.add_receiver(report)
simulator.init_nodes(seed=42)
simulator.start(n_rounds=100)
plot_evaluation([[ev for _, ev in report.get_evaluation(False)]], "Overall test results")
And the results I get back are the following :
# INFO # Sent messages: 95 simul.py:248
# INFO # Failed messages: 79 simul.py:249
# INFO Total size: 5415 simul.py:250
# INFO accuracy: 0.62 utils.py:181
# INFO precision: 0.62 utils.py:181
# INFO recall: 0.62 utils.py:181
# INFO f1_score: 0.61 utils.py:181
# INFO auc: 0.66 utils.py:181
In the algorithm from the paper ormandi2013
, a node wait for the reception of a model to update the model on the data, and thus the model never learn. This is a bit sad but expected. The part that is very strange for me is the number of "failed "message".
I would have suppose that all message would failed since there is no other nodes in the network.
What do you think ? Is this expected or a bug ?
Hello,
I'm a PhD student at Sorbonne University and I've started to look at gossip learning. I've just discovered your simulator and I think I will use it for my work.
If I add new protocols or algorithms to the simulator, are you open that I add them to the project as pull requests ?
Thanks,
Mohamed Amine LEGHERABA
Hi,
First thank you for your good work.
I have a question regarding the class TorchModelPartition implemented here:
gossipy/gossipy/model/sampling.py
Lines 110 to 234 in 5ae94ab
The method _check checks whether a tensor is of dimension 3 or more and _partition suppose that tensors are of dimension less or equal to 3 (as _check is called before). Why this limitation? It looks like the code of _partition can be extended to support tensors of dimension 4 but, maybe I am missing something?.
This current limitation prevents us to use LeNet-5 implemented here for example: https://github.com/lychengrex/LeNet-5-Implementation-Using-Pytorch/blob/master/LeNet-5%20Implementation%20Using%20Pytorch.ipynb
Sincerely,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.