opentensor / bittensor Goto Github PK

View Code? Open in Web Editor NEW

811.0 34.0 269.0 329.13 MB

Internet-scale Neural Networks

Home Page: https://www.bittensor.com/

License: MIT License

Python 98.22% Shell 1.67% Dockerfile 0.07% Makefile 0.03%

neural-networks machine-learning p2p-network substrate polkadot blockchain cryptocurrency ai deep-learning p2p

bittensor's Introduction

Bittensor

Internet-scale Neural Networks

Discord • Network • Research

Bittensor is a mining network, similar to Bitcoin, that includes built-in incentives designed to encourage computers to provide access to machine learning models in an efficient and censorship-resistant manner. These models can be queried by users seeking outputs from the network, for instance; generating text, audio, and images, or for extracting numerical representations of these input types. Under the hood, Bittensor’s economic market, is facilitated by a blockchain token mechanism, through which producers (miners) and the verification of the work done by those miners (validators) are rewarded. Miners host, train or otherwise procure machine learning systems into the network as a means of fulfilling the verification problems defined by the validators, like the ability to generate responses from prompts i.e. “What is the capital of Texas?.

The token based mechanism under which the miners are incentivized ensures that they are constantly driven to make their knowledge output more useful, in terms of speed, intelligence and diversity. The value generated by the network is distributed directly to the individuals producing that value, without intermediaries. Anyone can participate in this endeavour, extract value from the network, and govern Bittensor. The network is open to all participants, and no individual or group has full control over what is learned, who can profit from it, or who can access it.

To learn more about Bittensor, please read our paper.

Install

There are three ways to install Bittensor

Through the installer:

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/opentensor/bittensor/master/scripts/install.sh)"

With pip:

$ pip3 install bittensor

From source:

$ git clone https://github.com/opentensor/bittensor.git
$ python3 -m pip install -e bittensor/

Using Conda (recommended for Apple M1):

$ conda env create -f ~/.bittensor/bittensor/scripts/environments/apple_m1_environment.yml
$ conda activate bittensor

To test your installation, type:

$ btcli --help

or using python

import bittensor

CUDA

If you anticipate using PoW registration for subnets or the faucet (only available on staging), please install cubit as well for your version of python. You can find the Opentensor cubit implementation and instructions here.

For example with python 3.10:

pip install https://github.com/opentensor/cubit/releases/download/v1.1.2/cubit-1.1.2-cp310-cp310-linux_x86_64.whl

Wallets

Wallets are the core ownership and identity technology around which all functions on Bittensor are carried out. Bittensor wallets consists of a coldkey and hotkey where the coldkey may contain many hotkeys, while each hotkey can only belong to a single coldkey. Coldkeys store funds securely, and operate functions such as transfers and staking, while hotkeys are used for all online operations such as signing queries, running miners and validating.

Wallets can be created in two ways.

Using the python-api

import bittensor
wallet = bittensor.wallet()
wallet.create_new_coldkey()
wallet.create_new_hotkey()
print (wallet)
"Wallet (default, default, ~/.bittensor/wallets/)"

Or using btcli

Use the subcommand wallet or it's alias w:

$ btcli wallet new_coldkey
    Enter wallet name (default):      

    IMPORTANT: Store this mnemonic in a secure (preferably offline place), as anyone who has possession of this mnemonic can use it to regenerate the key and access your tokens. 
    The mnemonic to the new coldkey is:
    **** *** **** **** ***** **** *** **** **** **** ***** *****
    You can use the mnemonic to recreate the key in case it gets lost. The command to use to regenerate the key using this mnemonic is:
    btcli w regen_coldkey --mnemonic post maid erode shy captain verify scan shoulder brisk mountain pelican elbow

$ btcli wallet new_hotkey
    Enter wallet name (default): d1
    Enter hotkey name (default): 

    IMPORTANT: Store this mnemonic in a secure (preferably offline place), as anyone who has possession of this mnemonic can use it to regenerate the key and access your tokens. 
    The mnemonic to the new hotkey is:
    **** *** **** **** ***** **** *** **** **** **** ***** *****
    You can use the mnemonic to recreate the key in case it gets lost. The command to use to regenerate the key using this mnemonic is:
    btcli w regen_hotkey --mnemonic total steak hour bird hedgehog trim timber can friend dry worry text

In both cases you should be able to view your keys by navigating to ~/.bittensor/wallets or viewed by running btcli wallet list

$ tree ~/.bittensor/
    .bittensor/                 # Bittensor, root directory.
        wallets/                # The folder containing all bittensor wallets.
            default/            # The name of your wallet, "default"
                coldkey         # You encrypted coldkey.
                coldkeypub.txt  # Your coldkey public address
                hotkeys/        # The folder containing all of your hotkeys.
                    default     # You unencrypted hotkey information.

Your default wallet Wallet (default, default, ~/.bittensor/wallets/) is always used unless you specify otherwise. Be sure to store your mnemonics safely. If you lose your password to your wallet, or the access to the machine where the wallet is stored, you can always regenerate the coldkey using the mnemonic you saved from above.

$ btcli wallet regen_coldkey --mnemonic **** *** **** **** ***** **** *** **** **** **** ***** *****

Using the cli

The Bittensor command line interface (btcli) is the primary command line tool for interacting with the Bittensor network. It can be used to deploy nodes, manage wallets, stake/unstake, nominate, transfer tokens, and more.

Basic Usage

To get the list of all the available commands and their descriptions, you can use:

btcli --help

usage: btcli <command> <command args>

bittensor cli v{bittensor.__version__}

commands:
  subnets (s, subnet) - Commands for managing and viewing subnetworks.
  root (r, roots) - Commands for managing and viewing the root network.
  wallet (w, wallets) - Commands for managing and viewing wallets.
  stake (st, stakes) - Commands for staking and removing stake from hotkey accounts.
  sudo (su, sudos) - Commands for subnet management.
  legacy (l) - Miscellaneous commands.

Example Commands

Viewing Senate Proposals

btcli root proposals

Viewing Senate Members

btcli root list_delegates

Viewing Proposal Votes

btcli root senate_vote --proposal=[PROPOSAL_HASH]

Registering for Senate

btcli root register

Leaving Senate

btcli root undelegate

Voting in Senate

btcli root senate_vote --proposal=[PROPOSAL_HASH]

Miscellaneous Commands

btcli legacy update
btcli legacy faucet

Managing Subnets

btcli subnets list
btcli subnets create

Managing Wallets

btcli wallet list
btcli wallet transfer

Note

Please replace the subcommands and arguments as necessary to suit your needs, and always refer to btcli --help or btcli <command> --help for the most up-to-date and accurate information.

For example:

btcli subnets --help

usage: btcli <command> <command args> subnets [-h] {list,metagraph,lock_cost,create,register,pow_register,hyperparameters} ...

positional arguments:
  {list,metagraph,lock_cost,create,register,pow_register,hyperparameters}
                        Commands for managing and viewing subnetworks.
    list                List all subnets on the network.
    metagraph           View a subnet metagraph information.
    lock_cost           Return the lock cost to register a subnet.
    create              Create a new bittensor subnetwork on this chain.
    register            Register a wallet to a network.
    pow_register        Register a wallet to a network using PoW.
    hyperparameters     View subnet hyperparameters.

options:
  -h, --help            show this help message and exit

Post-Installation Steps

To enable autocompletion for Bittensor CLI, run the following commands:

btcli --print-completion bash >> ~/.bashrc  # For Bash
btcli --print-completion zsh >> ~/.zshrc    # For Zsh
source ~/.bashrc  # Reload Bash configuration to take effect

The Bittensor Package

The bittensor package contains data structures for interacting with the bittensor ecosystem, writing miners, validators and querying the network. Additionally, it provides many utilities for efficient serialization of Tensors over the wire, performing data analysis of the network, and other useful utilities.

In the 7.0.0 release, we have removed torch by default. However, you can still use torch by setting the environment variable USE_TORCH=1 and making sure that you have installed the torch library. You can install torch by running pip install bittensor[torch] (if installing via PyPI), or by running pip install -e ".[torch]" (if installing from source). We will not be adding any new functionality based on torch.

Wallet: Interface over locally stored bittensor hot + coldkey styled wallets.

import bittensor
# Bittensor's wallet maintenance class.
wallet = bittensor.wallet() 
# Access the hotkey
wallet.hotkey 
# Access the coldkey
wallet.coldkey ( requires decryption )
# Sign data with the keypair.
wallet.coldkey.sign( data )

Subtensor: Interfaces with bittensor's blockchain and can perform operations like extracting state information or sending transactions.

import bittensor
# Bittensor's chain interface.
subtensor = bittensor.subtensor() 
# Get the chain block
subtensor.get_current_block()
# Transfer Tao to a destination address.
subtensor.transfer( wallet = wallet, dest = "xxxxxxx..xxxxx", amount = 10.0)
# Register a wallet onto a subnetwork
subtensor.register( wallet = wallet, netuid = 1 )

Metagraph: Encapsulates the chain state of a particular subnetwork at a specific block.

import bittensor
# Bittensor's chain state object.
metagraph = bittensor.metagraph( netuid = 1 ) 
# Resync the graph with the most recent chain state
metagraph.sync()
# Get the list of stake values
print ( metagraph.S )
# Get endpoint information for the entire subnetwork
print ( metagraph.axons )
# Get the hotkey information for the miner in the 10th slot
print ( metagraph.hotkeys[ 10 ] )
# Sync the metagraph at another block
metagraph.sync( block = 100000 )
# Save the metagraph
metagraph.save()
# Load the same
metagraph.load()

Synapse: Responsible for defining the protocol definition between axon servers and dendrite clients

class Topk( bittensor.Synapse ):
    topk: int = 2  # Number of "top" elements to select
    input: bittensor.Tensor = pydantic.Field(..., frozen=True)  # Ensure that input cannot be set on the server side. 
    v: bittensor.Tensor = None
    i: bittensor.Tensor = None

def topk( synapse: Topk ) -> Topk:
    v, i = torch.topk( synapse.input.deserialize(), k = synapse.topk ) 
    synapse.v = bittensor.Tensor.serialize( v )
    synapse.i = bittensor.Tensor.serialize( i )
    return synapse

# Attach the forward function to the axon and start.
axon = bittensor.axon().attach( topk ).start()

Axon: Serves Synapse protocols with custom blacklist, priority and verify functions.

import bittensor

class MySynapse( bittensor.Synapse ):
    input: int = 1
    output: int = None

# Define a custom request forwarding function
def forward( synapse: MySynapse ) -> MySynapse:
    # Apply custom logic to synapse and return it
    synapse.output = 2
    return synapse

# Define a custom request verification function
def verify_my_synapse( synapse: MySynapse ):
    # Apply custom verification logic to synapse
    # Optionally raise Exception

# Define a custom request blacklist function
def blacklist_my_synapse( synapse: MySynapse ) -> bool:
    # Apply custom blacklist 
    # return False ( if non blacklisted ) or True ( if blacklisted )

# Define a custom request priority function
def prioritize_my_synape( synapse: MySynapse ) -> float:
    # Apply custom priority
    return 1.0 

# Initialize Axon object with a custom configuration
my_axon = bittensor.axon(config=my_config, wallet=my_wallet, port=9090, ip="192.0.2.0", external_ip="203.0.113.0", external_port=7070)

# Attach the endpoint with the specified verification and forwarding functions  
my_axon.attach(
    forward_fn = forward_my_synapse, 
    verify_fn=verify_my_synapse,
    blacklist_fn = blacklist_my_synapse,
    priority_fn = prioritize_my_synape
).start()

Dendrite: Represents the abstracted implementation of a network client module designed to send requests to those endpoints to receive inputs.

Example:

dendrite_obj = dendrite( wallet = bittensor.wallet() )
# pings the axon endpoint
await d( <axon> )
# ping multiple axon endpoints
await d( [<axons>] ) 
# Send custom synapse request to axon.
await d( bittensor.axon(), bittensor.Synapse() ) 
# Query all metagraph objects.
await d( meta.axons, bittensor.Synapse() )

Setting weights on root network

Use the root subcommand to access setting weights on the network across subnets.

btcli root weights --wallet.name <coldname> --wallet.hotkey <hotname>
Enter netuids (e.g. 0, 1, 2 ...):
# Here enter your selected netuids to set weights on
1, 2

>Enter weights (e.g. 0.09, 0.09, 0.09 ...): 
# These do not need to sum to 1, we do normalization on the backend.
# Values must be > 0
0.5, 10

Normalized weights: 
        tensor([ 0.5000, 10.0000]) -> tensor([0.0476, 0.9524])

Do you want to set the following root weights?:
  weights: tensor([0.0476, 0.9524])
  uids: tensor([1, 2])? [y/n]: 
y

⠏ 📡 Setting root weights on test ...

Bittensor Subnets API

This guide provides instructions on how to extend the Bittensor Subnets API, a powerful interface for interacting with the Bittensor network across subnets. The Bittensor Subnets API facilitates querying across any subnet that has exposed API endpoints to unlock utility of the Bittensor decentralized network.

The Bittensor Subnets API consists of abstract classes and a registry system to dynamically handle API interactions. It allows developers to implement custom logic for storing and retrieving data, while also providing a straightforward way for end users to interact with these functionalities.

Core Components

APIRegistry: A central registry that manages API handlers. It allows for dynamic retrieval of handlers based on keys.
SubnetsAPI (Abstract Base Class): Defines the structure for API implementations, including methods for querying the network and processing responses.
StoreUserAPI & RetrieveUserAPI: Concrete implementations of the SubnetsAPI for storing and retrieving user data.

Implementing Custom Subnet APIs

To implement your own subclasses of bittensor.SubnetsAPI to integrate an API into your subnet.

Inherit from SubnetsAPI: Your class should inherit from the SubnetsAPI abstract base class.
Implement Required Methods: Implement the prepare_synapse and process_responses abstract methods with your custom logic.

That's it! For example:

import bittensor

class CustomSubnetAPI(bittensor.SubnetsAPI):
    def __init__(self, wallet: "bittensor.wallet"):
        super().__init__(wallet)
        # Custom initialization here

    def prepare_synapse(self, *args, **kwargs):
        # Custom synapse preparation logic
        pass

    def process_responses(self, responses):
        # Custom response processing logic
        pass

Release

The release manager should follow the instructions of the RELEASE_GUIDELINES.md document.

Contributions

Please review the contributing guide for more information before making a pull request.

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Acknowledgments

learning-at-home/hivemind

bittensor's People

Contributors

Stargazers

Watchers

Forkers

lesliesibbs aidangomez yangki1902 il-dar tawawhite eugenium ml-lab timolegros josephrocca jupiterapex1 dingchaoz chaosxcode ndubuisx diamondnetwork nickbeentjes camfairchild shnwang kheele badgyalbrowbrow robertalanm iamnmn9 mrseeker pouyaexe mojo-flat xychelsea hammertoe 0xrao commune-ai chediak dmikushin mythicfrost quac88 creativebuilds corytorrella etincarnatus dru10 0xluk divineopera eseckft francoisluus redk4n vbrydik goonfellaz dilonpar cyborgshead jrealee noxonsu fetpo cryptobuks piotrekraw randompersonunderstairs filoozom taostation nikogold78 jermaine150 mogmachine teast21 flash-singh staketensor 0xskol softintelligent arcz lucrosuscapital arguad bartu92u ssdurkin lmat4b igoraxz lucdeville taovali gmarlettts geewhzz blitz091 lawrenceboyd-1995 lewbare benedikt2406 sfuller4 cisterciansis skerberferber theeblardo whiterhinotao mtbua88 bittensoranon julner sayuri18 kmfoda viktorthink mo-u410 quantros ahmedbahaaeldin angeyobo asifehmad akyo-labs nathanotal soltrinox rad2520 inquinim skeossei mowens24 unconst

bittensor's Issues

Prepare expansion of PoA network

We currently have a validator set of 6 validators, running quite centrally. In accordance with the Roadmap, we will expand this to 18 validators, which will be spread out over different trusted entities.

This step requires some preparation and planning:

We need to find out technically how to add validators to to the set.
We need to find trusted entities and coordinate with them to have them run a miner.
We need to coordinate a secure channel through which the validators keys can communicated
A document should be drafted outlining the server config
Ideally, an install script setting everything up should be made

CIFAR-10 breaks after 8th epoch

It appears that CIFAR 10 breaks down after the 8th epoch with a generic error:

2020-09-24 15:53:07.614 | ERROR | __main__:main:322 - There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, Autograd, Profiler, Tracer, Autocast]

Document BitTensor SDK on Confluence

Presently we have limited documentation on NashTensor, but BitTensor SDK needs more work and more documentation to make onboarding better.

Fix the executor in the router PR.

Proof-of-work caching in metagraph

The metagraph needs to be extended with the proof of work caching. Nodes solve a proof of work periodically and attach this to synapses they gossip. Nodes receiving queue these synapses by proof of work and destroy synapses with low proof of work based on the size of their queue.

Remote testing the GRPC client on digital ocean. With p2p bootstrapping

We need a way of running 10+ computers simultaneously connected together using the metagraph. A script to push these up quickly will be useful for us and anyone else who uses this tech.

Abstract away miner

Describe the bug
Currently miners are a bit too hairy and a lot of logic is exposed. This should be abstracted away completely so folks can better understand how things work and only need to look under the hood if they need to. This should reduce confusion and create more streamlined experiences.

Speak to @shibshib or @unconst for more details before starting this bug.

Write unit tests for config.ConfigService class

This class contains quite a bit of validation logic, which should be unit tested so it won't give false positives or false negatives while validating.

Re-implement cifar example on new structure.

CIFAR was removed as it was way behind on the new structure. We need to bring it back and structure it again properly.

Tensorboard needs to display more information

Is your feature request related to a problem? Please describe.
Tensorboard right now only displays some information about the Axon (receiving terminal), Dendrite (sending terminal) and the general state of the model (local loss, distillation loss, remote loss). This is all great but it would be good to get more in depth data such as how much Tao the model has made so far, how much tao it has staked, what's the iteration rate per second, learning rate changes, etc.

Describe the solution you'd like
Simply input this information into Tensorboard in the miner. Perhaps abstract it all away into a method that fills in all this information, instead of sporadically spreading a bunch of tensorboard calls all over the miner.

Dendrite needs to be multiprocess

Mnist error is massive after first epoch

After first epoch, MNIST accuracy is 10% (wat) and loss increases after that. Investigate.

Metagraph visualizer

We no longer have a metagraph visualizer. We need to re-create this so we can start pulling data about all the nodes running.

Fix BERT node and check performance

Is your feature request related to a problem? Please describe.
Presently, BERT node is using huggingface API. We have found with GPT2, huggingface implementation doesn't do so well, even when training purely locally. This could be due to many reasons. We need to check if BERT trains well locally, and if it doesn't then we need to fix it so that it does and then push it onto the network, similarly to the workflow described below.

Describe the solution you'd like
When creating this model, the best flow is to:

Create model locally, run it through the miner logic but using only local_forward.
Once you're satisfied with performance of local_forward you can switch that to a remote_forward and let it train for a few days. Check stability of this miner and make sure it didn't die off for some reason.

Have the dist server maintain a chain version for validators

Currently our dist-server maintain a version of the kusanagi chain that can be used by FULL nodes only.
As part of the preparation to expand the validator network, we'll need the dist server to also server a version that can be used by validators.

Verify correct rollbar functionality, and fix

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Too many open files bug

Describe the bug
The current huggingface tokenizer has an inherent bug in which it tries to open all the files all the time, this leads to OS errors because there are limitations on how many files can be open at any given time.

** Fix**:
We need to either reduce the number of times we use a tokenizer in an epoch OR see if huggingface fixed the issue.

UPDATE: Huggingface fixed the issue and pushed to master recently, we simply need to update our version: huggingface/tokenizers#178

Have bittensor output more useful error messages

Is your feature request related to a problem? Please describe.
Right now, bittensor outputs the following when a subscription fails:

ERROR |bittensor.subtensor:_submit_and_check_extrinsic:322 - Error in extrinsic: {'code': 1002, 'message': 'Verification Error: Execution: Trap: Trap { kind: Unreachable }', 'data': 'RuntimeApi("Execution: Trap: Trap { kind: Unreachable }")'} Failed to subscribe

Which is a bit useless

Bittensor should provide the user with clear reasons why a certain problem occurs, so they can take appropriate action.

Serve pre-trained models

Is your feature request related to a problem? Please describe.
Presently nodes are training a gpt2 model locally and then serving it. Instead, it should just serve a pre-trained version of GPT2 for maximal knowledge contribution to the network.

Describe the solution you'd like
When calling axon.serve, the model that is being passed to it should be a pre-trained model. This is easier said than done, however, as there's a lot of caveats here.

Create an ALBERT-powered mining node

Is your feature request related to a problem? Please describe.
Presently we have models like BERT, GPT, and XLM. We need to create an ALBERT node as well to add to this fleet of miners.

Describe the solution you'd like
When creating this model, the best flow is to:

Create model locally, run it through the miner logic but using only local_forward.
Once you're satisfied with performance of local_forward you can switch that to a remote_forward and let it train for a few days. Check stability of this miner and make sure it didn't die off for some reason.

Integrate Pytorch Tensorboard

As title states, need tensorboardx

Add SGMOE router

Sparsely Gated Mixtures of Experts.

Build a router object which is based off the SGMOE.
-- Should use the same format as the PKM router.

Add a Transformer XL powered miner

Is your feature request related to a problem? Please describe.
New miner! We should investigate how to add a miner powered by transformer XL model.

Describe the solution you'd like
When creating this model, the best flow is to:

Create model locally, run it through the miner logic but using only local_forward.
Once you're satisfied with performance of local_forward you can switch that to a remote_forward and let it train for a few days. Check stability of this miner and make sure it didn't die off for some reason.

encoder issue

add encoder into synapse

Peer drops, but synapse still counted

During live experiment, peer has dropped, but synapse is still counted. This is because the peer only got one try and was kicked out. We need to investigate how to set it so that the peer stays "alive" even if it doesn't respond within a speccific time window.

Save and reload checkpoint models using torch.

-- There is currently no way to serve a synapse which is a deep copy of the currently training model.

Fix Docker ports

Multiple docker containers at the moment are not supported on the same machine. This is due to a port being used at the same time. This needs to be resolved.

Improper formatting to net.ip_to_net(ip) breaks miners.

Describe the bug
Sometimes the external IP service is down, so the formatting of net.ip_to_net breaks down since nothing is being passed and this breaks the miners. We need to fix this so we're not reliant on that service

To Reproduce
Steps to reproduce the behavior:
Run miner and point it to an incorrect IP service.

Expected behavior
Should retry or detect IP differently.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment:

OS and Distro: Linux Ubuntu
Bittensor Version: 1.0.3

TBLogger dashboard peer information

Add information of peers joining and leaving in TB dashboard.

Cycle receptors on the dendrite

Is your feature request related to a problem? Please describe.
Receptors should cycle on the dendrite to ensure TCP connections dont explode. In essence, we need to recycle TCP connections on the dendrite to create for more efficient TCP handling and less compute/networking required.

Mnist Shaping error

Traceback (most recent call last):
File "examples/mnist/main.py", line 149, in main
train( model, epoch, global_step )
File "examples/mnist/main.py", line 84, in train
output = model(images, labels, query = True)
File "/Users/const/.pyenv/versions/3.7.3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/const/Workspace/bittensor/bittensor/synapses/mnist/model.py", line 178, in forward
network = self.router.join( responses ) # Joins responses based on scores..
File "/Users/const/Workspace/bittensor/bittensor/utils/router.py", line 43, in join
return self.dispatcher.combine (responses, self.scores)
File "/Users/const/Workspace/bittensor/bittensor/utils/dispatcher.py", line 116, in combine
combined = combined.view(expert_out[0].shape)
RuntimeError: shape '[63, 1, 512]' is invalid for input of size 32768
ERROR |main:main:166 - shape '[63, 1, 512]' is invalid for input of size 32768
const: bittensor :

Add readthedocs documentation

Repository is integrated with readthedocs , but it still requires actual documentation to be placed in it for it to be generated on readthedocs. The first thing we should start with is the NashTensor description, then a GRPC description, then the full SDK documentation can follow in another ticket.

Synapse remote gradient caching and application in training loop

-- currently we throw gradients into the garbage when receiving from a remote synapse
-- need a method to apply remote gradients consistently with local training script
-- needs to worry about buffer overflow / stale and NAN gradients, etc.

Extend Genesis dataset

Extend the genesis dataset with more files.

For instance, adding a dataset like the wiki corpus to the genesis dataloader

[Potentially a Blockchain Task] Automatic bootstrapping of bittensor peers

When we start a new peer or a docker container, we need to automatically have it bootstrap itself to a running instance. Though this may be something that will be tackled when we create the chain, so we will leave this as a feature and not a bug for the time being.

Redo bittensor <> subtensor interface

Currently, we're using a substrate interface using websockets that use autobahn asyncio under the hood.
This setup is so unstable, that the code is a mess, the code style is not compatible with the other code and nodes
crap out because of connection issues.

We need to take this out, and replace it with either pysubstrate native WS connections, or plain http RPC calls.

Miner YAML configs out of date.

Describe the bug

Presently, users can use the cmd line argument parser OR the YAML config files to set up their miner configuration. Whatever that is not set by the user explicitly already defaults to its default value.

The YAML configuration files of the miners are all out of date, they have lots of old references to old renamed variables (for example, session should be miner). These need to be fixed across the board by checking the parsers and making sure all of them match up with the configs.

To Reproduce
Steps to reproduce the behavior:

Go to miners/TEXT(or image)/{miner name}/{config file name}.yaml
See those configs, compare them to the miner and other components' cmd line parser values and names.

Expected behavior
The YAML file variable names and values should match up with the cmd line parser.

Log bandwidth of dendrite into TBLogger

Having some up/down bandwidth information in the TBlogger would be very useful as a TB dashboard tool

Load previously-trained models

Presently, bittensor models are saved after each epoch if they perform better than the previous version. However, users cannot re-load those saved models back up and continue running them where they left off.

Release subtensor v1.1.0

Subtensor v1.1.0 is currently in the test phase. But should be release to kusanagi

Todo:

Pick release date
Update akira nodes with v1.0.2
Runtime upgrade akira with v1.1.0
Communicate to people running subtensor they should install v1.0.2 before release date

Actual release:

Update all nodes in kusanagi network with v1.0.2
Verify operation
Perform runtime update

Further develop the concept of Active Stake and Active neurons

Shib made a cool PR that introduces the concept of active stake and active neurons. I'd like to take the opportunity to further investigate and develop these concepts to see how they can be integrated the best possible way

Queuing mechanism not GPU friendly

Describe the bug
Presently, GPU runs into segmentation fault when enqueuing processes, this needs to be investigated to further understand why it's happening.

Distillation is not being properly applied on CIFAR

Presently, it looks like CIFAR is not applying distillation properly, this needs to be resolved as if distillation is not applied then the point is lost.

Metagraph unit tests

Metagraph has no unit tests, this is problematic and can break easy. We need to implement unit tests for the metagraph

Make axon and dendrite fail proof.

-- There is no error catch in deserialization or serialization
-- Nodes should not ever be killed by exterior requests

Docker and python files take different cmd line arguments

Presently, docker run of Bittensor takes different flags as cmd line arguments than the typical Python native application.

This is because the dockerized version "assumes" all the flags for running it in python, when in reality it should be taking the same flags as the python cmd line arguments and passing them along until it runs the model in Python.

Bittensor throws exception if one peer dies or stops contributing

We need to surround things with try/catch blocks to get Bittensor to continue without interruption if a specific peer dies or stops contributing to it.

fedature 1

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

If miners are slow, they run into 'nucleus full' error

Describe the bug
If miners are slow, they run into 'nucleus full' error, this is because they're trying to push to an empty queue but they can't. This is because the processing queue is too big.

Need to improve queuing structure here to turn into a proper priority queue.

Bring back 'resume_training' to miners

Is your feature request related to a problem? Please describe.
Presently miners will restart training every time they are restarted, making a new model and losing all progress on the last model being trained. Ideally, Bittensor should pick up the last trained model with the best loss and continue training it each time the miner is restarted.

Describe the solution you'd like
Add a flag to miners called 'resume_training' which will cause the miner to find the best trained model, load it, and continue training it.

Describe alternatives you've considered
Two possible approaches here:

Simply pick up the latest trained model and continue training it (naive solution)
Intelligently keep track of the best trained model so far and just pick it up on each restart.

opentensor / bittensor Goto Github PK

bittensor's Introduction

Bittensor

Internet-scale Neural Networks

Install

CUDA

Wallets

Using the cli

Basic Usage

Example Commands

Viewing Senate Proposals

Viewing Senate Members

Viewing Proposal Votes

Registering for Senate

Leaving Senate

Voting in Senate

Miscellaneous Commands

Managing Subnets

Managing Wallets

Note

Post-Installation Steps

The Bittensor Package

Setting weights on root network

Bittensor Subnets API

Core Components

Implementing Custom Subnet APIs

Release

Contributions

License

Acknowledgments

bittensor's People

Contributors

Stargazers

Watchers

Forkers

bittensor's Issues

Recommend Projects

Recommend Topics

Recommend Org