Making an issue for this. I'm busy at the moment and I'm fighting with getting the pyt

We still owe Google $30k... <a h

Is there a way to use "Mount drive" in Colab to help? <a href="https://stackoverflow.c

Comments (36)

nepeat commented on August 28, 2024 4

Just found the torrent you have hosted on the latest commit. Currently seeding but for others reading this, the magnet is below here, so you do not have to download it from S3 and burn bandwidth.

Torrent

model_v5.torrent.zip

Magnet

magnet:?xt=urn:btih:b343b83b35bff774dab13e0281ce13b3daf37d3e&dn=model%5Fv5&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.leechers-paradise.org%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Ftracker.pomf.se%3A80%2Fannounce&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80%2Fannounce&tr=udp%3A%2F%2Fp4p.arenabg.com%3A1337%2Fannounce&tr=udp%3A%2F%2F9.rarbg.me%3A2710%2Fannounce&tr=udp%3A%2F%2F9.rarbg.to%3A2710%2Fannounce&tr=udp%3A%2F%2Fexodus.desync.com%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.tiny-vps.com%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.stealth.si%3A80%2Fannounce&tr=udp%3A%2F%2Fdenis.stalker.upeer.me%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.torrent.eu.org%3A451%2Fannounce&tr=udp%3A%2F%2Ftracker.moeking.me%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.cyberia.is%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.si%3A1337%2Fannounce&tr=udp%3A%2F%2Fipv4.tracker.harry.lu%3A80%2Fannounce&tr=udp%3A%2F%2Ftracker3.itzmx.com%3A6961%2Fannounce&tr=udp%3A%2F%2Fzephir.monocul.us%3A6969%2Fannounce&tr=udp%3A%2F%2Fxxxtor.com%3A2710%2Fannounce

from aidungeon.

nickwalton commented on August 28, 2024 3

We still owe Google $30k...

…

On Wed, Dec 11, 2019, 9:36 AM Jonas Kamsker ***@***.***> wrote: I am also seeding with 1gbit for 8 hours straight now. (radio is x200 now :P) If im allowed to ask: what were the costs of the hosted storage? @JamesHutchison <https://github.com/JamesHutchison> @nickwalton <https://github.com/nickwalton> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#41>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFJNOQHYE7E6HPP4RTUEOJLQYEJINANCNFSM4JXV3VMA> .

from aidungeon.

ZerxXxes commented on August 28, 2024 2

I added the v5 model to IPFS so that it now can be reached via all IPFS public gateways such as Cloudflares: https://cloudflare-ipfs.com/ipfs/QmRkuYGhAcNFz9FZq3xEFduXyihbUwmbhPbuakjhn9SRVQ

from aidungeon.

sbrichardson commented on August 28, 2024 1

Seeding the torrent files currently, I saw about 25-50 MB/s download, I'm seeding on a 1Gb/s link. Appreciate your work!

from aidungeon.

nickwalton commented on August 28, 2024

It should only download it once. I think this issue should be mostly resolved though as I set up google cloud CDN which should (hopefully) get rid of the high international egress costs from North America to colab servers in other countries.

from aidungeon.

nickwalton commented on August 28, 2024

Nevermind. The costs are still super high. I'm going to have to shut off bucket access for now.

from aidungeon.

kylemiller3 commented on August 28, 2024

Would it be possible to host the file via BitTorrent and download with the tools available on Google's end?

from aidungeon.

Akababa commented on August 28, 2024

Is there a way to use "Mount drive" in Colab to help? https://stackoverflow.com/questions/53576555/share-a-part-of-google-drive-on-colab

from aidungeon.

nickwalton commented on August 28, 2024

Apparently it might be possible to download from my drive to another persons without worrying about download fees. That might work as a temporary solution.

…

On Sat, Dec 7, 2019 at 9:38 PM Michael Pang ***@***.***> wrote: Is there a way to use "Mount drive" in Colab to help? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#41>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFJNOQDLMY2XYWTQLJ4LZBTQXR25XANCNFSM4JXV3VMA> .

from aidungeon.

JeffreyBenjaminBrown commented on August 28, 2024

The first time it borked for me, I closed the page and reopened it. A few other times I force-restarted. Later I discovered that I could select "restart and run all" and it would go a lot faster, not having to re-download everything.

If that was in the README it might save you some money and/or your users some time.

from aidungeon.

commented on August 28, 2024

I don't see anything in the github TOS that says we can't just shard the file up and host each shard on its own repo.

from aidungeon.

synap5e commented on August 28, 2024

Github sets Content-Type: text/plain to prevent being used as a CDN. I know that doesn't prevent this use case, but I'd be careful of using them as a CDN regardless. Even if not in the TOS, they may not be happy with this use and could ask you to stop. There are sites set up to act as a CDN to GH however, such as https://raw.githack.com/, although I believe these are intended for .js, .html, images etc. and not bulk data.

If the gdrive option doesn't work out, the cheapest option (without running afoul of GH or other services) might be to run a bunch of VPSes or a dedi with lots of (ideally unlimited) bandwidth and set up a caching layer on that. It's not too hard to find servers under $50/mo with unlimited bandwidth.

from aidungeon.

commented on August 28, 2024

I was thinking of the sharding solution as being temporary. That said this whole thing with running out of a colab notebook might be temporary as well as I'm sure google didn't consider the use-case of someone using it as a game engine.

I think putting this on BitTorrent makes a heck of a lot of sense and is a fantastic use case for it (legal file hosting... waaaaah?). Until there's enough seeds though it wouldn't hurt to have an alternative. Also, I'm sure many universities block BitTorrent, or at least they did when I was in college.

from aidungeon.

commented on August 28, 2024

Alright, I wrote a script to upload the shards to github and it's running now. Here's the first shard:
https://github.com/JamesHutchison/aidungeon2-model-550-1-of-84

Just change 1 to whatever number to get the remaining shards. The upload is done once shard 84 has a file in it. You can recombine them by checking out all the repos in their own directories and doing cat */x* > model-550.data-00000-of-00001 (assuming the only files / directories in the cwd are the repo directories). Note that this is only the model data, the other files that are downloaded could simply be commited to this repo as they aren't that big.

I haven't tested the files to see if they get corrupted in the process. I don't think my git upload is configured to change line-endings but its possible that might happen. I'm going to bed and will probably test sometime tomorrow, if someone doesn't beat me to it.

As of this post its on shard 10 of 84

from aidungeon.

louisgv commented on August 28, 2024

I think it should be fine to show a prompt telling user to grab the model via torrent instead of trying to to download it in the installation script. It will also help with seeding as well, since volunteer will keep it seeding forever. And then they can make a copy of the Google Collab notebook, upload the model into their personal drive and play it from there. It would be a bit of work but more sustainable imo.

from aidungeon.

MrKrzYch00 commented on August 28, 2024

I will keep seeding the torrent on 200/200mbps. I know a lot of people may be turned off by waiting time so I hope it helps at least a bit. This masterpiece deserves people's attention.

EDIT: Almost 4 hours later, U/L ratio: 41.

from aidungeon.

ekmett commented on August 28, 2024

@JamesHutchison Your cat instructions will break because you didn't prefix the single digit numbers with a 0.

$ ls */x*
aidungeon2-model-550-1-of-84/xaa
aidungeon2-model-550-10-of-84/xaj
aidungeon2-model-550-11-of-84/xak
aidungeon2-model-550-12-of-84/xal
aidungeon2-model-550-13-of-84/xam
aidungeon2-model-550-14-of-84/xan
aidungeon2-model-550-15-of-84/xao
aidungeon2-model-550-16-of-84/xap
aidungeon2-model-550-17-of-84/xaq
aidungeon2-model-550-18-of-84/xar
aidungeon2-model-550-19-of-84/xas
aidungeon2-model-550-2-of-84/xab
aidungeon2-model-550-20-of-84/xat
aidungeon2-model-550-21-of-84/xau
aidungeon2-model-550-22-of-84/xav
...

means cat will assemble them in the wrong order.

Some judious moving will fix the assembly order:

$ for f in aidungeon2-model-550-?-of-84; do mv $f aidungeon2-model-550-0${f#aidungeon2-model-550-}; done

(I'm sure there is a better syntax for that, but I'm tired.)

from aidungeon.

commented on August 28, 2024

ah good point. I was just checking on this and the md5 wasn't matching, that might be why. Sorry I was throwing together instructions without the time to test them.

from aidungeon.

commented on August 28, 2024

and TIL that in github to rename a repo you have to click both the rename button and hit enter to actually rename

from aidungeon.

commented on August 28, 2024

Alright, here's a script that pulls from the repos and rebuilds the model file. I'm pretty busy today and can't really spend time to clean this up a little bit. This just generates the model file. Copying it to the correct location, or even the script working on windows, is missing. if you are a windows user you can either execute this in cygwin or the ubuntu subsystem, or update the script so that the call to cat is replaced with what I would imagine would be a glob followed by reading the files and writing them to an output file. The md5 calculation would need to be moved to a pure python implementation, using hashlib I would imagine.

Edit: updated to use a zip file now that contains all the model files

import os

import subprocess

FILENAME = 'model-550.zip'
repo_template = 'aidungeon2-model-550-zip-{shard}-of-78'


def clone_repos():
    for i in range(84):
        shard = "%02d" % (i + 1)
        repo_name = repo_template.format(shard=shard)
        url = "https://github.com/JamesHutchison/{repo_name}".format(repo_name=repo_name)
        os.system('git clone %s' % url)


def rebuild_model():
    os.system('cat */x* > %s' % FILENAME)


def check_md5():
    expected_md5 = 'cb07f8fcecea5c3a418533296cbd088d'
    output = subprocess.check_output(['md5sum', FILENAME])
    actual_md5 = output.strip().split()[0]
    print("Expected md5 of %s got %s" % (expected_md5, actual_md5))
    assert actual_md5 == expected_md5

clone_repos()
rebuild_model()
check_md5()

I'm skeptical this is going to be the preferred method but at least we have another alternative if the other methods aren't working for some reason

from aidungeon.

TheReal1604 commented on August 28, 2024

@nickwalton i also like to support this project with free bandwith. Seeding the magnet link @nepeat posted with 500mbit/s.

EDIT: @MrKrzYch00 got a ratio of 342~ now. 😁

from aidungeon.

commented on August 28, 2024

when I get time later I'm going to redo the github sharded file to be a zip containing the same files as the torrent download. Better to keep things consistent and file size will be a little smaller.

from aidungeon.

MrKrzYch00 commented on August 28, 2024

I think the situation with torrent overload has been resolved. Back then there were almost 700 peers which was slowly reduced to ~400-450 before @TheReal1604's help (and anyone else from non-aria clients that kept seeding). Most likely aria command line update helped as well. We are down to ~80 peers and download speeds are very good, mostly ~5-25MiB/s.

from aidungeon.

commented on August 28, 2024

Updated the code block above to point to the new zip repo that contains the same files as the torrent. Just need to unzip to the right place after getting the file.

from aidungeon.

arshem commented on August 28, 2024

Seeding on my seedbox. 1gbps link as well. this is absolutely awesome! Thanks for sharing!

from aidungeon.

szepeviktor commented on August 28, 2024

@nickwalton Please see https://www.feralhosting.com/pricing

from aidungeon.

exotime commented on August 28, 2024

Seeding on Gigabit for the forseeable future too.

from aidungeon.

JKamsker commented on August 28, 2024

I am also seeding with 1gbit for 8 hours straight now. (radio is x200 now :P)

If im allowed to ask: what were the costs of the hosted storage? @JamesHutchison @nickwalton

from aidungeon.

karibuTW commented on August 28, 2024

Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.

from aidungeon.

JKamsker commented on August 28, 2024

Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.

Just seed the torrent, i think that might help the most :)

from aidungeon.

JKamsker commented on August 28, 2024

@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.

from aidungeon.

karibuTW commented on August 28, 2024

Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.

Just seed the torrent, i think that might help the most :)

Okay, i have added 3 dedicated servers on it.
200m in Vietnam, 100M in France and 1G in France, seeding h24.

I am actually surprised to see my 1G server uploading at 50mb/s at the moment. Quite some demand indeed!
I have 2 more dedicated servers on 100M, but I feel users expect Gigabit connections now haha

from aidungeon.

louisgv commented on August 28, 2024

@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.

Yup, the notebook on develop branch has utilities function that will allow you to do just that!

from aidungeon.

ben-bay commented on August 28, 2024

@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.

Yup, the notebook on develop branch has utilities function that will allow you to do just that!

Hoping to move it all over to master soon!

from aidungeon.

taliptako commented on August 28, 2024

@JamesHutchison Why not just share it as a release they don't have bandwidth limit
https://help.github.com/en/github/administering-a-repository/about-releases#limitations-on-binary-files

We don't limit the total size of your binary release files, nor the bandwidth used to deliver them. However, each individual file must be under 2 GB in size.

We just have to split them and upload then persons who download just need to join them to use

from aidungeon.

MrKrzYch00 commented on August 28, 2024

I will keep the model hosted at http://virtual.4my.eu/AIdungeon2/ for people having trouble getting the torrent (blocked ports or other reasons). The uplink may not be that great so for speed torrent may still be recommended.

from aidungeon.

High hosting cost - 5 GB model about aidungeon HOT 36 OPEN

Comments (36)

Torrent

Magnet

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent