Comments (36)
Just found the torrent you have hosted on the latest commit. Currently seeding but for others reading this, the magnet is below here, so you do not have to download it from S3 and burn bandwidth.
Torrent
Magnet
magnet:?xt=urn:btih:b343b83b35bff774dab13e0281ce13b3daf37d3e&dn=model%5Fv5&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.leechers-paradise.org%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Ftracker.pomf.se%3A80%2Fannounce&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80%2Fannounce&tr=udp%3A%2F%2Fp4p.arenabg.com%3A1337%2Fannounce&tr=udp%3A%2F%2F9.rarbg.me%3A2710%2Fannounce&tr=udp%3A%2F%2F9.rarbg.to%3A2710%2Fannounce&tr=udp%3A%2F%2Fexodus.desync.com%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.tiny-vps.com%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.stealth.si%3A80%2Fannounce&tr=udp%3A%2F%2Fdenis.stalker.upeer.me%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.torrent.eu.org%3A451%2Fannounce&tr=udp%3A%2F%2Ftracker.moeking.me%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.cyberia.is%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.si%3A1337%2Fannounce&tr=udp%3A%2F%2Fipv4.tracker.harry.lu%3A80%2Fannounce&tr=udp%3A%2F%2Ftracker3.itzmx.com%3A6961%2Fannounce&tr=udp%3A%2F%2Fzephir.monocul.us%3A6969%2Fannounce&tr=udp%3A%2F%2Fxxxtor.com%3A2710%2Fannounce
from aidungeon.
from aidungeon.
I added the v5 model to IPFS so that it now can be reached via all IPFS public gateways such as Cloudflares: https://cloudflare-ipfs.com/ipfs/QmRkuYGhAcNFz9FZq3xEFduXyihbUwmbhPbuakjhn9SRVQ
from aidungeon.
Seeding the torrent files currently, I saw about 25-50 MB/s download, I'm seeding on a 1Gb/s link. Appreciate your work!
from aidungeon.
It should only download it once. I think this issue should be mostly resolved though as I set up google cloud CDN which should (hopefully) get rid of the high international egress costs from North America to colab servers in other countries.
from aidungeon.
Nevermind. The costs are still super high. I'm going to have to shut off bucket access for now.
from aidungeon.
Would it be possible to host the file via BitTorrent and download with the tools available on Google's end?
from aidungeon.
Is there a way to use "Mount drive" in Colab to help? https://stackoverflow.com/questions/53576555/share-a-part-of-google-drive-on-colab
from aidungeon.
from aidungeon.
The first time it borked for me, I closed the page and reopened it. A few other times I force-restarted. Later I discovered that I could select "restart and run all" and it would go a lot faster, not having to re-download everything.
If that was in the README it might save you some money and/or your users some time.
from aidungeon.
I don't see anything in the github TOS that says we can't just shard the file up and host each shard on its own repo.
from aidungeon.
Github sets Content-Type: text/plain
to prevent being used as a CDN. I know that doesn't prevent this use case, but I'd be careful of using them as a CDN regardless. Even if not in the TOS, they may not be happy with this use and could ask you to stop. There are sites set up to act as a CDN to GH however, such as https://raw.githack.com/, although I believe these are intended for .js, .html, images etc. and not bulk data.
If the gdrive option doesn't work out, the cheapest option (without running afoul of GH or other services) might be to run a bunch of VPSes or a dedi with lots of (ideally unlimited) bandwidth and set up a caching layer on that. It's not too hard to find servers under $50/mo with unlimited bandwidth.
from aidungeon.
I was thinking of the sharding solution as being temporary. That said this whole thing with running out of a colab notebook might be temporary as well as I'm sure google didn't consider the use-case of someone using it as a game engine.
I think putting this on BitTorrent makes a heck of a lot of sense and is a fantastic use case for it (legal file hosting... waaaaah?). Until there's enough seeds though it wouldn't hurt to have an alternative. Also, I'm sure many universities block BitTorrent, or at least they did when I was in college.
from aidungeon.
Alright, I wrote a script to upload the shards to github and it's running now. Here's the first shard:
https://github.com/JamesHutchison/aidungeon2-model-550-1-of-84
Just change 1 to whatever number to get the remaining shards. The upload is done once shard 84 has a file in it. You can recombine them by checking out all the repos in their own directories and doing cat */x* > model-550.data-00000-of-00001
(assuming the only files / directories in the cwd are the repo directories). Note that this is only the model data, the other files that are downloaded could simply be commited to this repo as they aren't that big.
I haven't tested the files to see if they get corrupted in the process. I don't think my git upload is configured to change line-endings but its possible that might happen. I'm going to bed and will probably test sometime tomorrow, if someone doesn't beat me to it.
As of this post its on shard 10 of 84
from aidungeon.
I think it should be fine to show a prompt telling user to grab the model via torrent instead of trying to to download it in the installation script. It will also help with seeding as well, since volunteer will keep it seeding forever. And then they can make a copy of the Google Collab notebook, upload the model into their personal drive and play it from there. It would be a bit of work but more sustainable imo.
from aidungeon.
I will keep seeding the torrent on 200/200mbps. I know a lot of people may be turned off by waiting time so I hope it helps at least a bit. This masterpiece deserves people's attention.
EDIT: Almost 4 hours later, U/L ratio: 41.
from aidungeon.
@JamesHutchison Your cat instructions will break because you didn't prefix the single digit numbers with a 0.
$ ls */x*
aidungeon2-model-550-1-of-84/xaa
aidungeon2-model-550-10-of-84/xaj
aidungeon2-model-550-11-of-84/xak
aidungeon2-model-550-12-of-84/xal
aidungeon2-model-550-13-of-84/xam
aidungeon2-model-550-14-of-84/xan
aidungeon2-model-550-15-of-84/xao
aidungeon2-model-550-16-of-84/xap
aidungeon2-model-550-17-of-84/xaq
aidungeon2-model-550-18-of-84/xar
aidungeon2-model-550-19-of-84/xas
aidungeon2-model-550-2-of-84/xab
aidungeon2-model-550-20-of-84/xat
aidungeon2-model-550-21-of-84/xau
aidungeon2-model-550-22-of-84/xav
...
means cat
will assemble them in the wrong order.
Some judious moving will fix the assembly order:
$ for f in aidungeon2-model-550-?-of-84; do mv $f aidungeon2-model-550-0${f#aidungeon2-model-550-}; done
(I'm sure there is a better syntax for that, but I'm tired.)
from aidungeon.
ah good point. I was just checking on this and the md5 wasn't matching, that might be why. Sorry I was throwing together instructions without the time to test them.
from aidungeon.
and TIL that in github to rename a repo you have to click both the rename button and hit enter to actually rename
from aidungeon.
Alright, here's a script that pulls from the repos and rebuilds the model file. I'm pretty busy today and can't really spend time to clean this up a little bit. This just generates the model file. Copying it to the correct location, or even the script working on windows, is missing. if you are a windows user you can either execute this in cygwin or the ubuntu subsystem, or update the script so that the call to cat
is replaced with what I would imagine would be a glob followed by reading the files and writing them to an output file. The md5 calculation would need to be moved to a pure python implementation, using hashlib I would imagine.
Edit: updated to use a zip file now that contains all the model files
import os
import subprocess
FILENAME = 'model-550.zip'
repo_template = 'aidungeon2-model-550-zip-{shard}-of-78'
def clone_repos():
for i in range(84):
shard = "%02d" % (i + 1)
repo_name = repo_template.format(shard=shard)
url = "https://github.com/JamesHutchison/{repo_name}".format(repo_name=repo_name)
os.system('git clone %s' % url)
def rebuild_model():
os.system('cat */x* > %s' % FILENAME)
def check_md5():
expected_md5 = 'cb07f8fcecea5c3a418533296cbd088d'
output = subprocess.check_output(['md5sum', FILENAME])
actual_md5 = output.strip().split()[0]
print("Expected md5 of %s got %s" % (expected_md5, actual_md5))
assert actual_md5 == expected_md5
clone_repos()
rebuild_model()
check_md5()
I'm skeptical this is going to be the preferred method but at least we have another alternative if the other methods aren't working for some reason
from aidungeon.
@nickwalton i also like to support this project with free bandwith. Seeding the magnet link @nepeat posted with 500mbit/s.
EDIT: @MrKrzYch00 got a ratio of 342~ now. 😁
from aidungeon.
when I get time later I'm going to redo the github sharded file to be a zip containing the same files as the torrent download. Better to keep things consistent and file size will be a little smaller.
from aidungeon.
I think the situation with torrent overload has been resolved. Back then there were almost 700 peers which was slowly reduced to ~400-450 before @TheReal1604's help (and anyone else from non-aria clients that kept seeding). Most likely aria command line update helped as well. We are down to ~80 peers and download speeds are very good, mostly ~5-25MiB/s.
from aidungeon.
Updated the code block above to point to the new zip repo that contains the same files as the torrent. Just need to unzip to the right place after getting the file.
from aidungeon.
Seeding on my seedbox. 1gbps link as well. this is absolutely awesome! Thanks for sharing!
from aidungeon.
@nickwalton Please see https://www.feralhosting.com/pricing
from aidungeon.
Seeding on Gigabit for the forseeable future too.
from aidungeon.
I am also seeding with 1gbit for 8 hours straight now. (radio is x200 now :P)
If im allowed to ask: what were the costs of the hosted storage? @JamesHutchison @nickwalton
from aidungeon.
Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.
from aidungeon.
Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.
Just seed the torrent, i think that might help the most :)
from aidungeon.
@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.
from aidungeon.
Hello guys,
Do you need to host those on a different servers?
I have few dedicated servers either in France or in Vietnam. I'll be happy to provide space and bandwidth to support the project.Just seed the torrent, i think that might help the most :)
Okay, i have added 3 dedicated servers on it.
200m in Vietnam, 100M in France and 1G in France, seeding h24.
I am actually surprised to see my 1G server uploading at 50mb/s at the moment. Quite some demand indeed!
I have 2 more dedicated servers on 100M, but I feel users expect Gigabit connections now haha
from aidungeon.
@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.
Yup, the notebook on develop branch has utilities function that will allow you to do just that!
from aidungeon.
@JamesHutchison Another idea came to my mind is to upload the model to a google drive and share the link. I don't know exactly if its possible via API but there is a function which copies public shared files to your own drive. That bypasses the download limit of shared files.
Yup, the notebook on develop branch has utilities function that will allow you to do just that!
Hoping to move it all over to master soon!
from aidungeon.
@JamesHutchison Why not just share it as a release they don't have bandwidth limit
https://help.github.com/en/github/administering-a-repository/about-releases#limitations-on-binary-files
We don't limit the total size of your binary release files, nor the bandwidth used to deliver them. However, each individual file must be under 2 GB in size.
We just have to split them and upload then persons who download just need to join them to use
from aidungeon.
I will keep the model hosted at http://virtual.4my.eu/AIdungeon2/ for people having trouble getting the torrent (blocked ports or other reasons). The uplink may not be that great so for speed torrent may still be recommended.
from aidungeon.
Related Issues (20)
- [FEAT] Add /forget, /update commands HOT 4
- [BUG] Stories disappear too quickly
- [BUG] When switching to another app on Android, game restarts HOT 4
- [BUG] Use balking retries to handle "The AI is a little overloaded"
- [BUG] Using back button on web browser logs me out HOT 1
- [BUG] Buffer overflow when text+remember_text gets too long? HOT 1
- [FEAT] Translation HOT 2
- [FEAT] Add sorting the search results HOT 3
- [BUG] AI never uses double quotes, even when direct dialog is enabled HOT 3
- [BUG] Starting custom prompts with capital letters clears input
- [Q/A]
- [BUG] White screen HOT 1
- [Q/A] Why does the subscriptions websocket returns the entire story text every time? Why not just the diff? HOT 1
- [BUG] Bottom bar on phone overlaps the button
- [DOC] re-training on custom data q HOT 1
- [FEAT] Solve the input and output confusion and other functional requirements of multiplayer mode.
- [FEAT] Add a Pin/Favourite button to Stories
- [BUG] Quests disappear after playing a few turns HOT 1
- [FEAT] nsfw toggle in Explore
- [BUG] Flickering pixel column on mobile app HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aidungeon.