Comments (9)
Thank you! I noticed that it works now when I use the 1.0.5-dev version.
from pykeen.
when I manually split my data I am able to use it with the following code :
from pykeen.triples import TriplesFactory
from pykeen.pipeline import pipeline
training = TriplesFactory(path=work_path + '/train.txt')
testing = TriplesFactory(
path=work_path + '/test.txt',
entity_to_id=training.entity_to_id,
relation_to_id=training.relation_to_id,
)
pipeline_result = pipeline(
training_triples_factory=training,
testing_triples_factory=testing,
model='TransE',
)
pipeline_result.save_to_directory('test_pre_stratified_transe')
There seems to be an issue with the split function built into TriplesFactory.
from pykeen.
@ChristopherMarais this might be an issue with the type of python you're using. Are you on 32 bit? It would be helpful if you could report the version of OS you're using, the version of Python, and also PyKEEN
It could be the case that windows numpy defaults to 32 bit integers (reference: dask/dask-ml#230 (comment)). In that case, the fix for this bug would be to specify the datatype for the random number generator explicitly.
This might have slipped through the cracks because we haven't done any testing on Windows. Most people using PyKEEN would like to take advantage of GPUs, which are only available on Linux. However, we could set up CI for AppVeyor, since we aren't exactly pedantic towards the usage of GPUs. Sorry, probably too much information! Your feedback is appreciated and I hope we can get this working for you and anyone else who might run into this issue
from pykeen.
I am using :
Windows 10
Python 3.8.3
I wasn't aware that I would not be able to access my GPU from windows. I assumed that if I got all the GPU related packages running like cudatools etc. on my anaconda environment that it would be capable of using my GPU.
what would be the recommended system requirements for using PyKEEN?
from pykeen.
Hi @ChristopherMarais thanks for letting me know. I'm not familiar with getting PyTorch up and running on Windows - would you mind sharing how you did it? For example, are you using conda? This would also be useful for us to share with other uses of PyKEEN. Then, in #95 I will try to make sure we have AppVeyor running on each push
from pykeen.
I am using conda yes. I used the following command to install pytorch in an environment:
conda install -c pytorch pytorch
I can't remember entirely if I had to install cudatools separately too ( I have made so many environments recently).
I do remember having to install some packages before being able to install PyKEEN.
I have attached my exported environment .yml file as an attachment if you want to try and test it.
It should also show you which packages I have installed.
when I used the following command I was able to copy the environment to another PC:
conda env create -f pykeen_env-yml.txt
when I run the examples in a jupyter notebook through jupyter lab not all of them work and I can't see that it does end up using my GPU so it might not fully work, however I do end up creating embeddings for many of my own datasets I just have to stratify them 'manually' before running them through the proposed pre-stratified example in the docs.
As a side note, (I know I should actually make a new issue for this) is there a way for me to control the size of the embeddings being created. I see that they all end up being vectors that are 50 elements long.
I see here https://www.aclweb.org/anthology/I17-2006/ that embedding dimensions could possibly have an effect. I would like to test that through PyKEEN too.
from pykeen.
@ChristopherMarais @cthoyt I think I found a solution to the issue, cf. #98
from pykeen.
As a side note, (I know I should actually make a new issue for this) is there a way for me to control the size of the embeddings being created. I see that they all end up being vectors that are 50 elements long.
I see here https://www.aclweb.org/anthology/I17-2006/ that embedding dimensions could possibly have an effect. I would like to test that through PyKEEN too.
You can change the dimension(s) of the embeddings when creating the model instance. Depending on the interaction model, there may be more than one dimension (e.g. a separate dimension for relation embeddings). You can check the documentation of the individual models for more information, cf. https://pykeen.readthedocs.io/en/latest/reference/models.html
When using the pipeline
function you can pass them via model_kwargs
, e.g.
pipeline_result = pipeline(
training_triples_factory=training,
testing_triples_factory=testing,
model='TransE',
model_kwargs=dict(embedding_dim=64),
)
to have 64-dimensional entity and relation embeddings.
from pykeen.
@ChristopherMarais thanks for the help and motivation, you'll see that we've got testing working for Windows now as of #95!
from pykeen.
Related Issues (20)
- Question about the use of `create_inverse_triples` HOT 2
- Want to train a model without any evaluate or test dataset HOT 1
- Bug in wandb result tracker HOT 1
- Possible issue with model evaluation when using datasets with inverse triples HOT 1
- RGCN RuntimeError: trying to backward through graph a second time. (has parameters but no reset_parameters) HOT 2
- QuatE: GPU memory is not released per epoch HOT 3
- Training loop does not update relation representations when continuing training HOT 2
- from pykeen.pipeline import pipeline, pipeline issue HOT 3
- Evaluating metrics on many subsets with multiple models HOT 2
- Shape Mismatch upon initializing pretrained ComplEx embeddings HOT 2
- TransE - CUDA out of memory HOT 3
- Importing model_resolver HOT 2
- Getting Embeddings of the Entity and Relations HOT 13
- RGCN Hyper parameter optimization error HOT 1
- MatKG HOT 1
- HPO_Pipeline fails on AutoSF models HOT 1
- Unable to reproduce TransE experiment
- EarlyStopper: show progress bar
- Cosine Annealing with Warm Restart LR Scheduler recieving an unexpected kwarg `T_i`
- OOM Crash on MPS/Apple silicon HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pykeen.