Giter VIP home page Giter VIP logo

Comments (9)

ChristopherMarais avatar ChristopherMarais commented on May 24, 2024 1

Thank you! I noticed that it works now when I use the 1.0.5-dev version.

from pykeen.

ChristopherMarais avatar ChristopherMarais commented on May 24, 2024

when I manually split my data I am able to use it with the following code :

from pykeen.triples import TriplesFactory
from pykeen.pipeline import pipeline

training = TriplesFactory(path=work_path + '/train.txt')
testing = TriplesFactory(
    path=work_path + '/test.txt',
    entity_to_id=training.entity_to_id,
    relation_to_id=training.relation_to_id,
)

pipeline_result = pipeline(
    training_triples_factory=training,
    testing_triples_factory=testing,
    model='TransE',
)
pipeline_result.save_to_directory('test_pre_stratified_transe')

There seems to be an issue with the split function built into TriplesFactory.

from pykeen.

cthoyt avatar cthoyt commented on May 24, 2024

@ChristopherMarais this might be an issue with the type of python you're using. Are you on 32 bit? It would be helpful if you could report the version of OS you're using, the version of Python, and also PyKEEN

It could be the case that windows numpy defaults to 32 bit integers (reference: dask/dask-ml#230 (comment)). In that case, the fix for this bug would be to specify the datatype for the random number generator explicitly.

This might have slipped through the cracks because we haven't done any testing on Windows. Most people using PyKEEN would like to take advantage of GPUs, which are only available on Linux. However, we could set up CI for AppVeyor, since we aren't exactly pedantic towards the usage of GPUs. Sorry, probably too much information! Your feedback is appreciated and I hope we can get this working for you and anyone else who might run into this issue

from pykeen.

ChristopherMarais avatar ChristopherMarais commented on May 24, 2024

I am using :
Windows 10
Python 3.8.3

I wasn't aware that I would not be able to access my GPU from windows. I assumed that if I got all the GPU related packages running like cudatools etc. on my anaconda environment that it would be capable of using my GPU.

what would be the recommended system requirements for using PyKEEN?

from pykeen.

cthoyt avatar cthoyt commented on May 24, 2024

Hi @ChristopherMarais thanks for letting me know. I'm not familiar with getting PyTorch up and running on Windows - would you mind sharing how you did it? For example, are you using conda? This would also be useful for us to share with other uses of PyKEEN. Then, in #95 I will try to make sure we have AppVeyor running on each push

from pykeen.

ChristopherMarais avatar ChristopherMarais commented on May 24, 2024

I am using conda yes. I used the following command to install pytorch in an environment:

conda install -c pytorch pytorch

I can't remember entirely if I had to install cudatools separately too ( I have made so many environments recently).
I do remember having to install some packages before being able to install PyKEEN.
I have attached my exported environment .yml file as an attachment if you want to try and test it.
It should also show you which packages I have installed.

pykeen_env-yml.txt

when I used the following command I was able to copy the environment to another PC:

conda env create -f pykeen_env-yml.txt

when I run the examples in a jupyter notebook through jupyter lab not all of them work and I can't see that it does end up using my GPU so it might not fully work, however I do end up creating embeddings for many of my own datasets I just have to stratify them 'manually' before running them through the proposed pre-stratified example in the docs.

As a side note, (I know I should actually make a new issue for this) is there a way for me to control the size of the embeddings being created. I see that they all end up being vectors that are 50 elements long.
I see here https://www.aclweb.org/anthology/I17-2006/ that embedding dimensions could possibly have an effect. I would like to test that through PyKEEN too.

from pykeen.

mberr avatar mberr commented on May 24, 2024

@ChristopherMarais @cthoyt I think I found a solution to the issue, cf. #98

from pykeen.

mberr avatar mberr commented on May 24, 2024

@ChristopherMarais

As a side note, (I know I should actually make a new issue for this) is there a way for me to control the size of the embeddings being created. I see that they all end up being vectors that are 50 elements long.
I see here https://www.aclweb.org/anthology/I17-2006/ that embedding dimensions could possibly have an effect. I would like to test that through PyKEEN too.

You can change the dimension(s) of the embeddings when creating the model instance. Depending on the interaction model, there may be more than one dimension (e.g. a separate dimension for relation embeddings). You can check the documentation of the individual models for more information, cf. https://pykeen.readthedocs.io/en/latest/reference/models.html

When using the pipeline function you can pass them via model_kwargs, e.g.

pipeline_result = pipeline(
    training_triples_factory=training,
    testing_triples_factory=testing,
    model='TransE',
    model_kwargs=dict(embedding_dim=64),
)

to have 64-dimensional entity and relation embeddings.

from pykeen.

cthoyt avatar cthoyt commented on May 24, 2024

@ChristopherMarais thanks for the help and motivation, you'll see that we've got testing working for Windows now as of #95!

from pykeen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.