Giter VIP home page Giter VIP logo

grail's People

Contributors

kkteru avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

grail's Issues

Can explain Why operate A_incidence += A_incidence.T?

Dear author,
May I ask a superficial problem that
Why make incidence_matrix(A_list) to be an undirected graph in function subgraph_extraction_labeling in grail/subgraph_extraction/graph_sampler.py
by adding A_incidence.T with A_incidence ?

runtime error of _prepare_subgraphs for negative data

Do you meet the following problem when you run the code?

subgraphs_neg.append(self._prepare_subgraphs(nodes_neg, r_label_neg, n_labels_neg))

Exception has occurred: AssertionError
For return_array=False, there should be one and only one edge between u and v, but get 0 edges. Please use return_array=True instead

How are the embeddings stored/how can one access them?

I really appreciate the work that this does, but I am having issues understanding how I can access the embeddings produced by the algorithm. Are they in the mdb files? I have been trying to read data out of them, and can read the key/value pairs, but the values seem to be binary encoded, and I am not sure where they came from. Any insight is appreciated! Thanks again!

Inductive Datasets

Hi, in the paper it says the following, but we observed several overlapping entities (e.g., /m/080knyg in fb237_v1_ind). Are we maybe missing steps in the data processing, or could you please detail what you mean by inductive setting?

"F. Inductive Graph Generation
The inductive train and test graphs examined in this paper do not have overlapping entities."

Thank you already!

return_array=False

There is a error: AssertionError: For return_array=False, there should be one and only one edge between u and v, but get 0 edges. Please use return_array=True instead

accuracy reproduction

Hi, I'm very interested in your work and I'm quite new to knowledge graph. And I was reproduce your results in the paper with the default code and the dataset given, the WN18RR datasets seem to work well with your given command line, and the results are always a bit higher than the paper's results.

But when I use nell_v1 for training and nell_v1_ind for testing, the Hits@10 and auc_pr is much lower than the paper's results. I want to make sure if this is the right way to run the code for this dataset(the given command line with dataset being replaced)? Should I tune other parameters? If so, can you please give me a hint about which parameters influence the performance the most if that's available?

Thanks for you kind response!

Screen Shot 2022-11-21 at 11 48 05 PM

Screen Shot 2022-11-21 at 11 49 27 PM

Is the data in the paper inconsistent with the data provided by the warehouse?

I have performed statistics on all the version data provided in the warehouse, and found that there is some inconsistency with the statistical results in Table 13 of the paper? Can you give a brief explanation?
Paper Sheet 13:
1

My statistics table:
2

The red and bold parts are inconsistent data.
Looking forward to your reply!!!

The statistics code I used is :

    root_path = 'data/WN18RR_v2_ind'
    file_list = [root_path + '/train.txt', root_path + '/valid.txt', root_path + '/test.txt']
    relation_list = []
    entity_list = []
    count = 0
    for file_path in file_list:
        with open(file_path) as f:
            file_data = [line.split() for line in f.read().split('\n')[:-1]]
            count = count + len(file_data)
            for triplet in file_data:
                if triplet[0] not in entity_list:
                    entity_list.append(triplet[0])
                if triplet[2] not in entity_list:
                    entity_list.append(triplet[2])
                if triplet[1] not in relation_list:
                    relation_list.append(triplet[1])
    print(root_path[root_path.rfind('/')+1:])
    print(len(relation_list))
    print(len(entity_list))
    print(count)

Do you mind sharing the code about how to get the dataset division?

Nice work! Thank you very much for the open source code. In the original paper, the FB15k-237, NELL995 and WN18RR datasets are divided into V1 to V4 versions. Because of the needs of my own paper, I want to divide the FB15k dataset similarly to generate four versions of datasets such as FB15k_v1, FB15k_v1_ind, etc. if it is convenient for you, would you like to open source the code divided by different versions of the dataset? Looking forward to your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.