Giter VIP home page Giter VIP logo

Comments (4)

hugochan avatar hugochan commented on September 3, 2024

You gave a link of your research data. I downloaded it and I saw a file called freebase_full.json. In this file, each element has its neighbors and paths directing to them. I checked the freebase dump which offered by google with sparql language, the paths that each element connects in your json file are not the all. Which means there are more paths should be connected by each element in freebase_full.json. I’m curious about how did you decide which path or neighbor should be added in each record? How did you build freebase_full.json file?

All the best, appreciate!
Siqi Lai

Hi Siqi,

Thank you for your interest to our work! The short answer to your question is that in the freebase_full.json file, we only kept 2-hop subgraphs surrounding all the candidate topic entities appearing in the webquestions dataset. So basically what was stored in freebase_full.json is a small subset of the whole Freebase which is minimally necessary for answering questions in the webquestions dataset. You might want to look into the following data preprocessing scripts for more details.

  1. get topics entity list
  2. get 2-hop subgraphs given a topic entity list

As for the candidate topic entities, please refer to #9.

from bamnet.

Gungnir2099 avatar Gungnir2099 commented on September 3, 2024

Thank you. It helps a lot.

from bamnet.

Gungnir2099 avatar Gungnir2099 commented on September 3, 2024

from bamnet.

hugochan avatar hugochan commented on September 3, 2024

Thanks for your reply. I'm also facing problems in loading freebase dump ( 2013-06-09 version, which is the version WebQ dataset maker used ). If a triple’s object is empty, this results in ‘‘Unrecognized: [DOT]’’ error; if a triple’s predicate ID contains a ‘‘$’’ character, it will result in ‘‘Unknown char : $’’ error, and so on. These errors will cause the load process to be interrupted. What should I do in this situation? Do you have any code to process the dump and make it can be parsed into Apache Jena or Virtuoso? Thanks a lot. Appreciate! Siqi Lai

I used the freebase dump released by this ACL 2014 paper [1]. They provided the dumped results from Freebase Search API. Here is the link to the data: http://cs.jhu.edu/~xuchen/packages/freebase-data.tar

[1] Yao, Xuchen, and Benjamin Van Durme. "Information extraction over structured data: Question answering with freebase." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2014.

from bamnet.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.