Giter VIP home page Giter VIP logo

wikibrain's Introduction

Setup WikiBrain - Atlasify

1. Environment

Maven, Java 1.7+, PostGIS

2. Clone the codebase

Clone the master branch of this repo git clone [email protected]:cheetah90/wikibrain.git

3. Setup Java Options

export JAVA_OPTS="-d64 -Xmx16000M -server" make sure you have a host with RAM > 16G. Set the Xmx higher if you have more RAM.

4. Configure database

Edit wikibrain-core/src/main/resources/reference.conf

dao:dataSource:default: psql
dao:dataSource:psql: put in the username and password for postgres
spatial:dao:dataSource:default: postgis
spatial:dao:dataSource:postgis: put in the username and password for postgres

5. Compile WikiBrain

At the project root (/wikibrain) run mvn -f wikibrain-utils/pom.xml clean compile exec:java -Dexec.mainClass=org.wikibrain.utils.ResourceInstaller

6. Tune postgres

According to https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

listen_addresses = '*'
max_connections = 500         # Must be at least 300
shared_buffers = 48GB         # Should be 1/4 of system memory
effective_cache_size = 96GB   # Should be 1/2 of system memory
fsync = off                 
synchronous_commit = off    
checkpoint_segments = 256
checkpoint_completion_target = 0.9
autovacuum = off

7. Start Data Ingestion

Minimal

./wb-java.sh org.wikibrain.Loader -l en -s wikidata -s spatial
(Only running the above script will get the Atlasify running but with limitted function. Good for a feasibility test.)

Full

./wb-java.sh org.wikibrain.Loader -l en -s wikidata -s spatial -s sr
./wb-java.sh org.wikibrain.sr.SRBuilder -l simple -m ensemble -o both

Debug:

If SSL Certificate error occurs, you need to add the certificate from dump.wikimedia.org to the java keystore

To download the cert
echo -n | openssl s_client -connect dumps.wikimedia.org:443 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > ~/dumpswikimedia.cert

To add it to java cacert, first locate the java cacerts and then, in that directory, add the downloaded cert to cacerts
keytool -keystore cacerts -importcert -alias dumpswikimedia -file [dumpswikimedia.cert]

9. Configure the URL

Edit atlasify/src/main/java/org/wikibrain/atlasify/AtlasifyLauncher.java. set externalURL and portNo and helloWorldUrl according to the information of the host. These are the URL and PortNo for the wikibrain backend. Wikibrain needs its own port so make sure this port is open through the firewall.

10. Host the front end

Host https://github.com/cheetah90/Atlasify with your favorite http server (e.g. Apache)

11. Configure the front end

change the baseURL and featureArticleURL in atlasify.js based on the host info. Minimally, you just need to change the server name.

12. Start Server

run ./wb-java.sh org.wikibrain.atlasify.AtlasifyLauncher

13. Test

Open index.html to try if everything works. Note: run a query first -- and then the back-end will start loading. Wait till the loading finishes to try another query.

Set up development environment

1. Connect to the postgres database

Option 1 (recommended): Local database
Ingest the Simple English edition of Wikipedia to your local database

Option 2: SSH tunneling plus local copy of the intermediary files Opening up the 5432 on server to receive all requests and copy the wikibrain rooy folder to local

2. Using InteliJ

Issue 1: java.lang.NoClassDefFoundError issue
In IntelliJ, File->Project Structure->Modules->Dependencies In dependencies tab, change the “scope” from “Provided” to “Compile”

3. Sync with remote server

Front-end

On the server, git clone the front end repo and change the js/atlasify.js file. Copy the repo to /var/www/html

Back-end

On the server, git pull in the wikibrain directory.

wikibrain's People

Contributors

aaroniidx avatar alan502 avatar bjhecht avatar cheetah90 avatar derianders avatar huymai avatar lvonessen avatar math1man avatar mlesicko avatar rcharper avatar sam-gc avatar shilad avatar tobyli avatar yuropa avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

floatkeera

wikibrain's Issues

Improve the discovery for small countries

Small countries, even with high SR values, are hard to discovery with the current interface. For example, in the results of searching the concept "Jesus", Israel is very dark but too easy to be neglected. Can we add a ranked list of the countries with highest SRs?

Put hardcoded config into separate config files

Create atlasify.config file in the wikibrain/atlasify/src/main/resources/ directory. This file will be a txt file that hold the following configs in key-value pair (e.g. in JSON format)

  1. Config for wikibrain
    https://github.com/cheetah90/wikibrain#9-configure-the-url
  2. Config for frontend
    https://github.com/cheetah90/wikibrain#11-configure-the-front-end
  3. choice of SR https://github.com/cheetah90/wikibrain/blob/master/atlasify/src/main/java/org/wikibrain/atlasify/AtlasifyGameGenerator.java#L50

In the above three java code, parse the atlasify.config and get the according information.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.