UMD Libraries bento-box search application, based on the NSCU Quick Search Rails engine (https://github.com/NCSU-Libraries/quick_search)
This application wraps the NCSU Quick Search engine.
Requires:
- Ruby 2.3.7
- Bundler
-
Clone this repository.
-
Install the dependencies:
> gem install bundler
> bundle install --without production
- Set up the database:
> rails db:reset
-
Copy the "env_example" file to ".env" and configure.
-
To run the web application:
> rails server
Some searchers used by this application require API keys to perform searches. To keep these keys secure, and out of the GitHub repository, these keys are should be configured through the environment.
The application uses the "dotenv" gem to configure the environment. The gem expects a ".env" file in the root directory to contain the environment variables that are provided to Rails. A sample "env_example" file has been provided to assist with this process. Simply copy the "env_example" file to ".env" and fill out the parameters as appropriate.
The configured .env file should not be checked into the Git repository, as it contains credential information.
The quick_search-library_website_searcher requires a Solr instance containing search results. To set up a Solr instance use one of the following two methods:
In this method, a Solr instance will be created and populated using Apache Nutch. Use this method if you don't have a Solr data backup.
- Create a Docker image "searchumd-solr:dev" using the Dockerfile-solr:
> docker build -t searchumd-solr:dev -f Dockerfile-solr .
- Create a Docker bridge network named "dev_network":
> docker network create dev_network
- Run a Docker container with the Solr image, naming it "solr_app", specifying the "dev_network", and making it accessible from http://localhost:8983/:
> docker run --rm -p 8983:8983 --network dev_network --name solr_app --mount source=solr-data,destination=/opt/solr/server/solr/nutch searchumd-solr:dev
The Solr data will be persisted in a Docker volume named "solr-data" and the Solr instance should now be available via a web browser at:
- To populate the Solr container, do the following:
a) Create a Docker image "searchumd-nutch:dev" using the Dockerfile-nutch:
> docker build -t searchumd-nutch:dev -f Dockerfile-nutch .
b) Run Nutch, using only two crawl iterations, placing the result in the local Solr "solr_app" instance, via the "dev_network" network:
> docker run --rm --network dev_network searchumd-nutch:dev bin/crawl -i -D solr.server.url=http://solr_app:8983/solr/nutch -s /root/nutch/urls/ LibCrawl/ 2
Note: If you want to preserve the Apache Nutch crawl data between container runs, use a Docker volume (named "nutch-data" in this example) by running the following command:
> docker run --rm --mount source=nutch-data,destination=/root/nutch/LibCrawl --network dev_network searchumd-nutch:dev bin/crawl -i -D solr.server.url=http://solr_app:8983/solr/nutch -s /root/nutch/urls/ LibCrawl/ 2
Use this method if you have a Solr data backup, or can retrieve one from some source.
See https://docs.docker.com/storage/volumes/#backup-restore-or-migrate-data-volumes for more information about backing up and restoring data volumes.
Note: This step can be skipped if you already have a Solr data backup.
In order to populate Solr from a backup file, we first need the data. To retrieve the data from a Docker data volume (named "solr-data" in this example) do the following:
- Run a "ubuntu" container with the "solr-data" volume mounted, and run the "tar" command:
> docker run --rm -v `pwd`:/backup --mount source=solr-data,destination=/root/solr ubuntu tar -cvf /backup/backup-solr.tar /root/solr
This will create a "backup-solr.tar" file in the current directory.
- Create a Docker volume named "solr-data":
> docker volume create solr-data
- Run a "ubuntu" container and place the data from the backup-solr.tar into the volume:
> docker run --rm -v `pwd`:/backup --mount source=solr-data,destination=/root/solr ubuntu bash -c "cd /root/solr && tar -xvf /backup/backup-solr.tar --strip 1"
- Create a Docker image "searchumd-solr:dev" using the Dockerfile-solr:
> docker build -t searchumd-solr:dev -f Dockerfile-solr .
- Create a Docker bridge network named "dev_network":
> docker network create dev_network
- Run a Docker container with the Solr image, naming it "solr_app", specifying the "dev_network", and making it accessible from http://localhost:8983/:
> docker run --rm -p 8983:8983 --network dev_network --name solr_app --mount source=solr-data,destination=/opt/solr/server/solr/nutch searchumd-solr:dev
The Solr container will now use and persist data in the "solr-data" Docker volume. The Solr instance should now be available via a web browser at:
This application provides the following Dockerfiles for generating Docker images for use in production:
- Dockerfile - Generates image for the searchumd Rails application
- Dockerfile-nginx - Generates image for the Nginx web server providing HTTPS and port redirection.
- Dockerfile-solr - Generates image for the Solr search application
- Dockerfile-nutch - Generates image for Apache Nutch application with UMD custom configuration
The "docker_config" directory contains files used by the Dockerfiles.
In order to generate "clean" Docker images, the Docker images should be built from a fresh clone of the GitHub repository.
In addition to the functionality provided by the NSCU Quick Search Rails engine, this application also provides a search page for the library website on the "website" path (i.e., http://localhost:3000/website.
This functionality uses the quick_search-library_website_searcher to generate the results, and so is also dependent on a running Solr instance.