Giter VIP home page Giter VIP logo

opensemanticsearch / open-semantic-search-apps Goto Github PK

View Code? Open in Web Editor NEW
93.0 12.0 36.0 1.4 MB

Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations and named entities) and data import (ETL like text extraction, OCR and crawling filesystems or websites)

Home Page: https://opensemanticsearch.org/

License: GNU General Public License v3.0

Python 24.51% JavaScript 13.68% HTML 14.67% Shell 0.33% CSS 46.03% SCSS 0.77%
search search-interface research-tool research-data-management django django-application thesaurus skos solr solr-client

open-semantic-search-apps's Introduction

Open Semantic Search Apps

Python/Django based webapps and user interfaces for search and meta data management

open-semantic-search-apps's People

Contributors

bhelou-roivant avatar dralves avatar g-braeunlich avatar mandalka avatar mosea3 avatar opensemanticsearch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-semantic-search-apps's Issues

Migrate facet field names to Tika style

Migrate facet field names to Tika style + Solr 7 field type suffix:

'message_from_ss' to 'Message-From_ss'
'message_to_ss' to 'Message-To_ss'
'message_cc_ss' to 'Message-CC_ss'
'message_bcc_ss' to 'Message-BCC_ss'

Move source code to github

Since there is more interest from external developers, after new release this week move code of the Django apps to Github. Until then you can get the full source code from the debian packages.

Error 500 when accessing search-apps from the LAN

Hi,

OSS is installed on a server on my LAN and I want my user to use it threw there web browser.
The search part works like a charm, but I can't access the search-app url and always get an ERROR 500.

I suspect something around Django's config; like ALLOWED_HOSTS.

Any hint on that?

Thank you!

Import entities by SPARQL SELECT

Support custom fields SPARQL SELECT statement results for import of named entites, too instead of only labels in RDF graph from CONSTRUCT or DESCRIBE statements.

Snippets to added ontologies, and enabling graphs for them

Hi,

I'm interested in adding snippets to an added ontology, and graph visualization (through Analyze->Connections (Graph)).

For example, let's say I've added the ontology human disease ontology.xrdf. To add snippets and graph visualization, I first find the name of the facet associated with the ontology by viewing /etc/solr-php-ui/config.facets.php. I get

image
(BTW, what does the _ss stand for. Semantic search?)

I could modify that config file directly but to avoid errors, I go to Django admin page and add the human_disease_ontology_xrdf_ss facet:

I can now see snippets

image

I can also find 'diseases' when I try to visualize graphs

I am not sure if this is how OSS is intended to be used. Is there a better way to add snippets and enable graph visualization?

Best regards,
Bassam

Can't delete a file via REST-API

Hi,

i can't delete a file via the REST-API - It always throws an Exception:
name 'delete' is not defined

URL was correct (http:///search-apps/api/delete?uri=file:///tmp/testfile)
For me the solution was to edit the views.py and add "from opensemanticetl.tasks import delete", like:

/var/lib/opensemanticsearch/api/views.py:
5 import json
6
7 from opensemanticetl.tasks import delete
8 from opensemanticetl.tasks import enrich

Search auto-complete and synonyms not working

Hi,

I'm using one of the latest versions of OSS (open-semantic-search_18.12.30.deb).

In the previous version (open-semantic-search_18.09.27.deb), the UI had auto-completion of searches (see screenshot below) and was using the synonyms defined at http://localhost:8983/solr/opensemanticsearch/schema/analysis/synonyms/skos/

It's not the case anymore. Could you please look into it or direct me to the part of the code that may be causing it?

Thanks for your help!

image

Problem with indexing web URLs

Hi,
I've noticed that URL those contains some Persian or Arabic words, will fail to index through rest API or console.
For example:
Request:
http://192.168.1.154/search-apps/api/index-web?uri=https://www.nimrokh.tv/news/38938/دلال-بازی-تلگرامچی-بی-مخاطب-سینمایی
Response:
{"queue": "7a746810-7d03-4c97-9f78-980ae708cc88"}
So I know open semantic search will try to crawl and index this URL but as long as the URL contains Persian characters open semantic search will fail to crawl it.
Any hint on this?

Abstract/externalize/modularize named entities dictionary code

Clean/modularize code for generation and setup of named entities extraction dictionaries, so it can be used more abstract (for example more usable with elastic search) and with cleaner architecture than now in different views of Thesaurus and Ontologies app.

Integrate thesaurus and ontology manager with Elastic Search

Integrate autotagging by thesaurus or ontologies with Elasticsearch, too.

Therefore
-implement update_by_query to elastic search ETL exporter
-abstract call of connector in thesaurus view and ontologies view
-use the config file instead default Solr URI so connector, index and core can be switched

Separated facets for Dictionary based Entity Extraction

Use additional/seperated facets for dictionary based entity extraction and copy its value to target facet, so tools can tag/add values to the same facet with values that are not in the dictionary and we can see/diff which values are added automated and which manual.

OperationalError at /thesarus

I face the following problem
where it says OperationalError at /thesaurus , unable to open database file

Error during template rendering
In template /var/lib/opensemanticsearch/thesaurus/templates/thesaurus_base.html, error at line 19

I am using ubuntu 18.04.1 LTS as a hyper-v virtual machine and django 2.1.2. Same issue of unable to open database for the thesarus, graph etc

No crawl.html template

Row 45 of /src/datasources/views.py has the following line:

return render(request, 'crawl.html',

But there is no crawl.html template

Use REST-API for Solr config

Use REST-API for synonyms config instead of config file, so we can use Solr on separates server(s) over network without sharing files, which makes scaling & separation of Docker containers and integration with Elastic search easier.

Thesaurus/taxonomy

Could someone explain (or give example) to me how can I use taxonomy fields? I've created simple thesaurus that contains hierarchy (broader, narrower). Using the solr admin panel I can query documents that contain fields "tag_ss_taxonomy_X_ss" but how can I use those fields in search?

For example, I have 2 fields:
"tag_ss_taxonomy_0_ss":["Human", "Human", "Human", "Human"]
and
"tag_ss_taxonomy_1_ss":["Human\tPerson", "Human\tPerson", "Human\tPerson", "Human\tPerson"]
Thseaurus: human is broader than person; person is narrower than human.
My document contains word "Person" but if I search for "Human" it will not find the document.

On deleting index reset import states

On deleting index by --empty reset import states of last imports, so yet imported data from RSS Feeds or Hypothesis annotations will be reindexed again.

Manual Tagging UI: Remove deleted tags from index

Deleted tags have to be removed from index, which can not yet be done by set instead of add tag fields, since automatic tags would be overwritten, which will be in separated field in the future.

OperationalError at /setup/

Hi All

I am facing the setup error while click on config tab. Below is the trace back details.

Environment:


Request Method: GET
Request URL: http://localhost/search-apps/setup/

Django Version: 1.10.7
Python Version: 3.5.3
Installed Applications:
('django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'setup',
 'thesaurus',
 'crawler',
 'files',
 'datasources',
 'annotate',
 'search_list',
 'csv_manager',
 'rss_manager',
 'ontologies',
 'querytagger',
 'morphology',
 'hypothesis',
 'search_entity',
 'visual_graph_explorer',
 'entity_rest_api',
 'import_export')
Installed Middleware:
('django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware',
 'django.middleware.security.SecurityMiddleware')



Traceback:

File "/usr/lib/python3/dist-packages/django/db/backends/utils.py" in execute
  64.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3/dist-packages/django/db/backends/sqlite3/base.py" in execute
  337.         return Database.Cursor.execute(self, query, params)

The above exception (no such column: setup_setup.segmentation_sentences) was the direct cause of the following exception:

File "/usr/lib/python3/dist-packages/django/core/handlers/exception.py" in inner
  42.             response = get_response(request)

File "/usr/lib/python3/dist-packages/django/core/handlers/base.py" in _legacy_get_response
  249.             response = self._get_response(request)

File "/usr/lib/python3/dist-packages/django/core/handlers/base.py" in _get_response
  187.                 response = self.process_exception_by_middleware(e, request)

File "/usr/lib/python3/dist-packages/django/core/handlers/base.py" in _get_response
  185.                 response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/var/lib/opensemanticsearch/setup/views.py" in update_setup
  473. 	setup = Setup.objects.get(pk=pk)

File "/usr/lib/python3/dist-packages/django/db/models/manager.py" in manager_method
  85.                 return getattr(self.get_queryset(), name)(*args, **kwargs)

File "/usr/lib/python3/dist-packages/django/db/models/query.py" in get
  379.         num = len(clone)

File "/usr/lib/python3/dist-packages/django/db/models/query.py" in __len__
  238.         self._fetch_all()

File "/usr/lib/python3/dist-packages/django/db/models/query.py" in _fetch_all
  1087.             self._result_cache = list(self.iterator())

File "/usr/lib/python3/dist-packages/django/db/models/query.py" in __iter__
  54.         results = compiler.execute_sql()

File "/usr/lib/python3/dist-packages/django/db/models/sql/compiler.py" in execute_sql
  835.             cursor.execute(sql, params)

File "/usr/lib/python3/dist-packages/django/db/backends/utils.py" in execute
  79.             return super(CursorDebugWrapper, self).execute(sql, params)

File "/usr/lib/python3/dist-packages/django/db/backends/utils.py" in execute
  64.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3/dist-packages/django/db/utils.py" in __exit__
  94.                 six.reraise(dj_exc_type, dj_exc_value, traceback)

File "/usr/lib/python3/dist-packages/django/utils/six.py" in reraise
  685.             raise value.with_traceback(tb)

File "/usr/lib/python3/dist-packages/django/db/backends/utils.py" in execute
  64.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3/dist-packages/django/db/backends/sqlite3/base.py" in execute
  337.         return Database.Cursor.execute(self, query, params)

Exception Type: OperationalError at /setup/
Exception Value: no such column: setup_setup.segmentation_sentences


Request Method: | GET
-- | --
http://localhost/search-apps/setup/
1.10.7
OperationalError
no such column: setup_setup.segmentation_sentences
/usr/lib/python3/dist-packages/django/db/backends/sqlite3/base.py in execute, line 337
/usr/local/bin/python3
3.5.3
['/var/lib/opensemanticsearch',  '/usr/lib/python35.zip',  '/usr/lib/python3.5',  '/usr/lib/python3.5/plat-x86_64-linux-gnu',  '/usr/lib/python3.5/lib-dynload',  '/usr/local/lib/python3.5/dist-packages',  '/usr/lib/python3/dist-packages',  '/usr/lib/python3/dist-packages',  '/usr/lib/python3/dist-packages/opensemanticetl',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',  '/usr/local/lib/python3.5/dist-packages/odf',



Why i am facing this issue.Do i have to change any settings.
I appreciate your suggestions.
Regards
Swagat

Upgrade to Python 3

Migrate code from Python 2 to Python 3 so ready for future and we get rid of many charset problems because new standard charset is UTF-8 instead of ASCII

Initial database creation

Do database creation by Djangos manage.py syncdb instead of shipping prefilled DB, so more indepedent of Linux distribution version, database library versions format and Django version.

Rest API format

I have downloaded and install a VM on virtual box. The web ui works fine so that I can search through a web browser. what is the WEB API URL I can use in my own code?

it says in https://www.opensemanticsearch.org/doc/admin/rest-api
Search or read data
We won't reinvent all wheels, so use the Rest-API of Solr for searching or / and getting data in XML or JSON format or use Solr Client APIs to get data with your favorite programming language.

But I am new to this, Can you give me an example of the Rest API url?
Thank you so much
James

Admin user name and password are not mentioned in the documentation.

Admin user name and password are not mentioned in the documentation.

Below is the transcript from the documentation.

The initial password for the Django admin interface (i.e. for adding tags that are usable for documents tagging) is live.

Tried the below password with following user names:
user
django
admin
root

Need some understanding

Hi All

First of all congratulation for this awesome search engine tool. I was searching for open source semantic search and i came across this tool. I am trying to understand this work flow. I installed this in Debian. I used few websites through Admin UI Datasources section to index website and then set the crawlers. When type a key word for search , it retrieve the website link with key words highlighted. Is it possible to retrieve the Abstract text based on keyword search. Like in science direct if you search based on keywords , it will retrieve the Abstract along with links to pdf/doc files. Is it possible to achieve this using this OSS. If so then how?

Example - http://nactem-copious.man.ac.uk/Thalia/

I appreciate your suggestions.

Tagging and annotation UI not working.

I have installed the open-semantic-search_17.07.13.deb in ubuntu Ubuntu 16.04.2 LTS

After clicking the tagging & annotation link below the search result it trows the following error.

Exception Value:	
module 'urllib' has no attribute 'quote_plus'
Exception Location:	/var/lib/opensemanticsearch/annotate/views.py in edit_annotation, line 127
Python Executable:	/usr/bin/python3
Python Version:	3.5.2

Fixed by modifying line 127 in /var/lib/opensemanticsearch/annotate/views.py to

return HttpResponseRedirect( "{}?uri={}".format( reverse('annotate:create'), urllib.parse.quote_plus( uri ) ) ) # Redirect after POST

Thesaurus: Saving of hidden label fails

If no entry for misspellings yet and not added by Recommender (adding misspelling works there) but by edit mask:

Error: 'HiddenForm' object has no attribute 'cleaned_data'

Web admin UI config option for showing ETL status

Option to switch on or off following new search UI feature:

Preview tab "Import & Analysis (ETL)" of search UI shows ETL status and error messages.

Listview / Search results show facet with / interactive filter for failed ETL plugins.

Commit after tagging & anntoation

Do a explicit commit after change/save tags or annotations, so users have not to wait for autocommit time which can take some seconds to see/find changes/new tags in UI.

Absolute paths in FileField's 'upload_to' causes Django Migration failure

Installing to a newly installed Ubuntu 17.10 VM using the open-semantic-search_18.01.19.deb package, was unable to access Datasources or Thesaurus. Tracked it back to /var/lib/opensemanticsearch/db.sqlite3 being empty as Django migration had failed during install.

Trying to manually run migrations resulted in:

# python3 manage.py migrate
SystemCheckError: System check identified some issues:

ERRORS:
csv_manager.CSV_Manager.file: (fields.E202) FileField's 'upload_to' argument must be a relative path, not an absolute path.
	HINT: Remove the leading slash.
ontologies.Ontologies.file: (fields.E202) FileField's 'upload_to' argument must be a relative path, not an absolute path.
	HINT: Remove the leading slash.

WARNINGS:
thesaurus.Concept.groups: (fields.W340) null has no effect on ManyToManyField.

file = models.FileField(upload_to='/var/opensemanticsearch/csv', blank = True)

file = models.FileField(upload_to='/var/opensemanticsearch/ontologies', blank = True)

Changing both of these to be relative rather than absolute paths allowed the DB migration to complete and removed to UI errors when browsing to Datasources and Thesaurus.

Thesaurus recommender: Optimize results

Do queries with stemming, prefixes and suffixes and lower edit distance before max edit distance, so false positives of more edit distance will not exlude better variants.

Upgrade to Django 1.10

Migrate views code (template rendering / request context) to Django 1.10 so its compatible with Debian 9 Stretch and further Django releases.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.