spamscope / spamscope Goto Github PK
View Code? Open in Web Editor NEWFast Advanced Spam Analysis Tool
Home Page: https://pypi.python.org/pypi/SpamScope
License: Apache License 2.0
Fast Advanced Spam Analysis Tool
Home Page: https://pypi.python.org/pypi/SpamScope
License: Apache License 2.0
Errors appearing under the spamscope_debug worker.log
2018-10-11 08:54:22.519 phishing Thread-42 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:22.520 phishing Thread-44 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:22.520 phishing Thread-43 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:22.520 phishing Thread-37 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:22.522 phishing Thread-41 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:22.525 phishing Thread-38 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:23.509 phishing Thread-39 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))2018-10-11 08:54:23.509 phishing Thread-40 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))
--
Question:
I have noticed that the Sender IP is always Null in the JSON output. The sender IP is in fact in the original email. Is there a way I can change this or is it expected behavior? I would like to add more lookups (other than Virustotal and shodan) but want to make Im looking in the right place.
Sometimes it is also in the "Return-Path" header.
Thank you
From issue #14.
@bm1391 asked me:
I would like to add more lookups (other than Virustotal and shodan) but want to make Im looking in the right place.
The Tika Python library uses the REST server (which is faster than CMD line calls in Java to Tika APP since the REST server doesn't need to reload Tika config and the JVM each time). In addition you don't need to worry about the location of the Tika jar file (and install it separately). It will manage all that for you.
Looks like you would just update requirements.txt to use pip install tika, and then make whatever necessary updates. If you want I can send a PR.
Add Shodan informations to sender
ip address.
Hi, I'm a student and I wanted to try your spamscope as it looks comprehensive at processing bulks of mail and stripping them down. Its the latest version of the spamscope available on your Git as well
I'm a bit confused on how to parse in data into this though. I have a docker image with the apache storm running and using the spamscope-debug topology to store the output on file system.
From there, I'm not very sure of how to parse in data into it, I have some email headers that I want to parse into it for processing. I understand it has to do with apache storm spouts but I've never used it before and some guidance would be appreciated! Would it be possible for it to take in a set of email files located in a folder for example
Thank you in advance also! :)
Fresh docker image of spamscope and it brings this error.
org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read. Serializer Exception: /usr/local/lib/python2.7/dist-packages/astropy/config/configuration.py
I have tried using the topology debug and the debug-iter already and both have this error. Upon submitting a mail to be analysed, the mail is analysed, put into /tmp/failed and this error still remains there. No output is received.
Help would be appreciated on this matter.
Unable to convert the value to float. I added a try and except around it to fix it. For your awareness.
Traceback (most recent call last): File "/opt/spamscope/venv/local/lib/python2.7/site-packages/pystorm/component.py", line 488, in run self._run() File "/opt/spamscope/venv/local/lib/python2.7/site-packages/pystorm/bolt.py", line 197, in _run self.process(tup) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/bolts/raw_mail.py", line 50, in process p(self.conf[p.__name__], raw_mail, mail_type, results) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/post_processing.py", line 93, in spamassassin results["spamassassin"] = spamassassin[mail_type](raw_mail) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 90, in report_from_file return obj_report(mail) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 56, in obj_report details = convert_ascii2json(t) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 141, in convert_ascii2json "pts": float(row[0]), ValueError: could not convert string to float: [SPF
When attempting to analyse a mail with multiple subject headers such as the one in this gist (which is sort of invalid, but may happen anyway) with the phishing bolt bolts/phishing.py
, the following exception occurs:
File "/usr/local/lib/python3.6/dist-packages/pystorm/component.py", line 488, in run
self._run()
File "/usr/local/lib/python3.6/dist-packages/pystorm/bolt.py", line 197, in _run
self.process(tup)
File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/bolts/phishing.py", line 92, in process
self._mails.pop(sha256_random))
File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/bolts/phishing.py", line 71, in _phishing
subject_keys=self.subject_keys)
File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/modules/mails/phishing.py", line 147, in check_phishing
if swt(subject, subject_keys):
File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/modules/utils.py", line 196, in search_words_in_text
text = text.lower()
AttributeError: 'list' object has no attribute 'lower'
The reason here being that in such a case the underlying mail-parser returns a list with all encountered subject values instead of a string:
>>> m = mailparser.parse_from_string("...")
>>> m.mail_partial.get('subject')
['195.133.49.168 e HMUth', 'Potenzmittel GRATIS testen ๐ฅ ๐ฅ ๐ฅ']
I have installed and configured everything using docker-compose.
I have no errors in the storm UI.
I assume the problem is that no mail seems to be picked up as the example email file remains in the folder, and the Kibana instance doesnt recognize the suggested index in my config.file.
I have placed a raw email example as located here in the "/mnt/mails" of my host.
my .env file looks like this:
CLUSTER_NAME=spamscope-cluster
DOCKER_MAILS_FOLDER=/mnt/mails
ELASTIC_DATA=/usr/share/elasticsearch/data
ELASTIC_MEM_LIMIT=2g
ELK_BIND_IP=127.0.0.1
ELK_TAG=5.6.3
HEAP_SIZE=1024m
HOST_MAILS_FOLDER=/mnt/mails
HOST_SPAMSCOPE_CONF=/etc/spamscope/
KIBANA_MEM_LIMIT=2g
NET_NAME=esnet
NODE_NAME=spamscope
SPAMSCOPE_BIND_IP=127.0.0.1
SPAMSCOPE_IMAGE_NAME=fmantuano/spamscope-elasticsearch
SPAMSCOPE_MEM_LIMIT=4g
Where to start Troubleshooting ?
Thanks for your time.
I noticed my issue was deleted, is this library not supported anymore?
Split result in two parts:
Store only a sample for hash and attach Tika and Virustotal analysis only a time for hash.
Manage Rejecting mapping update to
in bulk indexing:
2018-10-08 14:16:17.276 o.a.s.d.executor Thread-46 [ERROR]
java.lang.Exception: Shell Process Exception: Python BulkIndexError raised while processing Tuple Tuple(id=u'9016744204847506491', component=u'__system', stream=u'__tick', task=-1, values=(60,))
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/pystorm/component.py", line 488, in run
self._run()
File "/usr/local/lib/python2.7/dist-packages/pystorm/bolt.py", line 193, in _run
self.process_tick(tup)
File "/hadoop/storm/supervisor/stormdist/spamscope_elasticsearch-1-1538989492/resources/bolts/output_elasticsearch.py", line 106, in process_tick
self.flush()
File "/hadoop/storm/supervisor/stormdist/spamscope_elasticsearch-1-1538989492/resources/bolts/output_elasticsearch.py", line 60, in flush
helpers.bulk(self._es, self._mails)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
for ok, item in streaming_bulk(client, actions, *args, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 192, in streaming_bulk
raise_on_error, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 137, in _process_bulk_chunk
raise BulkIndexError('%i document(s) failed to index.' % len(errors), errors)
BulkIndexError: (u'2 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'analysis', u'_index': u'spamscope_mails-2018.10.08', u'error': {u'reason': u'Rejecting mapping update to [spamscope_mails-2018.10.08] as the final mapping would have more than 1 type: [_doc, analysis]', u'type': u'illegal_argument_exception'}, u'_id': u'JVqbU2YBiKy7cYIvRinK', u'data': {u'return-path': u
New sample processing module for oletools.
Receiving these errors once installation was complete via ansible: Looks like I didnt install something or forgot to do a step in installation.
java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read. Serializer Exception: /opt/spamscope/venv/local/lib/python2.7/site-pack
--
Any help would be appreciated!
Change json
and phishing
bolts with the instrutions on this streamparse issue.
With certain emails, the output of Spamscope shows SpamAssassin as a empty Dictionary. If I run the email through the Spamassassin CLI with spamassassin -t , it parses it fine.
Is there any reason why Spamscope is returning a empty dictionary when it should not?
Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.