Giter VIP home page Giter VIP logo

spamscope / spamscope Goto Github PK

View Code? Open in Web Editor NEW
281.0 281.0 59.0 6.39 MB

Fast Advanced Spam Analysis Tool

Home Page: https://pypi.python.org/pypi/SpamScope

License: Apache License 2.0

Python 95.27% Clojure 0.19% Shell 0.10% Dockerfile 0.23% Makefile 0.89% Jinja 3.32%
ansible ansible-playbook apache-storm application-security dialect docker docker-image mail-analyzer outlook python security smtp spam-analyzer spamscope streamparse

spamscope's Issues

Configuration defaults will be used due to OSError

Errors appearing under the spamscope_debug worker.log

2018-10-11 08:54:22.519 phishing Thread-42 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:22.520 phishing Thread-44 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:22.520 phishing Thread-43 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:22.520 phishing Thread-37 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:22.522 phishing Thread-41 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:22.525 phishing Thread-38 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:23.509 phishing Thread-39 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

2018-10-11 08:54:23.509 phishing Thread-40 [INFO] /opt/spamscope/venv/local/lib/python2.7/site-packages/astropy/config/configuration.py:541: ConfigurationMissingWarning: Configuration defaults will be used due to OSError:Could not find unix home directory to search for astropy config dir on None
warn(ConfigurationMissingWarning(msg))

--

Sender Ip is always NULL

Question:

I have noticed that the Sender IP is always Null in the JSON output. The sender IP is in fact in the original email. Is there a way I can change this or is it expected behavior? I would like to add more lookups (other than Virustotal and shodan) but want to make Im looking in the right place.

Sometimes it is also in the "Return-Path" header.

Thank you

Consider swapping out tika-app with tika-python

The Tika Python library uses the REST server (which is faster than CMD line calls in Java to Tika APP since the REST server doesn't need to reload Tika config and the JVM each time). In addition you don't need to worry about the location of the Tika jar file (and install it separately). It will manage all that for you.

Looks like you would just update requirements.txt to use pip install tika, and then make whatever necessary updates. If you want I can send a PR.

Unsure about data input

Hi, I'm a student and I wanted to try your spamscope as it looks comprehensive at processing bulks of mail and stripping them down. Its the latest version of the spamscope available on your Git as well

I'm a bit confused on how to parse in data into this though. I have a docker image with the apache storm running and using the spamscope-debug topology to store the output on file system.

From there, I'm not very sure of how to parse in data into it, I have some email headers that I want to parse into it for processing. I understand it has to do with apache storm spouts but I've never used it before and some guidance would be appreciated! Would it be possible for it to take in a set of email files located in a folder for example

Thank you in advance also! :)

image

Serializer Exception & Pipe Broken

Fresh docker image of spamscope and it brings this error.

org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read. Serializer Exception: /usr/local/lib/python2.7/dist-packages/astropy/config/configuration.py

I have tried using the topology debug and the debug-iter already and both have this error. Upon submitting a mail to be analysed, the mail is analysed, put into /tmp/failed and this error still remains there. No output is received.

Help would be appreciated on this matter.

Unable to convert Float

Unable to convert the value to float. I added a try and except around it to fix it. For your awareness.

Traceback (most recent call last): File "/opt/spamscope/venv/local/lib/python2.7/site-packages/pystorm/component.py", line 488, in run self._run() File "/opt/spamscope/venv/local/lib/python2.7/site-packages/pystorm/bolt.py", line 197, in _run self.process(tup) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/bolts/raw_mail.py", line 50, in process p(self.conf[p.__name__], raw_mail, mail_type, results) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/post_processing.py", line 93, in spamassassin results["spamassassin"] = spamassassin[mail_type](raw_mail) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 90, in report_from_file return obj_report(mail) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 56, in obj_report details = convert_ascii2json(t) File "/var/lib/storm/supervisor/stormdist/spamscope_debug-1-1539177259/resources/modules/mails/spamassassin_analysis.py", line 141, in convert_ascii2json "pts": float(row[0]), ValueError: could not convert string to float: [SPF

Exception in phishing analysis for mail with mutiple subject headers

When attempting to analyse a mail with multiple subject headers such as the one in this gist (which is sort of invalid, but may happen anyway) with the phishing bolt bolts/phishing.py, the following exception occurs:

  File "/usr/local/lib/python3.6/dist-packages/pystorm/component.py", line 488, in run
    self._run()
  File "/usr/local/lib/python3.6/dist-packages/pystorm/bolt.py", line 197, in _run
    self.process(tup)
  File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/bolts/phishing.py", line 92, in process
    self._mails.pop(sha256_random))
  File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/bolts/phishing.py", line 71, in _phishing
    subject_keys=self.subject_keys)
  File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/modules/mails/phishing.py", line 147, in check_phishing
    if swt(subject, subject_keys):
  File "/data/supervisor/stormdist/spamscope_analysis-1-1617302134/resources/modules/utils.py", line 196, in search_words_in_text
    text = text.lower()
AttributeError: 'list' object has no attribute 'lower'

The reason here being that in such a case the underlying mail-parser returns a list with all encountered subject values instead of a string:

>>> m = mailparser.parse_from_string("...")
>>> m.mail_partial.get('subject')
['195.133.49.168 e HMUth', 'Potenzmittel GRATIS testen   ๐Ÿ”ฅ    ๐Ÿ”ฅ    ๐Ÿ”ฅ']

New picking up emails and not recognized index in kibana

I have installed and configured everything using docker-compose.

I have no errors in the storm UI.

I assume the problem is that no mail seems to be picked up as the example email file remains in the folder, and the Kibana instance doesnt recognize the suggested index in my config.file.

I have placed a raw email example as located here in the "/mnt/mails" of my host.

my .env file looks like this:

CLUSTER_NAME=spamscope-cluster
DOCKER_MAILS_FOLDER=/mnt/mails
ELASTIC_DATA=/usr/share/elasticsearch/data
ELASTIC_MEM_LIMIT=2g
ELK_BIND_IP=127.0.0.1
ELK_TAG=5.6.3
HEAP_SIZE=1024m
HOST_MAILS_FOLDER=/mnt/mails
HOST_SPAMSCOPE_CONF=/etc/spamscope/
KIBANA_MEM_LIMIT=2g
NET_NAME=esnet
NODE_NAME=spamscope
SPAMSCOPE_BIND_IP=127.0.0.1
SPAMSCOPE_IMAGE_NAME=fmantuano/spamscope-elasticsearch
SPAMSCOPE_MEM_LIMIT=4g

Where to start Troubleshooting ?

Thanks for your time.

Split actual output in: JSON mails and JSON attachments

Split result in two parts:

  • mail result with all fields except details of attachments (only hashes)
  • attachment result with all attachment details

Store only a sample for hash and attach Tika and Virustotal analysis only a time for hash.

Manage Rejecting mapping update to in Elasticsearch bolt

Manage Rejecting mapping update to in bulk indexing:

2018-10-08 14:16:17.276 o.a.s.d.executor Thread-46 [ERROR]
java.lang.Exception: Shell Process Exception: Python BulkIndexError raised while processing Tuple Tuple(id=u'9016744204847506491', component=u'__system', stream=u'__tick', task=-1, values=(60,))
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pystorm/component.py", line 488, in run
    self._run()
  File "/usr/local/lib/python2.7/dist-packages/pystorm/bolt.py", line 193, in _run
    self.process_tick(tup)
  File "/hadoop/storm/supervisor/stormdist/spamscope_elasticsearch-1-1538989492/resources/bolts/output_elasticsearch.py", line 106, in process_tick
    self.flush()
  File "/hadoop/storm/supervisor/stormdist/spamscope_elasticsearch-1-1538989492/resources/bolts/output_elasticsearch.py", line 60, in flush
    helpers.bulk(self._es, self._mails)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
    for ok, item in streaming_bulk(client, actions, *args, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 192, in streaming_bulk
    raise_on_error, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 137, in _process_bulk_chunk
    raise BulkIndexError('%i document(s) failed to index.' % len(errors), errors)
BulkIndexError: (u'2 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'analysis', u'_index': u'spamscope_mails-2018.10.08', u'error': {u'reason': u'Rejecting mapping update to [spamscope_mails-2018.10.08] as the final mapping would have more than 1 type: [_doc, analysis]', u'type': u'illegal_argument_exception'}, u'_id': u'JVqbU2YBiKy7cYIvRinK', u'data': {u'return-path': u

Java errors in Storm UI after installation

Receiving these errors once installation was complete via ansible: Looks like I didnt install something or forgot to do a step in installation.


java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read. Serializer Exception: /opt/spamscope/venv/local/lib/python2.7/site-pack
--

Any help would be appreciated!

SpamAssassin returns empty dictionary

With certain emails, the output of Spamscope shows SpamAssassin as a empty Dictionary. If I run the email through the Spamassassin CLI with spamassassin -t , it parses it fine.

Is there any reason why Spamscope is returning a empty dictionary when it should not?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.