yelp / threat_intel Goto Github PK

View Code? Open in Web Editor NEW

270.0 33.0 59.0 195 KB

Threat Intelligence APIs

License: MIT License

Makefile 0.31% Python 99.69%

threat_intel's People

Stargazers

Watchers

threat_intel's Issues

IBM X-Force Exchange integration

IBM X-Force Exchange provides an API to access the service.
This could be easily integrated with the Threat Intel library.

Pulsedive API Integration

Pulsedive has a lot of really useful threat intelligence data could definitely benefit this project, and there's a free API that's really easy to integrate.

set + simplejson.dumps = error

Line 23 in threat_intel/opendns.py:
domains = set(domains) - set(all_responses)
makes domains a set.

Then, on Line 90, simplejson.dumps(domains) raise an error because domains is a set:
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: set([...

Add threat_intel to travis CI

We might have to wait till this is a public repo?

ConnectionError: Request to https://investigate.api.opendns.com/security/name/.hk.json had an empty response

OpenDNS Investigate API seems to return sometimes an empty response, e.g. if a malformed domain is given (.hk.):

ConnectionError: Request to https://investigate.api.opendns.com/security/name/.hk.json had an empty response

It would make sense to throw a custom exception, i.e. OpenDnsEmptyResponseException instead.

Better error handling for HTTP responses

It seems that Threat Intel does not handle very well HTTP error responses, e.g. 403 Forbidden.
They are rather ignored silently.
Some error handling, e.g. throwing an exception, might be a better way of dealing with these situations.

Better exception handling in threat_intel.opendns._cached_by_domain wrapper

The wrapper raises a very descriptive exception 'dang' when the response from OpenDNS Investigate API call is empty. We could do better than that.

In ApiCache class make call to _write_cache_to_file() in close() method optional

I have observed that when running make test in OSXCollector multiple times the cache files used in the tests are overwritten. OSXCollector Output Filters use cache mechanism from Threat Intel to mock responses from OpenDNS, VirustTotal etc. But as a result, after each test run the values of the cache are changing (e.g. items in JSON file are reordered etc.) and so the files are changed as well. This is quite annoying as it makes you think that there was a code change introduced just by running the tests.

It would be good to have an option to tell ApiCache to do not write the cache back to the file, based on the initialization parameter, e.g. update_cache. The default value of this parameter should be True to ensure backward compatibility.

"AttributeError: 'NoneType' object has no attribute 'status_code'" in threat_intel/shadowserver.py", line 55, in get_bin_test

This is caused by an empty response appearing in the responses list.
An easy remediation would be to have a test against response being None.

OpenDNS api wrapper for domain scores

This is the simplest piece of data, sometimes it's all we need, and the security call is more heavyweight

Invalid syntax in shadowserver.py

Traceback:

File "ticl/hash_checker.py", line 7, in <module>
from threat_intel.shadowserver import ShadowServerApi 
File "/nail/home/marina/pg/stuff/security-tools/ticl/.tox/py/lib/python2.6/site-packages/threat_intel/shadowserver.py", line 46
all_responses = {key: val for key, val in all_responses.iteritems() if len(val) >= 2}
^
SyntaxError: invalid syntax

Provide more effective and granular control over retry mechanisms

Services utilizing the Threat Intel API will sometimes choke for prolonged periods of time due to poorly managed retry mechanisms. For example, there have been multiple reports of Shadowserver requests hanging for minutes on end, repeatedly issuing warnings like "Request to http://bin-test.shadowserver.org/api failed with None response.". Additionally, the existing framework for retries excessively reattempts requests due to client-side errors when the focus should be on server-side issues instead.

Currently, the grequest wrapper defined in the util/http module defines its own method of handling retries, which ends up retrying any failed request with an error code of 400 or above. This means any obvious client-facing problems will suspend calls entirely until all requests have been erroneously re-issued. Note also that this is done per request batch, so although a simple fix of decreasing the value for max_retries per affected API would be sufficient, it is short-sighted in the sense that an API endpoint clearly experiencing service issues will result in Threat Intel indeterminately hanging for the entire session. In other words, what needs to be changed is for finer-grain control over what type of requests are retried, as well as management of the retrying mechanism not on a per-batch but per-session basis.

There is a clear and effective fix for solving both these problems, which involves a refactoring of the current retry mechanism to better make use of the underlying transport adapter. The solution is to provide the max_retries argument of the alternative transport adapter a Retry() object, which provides granular control over total requests to be issued, types of requests to retry (i.e. based on status code), as well as a backoff_factor to apply between attempts, all on an individual session basis. Also, just in case there may be a reason to keep the existing per-batch retry mechanism, this improvement can be provided without modifying that at all. To do this, while providing backwards-compatibility, the default max_retries value should be decreased to a low value (e.g. 2-3 max) and a custom Retry() object should be provided with customizable arguments when mounting the adapter to each grequest wrapper session.

Add all to init.py

dir(threat_intel) is not showing all the public stuff in the module by default. Please add an __all__

OpenDNS Investigate Threat Grid integration

OpenDNS Investigate is now integrated with Threat Grid from their parent company, Cisco.
Threat Intel should allow also querying these new endpoints for file hashes and domains.

Hudson Rock Cybercrime Intelligence Free Integration

Consider adding Hudson Rock's complimentary data to receive additional intelligence about the domain, or email address that was compromised in global Infostealer attacks.

Email sample: https://cavalier.hudsonrock.com/api/json/v2/osint-tools/[email protected]
Domain sample: https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-domain?domain=tesla.com

Free API key and full documentation is available here: https://cavalier.hudsonrock.com/docs

Thank you.

Limit the number of domains sent to the OpenDNS Investigate API Domain Categorization endpoint in a single POST request

According to the official documentation OpenDNS Investigate API Domain Categorization endpoint takes up to 1000 domains in a single request.
So far Threat Intel was sending all of the domains passed to the threat_intel.opendns.categorization() method in a single POST request. This caused the request to fail and escalate to a mysterious 'dang' exception seen in #59.

Move grequests and simplejson from requirements-dev.txt to setup.py

For packaging to work, grequests and simplejson need to be specified in setup.py rather than in requirements-dev.txt, which only includes the test requirements.

Merge all responses returned by opendns.categorization() and opendns.domain_score() methods

It seems that the change #62 that introduced request splitting for OpenDNS API the responses were not merged back to form a one big dictionary following the format{<domain>: <response>, ..}. That was a break in the these methods compatibility as they were no longer returning a dictionary, but rather a list of dictionaries.

This was circumvented by changing the decorator method _cached_by_domain to understand the new type returned by the methods. However a more obvious place to fix it would be to keep it not in the decorator, but in the _multi_post method.

Improve output of help() for threat_intel

The output for help(threat_intel) isn't all that helpful. Some well placed doc comments could really spruce things up.

In [2]: help(threat_intel)
Help on package threat_intel:

NAME
    threat_intel - # -*- coding: utf-8 -*-

FILE
    /Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/__init__.py

PACKAGE CONTENTS
    exceptions
    opendns
    shadowserver
    util (package)
    virustotal

Domain Scores endpoint from OpenDNS Investigate API is deprecated

According to the official documentation for OpenDNS Investigate API the Domain Scores API endpoint is now deprecated and replaced by the Domain Status endpoint:

This endpoint has been deprecated and replaced by the Domain Status and Categorization endpoint above. Please use the Domain Status endpoint and update any API clients as quickly as possible.

Threat Intel should add some warning when using threat_intel.opendns.domain_score() method, to let the API users know that the endpoint is deprecated.

Add long_description to setup.py

Following the instructions in Packaging Python Projects documentation let's add the long_description field to setup.py that will link the GitHub README.

Import of InvestigateApi is failing

$ virtualenv venv_threat_intel
$ source venv_threat_intel/bin/activate
$ pip install threat_intel
$ ipython
WARNING: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
Type "copyright", "credits" or "license" for more information.

IPython 2.3.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import threat_intel

In [2]: dir(threat_intel)
Out[2]: ['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']

In [3]: help(threat_intel)
Help on package threat_intel:

NAME
    threat_intel - # -*- coding: utf-8 -*-

FILE
    /Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/__init__.py

PACKAGE CONTENTS
    exceptions
    opendns
    shadowserver
    virustotal



In [4]: help(threat_intel.opendns)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-16955bf30c12> in <module>()
----> 1 help(threat_intel.opendns)

AttributeError: 'module' object has no attribute 'opendns'

In [5]: import threat_intel.opendns
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-5734f8471005> in <module>()
----> 1 import threat_intel.opendns

/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
      5 import simplejson
      6
----> 7 from threat_intel.util.api_cache import ApiCache
      8 from threat_intel.util.error_messages import write_error_message
      9 from threat_intel.util.error_messages import write_exception

ImportError: No module named util.api_cache

In [6]: from threat_intel.opendns import *
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-6-b1ce384838aa> in <module>()
----> 1 from threat_intel.opendns import *

/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
      5 import simplejson
      6
----> 7 from threat_intel.util.api_cache import ApiCache
      8 from threat_intel.util.error_messages import write_error_message
      9 from threat_intel.util.error_messages import write_exception

ImportError: No module named util.api_cache

In [7]: from threat_intel.opendns import InvestigateApi
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-7-f147080588b0> in <module>()
----> 1 from threat_intel.opendns import InvestigateApi

/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
      5 import simplejson
      6
----> 7 from threat_intel.util.api_cache import ApiCache
      8 from threat_intel.util.error_messages import write_error_message
      9 from threat_intel.util.error_messages import write_exception

ImportError: No module named util.api_cache

In [8]:

Upgrade requests to version 2.20.0 or later (CVE-2018-18074)

GitHub flagged the following security vulnerability affecting this repository.

https://nvd.nist.gov/vuln/detail/CVE-2018-18074

CVE-2018-18074
moderate severity
Vulnerable versions: <= 2.19.1
Patched version: 2.20.0
The Requests package through 2.19.1 before 2018-09-14 for Python sends an HTTP Authorization header to an http URI upon receiving a same-hostname https-to-http redirect, which makes it easier for remote attackers to discover credentials by sniffing the network.

Update README

Let's add:

Examples of how to use each API
Details of public vs private VT - tuning the rate limit
How to get tests to run

Include pattern searching functionality for the OpenDNS Umbrella API wrapper

Another useful feature that Yelp's threat intel platform could provide is the ability to query for potentially masquerading domains for a given domain name pattern. The OpenDNS Umbrella Investigate API actually allows querying of such data extending up to 30-days from the present with up to 1000 results. Incorporating this functionality into the existing OpenDNS wrapper would allow for wider insight on imposter domain discovery.

"Connection pool is full, discarding connection" warning

This warning pops up sometimes when running the OSXCollector Analyze Filter:

requests.packages.urllib3.connectionpool: WARNING  Connection pool is full, discarding connection: investigate.api.opendns.com

Not sure if it means that a particular connection is dropped and the results are never gonna be obtained for a certain domain.

threat_intel.util.config is missing

In virustotal:

from threat_intel.util.config import config_get_deep

Remove sitepackages=True from tox.ini

I think this might be copy-paste error from OSXCollector as in Threat Intel we are not using anything OSX-specific.

Add ShadowServer Support

ShadowServer is in https://github.com/Yelp/osxcollector/tree/master/osxcollector/output_filters/shadowserver and can move here instead.

SSLErrors Connecting to VirusTotal API

I've got python 2.7.10 and latest OpenSSL.

$ python --version
Python 2.7.10
$ python -c 'import ssl; print ssl._OPENSSL_API_VERSION'
(1, 0, 2, 3, 15)

But still I get this.

Traceback (most recent call last):
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/grequests.py", line 71, in send
    self.url, **merged_kwargs)
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
    raise SSLError(e, request=request)
SSLError: EOF occurred in violation of protocol (_ssl.c:590)
<Greenlet at 0x12ac6b050: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x12b947790>>(stream=False)> failed with SSLError

I've got a branch where I've explored things a bit and I think I have a fix.

Create API module for aggregating Alexa domain rankings

It would be very useful to incorporate Alexa rankings as an alternate source for threat intelligence. The value here is to allow incident responders to identify globally rare domains potentially worthy of further investigation. Additionally, this could be used as a filtering mechanism that could be used for whitelisting if popularity is seen as a characteristic to help identify benign domains.

Incorporate new risk score endpoint to Investigate handler

From chatting with an Umbrella technical account manager, we have found out that Umbrella is proposing to incorporate a new, soon-to-be released endpoint for exposing Umbrella risk scores for domains, a non-authoritative cumulative aggregate of the security details of a particular domain. Previously, this has only been accessible through their UI. We should add this as an additional threat indicator for our Umbrella handler as it should prove invaluable to malicious domain identification.

Python 3 Support

This library can't be used with python3 code. Would be nice to support it.

Setup autodeploy to PyPI through Travis

Here is an example how to do it from the PaaSTA GitHub repo.

Migrate from grequests to requests-futures for concurrent web requests

As of May 2015, the grequests module allowing for making concurrent request calls has been abandoned. It makes sense to move to the updated requests-futures package instead, which makes use of Python's concurrent.futures module for issuing asynchronous callables.

Retry only for the responses that have not finished successfully

_wait_for_response method in MultiRequest class in http.py module will try to issue again and again a batch of requests that is failing to complete. It will also try to reissue the requests that already completed successfully, which seem like a waste of the resources.

In the consecutive retries the method should only reissue the requests that have not completed successfully.

yelp / threat_intel Goto Github PK

threat_intel's People

Stargazers

Watchers

Forkers

threat_intel's Issues

Recommend Projects

Recommend Topics

Recommend Org