yelp / threat_intel Goto Github PK
View Code? Open in Web Editor NEWThreat Intelligence APIs
License: MIT License
Threat Intelligence APIs
License: MIT License
IBM X-Force Exchange provides an API to access the service.
This could be easily integrated with the Threat Intel library.
Line 23 in threat_intel/opendns.py:
domains = set(domains) - set(all_responses)
makes domains a set.
Then, on Line 90, simplejson.dumps(domains) raise an error because domains is a set:
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: set([...
We might have to wait till this is a public repo?
OpenDNS Investigate API seems to return sometimes an empty response, e.g. if a malformed domain is given (.hk.
):
ConnectionError: Request to https://investigate.api.opendns.com/security/name/.hk.json had an empty response
It would make sense to throw a custom exception, i.e. OpenDnsEmptyResponseException
instead.
Let's use [email protected]
It seems that Threat Intel does not handle very well HTTP error responses, e.g. 403 Forbidden.
They are rather ignored silently.
Some error handling, e.g. throwing an exception, might be a better way of dealing with these situations.
The wrapper raises a very descriptive exception 'dang'
when the response from OpenDNS Investigate API call is empty. We could do better than that.
I have observed that when running make test
in OSXCollector multiple times the cache files used in the tests are overwritten. OSXCollector Output Filters use cache mechanism from Threat Intel to mock responses from OpenDNS, VirustTotal etc. But as a result, after each test run the values of the cache are changing (e.g. items in JSON file are reordered etc.) and so the files are changed as well. This is quite annoying as it makes you think that there was a code change introduced just by running the tests.
It would be good to have an option to tell ApiCache
to do not write the cache back to the file, based on the initialization parameter, e.g. update_cache
. The default value of this parameter should be True
to ensure backward compatibility.
This is caused by an empty response appearing in the responses list.
An easy remediation would be to have a test against response being None
.
This is the simplest piece of data, sometimes it's all we need, and the security call is more heavyweight
Traceback:
File "ticl/hash_checker.py", line 7, in <module>
from threat_intel.shadowserver import ShadowServerApi
File "/nail/home/marina/pg/stuff/security-tools/ticl/.tox/py/lib/python2.6/site-packages/threat_intel/shadowserver.py", line 46
all_responses = {key: val for key, val in all_responses.iteritems() if len(val) >= 2}
^
SyntaxError: invalid syntax
Services utilizing the Threat Intel API will sometimes choke for prolonged periods of time due to poorly managed retry mechanisms. For example, there have been multiple reports of Shadowserver requests hanging for minutes on end, repeatedly issuing warnings like "Request to http://bin-test.shadowserver.org/api failed with None response."
. Additionally, the existing framework for retries excessively reattempts requests due to client-side errors when the focus should be on server-side issues instead.
Currently, the grequest
wrapper defined in the util/http
module defines its own method of handling retries, which ends up retrying any failed request with an error code of 400 or above. This means any obvious client-facing problems will suspend calls entirely until all requests have been erroneously re-issued. Note also that this is done per request batch, so although a simple fix of decreasing the value for max_retries
per affected API would be sufficient, it is short-sighted in the sense that an API endpoint clearly experiencing service issues will result in Threat Intel indeterminately hanging for the entire session. In other words, what needs to be changed is for finer-grain control over what type of requests are retried, as well as management of the retrying mechanism not on a per-batch but per-session basis.
There is a clear and effective fix for solving both these problems, which involves a refactoring of the current retry mechanism to better make use of the underlying transport adapter. The solution is to provide the max_retries
argument of the alternative transport adapter a Retry() object, which provides granular control over total requests to be issued, types of requests to retry (i.e. based on status code), as well as a backoff_factor
to apply between attempts, all on an individual session basis. Also, just in case there may be a reason to keep the existing per-batch retry mechanism, this improvement can be provided without modifying that at all. To do this, while providing backwards-compatibility, the default max_retries
value should be decreased to a low value (e.g. 2-3 max) and a custom Retry()
object should be provided with customizable arguments when mounting the adapter to each grequest
wrapper session.
dir(threat_intel)
is not showing all the public stuff in the module by default. Please add an __all__
OpenDNS Investigate is now integrated with Threat Grid from their parent company, Cisco.
Threat Intel should allow also querying these new endpoints for file hashes and domains.
Consider adding Hudson Rock's complimentary data to receive additional intelligence about the domain, or email address that was compromised in global Infostealer attacks.
Email sample: https://cavalier.hudsonrock.com/api/json/v2/osint-tools/[email protected]
Domain sample: https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-domain?domain=tesla.com
Free API key and full documentation is available here: https://cavalier.hudsonrock.com/docs
Thank you.
According to the official documentation OpenDNS Investigate API Domain Categorization endpoint takes up to 1000 domains in a single request.
So far Threat Intel was sending all of the domains passed to the threat_intel.opendns.categorization()
method in a single POST request. This caused the request to fail and escalate to a mysterious 'dang' exception seen in #59.
For packaging to work, grequests and simplejson need to be specified in setup.py rather than in requirements-dev.txt, which only includes the test requirements.
It seems that the change #62 that introduced request splitting for OpenDNS API the responses were not merged back to form a one big dictionary following the format{<domain>: <response>, ..}
. That was a break in the these methods compatibility as they were no longer returning a dictionary, but rather a list of dictionaries.
This was circumvented by changing the decorator method _cached_by_domain
to understand the new type returned by the methods. However a more obvious place to fix it would be to keep it not in the decorator, but in the _multi_post
method.
The output for help(threat_intel)
isn't all that helpful. Some well placed doc comments could really spruce things up.
In [2]: help(threat_intel)
Help on package threat_intel:
NAME
threat_intel - # -*- coding: utf-8 -*-
FILE
/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/__init__.py
PACKAGE CONTENTS
exceptions
opendns
shadowserver
util (package)
virustotal
According to the official documentation for OpenDNS Investigate API the Domain Scores API endpoint is now deprecated and replaced by the Domain Status endpoint:
This endpoint has been deprecated and replaced by the Domain Status and Categorization endpoint above. Please use the Domain Status endpoint and update any API clients as quickly as possible.
Threat Intel should add some warning when using threat_intel.opendns.domain_score()
method, to let the API users know that the endpoint is deprecated.
Following the instructions in Packaging Python Projects documentation let's add the long_description
field to setup.py
that will link the GitHub README
.
$ virtualenv venv_threat_intel
$ source venv_threat_intel/bin/activate
$ pip install threat_intel
$ ipython
WARNING: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
Python 2.7.6 (default, Sep 9 2014, 15:04:36)
Type "copyright", "credits" or "license" for more information.
IPython 2.3.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import threat_intel
In [2]: dir(threat_intel)
Out[2]: ['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
In [3]: help(threat_intel)
Help on package threat_intel:
NAME
threat_intel - # -*- coding: utf-8 -*-
FILE
/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/__init__.py
PACKAGE CONTENTS
exceptions
opendns
shadowserver
virustotal
In [4]: help(threat_intel.opendns)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-16955bf30c12> in <module>()
----> 1 help(threat_intel.opendns)
AttributeError: 'module' object has no attribute 'opendns'
In [5]: import threat_intel.opendns
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-5-5734f8471005> in <module>()
----> 1 import threat_intel.opendns
/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
5 import simplejson
6
----> 7 from threat_intel.util.api_cache import ApiCache
8 from threat_intel.util.error_messages import write_error_message
9 from threat_intel.util.error_messages import write_exception
ImportError: No module named util.api_cache
In [6]: from threat_intel.opendns import *
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-6-b1ce384838aa> in <module>()
----> 1 from threat_intel.opendns import *
/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
5 import simplejson
6
----> 7 from threat_intel.util.api_cache import ApiCache
8 from threat_intel.util.error_messages import write_error_message
9 from threat_intel.util.error_messages import write_exception
ImportError: No module named util.api_cache
In [7]: from threat_intel.opendns import InvestigateApi
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-7-f147080588b0> in <module>()
----> 1 from threat_intel.opendns import InvestigateApi
/Users/ivanlei/venv_threat_intel/lib/python2.7/site-packages/threat_intel/opendns.py in <module>()
5 import simplejson
6
----> 7 from threat_intel.util.api_cache import ApiCache
8 from threat_intel.util.error_messages import write_error_message
9 from threat_intel.util.error_messages import write_exception
ImportError: No module named util.api_cache
In [8]:
GitHub flagged the following security vulnerability affecting this repository.
https://nvd.nist.gov/vuln/detail/CVE-2018-18074
CVE-2018-18074
moderate severity
Vulnerable versions: <= 2.19.1
Patched version: 2.20.0
The Requests package through 2.19.1 before 2018-09-14 for Python sends an HTTP Authorization header to an http URI upon receiving a same-hostname https-to-http redirect, which makes it easier for remote attackers to discover credentials by sniffing the network.
Let's add:
Another useful feature that Yelp's threat intel platform could provide is the ability to query for potentially masquerading domains for a given domain name pattern. The OpenDNS Umbrella Investigate API actually allows querying of such data extending up to 30-days from the present with up to 1000 results. Incorporating this functionality into the existing OpenDNS wrapper would allow for wider insight on imposter domain discovery.
This warning pops up sometimes when running the OSXCollector Analyze Filter:
requests.packages.urllib3.connectionpool: WARNING Connection pool is full, discarding connection: investigate.api.opendns.com
Not sure if it means that a particular connection is dropped and the results are never gonna be obtained for a certain domain.
In virustotal:
from threat_intel.util.config import config_get_deep
I think this might be copy-paste error from OSXCollector as in Threat Intel we are not using anything OSX-specific.
ShadowServer is in https://github.com/Yelp/osxcollector/tree/master/osxcollector/output_filters/shadowserver and can move here instead.
I've got python 2.7.10 and latest OpenSSL.
$ python --version
Python 2.7.10
$ python -c 'import ssl; print ssl._OPENSSL_API_VERSION'
(1, 0, 2, 3, 15)
But still I get this.
Traceback (most recent call last):
File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/grequests.py", line 71, in send
self.url, **merged_kwargs)
File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
raise SSLError(e, request=request)
SSLError: EOF occurred in violation of protocol (_ssl.c:590)
<Greenlet at 0x12ac6b050: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x12b947790>>(stream=False)> failed with SSLError
I've got a branch where I've explored things a bit and I think I have a fix.
It would be very useful to incorporate Alexa rankings as an alternate source for threat intelligence. The value here is to allow incident responders to identify globally rare domains potentially worthy of further investigation. Additionally, this could be used as a filtering mechanism that could be used for whitelisting if popularity is seen as a characteristic to help identify benign domains.
From chatting with an Umbrella technical account manager, we have found out that Umbrella is proposing to incorporate a new, soon-to-be released endpoint for exposing Umbrella risk scores for domains, a non-authoritative cumulative aggregate of the security details of a particular domain. Previously, this has only been accessible through their UI. We should add this as an additional threat indicator for our Umbrella handler as it should prove invaluable to malicious domain identification.
This library can't be used with python3 code. Would be nice to support it.
Here is an example how to do it from the PaaSTA GitHub repo.
As of May 2015, the grequests
module allowing for making concurrent request calls has been abandoned. It makes sense to move to the updated requests-futures
package instead, which makes use of Python's concurrent.futures
module for issuing asynchronous callables.
_wait_for_response
method in MultiRequest
class in http.py
module will try to issue again and again a batch of requests that is failing to complete. It will also try to reissue the requests that already completed successfully, which seem like a waste of the resources.
In the consecutive retries the method should only reissue the requests that have not completed successfully.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.