Giter VIP home page Giter VIP logo

sitereview's People

Contributors

gunstick avatar poorbillionaire avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sitereview's Issues

Seems not to work

This site uses anti-spider method now, and I think only using selenium+web driver can crawl information, but there is still a limit per IP in a period of time.

API rate limit

Quick question: I need to query the API for about 100,000 domains. Do you know how much time gap is recommended between consecutive queries? I tried 5-10 seconds and I feel like that is probably not sufficient as I seem to have gotten blocked.

Connection Time out

Hi,
I am getting connection time out while using this script. Can you help

got an error

$python sitereview.py http://brins.biz/
Traceback (most recent call last):
File "sitereview.py", line 60, in
main(args.url)
File "sitereview.py", line 42, in main
response = s.sitereview(url)
File "sitereview.py", line 28, in sitereview
return json.loads(self.req.content.decode("UTF-8"))
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

API seems to be down

When running the script, I get this error:

Traceback (most recent call last):
  File "sitereview.py", line 65, in <module>
    main(args.url)
  File "sitereview.py", line 47, in main
    response = s.sitereview(url)
  File "sitereview.py", line 28, in sitereview
    return json.loads(self.req.content.decode("UTF-8"))
  File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

After altering the script to print the raw response, it looks like something is wrong with the endpoint:

<html><head><title>Apache Tomcat/7.0.52 (Ubuntu) - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 403 - Security violation.</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>Security violation.</u></p><p><b>description</b> <u>Access to the specified resource has been forbidden.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.52 (Ubuntu)</h3></body></html>

Am I using the script wrong, or is this an error with Site Review?

Input Multiple domains in a file

Is it possible to feed this a list of domains from a text file? Maybe 1 per line and have it output the results to a file ?

E.G

URL: domain1.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious Sources/Malnets

URL: http://domain2.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious S ources/Malnets

URL: http://domain3.com | Last Time Rated/Reviewed: > 7 days | Category: Malicious S ources/Malnets

xml.etree.ElementTree.ParseError: mismatched tag: line 10, column 10

Hi there,

Since yesterday, the script does not work no more and generate this error.
It's look like that the json's structure might have changed:

Traceback (most recent call last):
  File "sitereview.py", line 61, in <module>
    main(args.url)
  File "sitereview.py", line 43, in main
    s.check_response(response)
  File "sitereview.py", line 28, in check_response
    root = ET.fromstring(self.req.content)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1311, in XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
    raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 10, column 10

is the api broken?

I get this response indicating that it violates the terms of service

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Site Review Acceptable Use Information</title>
        <script type="text/javascript" src="/analytics.js"></script>
        <link type="text/css" rel="stylesheet" media="all" href="/css/legacy.css" />
        <noscript>
            <meta HTTP-EQUIV="REFRESH" content="0; url=noJavascript.jsp">
        </noscript>
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <meta http-equiv="content-type" content="text/html; charset=UTF-8">
        <meta charset="utf-8">

        <style id="antiClickjack">body{display:none !important;}</style>

        <style>
            section.site-review-content p {
                margin-bottom: 1em;
                color: #454545;
            }

            section.site-review-content h1 {
                margin-bottom: 1em;
            }

            section.site-review-content hr {
                border-color: black;
            }

            section.important-note {
                margin-left: 4em;
                margin-right: 4em;
                margin-bottom: 2em;
                background-color: white;
                border: solid 2px #fdbb30;
                padding: 0 1em;
            }

            section.important-note h3 {
                color: black;
                font-weight: bold;
                margin-top: 1em;
            }
        </style>
    </head>
    <body class="sym-theme">
        <div class="header-background" style="padding-bottom: 1em">
            <div class="container header_container">
                <div class="header_master">
                    <div class="app-logo"></div>
                </div>
            </div>
        </div>

        <section class="site-review-content b-content">
            <div class="center_section container">
                <h1>Site Review Acceptable Use Information</h1>
                <p>
                    It appears you are using Site Review in an automated fashion, which violates our <a href="https://www.symantec.com/about/legal/blue-coat-legal-archive/website-terms-of-use">Terms
                    of Use</a> and can result in loss of access to the service.
                </p>
                <p>
                    Please contact your Symantec representative for other options.
                </p>

            </div>
        </section>

        <div class="footer-background">
            <div class="footer_container container">
                <div class="footer_master">
                    <div class="copyright">
                        Copyright &copy; 1995-<span>2019</span> Symantec Corporation
                    </div>
                </div>
            </div>
        </div>

        <script type="text/javascript">
            if(self == top) {
                var antiClickjack = document.getElementById("antiClickjack");
                antiClickjack.parentNode.removeChild(antiClickjack);
            } else {
                top.location = self.location;
            }
        </script>
    </body>
</html>

Global variable not defined

sitereview.py https://www.google.com
Traceback (most recent call last):
File "/usr/local/bin/sitereview.py", line 4, in
import('pkg_resources').run_script('sitereview==2.0', 'sitereview.py')
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 739, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1501, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 61, in

File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 43, in main

File "/usr/local/lib/python2.7/dist-packages/sitereview-2.0-py2.7.egg/EGG-INFO/scripts/sitereview.py", line 26, in check_response

NameError: global name 'req' is not defined

req called instead of self.req

Hey PoorBillionaire,

Just playing with this and noted that in the check_response function, in the if statement, you are calling req.status_code. Needs to be self.req.status_code.

Just letting you know,

Thanks for the script!

Error line 49: invalid syntax

When I tried to execute the code, whatever input I give it returns me the following error:

File "C:\Users\USER\Desktop\sitereview-master\sitereview.py", line 49
print "\n{0}\n{1}\n{0}\n".format(border, "Blue Coat Site Review")
^
SyntaxError: invalid syntax

Am I doing something wrong or there is an error in the code?
PS: I have python 3.6.2

Correct README

Update readme to reflect Symantec ownership and to be more accurate with the new API

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.