Giter VIP home page Giter VIP logo

breach-tw / breach.tw Goto Github PK

View Code? Open in Web Editor NEW
156.0 156.0 22.0 1.98 MB

A service that can track data breaches like "Have I Been Pwned", but it is specific for Taiwan.

Home Page: https://breach.tw

License: MIT License

PHP 72.70% Hack 0.26% JavaScript 12.13% TSQL 10.24% Dockerfile 0.12% CSS 3.58% HTML 0.97%
breach breaches data-breach data-breaches experian-identityworks haveibeenpwned infosec osint taiwan web-security webservice

breach.tw's People

Contributors

coin3x avatar daisuke1230 avatar gnehs avatar koru1130 avatar lekoowo avatar ototot avatar sea-n avatar seadog007 avatar t510599 avatar yu-hsin-chen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

breach.tw's Issues

Dockers for debugging

We should have a docker composer file that can automatically deploy code for test.

  • MySQL (Need import schema)
  • PHP & Nginx

Task Progress and Status

There is no progress and status for the task, user will never knows if the task finished or not.
Also this should include input lines count and output lines count

截圖 2019-10-13 下午3 57 47

Some suggestion for about page

In the "technical details" section, there is word description about "he right to legal prosecution". In the laws of Taiwan (R.O.C), there is no such thing as "the right to legal prosecution". So, I proposed to modify the word like "will file a lawsuit" or "will take legal actions against".

Need some example CLI clients for #15

Need some example CLI clients for #15
Single file scripts are ideal

  • Python
  • Node.js
  • PHP
  • Shell Script

There will be a new orphan branch for this

Below is the example code for calculate nonce in Python

import hashlib

be = 'hash'
nonce = 0
diff = 5

while 1 :
	x = be + str(nonce)
	if hashlib.sha1(x).hexdigest()[:diff] == 'a' * diff :
		print("done",x,nonce,hashlib.sha1(x).hexdigest()[:diff])
		break
	else :
		nonce = nonce + 1

API keys

you can have API service like haveibeenpwned for selling the api keys with 2 plan. one can be free with for example 1000 request per month and another which is premium plan with unlimited or for example 1 million request per month or etc.

API for public, which require noCAPTCHA for now

Since there are apis for the website, we should publish the api for the public, and which require some proof of work for preventing abuse.

The system ideally should be stateless, contains two functions

  • Generate challenge, and sign with the given key (written in config.php), then distribute to users
  • Verify the challenge and the work from user

but stateless PoW need some mechanism to prevent replay attack.

Add reCAPTCHA for all forms

Currently there is only subscribe form verify the user, and prevent people abuse the service.
We also need reCAPTCHA for search forms (2 of them).

Extract Javascript(s) into a file

There are too many separate script tag in HTMLs, which cause maintain issue.
All JS should be in a file (something like main.js) to reduce duplicate code, and make it easier for maintaining.

Email Verification Page Misleading

When users already verify their email, it will show the unknown error.
Should be something like "You already verify this email address"
photo_2019-08-03_10-19-12

Tasks and Task page

Since there are APIs for fetching the status of task, It should be able to show the progress of task

/api/import/tasklist
/api/import/task?id=

"Frequently Asked Questions" page

This page should contain some common questions.

  • What should I do after breaches? (個資被洩露,我應該怎麼做?)
  • Are we fraud?(這是詐騙嗎?)
  • How to use API?(API 要怎麼用?)
  • What is hash?(雜湊是什麼?)
  • What email address are notifications sent from? (通知會從哪個電子郵件地址寄送?)

Please feel free to provide more questions under this issue

Set font limit for index.php

Since the font size is depended on resolution, it will be too large for high-res display.
The original one
截圖 2019-10-11 下午3 31 50

v2.0
截圖 2019-10-11 下午3 32 08

search by hash

design a search way just by hash

user can hash their data by their machine without doubt

Unsubscribe from the mailing list

as title.
Maybe add a subscribe status page(to edit the following list or change the email) on website or add an unsubscribe link in the mail.

IE Compatible

It should be compatible for IE 9+ (or 10+) since there are still some user using IE for browsing our website

UI Improvement

In windows 10 with resolution 1920x1080.

You will see a blank line right beside your window in most browsers.

Example Image

Auto import script (or even a web interface)

Given the list (either hash list or original data list), it can automatically import it into the system.
There are several steps

  • (Optional) If it is the original data list, then it will need to convert to the hashed list first, which can be done by a simple python script
from hashlib import sha1
import sys

iname = sys.argv[1]
oname = iname + '_hashed'

with open(iname, 'r') as ifile:
    with open(oname, 'w') as ofile:
        for line in ifile:
            line = line.rstrip()
            digest = sha1(line).hexdigest()
            ofile.write('{0}\n'.format(digest))
  • Add a row into breach_source table, which include
  1. name
  2. description (default Text("待補"))
  3. round_k (default Int(0))
  4. comment
  5. time
  6. major (if round_k > 5 then 1)
  7. file (if is Google Hacking)
  8. type (now only 4 type Text(政府單位), Text(教育機構), Text(民間企業), Text(其他))
  • Add rows into source_item table, which contains
  1. source (primary id from breach_source table)
  2. item (primary id from breach_item table)
  • Import hashes into breach_log table, which can divide into few steps
  1. Import hashes into breach_log table
  2. Update source to the specific id (primary id from breach_source table)
    The example SQL will be something like below:
LOAD DATA INFILE '/tmp/hash.txt' INTO TABLE `breach_log` CHARACTER SET utf8 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' (`hash`);
UPDATE `breach_log` SET `source`=999 WHERE `source` IS NULL;
  • Generate the email list from the subscribers table
    the example SQL will like this
SELECT `email` FROM `subscribers` WHERE `hash` IN (SELECT `hash` FROM `breach_log` WHERE `source` = 999) AND email_verify = 1;

PPS List

Filter (Validator): Filter out a record

Processor (Remover): Doing something to the record

Stage 1 (Line Filter/Processor)

  • Question Mark Filter (question-mark.s1.js)
    If there is a question mark in the line, which means something missing in the line, remove that line.

  • CSV Comma Remover
    Parsing CSV for helping stage converter.

  • TSV Tab Remover
    Parsing TSV for helping stage converter.

  • Space Remover
    Remove all kind of wired space, highly change appear between name where which is only 2 chinese character.

Stage 2 ([name, id] Filter/Processor)

  • 10 Digit ID to 6 digit (10d26d.s2.js)
    Coverter 10 digit id to last 6 digit for hashing.

  • A123456789 Filter (dummy-id.s2.js)
    Too many testing data using this....

  • Name English Filter (name-english-filter.s2.js)
    If there is English character in the name, then filter it out.

  • ID Validator (id-validate.s2.js)
    Check if ID is vaild.

  • Unifier
    Unify the data, be sure the data will not repeat.

Stage Converter

  • Stage 1 to Stage 2 Converter (s1-to-s2.js)
    Convert line to [name, id]

  • Stage 2 to Final
    By hashing the name + id form stage 2

Inconsistent rendering & design issues

  1. tocas.css does not seems to be fully optimized for the design that the website uses.
    • The homepage sometimes have issues rendering across unusual dimension sizes (depending on the dimensions, the header will render either not render at all, or render twice).
    • Additionally, this CSS condones bad practices using *{font-family:"Open Sans","Noto Sans TC","Meiryo","微軟正黑體","Microsoft JhengHei";box-sizing:border-box;letter-spacing:-0.02em}.
    • The issue lies in that no additional generic fontset is set after JhengHei. Furthermore, JhengHei may or may not be shipped in future versions of Windows, as Microsoft has begun to switch to Microsoft YaHei.
  2. There are a lot of weird unclosed (random </h3> out of nowhere) or unnecessarily closed HTML tags (<input> is a root tag, which does not require closure) present in some parts of the PHP code.
  3. CSS files are scattered across PHP files, which may lead to duplication and further issues down the road for readability and debuggability.

Tag Support

Need some tags for breaches page

User Interface

  • DB Schema
  • Backend Function Upgrade
  • Frontend

Admin Panel

  • DB Schema
  • Panel Backend
  • Panel Frontend

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.