breach-tw / breach.tw Goto Github PK
View Code? Open in Web Editor NEWA service that can track data breaches like "Have I Been Pwned", but it is specific for Taiwan.
Home Page: https://breach.tw
License: MIT License
A service that can track data breaches like "Have I Been Pwned", but it is specific for Taiwan.
Home Page: https://breach.tw
License: MIT License
We should have a docker composer file that can automatically deploy code for test.
Cobmo list: 有多個來源,整理好的列表
應為:
Combo list: 有多個來源,整理好的列表
Just move this out
https://github.com/seadog007/breach.tw/blob/admin_panel/config.json#L9
In the "technical details" section, there is word description about "he right to legal prosecution". In the laws of Taiwan (R.O.C), there is no such thing as "the right to legal prosecution". So, I proposed to modify the word like "will file a lawsuit" or "will take legal actions against".
Need some example CLI clients for #15
Single file scripts are ideal
There will be a new orphan branch for this
Below is the example code for calculate nonce in Python
import hashlib
be = 'hash'
nonce = 0
diff = 5
while 1 :
x = be + str(nonce)
if hashlib.sha1(x).hexdigest()[:diff] == 'a' * diff :
print("done",x,nonce,hashlib.sha1(x).hexdigest()[:diff])
break
else :
nonce = nonce + 1
After typing the email, you will not be able to use enter to entering next step after #86
This
https://github.com/seadog007/breach.tw/blob/master/js/main.js#L119
Discovered by @t510599
you can have API service like haveibeenpwned for selling the api keys with 2 plan. one can be free with for example 1000 request per month and another which is premium plan with unlimited or for example 1 million request per month or etc.
Since there are apis for the website, we should publish the api for the public, and which require some proof of work for preventing abuse.
The system ideally should be stateless, contains two functions
config.php
), then distribute to usersbut stateless PoW need some mechanism to prevent replay attack.
Currently there is only subscribe form verify the user, and prevent people abuse the service.
We also need reCAPTCHA for search forms (2 of them).
There are too many separate script tag in HTMLs, which cause maintain issue.
All JS should be in a file (something like main.js) to reduce duplicate code, and make it easier for maintaining.
Since there are APIs for fetching the status of task, It should be able to show the progress of task
/api/import/tasklist
/api/import/task?id=
This page should contain some common questions.
Please feel free to provide more questions under this issue
as title, it might be great to encrypt mails as the mail might contain some security related problems.
design a search way just by hash
user can hash their data by their machine without doubt
As title owo)/
as title.
Maybe add a subscribe status page(to edit the following list or change the email) on website or add an unsubscribe link in the mail.
It should be compatible for IE 9+ (or 10+) since there are still some user using IE for browsing our website
add permalink for breaches like https://breach.tw/breaches.php#combo so people can share narrative of that breach easily
Just a doc
Given the list (either hash list or original data list), it can automatically import it into the system.
There are several steps
from hashlib import sha1
import sys
iname = sys.argv[1]
oname = iname + '_hashed'
with open(iname, 'r') as ifile:
with open(oname, 'w') as ofile:
for line in ifile:
line = line.rstrip()
digest = sha1(line).hexdigest()
ofile.write('{0}\n'.format(digest))
breach_source
table, which includeText("待補")
)Int(0)
)round_k
> 5 then 1)Text(政府單位)
, Text(教育機構)
, Text(民間企業)
, Text(其他)
)source_item
table, which containsbreach_source
table)breach_item
table)breach_log
table, which can divide into few stepsbreach_log
tablebreach_source
table)LOAD DATA INFILE '/tmp/hash.txt' INTO TABLE `breach_log` CHARACTER SET utf8 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' (`hash`);
UPDATE `breach_log` SET `source`=999 WHERE `source` IS NULL;
subscribers
tableSELECT `email` FROM `subscribers` WHERE `hash` IN (SELECT `hash` FROM `breach_log` WHERE `source` = 999) AND email_verify = 1;
Question Mark Filter (question-mark.s1.js)
If there is a question mark in the line, which means something missing in the line, remove that line.
CSV Comma Remover
Parsing CSV for helping stage converter.
TSV Tab Remover
Parsing TSV for helping stage converter.
Space Remover
Remove all kind of wired space, highly change appear between name where which is only 2 chinese character.
10 Digit ID to 6 digit (10d26d.s2.js)
Coverter 10 digit id to last 6 digit for hashing.
A123456789 Filter (dummy-id.s2.js)
Too many testing data using this....
Name English Filter (name-english-filter.s2.js)
If there is English character in the name, then filter it out.
ID Validator (id-validate.s2.js)
Check if ID is vaild.
Unifier
Unify the data, be sure the data will not repeat.
Stage 1 to Stage 2 Converter (s1-to-s2.js)
Convert line to [name, id]
Stage 2 to Final
By hashing the name + id
form stage 2
After #71 publish
*{font-family:"Open Sans","Noto Sans TC","Meiryo","微軟正黑體","Microsoft JhengHei";box-sizing:border-box;letter-spacing:-0.02em}
.Microsoft YaHei
.</h3>
out of nowhere) or unnecessarily closed HTML tags (<input>
is a root tag, which does not require closure) present in some parts of the PHP code.https://github.com/breach-tw/breach.tw/blob/master/verify.php#L14
The verification check if the certain query string is visit.
Since some corps' firewall/mail gateway will check links in emails, the verification might be pass even if the email was not exist.
Fix:
Should be on page post
breaches.php should have leaked fields on each item
Need some tags for breaches page
Mainly in main.js
https://github.com/seadog007/breach.tw/blob/admin_panel/server/file-preprocessor.js#L83
it doesn't convert lowercase to uppercase
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.