Giter VIP home page Giter VIP logo

bot-detector-core-files's Introduction

The Bot Detector Plugin Core Files

GitHub GitHub top language Website Discord Twitter Follow

How does it work?

The project is broken into 7 separate pieces:

  • API โ† You are here
  • Database
  • Hiscore scraper
  • Machine Learning (ML)
  • Discord Bot
  • Twitter Bot
  • Website
  • Plugin

The API (core files) links all components with the database.

image

How can I request a new feature or report a bug?

To request a new feature or report a bug you should open an issue on github. This way we can track new and interesting features recommended by users and developers of the plugin.

How can I join the plugin community?

If you would like to join our community, get involved in development, join our clan, participate in events, and more -- you can join us on our discord!

Can I get involved with development?

Yes. We're always welcoming new talent to the team. Many new faces like to join our discord to have a bit of guidance, however if that's not your cup of tea -- we've listed all of the steps necessary to start a development environment, and to help contribute to banning bots, below:

Core Files Setup

This guide will take you through the necessary steps to start contributing to the server side components. This will include the following repositories:

You can find other relevant repositories in our organization's github.

Install:

setup pre-commit

pre-commit is used to format your commit before committing it.

pre-commit --version
pre-commit install

pre-commit runs automatically when you commit, but if it doesn't work.

pre-commit run --all-files

Setup:

  1. Open a terminal cmd
  2. Navigate cd to where you want to save our code.
  3. The command below will Create a folder bot-detector with two sub folders remote & local & download the remote repositories in the remote folder.
    • To add the repositories in github desktop, select File on the top left than click Add local repository, and navigate to the cloned repositories.

Windows

mkdir bot-detector\remote bot-detector\local && cd bot-detector\remote
git clone https://github.com/Bot-detector/Bot-Detector-Core-Files.git
git clone https://github.com/Bot-detector/bot-detector-mysql.git

Linux

mkdir -p bot-detector/{remote,local}
git clone https://github.com/Bot-detector/Bot-Detector-Core-Files.git
git clone https://github.com/Bot-detector/bot-detector-mysql.git
  1. Now you can start the project, the command below will create the necessary docker containers, the first time might take a couple minutes. Make sure docker desktop is running!
cd Bot-Detector-Core-Files
docker-compose up --build
  1. In the terminal you will now see /usr/sbin/mysqld: ready for connections. this means the database is ready.
  2. Test the api's:
    • Core api: http://localhost:5000/

adding /docs at the end will give return the swagger documentation for the components /docs

What contributions are needed?

Features, and bugs are documented as issues in each repository. The project owners, review these, and select some as part of a github project.

Merging your changes

Changes to the project will have to submitted through the process of Merge Requests. Github has good documentation outlining this process and how it works, but to summarize it here briefly:

  1. Go to our repository and click Fork. image
  2. Clone your newly created repository to your local machine (into the bot-detector\local folder)
  3. Make your local changes. Test. Commit. And push to your own repo
  4. Open a Merge Request

The Development Workflow:

  1. Make sure you are working in your fork. This will be a copy of the repository.
    • In github desktop, in the top left, you can click Current repository, select the repository under your name.
  2. Create a branch, with a relative name, related to the issue.
    • In github desktop, on the top click branch or current branch than new branch.
  3. Publish your branch.
    • In github desktop, blue button on the middle of your screen Publish branch
  4. Create your commits (changes).
    • Small commits, defined scope are preferred.
    • Commit messages are desired.
  5. Push your commits.
  6. Create a Pull Request (PR)
    • In github desktop, blue button on the middle of your screen Create Pull Request
    • This will open your browser, make sure the base repository: Bot-detector/ and base: develop

What are the coding standards?

General

Code must be well-understood by those willing to review it. Please add comments where necessary, if you find that the method used may be difficult to decipher in the future.

Linting

Code must be linted prior to merging. We use black.

Tests

Tests must be written where applicable.

Naming conventions

  • File: snake_case
  • Class: camelCase
  • Function: snake_case
  • Variable: snake_case
  • Table: camelCase
  • Route: kebab-case

How is my code approved?

We have automated workflows setup for assigning approvers based on their knowledge in each repository - this person will be the owner of Issue/Merge Request. If we have not seen your pull request in a 24 hour period, please notify us via our our discord or on github.

bot-detector-core-files's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bot-detector-core-files's Issues

write test cases for routes

as a system i need to test if my routes work.

we should write test cases for our routes to see that each release is consistent and works according to the requirements.
test cases can be written in the tests folder

Add wiki for core

As a developer I would like to allow other developers to easily assist in this project by writing proper documentation of how it works, and how it can be expanded upon.

Handle deadlock

the api should handle deadlocks

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1779, in _execute_context
    self.dialect.do_executemany(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 197, in do_executemany
    rowcount = cursor.executemany(statement, parameters)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/aiomysql.py", line 100, in executemany
    return self.await_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 76, in await_only
    return current.driver.switch(awaitable)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 129, in greenlet_spawn
    value = await result
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/aiomysql.py", line 121, in _executemany_async
    return await self._cursor.executemany(operation, seq_of_parameters)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 281, in executemany
2021-11-22 09:08:01,122 - api.database.functions - ERROR - None
    return (await self._do_execute_many(
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 318, in _do_execute_many
    r = await self.execute(sql + postfix)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 239, in execute
    await self._query(query)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 457, in _query
    await conn.query(q)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 428, in query
    await self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 622, in _read_query_result
    await result.read()
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 1105, in read
    first_packet = await self.connection._read_packet()
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 593, in _read_packet
    packet.check_error()
  File "/usr/local/lib/python3.9/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.9/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/code/./api/database/functions.py", line 46, in execute_sql
    rows = await session.execute(sql, param)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/ext/asyncio/session.py", line 212, in execute
    return await greenlet_spawn(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 134, in greenlet_spawn
    result = context.throw(*sys.exc_info())
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1689, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1611, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1478, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1842, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2023, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1779, in _execute_context
    self.dialect.do_executemany(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 197, in do_executemany
    rowcount = cursor.executemany(statement, parameters)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/aiomysql.py", line 100, in executemany
    return self.await_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 76, in await_only
    return current.driver.switch(awaitable)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 129, in greenlet_spawn
    value = await result
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/mysql/aiomysql.py", line 121, in _executemany_async
    return await self._cursor.executemany(operation, seq_of_parameters)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 281, in executemany
    return (await self._do_execute_many(
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 318, in _do_execute_many
    r = await self.execute(sql + postfix)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 239, in execute
    await self._query(query)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/cursors.py", line 457, in _query
    await conn.query(q)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 428, in query
    await self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 622, in _read_query_result
    await result.read()
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 1105, in read
    first_packet = await self.connection._read_packet()
  File "/usr/local/lib/python3.9/site-packages/aiomysql/connection.py", line 593, in _read_packet
    packet.check_error()
  File "/usr/local/lib/python3.9/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.9/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')

turn of debug for query

the following query has debug on

2021-11-22 17:45:41,202 - api.database.functions - DEBUG - has_return=False
2021-11-22 17:45:41,203 - api.database.functions - DEBUG - sql=
        insert ignore into PredictionsFeedback (vote, prediction, confidence, subject_id, proposed_label, voter_id)
        values (%s, %s, %s, %s, %s, %s) 

There is also some other printing going on somewhere

{'normalized_name': 'sir gareth'}
<api.database.functions.sql_cursor object at 0x7f42b434be20>

Add an Auto-Train feature to the ML data training sequence

As a very efficent (lazy) ML dev I would like for the system to learn by itself at this stage, and do some of the heavy lifting for me. This would mean having the system auto-train on data with set parameters based upon a rule based methodology.

Remove spam from the game please

Hi Ferrariic

Can you use your expertise to remove spam from the game? I am sure you can easily detect players with your algo who spam gold farming websites, trading and gambling scams etc

Your plugin can constantly learn to always stay up to date with the best spam filters

The work you are doing is amazing

Create tokenised routes for discord bot

As a best practice, the discord bot should use the api and not the database directly.
this will limit the attack vector & if changes are made on the data layer we can change it in one place

missing data

hey,

i fail to understand the dataflow, i see you have many pickle files how are they generated?

!kc/Get Contributions - Use normalized name

async def sql_get_contributions(contributors: List):

Change this to query = (""" SELECT rs.manual_detect as detect, rs.reportedID as reported_ids, ban.confirmed_ban as confirmed_ban, ban.possible_ban as possible_ban, ban.confirmed_player as confirmed_player FROM Reports as rs JOIN Players as pl on (pl.id = rs.reportingID) join Players as ban on (ban.id = rs.reportedID) WHERE 1=1 AND pl.normalized_name in :contributors """)

We probably want the endpoints

@router.post('/stats/contributions/', tags=['legacy'])

(There is a GET and POST) to normalized the name using the to_jagex_name method so we don't have to trust clients to normalize the names the correct way.

User reports

as a user i want to see the users that i have reported, more specifically the confirmed banned users.

Update Model V2: Add xp daily diff evaluation.

I would like to add, as a potential improvement, xp daily diff as a feature.

This feature would be >75% xp in skilldiff/totaldiff per day averaged for a week.

This will require:

  • data cleaning
  • feature additions
  • dask in place of pandas for server ram
  • OVH server instead of linode
  • MOAR RAM!
  • Fix auto-train.

Improvements should be tangible. 0.8 correlation in testing.

Hiscores scraper not inserting

Our last hiscores insert was on November 28th. Okay, I didn't realize what date it was. But our confirmed bans numbers are super low and I'm not sure what's going on.

Attempt to Create Prediction if One Doesn't Exist

  1. Check if prediction exists
  2. Ask ML repo app to make one via remote request
  3. If it can't create a prediction (no data on the account or whatever else) return a "blank" prediction so that the plugin doesn't freak out.

Blank prediction format:

predict_dict = {
                "player_id": -1,
                "player_name": player_name,
                "prediction_label": "Prediction not in database",
                "prediction_confidence": 0
            }

well structured api routes

it makes more sense to develop api routes around the database structure.
There must be a possibility to batch & filter on the routes, some filters require a token in the headers.

routes are needed for

  • player
  • reports
  • hiscores
  • hiscoreslatest
  • labels
  • predictions

website authentication

As a user/ developer/ admin i want to access information generated by the plugin in a secure way.

Setup & automate API test environment

As a developer it is best practice to develop & test on a different environment then live, production.

We should make a replication of the database structure, fill it with a subset of data.

We should update the CICD, to include the test environment.

auto assign reviewer

as a project maintainer i want to auto assign the organization group as reviewer on a pull requst

Detect route: List index out of range

future: <Task finished name='Task-491' coro=<detect() done, defined at /code/./api/routers/legacy_debug.py:92> exception=IndexError('list index out of range')>
Traceback (most recent call last):
  File "/code/./api/routers/legacy_debug.py", line 134, in detect
    df["reporter_id"]  = df_names.query(f"name == {df['reporter'].unique()}")['id'].to_list()[0]
IndexError: list index out of range

Port legacy routes

as a system i need to support the legacy routes from the flask era.

move all legacy routes from the depricated folders to the routes/legacy.py file
make them work with fast-api & pydantic models

Automate Tipoff Reports

As a developer, I'd like our tipoff reports to be automated in our proper format to [email protected].

This would allow for daily reports to be sent, over to the relevant accounts. We can use the official plugin email for this purpose.

Have the names column in the Predictions table contain normalized names.

I'd like for the POST /v1/prediction route in the FastAPI port to normalize the names in each list of predictions it receives from the ML app before persisting them to the database.

This method already exists in legacy.py, and is how I'd like the names to be normalized. This will make it much easier to fetch predictions for accounts with special characters (- and _ mostly) in their names.

In the Flask version of our API we select the account's player ID and use that to locate the prediction, which is why it hasn't been an issue up until now. As it is written now, the GET /v1/prediction route utilizes a name string and due to inconsistencies in our name formatting we may run into issues where the API cannot locate a prediction if we do not normalize the names in the Predictions table first.

Improve ML modularity

As a ML dev I would like to easily contribute to improving the ML code. For me, this would mean making the following modular in nature:

  • Pre-processing
  • Model used
  • Batch/No Batch
  • Addition of new features
  • Addition of new labels
  • Output [binary, categorical]

rethink security model

Currently we have a bit column that sets the permission for a user, to add or remove permissions (levels)(not of a user), we have to update the table, i think it would be easier to have 3 tables, permissions, users, user_permissions.
permissions would hold the unique permissions, users would hold the unique users (name, token), user permissions is the mapping of the user and permission table, one user can have many permissions

latest hiscores slow

as an ML developer i want to get the latest hiscores fast.
solution, createa trigger on insert hiscoredate, replace into hiscoredatalatest table.
requesting the data from the table should be faster then from the view

CICD steps to jobs

The CICD is currently one big build job, can we seperate the steps into jobs

Create tokenised routes for discord bot

As a best practice, the discord bot should use the api and not the database directly.
this will limit the attack vector & if changes are made on the data layer we can change it in one place

Equipment Values NULL

Users are reporting that equipment values in sightings data are not correct. It appears all equipment fields may be getting set to NULL.

update workflow

the workflow on readme is outdated, please update it to more accuratly reflect the current situation
plugin => names =>hiscores => ML => jmod => names (confirmed ban, confirmed player)
hiscores => names (potential ban)

feedback route: list of index out of range

Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/fastapi/applications.py", line 208, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc
  File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 656, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 259, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 61, in app
    response = await func(request)
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 226, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 159, in run_endpoint_function
    return await dependant.call(**values)
  File "/code/./api/routers/legacy.py", line 789, in receive_plugin_feedback
    voter_data = voter_data.rows2dict()[0]
IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.