Giter VIP home page Giter VIP logo

Comments (15)

extreme4all avatar extreme4all commented on July 3, 2024 1

i have some experience with the API limits :), i used to scrape the entire osrs ge.
but first things first, some refactoring, a database would be beneficial, what data are we getting from the plugin.
it would be nice if we had the following information from the plugin:

  • player_reporter
  • player_reported
  • location

i suggest 2 endpoints. Report_player & report_players.
both endpoints do inserts in the database,
table:

  • player_reports
  • players (all unique players)

the difference between report_player & report players is that we would set a column in player_reports as Nearby_players, True (1).

in the players table we keep track when a player is created, banned, banned_date.
ban is detected if a player is removed from high scores.

For highscores we need some tables
Table: Highscores
Columns:

  • player_id (as defined in players)
  • Timestamp
  • stats (skill lvl & xp, minigame rank, score)

Table: Highscores_latest (don't know if needed)
Columns:

  • player_id (as defined in players)
  • stats (skill lvl & xp, minigame rank, score)

We would need routes to request data from the database. i suggest:

  • /player?player_name=
  • /player_reports?reporter_player_id=&reported_player_id=
  • /highscores?player_id=&start_date=&end_date=
  • /highscores_latest?player_id=

Can you set up the database side, i can setup the Flask api that will run on the server.
what i have described should be the basis for a nice website that can display our best bot detector :D.
additionally it should be the basis for our AI idea's.

AI workflow will be the following:

  1. Request data
  2. pre processing (Data cleaning & Feature engineering )
  3. ai modelling
  4. model evaluation
  5. model deployment

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024 1

it might also be useful to have the plugin push a user token, so we can stop abuse?

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

Hey there - sorry about the data flow mess. I'll work on cleaning that up so it's much more readable.

  • "OSRS_KNN_V1" - This pickled file is the KNN classifier, you can use:
osrsknn = pickle.load(open("OSRS_KNN_V1","rb")) 
osrsknn_predict = osrsknn.predict(PLAYER_IND[2].reshape(1,-1))
print(osrsknn_predict)

to use the classifier.

  • "ykmfile" - These are the labels produced by KMeans (n_clusters = 300). They are in the order of the input data, and correlate with the "Pnamefile" which are the player names. Ex. ykmfile[8] is the group label of pnamefile[8].

  • "pnamefile" - These are the player names.

  • "PIfile" - This is the raw dataform and needs to have x = np.reshape(PIfile,(-1,78)) passed through it in order to produce an array with [ENTRIES,78]. ENTRIES = the number of total names added to the dataset, and 78 being the number of features. This raw dataform has not been normalized or adjusted in any way.

  • "traindata" - This is just the section of the raw data used in the KNN classifier, and can largely be ignored.

I will focus on making a much more readable format shortly - which should help answer some of your questions.

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024

the data you have are only player names?

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024

FYI we are doing something very similar to:
https://www.youtube.com/watch?v=Dk4Yahv2lek&list=PLX9loFun2zNkqwEk3abeMzZnVlT0YPxkp
but on a cleaner way.

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

the data you have are only player names?

No, the data are the stats from the hiscores, located in "PIfile" :)

So if loaded in,

ykmfile = generated labels
Pifile.reshape(-1,78) = hiscore data
Pnames = names

So that:

ykm[4] is the label for player pname[4], with features PIfile[4]

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

FYI we are doing something very similar to:
https://www.youtube.com/watch?v=Dk4Yahv2lek&list=PLX9loFun2zNkqwEk3abeMzZnVlT0YPxkp
but on a cleaner way.

Really cool project!

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024

i don't know how effective the raw players stats are, for detecting bots.
Some data engineering, we can scrape every hour, 6 hours, day scrape highscores to get the xp gains over time.

a hypothesis is that bots gain xp at a similar rate, in a specific skill compared to normal people gaining xp in many skills

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024

also gathering labeled data will make it way easier :D

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

i don't know how effective the raw players stats are, for detecting bots.
Some data engineering, we can scrape every hour, 6 hours, day scrape highscores to get the xp gains over time.

a hypothesis is that bots gain xp at a similar rate, in a specific skill compared to normal people gaining xp in many skills

Yeah I would love to scrape the hiscores every 6 hrs, unfortunately there is a rate limit of 2-3 seconds per name. So 100K names could take 69 hrs to scrape.

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

also gathering labeled data will make it way easier :D

It would be, but we don't know the labels unfortunately, since we don't know/can't easily trust the accuracy of sent in labels for individual players. So kmeans can group players on their stats and output labels for us. Those labels then go into the KNN classifier Which seems to work well so far at least.

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

It's a very tough situation due to the API ratelimit.

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

Excellent suggestions. I will work this week and weekend to make the code much easier to read. We will also try to make it so that the reporting player's info will be included, as well as report the location of the found players. I can definitely set up the database side and properly reconfigure everything so that it is very clear and manageable. I have also recently set up a flask app on a Linode server w/ gunicorn and nginx as a test for a switch from Google cloud app ==> Linode. - however my flask app is very rudimentary so changes are highly appreciated. I will let you know once the changes have been made, and when a database will become available - this will definitely assist in improving the workflow from this point onward.

As for the data we are getting from the plugin: Simply player names are being given at this time. Those names are then processed on our end to retrieve the OSRS Hiscore data values. Location and the reporting player were planned to be included in later updates, but we can shift the schedule to include these values earlier on.

from bot-detector-core-files.

extreme4all avatar extreme4all commented on July 3, 2024

the data is in json format?

for an minimum viable product the Location and the reporting player would be really good, combined with a website. It gives people something to show, with a bit of luck sir pugger will pick it up :D.

(recently i've got myself a vps for my tools aswell, but i'm not experienced in any of that linux stuf :p )

maybe send me a message on twitter, so we can share a .env file, @3xtreme4all

from bot-detector-core-files.

Ferrariic avatar Ferrariic commented on July 3, 2024

Haha no. Embarrassingly, the data is in a text file format. I'm going to try and convert it all into a json format from now on. Also I'd be super excited to have a great looking website where you can look up statistics/etc. That would really be remarkable to add in the future!

Also don't worry - I don't know anything regarding linux. I just followed a tutorial on youtube (As with basically how I've done everything that I've done so far, youtube is the way to go)

from bot-detector-core-files.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.