hey, i fail to understand the dataflow, i see you have many pickle f

FYI we are doing something very similar to: <a href="https://www.youtube.com/watch

FYI we are doing something very similar to: <a href="https://www.yout

missing data about bot-detector-core-files HOT 15 CLOSED

bot-detector commented on July 3, 2024

missing data

from bot-detector-core-files.

Comments (15)

extreme4all commented on July 3, 2024 1

i have some experience with the API limits :), i used to scrape the entire osrs ge.
but first things first, some refactoring, a database would be beneficial, what data are we getting from the plugin.
it would be nice if we had the following information from the plugin:

player_reporter
player_reported
location

i suggest 2 endpoints. Report_player & report_players.
both endpoints do inserts in the database,
table:

player_reports
players (all unique players)

the difference between report_player & report players is that we would set a column in player_reports as Nearby_players, True (1).

in the players table we keep track when a player is created, banned, banned_date.
ban is detected if a player is removed from high scores.

For highscores we need some tables
Table: Highscores
Columns:

player_id (as defined in players)
Timestamp
stats (skill lvl & xp, minigame rank, score)

Table: Highscores_latest (don't know if needed)
Columns:

player_id (as defined in players)
stats (skill lvl & xp, minigame rank, score)

We would need routes to request data from the database. i suggest:

/player?player_name=
/player_reports?reporter_player_id=&reported_player_id=
/highscores?player_id=&start_date=&end_date=
/highscores_latest?player_id=

Can you set up the database side, i can setup the Flask api that will run on the server.
what i have described should be the basis for a nice website that can display our best bot detector :D.
additionally it should be the basis for our AI idea's.

AI workflow will be the following:

Request data
pre processing (Data cleaning & Feature engineering )
ai modelling
model evaluation
model deployment

from bot-detector-core-files.

extreme4all commented on July 3, 2024 1

it might also be useful to have the plugin push a user token, so we can stop abuse?

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

Hey there - sorry about the data flow mess. I'll work on cleaning that up so it's much more readable.

"OSRS_KNN_V1" - This pickled file is the KNN classifier, you can use:

osrsknn = pickle.load(open("OSRS_KNN_V1","rb")) 
osrsknn_predict = osrsknn.predict(PLAYER_IND[2].reshape(1,-1))
print(osrsknn_predict)

to use the classifier.

"ykmfile" - These are the labels produced by KMeans (n_clusters = 300). They are in the order of the input data, and correlate with the "Pnamefile" which are the player names. Ex. ykmfile[8] is the group label of pnamefile[8].
"pnamefile" - These are the player names.
"PIfile" - This is the raw dataform and needs to have x = np.reshape(PIfile,(-1,78)) passed through it in order to produce an array with [ENTRIES,78]. ENTRIES = the number of total names added to the dataset, and 78 being the number of features. This raw dataform has not been normalized or adjusted in any way.
"traindata" - This is just the section of the raw data used in the KNN classifier, and can largely be ignored.

I will focus on making a much more readable format shortly - which should help answer some of your questions.

from bot-detector-core-files.

extreme4all commented on July 3, 2024

the data you have are only player names?

from bot-detector-core-files.

extreme4all commented on July 3, 2024

FYI we are doing something very similar to:
https://www.youtube.com/watch?v=Dk4Yahv2lek&list=PLX9loFun2zNkqwEk3abeMzZnVlT0YPxkp
but on a cleaner way.

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

the data you have are only player names?

No, the data are the stats from the hiscores, located in "PIfile" :)

So if loaded in,

ykmfile = generated labels
Pifile.reshape(-1,78) = hiscore data
Pnames = names

So that:

ykm[4] is the label for player pname[4], with features PIfile[4]

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

FYI we are doing something very similar to:
https://www.youtube.com/watch?v=Dk4Yahv2lek&list=PLX9loFun2zNkqwEk3abeMzZnVlT0YPxkp
but on a cleaner way.

Really cool project!

from bot-detector-core-files.

extreme4all commented on July 3, 2024

i don't know how effective the raw players stats are, for detecting bots.
Some data engineering, we can scrape every hour, 6 hours, day scrape highscores to get the xp gains over time.

a hypothesis is that bots gain xp at a similar rate, in a specific skill compared to normal people gaining xp in many skills

from bot-detector-core-files.

extreme4all commented on July 3, 2024

also gathering labeled data will make it way easier :D

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

i don't know how effective the raw players stats are, for detecting bots.
Some data engineering, we can scrape every hour, 6 hours, day scrape highscores to get the xp gains over time.

a hypothesis is that bots gain xp at a similar rate, in a specific skill compared to normal people gaining xp in many skills

Yeah I would love to scrape the hiscores every 6 hrs, unfortunately there is a rate limit of 2-3 seconds per name. So 100K names could take 69 hrs to scrape.

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

also gathering labeled data will make it way easier :D

It would be, but we don't know the labels unfortunately, since we don't know/can't easily trust the accuracy of sent in labels for individual players. So kmeans can group players on their stats and output labels for us. Those labels then go into the KNN classifier Which seems to work well so far at least.

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

It's a very tough situation due to the API ratelimit.

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

Excellent suggestions. I will work this week and weekend to make the code much easier to read. We will also try to make it so that the reporting player's info will be included, as well as report the location of the found players. I can definitely set up the database side and properly reconfigure everything so that it is very clear and manageable. I have also recently set up a flask app on a Linode server w/ gunicorn and nginx as a test for a switch from Google cloud app ==> Linode. - however my flask app is very rudimentary so changes are highly appreciated. I will let you know once the changes have been made, and when a database will become available - this will definitely assist in improving the workflow from this point onward.

As for the data we are getting from the plugin: Simply player names are being given at this time. Those names are then processed on our end to retrieve the OSRS Hiscore data values. Location and the reporting player were planned to be included in later updates, but we can shift the schedule to include these values earlier on.

from bot-detector-core-files.

extreme4all commented on July 3, 2024

the data is in json format?

for an minimum viable product the Location and the reporting player would be really good, combined with a website. It gives people something to show, with a bit of luck sir pugger will pick it up :D.

(recently i've got myself a vps for my tools aswell, but i'm not experienced in any of that linux stuf :p )

maybe send me a message on twitter, so we can share a .env file, @3xtreme4all

from bot-detector-core-files.

Ferrariic commented on July 3, 2024

Haha no. Embarrassingly, the data is in a text file format. I'm going to try and convert it all into a json format from now on. Also I'd be super excited to have a great looking website where you can look up statistics/etc. That would really be remarkable to add in the future!

Also don't worry - I don't know anything regarding linux. I just followed a tutorial on youtube (As with basically how I've done everything that I've done so far, youtube is the way to go)

from bot-detector-core-files.

missing data about bot-detector-core-files HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent