Giter VIP home page Giter VIP logo

Comments (23)

enov avatar enov commented on July 24, 2024

I think your solution is good.

from database.

 avatar commented on July 24, 2024

Ok, thanks. But, what can you say about this trouble? How community fixes this trouble? (With creating session nodes for each bot request)

from database.

enov avatar enov commented on July 24, 2024

@Coll3ctor there seems to be a garbage collector that runs after each 500 requests.

https://github.com/kohana/database/blob/3.3/master/classes/Kohana/Session/Database.php#L37

Once your sessions are considered as expired, the GC will clean them.

from database.

 avatar commented on July 24, 2024

@enov In one second, my server took 3-4 request, so it 3-4 new nodes in sessions table. Garabage collector can't help me, because, if my session expiring time is Date::DAY (for example, but, it is very small period), numberOfNodesPerDay = 24 * 60 * 60 * 3 = 300k nodes. With that number, my site can't work correctly.

Or i not understood correctly your answer ?

from database.

enov avatar enov commented on July 24, 2024

If robots are piling up ~300k rows in your database, then you need to do something about it :)

If we should to do something about it in the framework, then the same applies for native sessions and for other sessions as well. After all native sessions are saved as files by default and your case will pile up 300k files somewhere in session.save_path.

Now, I see that your solution is good and applies to the abstract class Session which will fix everything else, but still not 100% convinced that we should go for it. Are there any robots that store sessions? Do we break anything else if we don't write to the session in case of robots?

I wonder what others think about this.

from database.

 avatar commented on July 24, 2024

Ok, very big thanks, i understand!

You mean, what i can also (may be) divide sessions by directories (if i use 'native' driver), and i will not touch a overloading (exception overloading of my hard disk, ofc)?

As i saw, Yandex, Google, Yahoo, MailRu, Rambler bots - everyone not accepts cookies, so everyone of them fills up my database :( I am totally sure, what we will not break anything, using my solution (only if someone uses bots 'useragent', but, this guy, i think, is very fishily for my site, and this will be good, if i don't give him cookie & session node & authorization on my site). And i am totally sure, that trying (and saving) to save sessions for bots is totally useless and harmful actions.

Also, about checking of bot: checking is pretty strictly and i think, thats OK.

Another one question: Should i redeclare class Session_Database and not Session? Because, i am working with session just on "database" area (yep, i see your advices about native session).

from database.

acoulton avatar acoulton commented on July 24, 2024

I think this is something that needs to be fixed on a site-specific basis. In my experience it's highly unusual to get that volume of crawlers (on an ongoing basis) unless the site itself is also taking huge amounts of traffic (in which case you need to architect session storage for scale anyway).

I'm not aware of any other framework that automatically disables sessions for bots, so it would be quite unexpected. Some website monitoring services and potentially some desired crawlers will need to authenticate and use sessions - so this could break things for people.

I wouldn't object to having an option for this behaviour, but it shouldn't be default.

from database.

enov avatar enov commented on July 24, 2024

👍 for a configurable option, @acoulton, and yes, should not be default.

@Coll3ctor no need to redeclare Session_Database. Your Session extension looks good.

from database.

loonies avatar loonies commented on July 24, 2024

I really don't see how robot filtering has to do anything with session.
One can spoof this very easily. With no practical benefit and
maintenance overhead I think this should not be included at all.

from database.

enov avatar enov commented on July 24, 2024

@loonies

only if someone uses bots 'useragent', but, this guy, i think, is very fishily for my site, and this will be good, if i don't give him cookie & session node & authorization on my site

@Coll3ctor is arguing that he does not want to serve cookies to the one who's spoofing user-agent either.

from database.

loonies avatar loonies commented on July 24, 2024

Regardless of spoofing, this is not reliable. I can set my browser's UA
to a robot string, or vice-versa for whatever reason and my browsing
will fail, or a robot crawler will pass.

from database.

 avatar commented on July 24, 2024

@acoulton Ok, i understood you. And i argee with you. (and my site really accepts a large volume of traffic (it is a long time to describe this)).
@enov Ok, nice. And, can u say, pls, what about method, which i should redeclare? Is it public write() or protected _write()?
@loonies And what solution you can offer?

from database.

loonies avatar loonies commented on July 24, 2024

Handle bots on application level. Think about it as a CAPTCHA challenge.

from database.

acoulton avatar acoulton commented on July 24, 2024

@loonies I don't think a CAPTCHA works for this issue - a GET to any page will cause a session to be created. That said you're right, user agent sniffing is probably the least good option.

@Coll3ctor do you actually need to create sessions for people just browsing? Could you instead make it so only logged in users have a session? If you're only storing a small amount of data you could also use cookie sessions (or cookie sessions for guests and database sessions for logged in users if you need to store more about them) which wouldn't need any storage on your side.

More I think about it, I think we shouldn't fix it like this in core (even as an option). Would it help if we made it so the session is only written if there is data in it?

from database.

loonies avatar loonies commented on July 24, 2024

My point was that @Coll3ctor should think of solving this problem the
same way you would solve the problem with bots commenting, creating
accounts, etc. on your website, hence mention of CAPTCHA challenge.

from database.

acoulton avatar acoulton commented on July 24, 2024

@loonies I understand, but I genuinely have no idea how you would identify bot vs human on a GET request (other than user agent)? CAPTCHA/Honeypot/etc are only useful on form submission or other interaction like you mention.

from database.

loonies avatar loonies commented on July 24, 2024

Well, yeah, UA check, like @Coll3ctor did in the first post.

Just thinking out loud ... If this is ever going to be implemented in
the Session class, than we could make a session filter. Something like

$agents = array('Google', 'Yahoo');
$filter = new UASessionFilter($agents);
$session = Session::instance()->add_filter($filter);

Maybe split add_filter to add_read_filter, and add_write_filter.

Again, I think it's only maintenance overhead. This is the first use
case I remember of since I been using Kohana. Not sure how many people
would find this feature useful.

from database.

 avatar commented on July 24, 2024

@loonies, @acoulton I think, this is talking about CAPTCHA on conception level, right ?
@acoulton

  1. No, i don't need sessions for whole users, just for logged-in;
  2. "or cookie sessions for guests and database sessions for logged in users if you need to store more about them" - say, please, how i can make this? :)
  3. "Would it help if we made it so the session is only written if there is data in it?" - i think, yes

@loonies Did you mean, what UASessionFilter can inherited from Session_Filter, for example? (So, make it on Kohana level) But, is it too difficult, for making classes, etc. for this trouble?

PS
@acoulton What can you say about adding to session user_id? On Kohana level (not just by me). By this we can easy count online users, for example. Also we can detect for concrete user, is he online. Yes, i read disuccions about it in some Stackoverflow topics and offical Kohana forum.

from database.

loonies avatar loonies commented on July 24, 2024

@loonies, @acoulton I think, this is talking about CAPTCHA on
conception level, right ?

Yes

@loonies Did you mean, what UASessionFilter can inherited from
Session_Filter, for example? (So, make it on Kohana level)

Yes, extend abstract class or implement interface.

But, is it too difficult, for making classes, etc. for this trouble?

I don't get the question. What is difficult?

from database.

acoulton avatar acoulton commented on July 24, 2024

I understood that @loonies means CAPTCHA as concept, was just curious if he had a suggestion for identifying bot on a first GET request to the site as I couldn't think of anything (except user agent check).

Cookie sessions for guests and database sessions for logged in users if you need to store more about them

Simplest way probably in your bootstrap:

if (Cookie::get('authenticated')) {
  Session::$default = 'database';
} else {
  Session::$default = 'cookie';
}

And then when a user logs in or out, call Cookie::set('authenticated', TRUE|FALSE);.

Better approach would probably be a custom session driver that internally uses a cookie driver and a database driver to load merged session data, and then splits writes between the two types of storage as required - only writing the database session if there's stuff in it.

What can you say about adding to session user_id? On Kohana level (not just by me)

I can see the usecase, but user_id is quite an application-specific concept, could be issues trying to implement something that would suit all applications. Also if you want to eg load by user ID that means adding a method to the database driver that can't be implemented in cookie sessions at all and would be difficult in native.

IMHO would be better to produce a separate package with a custom database session driver that stores/retrieves by user ID etc rather than adding that in core.

from database.

 avatar commented on July 24, 2024

I don't get the question. What is difficult?
@loonies This is very simple and small trouble. And for this trouble we think about make a large module, a few classes, inherited from something and etc. etc. etc.
This is trouble just about overflowing of database table, but we think about so large work! :) (so, maybe I did not say this very true in english)

@acoulton Ok, thanks, i understood. So, can we (community) wait something from Kohana after our discussion?:)

from database.

acoulton avatar acoulton commented on July 24, 2024

@Coll3ctor you can wait if you like, but given there are very few of us with very little time, and nobody else is reporting real-world problems with this you are likely to be waiting a long time....

I would suggest building a custom session driver yourself and releasing it as a kohana module.

from database.

 avatar commented on July 24, 2024

@acoulton Current version of Kohana and current number of Kohana modules is pretty good, so, ofc, we wait :)

Main of that, that we solved trouble.Aalbeit with a little crutch, but solved!

So, thanks all, and close that issue!

from database.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.