Giter VIP home page Giter VIP logo

Comments (6)

elandau avatar elandau commented on May 28, 2024

Nope. It might be possible to add but we never had a need for multitenant column families. We have a shared cluster with a fairly large number, nearly 100, of separate column families which works well for us. I believe there used to be an issue (memory related) with having several keyspaces and column families in previous versions of Cassandra but that this was fixed in later versions making the need for multi-tenant column families unnecessary. Also, keep in mind that another drawback of multi-tenant column families is that they require all tenants to have the same comparator and default validation class, which is unlikely to satisfy all tenants. How many tenants do you have in your cluster?

from astyanax.

jianjin avatar jianjin commented on May 28, 2024

Thousands of tenants. We would like to use Cassandra as Distributed store for different user. We would like to have a way to aggregate the same user's data together, easy to expire and delete.

We encounter issue when do it by key scan (Have configure index on userID) and remove them one by one, but found it is too slow and sometime experience exception. So we want to leverage multitenant and if we want to remove all data about a specific user, just create another virtual space for it. I think this will be more efficient.

On Feb 8, 2012, at 2:12 PM, elandau wrote:

Nope. It might be possible to add but we never had a need for multitenant column families. We have a shared cluster with a fairly large number, nearly 100, of separate column families which works well for us. I believe there used to be an issue (memory related) with having several keyspaces and column families in previous versions of Cassandra but that this was fixed in later versions making the need for multi-tenant column families unnecessary. Also, keep in mind that another drawback of multi-tenant column families is that they require all tenants to have the same comparator and default validation class, which is unlikely to satisfy all tenants. How many tenants do you have in your cluster?


Reply to this email directly or view it on GitHub:
#8 (comment)

from astyanax.

elandau avatar elandau commented on May 28, 2024

How many objects do you store per user? Would it be possible to store each object as a column on a single row key (userID). That way you can delete the entire row when you want to remove all data.

With your current method it sounds like you never actually delete the data. You just make it inaccessible. Wouldn't that just cause uncontrolled growth of your storage?

  • Eran

From: jian jin [email protected]
To: elandau [email protected]
Sent: Tuesday, February 7, 2012 10:34 PM
Subject: Re: [astyanax] Does astyanax support VirtualKeyspace? (#8)

Thousands of tenants. We would like to use Cassandra as Distributed store for different user. We would like to have a way to aggregate the same user's data together, easy to expire and delete.

We encounter issue when do it by key scan (Have configure index on userID) and remove them one by one, but found it is too slow and sometime experience exception. So we want to leverage multitenant and if we want to remove all data about a specific user, just create another virtual space for it. I think this will be more efficient.

On Feb 8, 2012, at 2:12 PM, elandau wrote:

Nope. It might be possible to add but we never had a need for multitenant column families. We have a shared cluster with a fairly large number, nearly 100, of separate column families which works well for us. I believe there used to be an issue (memory related) with having several keyspaces and column families in previous versions of Cassandra but that this was fixed in later versions making the need for multi-tenant column families unnecessary. Also, keep in mind that another drawback of multi-tenant column families is that they require all tenants to have the same comparator and default validation class, which is unlikely to satisfy all tenants. How many tenants do you have in your cluster?


Reply to this email directly or view it on GitHub:
#8 (comment)


Reply to this email directly or view it on GitHub:
#8 (comment)

from astyanax.

jianjin avatar jianjin commented on May 28, 2024

There maybe Gigabyte level data per user.

We would like to delete old data periodically, though we do not know how to do it efficiently yet. Do you think this make-sense or not.

On Feb 8, 2012, at 4:00 PM, elandau wrote:

How many objects do you store per user? Would it be possible to store each object as a column on a single row key (userID). That way you can delete the entire row when you want to remove all data.

With your current method it sounds like you never actually delete the data. You just make it inaccessible. Wouldn't that just cause uncontrolled growth of your storage?

  • Eran

From: jian jin [email protected]
To: elandau [email protected]
Sent: Tuesday, February 7, 2012 10:34 PM
Subject: Re: [astyanax] Does astyanax support VirtualKeyspace? (#8)

Thousands of tenants. We would like to use Cassandra as Distributed store for different user. We would like to have a way to aggregate the same user's data together, easy to expire and delete.

We encounter issue when do it by key scan (Have configure index on userID) and remove them one by one, but found it is too slow and sometime experience exception. So we want to leverage multitenant and if we want to remove all data about a specific user, just create another virtual space for it. I think this will be more efficient.

On Feb 8, 2012, at 2:12 PM, elandau wrote:

Nope. It might be possible to add but we never had a need for multitenant column families. We have a shared cluster with a fairly large number, nearly 100, of separate column families which works well for us. I believe there used to be an issue (memory related) with having several keyspaces and column families in previous versions of Cassandra but that this was fixed in later versions making the need for multi-tenant column families unnecessary. Also, keep in mind that another drawback of multi-tenant column families is that they require all tenants to have the same comparator and default validation class, which is unlikely to satisfy all tenants. How many tenants do you have in your cluster?


Reply to this email directly or view it on GitHub:
#8 (comment)


Reply to this email directly or view it on GitHub:
#8 (comment)


Reply to this email directly or view it on GitHub:
#8 (comment)

from astyanax.

elandau avatar elandau commented on May 28, 2024

I suppose multi-tenancy would make sense here if you plan to provide a factory for Keyspace objects to some higher level client and not have the client worry about adding their user id to the row keys. Although, this could also be accomplished by providing a wrapper on top of Astyanax which would add further abstraction of the cassandra api and data model from the client. Regarding the deletes, you would probably want to set up an asynchronous purge mechanism which will take care of the deleted (Producer/Consumer using SQS for messaging, for example). This will work well with your previous model of having an index of objects owned by the user. The purger will continue to perform deletes as long as there are columns in the index and would implement any retry and backoff logic when you get errors. Out of curiosity, what exceptions are you seeing with deletes?

from astyanax.

jianjin avatar jianjin commented on May 28, 2024

Hi,

I confirmed again, the exception is because of our API usage is not correct.

I wonder if Cassandra has a way for operations to delete old data efficiently. Could them remove the old data file directly? We can afford data miss from application level.

On Feb 8, 2012, at 4:16 PM, elandau wrote:

I suppose multi-tenancy would make sense here if you plan to provide a factory for Keyspace objects to some higher level client and not have the client worry about adding their user id to the row keys. Although, this could also be accomplished by providing a wrapper on top of Astyanax which would add further abstraction of the cassandra api and data model from the client. Regarding the deletes, you would probably want to set up an asynchronous purge mechanism which will take care of the deleted (Producer/Consumer using SQS for messaging, for example). This will work well with your previous model of having an index of objects owned by the user. The purger will continue to perform deletes as long as there are columns in the index and would implement any retry and backoff logic when you get errors. Out of curiosity, what exceptions are you seeing with deletes?


Reply to this email directly or view it on GitHub:
#8 (comment)

from astyanax.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.