Cassandra table based query and primary key uniqueness

farhawa

I have read here that for a table like:

CREATE TABLE user (
    username text,
    password text,
    email text,
    company text,
    PRIMARY KEY (username)
);

We can create a table like:

CREATE TABLE user_by_company (
    company text,
    username text,
    email text,
    PRIMARY KEY (company)
);

In order to support query by the company. But what about primary key uniqueness for the second table?

Aaron

Modify your table's PRIMARY KEY definition and add username as a clustering key:

CREATE TABLE user_by_company (
    company text,
    username text,
    email text,
    PRIMARY KEY (company,username)
);

That will enforce uniqueness, as well as return all usernames for a particular company. Additionally, your result set will be sorted in ascending order by username.

data will be partitioned by the company name over nodes. What if there is a lot of users from one company and less from other one. Data will be partition'ed in a non balanced way

That's the balance that you have to figure out on your own. PRIMARY KEY definition in Cassandra is a give-and-take between data distribution and query flexibility. And unless the cardinality of company is very low (like single digits), you shouldn't have to worry about creating hot spots in your cluster.

Also, if one particular company gets too big, you can use a modeling technique known as "bucketing." If I was going to "bucket" your user_by_company table, I would first add a company_bucket column, and it as an additional (composite) partitioning key:

CREATE TABLE user_by_company (
    company text,
    company_bucket text,
    username text,
    email text,
    PRIMARY KEY ((company,company_bucket),username)
);

As for what to put into that bucket, it's up to you. Maybe that particular company has East and West locations, so something like this might work:

INSERT INTO user_by_company (company,company_bucket,username,email)
  VALUES ('Acme','West','Jayne','[email protected]');

The drawback here, is that you would then have to provide company_bucket whenever querying that table. But it is a solution that could help you if a company should get too big.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Cassandra primary key design to cater range query

From Dev

Cassandra primary key design to cater range query

From Dev

Query Cassandra with Both Primary Key and Secondary Key Constraints

From Dev

Query to check if primary key exists on the table in informix

From Dev

MS Access query table without primary key

From Dev

Cassandra :[Invalid query] message="PRIMARY KEY column "lng" cannot be restricted

From Dev

performance of using primary key and secondary index in a query in cassandra

From Dev

Primary key and indexing in cassandra

From Dev

Cassandra UDTs as primary key

From Dev

Primary key in cassandra is unique?

From Dev

Primary key and indexing in cassandra

From Dev

Inserting to cassandra table with a composite primary key from hadoop reduce

From Java

Inserting to multiple mysql table based on an primary key from first table

From Dev

Cassandra UPDATE primary key value

From Dev

Database agnostic query to get the row number based on the primary key

From Dev

SELECT a non ID column in a foreign key table (TABLE B) based on the foreign key in the primary table (TABLE

From Dev

Is it possible to get all MySQL table associations based on one Primary Key?

From Dev

Create new records based on primary key of different table mysql

From Dev

Setting new column values based on primary key lookup in another table

From Dev

Query in Oracle to get columns in a table with primary key info

From Dev

Query in Oracle to get columns in a table with primary key info

From Dev

I can't query a table without primary key

From Dev

raw query with primary key

From Dev

Keeping data together in spark based on cassandra table partition key

From Dev

Table without a primary key

From Dev

DynamoDB key uniqueness across primary and global secondary index

From Dev

Dynamodb checking for uniqueness across primary key AND another field

From Dev

How to query rows by blob primary key from Cassandra using Python's cqlengine

From Dev

Cassandra table with one primary key, one clustering column, and one regular column does not append data. It overwrites it

Related Related

  1. 1

    Cassandra primary key design to cater range query

  2. 2

    Cassandra primary key design to cater range query

  3. 3

    Query Cassandra with Both Primary Key and Secondary Key Constraints

  4. 4

    Query to check if primary key exists on the table in informix

  5. 5

    MS Access query table without primary key

  6. 6

    Cassandra :[Invalid query] message="PRIMARY KEY column "lng" cannot be restricted

  7. 7

    performance of using primary key and secondary index in a query in cassandra

  8. 8

    Primary key and indexing in cassandra

  9. 9

    Cassandra UDTs as primary key

  10. 10

    Primary key in cassandra is unique?

  11. 11

    Primary key and indexing in cassandra

  12. 12

    Inserting to cassandra table with a composite primary key from hadoop reduce

  13. 13

    Inserting to multiple mysql table based on an primary key from first table

  14. 14

    Cassandra UPDATE primary key value

  15. 15

    Database agnostic query to get the row number based on the primary key

  16. 16

    SELECT a non ID column in a foreign key table (TABLE B) based on the foreign key in the primary table (TABLE

  17. 17

    Is it possible to get all MySQL table associations based on one Primary Key?

  18. 18

    Create new records based on primary key of different table mysql

  19. 19

    Setting new column values based on primary key lookup in another table

  20. 20

    Query in Oracle to get columns in a table with primary key info

  21. 21

    Query in Oracle to get columns in a table with primary key info

  22. 22

    I can't query a table without primary key

  23. 23

    raw query with primary key

  24. 24

    Keeping data together in spark based on cassandra table partition key

  25. 25

    Table without a primary key

  26. 26

    DynamoDB key uniqueness across primary and global secondary index

  27. 27

    Dynamodb checking for uniqueness across primary key AND another field

  28. 28

    How to query rows by blob primary key from Cassandra using Python's cqlengine

  29. 29

    Cassandra table with one primary key, one clustering column, and one regular column does not append data. It overwrites it

HotTag

Archive