1

I realize MySQL has different collations I can use to enforce a case sensitive unique index, but for my particular use case, I want to enforce that only on a subset of the data.

Most of the unique IDs in the data set aren't case sensitive, so making the entire column case sensitive is not ideal. I can identify which rows have case sensitive IDs also.

It also seems MySQL doesn't have a way of creating an index with a condition either (correct me if I'm wrong).

Here is my schema

masterId - int
canonicalId - varchar
environment - int

canonicalId is the column I'm referring to, and it can be tied to different environments. Right now I have a unique index on canonicalId and environment. Environment "7" to give an example should have case sensitive IDs, the rest do not...

What is a good way to handle this?

Also: I don't think making all the unique IDs case sensitive would be a good solution here. We get these IDs from different services and the casing could very well change on them but not represent different items. That's why I only want to enforce case sensitivity on a service is known to use case sensitive IDs.

I was thinking of creating a SHA-256 hash of the canonicalId for this particular environment and the actual data along with the associated hash would be stored in a separate table if needed for lookup purposes later. I only need to ensure uniqueness in this table, as the information in the masterId column is used everywhere else in my application. Not sure if that would be the best way though.

6
  • Why not just store all your critical data in lowercase (or maybe uppercase) form? That way, the collation does not matter, and you need to only compare against that column in a link case manner? Commented Jan 22, 2019 at 20:54
  • 1
    I take it environment 7 could have different canonicalId values of Aa and AA, but if those values appeared in environment 6 they would be treated as the same value? Commented Jan 22, 2019 at 21:01
  • @O. Jones: Yes, that is correct Commented Jan 22, 2019 at 21:27
  • SHA-256 only works case sensitive, so I don't understand how it could help. Commented Jan 26, 2019 at 17:40
  • What's the charset? Just ASCII? Or something more involved, such as latin1, which does not have a perfect UPPER() and LOWER()? Commented Jan 26, 2019 at 17:43

1 Answer 1

1

Using a Generated Expression you can create:

alter table t
add g varchar(20) charset utf8 COLLATE utf8_bin GENERATED ALWAYS AS (
   IF(environmentId=7, canonicalId, UPPER(canonicalId))
)

Note: MariaDB doesn't support explicit collation on generated columns (MDEV-12161).

The collation on g, via explict or table defaul, needs to be case sensitive.

Then make environmentId,g unique:

alter table t add unique index gunique(environmentId, g)

https://dbfiddle.uk/?rdbms=mariadb_10.3&fiddle=bba30d04eb21cc60a1c1f8993e5be6e6

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.