87

I've created the following table:

CREATE TABLE MMCompany ( CompanyUniqueID BIGSERIAL PRIMARY KEY NOT NULL, Name VARCHAR (150) NOT NULL, PhoneNumber VARCHAR(20) NOT NULL UNIQUE, Email VARCHAR(75) UNIQUE, CompanyLogo BYTEA ); 

The email column is unique and it causes a "bug" in my scenario since there could only be one record with null. I'm trying to achieve records of companies without the same email but at the same time allow a companies to have no email.

How can I achieve that?

0

7 Answers 7

181

This is a misunderstanding.
The UNIQUE constraint does exactly what you want. NULL values can coexist in multiple rows in a column defined UNIQUE.

The manual:

In general, a unique constraint is violated when there is more than one row in the table where the values of all of the columns included in the constraint are equal. However, two null values are not considered equal in this comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior can be changed by adding the clause NULLS NOT DISTINCT [...]

Bold emphasis mine.

Be aware that character types allow an empty string (''), which is not a NULL value and would trigger a unique violation just like any other non-null value when entered in more than one row.

Postgres 15 allows the clause NULLS NOT DISTINCT to alter this behavior (the opposite of what you requested). I updated the above quote. See:

Sign up to request clarification or add additional context in comments.

12 Comments

@foibs: That is 100% correct and failsafe and conforming to the SQL standard and how it should be. NULL is defined that way for a reason.
@sqlvogel: To avoid the feature, define the column NOT NULL. Best of both worlds, nothing ugly about this. Useful as this very request goes to demonstrate. So, obviously, I don't quite agree with your verdict.
@sqlvogel: I am stressing the usefulness of the feature (while it can easily be switched off when not desirable). You write It is not useful to do so under a question where the OP requested exactly that, proving its usefulness? Is the irony lost on you?
I am all for normalization as long as it is half way sensible. But I see no "key dependencies" nor any violation of any normal form. You still didn't name one. Either way, this is purely academic. It would be unwise to create a separate table for the email attribute as long as there are no additional requirements like attributes per email address or multiple emails per company. The nullable column fits the requirements perfectly. I suggest we agree to disagree.
@sqlvogel and ErwinBrandstetter Looks like you've fallen into the age-old debate about the worthiness or dangerousness of NULL. For an exhaustive critique of NULL read Dr. Chris Date's book, Guide to the SQL Standard. You can also read academic articles explaining the multiple ambiguous meanings that NULL can take. While I'm in the camp that NULL is more trouble than help, other folks find practical use. I suggesting letting it go at that.
|
29

No Such Problem In Postgres

In Erwin Brandstetter's correct answer, he explains that you should indeed be seeing the behavior you want (multiple NULLs allowed in a Unique constraint). You should see this behavior in Postgres in particular as well as any SQL standard compliant database in general.

Workaround for Other Databases

However, the Postgres doc cautions about portability because some databases are known to be in violation of this feature. For such a non-compliant system I suggest replacing the use of a NULL value in such fields with a bogus value. The bogus value would be a string such as "unknown_" plus some arbitrary value that is virtually certain to be unique. That arbitrary value could be something like the current date-time plus a random number.

UUID

But, rather than roll your own arbitrary value, generate a UUID. The original Version 1 UUID is indeed a combination of the current date-time, a random number, and the computer's virtually unique MAC address.

A UUID presented as a hex string with canonical formatting using hyphens looks like this:

93e6f268-5c2d-4c63-9d9c-40e6ac034f88

So my suggestion is to combine an arbitrary string such as "unknown_" plus a UUID, to look like this:

unknown_93e6f268-5c2d-4c63-9d9c-40e6ac034f88

So my suggestion for non-compliant databases is to generate such a value and use it in place of NULL, use it where you do not yet have a known value in that column for a particular row. Instead of writing queries that look for rows that have (or do not have) a NULL value in that column, write queries that look for rows that have (or do not have) a value beginning with the arbitrary string, "unknown_" in this example. Each row would then satisfy the constraint of having a unique value.

Indeed, I would assign this "unknown_" + UUID value as the default for that column.

You could also add a NOT NULL constraint to this column.

Generating UUID Values

Postgres has built-in support for the data type of UUID, but that's irrelevant in this answer here. What you need is to generate a UUID value.

For generating UUIDs you need an extension (plugin) that adds this capability to Postgres. Most Postgres installers include such an extension. This extension is called uuid-ossp. Usually the extension is not activated by default. To do so in recent versions of Postgres, use the CREATE EXTENSION command. For instructions, see my blog post on installing in Postgres 9.1 and later or my other post on Postgres 9.0 and earlier. Both the new and old way of installation is easy provided the extension/plugin was compiled and bundled with your Postgres installation.

Summary

Let me be clear that for Postgres alone, there is no need for this workaround because Postgres complies with the SQL standard. But if:

  • You are concerned about portability of your code to some other non-compliant database system, or
  • You need to exchange data with a non-compliant database system, or
  • You agree with Dr. Chris Date that NULL is the work of the devil and should be avoided

…then a workaround such as this is necessary.

3 Comments

Thank you, I've seen it already, I'd like to reassure you I did bothered to search before posting, I usually do, unfortunately the answer offered is not the right one for me, anyhow than you for your help, try not to be so rude next time!
@liva (a) I was wrong about your lack of bothering to research. I was wrong in my original answer's link to a Question-Answer that failed to address your question. I stand corrected. For my rudeness and thoughtlessness, I apologize. (b) I replaced my original answer with a fresh one that may help you or anyone else who may need to work with databases other than Postgres, databases that fail to follow the SQL spec’s requirement that a Unique constraint should allow multiple NULL values.
@KamilKiełczewski No, that won't do. There is no row.id in Postgres. Same idea found in the ctid system column on every table, which uniquely identifies the row. But as a physical identifier, it can change such as during a VACUUM FULL, and possibly be re-assigned. As the doc says: “ctid is useless as a long-term row identifier”. When you need a universally unique identifier, use… a universally unique identifier (UUID).
12

Some databases do not allow multiple null values, for example the SQL Server documentation states that "multiple null values are considered duplicates". On databases that do not allow nullable UNIQUE constraints you could try this (from GuidoG's answer to another question):

CREATE UNIQUE NONCLUSTERED INDEX IDX_Email ON MMCompany (Email) WHERE Email IS NOT NULL; 

Comments

5

Drop the email column from the table. Put it in a new table where it can be NOT NULL and UNIQUE:

CREATE TABLE CompanyEmail ( CompanyUniqueID INT NOT NULL PRIMARY KEY REFERENCES MMCompany (CompanyUniqueID), Email VARCHAR(75) NOT NULL UNIQUE ); 

Avoid nullable UNIQUE constraints.

Comments

3

Unique and null don't get along much, since null is undefined by definition — you can't know if two nulls are the same unknown.

In this sense, your current unique constraint on email is the right thing to do and should work as is.


In case you ever need to make it otherwise, though, a partial index works:

create unique index on MMCompany((email is null)) where (email is null); 

Another approach is to define a constraint trigger. Something like:

create function email_chk() returns trigger as $$ begin if exists ( select 1 from mmcompany where email is null and companyuniqueid <> new.id ) then raise 'dup null found'; end if; return null; end; $$ language plpgsql; create constraint trigger after insert or update on mmcompany for each row when (new.email is null) execute procedure email_chk(); 

2 Comments

Actually, two nulls are always considered not equal. That is why you never do WHERE email = NULL but do WHERE email IS NULL instead. Sorry for replying to a 5year old comment. But that was true 5 years ago as much as it is now...
Well, "is null" means "null is a singleton" and any two null values should be equal.
0

on sqlite you can use

create unique index 

then use an expression in the indexed column eg.

... iif (field is null, 0, field), ... 

see https://www.sqlite.org/lang_createindex.html and https://www.sqlite.org/expridx.html

suggestion above to hive the field off into another table probably the correct way forward, but...

Comments

-4

And just in case you're generating your DB Tables using EF Code First, edit your Migration Class' Up method like the following to enforce your UNIQUE KEY constraint to ignore NULL.

migrationBuilder.Sql(@"CREATE UNIQUE NONCLUSTERED INDEX[IX_Employees_TaskId] ON[dbo].[Employees]([TaskId] ASC) WHERE [TaskId] IS NOT NULL" ); 

And then you could test your Unique Constraint by logging into your DB through SQL Server Management Studio or something similar. Like in this case Employee Table happily accepts 2 NULL values in TaskId although its an UNIQUE column.

enter image description here

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.