I have two columns which contain email info. Column A, Column B. Now, in these fields, emails SHOULD be as such:
Column A Column B
[email protected]; [email protected] [email protected]; [email protected]
[email protected];[email protected] [email protected];[email protected]
However, in an effort to do some data quality checks and such, it turns out that MANY entries are instead not following this format. I am trying to find all the outliers, and I have identified the outliers to take the form as such:
[email protected] and [email protected]
[email protected], [email protected] (uses comma so it is incorrect)
[email protected] or [email protected]
[email protected] / [email protected]
There could be other wrong characters or words that make the format incorrect. But I hope these examples pinpoint the issue.
What I am trying to do: Create a query that will pinpoint all instances that are NOT in the correct format, so that the problem points can be found and edited later, but that's a different topic :)
Here is a Query I have so far:
SELECT A_EMAIL, B_EMAIL, NAME, ID
FROM NAMES
WHERE A_EMAIL LIKE ('and %') OR A_EMAIL LIKE ('or %')
OR B_EMAIL LIKE ('and %') OR B LIKE ('or %')
This is using LIKE and the % is with a space in between. However, this returns no results, and I know such results definitely do exist. But I would like to build a logic that would bring me back everything that isn't in the proper format instead of trying to use LIKE 'XYZ' because even though I know most of the problem issues, I could still miss some.
However, if such a thing isn't possible via SQL. Then I would still like to get my current logic of using LIKE ('XYZ %') to work instead which even though not an optimal route, should still be able to help me in my goal someway.
Your query is fine, you just missed one %
in it. Instead of this
WHERE A_EMAIL LIKE ('and %') OR A_EMAIL LIKE ('or %')
OR B_EMAIL LIKE ('and %') OR B LIKE ('or %')
you should use this
WHERE A_EMAIL LIKE ('%and %') OR A_EMAIL LIKE ('%or %')
OR B_EMAIL LIKE ('%and %') OR B LIKE ('%or %')
Your original query looks for values that start with 'and ', while you are interested in cases where 'and ' appears anywhere inside the column value.
Of course, this is a one-off solution to your immediate problem. The permanent solution is not to store several e-mails in the same column in the first place.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments