It can be used in a trigger to add data validation, or to check data already in your database that needs a clean up ... really a simple wrapper around a simple regexp query, but it can be helpful.
Here it is:
DELIMITER $$ DROP FUNCTION IF EXISTS `test`.`is_valid_email` $$ CREATE DEFINER=`root`@`localhost` FUNCTION `is_valid_email`(p_email varchar(64)) RETURNS tinyint(1) BEGIN CASE WHEN NOT (SELECT p_email REGEXP '^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$') THEN -- bad data RETURN FALSE; ELSE -- good email RETURN TRUE; END CASE; END $$ DELIMITER ;
As per Mushu's comment, this is much cleaner, oops:
DELIMITER $$ DROP FUNCTION IF EXISTS `test`.`is_email_valid` $$ CREATE FUNCTION `test`.`is_email_valid` (p_email varchar(64)) RETURNS tinyint(1) BEGIN CASE WHEN p_email REGEXP '^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$' THEN -- good email RETURN TRUE; ELSE -- bad data RETURN FALSE; END CASE; END $$ DELIMITER ;
Also, if you can move such checks into the application code, it may be a wise thing to do as executing regular expressions can easly consume all CPU resources.
Of course doing checks in the application code is cleaner, but the ability to add them later at database level, or to run them in the database without interaction with external apps is also handy ;-)
7 comments:
a common complaint about email validation on the web is that people forget the plus sign, '+', can be a valid character on the left-hand side. to really support the full range of valid email addresses, the regex is really quite complex. (see jeffrey friedl's mastering regular expressions.)
Also doesn't support .museum or IDN domains. To be honest the best form of validation is to check for an MX record.
You are both right, it's just a quick hack which uses a regexp taken from web.
It's by no means RFC compliant and I agree with Jim, the real regexp would be quite complex.
Checking for an MX record in another story as it would require an UDF.
I would rather put REGEXP into IF condition, which is more natural construction than doing CASE of SELECT negation.
Also, if you can move such checks into the application code, it may be a wise thing to do as executing regular expressions can easly consume all CPU resources.
mushu,
thanks for your comments, I enhanced the function a bit :-)
It also doesn't work for a mail address at a top level domain (they do exist).
Hi all,
thanks for your feedback, you'll notice that I cleaned up the function a lot but didn't change the regex, this is because I'm an regexp illiterate and there are many good ones available on the web.
Feel free to share your good one here, I'll add it to the original post.
Post a Comment