Home Uncategorized Validate a URL from SQL Server

    Validate a URL from SQL Server

    841
    25

    File this one in your folder of things you should probably never use — but maybe, some day, in an emergency, you’ll need this.

    I see posts requesting this functionality all the time. “How do I validate a URL in SQL Server?” Not just the string, but the URL itself — how can we find out if it’s valid?

    Thanks to the Object Automation extended stored procedures Microsoft includes with SQL Server, it’s quite easy…

    [sql]CREATE FUNCTION dbo.ValidateURL(@URL VARCHAR(300))
    RETURNS BIT
    AS
    BEGIN
    DECLARE @Object INT
    DECLARE @Return TINYINT
    DECLARE @Valid BIT SET @Valid = 0 –default to false

    –create the XMLHTTP object
    EXEC @Return = sp_oacreate ‘MSXML2.ServerXMLHTTP.3.0’, @Object OUTPUT
    IF @Return = 0
    BEGIN
    DECLARE @Method VARCHAR(350)

    –define setTimeouts method
    –Resolve, Connect, Send, Receive
    SET @Method = ‘setTimeouts(45000, 45000, 45000, 45000)’

    –set the timeouts
    EXEC @Return = sp_oamethod @Object, @Method

    IF @Return = 0
    BEGIN
    –define open method
    SET @Method = ‘open(“GET”, “‘ + @URL + ‘”, false)’

    –Open the connection
    EXEC @Return = sp_oamethod @Object, @Method
    END

    IF @Return = 0
    BEGIN
    –SEND the request
    EXEC @Return = sp_oamethod @Object, ‘send()’
    END

    IF @Return = 0
    BEGIN
    DECLARE @Output INT
    EXEC @Return = sp_oamethod @Object, ‘status’, @Output OUTPUT

    IF @Output = 200
    BEGIN
    SET @Valid = 1
    END
    END
    END

    –destroy the object
    EXEC sp_oadestroy @Object

    RETURN (@Valid)
    END
    [/sql]

    … And that is it …

    [sql]SELECT dbo.ValidateURL(‘http://www.microsoft.com/sql’)


    1

    SELECT dbo.ValidateURL(‘http://www.XMLisNOTaMAGICbullet.com/’)


    0
    [/sql]

    Note, you don’t want to run this thing against a big table. It runs synchronously and waits for the remote site to respond. That can definitely hold locks open a lot longer than you might want.


    UPDATE: Thanks to Marcus Tucker for pointing out that Microsoft.XMLHTTP (the XMLHTTP client object) was not the right choice here. I’ve updated the UDF to use MSXML2.ServerXMLHTTP.3.0, the XMLHTTP server object, instead.


    ANOTHER UPDATE: Added a call to the setTimeouts method, as I discovered that this wasn’t behaving the same on differente servers — apparently there is some default timeout set somewhere; I have no idea where, though. Anyway, the four timeout types have all been set to 45 seconds (45000 ms). Tweak them if you need to.

    25 COMMENTS

    1. Daren: This function was designed for SQL Server 2000.  In SQL Server 2005 you would have to turn on Object Automation using sp_configure in order to make it work. This is not recommended; rather, you should consider using a SQLCLR UDF to accomplish this, which is much safer and more stable than object automation.

    2. Adam,
      The OLE Automation method above seems to be entirely done inside of SQL Server using SQL, which is very attractive – no additional environments are necessary – just the SQL Server engine and its natual integration capabilities with the Windows environment.
      Although not as good as Oracle and DB2 — which have built-in support for this kind of thing (see below) — it is very similar to them and thus very attractive as it requires nothing else to implement.  
      Oracle:
      SELECT HTTPURITYPE.createuri (‘http://www.microsoft.com‘).getclob ( ) web_page FROM DUAL
      DB2:
      select db2xml.soaphttpv (‘http://10.16.1.222/85256D4B003C9567/002?openagent‘,”,'<polnum>HO123456</polnum>’) from sysibm.sysdummy1
      Now, from what I can tell in your response above, you do NOT recommended building solutions in SQL Server using OLE Automation, but rather, you recommend we should create a SQLCLR UDF.
      Three questions:
      1.  Doesn’t this require a whole separate IDE just to develop what SQL Server already supports via Automation (your example), and Oracle and DB2 already support via built-in functions?
      2.  Is there no possible way to call and return a web request natively without having to go and program in some other language than SQL?  
      3. One of the reasons you give above for not using your own example code is that it runs synchronously.  Does this mean only my connection to SQL Server is "hung" waiting, or, the entire SQL Server?  How does using CLR make it less synchronous?  
      Thanks!

    3. Greg:
      1. Yes, it requires a different programming environment (maybe not an IDE — you can do it from Notepad and compile using cs.exe if you want)
      2. No, no way.
      3. Only your connection, not the whole server.  The point is, on big sets of data this can be VERY slow.  CLR would not make it any less synchronous, but using SQLCLR has a lot of benefits over OLE automation including much less of a chance of causing a memory leak or crashing SQL Server (neither are at all likely with the above code), and much greater flexibility in terms of what you can do.

    4. Works like a charm when I run it against a 2000 row data set as part of a SSIS package in SQL Server 2005.  Takes a few minutes, but it works like a charm….

    5. In SQL Server 2005, I enabled "Ole Automation Procedures" to 1. Still the functions returns 0 for any valid URL. Is there something else I need to enable in SQL 2005 to make it work? Please advise!

    6. This works great for validating URLs.  Thanks Adam!  I was hoping to find something like this for email address validation.  Do you have a method for validating email addresses stored in a SQL table?

    7. Hi Anthony,
      Unfortunately, e-mail validation is quite different than URL validation. Whereas you can validate a URL by simply visiting it and checking for the presence of content (and lack of an error), e-mail validation requires sending an e-mail and waiting for a response — which may take days. That’s not something I think it well-suited to doing in SQL Server. Actually I don’t think URL validation is, either, but it’s definitely the lesser of the two evils 🙂
      –Adam

    8. Dear Adam,
      we are using the function quite heaviliy in order to monitor job execution. Now after migration to sql2k8 and to another machine (win2k3 server) we are experiencing memory leaks, so the function stops working after ~3 days…. Details can be found http://stackoverflow.com/q/8284498/833679 do you have an idea how to solve this?
      Ludwig

    9. Hi Ludwig,
      Sorry that you’re having problems; unfortunately I don’t have any idea. I wrote that function back in 2003 and haven’t touched it since that project.
      In SQL Server 2005 or 2008 I would implement this functionality in a SQLCLR UDF rather than use Object Automation. Do you have anyone on your team with C# experience? It should be very simple to put together.
      –Adam

    10. Ludwig,
      One other idea: Have you tried pulling the logic out of the function and putting it into a stored procedure instead? Then you could, at least, use exception handling and make sure that sp_oadestroy gets run if there is a problem. That should (in theory) prevent any memory leaks.

    11. I had tried this example and it worked proper but how to get response from that URL.
      For example,
      My URL is http://localhost/SampleSite/Default.aspx
      When i called from sql server i got the output "1". If URL was wrong then got the output "0". This mean URL is not proper.
      Now my requirement is that after call URL, URL will give return response like ("True" / "False") from URL side.

    12. Here is how you get the response:
      Exec sp_OAMethod @Object, ‘responseText’, @ResponseText OUTPUT
      Exec @hr = sp_OAGetProperty @Object, ‘status’, @status OUT
      Exec @hr = sp_OAGetProperty @Object, ‘responseText’, @ResponseText OUT
      @Status is integer

    13. Thanks for filling in the gaps, Mark!
      I’ll take this time to reiterate that this might not be the very -best- idea to do inside of SQL Server…

    14. Noticed when mutiple calls in a single string that it returns all zeros after roughly 200 instances. Have you guys had this experience, and is there a solution?
      Here is the query I am running to insert the return into a table (after 240 call it only returns 0’s):
      insert into dbo.URL_VALIDATOR (VALIDATION1, URL)
      select a.VALIDATION1,a.URL
      from
      (SELECT dbo.validateURL(‘http://www.mcafee.com‘) as VALIDATION1, (‘10548241969129’) as URL)a;
      insert into dbo.URL_VALIDATOR (VALIDATION1, URL)
      select a.VALIDATION1,a.URL
      from
      (SELECT dbo.validateURL(‘http://www.midwescofilter.com‘) as VALIDATION1, (‘10549497419218’) as URL)a;

    15. Do someone know how to get response from URLs that have Windows integrated activated?
      At the moment I only got response 0 from them.

    16. Very helpful code, thanks!
      I do have a question though.  Is there a way to modify the internals of the function to return a resulting URL if the URL you originally submit forwards to another URL?

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here