Strange behavior by design of the spatial function Filter in SQL Server 2008

In SQL Server 2008 there is a spatial function called Filter, documented here:
http://msdn.microsoft.com/en-us/library/cc645883.aspx
This function makes a fast index-based scan for geometry intersection. It guarantees to return all intersecting cases, but might return cases not intersecting as well. So this is a first filtering and as I understand it, it is an internal part of STIntersects. This functionality has nothing to do with the bounding box comparison  PostGIS does as a first filtering before the real ST_intersts calculation. In SQL Server Filter returns a more accurate answer to the intersection question. This discussion explains a lot of how it works:
social.msdn.microsoft.com/Forums/en/sqlspatial/thread/6e1d7af4-ecc2-4d82-b069-f2517c3276c2

The problem with this function is that, as the documentation says:

In cases where an index is not available, or is not used, the method will return the same values as STIntersects() when called with the same parameters.”

Why is this a problem? Well, this means that the function gives different answers with and without an index from the exactly same query and dataset. This I think is a little problematic in itself for a function not being totally internal. But it might become nearly absurd in some cases. Look at this query:

Select a.id, b.id from table1 a ,  table2 b where a.geom.Filter(b.geom)=1;

If the geometries in table1 and table2 are indexed, we will get a fast answer, which might contain more geometry combinations than those actually intersecting with each other.  That is no problem, that is the whole idea. The problem shows if we want to see the result of the Filter function in the select part of the query like this:

Select.geom.Filter(b.geom) asFiltered,  a.id, b.id from table1 a ,  table2 b where a.geom.Filter(b.geom)=1;

Then the interesting thing happens that since the index won’t be used in the select part (using index makes no sense in select part) the Filter function here will give the same result as STIntersects, just as the documentation says. So, even if we “filter away” all cases where Filter returns anything else than 1 we will get rows where the column “Filtered” returns 0.

Here is a picture of my practical example from SQL Server Management console. As you see I get 144 rows back. I use the exactly same function in the where-part and in the select part but apparently get different answers. Fully logical from how the function is designed, but I don’t like it.

I guess that the reason for this design is that it is difficult or impossible to get the same result without the index

Please comment on this one. Have I misunderstood things?
Maybe it is common with functions giving different answers with and without indexes?

But to me it looks quite ugly.

Tags: , , , , ,

5 Responses to “Strange behavior by design of the spatial function Filter in SQL Server 2008”

  1. Isaac Kunen says:

    Hi Nicklas,

    Well, the trackback didn’t work (I assume user error on my end) but I’ve posted a response here: http://blogs.msdn.com/isaac/archive/2010/03/04/filter-one-odd-duck.aspx

    The short version: yes it’s odd. But it’s a good thing.

    Cheers,
    -Isaac

  2. David says:

    You are correct, it may return false positives. However, the intended use is to get a smaller chunk of data out of a huge database really really fast. And THEN you use more precise filtering to narrow that smaller piece of data down to your final correct answer. If you were to only use the precise filters first, it would take far longer than this two step process.

    In a sense, it’s like looking something up in a [real, paper book] encyclopedia – you would grab the book with generally what you want in it, like articles starting with En-Ez, and then look in the index in the back of that book to find exactly the article you want.

  3. Nicklas Avén says:

    the point here is not the false positive answers from the filter function. The point is that the filter function in some situations give different answer in the select clause and the where clause depending on that the where part gets an answer index-based answer and the select part does not.
    /Nicklas

  4. it support Leighton buzzard says:

    It’s actually a nice and helpful piece of info. I am glad that you shared this useful info with us. Please keep us informed like this. Thanks for sharing.

    My weblog :: it support Leighton buzzard

Leave a Reply