In SQL Server 2008 there is a spatial function called Filter, documented here:
This function makes a fast index-based scan for geometry intersection. It guarantees to return all intersecting cases, but might return cases not intersecting as well. So this is a first filtering and as I understand it, it is an internal part of STIntersects. This functionality has nothing to do with the bounding box comparison PostGIS does as a first filtering before the real ST_intersts calculation. In SQL Server Filter returns a more accurate answer to the intersection question. This discussion explains a lot of how it works:
The problem with this function is that, as the documentation says:
“In cases where an index is not available, or is not used, the method will return the same values as STIntersects() when called with the same parameters.”
Why is this a problem? Well, this means that the function gives different answers with and without an index from the exactly same query and dataset. This I think is a little problematic in itself for a function not being totally internal. But it might become nearly absurd in some cases. Look at this query:
Select a.id, b.id from table1 a , table2 b where a.geom.Filter(b.geom)=1;
If the geometries in table1 and table2 are indexed, we will get a fast answer, which might contain more geometry combinations than those actually intersecting with each other. That is no problem, that is the whole idea. The problem shows if we want to see the result of the Filter function in the select part of the query like this:
Select.geom.Filter(b.geom) asFiltered, a.id, b.id from table1 a , table2 b where a.geom.Filter(b.geom)=1;
Then the interesting thing happens that since the index won’t be used in the select part (using index makes no sense in select part) the Filter function here will give the same result as STIntersects, just as the documentation says. So, even if we “filter away” all cases where Filter returns anything else than 1 we will get rows where the column “Filtered” returns 0.
Here is a picture of my practical example from SQL Server Management console. As you see I get 144 rows back. I use the exactly same function in the where-part and in the select part but apparently get different answers. Fully logical from how the function is designed, but I don’t like it.
I guess that the reason for this design is that it is difficult or impossible to get the same result without the index
Please comment on this one. Have I misunderstood things?
Maybe it is common with functions giving different answers with and without indexes?
But to me it looks quite ugly.