Archive for the ‘PostGIS’ Category

What is behind PostGISonline.org

Tuesday, June 15th, 2010

If you are curious and want to know what is behind PostGIS online you can download the database, php-pages and others from:
http://download.jordogskog.no/pgo
There is two files, pgo.sql.tar.gz is the database as a dump and pgo.tar.gz is the rest of the files.

Just so it’s said:
Do whatever you want with this code, but don’t blame me if things goes wrong.
If you develop it I would appreciate to get a note, just of curiosity.

You will find no rocket science but maybe a few ideas that might be interesting. Now it is just presented as a quite raw copy of the site without much explanation so I will give a few notes here.

The basic idea behind the site is to take the sql-string the user gives and create a table from the result and then show that table through MapServer. In the beginning it created a view instead of a table to save some disc writing, but I found that the view had to be evaluted several times to find resulting geometry types,srid and finally to show the view. So my conclusion was that it is worth writing down the table. So what is actually happening when the sql-string is sent is that the “CREATE TABLE” and a table name is put in front.

The table gets the name map1, map2 or map3 depending on what button the user clicked. To make it working with many users the session is is also added after mapX, for example map139853657d34373592f51e5e8c45f06b7.

What also happens is that when a user sends his first sql-string the session id is stored in table adm.sessions with a timestamp. then the timestamp is updated for each new query the user runs. That makes it possible for a function in the database found in file functions.sql to drop all tables belonging to session id’s that has not been used for half an hour.

On the Mapserver side of the story there is a mapfile with definitions of 9 different layers. One for Map1 point, one for Map1 linestring, one for Map1 polygon, one for Map2 point and so on. To make it possible to run queries that return geometry collections that contains different types I use the new function in PostGIS 1.5, ST_CollectionExtract(geometry collection, integer type). It gives som overhead but makes the system working with results containing mixed types and as mentioned even mixed geometry collections. So each geometry is checked both for points, linestrings and polygons. The DATA definition for points in Map1 looks like this for instance:

DATA “the_geom FROM (
select ST_CollectionExtract(ST_Force_Collection(the_geom),1) as the_geom,
idmap1 from userdata.%relation%) as a
USING UNIQUE idmap1 USING SRID=%srid%”

As you see on the above example there is also an unique id added to the table. That id, here called idmap1, is never showed to the user and is just added to the table after it was created to make Mapserver happy.

What I think from this maybe is most noticeable is the last part, how to make Mapserver work with geometry collections. As said earlier it gives some overhead, but can be usable anyway.

That’s it for now. Any comments or suggestion for improvement is very welcome.

Testing PostGIS 2.0 on PostGIS online

Friday, June 11th, 2010

Now it is possible to try and test new functionality soon after it is added in PostGIS trunk. How up to date the version on the site is might vary because there will be changes on the road towards PostGIS 2.0 that can not be automatically updated on the site.

How to use the trunk version you can learn from this tiny tutorial.

To see what new functions is there take a look at the SVN documentation snapshot

Simple SQL tutorial

Saturday, April 3rd, 2010

Here comes an attempt to explain some basic SQL queries.

The tutorial is divided in 3 sections. You can find the all here:
http://www.postgisonline.org/tutorials/

PostGIS online

Sunday, March 28th, 2010

Maybe you have seen it, it has been up and running for a few weeks now, but now it is also starting to take shape as I want it. What?…
http://www.postgisonline.org, a site for testing, training and showing PostGIS.

It is a web site with the intention to make it possible to find the power in handling spatial data in a relational database environment. You can write any SELECT statement against the database that PostGreSQL/PostGIS latest releases support and get the answer back as a table or as a map if it includes spatial data in a field named “the_geom”.

It is also possible for anyone to write tutorials and run them through the site. The tutorial can present a text, a sql-code, a background picture and a background map for each page. This concept have to be further developed but I think it is a good start and I hope it will be used to show the capabilities of PostGIS. I will use it to continue writing about “How to use the new distance functions”. An instruction, how to write tutorials will show up in the documentation area of the site. So far I have written two of them that can be found here

The spatial functionality is done with MapServer showing tables created by the sql query written by the user.

So, this is a first version. I hope that it will be used and maybe it can become a project with more people involved. There is a lot of functionality that can be added like PL/R with some way to show diagrams and pgrouting and some way to show WKTraster.

Oh, yes the logo…. It is a creation made by my wife :-)

Welcome !

How to use the new distance related functions in PostGIS Part1

Sunday, February 7th, 2010

As described earlier there is also some new distance related functions in new PostGIS release. Since I am in some way responsible for their existance I feel a need to give some examples of how they can be used.

Example 1 Snap GPS-points to roads

If collecting GPS-points along a road you will get them placed on the side of the roadline because of the inaccurancy of the GPS and the road on the map. One way to get the points on the road is to use ST_Closestpoint.

We have a table with our roads called ….. roads (why not) and another table called gpspoints with pour GPS points. The gepspoints are as expected placed, not on the road, but next to the road. There is also a GPS point that probably have been collected when walking in the forest since it is to far away from the road.


roads and GPS points

Now lets do a first attempt to move the points to the roads. We do:

Select ST_Closestpoint(a.the_geom, b.the_geom) as the_geom
from roads a inner join gpspoints b
on ST_Dwithin(a.the_geom, b.the_geom, 100);

We count on that no GPS points should be more than 100 meters away from the road if it is collected from the road. That much error should never be from the GPS, but it is possible if it is just started and so on. The result from our query will be the green dots below:


Gps Points moved to road 1

But there is a problem. Not the point out in the forest. That one we have abandoned since it is more than 100 meters from the road. But the point between two road parcels a little bitdown from the middle of the map. That GPS point did get two new points insted of just one. What has happened is that every road feature closer than 100 meters of a GPS point has had a point from that GPS point.

This can be solved by collecting all road features that is within those 100 meters from the GPS point into a geometry collection which we put into the ST_Closestpoint function. Since grouping by geometries often is a bad idea we group by the id of the road features but also group by the geometriy just to avoid the need of aggregating also the GPS points. The query will look something like:

Select ST_Closestpoint(ST_Collect(a.the_geom), b.the_geom) as the_geom
from roads a inner join gpspoints b
on ST_Dwithin(a.the_geom, b.the_geom, 100)
group by b.gid, b.the_geom;

and the result will be, again like the green dots:


roads3

That’s better. Now it looks like expected.

PostGIS 1.5.0 released

Saturday, February 6th, 2010

I hardly believed my eyes the other day when I saw the headlines roll. PostGIS 1.5.0 is released and totally without any suffix like beta or rc :-)

With 1.5.0 the knowledge has reached PostGIS that the planet Tellus is not flat but a globe. You can now calculate the distance from Stockholm, Sweden to Oshakati in northern Namibia without the need to dig a tunnel to get the same answer from practical measuring. If you use the new geographical storage type you will get the distance around the globe from point 1 to point 2. 

But still most of the PostGIS functionality is only applicable to the planar way of looking at the earth, one part at a time.

The full release document can be found here:

http://postgis.org/news/20100204

ST_Distance, the faster edition or Birgers Boost

Wednesday, December 23rd, 2009

When I was working on the new functions described in previous post I found that the distance calculation in general is very heavy and slow. The distance function gets two geometries to find the shortest distance in between. The approach has been to calculate the distance between all possible combinations of vertex-vertex and vertex-edge between the two geometries. That means that two geometries with 1000 vertexes each causes one million iterations and even if computers are fast, that takes some time.

The ideas how to make it faster came to me by the time of the birth of my son. I guess you get some extra boost from something like that. I was home from job for 10 days to help my wife and son, and I did, I promise :-) But I also had time to try some ideas of getting distance calculations faster. Because of this I call  it Birgers Boost from my son Birger.

The idea was to find a way to not do this distance calculation between all and every vertexes. I thought that at least the ones behind the middle of the geometry must be possible to avoid. I imagined like a wall that I projected against the geometries and then I could sort the vertexes as they appear on the other side of the wall as I move it through the geometry. I guess it maybe doesn’t make sense but I thought it was a little fun to describe how the idea appeared. The resulting algorithm uses a line from the middle of the first geometry to the middle of the second geometry. Then it orders the vertexes along that line and calculates the distances in the order of how close they are along that line. The big difference from the old function is that the preparation here, giving the vertexes a value along this line only happens once per vertex. So in the example of 1000 vertexes per geometry it takes only 2000 calculations to get those values. Then, when the vertexes is ordered we can do the distance calculations in the right order. And when the distance between those abstract walls that I imagined is bigger than the smallest found distance, then we know that the shortest distance is found. How many distances we have to calculate before we know this will vary depending on how the geometries is related to each other.

From the testing we have done it seems like it in general gives a quite good increase in speed. For larger geometries it is between 10 and 100 times faster than the old algorithm. In some special cases it is not that fast and in some cases it is even faster.

This way of doing it will not work if the geometries overlap. The easiest way to be sure they don’t overlap is to check for overlapping bounding boxes. So, if there is overlapping bounding boxes the calculation is sent to the old hard way of doing it. The same is the situation if one of the geometries is a point because then there is no gain to get. Then it is done the same way as before

This is a problem but hopefully this will be solved. Paul Ramsey have come up with ideas that might make my way of doing it short lived, see his blog:
http://blog.cleverelephant.ca/2009/11/is-good-enough-good-enough.html
He is mostly discussing his new geography functions but probably it will be a good way of doing it for geometry too. So in PostGIS 2.0 the development will continue :-)

Those distance calculations enhancements might be quite important because it makes it possible to calculate directly with the geometries in nearest neighbor calculations and thing like that instead of using the centroids. Using points will still be faster bu sometimes it may be useful to be able to run on the whole geometry and before it was often more or less impossible because of too heavy calculations.

This will be in PostGIS 1.5. A Beta release will hopefully be out soon. For windows there is experimental builds already available here:
http://postgis.org/download/windows/experimental.php
And of course the source code is available to compile for other platforms.

I have wrote some lines in the wiki too, to describe this
http://trac.osgeo.org/postgis/wiki/NewDistCalcGeom2Geom

Shortest line and other new functionality in PostGIS 1.5

Tuesday, December 22nd, 2009

One and a half year ago I found PostGIS. I did fast become a fan. Handling spatial data with sql is a wonderful way of doing it. PostGIS also have a great amount of functionality and if something is missing no one will be stopped from creating that functionality. When I realized that I understood that I no longer could complain about a functionality I have missed in other GIS systems. I have done some avenue scripting in Arcview 3.x and solved a lot of tasks that way. But I have missed an easy way to get the information about between which points the distance-function gets that min distance.

Let’s say you are working with linestrings of rivers and you want to know how close a linestring that represents a road is to that river. Ok, the distance-function tells you that the minimum distance is 20 meters. Great, but the next question will be, where. Where is the road only 20 meters away from the river. In a couple of times I have wanted that information and I have always imagined that the information have to be somewhere in there, in the function. To find the minimum distance you first have to identify where to measure, was my thought. That was partly right I found.

That’s the great thing about open source, if you are wondering how it is done the code is there to read. Since I have never studied C before I didn’t have very high expectations of understanding anything. But from quite good commenting and clean structure I successes to put this together
http://www.jordogskog.no/distance.html
The minimum distance between to geometries have to be between two vertexes or between one vertex and one edge. The distance calculation iterated through the vertexes and edges defining the inputted geometries comparing their relations one by one. How to find the distance between two vertexes is just done with the Pythagorean theorem. Little bit worse is it to calculate the distance between one vertex and an edge. Search for “How do I find the distance from a point to a line?” in this link
http://www.faqs.org/faqs/graphics/algorithms-faq/
There is a description how to get the distance from the line to the point. That is the way it was done before. But there is also a description how to identify the point on the edge (line) from where the shortest distance is found. Time for copy and paste. When the overall shortest distance is found the points defining that distance is returned to the user as a line. I found a line being the best way of returning the information because than the user can get both first and last point from that and the distance from the length of the line. The use of this functionality will probably, as described in the beginning be to identify where the minimum distance is found. Let’s say you are sitting on an big Island with your laptop and asking yourself from where you should swim to get the shortest way to shore. Now that problem is solved. For convenience the first point of ST_Shortestline can also be found with function ST_Closestpoint.

From this rewriting a also successes to get maximum distance calculation working, ST_Maxdistance. Then it was natural to also add longest line function which relates to ST_Maxdistance as ST_Shortestline relates to ST_Distance.
To make the symmetry complete I also added ST_DFullywithin. That function returns true if the maxdistance between two geometries is smaller or the same as the inputted last parameter. Just like ST_DWithin but with maximum distance instead of minimum distance.

So as summary
the old functions working with minimum distance, ST_Distance and ST_DWithin has now got a new friend ST_Shortestline and there is also the corresponding functions for max distance, ST_Maxdistance, ST_Longestline and ST_DFullywithin.

I will get back soon and tell about how I found the maybe fastest distance calculation, included in 1.5