Importing Shapefiles

There’s a lot of interesting data out there that isn’t points, but regions. In Chapter 10 of the book, we showed you how to calculate an area inside a polygon (see this demo); unfortunately, there just wasn’t the space to cover more region-centric topics.

Texas Highlight

The current state of the API leaves you pretty much adrift when it comes to actually plotting region-data. Many mashups do a good job faking it with polylines, but polylines and polygons are really very different beasts.

  1. A polygon is a closed loop, thus you should have a way to colour its interior, and
  2. Polygons representing things even the slightest bit complex will have vertex counts in the thousands, rather than dozens.

It’s that second reason, I expect, that has made the Maps team antsy about building polygon calls directly in the API… they don’t want to have to explain to Joe Mashup Author why he can’t pass the 179,354 points that define the outline of Alaska in through a URL querystring.

But enough of that; we’re here to show you how you can do regions to your hearts’ content. School districts and election seats are just the start—there’s mountains of data out there that’s presented by country, or by state/province, and it’s just waiting for some map love. (not to mention boundaries based on ecological, social, environmental, or economic divisions.)

The Format Game

Between GML and the TIGER/Line data, you’d think there’d be enough variety already, but the format we’re looking at here is Shapefile. Shapefiles are binary collections of vector data, and as such, it would be extremely hairy to try working through one without some kind of preprocessing.

Fortunately, Bryce Nesbitt and Frank Warmerdam have done the heavy lifting already, having created a set of tools for converting from Shapefiles to rational text-based formats.

Nesbitt has included executables for several platforms in his download there, so you likely don’t even need to compile anything. If you’ve got shell access to your webspace, you can just download a shapefile directly to it, and then process it in-place. Alternatively, it’s just as easy to perform these operations on a local machine. Check it out:

wget http://edcftp.cr.usgs.gov/pub/data/nationalatlas/statesp020.tar.gz
tar xvfpz statesp020.tar.gz
wget http://www.obviously.com/gis/shp2text/shp2text.zip
unzip shp2text.zip
./shp2text --gpx statesp020.shp 3 4 > output.gpx

What did I do? The first line downloads a zipped file of US state outlines from this fantastic page. The second one uses UNIX’s tar utility to unzip it. The third and fourth line grab and unzip the shp2text program. And the final line sets to work on our shapefile.

Of the three options offered by shp2text, I thought GPX looked like the most promising. As an XML format, I know that it would be at least somewhat self-describing; hopefully I’d be able to just open it up and get a feel for what it was all about.

Sure enough, look at how output.gpx starts out:

<rte><number>0</number><name>Alaska</name><cmt>02</cmt>
    <rtept lat=" 70.95909119" lon="-157.47343445"></rtept>
    <rtept lat=" 70.96421051" lon="-157.46252441"></rtept>
    <rtept lat=" 70.97583771" lon="-157.42974854"></rtept>
    <rtept lat=" 70.98235321" lon="-157.41198730"></rtept>
    <rtept lat=" 70.98793793" lon="-157.39967346"></rtept>
    ...

That can’t be too bad—it’s just a bunch of rte blocks that wrap around lists of points. Of course, we can’t just point SimpleXML at a 23 MB file (for memory reasons), but by making use of PHP’s SAX processing, we can get at all that data, and get it into an database.

XML to SQL

I set up a new database in MySQL, and created two tables. One would represent the individual vertex points, with the other representing groupings of polygons. Technically, I probably could have gotten away with just the one, but it’s good to have those bounding-box values to query against. All but the simplest states (geometrically, of course) have many polygons in their construction. There are over 3000 records in my shape_polygons table, yet that’s just 50 states worth of points.

Here are the table definitions:

CREATE TABLE `shape_polygons` (
  `id` int(11) NOT NULL auto_increment,
  `latitude_min` float NOT NULL default '-90',
  `latitude_max` float NOT NULL default '90',
  `longitude_min` float NOT NULL default '-180',
  `longitude_max` float NOT NULL default '180',
  `code` varchar(32) NOT NULL,
  `source` varchar(32) default NULL,
  PRIMARY KEY  (`id`),
  KEY `code` (`code`)
) ENGINE=MyISAM;

CREATE TABLE `shape_vertices` (
  `id` int(11) NOT NULL auto_increment,
  `polygon_id` int(11) NOT NULL,
  `ordering` int(11) NOT NULL,
  `latitude` float NOT NULL,
  `longitude` float NOT NULL,
  `elevation` float default NULL,
  PRIMARY KEY  (`id`),
  KEY `polygon_id` (`polygon_id`),
  KEY `ordering` (`ordering`)
) ENGINE=MyISAM;

And here’s the source for my importer. If you need help following it, check out the documentation on SAX XML, since the callback-based approach can be a little less intuitive if you haven’t seen it before.

Anyhow, best of luck with this! Watch for upcoming articles explaining how to turn this information into swanky tilesets and overlays. (and as always, please report any sweet data sources you find, especially those with global scope!)

Importer source: Import.php

Shapefile tools: shp2text

U.S. State Outlines: statesp020.tar.gz


14 Responses to “Importing Shapefiles”  

  1. 1 Josh L

    Thanks for the interesting post.

    If not already tied to another database, some folks may be interested in the PostGIS - extensions to PostgreSQL. Easy to install, and very powerful for this kind of stuff.

    You can use the shp2pgsql utility that comes with it to quickly turn shapefiles to sql in one step. The SQL includes the geometry column, and once in PostGIS it’s really easy to do powerful spatial analyses via SQL queries.

    You can also simplify complicated polygons’ output on the fly, easily query for areas, things within a buffered distance, things that intersect sets of polygons, and lots of other stuff.

    A very simple example using postgis behind google maps is at http://eactive.org/svdp (you can click on the map to get the polygon that point resides in). Nothing complicated, but it’s a task made much easier using PostGIS

    Cheers,

    -Josh

    PS I think ogr now has kml output (it reads shapefiles), so that might be of interest as well.

  2. 2 Mike

    Josh: Thanks for the heads-up on that. For people who have control over their own servers, PostGIS is definitely a very powerful tool worth investigating. We used MySQL in the book; for these articles, I wanted to stick to tools that people are likely to have on their shared hosting environments. (In the next segment, I’ll be using GD for generating overlay images, but there will be a later demo of ImageMagick’s much superior output…)

  3. 3 Sami Azmi

    Hi people.

    I am looking for some help. I went thru ur posts.. they are great. I need some thing to create a web based GIS system capable of handling queries, and has normal GIS operations like zoom in out etc. It should be extendable to an extent that a small Web based Descision Support System can be made out of it. Can u ppl suggest ne thing.

    Waiting for ur response

    Sami

  4. 4 Karl Fallstrom

    I would think about Oracle Spatial…

    but I do work at a large firm. It has all the capabilities you require.

  5. 5 phani

    hi! i have seen ur article. its very interesting. i think u can help me in overlaying a shape file(polygons) which is converted into postgresql table onto google maps.
    please help me out of how to proceed with.

    thank u
    phani

  6. 6 Dan

    Thanks for this post; I’ve been tearing my hair out trying to work out how to make density maps with google or yahoo maps.

    There’s a vast amount of data at http://geodata.grid.unep.ch/: iirc, they have shapefiles of first and second-level administrative boundaries, and a lot of data on population, economics, etc, to go with it.

    I’ve been using server-side tools for this kind of thing (e.g. http://www.ohuiginn.net/mt/2006/11/westminsters_map.html), but the next time I have a free afternoon I’ll have a bash at doing the same thing as a google maps mashup

  7. 7 Justin Leider

    Wow, this documentation just made my day! Thanks for the great documentation. probably would have taken me a couple days worth of trial and error to get all my shp data imported into SQL. THANKS!

  8. 8 Max

    Thanks for the post Mike. It’s quite old, but came in really handy.

    I stuck with one problem though… and can’t seem to find an answer. I downloaded city planning shape files and ran them through the shp2text - everything came out just fine, but the coordinates in the State Plane system, and I can’t find anything on how to convert them or use them correctly with Google Maps API. Do you have any ideas about it?

    Thanks again for a great post!

    P.S. BTW, what are those numbers “3 4″ standing for in the conversion line
    /shp2text –gpx statesp020.shp 3 4 > output.gpx ?

  9. 9 Tom Friedel

    I am assisting with the site Cracids.org for the conservation of Cracids, which are rare South American birds. I would like to combine shape file information from Infonatura with Google Maps to show the distribution of about fifty species of birds. There is no funding for this project. I am hoping someone will volunteer time to help me get this running.

  10. 10 vocoder

    Someone ported that shp2pgsl to mysql - so there is now shp2mysql which works great for getting shapefile data into mysql. haven’t tried drawing the data back with GD yet though….anyone successful with that? It seems this site is slightly outdated.

  11. 11 sandy
  1. 1 Shapefile Tiles with PHP and GD at Beginning Google Maps Applications
  2. 2 EpiSPIDER » Google Maps Polygons
  3. 3 Adderall without a prescription.



Buy Our Books!

(Here's Why) PHP book Rails book DOM book mashups book