On this iteration I'm loading some geographical data, converting it to hexagons and exporting vector tiles with the resulting information.
Creating the loader process has been an incredibly empirical process as I've experimented tons of approaches, never quite happy with them.
The road so far
|Supernatural reference :)|
Geographical data on the Database
Initially, as posted on one of my blog posts, I simply loaded geographical data and stored it "as is" on a Geographical column in Sql Server. Then, I had a WebAPI that would make a geographical query for all the features included on a certain area (matching the viewport). Additional calculations were then made in client-side to calculate the intersection between the hexagons and the geographical features.
I even posted a video of this:
But, as I explained on the video, this was a sub-optimal approach, as I was wastefully making tons of unnecessary calculations and ideally some of them would need to be pre-calculated and stored. Also, this approach wouldn't scale well for the whole world, just for small areas created "on-demand".
Redis on Node
After some intermediate experiments I then decided to store the pre-calculated hexagon data on a key-store, where I opted for Redis. I even implemented it on Node, although I haven't blogged about it. The results were pretty interesting although not easily scalable, mostly as the key-value nature of Redis didn't map to the queries I needed to do. Also, as Redis loads everything into memory (although having persistence options) this wasn't scalable from a financial point of view, as RAM is very expensive on the Cloud and I would need tons of memory to map the whole world with hexagons.
Azure Table Storage on Node
Azure Table Storage seemed a good candidate to store the hexagon data. Scales well, has great performance, is cheap and lends itself well to store the hexagon data. I even created a Proof-of-concept that used Azure Table Storage, but this was turning out to be a difficult task, particularly as the node client is not perfect and the loading process took ages to complete, even for a small portion of the world. Regardless, using Table Storage could be a viable option, I just need to solve some challenges.
Recently I've decided to redo the loader in C# and this turned out to be good bet. Besides having better performance, I ended up with a much better architecture, allowing me to experiment different alternatives that would suit the problem better.
Currently I'm trying two different options in parallel:
- Setting up a CDN on Azure that has its origin set to a Cloud-Service that hosts an WebAPI that's fetching hexagon data from Azure Table Storage.
- Storing Vector Tiles on a blob storage and serve them directly to the web-clients, eventually through a CDN.
So, summing up, the main argument will be on using an origin-pull CDN or a push CDN. To narrow down the scope of this post I'm just going to discuss the Vector Tiles implementation where I push the vector tile files to the Cloud.
So, what's a vector tile?
I'm quoting the definition found on OSM (http://wiki.openstreetmap.org/wiki/Vector_tiles)
"Vector tiles are a way to deliver geographic data in small chunks to a browser or other client app. Vector tiles are similar to raster tiles but instead of raster images the data returned is a vector representation of the features in the tile. For example a GeoJSON vector tile might include roads as LineStrings and bodies of water as Polygons. Some vector tile sources are clipped so that all geometry is bounded in the tiles, potentially chopping features in half. Other vector tile sources serve unclipped geometry so that a whole lake may be returned even if only a small part of it intersects the tile."
In my case, for each corresponding map tile I generate a JSON file that defines the type of hexagons that are present on that particular tile (like roads or forests) but without any style information (colours, widths and such). The important detail is that I won't (at least for now) generate raster tiles, only vector tiles. Then I can generate the tile images on the fly, either on the server (like on my previous post using GDI+) or on the client (for example using HTML5 Canvas as I had done before).
How does the loader process work and were does the Vector Tile generation take place?
The high-level loader process is like this:
- I have various sources of data. Currently I support GeoJson and the XYZ format (which is basically an ascii file with Lat, Lon and a numerical value).
- The data is loaded and converted to hexagon data in-memory, processing each type of information separately (roads, land, rivers, railroads, altitude, forests, etc).
- After loading all the info I run various post-processing commands (using a plugin architecture) as some values are extrapolated from various layers (more on this on future posts).
- The data is then exported to tiles.
- Iterating each hexagon it's very simple to determine with map tiles contain it (basically as simple as dividing the pixel coordinates of the hexagon's bounding box by 256).
- Vector Tiles are generated including all the related hexagon data.
- The files are copied to the target location. Could be local storage or blob storage.
So, right now I've already exported some tiles to Azure Blob Storage, which you can validate using the following links:
Another detail worth mentioning is that fact that I've added support for the Quadkey representation of tiles (used above), which lends itself better to blob-storage as subfolders are not supported.
The data inside these vector tiles might include:
- Land (bool)
- Mask with 6 bits, one for each edge of the hexagon
- Mask with 6 bits, one for each edge of the hexagon
- Urban (bool)
- Forest (bool)
- Water (bool)
- Level (int)
- Normalised altitude
That's it for now. Next-step: generating image-tiles based on this info.
For reference, other notable examples of Vector Tiles:
Mapnik Vector Tiles: http://openstreetmap.us/~migurski/vector-datasource/
MapBox Vector Tiles: https://www.mapbox.com/blog/vector-tiles/