λ Tony's Blog λ

Parsing map data using a lazy language

Posted on January 10, 2010

Haskell is a pure lazy programming language. The laziness of Haskell allows certain performance improvements without sacrificing compositional program properties. I have recently written parsers for XML map data formats that allow a user to “read a data file into a collection of immutable objects.” If I told you I used the parser library to “read in” a 140GB map data file and you’re not familiar with a lazy language, you might have asked how I did this within the constraints of memory requirements. Easy of course; I used a lazy language. The implications of a lazy (and therefore, pure) language are widely misunderstood, so I say “easy” wishing it really was easy for all people, but I know it isn’t (keep practicing!).

HXT is a parsing library for XML that is based on Hughes’ arrows and allows a user to piece together their own specific XML parser. I used it to parse the GPS Exchange (GPX) and OpenStreetMap (OSM) data formats.

Here are some example uses of parsing GPX files and here are examples parsing OSM files. My favourite is a very simple example (there are more complex ones) that removes waypoints from a GPX file. This question (how to remove waypoints from gpx?) was asked on the OSM mailing list quite a while ago; questions like these partially inspired me to write these libraries.

import Data.Geo.GPX

removeWpts :: FilePath -> FilePath -> IO ()
removeWpts = flip interactGpx (usingWpts (const []))

The implementation is very simple. The interactGpx function takes two file names and a function that transforms a Gpx data structure to a new Gpx. The interactGpx function reads in the first given file name to a Gpx, executes the given function to produce a new Gpx, then writes the result to the other given file name.

interactGpx :: FilePath -> (Gpx -> Gpx) -> FilePath -> IO ()

The usingWpts function takes a function that transforms a list of waypoints to a new list of waypoints and a Gpx value and returns a new Gpx value with the waypoints transformed.

usingWpts :: ([WptType] -> [WptType]) -> Gpx -> Gpx

Of course, since we want to remove all waypoints, we ignore the given list of waypoints and return an empty list (const []). Pretty neat I reckon!

You can get either of these libraries from hackage:

Here is each of their home page: