Rectangle 27 0

Since Martijn posted a Python answer and said Perl would turn to line noise I felt there is the need for a Perl answer, too.

On CPAN, the Perl module directory, there is a module called Geo::Gpx. As Martijn already said, GPX is an XML format. But fortunately, someone has already made it into a module that handles the parsing for us. All we have to do is load that module.

There are several modules available for CSV handling, but the data in this XML file is rather simple, so we don't really need one. We can do it on our own with the built-in functionality.

Please consider the following script. I'll give an explanation in a minute.

use strict;
use warnings;
use Geo::Gpx;
use DateTime;
# Open the GPX file
open my $fh_in, '<', 'fells_loop.gpx';
# Parse GPX
my $gpx = Geo::Gpx->new( input => $fh_in );
# Close the GPX file
close $fh_in;

# Open an output file
open my $fh_out, '>', 'fells_loop.csv';
# Print the header line to the file
print $fh_out "time,lat,lon,ele,name,sym,type,desc\n";

# The waypoints-method of the GEO::GPX-Object returns an array-ref
# which we can iterate in a foreach loop
foreach my $wp ( @{ $gpx->waypoints() } ) {
  # Some fields seem to be optional so they are missing in the hash.
  # We have to add an empty string by iterating over all the possible
  # hash keys to put '' in them. Map is like a foreach loop.
  map { $wp->{$_} ||= '' } qw( time lat lon ele name sym type desc );

  # The time is a unix timestamp, which is hard to read.
  # We can make it an ISO8601 date with the DateTime module.
  # We only do it if there already is a time, though.
  if ($wp->{'time'}) {
    $wp->{'time'} = DateTime->from_epoch( epoch => $wp->{'time'} )
                             ->iso8601();
  }
  # Join the fields with a comma and print them to the output file
  print $fh_out join(',', (
    $wp->{'time'},
    $wp->{'lat'},
    $wp->{'lon'},
    $wp->{'ele'},
    $wp->{'name'},
    $wp->{'sym'},
    $wp->{'type'},
    $wp->{'desc'},
  )), "\n"; # Add a newline at the end
}
# Close the output file
close $fh_out;

Let's take this in steps:

  • use strict and use warnings enforce rules like declaring variables and tell you about common mistakes that are the hardest to find.
  • use Geo::Gpx and use DateTime are the modules we use. Geo::Gpx is going to handle the parsing for us. We need DateTime to make unix timestamps into readable dates and times.
  • The open function opens a file. $fh_in is the variable that holds the filehandle. The GPX file we want to read is fells_loop.gpx which I took the liberty of borrowing from topografix.com. You can find more info on open in perlopentut.
  • We create a new Geo::Gpx object called $gpx and use our filehandle $fh_in to tell it where to read the XML data from. The new-method is provided by all Perl modules that have an object oriented interface.
close
open
  • We print to a filehandle by putting it as the first argument to print. Note that there is no comma after the filehandle. The \n is a newline character.

The foreach loop takes the return value of the waypoints-method of the Geo::Gpx object. This value is an array reference. Think of this as an array that holds arrays (see perlref if you want to know more about references). In each iteration of the loop, the next element of that array ref (which represents a waypoint in the GPX data) will be put into $wp. If printed with Data::Dumper it looks like this:

$VAR1 = {
      'ele' => '64.008000',
      'lat' => '42.455956',
      'time' => 991452424,
      'name' => 'SOAPBOX',
      'sym' => 'Cemetery',
      'desc' => 'Soap Box Derby Track',
      'lon' => '-71.107483',
      'type' => 'Intersection'
    };

Now the map is a bit tricky. As we just saw, there are 8 keys in the hashref. Unfortunately, some of them are sometimes missing. Because we have use warnings, we will get a warning if we try to access one of these missing values. We have to create these keys and put an empty string '' in there.

map is great for that. It is very much like a foreach-loop, but shorter. It takes a BLOCK of code that will be applied to each element of the list it is handed as the second argument. We use the qw-operator to create that list. qw is short for quoted words and it does just that: it returns a list of the strings in it, but quoted. We could also have said ('time', 'lat', 'long'... ).

In the BLOCK, we access each key of $wp. $_ is the loop variable. In the first iteration it will hold 'time', then 'lat' and so on. Since $wp is a hashref, we need the -> to access it's keys. The curly braces tell that it's a hashref. The ||= operator assigns a value to our hashref element only if it not defined.

Now, if there is a time value (the empty string we just assigned if the date was not set is regarded as 'there is none'), we replace the unix timestamp with a proper date. DateTime helps us to do that. The from_epoch method gets the unix timestamp as an argument. It returns a DateTime object which we can directly use to call the iso8601 function on it.

This is called chaining. Some modules can do it. It is similar to what jQuery's JavaScript objects do. The unix timestamp in our hashref is replaced with the result of the DateTime operation.

  • Now we print to our filehandle again. join is used to put commas between the values. We also put a newline at the end again.
  • Once we're done with the loop, we close the filehandle.

All in all, I'd say this is pretty simple and also quite readable, isn't it? I tried to make it a healthy mix of overly verbose syntax and a perlish flavor.

Thanks for your script! I went to CPAN, looked@ readme and having errors. the perl Makefile.PL command resulted in: Optional ExtUtils::MakeMaker::Coverage not available Argument "6.57_05" isn't numeric in numeric ge (>=) at Makefile.PL line 34. Checking if your kit is complete... Looks good Warning: prerequisite DateTime::Format::ISO8601 0 not found. Warning: prerequisite HTML::Entities 0 not found. Warning: prerequisite XML::Descent 1.01 not found. Writing Makefile for Geo::Gpx Writing MYMETA.yml proceeded w/make test & 8/10 tests& 3/3 subtests failed. tried to only run lat,lon,elev, w/noluck

so I have 4 pages of errors from the make test, though attempted to remove time and all other fields from the text aside from lat, lon, elev and run it anyways, with no luck. I read my first 3 chapters of the beginning perl book yesterday, so I'm hoping theres and easy fix, I also tried to reinstall under sudo with no luck. the script makes sense and I appreciate the explanation portion as well. Being a novice I am scratching my head at the moment.

Have you read a manual on how to install cpan modules? Or did you try to download it from the CPAN website? If you use the command line tool, it will install all the dependencies.

Ah, in Beginning Perl it should tell you all about cpan in Chapter 2. If you're on Windows with ActivePerl, there's also a program called ppm that will give you a nice GUI to install modules. You can use either to get the modules you need with all the dependencies in one go.

Sounds great I will look into this, you are correct I did get the download from the site, though I will look to download at the command prompt in bash/ubuntu.

perl - How to extract .gpx data with python - Stack Overflow

python perl extract gpx
Rectangle 27 0

Have a look at Gpsbabel it's not a python module but it can be used standalone without an installer. You can call it through popen or similar calls with python.

gps - Python module that can convert .gpx to .kml - Stack Overflow

gps kml python-module gpx
Rectangle 27 0

The GPXTrackPoint class has a method speed_between(another_gpx_track_point). Just iterate through all points and call this function for neighboring points.

The GPX file format also allows for the speed to be saved in the GPX point directly. If that's the case for your track, you don't need to calculate with speed_between()... Just use gpx_track_point.speed attribute.

I'm writing this from head (ignore typoos:), but this is more-or-less the code:

for track in gpx.tracks:
  for segment in track.segments:
    for point_no, point in enumerate(segment.points):
      if point.speed != None:
        print "Speed=", point.speed
      elif point_no > 0:
        printf "Calculated speed=", point.speed_between(segment.points[point_no - 1])

BTW, you can calculate the speeds between the current point and the previous one and the current point and the next one and then average those values to have a better speed.

xml - GPX parsing. Calculate Speed. Python - Stack Overflow

python xml gps gpx
Rectangle 27 0

GPX is an XML format, so use a fitting module like lxml or the included ElementTree XML API to parse the data, then output to CSV using the python csv module.

I also found a python GPX parsing library called gpxpy that perhaps gives a higher-level interface to the data contained in GPX files.

Perl would be equally suited for the task; there are Perl XML parsers and CSV libraries, just like for python. However, you may find Python easier to learn; in my personal opinion Perl too easily devolves into line-noise.

perl - How to extract .gpx data with python - Stack Overflow

python perl extract gpx
Rectangle 27 0

I tried adding "of_" at the beginning of the filename as in outfile = outputGdb + "\\" + "of_" + featureName + ".shp" but it did not work with that either, I got the same error.

I figured it out, since I am putting it into a geodatabase ".shp" is not the correct extension to use. I just got rid of the ".shp" and it worked!

python - Using arcpy.gpxtofeature to convert gpx files en mass - Stack...

python arcgis arcpy gpx
Rectangle 27 0

ogr2ogr (part of GDAL) is a simple and straightforward Unix shell tool to load a GPX file into PostGIS.

ogr2ogr -append -f PostgreSQL PG:dbname=walks walk.gpx

ogr2ogr creates its own database tables in PostGIS with its own schema. The table tracks has one row per GPS track; tracks.wkb_geometry contains the GPS track itself as a MultiLineString. The table track_points contains individual location fixes (with timestamps).

Here is what the database walks looks like before the import:

walks=# \d
               List of relations
 Schema |       Name        | Type  |  Owner   
--------+-------------------+-------+----------
 public | geography_columns | view  | postgres
 public | geometry_columns  | view  | postgres
 public | raster_columns    | view  | postgres
 public | raster_overviews  | view  | postgres
 public | spatial_ref_sys   | table | postgres
(5 rows)

... and after the import:

walks=# \d
                    List of relations
 Schema |           Name           |   Type   |  Owner   
--------+--------------------------+----------+----------
 public | geography_columns        | view     | postgres
 public | geometry_columns         | view     | postgres
 public | raster_columns           | view     | postgres
 public | raster_overviews         | view     | postgres
 public | route_points             | table    | postgres
 public | route_points_ogc_fid_seq | sequence | postgres
 public | routes                   | table    | postgres
 public | routes_ogc_fid_seq       | sequence | postgres
 public | spatial_ref_sys          | table    | postgres
 public | track_points             | table    | postgres
 public | track_points_ogc_fid_seq | sequence | postgres
 public | tracks                   | table    | postgres
 public | tracks_ogc_fid_seq       | sequence | postgres
 public | waypoints                | table    | postgres
 public | waypoints_ogc_fid_seq    | sequence | postgres
(15 rows)

python - How can I load multiple gpx files into PostGIS? - Stack Overf...

python sql gis postgis gpx
Rectangle 27 0

If you are using linux, you may try this:

  • Use a program to convert GPX to SHP: gpx2shp sudo apt-get install gpx2shp ... gpx2shp -o output_file.shp infile.gpx

then load that file into a postgis enabled database with shp2pgsql

sudo apt-get install postgis
...
shp2pgsql output_file.shp gis_table

you may of course use pipe and make all in one command line

EDIT If you still want a python script, you may find help here http://pypi.python.org/pypi/gpxtools

thanks for pointers. would that solution give me separate shp file for each gpx input? or is it possible to combine all gpx files into one shp?

@radek You could import gpx files to R, merge them and export them using to a shape file using rgdal. Of course, once you have the gps files in R, you could also use SQL to import the data into a PostGIS DB.

@Roman: thanks. i did some work with R + SQL + Postgres so should be able to sort this part. is there any particular method/package you could recommend to work with gpx files tho?

There's a function to read gpx (readGPS I think) files in maptools (uses gpsbabel). I also wrote a gpx scraper function that uses only the XML package. Drop me a line if you'd be interested in giving it a try (looking for beta testers anyway :) ).

@Roman: i'm trying to find my way through this problem in R as well. afaik readGPS() is interface to GPSBabel software. i tried using system() call with manual definition of parameters, but had no success so far. would greatly appreciate if you could share your work :]

python - How can I load multiple gpx files into PostGIS? - Stack Overf...

python sql gis postgis gpx
Rectangle 27 0

It isn't really documented anywhere (in my opinion), so I'll post it here. Instead of letting the parser try to open the file, generate a file object first, and feed that to the parser, so:

import gpxpy
f = open(path_to_gpx_file, 'r')
p = gpxpy.parse(f)

python - Using GPXPY to parse gpx file results in not well-formed inva...

python xml gpx
Rectangle 27 0

Try to incorporate this into your code. The regular expression extracts all digits in each line

import re

gpx_list = []
gpx = open('G:\\14022705.gpx', 'r')      
gpx_list_out = open('G:\\Position_Data2.csv', 'w') 

for line in gpx:
    if 'trkpt ' in line:
      print re.findall(r"[-+]?\d*\.\d+|\d+",line)
      numerical_value=re.findall(r"[-+]?\d*\.\d+|\d+",line)
      gpx_list_out.write(",".join(numerical_value))

gpx_list_out.close()

Thanks! I've tried that and got ['-42.6150634', '+147.4397831'] ['1.431'], which really tidies it up. Doesn't want to write to the csv file though, which is strange as it was before!

This is the first bit of the original file: <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <gpx creator="NC" version="2.0" > <metadata> <link href="mikrokopter.de" <text>MikroKopter</text> </link> <desc>FC HW:2.1 SW:2.0a + NC HW:2.0 SW:2.0a</desc> </metadata> <trk> <name>Flight</name> <trkseg> <trkpt lat="-42.6150634" lon="+147.4397831"> <ele>1.431</ele> <time>2014-02-27T04:32:22Z</time>

Yes. I changed the code but since you have an XML file I recommend that you use some XML parser, for instance Beautifulsoup

How to clean up .gpx data before writing to a .csv file in python - St...

python-2.7 csv gpx
Rectangle 27 0

register_namespace() controls the prefixes used when serializing XML, but it does not affect parsing.

from xml.etree import ElementTree as ET

tree = ET.parse("gpx.xml")
for elem in tree.findall("{http://www.topografix.com/GPX/1/1}wpt"):
    print elem
<Element '{http://www.topografix.com/GPX/1/1}wpt' at 0x201c550>
<Element '{http://www.topografix.com/GPX/1/1}wpt' at 0x201c730>

With lxml, you can also use this:

from lxml import etree

NSMAP = {"gpx": "http://www.topografix.com/GPX/1/1"}

tree = etree.parse("gpx.xml")
for elem in tree.findall("gpx:wpt", namespaces=NSMAP):
    print elem

This is extremely useful, clarification of exactly what register_namespace() actually does. Serialization, if I am understanding correctly, only is useful when creating output. For what I am doing, it appears gpxpy's parser module may actually be a good starting point. Thank you for the help in clarifying this, though. Again, Stackoverflow shows me how I need to proceed!

Read GPX using Python ElementTree.register_namespace? - Stack Overflow

python-2.7 elementtree gpx
Rectangle 27 0

import gpxpy

gpx_sample = """...your GPX sample here..."""

gpx = gpxpy.parse(gpx_sample)

for wpt in gpx.waypoints:
    print wpt.latitude, wpt.longitude

Even if you don't want to use the library you can just check the code to see how it parses the XML file.

Yes, it is a shameless plug and a fair question. First, I want to understand how to use this, because it applies in another project I am working on. Much of the time, the doing is for the sake of learning. You used minidom and I would like to use ElementTree since I understand how to work with it...provided I can get over this hump. Second, I do not need everything in gpxpy. This will eventually be plugged in with ArcGIS, so I have everything I need for analysis and more in there. As a result, all I need is the ability to read these tags.

BTW, I'm using minidom only lxml is not available (since lxml is way faster than minidom).

Read GPX using Python ElementTree.register_namespace? - Stack Overflow

python-2.7 elementtree gpx