I've been playing around with R to correlate GPS and heart rate monitor logs from when we were cycling around in Italy.
One of the problems that I've run into is that the GPS position accuracy degrades significantly when less than 4 satellites are used, altitude data likewise degrades significantly when less than 6 satellites are used.
These high error data points add noise to the data set and complicate analysis. One option would be to find a way to calculate the error in position for each point, and move that data around with the position data.
That's hard. An easier way is to simply convert the inaccurate points to NULL (NA in R).
To import the data from my heart rate monitor, I used s710. To import the data from the GPS NMEA logs, I created a gpsbabel filter:
DESCRIPTION GRASS GIS v.in.ascii
SHORTLEN 8
EXTENSION grass
# FILE LAYOUT DEFINITIIONS:
FIELD_DELIMITER PIPE
RECORD_DELIMITER NEWLINE
BADCHARS ,"
IFIELD LON_DECIMAL,"","%f"
IFIELD LAT_DECIMAL,"","%f"
IFIELD ALT_METERS,"", "%f"
IFIELD GMT_TIME,"","%m/%d/%Y %H:%M:%S"
IFIELD PATH_SPEED,"","%f"
IFIELD PATH_COURSE,"","%f"
IFIELD GPS_HDOP,"","%f"
IFIELD GPS_VDOP,"","%f"
IFIELD GPS_PDOP,"","%f"
IFIELD GPS_SAT,"","%d"
IFIELD GPS_FIX,"","%s"
I had originally created this format for easy import into GRASS GIS, but it works nicely for importing into R :
Lat|Long|Altitude|timestamp|speed|bearing|hdop|vdop|pdop|satellites|fix
11.128023|43.430142|143.300000|09/26/2006 07:42:34|0.092600|11.690000|1.600000|0.000000|0.000000|6|3d
11.128025|43.430142|140.400000|09/26/2006 07:42:35|0.174911|177.690002|1.600000|0.000000|0.000000|6|3d
...
Read the GPS and heart rate data into an R frame with read.table.
raw_gps<-read.table("060926_bike_colle_valdelsa_siena.grass", sep="|", header=TRUE)
raw_hr1<-read.table("060926.0947.txt", header=TRUE)
raw_hr2<-read.table("060926.1447.txt", header=TRUE)
Create vectors to mask off the bad data.
mask_4sat <- !((!(raw_gps$satellites > 4)) & NA)
mask_6sat <- !((!(raw_gps$satellites > 6)) & NA)
This uses the identity that FALSE & NA FALSE, but TRUE & NA NA.
More data massaging is in order
The way I choose to deal with that is to create time series using ts(), and extend and resample the time series with window().
speed <- ts( raw_gps$speed * mask_4sat, start=c(7, 2554), frequency=3600)
alt <- ts( raw_gps$Altitude * mask_6sat, start=c(7, 2554), frequency=3600)
hr1 <- ts(data=raw_hr1$HR, start=c(7,564), frequency=720)
hr2 <- ts(data=raw_hr2$HR, start=c(12,567), frequency=720)
I've chose my unit of time to be the hour. The start argument to ts is particularly important, to get all of the data to line up. The easiest way to get all of this is to ensure that the clock on the heart rate monitor is set to UTC, and the GPS logs are in UTC. The frequency argument reflects the sampling rate of the GPS and heart rate monitor.
Now, mash all of this data together into a time series matrix, padding and resampling the constituent vectors to be uniform.
bike_cole_valdelsa_to_sienna <- ts(
matrix(
c(
window(alt, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(speed, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(hr1, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(hr2, start=7.5, end=13.5, frequency=720, extend=TRUE)
),
ncol=4),
start=7.5, end=13.5, frequency=720)
colnames(bike_cole_valdelsa_to_sienna) <- c("altitude", "speed", "HR1", "HR2")
Now the data is all referenced together, and ready for some analysis.

jack |
posted Mon Oct 30 13:15:44 2006 |
#
category:
projects/R
A weblog by Jack Cummings