Haar Wavelet Transform Calculation Primer

The DWT calculation is carried out in detail for a signal vector containing 8 data elements. The first step is to compute the averages and differences (divided by two) for each of the signal components. This is shown in Figures 1-4.

The next step involves computing the averages and differences of those averages and differences just computed. This is shown in Figures 5 and 6.

Finally, the last step involved is to compute the average and difference of the previous step. This is shown in Figure 7.

The discrete wavelet transform vector for the signal supplied is given by the bottom row, shown in Figure A-8. The color-coding is included to indicate where each of the computed differences (and the final average) is placed in the vector.

As the number of signal data points increases, so does the number of computations of averages and differences. This process lends itself quite readily to automation.

Haar Wavelet Transform Calculations Using MS Excel

Background on Calculations with the DWT

The discrete wavelet transform provides a system for constructing or approximating a signal or function. Unlike Fourier series, which reflects only the frequency or spectral components of a signal, wavelets provide time and frequency localization of signal specifics, which is necessary to reconstruct time-varying, non-stationary processes [1, 2]. The discrete wavelet transform calculation is conducted with respect to a Haar basis function, in which individual averages and differences (or details, as they are sometimes referred to in the art) are computed with respect to the raw signal data. Let’s begin by considering a small sample signal of raw data collected from a patient, defined by Eq. 1:

The process of computing wavelet coefficients from this vector is straightforward and is as illustrated in Figure 1.

The signal is decomposed into a series of averages and differences, where the average is calculated according to normal convention, and the difference is actually half the difference between any two raw signal values. Thus, from Equations 2 and 3:

The computations illustrated in Figure 1 proceed as follows: the average of each raw sample is computed with respect to its immediate neighbor, together with the difference (divided by 2). Once these are computed, the average and difference of these results are then computed. This process is continued until the complete ensemble (that is, the single value and difference) corresponding to the entire signal is determined. The first wavelet coefficient is given by the ensemble average corresponding to the longest scale value over the entire interval. The next wavelet coefficient corresponds to the size of the difference of the averages at the next scale up. The remaining coefficients follow the pattern of the differences between the averages at finer and finer scale (in general). Thus, the vector of wavelet coefficients given the data sample above appears as follows, represented as Equation 4:

We can represent this relationship between the wavelet coefficients and the raw signal as Equation 5:

Where H4 represents a 4 x 4 Haar matrix having the form:

Alternatively, given the raw signal, the wavelet coefficients may be found directly from Equation 7:

The Haar matrix may be inverted using standard methods. The creation of the Haar matrix follows a predictable pattern as the number of rows and columns increases. However, by applying the Haar transform, the size of the matrix increases according to 2n scale, where n is a positive integer. Thus, in the Haar basis, the quantity of data must conform to this scale as well. We can expand this Haar basis to an H8 basis, illustrated here in Equation 8:

The number of rows and columns contained within a Haar Hn basis follows in accord with 2n. We can consider an example problem now to illustrate the method. Let’s first expand the original signal from four to eight elements. This larger data quantity will help to illustrate some other features of the discrete wavelet transform and why it is being considered for the specific application. The vector for this data set is given by Equation 9:

The vector of wavelet coefficients associated with this signal, found using the H8, is as follows:

Now, one of the benefits of wavelet coefficients is that they establish the relative scale of the differences with respect to the overall signal average. This is important because, in terms of reproducing the signal, the values of these wavelet coefficients establish their relative impact on the overall signal. Thus, compression of the original signal can be achieved (at a loss) by discarding certain of these coefficients based on establishing a sensitivity threshold.

Defining the statistical significance level of this threshold can be done in accord with well-documented practices, especially relative to setting confidence intervals with respect to a known distribution [3, 4, 5]. However, the discarding of coefficients is not the objective of the wavelet transform in medical applications: indeed, removing potentially important information from the raw signal can be detrimental and will provide the clinician with incomplete data on the patient. Instead, wavelet transforms provide the capability to record all of the data and to automatically filter it so that (1) communication of all data elements between the clinical environment and the health enterprise will not be overwhelmed; and, (2) the ability to retrieve any amount of the data, from an ensemble to detailed temporal changes, can be mined at will by clinicians and researchers without requiring that all data be retrieved from the data repository in any one request.

One way to illustrate these concepts is by applying an exclusion threshold on the smallest values of coefficients: the magnitude of the wavelet coefficients provides insight into the level of contribution they make to the character of the overall raw signal. Hence, by omitting certain coefficients it becomes possible to exclude noise, artifact, or other components that are judged to be of minor influence to the overall raw data sample.

For instance, consider the table of wavelet coefficients (Table 1). The column on the left is the independent variable (time). Each subsequent set of columns defines the set of Haar-basis wavelet coefficients, and the resulting signal value, beginning with no applied threshold up to a value of 30% threshold. The threshold value is computed by multiplying the threshold percentage by the largest wavelet coefficient. For instance:  a 10% threshold multiplied by -4 yields a threshold (absolute) value of 0.4. In this case, one wavelet coefficient is discarded, given that the requirement for the 10% threshold case is that the absolute value of all coefficients is greater than 0.4.

At the 20% level, the threshold value is 0.8, but no other coefficients exceed this threshold, so still only one coefficient is discarded (i.e., set to zero so that its contribution will be ignored for signal reconstruction). In comparing the reconstructed signals with thresholds of 10% and 20% to the original (no threshold applied signal) one can see that there are differences in the reconstructed signal. These differences have a maximum deviation of 0.25 between the reconstructed and the original signal.

In viewing the 30% threshold columns, three coefficients are discarded. Here, the deviation between the original and reconstructed signals is no larger than 1.25. So, the general impact of discarding wavelet coefficients from the basis results in an approximation to the original signal. Thus, by discarding wavelet coefficients from the basis, the reconstructed signal approximates the original signal. As the discard threshold approaches zero, the difference between the reconstructed and original signals approaches zero. Figure 2 provides a comparative view of these data by displaying all of these signals on one overlay. To the casual observer, there does not appear to be much difference between the lossy and the lossless cases: the signal data points all appear to be close to one another.

Depending on the behavior of the original signal (that is, its shape, repetitiveness, noise content), the degree of loss vis-à-vis discarding of wavelet coefficients may or may not be acceptable to the end-user. However, in the case of a predictable or repetitive signal, the discarding of wavelet coefficients can have a trivial effect on the reconstruction of the original signal. This latter case can be illustrated effectively with the aid of a revised form of the signal data. The data are contained in Table 2, plotted in Figure 3

In this revised case, the raw signal data follow a series of three step functions: values of 8 for three time units, 3 for three time units and –4 for three time units. The wavelet coefficients show that three values are zero. Hence, thresholds of 30% on any of these coefficients will not exceed the threshold. In this instance we are seeing another benefit of the discrete wavelet transform: the ability to “automatically” filter out repeated data. Therefore, all of the reconstructed data shown in Figure 3 overlay the raw signal data. The number of coefficients required to reconstruct this signal are three fewer than the total number of data points contained within the raw signal. Hence, the discrete wavelet transform provides a means for representing the original signal with fewer overall data points. This concept plays a role in the application of the discrete wavelet transform technique to patient vitals data.

[1]     C. Sidney Burrus, Ramesh A. Gopinath, Haitao Guo, Introduction to Wavelets and Wavelet Transforms—A Primer; Prentice-Hall, 1998; page 3.

[2]     Tommi Vuorenmaa, “The Discrete Wavelet Transform with Financial Time Series Applications”; Seminar on Learning Systems at the Rolf Nevanlinna Institute; University of Helsinki, April 9th 2003.

[3]     James F. Zolman, Biostatistics: Experimental Design and Statistical Inference; Oxford University Press, 1993; pp 77-99.

[4]     Christopher Torrence, Gilbert P. Compo, “A Practical Guide to Wavelet Analysis,” Bulletin of the American Meteorological Society; Vol. 79, No. 1: 69-71, January 1998.

[5]     Sheldon Ross, A First Course in Probability, 3d Ed.; Macmillan Publishing Company, 1988; pp 336-357.

Import Multiple CSV Files into a Single MS Excel Spreadsheet

The following routine reads multiple (potentially hundreds) of comma-separated-value files (CSV) and writes their content to a single MS Excel worksheet.

Original reference from ExtendOffice.com

```Sub ImportCSVsWithReference()
'
' This macro reads in all CSV files in a directory and
' writes them to a single new worksheet ("Sheet1")
'
' in a new workbook (that is, the workbook in which this
' Macro is defined).
'
' The data from the separate CSV files are column-listed in' the Sheet1.
'
' Found at this web site: https://www.extendoffice.com/documents/excel/3388-excel-import-multiple-text-csv-xml-files.html
'
' Developed in this Macro 2018-01-08
'
' John Zaleski
'
' -------------------------------------------
' What follows is directly from the web site:
'
' UpdatebyKutoolsforExcel 2015-12-14
Dim xSht  As Worksheet
Dim xWb As Workbook
Dim xStrPath As String
Dim xFileDialog As FileDialog
Dim xFile As String
On Error GoTo ErrHandler
Set xFileDialog = Application.FileDialog(msoFileDialogFolderPicker)
xFileDialog.AllowMultiSelect = False
xFileDialog.Title = "Select a folder"
If xFileDialog.Show = -1 Then
xStrPath = xFileDialog.SelectedItems(1)
End If
If xStrPath = "" Then Exit Sub
Set xSht = ThisWorkbook.ActiveSheet
If MsgBox("Clear the existing sheet before importing?", vbYesNo, "Kutools for Excel") = vbYes Then xSht.UsedRange.Clear
Application.ScreenUpdating = False
xFile = Dir(xStrPath & "\" & "*.csv")
Do While xFile <> ""
Set xWb = Workbooks.Open(xStrPath & "\" & xFile)
Columns(1).Insert xlShiftToRight
Columns(1).SpecialCells(xlBlanks).Value = _
ActiveSheet.Name
ActiveSheet.UsedRange.Copy xSht.Range("A" _
& Rows.Count).End(xlUp).Offset(1)
xWb.Close False
xFile = Dir
Loop
Application.ScreenUpdating = True
Exit Sub
ErrHandler:
MsgBox "no files csv", , "Kutools for Excel"
End Sub```

Creating Plots in MS Excel Using Visual Basic Code

I am frequently in need of generating scatter and line plots of measured physiological signals. To this end, I like to use many different types of software tools. The one I like the most, however, for rapid manipulation and visualization of data is Microsoft Excel.

Oftentimes I need to create charts from columnar data within Excel. Creating and formatting line plots, particularly if creating many, can be a time-consuming and tedious process. Therefore, I had decided on searching for approaches for creating automated chart plotting routines of the format I was looking for. Naturally, I turned to the Internet and to my various MS Excel textbooks.

To my disappointment, however, after spending what was perhaps a week of evenings searching, I honestly could not find anything that fit the bill on the open Internet. That is to say, it was not as if I could not find any help… simply that I could not find anything that already provided me with exactly what I was looking for or event a close template that I could pilfer and customize.

Thus, it became necessary for me to slog through the process myself. The purpose of this entry is to simply communicate what I found and did so that another wayward traveler might be saved from some time and effort… although, I can imagine that what I provide below is not precisely what someone else is looking for, either.

Nevertheless, and once you review my code, you may conclude “that really was not anything special… why did it take him so long to create that, and why so difficult to find comparable models that he could reuse?”

The answer to these questions is that oftentimes those pursuing the process need to educate themselves. Thus, it was not merely me taking another individual’s sample code…it was about me understanding that sample code. The understanding is often the hardest part of the challenge.

To begin…

The objective is to create and X-Y plot, wherein the X-data and Y-data are columns in an MS Excel Worksheet, as shown in the figure below:

These data can be plotted either one dependent variable at a time or any combination of dependent columns versus the independent column. Accomplishing this manually using MS Excel is a straightforward task. Yet, if one wishes to develop a standardized template or create a common format, particularly if the plotting is to be repeated many times, performing this manually becomes overwhelmingly tedious.

The routine I developed creates a chart with multiple data series displayed versus the common independent axis. The plot generated by the Visual Basic code from these data is as shown below:

The code follows:

```Sub plotData()
' Purpose: Plots 3 functions versus time in Excel using
' visual basic programming.
' 2017-12-31.
' J. Zaleski

' how many rows?

Dim rCount As Long
rCount = ActiveSheet.Cells(9, 10)

' axis dimensions

Dim xaxis As Range
Dim yaxis As Range
Dim yaxis2 As Range
Dim yaxis3 As Range

Set xaxis = Range("\$a\$3", "\$a" & rCount)
Set yaxis = Range("\$b\$3", "\$b" & rCount)
Set yaxis2 = Range("\$c\$3", "\$c" & rCount)
Set yaxis3 = Range("\$d\$3", "\$d" & rCount)

' dimension chart

Dim c As Chart
Set c = ActiveWorkbook.Charts.Add
Set c = c.Location(Where:=xlLocationAsObject, Name:="test")

With c
.ChartType = xlXYScatterLines 'A scatter plot, not a line chart!
' set other chart properties
End With

' add data series to chart

With c
' assign x and y value ranges to series 1

.SeriesCollection.NewSeries
.SeriesCollection(1).Name = Worksheets("test").Cells(2, 2)
.SeriesCollection(1).Values = yaxis
.SeriesCollection(1).XValues = xaxis
.SeriesCollection(1).MarkerBackgroundColor = RGB(0, 0, 0)
.SeriesCollection(1).MarkerForegroundColor = RGB(0, 0, 0)
.SeriesCollection(1).MarkerSize = 2
.SeriesCollection(1).MarkerStyle = 3
.SeriesCollection(1).Format.Line.Weight = 1#
.SeriesCollection(1).Format.Line.ForeColor.RGB = RGB(0, 0, 0)
.SeriesCollection(1).Format.Line.BackColor.RGB = RGB(0, 0, 0)

' assign x and y value ranges to series 2

.SeriesCollection.NewSeries
.SeriesCollection(2).Name = Worksheets("test").Cells(2, 3)
.SeriesCollection(2).Values = yaxis2
.SeriesCollection(2).XValues = xaxis
.SeriesCollection(2).MarkerBackgroundColor = RGB(128, 0, 0)
.SeriesCollection(2).MarkerForegroundColor = RGB(128, 0, 0)
.SeriesCollection(2).MarkerSize = 2
.SeriesCollection(2).MarkerStyle = 4
.SeriesCollection(2).Format.Line.Weight = 1#
.SeriesCollection(2).Format.Line.ForeColor.RGB = RGB(128, 0, 0)
.SeriesCollection(2).Format.Line.BackColor.RGB = RGB(128, 0, 0)

' assign x and y value ranges to series 3

.SeriesCollection.NewSeries
.SeriesCollection(3).Name = Worksheets("test").Cells(2, 4)
.SeriesCollection(3).Values = yaxis3
.SeriesCollection(3).XValues = xaxis
.SeriesCollection(3).MarkerBackgroundColor = RGB(128, 128, 0)
.SeriesCollection(3).MarkerForegroundColor = RGB(128, 128, 0)
.SeriesCollection(3).MarkerStyle = 5
.SeriesCollection(3).MarkerSize = 2
.SeriesCollection(3).Format.Line.Weight = 1#
.SeriesCollection(3).Format.Line.ForeColor.RGB = RGB(128, 128, 0)
.SeriesCollection(3).Format.Line.BackColor.RGB = RGB(128, 128, 0)

End With

'adjust major unit on x axis

With c.Axes(xlCategory)
.MajorUnit = Worksheets("test").Cells(8, 10)
End With

'adjust major unit on y axis

With c.Axes(xlValue)
.MajorUnit = Worksheets("test").Cells(5, 10)
End With

' find maximum y value

Dim maxY As Integer
maxY = Worksheets("test").Cells(3, 10)

' find minimum y value

Dim minY As Integer
minY = Worksheets("test").Cells(4, 10)

' find maximum x value

Dim maxX As Integer
maxX = Worksheets("test").Cells(6, 10)

' find minimum x value

Dim minX As Integer
minX = Worksheets("test").Cells(7, 10)

With c
'locate chart

.ChartArea.Top = 50
.ChartArea.Left = 400

'adjust width and height

.ChartArea.Width = 400
.ChartArea.Height = 200

.Axes(xlValue).MinimumScale = minY
.Axes(xlValue).MaximumScale = maxY

.Axes(xlCategory).MinimumScale = minX
.Axes(xlCategory).MaximumScale = maxX

.ChartArea.Format.TextFrame2.TextRange.Font.Size = 8
.ChartArea.Format.TextFrame2.TextRange.Font.Name = "Arial Narrow"

.Axes(xlCategory).TickLabels.Font.Color = RGB(0, 0, 0)
.Axes(xlValue).TickLabels.Font.Color = RGB(0, 0, 0)

'adjust decimal places on x-axis

.Axes(xlCategory).TickLabels.NumberFormat = "0.00"
.Axes(xlValue).TickLabels.NumberFormat = "0.00"

.Axes(xlCategory, xlPrimary).HasTitle = True
.Axes(xlCategory, xlPrimary).AxisTitle.Characters.Text = Worksheets("test").Cells(2, 1)
.Axes(xlValue, xlPrimary).HasTitle = True
.Axes(xlValue, xlPrimary).AxisTitle.Characters.Text = "Three Functions versus Time"
.HasLegend = True
End With

End Sub```

The code is written as a macro and the data are maintained in a worksheet called “test”.  The macro is associated with a button placed on the worksheet which, when pressed, creates a plot of the three dependent variables. The routine is not intended to be all-encompassing in terms of capability. But, to the uninitiated, the code serves as a building block for tailoring and customization. For instance, the routine requires three specific dependent columns of data. The .SeriesCollection field is specified with three indices. This can be generalized to any number of plots using a for loop indexing variable. Furthermore, the color schemes, tick labels, size, location, etc. are all customizable.

LinkedIn Article: “Another year, another 500,000 meters”

Chronicling the accomplishment of meeting the 500,000 meter goal for indoor Erg and the time to achieve this for 2017. Published on LinkedIn.

Concept 2 Rowing Challenge: November 23rd – December 24th 2017

Completed the Concept 2 Holiday Rowing Challenge with 201,580 meters. Thus far this season have logged 482,862 total meters… and the year is not done yet. Targeting 500,000 at least by year end 2017. Rowing the erg in winter primarily and water rowing Spring-Summer-Early Fall on the Chesapeake Bay is for the most part how the split goes. Goal is to target exceeding these goals by 20% in 2018.

“What Real-Time Data Could Have Done for These Patients” – Smart Alarm Web Log

A summary of how smart alarms could have assisted patients monitored while in intensive care is provided at this Bernoulli Health link. This relates my experience many years ago and references the use of real-time data to better identify the onset of potentially adverse events.

Heart Rate Measurement Using Garmin & Polar Wearables

A study was made of the Garmin Vivoactive HR and Polar H10 chest strap in terms of comparative heart rate assessments. The units are shown in Figure 1 below. The two units involved included a wrist-based sensor (Garmin Vivoactive HR) and a chest strap (Polar H10).

This follow-up focuses on 20 minutes of water rowing using both units in an effort to assess the heart rate measurement consistency and reliability. Both watch and chest strap were properly attached with no movement between these devices and the skin. Data were collected and then downloaded and processed through a Microsoft Excel spreadsheet. The data were time-synchronized so that corresponding data points from each device were associated in time. A summary of the analysis is provided here.

Time-Based Plots of Heart Rate

Overlay scatter plots of heart rate measurements versus time were made and are as shown in Figure 2.

A general observation from the data is that the heart rate measurements from the two devices seem to overlap reasonably well as viewed by the naked eye. But there are key drops in measurements, particularly with the wrist-based heart rate sensor, that show as deviations in the overlap of the two signals. This can be seen more readily via the correlation curve shown in Figure 3. The correlation coefficient of 0.91 was determined between the two sets of measurements. It should be noted that the wrist-based sensor was snug with no movement on the wrist. Ambient temperature was approximately 80F.

As I showed in a previous post, there was a serious issue with the wrist-based sensor in which there were data dropouts with some significant time lags between measurements. In the case of the wrist-based sensor for the associated measurements here, this was also experienced. For comparison, I show histogram plots of the time intervals between measurements for both the wrist-based sensor (Figure 4) and the chest strap (Figure 5). The wrist-based sensor experiences a significant number of events in which the time between actual measurements are greater than one second. Indeed, from the figure, only 83 measurements during this interval were obtained within one-second of one another! There were a significant number of measurements in which the interval was > 1 second, with one as high as 40 seconds. The overall quantity of measurements was thus reduced to approximately 430 during the workout. On the other hand, the chest strap consistently measured at one-second intervals for a total of approximately 1320 measurements.

Conclusions

Chest straps are much more reliable for heart rate measurement versus wrist-based sensors. Users of wrist-based sensors for heart rate measurement should be advised that measurements can be in question, as results illustrate here. This is not to say that chest straps are the gold-standard. Clearly, ECG measurement similar to those obtained through stress-testing are of diagnostic quality. Yet, for rate measurement chest straps are quite adequate and seemingly reliable.

Heart Rate Measurement Using Garmin & Polar Wearables

A study was made of the Garmin Vivoactive HR and Polar H10 chest strap in terms of comparative heart rate assessments. Three different types of tests were conducted while the author wore these devices. The units are shown in Figure 1. The Garmin unit is able to be used with a number of sports, including rowing, and provides measurements of heart rate, stroke rate, distance per stroke, split times, and also provides for location tracking during the workout. Data can be uploaded to http://connect.garmin.com/and are also available for download in TCX (an XML format) as well as splits downloads in CSV format. The Polar H10 is strapped around the chest just below the level of the breast bone. This unit, too, can upload data to the http://flow.polar.com/web site, where data can be downloaded in TCX format, as well.  In order to provide some variety, I considered three different activities:

• General workout, involving weight lifting, sit-ups, squats;
• Walking for 1 mile; and,
• Indoor rowing for 15 minutes.

In all cases, both the Vivoactive HR and the H10 were attached, with the Vivoactive HR snuggly affixed to the left wrist. Both watch and chest strap were properly attached with no movement between these devices and the skin. Data were collected and then downloaded and processed through a Microsoft Excel spreadsheet. The data were time-synchronized so that corresponding data points from each device were associated in time. Plots of the measurements were made.

Time-Based Plots of Heart Rate

Overlay scatter plots of heart rate measurements versus time were made of all three activities, shown in Figure 2 through Figure 4. Data were downloaded from the Garmin & Polar cloud sites and were uploaded into MS Excel. The data were then time synchronized using visual basic to align the measurements.

Heart Rate Comparison: Walking

Measurements of heart rate were taken during a one mile walk. The heart rates were plotted against one another and the correlation coefficient was computed between the two sets of measurements. In the case of the comparison shown in Figure 5, the correlation among measurements was rather poor: the correlation coefficient was determined to be -0.54. Perfect correlation is given by the diagonal line in the figure. Interesting to note is that the data points taken from the Vivoactive HR time variance. In the case of the Vivoactive HR, in some instances, the time between measurements was as high as 47 seconds with 62 measurements in the 12-14 second interval range, whereas in the case of the Polar H10, all measurements were 1 second interval. Thus, the number (quantity) of measurements taken by the Polar H10 were far denser than those of the Vivoactive HR.

Heart Rate Comparison: General Activity

In the case of general activity, which included some weight lifting, sit-ups, leg raises and standing exercises, the heart rate comparison is as shown in Figure 6. The correlation coefficient among these measurements is a bit higher at 0.60. The variation in measurement collection time associated with the Garmin HR was even higher here, with one measurement interval as high as 88 seconds!

I have hypothesized that the wide variation in data collection time may be due to arm motion that is not experienced to the degree in walking. I also have hypothesized that the improved correlation may be due to the higher heart rate, which is more easily detected by the Vivoactive HR. We will see some supporting evidence of this in the final section on indoor rowing.

Heart Rate Comparison: Indoor Rowing

Rowing on the Concept 2 PM5 unit while wearing both the Vivoactive HR and the Polar H10 produced the results as illustrated in Figure 7. The correlation between the Vivoactive HR and the Polar H10 is much higher here, with a correlation coefficient of 0.95. Several items of note: the variation in measurements with the Vivoactive HR is much lower, with only two measurements 19 seconds apart and most measurements having 1-2 second intervals. This complies much more closely with the 1-second measurement intervals of the Polar H10. Furthermore, heart rate measurements are much higher here: some measurements as high as 165 beats/minute (during sprints). In general, corroboration between the two units is better as heart rate measurement is higher. This could be due to more accurate peripheral measurement.

Conclusions

Based on the limited sampling and workouts thus far, the general conclusion regarding heart rate measurement “trust” is that the Polar H10 is more reliable based on several observations: (1) data collection time variation remains consistent at 1 second; and, (2) data density remains high with no dropouts in any of the workouts. This is not a surprise in general as the conventional wisdom is that chest straps are much more reliable. Yet, I wanted to quantify this reliability using some objective measures. It should be noted that while heart rate remains somewhat questionable with the Vivoactive HR, I have found that stroke rate measurement in comparison with the Concept 2 PM5 measurement is dead on accurate (at least based on the data I have observed).

Rowing Data Analytics: Reducing and Studying the Rowing Workout

In my last post (“Rowing Data…”) I discussed the steps associated with downloading the Garmin Vivoactive HR data from Garmin Connect to an Excel spreadsheet. In this post, I’m going to take the reader through the analysis of the data as a tutorial and guide for assessing certain elements of these data.

Raw data in Excel format are shown in Figure 1. I am going to focus on distance (column M), speed (column N), and heart rate (column O).

I normally like to study discrete, time-based data by translating the time component from the Zulu time (column L) into a relative time from the start of the workout. Furthermore, I like to translate these into units of seconds as the base unit.

To do so, we can take advantage of some powerful capabilities contained within formulas inside of Microsoft Excel. For example, the start time listed in column L begins with the entry:

2017-07-08T14:09:31.000Z

The next entry is:

2017-07-08T14:09:34.000Z

These are “Zulu” time or absolute time references. We wish all future times to be keyed or made in reference to the first time. In order to do so, we need to translate this entry into a time in seconds. We can do so by parsing each element of the entry. These entries are listed sequentially in column L2 and L3, respectively.

Each element is translated into seconds by parsing the hours, minutes and seconds using the following formula:

=MID(\$A\$2,12,2)*60*60+MID(\$A\$2,15,2)*60+MID(\$A\$2,18,2)

The first component extracts the time in hours and translates into seconds. The second component extracts the “minutes” and translates into seconds. The third component extracts the “seconds” element by itself. The total time is the superposition of all three individual components.

Thus, what I normally do is to copy the contents of the initial spreadsheet into a new sheet adjacent to the original and then begin working on the data. Presently, I am in the process of developing an application that will perform this function automatically. Yet, here I am “walking the track” associated with analyzing the data in order to chronicle the mathematics surrounding the process.

The hour, minute and second can be extracted as separate columns. Let us copy the contents of column L in the original spreadsheet into a new sheet within the existing workbook and place the time in column A of that new sheet. Thus, the entries in this sheet would appear as follows:

 ns1:Time Absolute Time (seconds) Relative Time (seconds) 2017-07-08T14:09:31.000Z 50971 0 2017-07-08T14:09:34.000Z 50974 3 2017-07-08T14:09:35.000Z 50975 4

The Absolute time in the middle column is the time in seconds represented by the left-hand column relative to Midnight Zulu time. The right-hand column is the time relative to the first cell entry in the middle column. Thus, zero corresponds to 50971-50971. The entry for three seconds corresponds to the difference between 50974 (second entry) and 50971 (first entry), and so on.

I also created some columns to validate parameter entries. For instance, the reported total distance and speed (in units of meters and meters per second, respectively), in column M and N and the heart rate, in column O, are referred to next. I created a new column O in the new spreadsheet to provide a derived estimate of total distance, which I computed as the integral of speed over time. The incremental distance, dS, is equal to the speed at that time, dV, multiplied by the time differential between the current time and the previous time stamp, dt. Then, the total distance is the integral, or the summation of this incremental distance and all prior distances. I reflect this as column G in the new worksheet, shown in Figure 2.

What follows now are plots of the raw and derived data. First, the heart rate measurement over time is shown in Figure 3. Note that the resting rate is shown at first. Once the workout intensifies, heart rate increases and remains relatively high throughout the duration of the workout.

The total distance covered over time is shown in Figure 4. This tends to imply a relatively constant speed during the workout due to the linear behavior over the 8700+ meters.

The reported speed, as measured via GPS, shows variability but is typically centered about 1.85 meters per second. The speed over time is shown in Figure 5.

The GPS coordinates are also available through the Excel data. I have subtracted out the starting location in order to provide a relative longitude-latitude plot of the workout, shown in Figure 6.

In my next post I will focus on the athletic aspects of the workout related to training.