Giter VIP home page Giter VIP logo

awesome-sensor-logger's Introduction

Awesome Sensor Logger

This repository contains a collection of tools, resources and sample code to use alongside the Sensor Logger app.

Table of contents generated with markdown-toc

hero

The Sensor Logger App

Sensor Logger is a free, easy-to-use, cross-platform data logger that logs readings from common motion-related sensors on smartphones. Once completed, recordings can be exported as a zipped CSV file, JSON or be viewed within the app via interactive plots. The iOS version comes with a free companion watch app that allows you to log heart rate & wrist motion, and control recording sessions remotely from your wrist. Supported sensors and measurements include:

  • Device Acceleration (Accelerometer; Raw & Calibrated)
  • Gravity Vector (Accelerometer)
  • Device Rotation Rate (Gyroscope; Raw & Calibrated)
  • Device Orientation (Gyroscope)
  • Magnetic Heading (Magnetometer; Raw & Calibrated)
  • Barometric Altitude (Barometer)
  • GPS Coordinate, Altitude, Speed & Heading
  • Proximity Sensor (Android Only)
  • Audio Recording (Microphone)
  • Ambient Sound Level (Microphone)
  • Heart Rate (via Companion Watch App)
  • Wrist Motion (via Companion Watch App)
  • Annotations (Timestamp and optional accompanying text comment)

Learn more and download Sensor Logger at www.tszheichoi.com/sensorlogger.

Android iOS

Getting Started with Data Analysis

Measurements made using the Sensor Logger can be exported in either .csv or .json formats. For data analysis, the former is recommended. See www.tszheichoi.com/sensorloggerhelp for more about how exporting works.

Recommended Tools

Python is recommended for analysing the outputs from Sensor Logger. For interactive data exploration and visualisation, use Jupyter Notebooks, which you can run for free easily with tools like Google Colab or Deep Notes. Your typical data science Python packages apply:

  • Pandas for CSV parsing and manipulation
  • Numpy for data analysis
  • SciPy for time series analysis
  • Folium / Leaflet.js for geospatial visualisation
  • Dash / Flask for data streaming

Understanding Timestamps

All exported data have synchronised time stamps, meaning they can be cross-referenced. However, they do not necessarily align due to varied sampling rates. Note the followings:

  • The time column is the UNIX epoch timestamp of the measurement as reported by the sensors in nanoseconds. You can use tools like https://www.epochconverter.com/ to convert them to readable timestamps. By definition, these are UTC times -- whereas the filenames are in local times.
  • The seconds_elapsed column is the number of seconds since you tapped the Start Recording button. Note that some entries could be negative, meaning the measurements were made before the start of the recording, but are reported by your phone after the tap due to buffering or caching.
  • Please note that the accuracy of timestamps relies on accurate system timestamps. Please make sure your phone’s time is accurate to ensure physically correct timestamps. If your phone changes time zone mid-recording, it may also lead to unpredictable behaviour.

File Handling

Accessing Recording Metadata

The metadata.csv file contains information about the device that performed the logging.

  • version: The schema version of the exported data. This is different to the version of the app. When this version increments, you may have to update your data analysis script as things such as the column names, file names, or data structure may have changed.
  • device name: The name of the device used for recording.
  • recording time: The start time of the recording in UTC.
  • platform: Either iOS or Android.

When to Use Uncalibrated Data

Sensor Logger gives you the option to log raw, uncalibrated data from the accelerometer, gyroscope and magnetometer. Calibrated data is always logged. The raw stream is useful for performing lower-level post-processing or custom sensor fusion. If in doubt, always use the calibrated version unless you have a good reason not to.

Plotting Data

Use pandas to import .csv and convert timestamps. Use Plotly to visualise data interactively. Here, we show the acceleration experienced by a user during a roller coaster ride.

import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('Accelerometer.csv')
df.index = pd.to_datetime(df['time'], unit = 'ns')

fig = go.Figure()

for axis in ['x', 'y', 'z']:
    fig.add_trace(go.Scatter(x = df.index, y = df[axis], name = axis))

fig.show()

simple_accel

Some sensors may also report uncertainties, which is important for analysis and multi-sensor fusion. For example, the speed from GPS has associated errors.

import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('Location.csv')
df.index = pd.to_datetime(df['time'], unit='ns')

fig = go.Figure()

fig.add_trace(go.Scatter(x=df.index, y=df['speed'], mode='markers',
              error_y={'type': 'data', 'array': df['speedAccuracy']}))

fig.show()

error_gps

Diagnosing Sampling Rates

Real-world measurements are never evenly sampled. It is important to double-check before any analysis. For example, this particular GPS location data recorded on an iPhone has avg_sampling_rate of 0.93Hz, but the measurement gap ranges between 0.99 to 12 seconds.

import pandas as pd
import numpy as np

df = pd.read_csv('Location.csv')
consecutive_deltas = df['seconds_elapsed'].diff()
avg_sampling_rate = 1 / np.mean(consecutive_deltas)
shortest_gap = np.min(consecutive_deltas)
maximum_gap = np.max(consecutive_deltas)
total_num_of_samples = len(df.index)

To resample your data, for example, to minute periods:

import pandas as pd

df = pd.read_csv('Location.csv')
df.index = pd.to_datetime(df['time'], unit = 'ns')
df.resample('1T').median()

Also read https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/date-time-types-in-pandas-python/resample-time-series-data-pandas-python/

To understand the implications and robustness of resampling and dealing with missing data in motion-related time series, consult the following:

Mapping GPS Tracks

Use tools like Folium, which is built on top of leaflet.js to overlap GPS tracks on a map

import folium
import pandas as pd

df = pd.read_csv("Location.csv")
coords = [(row.latitude, row.longitude) for _, row in df.iterrows()]

my_map = folium.Map(location=[df.latitude.mean(), df.longitude.mean()], zoom_start=16)
folium.PolyLine(coords, color="blue", weight=5.0).add_to(my_map)

thorpe_park_track

Alternatively, convert your exported data to GPS using https://github.com/mhaberler/sensorlogger-util, and then upload to Google Maps for visualisation, following, for example, https://www.alphr.com/gpx-google-maps/.

Aligning & Interpolating Measurements Across Sensors

Often, one has to align measurements across sensors -- for instance, gyroscope and accelerometer so that you can apply rotation matrixes to the acceleration vectors.

  • Option 1: Perform an outer join and then interpolate missing values for both sensors. By default, pandas interpolates linearly, but see the documentation for more advanced options.
import pandas as pd
import numpy as np

df_gyro = pd.read_csv('Gyroscope.csv')
df_acce = pd.read_csv('Accelerometer.csv')

for df in [df_gyro, df_acce]:
    df.index = pd.to_datetime(df['time'], unit = 'ns')

df = df_gyro.join(df_acce, lsuffix = '_gyro', rsuffix = '_acce', how = 'outer').interpolate()
  • Option 2: Interpolate one sensor onto the timestamps of another -- probably better if the two sensors have wildly different sampling rates. For example:
np.interp(df_acce.index, df_gyro.index, df_gyro['x'])

Using option 1 above, the altitude readings from both the GPS and barometer are resampled and aligned. As you can see, the GPS is not great at characterising rapid altitude changes, and exhibits biases during transient periods.

resampled_altitudes

For more complex alignment needs -- such as aligning with measurements from sources other than Sensor Logger, you may need techniques such as cross-correlation or dynamic time warping:

Smoothing & Denoising

Different applications require different smoothing and denoising strategies. This scipy cookbook has some handy code you can borrow:

Referencing the previous section on interpolation, performing downsampling with a suitable aggregator (e.g. median) or interpolation with suitable spline or polynomial functions may also achieve smoothing.

Fourier Transforms

Use Fourier transforms to understand any periodicity in your logged values. For example, you can detect walking by simply thresholding the gravity vector measurements in frequency space. Here is a simple example of Fourier transforming the gravity measurements taken on Zodiac, a spinning ride at Thorpe Park. As you can see, there is a peak around 0.25 Hz (i.e. the ride goes around one revolution every 4 seconds).

import pandas as pd
import numpy as np

df = pd.read_csv('Gravity.csv')
ts = pd.to_datetime(df['time'], unit = 'ns')
period = np.nanmedian(ts.diff().dt.total_seconds())

sp = np.fft.fft(df['x'])
freq = np.fft.fftfreq(len(df.index), period)

mask = (freq >= 0) &(freq < 1)

fig = go.Figure()
fig.add_trace(go.Scatter(x = freq[mask], y = [np.linalg.norm(s) for s in sp[mask]]))
fig.show()

zodiac

Track Simplification

You can use algorithms like Douglas-Peucker to simplify recorded GPS tracks to save storage space for long recordings. A Sensor Logger specific implementation can be found here: https://github.com/mhaberler/sensorlogger-util/blob/master/simplify.py

Peak Detection

Activity Detection & Semantic Segmentation

Time Series Classification

Removing Duplicated Entries

Sometimes, erroneous sensors may report the same value twice with identical timestamps, likely due to caching. Try something like this to remove them:

def remove_duplicated_rows(df: pd.DataFrame):
    _df = df[
        ~(df["timeStamp"].diff() < 0)
    ]  # be careful that diff() gives nan for first row
    if len(_df.index) != len(df.index):
        print(
            f"duplicated rows detected in the input file ({len(df.index) - len(_df.index)}"
        )
    return _df

Audio Analysis

ffmpeg-python (https://github.com/kkroening/ffmpeg-python) has some valuable examples of interesting audio analysis:

pyAudioAnalysis is also worth checking out for audio feature extraction, classification and segmentation: https://github.com/tyiannak/pyAudioAnalysis

Converting to GPX / InfluxDB

Michael Haberler has helpfully put together a command-line tool sensorlogger-utils to take the JSON exported from Sensor Logger and convert it to GPX or InfluxDB formats: https://github.com/mhaberler/sensorlogger-util

python sensorlogger -g <json file>
python sensorlogger.py -2 [--bucket sensorlogger] --token xxx  --org yyyy --url http://host:8086 2022-06-14_03-15-05.json

Live Data Streaming

As of version 1.10, Sensor Logger supports pushing live data via HTTP. This can be enabled by tapping the gear icon on the Logger page. All enabled sensors during a recording will be streamed every 200ms to the specified URL. To display the streamed data, you will need to set up a websever on another computer.

Screenshot 2022-07-12 at 09 33 42

The schema of the streamed data is a json string of format {messageId: int, payload: List}. The payload is a list of {time: int, name: str, values: Dict}, where the name is the name of the sensor. The time is in UTC epoch nanoseconds. The messageId is useful because the messages can be recieved out-of-order, which may need to be handled depending on your use case.

To simply consume and explore the data, you may want to use something like https://requestbin.com/. To plot the data in real-time, you may need something more custom. See https://github.com/mhaberler/sensorlogger-telegraf for a solution using telegraf.

Here is an example Python implementation using Plotly Dash to get you started. Dash is powered by Flask under the hood, and provides an easy way to set up a web server for real-time, interactive data visualisation. This code listens on the /data endpoint, filters only the values from the accelerometer and plots it. The update_graph() callback is triggered every UPDATE_FREQ_MS, and update the plot with any accumulated measurements so far. You will have to custimse this script yourself if you want to plot measurements from other sensors.

import dash
from dash.dependencies import Output, Input
from dash import dcc, html, dcc
from datetime import datetime
import json
import plotly.graph_objs as go
from collections import deque
from flask import Flask, request

server = Flask(__name__)
app = dash.Dash(__name__, server=server)

MAX_DATA_POINTS = 1000
UPDATE_FREQ_MS = 100

time = deque(maxlen=MAX_DATA_POINTS)
accel_x = deque(maxlen=MAX_DATA_POINTS)
accel_y = deque(maxlen=MAX_DATA_POINTS)
accel_z = deque(maxlen=MAX_DATA_POINTS)

app.layout = html.Div(
	[
		dcc.Markdown(
			children="""
			# Live Sensor Readings
			Streamed from Sensor Logger: tszheichoi.com/sensorlogger
		"""
		),
		dcc.Graph(id="live_graph"),
		dcc.Interval(id="counter", interval=UPDATE_FREQ_MS),
	]
)


@app.callback(Output("live_graph", "figure"), Input("counter", "n_intervals"))
def update_graph(_counter):
	data = [
		go.Scatter(x=list(time), y=list(d), name=name)
		for d, name in zip([accel_x, accel_y, accel_z], ["X", "Y", "Z"])
	]

	graph = {
		"data": data,
		"layout": go.Layout(
			{
				"xaxis": {"type": "date"},
				"yaxis": {"title": "Acceleration ms<sup>-2</sup>"},
			}
		),
	}
	if (
		len(time) > 0
	):  #  cannot adjust plot ranges until there is at least one data point
		graph["layout"]["xaxis"]["range"] = [min(time), max(time)]
		graph["layout"]["yaxis"]["range"] = [
			min(accel_x + accel_y + accel_z),
			max(accel_x + accel_y + accel_z),
		]

	return graph


@server.route("/data", methods=["POST"])
def data():  # listens to the data streamed from the sensor logger
	if str(request.method) == "POST":
		print(f'received data: {request.data}')
		data = json.loads(request.data)
		for d in data['payload']:
			if (
				d.get("name", None) == "accelerometer"
			):  #  modify to access different sensors
				ts = datetime.fromtimestamp(d["time"] / 1000000000)
				if len(time) == 0 or ts > time[-1]:
					time.append(ts)
					# modify the following based on which sensor is accessed, log the raw json for guidance
					accel_x.append(d["values"]["x"])
					accel_y.append(d["values"]["y"])
					accel_z.append(d["values"]["z"])
	return "success"


if __name__ == "__main__":
	app.run_server(port=8000, host="0.0.0.0")

Run this Python script and visit http://localhost:8000/ on your computer. Then you have to enter the correct Push URL in Sensor Logger on your phone under the settings page. To find out the localhost of the device you are running the websever on, you can, for example, do something like this in Python.

import socket
hostname = socket.gethostname()
print(socket.gethostbyname(hostname))

For example, if it returns 192.168.1.168, then you want to enter http://192.168.1.168:8000/data in Sensor Logger. Use the "Tap to Test Pushing" button to test whether Sensor Logger can properly reach the endpoint. If you get a 200 response, then you are good to go! Start a recording as usual, and you should begin to see data being streamed in:

Screenshot 2022-07-10 at 10 23 03

Further Use Cases & Applications

Based on user-submitted feedback, Sensor Logger is being use for a lot of applications -- for researchers and hobbyists alike. Here are a few to get you started. Let me know, and I will feature your use case here as well!

Contribute

Please submit a PR if you have scripts or links that may be useful for other users. I will also feature any project that uses Sensor Logger, integrated as part of a larger workflow.

awesome-sensor-logger's People

Contributors

tszheichoi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.