Giter VIP home page Giter VIP logo

qs_ledger's Introduction

Quantified Self (QS) Ledger

A Personal Data Aggregator and Dashboard for Self-Trackers and Quantified Self Enthusiasts

Quantfied Self (QS) Ledger aggregates and visualizes your personal data.

The project has two primary goals:

  1. download all of your personal data from various tracking services (see below for list of integration services) and store locally.
  2. provide the starting point for personal data analysis, data visualization and a personal data dashboard

At present, the main objective is to provide working data downloaders and simple data analysis for each of the integrated services.

Some initial work has been started on using these data streams for predictive analytics and forecasting using Machine Learning and Artificial Intelligence, and the intention to increasingly focus on modeling in future iterations. .

Code / Dependencies:

  • The code is written in Python 3.
  • Shared and distributed via Jupyter Notebooks.
  • Most services depend on Pandas and NumPy for data manipulation and Matplot and Seaborn for data analysis and visualization.
  • To get started, we recommend downloading and using the Anaconda Distribution.
  • For initial installation and setup help, see documentation below.
  • For setup and usage of individual services, see documentation provided by each integration.

Current Integrations:

  • Apple Health: fitness and health tracking, data analysis and dashboard from iPhone or Apple Watch (includes example of Elastic Search integration and Kibana Health Dashboard).
  • AutoSleep: iOS sleep tracking data analysis of sleep per night and rolling averages.
  • Fitbit: fitness and health tracking and analysis of Steps, Sleep, and Heart Rate from a Fitbit wearable.
  • GoodReads: book reading tracking and data analysis for GoodReads.
  • Google Calendar: past events, meetings and times for Google Calendar.
  • Google Sheets: get data from any Google Sheet which can be useful for pulling data from IFTTT integrations that add data.
  • Habitica: habit and task tracking with Habitica's gamified approach to task management.
  • Instapaper: articles read and highlighted passages from Instapaper.
  • Kindle Highlights: Parser and Highlight Extract from Kindle clippings, along with a sample data analysis and tool to export highlights to separate markdown files.
  • Last.fm: music tracking and analysis of music listening history from Last.fm.
  • Oura: oura ring activity, sleep and wellness data.
  • RescueTime: track computer usage and analysis of computer activities and time with RescueTime.
  • Pocket: articles read and read count from Pocket.
  • Strava: activities downloader (runs, cycling, swimming, etc.) and analysis from Strava.
  • Todoist: task tracking and analysis of todo's and tasks completed history from Todoist app.
  • Toggl: time tracking and analysis of manual timelog entries from Toggl.
  • WordCounter: extract wordcounter app history and visualize recent periods of word counts.

EXAMPLES:

How to use this project: Installation and Setup Locally

Until we provide a working version for Google's Collab or other online jupyter notebook setups, we recommend to get started by downloading and using the Anaconda Distribution, which is free and open source. This will give you a local working version of Numpy, Pandas, Jupyter Notebook and other Python Data Science tools.

After installation, we recommend create and activating a virtual environment using Anaconda or manually:

python3 -m venv ~/.virtualenvs/qs_ledger

source ~/.virtualenvs/qs_ledger/bin/activate

Then clone the current github repo:

git clone https://github.com/markwk/qs_ledger.git

Using your activate virtual environment, install dependencies:

pip install -r requirements.txt

Then navigate into your directory and launch an individual notebook or the full project with jupyter notebook or jupyter lab:

jupyter lab

Code Organization

Best practices and organization are still a work-in-progresss, but in general:

  • Each project has a NAME_downloader and NAME_data_analysis.
  • Some projects include a helper function for data pulling.
  • Optionally, some projects have useful notebooks for specific use cases, like weekly reviews.

Useful Shortcuts

You can use command line to run jupyter notebooks directly and, in the case of papermill, you can pass parameters:

With nbconvert:

  • pip install nbconvert
  • jupyter nbconvert --to notebook --execute --inplace rescuetime/rescuetime_downloader.ipynb

With Papermill:

  • pip install papermill

  • papermill rescuetime_downloader.ipynb data/output.ipynb -p start_date '2019-08-14' -p end_date '2019-10-14'

  • NOTE: You first need to parameterize your notebook in order pass parameters into commands.

Creators and Contributors:

Want to help? Fork the project and provide your own data analysis, integration, etc.

Questions? Bugs? Feature Requests? Need Support?

Post a ticket in the QS Ledger Issue Queue

qs_ledger's People

Contributors

markwk avatar mbbroberg avatar michalszczecinski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qs_ledger's Issues

How to get started?

Hey all, I'm hoping to get this project up and running and I'm starting from a fresh MacOS environment. I don't yet understand the toolchain necessary to do so.

Notes:

  • I'm running Python 3.7, with python aliased to python3.7 and pip aliased to pip3

I'm specifically:

  1. Beginning by installing Anaconda, which successfully installed
    • verification on the CLI using conda -V shows conda 4.6.11
    • from the GUI of Anaconda-Navigator.app
    • I also installed PyCharm with the Anaconda plugin
  2. Starting from the Todoist downloader, I run pip install todoist-python
    • no errors on installation
  3. I set up my credentials in credentials.json
  4. I then try to view todoist_downloader.ipynb
    • in PyCharm, it only shows a JSON object due to it being the community edition
    • through Anaconda-Navigator, I launch Jupyter Notebook, choose the todoist_downloader.ipynb file, then get stuck.
  5. Jupyter Notebook says ModuleNotFoundError: No module named 'todoist' when I attempt to run it.
    • conda install todoist doesn't work, but I did re-run pip install todoist-python to verify it was installed and used pip list | grep todoist to verify it was there.

What's the best path forward? I'm out of my Python depths 🐍 😄

Error on running Todoist analysis

I'm working from the Jupyter Notebook launched by Anaconda Navigator and I'm running each section one-by-one. I pulled the data using your script at https://github.com/markwk/todoist_export. After importing and running the other earlier tasks, I then grab the data from that location.

tasks = pd.read_csv("/Users/mbbroberg/Develop/todoist_export/data/todost-tasks-completed.csv")
len(tasks)

Once I reach year_data = tasks['year'].value_counts().sort_index(), I receive the error below. When I look at the tasks object, I don't see a column for year (see screenshot). Is it possible I'm looking at the wrong data or am I missing something?

Screen Shot 2019-04-15 at 4 23 32 PM

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'year'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-24-a30ed33e8ded> in <module>
----> 1 year_data = tasks['year'].value_counts().sort_index()

/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'year'

strava - authorization flow change

hi Mark, thanks for sharing strava data downloader code. All works smoothly until when I try to pull activities. I believe they have changed the API settings and now that newly generated general token is not able to access activities (is just of scope read, whereas read_all scope is required). You can read more about this here:

https://developers.strava.com/docs/oauth-updates/

and here is stack overflow post with details of potential solution:
https://stackoverflow.com/questions/52880434/problem-with-access-token-in-strava-api-v3-get-all-athlete-activities

Also stravalib seems to include some code that should help with this:
https://github.com/hozn/stravalib#authentication

Thanks!

interpreting apple watch timestamps

Hi!

I am using this awesome work to look at my health data, and I have a question about how to interpret the timestamps/datetimes.

If I open the csv file I am interested in see that the creationDate of the first row is 2017-01-28 10:20:52 +0200, while if I read the file with pandas.read_csv(...) using the parse_dates argument, I get the corresponding value to be 2017-01-28 08:20:52.

In the blog post it says that the data contain UTC timestamps and that's ok, my problem is more about the timezone info: During this summer I spent the August in the US (I live in Europe), but I don't see this reflected in the data: here is what the creationDate of a row on August 8th looks like: 2019-08-08 02:03:20 +0200.

Shall I assume that the timestamp (i.e. the pure date & time, discarding the timezone info) is still UTC and then I have to figure out the daily time zone by myself? Or what else?

I wonder if there is a bug apple-health-data-parser.py ?

qs_ledger is really helpful. It would be hard for me to use the XML version

I wonder if I found a bug in apple-health-data-parser.py. I noticed the HeartRate.csv that the head had 8 field, how the data had 14. tools like unix cut fail to parse

I looked at the XML I think this is what a heart rate record looks like. It looks like apple include ',' in the value string. Once I figure out this out I was able to work around it

<Record type="HKQuantityTypeIdentifierHeartRate" sourceName="andrew e.’s Apple Watch" sourceVersion="5.1.3" device="&lt;&lt;HKDevice: 0x281e21c20&gt;, name:Apple Watch, manufacturer:Apple, model:Watch, hardware:Watch4,4, software:5.1.3&gt;" unit="count/min" creationDate="2019-07-29 16:13:30 -0800" startDate="2019-07-29 16:10:29 -0800" endDate="2019-07-29 16:10:29 -0800" value="53">

This is the Header from the HeartRate.csv file

sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value

this is a record from HeartRate.csv . I broke it up to figure what was going on

"andrew e.’s Apple Watch",
"5.1.3",
"<<HKDevice: 0x281e36490>, name:Apple Watch, manufacturer:Apple, model:Watch, hardware:Watch4,4, software:5.1.3>",
"HeartRate",
"count/min",
2019-07-29 15:28:48 -0800,
2019-07-29 15:26:47 -0800,
2019-07-29 15:26:47 -0800,
62

Andy

import dashboard to Kibana issue

Hi @markwk nice work! I have an issue with import apple_health/apple_health_elastic_dashboard to Kibana:
Zrzut ekranu 2021-03-27 o 16 24 19

There is any error on the dev console.
Kibana version 6.8.15
Elastic: 7.12.0

Using iOS 16.6 health data extraction does not work

Having the discussed problems with iOS 16.2 in mind and your ideas of a work around, I’ve tried the apple_health-extractor.ipynb.

During reading-/ parsing-procedure you get the error message „Unexpected node of type Correlation" after a few seconds.

None of the following notebooks is working. This error is the show stopper.

apple_health_data2elastic NOT WORKING :(

Hi everyone, I'd appreciate your help. I tried to run this code in Jupyter Lab, and it is not working. The main issues are in cell [18]

# Create Customized Index Mappings     
es.indices.put_mapping(index=INDEX, doc_type=TYPE, body=d, include_type_name=True)  

Apparently, elasticsearch 8.1 made some changes and there are issues with the arguments in put_mapping.
I uninstalled 8.1 and installed elastic 7.1, and now there is a ConnectionError.

Dear Mark, could you please update the file? Thank you so much!

apple health: parse error after upgrading to iOS 16

Upgrading to iOS 16 apparently also updated the HealthKit protocol:
HealthKit Export Version: 11 to HealthKit Export Version: 12

Parsing export.xml now throughs a parse error:

File "/.../qs_ledger/apple_health/apple-health-data-parser.py", line 118, in __init__
    self.data = ElementTree.parse(f)
File "/home/usr/.pyenv/versions/3.10.4/lib/python3.10/xml/etree/ElementTree.py", line 1229, in parse
    tree.parse(source, parser)
File "/home/usr/.pyenv/versions/3.10.4/lib/python3.10/xml/etree/ElementTree.py", line 580, in parse
    self._root = parser._parse_whole(source)
**xml.etree.ElementTree.ParseError: syntax error: line 156, column 0**

The problem has already been reported and seems to be ignored by Apple:
problem with import of XML Apple HealthKit Export Version: 12
Any thoughts on the suggested workaround with "patch.txt"?

Apple Health Extractor not working?

Hi!

Might be a mistake I'm making but the extractor doesnt seem to be working. I keep having

FileNotFoundError Traceback (most recent call last)

Whenever i try to run the extractor.

Not sure where I'm going wrong?

I did note that there was a note earlier on in the file that says

NOTE: Currently there are a few minror errors based on additional data from Apple Health that require some updates.

Any idea where I'm going wrong? I'm a bit of a noob so could be user error!

thanks!

Tom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.