This weekend I decided to switch my diet and glucose tracking from a combination of org-mode and excel to org-mode, R, and ggplot (R-graphics). Why? Because R is cool, it does way superior statistics, and it has nicer graphics. R seems to have hit the sweet spot between easy of programming, good defaults, and rigorous engineering. It's a full and proper vector oriented programming language that's easy to use and well organized for the kinds of problems that typically arise in statistics. It's impressive that this problem is solved with ten lines of code.
I already use org-mode's simple table editing system. I've got an ascii table for carbs each day:
#+TBLNAME:carbs
| Date | Breakfast | Snack1 | Lunch | Snack2 | Dinner | Total |
|------------+-----------+--------+-------+--------+--------+-------|
| 2015-10-20 | 25 | | 45 | | 56 | 126 |
| 2015-10-21 | 45 | | 75 | | 20 | 140 |
| 2015-10-22 | 90 | | 50 | | 65 | 205 |
| 2015-10-23 | 95 | | 50 | | 75 | 220 |
| 2015-10-24 | 35 | | | | 62 | 97 |
| 2015-10-25 | 25 | | 95 | | 64 | 184 |
It's got a name, "carbs", then columns of data. Org-mode does various useful things that make it easy for me to update this. At the end of the file I put in this code:
#+begin_src R :file 1.png :results graphics :var data=carbs :width 800 :height 600
library(ggplot2)
library(plyr)
library(reshape2)
df <- melt( data, "Date")
df$Date <- as.Date(df$Date, "%Y-%m-%d")
colnames( df) <- c( "Date", "Meal", "carbs")
plot <- qplot( Date, carbs, data=df, shape=Meal, color=Meal, ylab="Carbohydrates (gm)")
print(plot)
#+end_src
#+RESULTS: carbs_analysis
[[file:1.png]]
This means:
It starts the source of an R program. The R program will put it's output on 1.png, also inline as graphics in the emacs window, it assigns the R variable "data" to have the contents of the table named "carbs", and it sets width and height of the PNG file. The interface from org-mode to R is this easy to use. It takes that table (one of which is thousands of lines long) and converts it into the R data table variable named "data". Part of why R is so popular is that it understands tables.
Then it specifies the three libraries that will be needed.
Then it creates a new data table "df" that is melted from the orginal data table. qplot and ggplot work best where there is one data point per line. So "melt" converts each individual row in data into multiple rows. This form tells it to preserve the "Date" column into every row. It then automatically puts the column name, e.g. Lunch, as the value in a second column and the cell contents into the third column. Then I rename the colmuns of "df" to be "Date", "Meal", and "carbs". Finally an almost uncustomized qplot(), and I get the resulting picture.
Not bad for pure default behavior.
The next step was to deal with my four years of glucose measurements. These I categorize into before breakfast, before lunch, and before bed. It's always been hard to remember before dinner, so those occasional ones got lumped into before lunch. I did a little ggplot adjusting, in particular the plot command.
plot <- qplot( Date, glucose, data=df, shape=Time, color=Time, ylab="glucose mg/dl") + geom_smooth( n=80) +scale_color_manual( values= c( "blue", "green", "red"))