Fairhaven, The River

About

Recent Posts

  • There's meteorology in everything?
  • Tweaking the bar chart
  • Tracking carbs for diet management
  • Fitbit and R
  • Orgmode tables vs CSV files for R
  • Org-mode, R, and graphics
  • Org-mode dates and R dates
  • Firefox 35 fixes a highly
  • November Books Read
  • October Books Read
Subscribe to this blog's feed
Blog powered by Typepad

Archives

  • January 2016
  • December 2015
  • January 2015
  • December 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014

Categories

  • Arts
  • Books
  • Current Affairs
  • Eco-policy
  • Energy Tech
  • Food and Drink
  • Gift Economy
  • Healthcare
  • Politics
  • Science
  • Standards
  • Travel
  • Web/Tech
See More

There's meteorology in everything?

A bunch of googling and stack exchange took me to the many discussions of using R for meteorological variables.  That's familiar territory.  It's also what is needed to transform the raw Fitbit 5-minute data into hourly and daily data.  After all that reading the magic command is:

    daily <- aggregate( df["steps"], list( cut( df$Date, "1 day")), sum)

Now, having looked at various plots and statistical summaries of the result I'm confident that I've got the right data, and I'm not sure what to do with it.  This means I let it sit for a while and think about it.

My first reaction was that this looked like a very inefficient solution to the problem.  It might be.  But it only took about a second to process 213K 5-minute entries into 740 daily entries.

January 03, 2016 | Permalink | Comments (0)

Tweaking the bar chart

I wanted the bar chart to be a combination of stacked bars for the individual meals, and then a smoothed trendline of the total carbs for the day.  The original data has columns for date, each meal, and a final total.  Using "subset" to control the columns, the following R code

library(ggplot2)
library(plyr)
library(reshape2)

df <- melt( data, "Date")
df$Date <- as.Date(df$Date, "%Y-%m-%d ")
colnames( df) <- c( "Date", "Meal", "carbs")
plot <- ggplot( data=subset( df, Meal %in% c("Breakfast", "Lunch", "Snack1", "Snack2", "Dinner")), aes(x=Date, y=carbs, fill=Meal)) + geom_bar( stat="identity") + geom_smooth( data=subset( df, Meal %in% c("Total")), n=10) + scale_fill_brewer(palette="Set1")

The result is:

2

I had to do some reading to get this right, since googling found no examples to copy.

January 01, 2016 | Permalink | Comments (0)

Tracking carbs for diet management

At end of year, I've got a working carbs management for my needs.  Data acquisition is primarily manual.  I may take pictures of meals.  Or I may record it immediately.  Most often I look at pictures or remember meals, manually figure out grams of carbohydrates, and create a simple table.  This table is processed through R, and graphically displayed.  I look at the graphics to decide whether I need to change something, and as a reward of seeing things stay on track.

The R code is small:

colnames( df) <- c( "Date", "Meal", "carbs")
plot <- qplot( Date, carbs, data=df, shape=Meal, color=Meal, ylab="Carbohydrates (gm)") + geom_bar( stat="identity")

which generates a nice picture:

2

From this you can see that I usually have the same thing every day for breakfast.  Day to day variability is pretty hgh, but I'm holding daily carbs satisfactorily.  The running average is somewhere between 100 and 150 gms daily.

December 31, 2015 | Permalink | Comments (0)

Fitbit and R

Using these tools for gathering data from my Fitbit I've now got a little over 2 years worth of CSV files to analyze.  I've only made a little slow progress due to other time commitments.  What I know so far:

  • The data is kept in a directory per day per category.  These CSV files lack a trailing newline.  A little bit of awk took care of the newlines and combining all that data into annual files.
  • Reading and plotting the two years of data (about 200,000) individual data points, took two seconds on my computer.  That's plenty fast enough.
  • 5 minute data isn't too useful without further processing.  I'm still experimenting with what form of averaging, summarization, etc. makes sense.  I'm considering two kinds of summarization as a first step.  First, create a CSV of daily results, and second eliminate all the "zero" step periods.  Those may reveal something useful.
  • It turns out that R has all sorts of interesting time series analysis work that comes out of the financial community.  I should have realized that they would be intensively involved in this.

December 19, 2015 | Permalink | Comments (0)

Orgmode tables vs CSV files for R

Org-mode tables may be very convenient for data entry and routine use, but the interface to R is very inefficient.  What I've found is that when org-mode converts a 1,000 line table into a data table variable for R it consumes 15 seconds on my computer.  Reading a 200,000 line table from a CSV file for the same purpose takes less than a second when done directly by R.

I don't plan to change my current processes.  I can easily tolerate a 15 second delay for data analysis in return for the ease of editing.  But when the data is gathered automatically, then it will be in the form of CSV files.  The 200,000 line table is my past 2+ years of fitbit data, preserved with 5 minute data binning.

Next step is figuring what analysis and presentation is useful for the fitbit data.  The simple data plot confirms that I've got all the raw data and it's all reasonable looking.  But it doesn't reveal much in terms of patterns or trends.  Summarization and other processing will be needed.

December 12, 2015 | Permalink | Comments (0)

Org-mode, R, and graphics

This weekend I decided to switch my diet and glucose tracking from a combination of org-mode and excel to org-mode, R, and ggplot (R-graphics).  Why?  Because R is cool, it does way superior statistics, and it has nicer graphics.  R seems to have hit the sweet spot between easy of programming, good defaults, and rigorous engineering.  It's a full and proper vector oriented programming language that's easy to use and well organized for the kinds of problems that typically arise in statistics.   It's impressive that this problem is solved with ten lines of code.

I already use org-mode's simple table editing system.  I've got an ascii table for carbs each day:

#+TBLNAME:carbs
  |       Date | Breakfast | Snack1 | Lunch | Snack2 | Dinner | Total |
  |------------+-----------+--------+-------+--------+--------+-------|
  | 2015-10-20 |        25 |        |    45 |        |     56 |   126 |
  | 2015-10-21 |        45 |        |    75 |        |     20 |   140 |
  | 2015-10-22 |        90 |        |    50 |        |     65 |   205 |
  | 2015-10-23 |        95 |        |    50 |        |     75 |   220 |
  | 2015-10-24 |        35 |        |       |        |     62 |    97 |
  | 2015-10-25 |        25 |        |    95 |        |     64 |   184 |

It's got a name, "carbs", then columns of data.  Org-mode does various useful things that make it easy for me to update this.  At the end of the file I put in this code:


#+begin_src R :file 1.png :results graphics :var data=carbs :width 800 :height 600

library(ggplot2)
library(plyr)
library(reshape2)
df <- melt( data, "Date")
df$Date <- as.Date(df$Date, "%Y-%m-%d")
colnames( df) <- c( "Date", "Meal", "carbs")
plot <- qplot( Date, carbs, data=df, shape=Meal, color=Meal, ylab="Carbohydrates (gm)")
print(plot)
#+end_src

#+RESULTS: carbs_analysis
[[file:1.png]]

This means:

It starts the source of an R program.  The R program will put it's output on 1.png, also inline as graphics in the emacs window, it assigns the R variable "data" to have the contents of the table named "carbs", and it sets width and height of the PNG file.  The interface from org-mode to R is this easy to use.  It takes that table (one of which is thousands of lines long) and converts it into the R data table variable named "data".  Part of why R is so popular is that it understands tables.

Then it specifies the three libraries that will be needed.

Then it creates a new data table "df" that is melted from the orginal data table.  qplot and ggplot work best where there is one data point per line.  So "melt" converts each individual row in data into multiple rows.  This form tells it to preserve the "Date" column into every row.  It then automatically puts the column name, e.g. Lunch, as the value in a second column and the cell contents into the third column.  Then I rename the colmuns of "df" to be "Date", "Meal", and "carbs".  Finally an almost uncustomized qplot(), and I get the resulting picture.

1

 

Not bad for pure default behavior.

The next step was to deal with my four years of glucose measurements.  These I categorize into before breakfast, before lunch, and before bed.  It's always been hard to remember before dinner, so those occasional ones got lumped into before lunch.  I did a little ggplot adjusting, in particular the plot command.

plot <- qplot( Date, glucose, data=df, shape=Time, color=Time, ylab="glucose mg/dl") + geom_smooth( n=80) +scale_color_manual( values= c( "blue", "green", "red"))

1

 

December 10, 2015 | Permalink | Comments (0)

Org-mode dates and R dates

A small update.  Dealing with the past four years of glucose data I found another little trick needed when integrating with org-mode.

The date format in org-mode is [2015-12-03 Thu] for inactive dates, and <2015-12-03 Thu> for active dates.  R doesn't know what to do with the extra characters.  So an extra bit was needed:  data$Date <- as.Date( gsub( "^.", "", data$Date), "%Y-%m-%d") That strips off the leading character, and then the date extraction works.  It ignores the extra stuff at the end.

Now I've got four years of glucose data scatter plotted.  It needs some trend lines to eliminate the noise, but it's already made some things obvious.

December 08, 2015 in Web/Tech | Permalink | Comments (0)

Firefox 35 fixes a highly annoying bug introduced in Firefox 34. It had switched to the silly GNOME mouse scroll behavior. This is different than the behavior on Windows or Mac. A left click would jump to position rather than scroll down one page. A right click would scroll down a page. Windows and Mac scroll down a page with left click.

This was highly annoying. It was firefox bug 803633. The fix is:

Create a $HOME/.themes/Adwaita/gtk-2.0/gtkrc file with the following content:

include "/usr/share/themes/Adwaita/gtk-2.0/gtkrc"
gtk-primary-button-warps-slider = 0

You need that include because the theme includes setting the warps variable. This changes the order of evaluation to include the theme then change the variable. Without the include you change the variable, then the theme is included undoing that work.

January 21, 2015 | Permalink | Comments (0) | TrackBack (0)

November Books Read

  • The World of Yesterday, Stefan Zweig.
  • This is an easy read.  It's basically a lament and a memoir of sorts about a remembered lost world.  Zweig grew up in Austria from 1880.  He describes the decline of the world from the stable, safe, free societies of the 19th century, to the suicide of World War One, to the disintegrations of societies between the wars, with the end at the horrors of World War Two.  It's the viewpoint of a safe middle class European intellectual.  He did not get involved in any of the fighting or violence.  He was a successful public author
    As an Austrian Jew he witnessed the devolution of anti-semitism from being a non-issue before WW I to forcing him to leave Austria as the Nazis gained power.  This is discussed, but is just one small aspect of the disintegration of Europe that he despairs.
  • Confucious Says - There are no fortune cookies in china,Edward V. Yang.  Fast fluff of various social do's and don'ts when in China, or dealing with chinese.
  • China's Management Revolution, Charles Edouard Bouee.
    This is an interesting perspective on their different style of management.  It's a bit of a pangyric and sweeps things like Cultural Revolution and Tiennaman under the rug as trivial issues. This is a bit much to take at times.  The "crossing the river" meme is intensive throughout.
  • Logistics Clusters, Yossi Sheffi.
    Not bad as an introductory book explaining what these are, with examples of how they work, strengths, weaknesses, etc.  I wish it had more in depth analysis of some of the issues.  But the concept is well presented.  Logistics clusters are as significant in the world economy as innovation clusters like Silicon Valley.
  • Rethinking Secularism, Craig Calhoun, et al, editors.
  • This is a highly academic series of essays on society, government, religion, and their relationship current and historical.  It's motivated by the changing assumptions that dominate society.  The belief that religion was a dying irrelevant consideration has changed.  Some of the comparative meanings of "secular" are also interesting.  France and the US are at one extreme of secularism, but even within that extreme they are very different.  The French lacite expunges religion from public affairs, while the US welcomes religion and expunges religion from government.  Sweden and England are both far less religious yet both have an official state-funded religion, and in England religious leaders are part of the government simply because of their position in the church.  
    Each of the essays has a different focus.  The quality varies, but all manage at least average readability.
  • The Future Declassified, Mathew Burrows.
  • This discusses the "global trends" long range prediction efforts from the national intelligence community.  They are non-partisan analysis of major trends, singificant, alternatives, etc.  They present trends, etc. in isolation, and then present a few alternative world scenarios that highlight the combinations that might emerge. He discusses the components and discusses his current list of trends and gives some scenarios.
    The official NIC documents are also available.  The most recent is to 2030 Global trends.
  • Story Wars, Jonah Sachs.
  • Really about marketing, advertising and the use of story in marketing and advertising.  Lightweight fast read.  Nothing new or exciting if you've read about Campbell, etc.

December 09, 2014 | Permalink | Comments (0) | TrackBack (0)

October Books Read

During the month of October I finished:

  • The Human side of Post Mortems, by Dave Zweibeck.
    This turns out to be about a couple software project postmortems. It is not nearly as useful or informative as I expected.
  • Toxic Schools, by Paulle Brown.
  • An interesting ethnographic study of two high schools, one in the Bronx and one in Amsterdam. Mr Brown spent a couple years teaching in each one. These are each arguably the worst schools in their areas. Brown examines the social structures, lives of students, etc. The similarities are extremely strong despite all the cultural differences. Probably the strongest factor in failure for students is the pervasive fear and danger. Their lives are spend dealing with threats. The strongest indicator for escaping the dead end lives that so many face is having some protective adult continuity. A strong parent or strong relative is the usual source. This is not something that can be offered by a teacher. It needs to be a multi-year experience of having a healthy protective adult that the students need. It needs to have the continuity of a healthy person to person relationship.
  • The Incrementalists, Steven Brust
  • SF. Disappointing.
  • Anarchy Unbound, Peter T. Leeson
  • Interesting analysis of situations which were an anarchy, and how order develops. What were the major factors. What were the results. He argues that the claim "anarchy can't work" is easily disproved. He does not argue the claim that civil government is inferior to anarchy. He shows what anarchy does accomplish.
    The major factors in successful anarchies are the patience of the participants and the extent to which they must maintain long term relationships.
  • Learning From First Responders, Richard Dylan
  • Actually about how the IT systems and organizations were structured and tested for the Obama campaign. The contrast between this level of care to organization, goals, testing, and reliability and the complete failure of similar efforts in Obamacare launch are tremendous.
  • The Shifts and the Shocks, Martin Wolf
  • I expected better given the high quality of his lectures and newspaper articles. The information content is very good. The book style and organization was mediocre. Still worth reading.History and commentary on the whole economic situation since 2005.

December 09, 2014 in Books | Permalink | Comments (0) | TrackBack (0)

»