gedankenstuecke @ git

fed up with my S9Y-installation

Stepcounting: Fitbit vs. Fuelband

| Comments

About two weeks ago I posted a comparison of the step data I’ve collected through the Fitbit and the Nike+ Fuelband over a timeframe of about two months on the QuantifiedSelf blog. For those of you who haven’t read it,the tl;dr is: On most days the Fuelband counts much less steps than the Fitbit does and contrary to my initial belief the recent firmware updates of the Fuelband didn’t help at all to bring down the difference between both devices. Ernesto on the other hand found that both counts are quite close, while Austin’s results are quite similar to mine.

Reading those posts I noticed one difference in the way data was collected: While Austin & I did wear the Fuelband on our dominant wrists, Ernesto wore his on the non-dominant arm. My theory was that the algorithms applied in the Fuelband might play a role in why there might be a difference. The Fuelband does not only count steps but also awards Fuel points for activities which are not steps. So it could be that the algorithm discards valid steps if you move your dominant arm around too much while walking. Fortunately there’s an easy way to test this by just wearing the Fuelband on the other wrist for a couple of days.

So I put the Fuelband onto my non-dominant wrist for two weeks and did the same comparison as for the Quantified Self posting. The graph is quite simple: The x-axis gives the different days, the y-axis gives the difference between the Fitbit and the Fuelband. On days with positive values the Fitbit counted more steps, on days with negative values the Fuelband counted more steps. Blue bars are days on which the Fuelband was worn on the dominant wrist while orange bars are days on which the Fuelband was worn on the non-dominant wrist.

Just by looking at the graph it seems clear that this doesn’t really make a difference. And after doing a standard t-test it’s also clear that there’s no statistically significant difference between both conditions. So: Nope, changing your wrist will not make your Fuelband count more steps. But…

There’s a clear difference in Fuel points awarded for each step between the different conditions. The y-axis now shows the ratio of steps to Fuel points. Wearing the Fuelband on the non-dominant arm it takes more steps to achieve a single Fuel point. When it’s worn on the dominant arm the Fuelband on average awards a Fuel point after 3.2 steps, when worn on the non-dominant arm it takes 4.5 steps (following the t-test this is a statistically significant difference).

That makes sense: As the non-dominant arm is getting used less during the day there are less Fuel points awarded overall. By combining both tests we can now also see that the algorithms applied by the Fuelband don’t drop valid steps. If you’re active without walking it will award you Fuel points without counting or discounting steps. So: If you want to get the most Fuel points you should wear the Fuelband on your dominant arm, if you want to challenge yourself and make it harder to earn points you should go with the non-dominant arm. Stepwise it doesn’t make a difference. If you want to play around with the data for yourself it can be found below.

On Getting Sleep

| Comments

Some of you may have noticed that openSNP now has support for Fitbit. Using OAuth you can connect your Fitbit-account which allows us to mirror the data on openSNP as well. We also have an option for those who don’t feel comfortable sharing all their data: You can select which data categories should be mirrored. Right now those categories are activities (which includes your steps and floors), body (which includes your weight and body-mass-index and sleep (including the number of minutes you slept, how often you awoke in a night and how long those wake-phases were).

The screenshot should give you a glimpse of how my data set shows up on openSNP. There’s a longer blogpost on the whole issue on the project-blog. What’s maybe even more interesting, is what this data can tell us: Ms. Clarkzilla send me the notice that she believed that there’s an trend towards a longer sleep in my data. So I downloaded my data from openSNP and started playing around with it: I did a simple linear regression over the time series and could indeed find a trend towards more sleep. The regression came out as y = 0.5x + 417, which ± says that for each two days that pass I will sleep a minute longer, which also means that it will be about 2000 days (or 5.5 years) until I will sleep 24 hours a day.

Well, fortunately such linear regressions aren’t the way to go here (No wonder Ms. Clarkzilla asked me whether I learned my skills in Statistics for Social Science). But looking at the data I wondered whether other factors could explain differences in my sleeping data, which not only consists of the total time asleep but also the times I’ve awoken each night and how many minutes I’ve been awake between those sleep-times. One of the factors I thought of first was whether I’ve slept alone or shared my bed. Crawling through my Google Calendar I identified the nights I’ve slept alone and came up with ~60 nights of solo-sleep and ~20 nights during which I shared a bed (95% of those with @senficon).

Above you see the some of the average values as well as the p-values, as given by a normal two-tailed t-test. While the amount of sleep I get doesn’t differ between both conditions there are significant differences between other metrics, which Fitbit uses to calculate the sleep quality/efficiency. At first glance it looks like (at least for me) a shared bed is associated with a longer time until I fall asleep. While I awake more often and for longer time frames at the same time. To make sure that this isn’t an artifact due to other factors which also could influence my sleep (maybe the distribution of weekdays and weekends between the two conditions is skewed?) I also had a look into this and binned each night into weekdays and weekends.

There’s no significant difference between the workdays for the alone/shared-conditions (p=0.68) but there are some other nice correlations: As one might expect I tend to get more sleep on weekends and to spent more time in bed in general. But I’m also less active on weekends, at least for the number of floors climbed. Which also makes sense, as workdays are full of using the subway and climbing the floors to my office.

I’ve had a quick look into the scientific literature but until now I couldn’t find any publications which dealt with this issue, although there has some work been done on couples’ nighttime sleep efficiency and concordance and others have shown that the stable presence of a partner is an independent correlate of better sleep quality and continuity in women. So please let me know if you have some links on this.

So what are the preliminary results? Well, it looks like sharing a bed won’t help me to get a better sleep, although I’m not yet sure if this is a unique property of @senficon or a general pattern. So it looks like more research is desperately needed. Collecting more data should be easy enough (if I remember to wear the Fitbit each night), but who else will be willing to volunteer to share a bed with me? If you also want to play around with the data: If you sign up with openSNP you can download the complete data from my profile.

Latest Projects: RubyDAS & TwitterBackup

| Comments

My latest job (also a combination of bioinformatics & web-design) makes me use Python and Django – let’s just say I still need to acclimate to the changed framework. But I did some Ruby and Rails-stuff in the weeks before I started this position:

The first project is RubyDAS, which Alex Kalderimis and I started to work on during this year’s DAS-Workshop at the EBI. While Philipp and I did some work on the DAS-protocol and its integration into Ruby, while working on openSNP, we didn’t really bother to make it reusable and instead solved the problem in a quick and dirty way, by just making it work with the internal database-models of openSNP. RubyDAS should one day get around this limitation and be a stand-alone gem which can be used to easily set up DAS-sources with Ruby. Right now you can use the code to set up simple annotation and reference servers. The code includes parsers for fasta and GFF files, so you can simply deliver DAS-conform sequences, as well as annotations for those sequences. Having said this: The project needs some more work. Not all DAS commands are implemented yet and up to now there hasn’t been a single test written. So if this project could be of use to you, you might need to invest some time and help us test the whole thing and implement the missing DAS-commands. So far RubyDAS uses http://bioruby.open-bio.org/ for the GFF and fasta-parsing, Sinatra for the webstuff and DataMapper as ORM.

The second project which I started is a small, Rails-based, web-app to backup and visualize my own tweets. You can find the source code at GitHub in the twitter_backup-repository. As the search-feature of Twitter doesn’t allow you to search in old tweets and there is no way to scroll back in your own timeline this felt like a good idea. The tool grabs all my tweets and the retweets I’ve done and saves them locally. Using the web-frontend of twitter_backup you can scroll through your own timeline, see how much you interact with other users and find out on which days of the week and which times you most frequently tweet. Additionally the locations of your tweets are saved as well and you can find out from which places you most often tweet. And the biggest advantage of your own tweet-backup is also implemented: A working search-feature, which is done using Solr. You can search for users, specific dates or just the content of a tweet, which makes it much easier to find those interesting link you are sure you’ve tweeted but can’t find again on Twitter itself.

The general visualization is done using the Google Chart Tools and the mapping of the locations, along with the heat-maps are rendered through the Google Maps API and heatmap.js. If you’re interested in trying the small tool you can find my own setup on The Phylomemetic Tree.

Using a Privacy Policy Generator

| Comments

Since the start of openSNP we had a disturbing lack of faith a real privacy policy. Instead just offered the disclaimer people had to read while registering and uploading their genetic information on the website. In order to keep things simple we decided that there shouldn’t be an elaborate privacy-management-system like e.g. Facebook provides. Not only to save us some work but also to minimize the damage done in any case of a programming/server access fuck-up. The worst case would’ve been to promise not to display/share data with the public and accidentally doing so (a thing somehow quite frequent with social networks et al.). So we settled with making virtually all information public from the beginning.

The only information which is not public, but is entered by users while registering and using openSNP, are their eMail-addresses and their passwords (which is also only saved in its encrypted form). Everything else can be viewed on the website and downloaded using the APIs or the mass-download-features of the website itself. So even the worst-case scenario of some third party getting access to our servers shouldn’t result in much trouble for the users (the webserver also doesn’t even log the IPs used for access). Still: This doesn’t solve the lack of a privacy policy. Fortunately some months ago I found out about iubenda, a italian startup-company which tries to transform the process of creating a privacy policy into a point-and-click adventure

One could register for a closed beta around that time, but I missed out to join for it. Fortunately they just launched their service to the public this week. To create a privacy policy you just grab the different services you’ve implemented into your website out of different categories (advertising, analytics, social networks, commenting systems,…), enter your name & address as the data owner (as if somebody should own it) and then you are good to go. An example of how this looks with more standard services can be found in the footer of this page. I really like that – similarly to Creative Commons – they not only provide the legalese version of the policy but also a human-readable summary. They also have a quite reasonable business model: As long as you limit yourself to the standard services their service can be used for free. If you want to get some more flexibility you can opt-in to pay them a yearly fee for a pro-policy. The fee will be $27/year, but currently they have a lifetime-discount where the price will stay at $13.50/year.

Surprisingly it isn’t a standard application to collect genetic and phenotypic information, so yesterday I purchased a policy for openSNP. I encountered some trouble with PayPal during the purchasing-process (The very unspecific error message that PayPal did provide: «The transaction cannot complete successfully. Instruct the customer to use an alternative payment method.», no matter if a credit card or a standard bank account was used), so I already had a chance to test their support and I’ve to say they are doing a good job. They instantly called PayPal for further information about this and in the end we could find a workaround which allowed me to get a pro-policy. If you are interested in how such a policy can look like: I already put the policy in the footer of openSNP. So if you are like me and tend to procrastinate doing the privacy policies because you can’t speak legalese and can’t wrap your head around this stuff you can give iubenda. They seem to be eager to get feedback.

DAS Workshop at the EBI

| Comments

I made it to this years workshop about the Distributed Annotation System (DAS) at the European Bioinformatics Institute in Cambridge. Before we started working on openSNP I’ve never used the protocol but implementing one of the fields standards for sharing genetic information was definitely a thing we wanted to do with openSNP. We’ve worked on implementing this before the workshop but ran into some problems. After the first day of the workshop this already looks much better. According to the validations performed by the DASRegistry we now pass the sources, features and unknown_segment-commands. And these are the only ones which are currently implemented in openSNP. This is a nice step forward. Today I’ll work on implementing the types-command as well, as this is mandatory for annotation-servers in the 1.6-specification of DAS (the 1.5[E]-specification doesn’t explicitly state the need for the types-command if you implement features). I also work on implementing some useful application for DAS right into openSNP: Biodalliance is a genome browser/viewer which runs right away in your browser using JavaScript and it can be used to easily embed genome views into web-pages. I’ll try to add the respective views of the genome right into the SNP-pages. The screenshot here is of the Biodalliance website and already includes my genotypes in the last track. So this should be straight forward to do.

Open All the Access

| Comments

It’s great to see that not only the Open Access movement is gaining momentum. The traditional publishing system – which still hasn’t adapted to the digital age in many aspects – gets its fair share of critique recently. Over 7000 researchers have signed the Cost of Knowledge petition and by this pledged to refrain from publishing, refereeing or doing editorial work for Elsevier. The protest was partially started because of Elseviers support for the Research Works Act but there are many things one criticize about the way Elsevier does business.

The open letter published by @FakeElsevier gives you a nice summary of all the questionable things done over the years. You should go on and read the comments as well. One of my favorite ones is «But still, you know, credit where credit’s due. Elsevier doesn’t kill babies. Directly. Any longer.» but at least one participant in the comments is allegedly an employee of Elsevier him/herself.

If you are interested on how the current process of scientific publishing works – and why scientists still submit their work to closed access journals – you should watch this video Youtube.

What the H[a/e]ck Have You Done All the Time?

| Comments

My ambition is handicapped by my laziness.

Charles Bukowski Factotum

Okay, there are a couple of reasons why I didn’t really put the old S9Y-blog at gedankenstuecke.de to use during the last months. There were all those courses at university which required me to spend some time to actually learn the contents of the courses. Plus: I – and all the others involved with openSNP – invested much of the time at hand into this very same project. We launched the website just before the deadline for the Mendeley/PLoS Binary Battle ran out and the usual thing happened. Most of the bugfixing was done in the weeks following launch. Afterwards I gave two talks on openSNP and the future of genetic information together with Philipp. Both of which have been recorded, so you can watch them here. Additionally I gave two invited talks about openSNP – one at an institute in Strasbourg and another one at an institute in Tübingen – and two more are just coming up. Sorry, to my knowledge there are no recordings of those.

Between all the talks I also took some time to finally implement some API-features into openSNP. Due to this you can now grab the genetic and phenotypic information using JSON and – if you are interested in using a standard which is a bit more rooted in bioinformatics – you can use the Distributed Annotation System (DAS). There is a short How-To on the JSON-stuff on the openSNP blog and I’ve done some videos which also include the DAS-stuff, the videos can be found here. The DAS-integration is still in early alpha-testing so let me know if something breaks down. In order to get some expert help with implementing more DAS-features I will leave to visit a DAS workshop at the European Bioinformatics Institute in Cambridge on Sunday.

Just two days ago we also started the project to give out free genotypings with the money we’ve got from the Wikimedia here in Germany. The little funding we got should get us ~20 - 30 genotypings, depending on the shipping costs involved. And it looks like we will have no problem to get enough participants. Instead I really should try to get more funding, but I’m kind of lacking ideas whom to bug about this, so tips are appreciated. I’m still thinking about using Kickstarter to do some fundraising so people who aren’t interested in genotyping and sharing their own data can help us out, but I don’t know if anyone would be willing to do so. What are your opinions on this? Would you give some money to help others get genotyped? And: What rewards would you like to see if we would use Kickstarter for this?

@hello_world

| Comments

Hello World – Every programmer, ever

I just can’t stand S9Y and the support for XMLRPC any longer. So I’ll just give octopress a try and see if writing markdown and publishing from the CLI suites me any better.

Are Twitter Clients the new Hello World app?

I don’t have specific plans for this blog, but maybe I’ll use it for

  • writing a bit about the bioinformatics-stuff
  • blog a bit more technical details about openSNP

Whatever, read you soon.