>FindPeaks 4.0

>After much delay, and more statistics than I’d ever hoped to see in my entire life, I’m just about ready to tag FindPeaks 4.0.

There are some really cool things going on – better handling of control data, better compares, and even saturation analysis made it in. Overall, I’m really impressed with how well things are working on that front.

The one thing that I’m disappointed to have not been able to include was support for SAM/BAM files. No one is really pushing for it yet, here at the GSC, but it’s only a matter of time. Unfortunately, the integration is not trivial, and I made the executive decision to work on other things first.

Still, I can’t say I’m unhappy at all – there’s room for a fresh publication, if we decide to go down that route. If not, I have at least 2 possible publications in the novel applications of the code for the biology. That can’t be a bad thing.

Anyhow, I’m off to put a few final updates on the code before tagging the final 3.3.x release.

>Science Nightmares

>I had friends over last week, and an interesting conversation came up where we were discussing nightmares. Apparently people who have braces have the nightmare of all their teeth falling out, undergrads have the “I missed an exam” nightmare (although I did that one in real life, so the nightmares weren’t that disturbing afterwards), and profs have the “I missed a talk” nightmare.

Well, if it’s a sign that my career is on it’s way forward, I had the “missed a talk” nightmare this morning. The ironic thing is that I’ve never been invited to give a talk a conference, so it’s a bit premature.

Anyhow, it probably has more to do with the fact that I’m somewhat freaked about the huge changes in findpeaks. We learn SO much every day about the biology behind the experiment that this is really nerve wracking to keep on top of it. The development is going well, although bug testing is always a challenge.

At any rate, we’re finally getting to the point where there are very few arbitrary decisions – the data decides how to do the analysis. Quite the contrast to 3 months ago, where we thought we’d hit the end of what new things we could pull out of the data.

Anyhow, debugging calls. Back to work….

>2 weeks of neglect on my blog = great thesis progress.

>I wonder if my blogging output is inversely proportional to my progress on my thesis. I stopped writing two weeks ago for a little break, and ended up making big steps forward. The vast amount of my work went into FindPeaks, which included the following:

  • A complete threaded Saturation analysis for next-gen libraries.
  • A method of comparing next-gen libraries to identify peaks that are statistically significant outliers. (It’s also symmetic, unlike a linear regression based methods.)
  • A better control method
  • A whole new way of analysing WTSS data, which gives statistically valid expression differences

And, of course many many other changes. Not everything is bug-free, yet, but it’s getting there. All that’s left on my task list are debugging a couple of things in the compare mode, relating to peaks present in only one of the two librarires, and an upgrade to my FDR cutoff prediction methods. Once those are done, I think I’ll be ready to push out FindPeaks 4.0. YAY!

Actually, what was surprising to me was the sheer amount of work that I’ve done on this since January. I compiled the change list since my last “quarterly report” for a project that used FindPeaks (but doesn’t support it, ironically…. why am I doing reports for them again?) and came up with 24 pages of commit messages – over 575 commits. Considering the amount of work I’ve done on my actual projects, away from FindPeaks, I think I’ve been pretty darn productive.

Yes, I’m going to take this opportunity to pat myself on the back in public for a whole 2 seconds… ok, done.

So, overall, blogging may not have been distracting me from my work, as even at the height of my blogging (around AGBT), I was still getting lots done, but the past two weeks have really been a help. I’ll be back to blogging all the good stuff on monday. And I’m looking forward to doing some writing now, on some of the cool things in FP4.0 that haven’t made it into the manual… yet.

Anyone want some fresh ChIP-Seq results? (-;