>Chip-Seq revisited

>In the world of ChIP-Seq, things don’t seem to slow down. A collaborator of mine pointed out the new application called MACS, which is yet another peak finder, written in python as an open source project. That makes 2 open source peak finders that I’m aware of: Useq and now MACS.

The interesting thing, to me, is the maturity of the code (in terms of features implemented). In neither cases is it all that great, as it’s mostly lacking features I consider to be relatively basic, and relatively naive in terms of algorithms used for peak detection. Though, I suppose I’ve been working with FindPeaks long enough that nearly everything else will seem relatively basic in comparison.

However, I’ll come back to a few more FP related things in a moment. I wanted to jump to another ChIP-Seq related item that I’d noticed this week. The Wold lab merged their Peak Finder software into a larger development package for Genomic and Transcriptome work, which I think they’re calling ERANGE. I’ve long argued that the Peak Finding tools are really just a subset of the whole Illumina tool-set required, and it’s nice to see other people doing this.

This is the development model I’ve been using, though I don’t know if the wold lab does exactly the same thing. The high-level organization uses a core library set, core object set, and then FindPeaks and other projects just sit on top, using those shared layers. It’s a reasonably efficient model. And, in a blog a while ago, I mentioned that I’d made a huge number of changes to my code after coming across the tool called “Enerjy“. I sat down to figure out how many lines were changed in the last two weeks: 26,000+ lines of code, comments and javadoc. That’s a startling figure, since my entire code base ( grep -r ” ” * | wc -l) is only 22,884 lines, of which 15,022 contain semi-colons.

Anyhow, I have several plans for the next couple of days:

  1. try to get my SVN repository to somewhere other people can work on it as well, and not just restricted to GSC developers.
  2. Improve the threading I’ve got going
  3. Clean up the documentation, where possible
  4. and work on the Adaptive mode code.

Hopefully, that’ll clean things up a bit.

Back to FindPeaks itself, the latest news is that my Application note in Bioinformatics has been accepted. Actually, it was accepted about a week ago, but I’m still waiting to see it in the advanced access section – hopefully it won’t be much longer. I also have a textbook chapter on ChIP-Seq coming out relatively soon, (I’m absolutely honoured to have been given that opportunity!) assuming I can get my changes done by Monday.

I don’t think that’ll be a problem.

Leave a Reply

Your email address will not be published. Required fields are marked *