>I thought I’d switch gears a bit this morning. I keep hearing people say that the next project their company/institute/lab is going to tackle is a SNP calling application, which strikes me as odd. I’ve written at least 3 over the last several months, and they’re all trivial. They seem to perform as well as any one else’s SNP calls, and, if they take up more memory, I didn’t think that was too big of a problem. We have machines with lots of RAM these days, and it’s relatively cheap, these days.
What really strikes me as odd is that people think there’s money in this. I just can’t see it. The barrier to creating a new SNP calling program is incredibly low. I’d suggest it’s even lower than creating an aligner – and there are already 20 or so of those out there. There’s even an aligner being developed at the GSC (which I don’t care for in the slightest, I might add) that works reasonably well.
I think the big thing that everyone is missing is that it’s not the SNPs being called that important – it’s SNP management. In order to do SNP filtering, I have a huge postgresql database with SNPs from a variety of sources, in several large tables, which have to be compared against the SNPs and gene calls from my data set. Even then, I would have a very difficult time handing off my database to someone else – my database is scalable, but completely un-automated, and has nothing but the psql interface, which is clearly not the most user friendly. If I were going to hire a grad student and allocate money to software development, I wouldn’t spend the money on a SNP caller and have the grad student write the database – I’d put the grad student to work on his own SNP caller and buy a SNP management tool. Unfortunately, it’s a big project, and I don’t think there’s a single tool out there that would begin to meet the needs of people managing output from massively-parallel sequencing efforts.
Anyhow, just some food for thought, while I write tools that manage SNPs this morning.