Replacing science publications in the 21st century

Yasset Perez-Riverol asked me to take a look at a post he wrote: a commentary on an article titled Beyond the Paper.  In fact, I suggest reading the original paper, as well as taking a look at Yasset’s wonderful summary image that’s being passed around.  There’s some merit to both of them in elucidating where the field is going, as well as how to capture the different forms of communication and the tools available to do so.

My first thought after reading both articles was “Wow… I’m not doing enough to engage in social media.”  And while that may be true, I’m not sure how many people have the time to do all of those things and still accomplish any real research.

Fortunately, as a bioinformatician, there are moments when you’ve sent all your jobs off and can take a blogging break.  (Come on statistics… find something good in this data set for me!)  And it doesn’t hurt when Lex Nederbragt asks your opinion, etither

However, I think there’s more to my initial reaction than just a glib feeling of under-accomplishment.  We really do need to consider streamlining the publication process, particularly for fast moving fields.  Whereas the blog and the paper above show how the current process can make use of social media, I’d rather take the opposite tack: How can social media replace the current process.  Instead of a slow, grinding peer-review process, a more technologically oriented one might replace a lot of the tools we currently have built ourselves around.  Let me take you on a little thought experiment, and please consider that I’m going to use my own field as an example, but I can see how it would apply to others as well. Imagine a multi-layered peer review process that goes like this:

  1. Alice has been working with a large data set that needs analysis.  Her first step is to put the raw data into an embargoed data repository.  She will have access to the data, perhaps even through the cloud, but now she has a backup copy, and one that can be released when she’s ready to share her data.  (A smart repository would release the data after 10 years, published or not, so that it can be used by others.)
  2. After a few months, she has a bunch of scripts that have cleaned up the data (normalization, trimming, whatever), yielding a nice clean data set.  These scripts end up in a source code repository, for instance github.
  3. Alice then creates a tool that allows her to find the best “hits” in her data set.  Not surprisingly, this goes to github as well.
  4. However, there’s also a meta data set – all of the commands she has run through part two and three.  This could become her electronic notebook, and if Alice is good, she could use this as her methods section: It’s a clear concise list of commands needed to take her raw data to her best hits.
  5. Alice takes her best hits to her supervisor Bob to check over them.  Bob thinks this is worthy of dissemination – and decides they should draft a blog post, with links to the data (as an attached file, along with the file’s hash), the github code and the electronic notebook.
  6. When Bob and Alice are happy with their draft, they publish it – and announce their blog post to a “publisher”, who lists their post as an “unreviewed” publication on their web page.  The data in the embargoed repository is now released to the public so that they can see and process it as well.
  7. Chris, Diane and Elaine notice the post on the “unreviewed” list, probably via an RSS feed or by visiting the “publisher’s” page and see that it is of interest to them.  They take the time to read and comment on the post, making a few suggestions to the authors.
  8. The authors make note of the comments and take the time to refine their scripts, which shows up on github, and add a few paragraphs to their blog post – perhaps citing a few missed blogs elsewhere.
  9. Alice and Bob think that the feedback they’ve gotten back has been helpful, and they inform the publisher, who takes a few minutes to check that they have had comments and have addressed the comments, and consequently they move the post from the “unreviewed” list to the “reviewed” list.  Of course, checks such as ensuring that no data is supplied in the dreaded PDF format are performed!
  10. The publisher also keeps a copy of the text/links/figures of the blog post, so that a snapshot of the post exists. If future disputes over the reviewed status of the paper occur, or if the author’s blog disappears, the publisher can repost the blog. (If the publisher was smart, they’d have provided the host for the blog post right from the start, instead of having to duplicate someone’s blog, otherwise.)
  11. The publisher then sends out tweets with hashtags appropriate to the subject matter (perhaps even the key words attached to the article), and Alice’s and Bob’s peers are notified of the “reviewed” status of their blog post.  Chris, Diane and Elaine are given credit for having made contributions towards the review of the paper.
  12. Alice and Bob interact with the other reviewers via comments and twitters, for which links are kept from the article.  (trackbacks and pings) Authors from other fields can point out errors or other papers of interest in the comments below.
  13. Google notes all of this interaction, and updates the scholar page for Alice and Bob, noting the interactions, and number of tweets in which the blog post is mentioned.   This is held up next to some nice stats about the number of posts that Alice and Bob have authored, and the impact of their blogging – and of course – the number of posts that achieve the “peer reviewed” status.
  14. Reviews or longer comments can be done on other blog pages, which are then collected by the publisher and indexed on the “reviews” list, cross-linked from the original post.

Look – science just left the hands of the vested interests, and jumped back into the hands of the scientists!

Frankly, I don’t see it as being entirely far fetched.  The biggest issue is going to be harmonizing a publisher’s blog with a personal blog – which means that most likely personal blogs will probably shrink pretty rapidly, or they’ll move towards consortia of “publishing” groups.

To be clear, the publisher, in this case, doesn’t have to be related whatsoever to the current publishers – they’ll make their money off of targeted ads, subscriptions to premium services (advanced notice of papers? better searches for relevant posts?) and their reputation will encourage others to join.  Better bloging tools and integration will the grounds by which the services compete, and more engagement in social media will benefit everyone.  Finally, because the bar for new publishers to enter the field will be relatively low, new players simply have to out-compete the old publishers to establish a good profitable foothold.

In any case – this appears to be just a fantasy, but I can see it play out successfully for those who have the time/vision/skills to grow a blogging network into something much more professional.  Anyone feel like doing this?

Feel free to comment below – although, alas, I don’t think your comments will ever make this publication count as “peer reviewed”, no matter how many of my peers review it. :(

I’ve landed.

So, I think it’s time to return to blogging.  I’ve started in a new group and have begun feeling my way around in a new area – so, for those who followed me for Next Gen Sequencing in the past, you may be surprised that it’s likely to play a diminished role in my new position.  I don’t think I’m done in NGS, but it looks like I’ll have a little break from it, until it my new group completes a few upgrades, at least.

Are you curious?  I’m working at the CMMT in Vancouver, in the Kobor Lab.  I can’t say enough how awesome this group is, and how welcoming they’ve been (and I don’t even think they read my blog…).  I’ll also likely be collaborating with a few other groups here – but the extent of that is yet to be determined.

So what will feature prominently in my blog?  Well, that’s a good question.

It seems like Chip-Seq will come back.  I don’t think I’ll be returning to FindPeaks – I’ve got better ideas and more interesting plans that I hope to move forward on. It seems likely that I have more to contribute in this particular area, so I expect I’ll be starting a new code base that deals a bit more with the statistics of Chip-Seq.  The findpeaks code base has become a bit too big for rapid prototyping, so it’s time to step out of it to move forward.

I’m sure that epigenetics will take a front row seat in my work. That’s a major focus in this group, both for histones and DNA methylation, so I can’t see it not playing a significant part.  (I’m looking forward to working with methylation, which I’ve never done before…)

I’ll probably be working with Python – I’ve been thinking that it’s time to move away from Java.  Not that there’s anything wrong with Java, but I’ve heard really good things about Python, and I’m excited to start a language that seems to fit a little more naturally with the way I’d like to approach the problem.

I’m hoping to work with Open Source..  well, that hasn’t been discussed much yet, but I still believe strongly in the open source philosophy – particularly in the academic world.  I’d rather not work on closed source code in this environment.

I’ll also likely be working with a bit of Yeast genomics – it’s a great model system, and there’s still a lot to learn about regulation and epigenetics in that particular organism.  And there’s always the tie in to beer.  That doesn’t hurt either.

At any rate, things are still evolving, and I have a 500Mb stack of papers to read (yes, I’m saving paper), but I think that I’m back.  I may do a few reviews of the subjects I’ll have to read up on, which include the epigenetics of healthy ageing and childhood development.  Oddly enough, I think we can learn things at both ends of the human age spectrum, so why not?

And yes, I’ll try to keep the disparaging comments about Denmark to a minimum from now on, but I can’t promise there won’t be any.  Not, at least, till the lawyers finish working out who’s owed what.

>3 year post doc? I hope not!

>I started replying to a comment left on my blog the other day and then realized it warranted a little more than just a footnote on my last entry.

This comment was left by “Mikael”:

[…] you can still do a post-doc even if you don’t think you’ll continue in academia. I’ve noticed many life science companies (especially big pharmas) consider it a big plus if you’ve done say 3 years of post-doc.

I definitely agree that it’s worth doing a post-doc, even if you decide you don’t want to go on through the academic pathway. I’m beginning to think that the best time to make that decision (ivory tower vs indentured slavery) may actually be during your post-doc, since that will be the closest you come to being a professor before making the decision. As a graduate student, I’m not sure I am fully aware of risks and rewards of the academic lifestyle. (I haven’t yet taken a course on the subject, and one only gets so much of an idea through exposure to professors.)

However, at this point, I can’t stand the idea of doing a 3 year post doc. After 6 years of undergrads, 2.5 years of masters, 3 years of (co-)running my own company, and about 3.5 years of doing a PhD by the time I’m done… well, 3 more years of school is about as appealing as going back to the wet lab. (No, glassware and I don’t really get along.)

If I’m going to do a post-doc (and I probably will), it will be a short and sweet one – no more than a year and a half at the longest. I have friends who are stuck in 4-5 year post-docs and have heard of people doing 10-year post-docs. I know what it means to be a post-doc for that long: “Not a good career building move.” If you’re not getting publications out quickly in your post-doc, I can imagine it won’t reflect well on your C.V, destroying your chances of moving into the limited number of faculty positions – and wrecking havoc on your chances of getting grants.

Still, It’s more about what you’re doing than how long you’re doing it. I’d consider a longer post doc if it’s in a great lab with the possibility of many good publications. If there’s one thing I’ve learned from discussions with collaborators and friends who are years ahead of me, it’s that getting into a lab where publications aren’t forthcoming – and where you’re not happy – can burn you out of science quickly.

Given that I’ve spent this long as a science student (and it’s probably far too late for me to change my mind on becoming a professional musician or photographer), I want to make sure that I end up somewhere where I’m happy with the work and can make reasonable progress: this is a search that I’m taking pretty seriously.

[And, just for the record, if company needs me to do 3-years of post-doc at this point, I have to wonder just who it is I’m competing with for that job – and what it is that they think you learn in your 2nd and 3rd years as a postdoc.]

With that in mind, I’m also going to put my (somewhat redacted) resume up on the web in the next few days. It might be a little early – but as I said, I’m taking this seriously.

In the meantime, since I want to actually graduate soon, I’d better go see if my analyses were successful. (=

>Depressing view of Academia

>So I officially started going through available post-doc positions this week, now that I’m back from my vacation. I’m still trying to figure out what I want to do when I finish my PhD next year (assuming I do…), and of course, I came back to the academia vs. industry question.

In weighing the evidence, a friend pointed me to this article on the problems facing new scientists in academia. Somehow, it does a nice job of dissuading me from thinking about going down that route – although I’m not completely convinced industry is the way to go yet either.

Read for yourself: Real Lives and White Lies in the Funding of Scientific Research

>What would you do with 10kbp reads?

>I just caught a tweet about an article on the Pathogens blog (What can you do with 1000 base pair reads?), which is specifically about 454 reads. Personally, I’m not so interested in 454 reads – the technology is good, but I don’t have access to 454 data, so it’s somewhat irrelevant to me. (Not to say 1kbp reads isn’t neat, but no one has volunteered to pass me 454 data in a long time…)

So, anyhow, I’m trying to think two steps ahead. 2010 is supposed to be the year that Pacific Biosciences (and other companies) release the next generation of sequencing technologies – which will undoubtedly be longer than 1k. (I seem to recall hearing that PacBio has 10k+ reads.- UPDATE: I found a reference.) So to heck with 1kbp reads, this raises the real question: What would you do with a 10,000bp read? And, equally important, how do you work with a 10kbp read?

  • What software do you have now that can deal with 10k reads?
  • Will you align or assemble with a 10k read?
  • What experiments will you be able to do with a 10k read?

Frankly, I suspect that nothing we’re currently using will work well with them – we’ll all have to go back to the drawing board and rework the algorithms we use.

So, what do you think?

>4 Freedoms of Research

>I’m going to venture off the beaten track for a few minutes. Ever since the discussion about conference blogging started to take off, I’ve been thinking about what the rights of scientists really are – and then came to the conclusion that there really aren’t any. There is no scientist’s manifesto or equivalent oath that scientists take upon receiving their degree. We don’t wear the iron ring like engineers, which signifies our commitment to integrity…

So, I figured I should do my little part to fix that. I’d like to propose the following 4 basic freedoms to research, without which science can not flourish.

  1. Freedom to explore new areas
  2. Freedom to share your results
  3. Freedom to access findings from other scientists
  4. Freedom to verify findings from other scientists

Broadly, these rights should be self evident. They are tightly intermingled, and can not be separated from each other:

  • The right to explore new ideas depends on us being able to trust and verify the results of experiments upon which our exploration is based.
  • The right to share information is contingent upon other groups being able to access those results.
  • The purpose of exploring new research opportunities is to share those results with people who can use them to build upon them
  • Being able to verify findings from other groups requires that we have access to their results.

In fact, they are so tightly mingled, that they are a direct consequence of the scientific method itself.

  1. Ask a question that explores a new area
  2. Use your prior knowledge, or access the literature to make a best guess as to what the answer is
  3. Test your result and confirm/verify if your guess matches the outcome
  4. share your results with the community.

(I liked the phrasing on this site) Of course if your question in step 1 is not new, you’re performing the verification step.

There are constraints on what we are allowed to do as scientists as well, we have to respect the ethics of the field in which we do our exploring, and we have to respect the fact that ultimately we are responsible to report to the people who fund the work.

However, that’s where we start to see problems. To the best of my knowledge, funding sources define the directions science is able to explore. We saw the U.S. restrict funding to science in order to throttle research in various fields (violating Research Freedom #1) for the past 8 years, which was effectively able to completely halt stem cell research, and suppress alternative fuel sources, etc. In the long term, this technique won’t work, because the scientists migrate to where the funding is. As the U.S. restores funding to these areas, the science is returning. Unfortunately, it’s Canada’s turn, with the conservative government (featuring a science minister who doesn’t believe in evolution) removing all funding from genomics research. The cycle of ignorance continues.

Moving along, and clearly in a related vein, Freedom #2 is also a problem of funding. Researchers who would like to verify other group’s findings (a key responsibility of the basic peer-review process) aren’t funded to do this type of work. While admitting my lack of exposure to granting committees, I’ve never heard of a grant being given to verify someone else’s findings. However, this is the basic way by which the scientists are held accountable. If no one can repeat your work, you will have many questions to answer – and yet the funding for ensuring accountability is rarely present.

The real threat to an open scientific community occurs with the last two Freedoms: sharing and access. If we’re unable to discuss the developments in our field, or are not even able to gain information on the latest work done, then science will come grinding to a major halt. We’ll waste all of our time and money exploring areas that have been exhaustively covered, or worse yet, come to the wrong conclusions about what areas are worth exploring in our ignorance of what’s really going on.

Ironically, Freedoms 3 and 4 are the most eroded in the scientific community today. Even considering only the academic world, where freedoms are taken for granted our interaction with the forums for sharing (and accessing) information are horribly stunted:

  • We do not routinely share negative results (causing unnecessary duplication and wasting resources)
  • We must pay to have our results shared in journals (limiting what can be shared)
  • We must pay to access other scientists results in journals (limiting what can be accessed)

It’s trivial to think of other examples of how these two freedoms are being eroded. Unfortunately, it’s not so easy to think of how to restore these basic rights to science, although there are a few things we can all do to encourage collaboration and sharing of information:

  • Build open source scientific software and collaborate to improve it – reducing duplication of effort
  • Publish in open access journals to help disseminate knowledge and bring down the barriers to access
  • Maintain blogs to help disseminate knowledge that is not publishable

If all scientists took advantage of these tools and opportunities to further collaborative research, I think we’d find a shift away from conferences towards online collaboration and the development of tools favoring faster and more efficient communication. This, in turn, would provide a significant speed up in the generation of ideas and technologies, leading to more efficient and productive research – something I believe all scientists would like to achieve.

To close, I’d like to propose a hypothesis of my own:

By guaranteeing the four freedoms of research, we will be able to accomplish higher quality research, more efficient use of resources and more frequent breakthroughs in science.

Now, all I need to do is to get someone to fund the research to prove this, but first, I’ll have to see what I can find in the literature…

>On the necessity of controls

>I guess I’ve had this rant building up for a while, and it’s finally time to write it up.

One of the fundamental pillars of science is the ability to isolate a specific action or event, and determine it’s effects on a particular closed system. The scientific method actually demands that we do it – hypothesize, isolate, test and report in an unbiased manner.

Unfortunately, for some reason, the field of genomics has kind of dropped that idea entirely. At the GSC, we just didn’t bother with controls for ChIP-Seq for a long time. I can’t say I’ve even seen too many matched WTSS (RNA-SEQ) experiments for cancer/normals. And that scares me, to some extent.

With all the statistics work I’ve put in to the latest version of FindPeaks, I’m finally getting a good grasp of the importance of using controls well. With the other software I’ve seen, they do a scaled comparison to calculate a P-value. That is really only half of the story. It also comes down to normalization, to comparing peaks that are present in both sets… and to determining which peaks are truly valid. Without that, you may as well not be using a control.

Anyhow, that’s what prompted me to write this. As I look over the results from the new FindPeaks (3.3.3.1), both for ChIP-Seq and WTSS, I’m amazed at how much clearer my answers are, and how much better they validate compared to the non-control based runs. Of course, the tests are still not all in – but what a huge difference it makes. Real control handling (not just normalization or whatever everyone else is doing) vs. Monte Carlo show results that aren’t in the same league. The cutoffs are different, the false peak estimates are different, and the filtering is incredibly more accurate.

So, this week, as I look for insight in old transcription factor runs and old WTSS runs, I keep having to curse the lack of controls that exist for my own data sets. I’ve been pushing for a decent control for my WTSS lanes – and there is matched normal for one cell line – but it’s still two months away from having the reads land on my desk… and I’m getting impatient.

Now that I’m able to find all of the interesting differences with statistical significance between two samples, I want to get on with it and find them, but it’s so much more of a challenge without an appropriate control. Besides, who’d believe it when I write it up with all of the results relative to each other?

Anyhow, just to wrap this up, I’m going to make a suggestion: if you’re still doing experiments without a control, and you want to get them published, it’s going to get a LOT harder in the near future. After all, the scientific method has been pretty well accepted for a few hundred years, and genomics (despite some protests to the contrary) should never have felt exempt from it.

>Decision time

>Well, now that I’ve heard that there’s a distinct possibility that I might be done my PhD in about a year, it’s time to start making some decisions. Frankly, I didn’t think I’d be done that quickly – although, really, I’m not done yet. I have a lot of publications to put together, and things to make sense of before I leave, but the clock to start figuring out what to do next has officially begun.

I suppose all of those post-doc blogs I’ve been reading for the last year have influenced me somewhat: I’m going to look for a lab where I’ll find a good mentor, a good environment, and a commitment to publishing and completing post-docs relatively quickly. Although that sounds simple, judging by other blogs I’ve been reading, it’s probably not all that easy to work out. Add to that the fact that my significant other isn’t interested in leaving Vancouver (and that I would prefer to stay here as well), and I think this will be a difficult process.

I do need to put together a timeline, however – and since I’m not yet entirely convinced which track I should follow (academic vs industry), it’s going to be a somewhat complex timeline. Anyhow, the point of blogging this it is an excellent way to open communication channels with people who you wouldn’t be able to connect with in person – and the first one I’d like to open is to ask readers if they have any suggestions.

Input, at this time would be VERY welcome, both on the point of academia vs. industry, as well as what I should be looking for in a good post-doc position, if that ends up being the path I go down. (=

Anyhow, just to mention, I have another blog post coming, but I’ll save it for tomorrow. I’d like to comment on another series of blog post from John Hawks and Daniel McArthur. I’m sure the whole blogosphere has heard all about the subject of training bioinformatics students from both the biology and computer science paths by now, but I feel I have something unique to talk about on that issue. In the meantime, I’d better get back to debugging and testing code. FindPeaks has a very cool new method of comparing different samples – and I’d like to get the testing finished. (=

>Epidemiology and next-generation(s) sequencing.

>I had a very interesting conversation this morning with a co-worker, which ended up as a full fledged conversation about how next generation sequencing will end up spilling out of the research labs to the physician’s office. My co-worker originally stated that it will take 20 years or so for it to happen, which seems kind of off to me. While most inventions take a lot longer to get going, I think that next-gen sequencing will cascade over more quickly to general use a lot more quickly than people appreciate. Let me explain why.

The first thing we have to acknowledge is that pharmaceutical companies have a HUGE interest in making next gen sequencing work for them. In the past, pharma companies might spend millions of dollars getting a drug candidate to phase 2 trials, and it’s in their best interest to get every drug as far as they can. Thus, any drug that can be “rescued” from failing at this stage will decrease the cost of getting drugs to market, and increases revenues significantly for the company. With the price of genome sequencing falling to $5000/person, it wouldn’t be unreasonable for a company to do 5-10,000 genomes for the phase 3 trial candidates, as insurance. If the drug seems to work well for a population associated with a particular set of traits, and not well for another group, it is a huge bonus for the company in getting the drug approved. If the drug causes adverse reactions in a small population of people which associate with a second set of traits, then it’s even better – they’ll be able to screen out adverse responders.

When it comes to getting FDA approval, any company that can clearly specify who the drug will work for – who it won’t work for – and who shouldn’t take it, will be miles ahead of the game, and able to fast track their application though the approval process. That’s another major savings for the company.

(If you’re paying attention, you’ll also notice at least one new business model here: retesting old drugs that failed trials to see if you can find responsive sub-populations. Someone is going to make a fortune on this.)

Where does this meet epidemiology? Give it 5-7 years, and you’ll start to see drugs appear on the shelf with warnings like “This drug is counter-indicated for patients with CYP450 variant XXXX.” Once that starts to happen, physicians will really have very little choice but to start sending their patients for routine genetic testing. We already have PCR screens in the labs for some diseases and tests, but it won’t be long before a whole series of drugs appear with labels like this, and insurance companies will start insisting that patients have their genomes sequenced for $5000, rather than have 40-50 individual test kits that each cost $100.

Really, though, what choice will physicians have? When drugs begin to show up that will help 99% of the patients for which they should be prescribed, but are counter indicated for genomic variations, no physician will be willing to accept the risk of prescribing without the accompanying test. (Malpractice insurance is good… but only gets you so far!) And as the tests get more complex, and our understanding of underlying cause and effect of various SNPs starts to increase, this is going to quickly go beyond the treatment of single conditions.

I can only see one conclusion: every physician will have to start working closely with a genetic councilor of some sort, who can advise on relative risk and reward of various drugs and treatment regimes. To do otherwise would be utterly reckless.

So, how long will it be until we see the effects of this transformation on our medical system? Well, give it 5 years to see the first genetic counter-indications, but it won’t take long after that for our medical systems (on both sides of the border in North America) to feel the full effects of the revolution. Just wait till we start sequencing the genomes of the flu bugs we’ve caught to best figure out which anti-viral to use.

Gone are the days when the physician will be able to eye up his or her patient and prescribe whatever drug he or she comes up with off the top of their head. Of course, the hospitals aren’t yet aware of this tsunami of information and change that’s coming at them. Somehow, we need to get the message to them that they’ll have to start re-thinking the way they treat people, instead of populations of people.

>SNP callers.

>I thought I’d switch gears a bit this morning. I keep hearing people say that the next project their company/institute/lab is going to tackle is a SNP calling application, which strikes me as odd. I’ve written at least 3 over the last several months, and they’re all trivial. They seem to perform as well as any one else’s SNP calls, and, if they take up more memory, I didn’t think that was too big of a problem. We have machines with lots of RAM these days, and it’s relatively cheap, these days.

What really strikes me as odd is that people think there’s money in this. I just can’t see it. The barrier to creating a new SNP calling program is incredibly low. I’d suggest it’s even lower than creating an aligner – and there are already 20 or so of those out there. There’s even an aligner being developed at the GSC (which I don’t care for in the slightest, I might add) that works reasonably well.

I think the big thing that everyone is missing is that it’s not the SNPs being called that important – it’s SNP management. In order to do SNP filtering, I have a huge postgresql database with SNPs from a variety of sources, in several large tables, which have to be compared against the SNPs and gene calls from my data set. Even then, I would have a very difficult time handing off my database to someone else – my database is scalable, but completely un-automated, and has nothing but the psql interface, which is clearly not the most user friendly. If I were going to hire a grad student and allocate money to software development, I wouldn’t spend the money on a SNP caller and have the grad student write the database – I’d put the grad student to work on his own SNP caller and buy a SNP management tool. Unfortunately, it’s a big project, and I don’t think there’s a single tool out there that would begin to meet the needs of people managing output from massively-parallel sequencing efforts.

Anyhow, just some food for thought, while I write tools that manage SNPs this morning.

Cheers.