Wednesday, January 19, 2011

Re: How can cancer research be open-sourced?

Mark asked on Quora on how can cancer research be open-sourced. So, far I found Quora to be rather noisy, even after signing up only to science related groups, themes, whatever it is called. However, every now and then there is an interesting question like this one.

The question resonated with discussions I had earlier this week. During Peter's Symposium the discussion was restarted on why publishing data in databases is currently not rewarded. I think the answer is really simple: there is no independent organization counting citation statistics. What if Thomson did not calculate citation counts and impact factors? Would we be using them to judge the careers of fellow scientists? If FooBar would calculate H-indices based on data citations would we ignore that? I hardly think so. However, FooBar does not exists, and FooBar is not getting rich because of its citation counts.

From a scientist point of perspective, we see people hold back data and source code, because releasing it reduces the time for the scientist to bring the idea to Nature and Science. Now, in cheminformatics this is hardly a problem, because Nature and Science do generally not recognize fundamental, methodological work from informatics and statistics, despite their now crucial role in many Nature and Science papers. However, for data this is different. By releasing your data Openly (think Panton Principles), you remove your intellectual property that gives you a nice list of co-author papers for your publication list long tail. Mind you, this is not an argument I make up here, but actual practice: "Sure you can use my data/method, but I like to be co-author on your paper then."

Why this is actual practice? Even a paper in the long tail is rewarding. "Wow, he has 250 papers!" As Rich nicely characterizes it: game theory.

So, what if we would replace the papers in that publication list long tail, by points for releasing Open Data and Open Source? I'm all in favor. And no worries about Handles and DOIs. Forget about them. We had Thomson calculate impact factors very long before we had DOIs.

My reply to Mark's question?

    First thing that needs to be changed is the academic reward system. At this moment, it is rewarding to hold back information, source code, etc. Because if you do, you make yourself more competitive with respect to publishing in high-ranked journals. Now, if we would reward releasing data into public (Open) databases, that would change. Likewise for software. The new journal is an attempt at changing this situation (disclaimer: I'm on the editorial board). Of course, there are many kind of rewards. BMC giving out awards for Open Data is another. Another important reward would be financial. If organizations, foundation, etc, would start giving out financial support for Open projects, that will be a great change too. We are starting to see this with a couple of national founding agencies in Europe to have dedicated funding for Open Access publishing