tag:blogger.com,1999:blog-17889588.post1854230609407868013..comments2024-03-13T07:14:55.283+01:00Comments on chem-bla-ics: A typical QSAR study (cito:citesAsAuthority)Egon Willighagenhttp://www.blogger.com/profile/07470952136305035540noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-17889588.post-72388336516078137952013-01-23T18:11:59.825+01:002013-01-23T18:11:59.825+01:00@ Anonymous,since you mentioned about "good&q...@ Anonymous,since you mentioned about "good" QSAR models read this :<br /><br />QSAR: All models are wrong, but some are useful.<br /><br />http://eventheodd.blogspot.in/2012/12/qsar-all-models-are-wrong-but-some-are.htmlAnonymoushttps://www.blogger.com/profile/11363603927414414911noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-20375906391032854902012-04-29T10:01:43.108+02:002012-04-29T10:01:43.108+02:00Anonymous, that sounds like you use too many varia...Anonymous, that sounds like you use too many variables with respect to the number of objects? Given enough variables, you can fit anything. A common referred to ratio is 4 to 5 objects per variable.Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-87084042396741313912012-04-23T17:50:23.795+02:002012-04-23T17:50:23.795+02:00Thanks for the insight, and a few places to get st...Thanks for the insight, and a few places to get started. I look forward to a future post. In my models, I've been getting good R^2 values (>0.9), but Q^2 is burning them to the ground (approx. 0.04). Frustrating.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-17889588.post-2977985484785174822012-04-19T12:38:30.743+02:002012-04-19T12:38:30.743+02:00Vladimir, you could have a look at the 'forens...Vladimir, you could have a look at the 'forensic bioinformatics' work by K. Baggerly:<br /><br />http://odin.mdacc.tmc.edu/~kabaggerly/talks.htmlEgon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-13740743203937506802012-04-19T12:31:51.708+02:002012-04-19T12:31:51.708+02:00Anynomous, a very fair point. There are a few good...Anynomous, a very fair point. There are a few good papers in this area, and I will soon aggregate those and summarize them in a blog. Validation is particularly important, and most important to focus on. "Beware of Q^2!" is a good read, for example. Make sure to always visualize your predictions, e.g. in a y_pred versus y_measured scatter plot.<br /><br />For the paper discussed in the blog post, I can recommend the following. 1. with less than 100 compounds, there is a lot of freedom for your statistical method to come up with a model. Make sure to use a good independent test set, and to use cross-validation on the training set to find good parameter settings. You can use y-randomization and bootstrapping to get estimates of the predictive power of a random numerical model would give (thus a model with no cause-effect relationship).<br /><br />Another aspect of this paper was that they compared two methods, but confounded the regression method (SVR and PLS) and the kernel used to make a non-linear space linear. Keep in mind that SVR is a linear regression method, and that non-linear kernels can be applied to PLS too (not to be mistaken with kernel-PLS which is a method where the PLS algorithm is reformulated to be more efficient). I just got alerted this morning about this book chapter by the Zell group, which you will probably like, if you want to learn more about PLS versus SVR: http://dx.doi.org/10.1007/978-3-642-20389-3_12Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-48045120270960069982012-04-18T19:40:09.403+02:002012-04-18T19:40:09.403+02:00With so many QSAR papers out there, could you poin...With so many QSAR papers out there, could you point us to some citations that you would categorize as "good" QSAR papers? I'm an medicinal chemist in the middle of trying to build QSAR models, and a solid example would be quite helpful.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-17889588.post-21600736011498667582012-04-09T03:49:31.009+02:002012-04-09T03:49:31.009+02:00For the question is open:
1. What to do with stati...For the question is open:<br />1. What to do with statistically badly written papers? <br />2. What to do when the paper actually repeat the work already done 1-2-5-10 years ago, and doesn't have any new results (better Q2, etc)?<br /><br />Once I have send the letter to editor about some really badly written paper (http://farmacokratia.blogspot.com/2011/10/nonsense-article.html), but do not received any comments.Vladimir Chupakhinhttps://www.blogger.com/profile/14838130425318070954noreply@blogger.com