tag:blogger.com,1999:blog-17889588.post478443673557147936..comments2024-03-13T07:14:55.283+01:00Comments on chem-bla-ics: Validating MDL SD files and Symyx molfiles with the CDKEgon Willighagenhttp://www.blogger.com/profile/07470952136305035540noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-17889588.post-46398058059827595852010-02-01T21:48:27.834+01:002010-02-01T21:48:27.834+01:00Dear Anynomous,
perhaps you would like leave your...Dear Anynomous,<br /><br />perhaps you would like leave your reply too at the Blue Obelisk Exchange:<br /><br />http://blueobelisk.stackexchange.com/questions/202/proper-mdl-molfile-atom-block-line-format<br /><br />?Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-64127680043555169452010-02-01T21:42:00.902+01:002010-02-01T21:42:00.902+01:00Dear Anonymous,
thanx for getting back on me!
Re...Dear Anonymous,<br /><br />thanx for getting back on me!<br /><br />Readers are expected to padd those lines with whitespace...<br /><br />At some point I started marginal V3000 support, but this never got finish due to lack of user request...Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-74993071269987994022010-02-01T19:55:36.983+01:002010-02-01T19:55:36.983+01:00I have yet to make up my mind of the lack of those...<i>I have yet to make up my mind of the lack of those fields is a problem in the file, or allowed by the format.</i><br /><br />For the V2000 format (fixed fields), the Symyx/MDL format document does indeed indicate the expected behavior,<br /><br />"Note: A blank numerical entry on any line should be read as “0” (zero). Spaces are significant and correspond to one or more of the following:<br />•Absence of an entry<br />•Empty character positions within an entry<br />•Spaces between entries; single unless specifically noted otherwise<br />The FORTRAN format for coordinate information in the V2000 CTfile format is typically F10.4."<br /><br />There is historical reason for this - the V2000 (and pre-V2000) format was read using Fortran format statements in early readers. In more modern readers, lines are blank padded to the total number of chars in the format, and "blank" fields are read as '0' (or '0.0000' for floating point values).<br /><br />The V3000 format uses explicit key=value tagging and avoids many/most of the issues with the fixed field V2000 format.Anonymousnoreply@blogger.com