czwartek, marca 20, 2008

Molecule puzzles

I have the bad habit to gain skills and then lose them again, because I forgot to make me some notes. Few days ago I stumbled upon Felix's post about SMILES. I couldn't stop myself from commenting it and by the way refreshing my memory. So... before I'll forget once again how to superimpose molecules with OpenBabel I'll write a short tutorial here. All we need are:
  • OpenBabel for Linux
  • Pymol (for visualization)
  • Cup of coffee (you could do it without it, but it's just not the same thing)
Now we need some molecules to work with. Hmmm... I know - how about some drugs? ]:) Just save the following files on your disk in a empty folder called drugs.

amphetamine.mol (Toggle Plain Text)

Amphetamine OpenBabel03200812553D 23 23 0 0 0 0 0 0 0 0999 V2000 -1.9463 5.1372 -0.0024 N 0 0 0 0 0 -6.8821 4.9631 -0.4940 C 0 0 0 0 0 -5.7209 4.3409 -0.0463 C 0 0 0 0 0 -5.8016 3.0620 0.5027 C 0 0 0 0 0 -7.0266 2.4098 0.5929 C 0 0 0 0 0 -8.1818 3.0351 0.1375 C 0 0 0 0 0 -8.1087 4.3132 -0.4052 C 0 0 0 0 0 -4.3953 5.1018 -0.1532 C 0 0 0 0 0 -3.1038 4.2371 0.0129 C 0 0 0 0 0 -2.9321 3.2306 -1.1605 C 0 0 0 0 0 -6.8346 5.8835 -0.8820 H 0 0 0 0 0 -4.9730 2.6105 0.8338 H 0 0 0 0 0 -7.0765 1.4913 0.9852 H 0 0 0 0 0 -9.0632 2.5669 0.2001 H 0 0 0 0 0 -8.9386 4.7655 -0.7319 H 0 0 0 0 0 -4.3614 5.5351 -1.0538 H 0 0 0 0 0 -4.3871 5.8061 0.5566 H 0 0 0 0 0 -3.1296 3.7039 0.9743 H 0 0 0 0 0 -1.0791 4.5924 0.0995 H 0 0 0 0 0 -2.0287 5.8241 0.7599 H 0 0 0 0 0 -2.0985 2.6958 -1.0222 H 0 0 0 0 0 -2.8666 3.7319 -2.0233 H 0 0 0 0 0 -3.7205 2.6162 -1.1913 H 0 0 0 0 0 1 9 1 0 0 0 1 19 1 0 0 0 1 20 1 0 0 0 2 3 1 0 0 0 2 11 1 0 0 0 7 2 2 0 0 0 3 4 2 0 0 0 8 3 1 0 0 0 4 5 1 0 0 0 4 12 1 0 0 0 5 6 2 0 0 0 5 13 1 0 0 0 6 7 1 0 0 0 6 14 1 0 0 0 7 15 1 0 0 0 8 16 1 0 0 0 8 17 1 0 0 0 9 8 1 0 0 0 9 18 1 0 0 0 10 9 1 0 0 0 10 21 1 0 0 0 10 22 1 0 0 0 10 23 1 0 0 0 M END
MDMA.mol (Toggle Plain Text)
MDMA ChemPy 3D 0 29 30 0 0 1 0 0 0 0 0999 V2000 -0.4580 3.4318 -0.0340 N 0 0 0 0 0 0 0 0 0 0 0 0 -6.3305 2.0552 -0.9221 C 0 0 0 0 0 0 0 0 0 0 0 0 -6.1557 0.6746 -1.1459 C 0 0 0 0 0 0 0 0 0 0 0 0 -4.9418 0.0508 -0.9409 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.8927 0.8684 -0.4973 C 0 0 0 0 0 0 0 0 0 0 0 0 -4.0606 2.2347 -0.2726 C 0 0 0 0 0 0 0 0 0 0 0 0 -5.3025 2.8639 -0.4853 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.9092 3.0611 0.2079 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.5646 2.4325 -0.1983 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7167 3.1531 -0.8771 C 0 0 0 0 0 0 0 0 0 0 0 0 -8.3006 1.1993 -1.6224 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.2697 1.2095 0.6628 C 0 0 0 0 0 0 0 0 0 0 0 0 -7.3529 0.1183 -1.5770 O 0 0 0 0 0 0 0 0 0 0 0 0 -7.6430 2.4094 -1.2065 O 0 0 0 0 0 0 0 0 0 0 0 0 -4.7995 -1.0196 -1.1127 H 0 0 0 0 0 0 0 0 0 0 0 0 -2.9095 0.4139 -0.3250 H 0 0 0 0 0 0 0 0 0 0 0 0 -5.4360 3.9361 -0.3110 H 0 0 0 0 0 0 0 0 0 0 0 0 -3.0093 4.0833 -0.2138 H 0 0 0 0 0 0 0 0 0 0 0 0 -2.9551 3.1858 1.3091 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.6390 2.1105 -1.2715 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.8018 4.3475 -0.2372 H 0 0 0 0 0 0 0 0 0 0 0 0 1.4798 3.9176 -0.6838 H 0 0 0 0 0 0 0 0 0 0 0 0 1.1377 2.1751 -0.6074 H 0 0 0 0 0 0 0 0 0 0 0 0 0.5044 3.1428 -1.9579 H 0 0 0 0 0 0 0 0 0 0 0 0 -9.1302 0.9825 -0.9309 H 0 0 0 0 0 0 0 0 0 0 0 0 -8.6617 1.3210 -2.6557 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.3798 0.6687 0.3142 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.1006 1.4768 1.7148 H 0 0 0 0 0 0 0 0 0 0 0 0 -2.1158 0.5082 0.6338 H 0 0 0 0 0 0 0 0 0 0 0 0 1 10 1 0 0 0 0 1 21 1 0 0 0 0 2 3 2 0 0 0 0 2 7 1 0 0 0 0 2 14 1 0 0 0 0 3 4 1 0 0 0 0 3 13 1 0 0 0 0 4 5 2 0 0 0 0 4 15 1 0 0 0 0 5 6 1 0 0 0 0 5 16 1 0 0 0 0 6 7 2 0 0 0 0 6 8 1 0 0 0 0 7 17 1 0 0 0 0 8 9 1 0 0 0 0 8 18 1 0 0 0 0 8 19 1 0 0 0 0 9 1 1 0 0 0 0 9 12 1 0 0 0 0 9 20 1 0 0 0 0 10 22 1 0 0 0 0 10 23 1 0 0 0 0 10 24 1 0 0 0 0 11 14 1 0 0 0 0 11 25 1 0 0 0 0 11 26 1 0 0 0 0 12 27 1 0 0 0 0 12 28 1 0 0 0 0 12 29 1 0 0 0 0 13 11 1 0 0 0 0 M END
mescaline.mol (Toggle Plain Text)
Mescaline ChemPy 3D 0 32 32 0 0 1 0 0 0 0 0999 V2000 2.4594 3.8379 -1.2806 N 0 0 0 0 0 0 0 0 0 0 0 0 -0.9359 0.2906 -0.5320 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.2592 -0.9122 0.1016 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.3214 -1.5509 0.9473 C 0 0 0 0 0 0 0 0 0 0 0 0 0.9348 -0.9510 1.1379 C 0 0 0 0 0 0 0 0 0 0 0 0 1.2511 0.2625 0.5094 C 0 0 0 0 0 0 0 0 0 0 0 0 0.3200 0.8757 -0.3282 C 0 0 0 0 0 0 0 0 0 0 0 0 0.6478 2.1634 -1.0006 C 0 0 0 0 0 0 0 0 0 0 0 0 2.0423 2.6435 -0.5853 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.4588 -0.8900 -0.8094 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.8075 -3.8572 0.8750 C 0 0 0 0 0 0 0 0 0 0 0 0 1.6631 -1.7053 3.2582 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.6525 -2.6819 1.6732 O 0 0 0 0 0 0 0 0 0 0 0 0 1.9404 -1.5704 1.8631 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.4622 -1.5724 -0.0557 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.6590 0.7913 -1.1915 H 0 0 0 0 0 0 0 0 0 0 0 0 2.2420 0.7083 0.6776 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.1160 2.9416 -0.7347 H 0 0 0 0 0 0 0 0 0 0 0 0 0.6115 2.0242 -2.1145 H 0 0 0 0 0 0 0 0 0 0 0 0 2.7889 1.8339 -0.8293 H 0 0 0 0 0 0 0 0 0 0 0 0 2.0455 2.7701 0.5384 H 0 0 0 0 0 0 0 0 0 0 0 0 3.3254 4.1712 -0.9069 H 0 0 0 0 0 0 0 0 0 0 0 0 1.7696 4.5589 -1.2057 H 0 0 0 0 0 0 0 0 0 0 0 0 -4.3450 -1.5709 -0.7409 H 0 0 0 0 0 0 0 0 0 0 0 0 -3.1344 -0.7716 -1.8718 H 0 0 0 0 0 0 0 0 0 0 0 0 -3.6893 0.1042 -0.3560 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.0613 -4.6489 1.6231 H 0 0 0 0 0 0 0 0 0 0 0 0 0.1473 -4.0975 0.3489 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.6380 -3.7213 0.1407 H 0 0 0 0 0 0 0 0 0 0 0 0 2.5558 -2.2549 3.6488 H 0 0 0 0 0 0 0 0 0 0 0 0 0.7279 -2.2929 3.4220 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5805 -0.6968 3.7307 H 0 0 0 0 0 0 0 0 0 0 0 0 1 22 1 6 0 0 0 1 23 1 1 0 0 0 2 3 2 0 0 0 0 2 7 1 0 0 0 0 2 16 1 6 0 0 0 3 4 1 1 0 0 0 3 15 1 0 0 0 0 4 5 2 0 0 0 0 4 13 1 1 0 0 0 5 6 1 6 0 0 0 5 14 1 1 0 0 0 6 7 2 0 0 0 0 6 17 1 0 0 0 0 7 8 1 6 0 0 0 8 9 1 6 0 0 0 8 18 1 1 0 0 0 8 19 1 0 0 0 0 9 1 1 6 0 0 0 9 20 1 0 0 0 0 9 21 1 0 0 0 0 10 24 1 0 0 0 0 10 25 1 6 0 0 0 10 26 1 0 0 0 0 11 27 1 1 0 0 0 11 28 1 6 0 0 0 11 29 1 6 0 0 0 12 30 1 0 0 0 0 12 31 1 0 0 0 0 12 32 1 0 0 0 0 13 11 1 6 0 0 0 14 12 1 1 0 0 0 15 10 1 6 0 0 0 M END
Ok. The first thing that should be done is to join all molecules into a single file, but separating the molecule records. The SDF file format serve the best in this case.
[lightnir@developer drugs]$ ls
MDMA.mol amphetamine.mol mescaline.mol
[lightnir@developer drugs]$ babel -imol *.mol -osdf drugs.join.sdf
3 molecules converted
51 audit log messages
[lightnir@developer drugs]$


Now let us load that file into Pymol and have a look at it.

At the first look it seems like there is only the methylenedioxymethamphetamine molecule, but if we look closer at the file structure (press the sequence button at the right bottom of the viewer window) we can see there are our 3 molecules. To display all of them run this command in PyMol:
PYMOL> set state,0
It should look like this (I've changed the color scheme for every molecule for a better contrast)
Now it's time to superimpose these molecules. As you can see these compounds are derivatives of phenethylamine, so let's fit them using this fragment. It's easy to do since the SMARTS representation of phenethylamine is simply c1ccccc1CCN. Let's use amphetamine as the molecule to witch the others are aligned.
[lightnir@developer drugs]$ obfit 'c1ccccc1CCN' amphetamine.mol drugs.join.sdf > drugs.fit.sdf
RMSD: 0.161607
RMSD: 0.000000
RMSD: 0.147656
==============================
*** Open Babel Warning in ReadMolecule
WARNING: Problems reading a MDL file
Cannot read title line

[lightnir@developer drugs]$

You can see now two things. First, you have to redirect the obfit output from standard output to a file. And second, by looking at the RMSD values you can tell how good the models fit on the aligned molecule (amphetamine). The smaller the value the better they fit. Obviously amphetamine fits on amphetamine perfectly(RMSD: 0.000000). Looking at drugs.fit.sdf in Pymol we should now see the desired effect.

The above example was rather simple. Here's a more complicated one:

[lightnir@developer fitting]$ obfit '[$(*CCC(=O)O);$(*~*[CH3]);$(*********[$(*CCC(=O)O);$(*~*[CH3])])]' cpg3.mol pharm.sdf > pharm.fit.sdf

On the image you can see the coproporphyrinogen III molecule (green carbons) and a pharmacophore(cyan carbons) of the coproporphyrinogen III oxidase fitted on it. The SMART string used for superimposing the molecules describes carbons with neighbor propionic acid group and neighbor atom connected to a methyl group. Piece of cake ]:)

Uff... After all those phenethylamines let's relax by some "ambient" music.

6 komentarzy:

Felix pisze...

pretty cool. right now I am on vacation :)
but when I am back I'll try it out

baoilleach pisze...

Nice use of OpenBabel. I wonder would it be possible to filter out the chemistry posts, so that you could be syndicated from Chemical Blogspace?

Lightnir pisze...

If you want to filter out the chemistry posts in my blog use the myChemistry tag, although since I'm not a native English speaker some of them are written in Polish(the earlier ones). If you are only interested in the ones written in English use this little link .

baoilleach pisze...

I understand about the tags. The problem is that Chemical Blogspace needs an RSS feed, and Blogger only provides a single RSS feed which contains all of the posts. I have tried to use Yahoo Pipes to filter by tag but it doesn't seem to work very well.

Lightnir pisze...

It took me about 5 min in google to find a solution. The Blogger API provides the possibility to create tagged ATOM and RSS feeds. I spare you with the technical details. Here are the RSS 2.0 feeds for posts tagged as:
en
myChemistry
Hope this is what you're looking for.

baoilleach pisze...

Excellent! My blog is on blogger too, but I didn't know about this :-) I will forward your myChemistry RSS feed to Egon (of Chemical Blogspace)...