Molecular graphics


Molecular graphics is the discipline and philosophy of studying molecules and their properties through graphical representation. IUPAC limits the definition to representations on a "graphical display device". Ever since Dalton's atoms and Kekulé's benzene, there has been a rich history of hand-drawn atoms and molecules, and these representations have had an important influence on modern molecular graphics. This article concentrates on the use of computers to create molecular graphics. Note, however, that many molecular graphics programs and systems have close coupling between the graphics and editing commands or calculations such as in molecular modelling.

Relation to molecular models

There has been a long tradition of creating molecular models from physical materials. Perhaps the best known is Crick and Watson's model of DNA built from rods and planar sheets, but the most widely used approach is to represent all atoms and bonds explicitly using the "ball and stick" approach. This can demonstrate a wide range of properties, such as shape, relative size, and flexibility. Many chemistry courses expect that students will have access to ball and stick models. One goal of mainstream molecular graphics has been to represent the "ball and stick" model as realistically as possible and to couple this with calculations of molecular properties.
Figure 1 shows a small molecule, as drawn by the Jmol program. It is important to realize that the colors and shapes are purely a convention, as individual atoms are not colored, nor do they have hard surfaces. Bonds between atoms are also not rod-shaped.

Comparison of physical models with molecular graphics

Physical models and computer models have partially complementary strengths and weaknesses. Physical models can be used by those without access to a computer and now can be made cheaply out of plastic materials. Their tactile and visual aspects cannot be easily reproduced by computers. On a computer screen, the flexibility of molecules is also difficult to appreciate; illustrating the pseudorotation of cyclohexane is a good example of the value of mechanical models.
However, it is difficult to build large physical molecules, and all-atom physical models of even simple proteins could take weeks or months to build. Moreover, physical models are not robust and they decay over time. Molecular graphics is particularly valuable for representing global and local properties of molecules, such as electrostatic potential. Graphics can also be animated to represent molecular processes and chemical reactions, a feat that is not easy to reproduce physically.

History

Initially the rendering was on early Cathode ray tube screens or through plotters drawing on paper. Molecular structures have always been an attractive choice for developing new computer graphics tools, since the input data are easy to create and the results are usually highly appealing. The first example of MG was a display of a protein molecule by Cyrus Levinthal and Robert Langridge. Among the milestones in high-performance MG was the work of Nelson Max in "realistic" rendering of macromolecules using reflecting spheres.
By about 1980 many laboratories both in academia and industry had recognized the power of the computer to analyse and predict the properties of molecules, especially in materials science and the pharmaceutical industry. The discipline was often called "molecular graphics" and in 1982 a group of academics and industrialists in the UK set up the Molecular Graphics Society. Initially much of the technology concentrated either on high-performance 3D graphics, including interactive rotation or 3D rendering of atoms as spheres. During the 1980s a number of programs for calculating molecular properties became available and the term "molecular graphics" often included these. As a result, the MGS has now changed its name to the Molecular Graphics and Modelling Society.
The requirements of macromolecular crystallography also drove MG because the traditional techniques of physical model-building could not scale. The first two protein structures solved by molecular graphics without the aid of the Richards' Box were built with Stan Swanson's program FIT on the Vector General graphics display in the laboratory of Edgar Meyer at Texas A&M University: First Marge Legg in Al Cotton's lab at A&M solved a second, higher-resolution structure of staph. nuclease and then Jim Hogle solved the structure of monoclinic lysozyme in 1976. A full year passed before other graphics systems were used to replace the Richards' Box for modelling into density in 3-D. Alwyn Jones' FRODO program were developed to overlay the molecular electron density determined from X-ray crystallography and the hypothetical molecular structure.
In 2009 BALLView became the first software to use realtime Raytracing for molecular graphics.

Art, science and technology in molecular graphics

Both computer technology and graphic arts have contributed to molecular graphics. The development of structural biology in the 1950s led to a requirement to display molecules with thousands of atoms. The existing computer technology was limited in power, and in any case a naive depiction of all atoms left viewers overwhelmed. Most systems therefore used conventions where information was implicit or stylistic. Two vectors meeting at a point implied an atom or a complete residue.
The macromolecular approach was popularized by Dickerson and Geis' presentation of proteins and the graphic work of Jane Richardson through high-quality hand-drawn diagrams such as the "ribbon" representation. In this they strove to capture the intrinsic 'meaning' of the molecule. This search for the "messages in the molecule" has always accompanied the increasing power of computer graphics processing. Typically the depiction would concentrate on specific areas of the molecule and this might have different colors or more detail in the number of explicit atoms or the type of depiction.
In some cases the limitations of technology have led to serendipitous methods for rendering. Most early graphics devices used vector graphics, which meant that rendering spheres and surfaces was impossible. Michael Connolly's program "MS" calculated points on the surface-accessible surface of a molecule, and the points were rendered as dots with good visibility using the new vector graphics technology, such as the Evans and Sutherland PS300 series. Thin sections through the structural display showed very clearly the complementarity of the surfaces for molecules binding to active sites, and the "Connolly surface" became a universal metaphor.
The relationship between the art and science of molecular graphics is shown in the exhibitions sponsored by the Molecular Graphics Society. Some exhibits are created with molecular graphics programs alone, while others are collages, or involve physical materials. An example from Mike Hann, inspired by Magritte's painting Ceci n'est pas une pipe, uses an image of a salmeterol molecule. "Ceci n'est pas une molecule," writes Mike Hann, "serves to remind us that all of the graphics images presented here are not molecules, not even pictures of molecules, but pictures of icons which we believe represent some aspects of the molecule's properties."
Colour molecular graphics is often use on chemistry journal covers in an artistic manner.

Space-filling models

Fig. 4 is a "space-filling" representation of formic acid, where atoms are drawn as solid spheres to suggest the space they occupy. This and all space-filling models are necessarily icons or abstractions: atoms are nuclei with electron "clouds" of varying density surrounding them, and as such have no actual surfaces. For many years the size of atoms has been approximated by physical models in which the volumes of plastic balls describe where much of the electron density is to be found. That is, the surface of these models is meant to represent a specific level of density of the electron cloud, not any putative physical surface of the atom.
Since the atomic radii are only slightly less than the distance between bonded atoms, the iconic spheres intersect, and in the CPK models, this was achieved by planar truncations along the bonding directions, the section being circular. When raster graphics became affordable, one of the common approaches was to replicate CPK models in silico. It is relatively straightforward to calculate the circles of intersection, but more complex to represent a model with hidden surface removal. A useful side product is that a conventional value for the molecular volume can be calculated.
The use of spheres is often for convenience, being limited both by graphics libraries and the additional effort required to compute complete electronic density or other space-filling quantities. It is now relatively common to see images of surfaces that have been colored to show quantities such as electrostatic potential. Common surfaces in molecular visualization include solvent-accessible surfaces, solvent-excluded surfaces, and isosurfaces. The isosurface in Fig. 5 appears to show the electrostatic potential, with blue colors being negative and red/yellow positive. Opaque isosurfaces do not allow the atoms to be seen and identified and it is not easy to deduce them. Because of this, isosurfaces are often drawn with a degree of transparency.

Technology

Early interactive molecular computer graphics systems were vector graphics machines, which used stroke-writing vector monitors, sometimes even oscilloscopes. The electron beam does not sweep left-and-right as in a raster display. The display hardware followed a sequential list of digital drawing instructions, directly drawing at an angle one stroke for each molecular bond. When the list was complete, drawing would begin again from the top of the list, so if the list was long, the display would flicker heavily. Later vector displays could rotate complex structures with smooth motion, since the orientation of all of the coordinates in the display list could be changed by loading just a few numbers into rotation registers in the display unit, and the display unit would multiply all coordinates in the display list by the contents of these registers as the picture was drawn.
The early black-and white vector displays could somewhat distinguish for example a molecular structure from its surrounding electron density map for crystallographic structure solution work by drawing the molecule brighter than the map. Color display makes them easier to tell apart. During the 1970s two-color stroke-writing Penetron tubes were available, but not used in molecular computer graphics systems. In about 1980 Evans & Sutherland made the first practical full-color vector displays for molecular graphics, typically attached to an E&S PS-2 or MPS graphics processor. This early color display was expensive, because it was originally engineered to withstand the shaking of a flight-simulator motion base and because the vector scan was driven by a pair of 1Kw amplifiers. These systems required frequent maintenance and the wise user signed a flat rate Service Contract with E&S. The newer E&S PS-300 series graphics processors used less expensive color displays with raster scan technology and the entire system could be purchased for less than the older CSM display alone.
Color raster graphics display of molecular models began around 1978 as seen in this paper by Porter on spherical shading of atomic models. Early raster molecular graphics systems displayed static images that could take around a minute to generate. Dynamically rotating color raster molecular display phased in during 1982-1985 with the introduction of the Ikonas programmable raster display.
Molecular graphics has always pushed the limits of display technology, and has seen a number of cycles of integration and separation of compute-host and display. Early systems like Project MAC were bespoke and unique, but in the 1970s the MMS-X and similar systems used low-cost terminals, such as the Tektronix 4014 series, often over dial-up lines to multi-user hosts. The devices could only display static pictures but were able to evangelize MG. In the late 1970s, it was possible for departments to afford their own hosts and to attach a display directly to the bus. The display list was kept on the host, and interactivity was good since updates were rapidly reflected in the display—at the cost of reducing most machines to a single-user system.
In the early 1980s, Evans & Sutherland decoupled their PS300 graphics processor/display, which contained its own display information transformable through a dataflow architecture. Complex graphical objects could be downloaded over a serial line or Ethernet interface and then manipulated without impact on the host. The architecture was excellent for high performance display but very inconvenient for domain-specific calculations, such as electron-density fitting and energy calculations. Many crystallographers and modellers spent arduous months trying to fit such activities into this architecture. E&S designed a card for the PS-300 which had several calculation algorithms using a 100 bit wide finite state machine in an attempt to simplify this process but it was so difficult to program that it quickly became obsolete.
The benefits for MG were considerable, but by the later 1980s, UNIX workstations such as Sun-3 with raster graphics had started to appear. Computer-assisted drug design in particular required raster graphics for the display of computed properties such as atomic charge and electrostatic potential. Although E&S had a high-end range of raster graphics they failed to respond to the low-end market challenge where single users, rather than engineering departments, bought workstations. As a result, the market for MG displays passed to Silicon Graphics, coupled with the development of minisupercomputers which were affordable for well-supported MG laboratories. Silicon Graphics provided a graphics language, IrisGL, which was easier to use and more productive than the PS300 architecture. Commercial companies ported their code to Silicon Graphics, and by the early 1990s, this was the "industry standard". Dial boxes were often used as control devices.
Stereoscopic displays were developed based on liquid crystal polarized spectacles, and while this had been very expensive on the PS2, it now became a commodity item. A common alternative was to add a polarizable screen to the front of the display and to provide viewers with extremely cheap spectacles with orthogonal polarization for separate eyes. With projectors such as Barco, it was possible to project stereoscopic display onto special silvered screens and supply an audience of hundreds with spectacles. In this way molecular graphics became universally known within large sectors of chemical and biochemical science, especially in the pharmaceutical industry. Because the backgrounds of many displays were black by default, it was common for modelling sessions and lectures to be held with almost all lighting turned off.
In the last decade almost all of this technology has become commoditized. IrisGL evolved to OpenGL so that molecular graphics can be run on any machine. In 1992, Roger Sayle released his RasMol program into the public domain. RasMol contained a very high-performance molecular renderer that ran on Unix/X Window, and Sayle later ported this to the Windows and Macintosh platforms. The Richardsons developed kinemages and the Mage software, which was also multi-platform. By specifying the chemical MIME type, molecular models could be served over the Internet, so that for the first time MG could be distributed at zero cost regardless of platform. In 1995, Birkbeck College's crystallography department used this to run "Principles of Protein Structure", the first multimedia course on the Internet, which reached 100 to 200 scientists.
MG continues to see innovation that balances technology and art, and currently zero-cost or open source programs such as PyMOL and Jmol have very wide use and acceptance.
Recently the widespread diffusion of advanced graphics hardware has improved the rendering capabilities of the visualization tools. The capabilities of current shading languages allow the inclusion of advanced graphic effects in the interactive visualization of molecules. These graphic effects, beside being eye candy, can improve the comprehension of the three-dimensional shapes of the molecules. An example of the effects that can be achieved exploiting recent graphics hardware can be seen in the simple open source visualization system QuteMol.

Algorithms

Simple

In early displays only vectors could be drawn e.g. which are easy to draw because no rendering or hidden surface removal is required.
On vector machines the lines would be smooth but on raster devices Bresenham's algorithm is used
Atoms can be drawn as circles, but these should be sorted so that those with the largest z-coordinates are drawn last. Although imperfect, this often gives a reasonably attractive display. Other simple tricks which do not include hidden surface algorithms are:
Typical pseudocode for creating Fig. 7 :
// Assume:
// Atoms with x, y, z coordinates and elementSymbol
// bonds with pointers/references to atoms at ends
// table of colors for elementTypes
// find limits of molecule in molecule coordinates as xMin, yMin, xMax, yMax
scale = min, yScreenMax / )
xOffset = −xMin × scale
yOffset = −yMin × scale
for each bond in bonds do
atom0 = bond.getAtom
atom1 = bond.getAtom
x0 = xOffset + atom0.getX × scale
y0 = yOffset + atom0.getY × scale //
x1 = xOffset + atom1.getX × scale
y1 = yOffset + atom1.getY × scale //
x1 = atom1.getX
y1 = atom1.getY
xMid = / 2
yMid = / 2
color0 = ColorTable.getColor)
drawLine
color1 = ColorTable.getColor)
drawLine
Note that this assumes the origin is in the bottom left corner of the screen, with up the screen. Many graphics systems have the origin at the top left, with down the screen. In this case the lines and should have the y coordinate generation as:
y0 = yScreenMax - //
y1 = yScreenMax - //
Changes of this sort change the handedness of the axes so it is easy to reverse the chirality of the displayed molecule unless care is taken.

Advanced

For greater realism and better comprehension of the 3D structure of a molecule many computer graphics algorithms can be used. For many years molecular graphics has stressed the capabilities of graphics hardware and has required hardware-specific approaches. With the increasing power of machines on the desktop, portability is more important and programs such as Jmol have advanced algorithms that do not rely on hardware. On the other hand, recent graphics hardware is able to interactively render very complex molecule shapes with a quality that would not be possible with standard software techniques.

Chronology

DeveloperApproximate dateTechnologyComments
Crystallographers< 1960Hand-drawnCrystal structures, with hidden atom and bond removal. Often clinographic projections.
Johnson, Motherwellca 1970Pen plotterORTEP, PLUTO. Very widely deployed for publishing crystal structures.
Cyrus Levinthal, Bob Langridge, Ward, Stots1966Project MAC display system, two-degree of freedom, spring-return velocity joystick for rotating the image.First protein display on screen. System for interactively building protein structures.
Barry1969LINC 300 computer with a dual trace oscilloscope display.Interactive molecular structure viewing system. Early examples of dynamic rotation, intensity depth·cueing, and side-by-side stereo. Early use of the small angle approximations to speed up graphical rotation calculations.
Ortony1971Designed a stereo viewer for molecular computer graphics.Horizontal two-way mirror combines images drawn on the upper and lower halves of a CRT. Crossed polarizers isolate the images to each eye.
Ortony1971Light pen, knob.Interactive molecular structure viewing system. Select bond by turning another knob until desired bond lights up in sequence, a technique later used on the MMS-4 system below, or by picking with the light pen. Points in space are specified with a 3-D ”bug" under dynamic control.
Barry, Graesser, Marshall1971CHEMAST: LINC 300 computer driving an oscilloscope. Two-axis joystick, similar to one used later by GRIP-75.Interactive molecular structure viewing system. Structures dynamically rotated using the joystick.
Tountas and Katz1971Adage AGT/50 displayInteractive molecular structure viewing system. Mathematics of nested rotation and for laboratory-space rotation.
Perkins, Piper, Tattam, White1971Honeywell DDP 516 computer, EAL TR48 analog computer, Lanelec oscilloscope, 7 linear potentiometers. Stereo.Interactive molecular structure viewing system.
Wright1972GRIP-71 at UNC-CH: IBM System/360 Model 40 time-shared computer, IBM 2250 display, buttons, light pen, keyboard.Discrete manipulation and energy relaxation of protein structures. Program code became the foundation of the GRIP-75 system below.
Barry and North1972Oxford Univ.: Ferranti Argus 500 computer, Ferranti model 30 display, keyboard, track ball, one knob. Stereo.Prototype large-molecule crystallographic structure solution system. Track ball rotates a bond, knob brightens the molecule vs. electron density map.
North, Ford, WatsonEarly 1970sLeeds Univ.: DEC PDP·11/40 computer, Hewlett-Packard display. 16 knobs, keyboard, spring-return joystick. Stereo.Prototype large-molecule crystallographic structure solution system. Six knobs rotate and translate a small molecule.
Barry, Bosshard, Ellis, Marshall, Fritch, Jacobi1974MMS-4: Washington Univ. at St. Louis, LINC 300 computer and an LDS-1 / LINC 300 display, custom display modules. Rotation joystick, knobs. Stereo.Prototype large-molecule crystallographic structure solution system. Select bond to rotate by turning another knob until desired bond lights up in sequence.
Cohen and Feldmann1974DEC PDP-10 computer, Adage display, push buttons, keyboard, knobsPrototype large-molecule crystallographic structure solution system.
Stellman1975Princeton: PDP-10 computer, LDS-1 display, knobsPrototype large-molecule crystallographic structure solution system. Electron density map not shown; instead an "H Factor" figure of merit is updated as the molecular structure is manipulated.
Collins, Cotton, Hazen, Meyer, Morimoto1975CRYSNET, Texas A&M Univ. DEC PDP-11/40 computer, Vector General Series 3 display, knobs, keyboard. Stereo.Prototype large-molecule crystallographic structure solution system. Variety of viewing modes: rocking, spinning, and several stereo display modes.
Cornelius and Kraut1976 Univ, of Calif. at San Diego: DEC PDP-11/40 emulator, Evans and Sutherland Picture System display, keyboard, 6 knobs. Stereo.Prototype large-molecule crystallographic structure solution system.
1976 PIGS: DEC PDP-11/70 computer, Evans and Sutherland Picture System 2 display, data tablet, knobs.Prototype large-molecule crystallographic structure solution system. The tablet was used for most interactions.
Feldmann and Porter1976NIH: DEC PDP—11/70 computer. Evans and Sutherland Picture System 2 display, knobs. Stereo.Interactive molecular structure viewing system. Intended to display interactively molecular data from the AMSOM – Atlas of Macromolecular Structure on Microfiche.
Rosenberger et al.1976MMS-X: Washington Univ. at St. Louis, TI 980B computer, Hewlett-Packard 1321A display, Beehive video terminal, custom display modules, pair of 3-D spring-return joysticks, knobs.Prototype large-molecule crystallographic structure solution system. Successor to the MMS-4 system above. The 3-D spring-return joysticks either translate and rotate the molecular structure for viewing or a molecular substructure for fitting, mode controlled by a toggle switch.
Britton, Lipscomb, Pique, Wright, Brooks1977GRIP-75 at UNC-CH: Time-shared IBM System/360 Model 75 computer, DEC PDP 11/45 computer, Vector General Series 3 display, 3-D movement box from A.M. Noll and 3-D spring return joystick for substructure manipulation, Measurement Systems nested joystick, knobs, sliders, buttons, keyboard, light pen.First large-molecule crystallographic structure solution.
Jones1978FRODO and RING Max Planck Inst., Germany, RING: DEC PDP-11/40 and Siemens 4004 computers, Vector General 3404 display, 6 knobs.Large-molecule crystallographic structure solution. FRODO may have run on a DEC VAX-780 as a follow-on to RING.
Diamond1978Bilder Cambridge, England, DEC PDP-11/50 computer, Evans and Sutherland Picture System display, tablet.Large-molecule crystallographic structure solution. All input is by data tablet. Molecular structures built on-line with ideal geometry. Later passes stretch bonds with idealization.
Langridge, White, MarshallLate 1970sDepartmental systems Mixture of commodity computing with early displays.
Davies, HubbardMid-1980sCHEM-X, HYDRALaboratory systems with multicolor, raster and vector devices.
Biosym, Tripos, PolygenMid-1980sPS300 and lower cost dumb terminals Commercial integrated modelling and display packages.
Silicon Graphics, SunLate 1980sIRIS GL workstationsCommodity-priced single-user workstations with stereoscopic display.
EMBL - WHAT IF1989, 2000Machine independentNearly free, multifunctional, still fully supported, many free servers based on it
Sayle, Richardson1992, 1993RasMol, KinemagePlatform-independent MG.
MDL 1995–1998Chimeproprietary C++ ; free browser plugin for Mac and PCs
1997-ICM-Browserproprietary; free download for Windows, Mac, and Linux.
1998-MarvinSketch & MarvinView. MarvinSpace proprietary Java applet or :wikt:stand-alone|stand-alone application.
Community efforts2000-, Jmol, PyMol, Avogadro, PDB, Open-source Java applet or :wikt:stand-alone|stand-alone application.
NOCH2002-NOCOpen source code molecular structure explorer
LION Bioscience / EMBL2004-SRS 3DFree, open-source system based on Java3D. Integrates 3D structures with sequence and feature data.
San Diego Supercomputer Center2006-SiriusFree for academic/non-profit institutions
Community efforts2009-HTML5/JavaScript viewers All Open-source. Require WebGL support in the browser.

Electronic Richards Box Systems

Before computer graphics could be employed, mechanical methods were used to fit large molecules to their electron density maps. Using techniques of X-ray crystallography crystal of a substance were bombarded with X-rays, and the diffracted beams that came off were assembled by computer using a Fourier transform into a usually blurry 3-D image of the molecule, made visible by drawing contour circles around high electron density to produce a contoured electron density map.
In the earliest days, contoured electron density maps were hand drawn on large plastic sheets. Sometimes, bingo chips were placed on the plastic sheets where atoms were interpreted to be.
This was superseded by the Richards Box in which an adjustable brass Kendrew molecular model was placed front of a 2-way mirror, behind which were plastic sheets of the electron density map. This optically superimposed the molecular model and the electron density map. The model was moved to within the contour lines of the superimposed map. Then, atomic coordinates were recorded using a plumb bob and a meter stick.
Computer graphics held out the hope of vastly speeding up this process, as well as giving a clearer view in many ways.
A noteworthy attempt to overcome the low speed of graphics displays of the time took place at Washington University in St. Louis, USA. Dave Barry's group attempted to leapfrog the state of the art in graphics displays by making custom display hardware to display images complex enough for large-molecule crystallographic structure solution, fitting molecules to their electron-density maps. The MMS-4 display modules were slow and expensive, so a second generation of modules was produced for the MMS-X system.
The first large molecule whose atomic structure was partly determined on a molecular computer graphics system was Transfer RNA by Sung-Hou Kim's team in 1976. after initial fitting on a mechanical Richards Box. The first large molecule whose atomic structure was entirely determined on a molecular computer graphics system is said to be neurotoxin A from venom of the Philippines sea snake, by Tsernoglou, Petsko, and Tu, with a statement of being first in 1977. The Richardson group published partial atomic structure results of the protein superoxide dismutase the same year, in 1977. All of these were done using the GRIP-75 system.
Other structure fitting systems, FRODO, RING, Builder, MMS-X, etc. succeeded as well within three years and became dominant.
The reason that most of these systems succeeded in just those years, not earlier or later, and within a short timespan had to do with the arrival of commercial hardware that was powerful enough. Two things were needed and arrived at about the same time. First, electron density maps are large and require either a computer with at least a 24-bit address space or a combination of a computer with a lesser 16-bit address space plus several years to overcome the difficulties of an address space that is smaller than the data. The second arrival was that of interactive computer graphics displays that were fast enough to display electron-density maps, whose contour circles require the display of numerous short vectors. The first such displays were the Vector General Series 3 and the Evans and Sutherland Picture System 2, MultiPicture System, and PS-300.
Nowadays, fitting of the molecular structure to the electron density map is largely automated by algorithms with computer graphics a guide to the process. An example is the XtalView XFit program.