383 90 13MB
English Pages 455 S [492] Year 2009
www.ebook3000.com
CoarseGraining of Condensed Phase and Biomolecular Systems
59556_C000.indd i
8/6/08 8:05:39 AM
www.ebook3000.com 59556_C000.indd ii
8/6/08 8:05:40 AM
CoarseGraining of Condensed Phase and Biomolecular Systems
Edited by
Gregory A. Voth
Boca Raton London New York
CRC Press is an imprint of the Taylor & Francis Group, an informa business
59556_C000.indd iii
8/6/08 8:05:40 AM
CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 334872742 © 2009 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acidfree paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number13: 9781420059557 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 9787508400. CCC is a notforprofit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress CataloginginPublication Data Coarsegraining of condensed phase and biomolecular systems / editor, Gregory A. Voth. p. ; cm. Includes bibliographical references and index. ISBN 9781420059557 (hardcover : alk. paper) 1. Molecular dynamicsComputer simulation. 2. BiomoleculesComputer simulation. 3. Condensed matterComputer simulation. I. Voth, Gregory A. II. Title. [DNLM: 1. Computer Simulation. 2. Computational Biologymethods. 3. Models, Molecular. 4. Models, Statistical. 5. Molecular Biologymethods. QA 76.9.C65 C652 2009] QP517.M65C63 2009 541’.394dc22
2008027690
Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com
www.ebook3000.com 59556_C000.indd iv
8/6/08 8:05:41 AM
Table of Contents Acknowledgments ...........................................................................................................................ix Editor ...............................................................................................................................................xi Contributors ................................................................................................................................. xiii Chapter 1
Introduction .................................................................................................................. 1
Gregory A. Voth Chapter 2
The MARTINI Force Field ..........................................................................................5
Siewert J. Marrink, Marc Fuhrmans, H. Jelger Risselada, and Xavier Periole Chapter 3
The Multiscale CoarseGraining Method: A Systematic Approach to CoarseGraining ............................................................. 21
W. G. Noid, Gary S. Ayton, Sergei Izvekov, and Gregory A. Voth Chapter 4
A Model for Lipid Bilayers in Implicit Solvent .......................................................... 41
Grace Brannigan and Frank L.H. Brown Chapter 5
CoarseGrained Dynamics of Anisotropic Systems .................................................. 59
L. Paramonov, M.G. Burke, and S.N. Yaliraki Chapter 6
StatePoint Dependence and Transferability of Potentials in Systematic Structural CoarseGraining ................................................ 69
Qi Sun, Jayeeta Ghosh, and Roland Faller Chapter 7
Systematic Approach to CoarseGraining of Molecular Descriptions and Interactions with Applications to Lipid Membranes .................................................. 83
Teemu Murtola, Ilpo Vattulainen, and Mikko Karttunen Chapter 8
Simulation of Protein Structure and Dynamics with the CoarseGrained UNRES Force Field ...................................................................... 107
Adam Liwo, Cezary Czaplewski, Stanisław Ołdziej, Ana V. Rojas, Rajmund Kaz´ mierkiewicz, Mariusz Makowski, Rajesh K. Murarka, and Harold A. Scheraga Chapter 9
CoarseGrained StructureBased Simulations of Proteins and RNA ...................... 123
Alexander Schug, Changbong Hyeon, and José N. Onuchic
v
59556_C000toc.indd v
8/12/08 2:32:54 PM
vi
Chapter 10
Table of Contents
On the Development of CoarseGrained Protein Models: Importance of Relative SideChain Orientations and Backbone Interactions ............................... 141
N.V. Buchete, J.E. Straub, and D. Thirumalai Chapter 11
Characterization of ProteinFolding Landscapes by CoarseGrained Models Incorporating Experimental Data ................................... 157
Silvina Matysiak and Cecilia Clementi Chapter 12
Principles and Practicalities of Canonical MixedResolution Sampling of Biomolecules ...................................................................................... 171
Daniel M. Zuckerman Chapter 13
Pathways of Conformational Transitions in Proteins ............................................. 185
Peter Májek, Ron Elber, and Harel Weinstein Chapter 14
Insights into the SequenceDependent Macromolecular Properties of DNA from BasePair Level Modeling ..............................................205
Wilma K. Olson, Andrew V. Colasanti, Luke Czapla, and Guohui Zheng Chapter 15
CoarseGrained Models for Nucleic Acids and Large Nucleoprotein Assemblies ...................................................................................... 225
Robert K.Z. Tan, Anton S. Petrov, Batsal Devkota, and Stephen C. Harvey Chapter 16
Elastic Network Models of CoarseGrained Proteins Are Effective for Studying the Structural Control Exerted over Their Dynamics ............................. 237
Robert L. Jernigan, Lei Yang, Guang Song, Ozge Kurkcuoglu, and Pemra Doruker Chapter 17
CoarseGrained Elastic Normal Mode Analysis and Its Applications in XRay Crystallographic Refinement at Moderate Resolutions ............................... 255
Jianpeng Ma Chapter 18
CoarseGrained Normal Mode Analysis to Explore LargeScale Dynamics of Biological Molecules......................................................................... 267
Osamu Miyashita and Florence Tama Chapter 19
OneBead CoarseGrained Models for Proteins .................................................... 285
Valentina Tozzini and J. Andrew McCammon Chapter 20
Application of ResidueBased and ShapeBased CoarseGraining to Biomolecular Simulations....................................................................................... 299
Peter L. Freddolino, Amy Y. Shih, Anton Arkhipov, Ying Ying, Zhongzhou Chen, and Klaus Schulten
www.ebook3000.com 59556_C000toc.indd vi
8/12/08 2:32:57 PM
Table of Contents
Chapter 21
vii
CoarseGraining Protein Mechanics ...................................................................... 317
Richard Lavery and Sophie SacquinMora Chapter 22
SelfAssembly of Surfactants in Bulk Phases and at Interfaces Using CoarseGrain Models .................................................................. 329
Wataru Shinoda, Russell DeVane, and Michael L. Klein Chapter 23
CoarseGrained Simulations of Polyelectrolytes ................................................... 343
Mark J. Stevens Chapter 24
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems ..................................................................... 361
F.A. Detcheverry, K.Ch. Daoulas, M. Müller, P.F. Nealey, and J.J. de Pablo Chapter 25
StructureBased Coarse and FineGraining in Soft Matter Simulations .......................................................................................... 379
Nico F.A. van der Vegt, Christine Peter, and Kurt Kremer Chapter 26
From Atomistic Modeling of Macromolecules Toward Equations of State for Polymer Solutions and Melts: How Important Is the Accurate Description of the Local Structure? .................................................. 399
Kurt Binder, Wolfgang Paul, Peter Virnau, Leonid Yelash, Marcus Müller, and Luis González MacDowell Chapter 27
Effective Interaction Potentials for CoarseGrained Simulations of PolymerTethered Nanoparticle SelfAssembly in Solution ................................... 415
Elaine R. Chan, Alberto Striolo, Clare McCabe, Peter T. Cummings, and Sharon C. Glotzer Chapter 28
CoarseGraining in Time: From Microscopics to Macroscopics ........................... 433
Angela Violi Index
59556_C000toc.indd vii
..................................................................................................................................449
8/12/08 2:32:57 PM
www.ebook3000.com 59556_C000toc.indd viii
8/12/08 2:32:57 PM
Acknowledgments My own research contributions to this book would not have been possible if it were not for the remarkable dedication, talent, and hard work of the members of my research group, both past and present. I thank my assistant, Shawna Derry, for her indispensable help and patience in the preparation of this book, and Lance Wobus of CRC Press/Taylor & Francis for his help in formulating the concept of the book and for his advice and guidance during its preparation. Most of all, I thank my two children Michael and Carolyn, for supporting me through the many long hours I have worked in my career, my two brothers and mother for putting up with someone who tries to think “outside the box” a bit too much, and my father who, while he was living, taught me the value of loyalty, courage, and perseverance.
ix
59556_C000a.indd ix
6/18/08 12:00:43 PM
www.ebook3000.com 59556_C000a.indd x
6/18/08 12:00:43 PM
Editor Gregory A. Voth is a distinguished professor of chemistry and the director of the Center for Biophysical Modeling and Simulation at the University of Utah. He received a PhD in theoretical chemistry from the California Institute of Technology in 1987. Selected honors and awards include: John Simon Guggenheim Memorial, Fellowship, 2004–2005; Miller Professorship, University of California, Berkeley, 2003; Elected Fellow of the American Association for the Advancement of Science, 1999; Elected Fellow of the American Physical Society, 1998; IBM Faculty Research Award, 1997–99; Camille Dreyfus TeacherScholar Award, 1994–99; Alfred P. Sloan Foundation Research Fellow, 1992–94; National Science Foundation Presidential Young Investigator Award, 1991–96; David and Lucile Packard Foundation Fellowship in Science and Engineering, 1990–95; Camille and Henry Dreyfus Distinguished New Faculty Award, 1989; IBM Postdoctoral Fellowship, University of California, Berkeley, 1987–88; The Francis and Milton Clauser Doctoral Prize, California Institute of Technology, 1987; The Herbert Newby McCoy Award, California Institute of Technology, 1986; The Procter and Gamble Award for Outstanding Research in Physical Chemistry, American Chemical Society, 1985. Current professional affiliations include American Chemical Society (ACS), American Physical Society (APS), the Biophysical Society (BPS), and the American Association for the Advancement of Science (AAAS). Professor Voth is the author or coauthor of more than 300 peerreviewed scientific articles and mentor to more than 100 postdoctoral fellows, graduate students, and undergraduate research assistants. His research interests include multiscale simulation and theoretical modeling of biomolecular systems; proton transport processes in biological, material, and solution phase systems; computer simulation and modeling of soft materials; roomtemperature ionic liquids; theory and simulation of solvation phenomena; structure and dynamics of interfaces; theory and simulation of condensedphase quantum dynamical processes; and highperformance computing.
xi
59556_C000b.indd xi
8/12/08 2:34:21 PM
www.ebook3000.com 59556_C000b.indd xii
8/12/08 2:34:22 PM
Contributors Anton Arkhipov Department of Physics University of Illinois at UrbanaChampaign Urbana, Illinois
Zhongzhou Chen Department of Physics University of Illinois at UrbanaChampaign Urbana, Illinois
Gary S. Ayton Center for Biophysical Modeling and Simulation and Department of Chemistry University of Utah Salt Lake City, Utah
Cecilia Clementi Department of Chemistry Rice University Houston, Texas
Kurt Binder Institut für Physik Johannes GutenbergUniversität Mainz Mainz, Germany Grace Brannigan Center for Molecular Modeling Department of Chemistry University of Pennsylvania Philadelphia, Pennsylvania Frank L. H. Brown Department of Chemistry and Biochemistry and Department of Physics University of California at Santa Barbara Santa Barbara, California N.V. Buchete School of Physics University College Dublin Dublin, Ireland M. G. Burke Institute for Mathematical Sciences, and Department of Chemistry Imperial College London London, England, U.K. Elaine R. Chan Electronics and Electrical Engineering Laboratory National Institute of Standards and Technology Gaithersburg, Maryland
Andrew V. Colasanti Department of Chemistry & Chemical Biology BioMaPS Institute for Quantitative Biology Rutgers, The State University of New Jersey Piscataway, New Jersey Peter T. Cummings Center for Nanophase Materials Sciences Oak Ridge National Laboratory Oak Ridge, Tennessee Luke Czapla Department of Chemistry & Chemical Biology BioMaPS Institute for Quantitative Biology Rutgers, The State University of New Jersey Piscataway, New Jersey Cezary Czaplewski Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and Faculty of Chemistry University of Gdan´sk Gdan´sk, Poland K. Ch. Daoulas Institut für Theoretische Physik GeorgAugust Universität Göttingen, Germany xiii
59556_C000c.indd xiii
7/12/08 7:04:33 AM
xiv
Contributors
J. J. de Pablo Department of Chemical and Biological Engineering University of WisconsinMadison Madison, Wisconsin
Jayeeta Ghosh Department of Chemical Engineering and Materials Science University of California, Davis Davis, California
F. A. Detcheverry Department of Chemical and Biological Engineering University of WisconsinMadison Madison, Wisconsin
Sharon C. Glotzer Department of Chemical Engineering and Department of Materials Science and Engineering University of Michigan Ann Arbor, Michigan
Russell DeVane The Laboratory for Research on the Structure of Matter University of Pennsylvania Philadelphia, Pennsylvania Batsal Devkota School of Biology Georgia Institute of Technology Atlanta, Georgia
Stephen C. Harvey School of Biology Georgia Institute of Technology Atlanta, Georgia Changbong Hyeon Center for Theoretical Biological Physics University of California, San Diego La Jolla, California
Pemra Doruker Department of Chemical Engineering and Polymer Research Center Bogazici University Bebek, Istanbul, Turkey Ron Elber Department of Computer Science Cornell University Ithaca, New York Roland Faller Department of Chemical Engineering and Materials Science University of California, Davis Davis, California Peter L. Freddolino Center for Biophysics and Computational Biology University of Illinois at UrbanaChampaign Urbana, Illinois Marc Fuhrmans Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials University of Groningen Groningen, The Netherlands
Sergei Izvekov Center for Biophysical Modeling and Simulation and Department of Chemistry University of Utah Salt Lake City, Utah Robert L. Jernigan LH Baker Center for Bioinformatics and Biological Statistics Department of Biochemistry, Biophysics, and Molecular Biology Iowa State University Ames, Iowa Mikko Karttunen Department of Applied Mathematics The University of Western Ontario London, Ontario, Canada Rajmund Kaz´ mierkiewicz Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and University of Gdan´sk Gdan´sk, Poland
www.ebook3000.com 59556_C000c.indd xiv
7/12/08 7:04:33 AM
Contributors
Michael L. Klein The Laboratory for Research on the Structure of Matter University of Pennsylvania Materials Research Science and Engineering Center Philadelphia, Pennsylvania Kurt Kremer Max Planck Institute for Polymer Research Mainz, Germany Ozge Kurkcuoglu Department of Chemical Engineering and Polymer Research Center Bogazici University Bebek, Istanbul, Turkey Richard Lavery Institute de Biologie et Chimie des Protéines Université de Lyon Lyon, France Adam Liwo Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and Faculty of Chemistry University of Gdan´sk Gdan´sk, Poland Jianpeng Ma Baylor College of Medicine Verna and Marrs Mclean Department of Biochemistry and Molecular Biology Houston, Texas Luis González MacDowell Departamento de Quimica Fisica Universidad Compluteuse de Madrid Madrid, Spain Peter Májek Department of Computer Science Cornell University Ithaca, New York
59556_C000c.indd xv
xv
Mariusz Makowski Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and Faculty of Chemistry University of Gdan´sk Gdan´sk, Poland Siewert J. Marrink Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials University of Groningen Groningen, The Netherlands Silvina Matysiak Institute for Computational Engineering and Science The University of Texas at Austin Austin, Texas Clare McCabe Department of Chemical Engineering Vanderbilt University Nashville, Tennessee J. Andrew McCammon Department of Chemistry and Biochemistry Center for Theoretical Biological Physics Howard Hughes Medical Institute University of California, San Diego La Jolla, California Osamu Miyashita Department of Biochemistry and Molecular Biophysics The University of Arizona Tucson, Arizona Marcus Müller Institut für Theoretische Physik GeorgAugust Universität Göttingen, Germany Rajesh K. Murarka Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York
7/12/08 7:04:34 AM
xvi
Contributors
Teemu Murtola Laboratory of Physics and Helsinki Institute of Physics Helsinki University of Technology Espoo, Finland P. F. Nealey Department of Chemical and Biological Engineering University of WisconsinMadison Madison, Wisconsin W. G. Noid Department of Chemistry Pennsylvania State University University Park, Pennsylvania Stanisław Ołdziej Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and Faculty of Chemistry University of Gdan´sk Gdan´sk, Poland Wilma K. Olson Department of Chemistry & Chemical Biology BioMaPS Institute for Quantitative Biology Rutgers, The State University of New Jersey Piscataway, New Jersey José N. Onuchic Center for Theoretical Biological Physics University of California, San Diego La Jolla, California
Xavier Periole Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials University of Groningen Groningen, The Netherlands Christine Peter Max Planck Institute for Polymer Research Mainz, Germany Anton S. Petrov School of Biology Georgia Institute of Technology Atlanta, Georgia H. Jelger Risselada Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials University of Groningen Groningen, The Netherlands Ana V. Rojas Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York and Department of Physics and Astronomy Louisiana State University Baton Rouge, Louisiana and Center for Computation and Technology Louisiana State University Baton Rouge, Louisiana Sophie SacquinMora Laboratoire de Biochimie Théorique Institut de Biologie PhysicoChimique Paris, France
L. Paramonov Institute for Mathematical Sciences, and Department of Chemistry Imperial College London London, England, U.K.
Harold A. Scheraga Baker Laboratory of Chemistry and Chemical Biology Cornell University Ithaca, New York
Wolfgang Paul Institut für Physik Johannes GutenbergUniversität Mainz Mainz, Germany
Alexander Schug Center for Theoretical Biological Physics University of California, San Diego La Jolla, California
www.ebook3000.com 59556_C000c.indd xvi
7/12/08 7:04:34 AM
Contributors
xvii
Klaus Schulten Department of Physics University of Illinois at UrbanaChampaign Urbana, Illinois
Robert K.Z. Tan School of Biology Georgia Institute of Technology Atlanta, Georgia
Amy Y. Shih Center for Biophysics and Computational Biology University of Illinois at UrbanaChampaign Urbana, Illinois
D. Thirumalai Biophysics Program Institute for Physical Science and Technology University of Maryland College Park, Maryland
Wataru Shinoda Research Institute of Computational Science National Institute of Advanced Industrial Science and Technology Philadelphia, Pennsylvania
Valentina Tozzini NESTCNRINFM Scuola Normale Superiore Pisa, Italy
Guang Song LH Baker Center for Bioinformatics and Biological Statistics Department of Computer Science Iowa State University Ames, Iowa Mark J. Stevens Sandia National Laboratories Albuquerque, New Mexico J. E. Straub Chemistry Department Boston University Boston, Massachusetts Alberto Striolo School of Chemical, Biological and Materials Engineering The University of Oklahoma Norman, Oklahoma Qi Sun Department of Chemical Engineering and Materials Science University of California, Davis Davis, California Florence Tama Department of Biochemistry and Molecular Biophysics The University of Arizona Tucson, Arizona
59556_C000c.indd xvii
Nico F. A. van der Vegt Max Planck Institute for Polymer Research Mainz, Germany Ilpo Vattulainen Department of Physics Tampere University of Technology Tampere, Finland Angela Violi Department of Mechanical Engineering University of Michigan Ann Arbor, Michigan Peter Virnau Institut für Physik Johannes GutenbergUniversität Mainz Mainz, Germany Gregory A. Voth Center for Biophysical Modeling and Simulation and Department of Chemistry University of Utah Salt Lake City, Utah Harel Weinstein Department of Physiology and Biophysics and the HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine Weill Medical College of Cornell University New York, New York
7/12/08 7:04:34 AM
xviii
Contributors
S. N. Yaliraki Institute for Mathematical Sciences, and Department of Chemistry Imperial College London London, England, U.K. Lei Yang LH Baker Center for Bioinformatics and Biological Statistics Department of Biochemistry, Biophysics, and Molecular Biology Iowa State University Ames, Iowa Leonid Yelash Institut für Physik Johannes GutenbergUniversität Mainz Mainz, Germany
Ying Ying Department of Physics University of Illinois at UrbanaChampaign Urbana, Illinois Guohui Zheng Department of Chemistry & Chemical Biology BioMaPS Institute for Quantitative Biology Rutgers, The State University of New Jersey Piscataway, New Jersey Daniel M. Zuckerman Department of Computational Biology University of Pittsburgh School of Medicine Pittsburgh, Pennsylvania
www.ebook3000.com 59556_C000c.indd xviii
7/12/08 7:04:35 AM
1 Introduction Gregory A. Voth Department of Chemistry, University of Utah
The computer simulation of condensed phases and biomolecular systems has resulted in profound new insight into the molecularscale phenomena that occur in these complex systems. However, many processes that occur in liquids, soft materials, and biomolecular systems occur over length and time scales that are well beyond the current capabilities of atomiclevel simulation. As such, new and novel approaches continue to be developed that can access longer time and length scale phenomena. One such approach is coarsegrained (CG) simulation, the topic of this book. In coarsegraining, groups of atoms are clustered into new CG “sites”. These CG sites then interact through more computationally efficient effective interactions. The combination of these efficient interactions with the reduction in the total number of degrees of freedom of the system allows for a significant jump in the accessible spatial/temporal scales. As such, coarsegraining is the reduction of molecularscale information (structural and interactions) into lowerresolution models that seek to retain the key physical features of the system of interest but are also simplified in their form (sometimes even greatly simplified). Such CG models are then most often used in a molecular simulation context, usually molecular dynamics (MD) or Monte Carlo (MC) simulation, to obtain the target properties of the system of interest. The key motivation for CG molecular modeling and simulation thus primarily derives from the need to bridge the atomistic and mesoscopic scales. Typically speaking, there are two to three ordersofmagnitude in length and time separating these regimes. At the mesoscopic scale, one sees the emergence of critically important phenomena (e.g., selfassembly in biomolecular or soft matter systems). CG simulations, especially as they seek to make increasing contact with experimental results on complex systems, can therefore play a crucial role in the exploration of mesoscopic phenomena and, in turn, of the behavior of real biomolecular and materials systems. Coarsegraining promises to provide a revolutionary advance for the scientific community, especially in the field of computer simulation. However, new challenges emerge when the CG approach is employed. These challenges are described in more detail below and in the chapters of this book. One of the challenges involves the establishment of a proper formal connection between the behavior of the CG representation of the system and the underlying allatom (full atomic resolution) model. Additional challenges involve the degree of “believable” predictive power of CG models and their transferability between dissimilar systems. The main current approaches to coarsegraining are represented in this book. These include highly “minimalist” CG models that are intended to reveal the essential physics of a given class of system. These models are usually very computationally efficient and qualitatively informative, but they do not necessarily provide quantitatively accurate predictions. Another approach is to develop CG models using experimental, thermodynamic, and/or average structural properties. This can be called the “inversion” approach to coarsegraining. Yet a third approach is to bridge atomistic information upward in scale to the CG level in a “multiscale” fashion. All of these approaches have their strengths and weaknesses, and they are certainly complementary to each other. However, at some level coarsegraining must ultimately be understood within the context of statistical mechanics. This venerable and remarkable theoretical framework provides us with connections between the macroscopic world of thermodynamics and the atomistic world of molecular 1
59556_C001.indd 1
7/9/08 7:24:35 AM
2
CoarseGraining of Condensed Phase and Biomolecular Systems
interactions. In that vein, most CG methods are best cast within the context of the following formula: exp(−F / k BT ) = (const.) ∫ d x exp[−V ( x) / k BT ]
(1.1)
≈ (const.′) ∫ d x CG exp[−VCG ( x CG ) / k BT ] where in the first line F is the (Helmholtz) free energy of the system, V(x) is the system potential energy as a function of the coordinates x of all of the atoms of the system, T is the thermodynamic temperature, kB is Boltzmann’s constant, and “const.” is a normalization constant (the prime in the second line being a different constant). Importantly, in the second line of Equation 1.1, the expression for the free energy is rewritten in terms of the CG variables xCG and the CG effective potential VCG(xCG). The CG variables, by virtue of the definition of coarsegraining, are fewer in number than the atomistic variables such that the number of these variables satisfies N xCG < N x . It should be noted that Equation 1.1 is rarely solved directly. However, its underlying structure forms the basis for various distribution functions, equilibrium averages, and properties, etc. Moreover, the equation clearly illustrates the principle of coarsegraining. It is well known that the evaluation of the integral in the first line of Equation 1.1 (and all integrals like it), using either MD or MC methods, is a great challenge for rugged multidimensional potential energy functions such as those for biomolecular systems. The promise of CG modeling is therefore to substantially reduce this computational challenge through a combination of fewer CG degrees of freedom and also the likely fact that the CG effective potential VCG(xCG) will be “smoother” than the full allatom resolution one, V(x). However, in this concept lie two of the main challenges in coarsegraining. First, one may not know beforehand the optimal choices of the CG sites since one does not know the solution to Equation 1.1. Second, one does not know the CG effective potential, VCG(xCG), so it must somehow be determined or modeled. On this latter point, Equation 1.1 also reveals just how difficult this latter task may be, because the equation indicates that the CG effective “potential” should actually be a free energy surface (i.e., the socalled manybody potential of mean force) for the CG variables. This is because, in a formal sense, certain degrees of freedom have been integrated out in going from the first to the second line of Equation 1.1. As such, the effective CG potential must contain these “missing entropy” effects arising from the degrees of freedom that have been integrated over when transforming the equation to the CG variables. These entropic effects can be ill defined and hard to predict in their behavior, wherein lies the origin of one of the key challenges in coarsegraining. At one level or another, most current coarsegraining schemes attempt to solve the problem embodied in Equation 1.1. Some methods may seek to only approximately satisfy this equation for a particular system and thermodynamic state, so that at the same time the CG model is transferable over a wider range of systems and conditions. Other coarsegraining methods, such as those being developed in my own research group, seek to provide a precise and systematic route to Equation 1.1 so that the approximation sign in the second line of the equation is as close to an equality as possible. This approach may, however, come at the expense of complete transferability of the CG model between disparate systems and thermodynamics conditions, so that additional formal methodology will need to be developed to enhance the model transferability. Following in the spirit of the above discussion, the individual chapters of this book describe most of the important current developments in the field of CG simulation and modeling, with a focus on approaches that provide CG representations of complex systems such as liquids, polymers, lipid bilayers, peptides, proteins, nucleic acids, and protein complexes. Each chapter focuses on specific examples of evolving coarsegraining methodology and presents results for a variety of these complex systems. Each author was asked to carefully describe their own CG approach, its motivation, strengths, and weaknesses, and to give one or two important example applications. These individual contributions contain an excellent crosssection of much of the important work being undertaken at
www.ebook3000.com 59556_C001.indd 2
7/9/08 7:24:37 AM
Introduction
3
the present time. For the reader the book represents the first time that most of the various current coarsegraining researchers have collated their work in such a fashion. Indeed, the field of coarsegraining is so new and so fluid at the present time that the format of the present book seems optimal, as it is difficult to imagine how a singleauthor book could capture the full diversity of this rapidly emerging field. For scientists interested in CG modeling, and also for those researchers interested in implementing such methods, the various chapters therefore provide a good overview of the current state of the art from a variety of different perspectives. For example, Chapters 2 and 3 provide two of the most successful current coarsegraining schemes. These two approaches are in fact quite different and complementary to one another. The work of Marrink and coworkers in Chapter 2 is a good example of the “inverse” approach to coarsegraining, wherein thermodynamic and other properties are used to parameterize CG force fields. Our own contribution in Chapter 3 presents the multiscale coarsegraining (MSCG) approach in which atomistic force information is utilized within a variational framework to systematically develop CG models from the “bottom up”. In this sense, the MSCG method adheres closely and strictly to the concept of coarsegraining embodied in Equation 1.1, while the work of Marrink and coworkers is a looser interpretation of that equation. However, it has the benefit of significant transferability between a variety of systems. Chapters 4 through 7 go on to present various other coarsegraining schemes, especially for lipid bilayers as a key example. Several of these schemes rely heavily on the socalled “reverse Monte Carlo” approach, and further develop it to help define the effective CG interactions based on an inversion of equilibrium structural (radial distribution function) data. Chapters 8 through 11 discuss current CG model development for peptides and proteins at the amino acid level (i.e., amino acids in the primary sequence are coarsegrained into a single or a few CG sites). Here, these systems are very complex, so one can rightfully expect significant diversity in the coarsegraining approaches. There is presently no single “best way” to coarsegrain such systems, and there may never be. In addition, Chapters 12 and 13 describe special methods for “mixedresolution” studies and for characterizing conformational transitions, respectively. At larger length scales, one typically utilizes more “aggressive” (lowerresolution) coarsegraining schemes. Here individual amino acids or base pairs in nucleic acids may not even be completely resolved at the CG level. Chapters 14 and 15 describe coarsegraining of nucleic acids (DNA) along these lines, while Chapters 16 through 21 provide various aggressive coarsegraining schemes for proteins, including elastic network models and normal modebased approaches. The book concludes with Chapters 22 through 28, which present important coarsegraining (and multiscale) methods and applications in softmatter materials science (polymers, surfactants, etc.) and in nanoscience. While there is a significant overlap in methodology with the earlier chapters, the materials science problems described in these chapters also present challenges and opportunities of their own for CG modeling. Chapter 28 concludes the book by describing an approach to the issue of “coarsegraining in time”, in which full atomic resolution is retained but the coarsegraining occurs in the dynamics so as to significantly extend the effective time scale of the simulation. The issue of time scale and realistic dynamics in CG modeling is clearly an important topic for the future. Despite its promising future, coarsegraining faces a number of significant challenges before it can become widely utilized by the research community, especially by experimental researchers as a tool to help interpret their experiments. In order for such a broad degree of acceptance to occur, coarsegraining must become a systematic, fully predictive technique in molecular simulation. For example, at present it often seems there is a risk that such models could have bias built into them because one sometimes “knows” (or has an idea) of the answer one wants when building a CG model to study a particular system or class of systems. It is therefore absolutely essential that a clear set of standards be developed (albeit with an appropriate degree of latitude) so that one can fully trust the predictions of a CG model or simulation. Along these lines, it is critical that CG simulation researchers “push” their models and methods into unknown territory and not be afraid to report their failures along with the successes. We must also make our procedures, both their strengths and
59556_C001.indd 3
7/9/08 7:24:37 AM
4
CoarseGraining of Condensed Phase and Biomolecular Systems
weaknesses, clearly known to our audience, both in our written papers and in our oral presentations. Generating beautiful graphics and CG simulations of systems for which the end result is already largely known will not serve to advance the field and, in fact, could well undermine it. We can be rightfully optimistic, however, that this will not happen, but we should also be realistic that there are various impediments that must be surmounted. In addition to providing a true predictive capability for CG modeling, there are also various immediate challenges faced by all CG methods. One essential challenge is the degree of transferability of CG models between various systems and from one set of thermodynamic conditions to another. In principle, a CG model cannot be completely transferable because it is a simplified (reduced degree of freedom) picture of a complex system and certain information has been effectively averaged away for those given conditions. On the other hand, many aspects of the CG model must certainly be transferable. A key goal then is both to define and to understand what is and what is not transferable in a given CG model and why. This is more than a technical issue. It is actually a very significant problem deeply rooted in the foundations of statistical mechanics, and a problem that has not yet been completely solved. There is also the question of CG dynamics (i.e., timedependent behavior), because CG models do not have the same dynamics as the real underlying atomistic MD. CG dynamics are often significantly faster than the real dynamics, and this is in fact a desirable feature of CG models if statistical sampling is their primary goal (i.e., the sampling is faster and probably more extensive). Some progress has been made on the CG dynamics problem. However, it also presents a paradox because if one were to develop a CG model with the correct (slower) dynamics, it would in turn undermine the efficient statistical sampling of the CG model. Thus, such a dynamically correct CG model would need to be extremely efficient computationally in order to simultaneously achieve both objectives. This is clearly a challenge for the future. Another important question to consider is whether coarsegraining will stand the test of time. As of this writing, it has become an explosively growing methodology in the field of molecular simulation. In addition to the fertile intellectual environment feeding this growth, the primary current driving force for coarsegraining is a desire among researchers to access the length and time scales in biomolecular and softmatter systems that cannot be reached by presentday allatom MD or MC methods. However—and this is an important point—one can certainly expect these allatom simulation methods (especially biomolecular MD) to increase their power significantly for the foreseeable future, including new MD algorithms, CPU speeds, and parallel execution on very large computing clusters. If this is the case, will the need for CG modeling and simulation then become obsolete? While the relevant CG methods and problems studied by CG modeling will surely evolve with time, all facts suggest that the answer is clearly “no”. There are many orders of magnitude in length and time scale that must be bridged for molecularinspired simulation to make contact with numerous real biological and materials phenomena. Moreover, and this is perhaps a key issue, it seems clear that coarsegraining will always remain a vital methodology for the interpretation of the behavior of complex systems, simply because the allatom description is often “overkill” in that it contains too much detailed information and hence a reduced CG picture offers great advantages as an interpretive tool. This aspect of coarsegraining will always be an important and valuable asset to many scientific researchers. To sum up this introduction, it is thus very clear that coarsegraining is an exciting conceptual and algorithmic challenge in the field of computer simulation and statistical mechanics. It is an approach that is providing a great step forward in the molecular modeling and simulation of real, complex systems. This research effort continues to be very rewarding for all of the contributors to this book, and they will all certainly make an ongoing contribution to finding the solutions to the critical challenges facing CG modeling. Only time can tell if we have succeeded, but there is every reason to be optimistic about the future growth and impact of this revolutionary advance in molecular simulation methodology.
www.ebook3000.com 59556_C001.indd 4
7/9/08 7:24:38 AM
2 The MARTINI Force Field Siewert J. Marrink, Marc Fuhrmans, H. Jelger Risselada, and Xavier Periole Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, The Netherlands
CONTENTS 2.1 Introduction ...............................................................................................................................5 2.2 Method ......................................................................................................................................6 2.2.1 Basic Parameterization .................................................................................................. 6 2.2.2 Reproducing Thermodynamic Data: Optimizing Nonbonded Parameters...................8 2.2.3 Reproducing Structural Data: Optimizing Bonded Parameters ...................................9 2.2.4 CoarseGraining Recipe .............................................................................................. 11 2.2.5 Limitations .................................................................................................................. 12 2.3 Applications ............................................................................................................................ 12 2.3.1 Vesicle Fusion .............................................................................................................. 13 2.3.2 Domain Formation ...................................................................................................... 14 2.3.3 Protein Aggregation .................................................................................................... 16 2.4 Outlook.................................................................................................................................... 17 Acknowledgments ............................................................................................................................ 17 References ........................................................................................................................................ 18
2.1
INTRODUCTION
The use of coarsegrained (CG) models in a variety of simulation techniques has proven to be a valuable tool to probe the time and length scales of systems beyond what is feasible with traditional allatom (AA) models. Applications to lipid systems in particular, pioneered by Smit et al.,1 have become widely used. A large diversity of coarsegraining approaches is available; they range from qualitative, solventfree models, via more realistic models with explicit water, to models including chemical specificity (for recent reviews see Refs. 2–4). Models within this latter category are typically parameterized based on comparison to atomistic simulations, using inverted Monte Carlo schemes5–7 or force matching8 approaches. Our own model,9,10 coined the MARTINI force field, has also been developed in close connection with atomistic models; however, the philosophy of our coarsegraining approach is different. Instead of focusing on an accurate reproduction of structural details at a particular state point for a specific system, we aim for a broader range of applications without the need to reparameterize the model each time. We do so by extensive calibration of the chemical building blocks of the CG force field against thermodynamic data, in particular oil/water partitioning coefficients. This is similar in spirit to the recent development of the GROMOS force field.11 Processes such as lipid selfassembly, peptide membrane binding, and protein–protein recognition depend critically on the degree to which the constituents partition between polar and nonpolar environments. The use of a consistent strategy for the development of compatible CG and atomiclevel force fields is of additional importance for its intended use in 5
59556_C002.indd 5
8/2/08 7:15:14 AM
6
CoarseGraining of Condensed Phase and Biomolecular Systems
multiscale applications.12 The overall aim of our coarsegraining approach is to provide a simple model that is computationally fast and easy to use, yet flexible enough to be applicable to a large range of biomolecular systems. Currently, the MARTINI force field provides parameters for a variety of biomolecules, including many different lipids, cholesterol, and all amino acids. A protocol for simulating peptides and proteins is also available. Extensive comparison of the performance of the MARTINI model with respect to a variety of experimental properties has revealed that the model performs generally quite well (“semiquantitatively”) for a broad range of systems and state points. Properties accurately reproduced include structural (e.g., liquid densities,9 area/lipid for many different lipid types,9 accessible lipid conformations,13 or the tilt angle of membrane spanning helices14), elastic (e.g., bilayer bending modulus,9 rupture tension10), dynamic (e.g., lipid lateral diffusion rates,9 water transmembrane (TM) permeation rate,9 time scales for lipid aggregation9,15), and thermodynamic (e.g., bilayer phase transition temperatures,16,17 propensity for interfacial versus TM peptide orientation,14 lipid desorption free energy10) data. The remainder of this chapter is organized as follows. A detailed description of the CG methodology is presented in the next section, discussing both its abilities and its limitations. Subsequently, examples of three applications are given, namely the fusion of vesicles, the formation of membrane domains, and the aggregation of membrane proteins. A short look at the future prospects of the MARTINI force field concludes this chapter.
2.2 METHOD 2.2.1
BASIC PARAMETERIZATION
The mapping: The MARTINI model is based on a fourtoone mapping;10 that is, on average four heavy atoms are represented by a single interaction center, with an exception for ringlike molecules. To map the geometric specificity of small ringlike fragments or molecules (e.g., benzene, cholesterol, and several of the amino acids), the general fourtoone mapping rule is insufficient. Ringlike molecules are therefore mapped with higher resolution (up to twotoone). The model considers four main types of interaction sites: polar (P), nonpolar (N), apolar (C), and charged (Q). Within a main type, subtypes are distinguished either by a letter denoting the hydrogenbonding capabilities (d = donor, a = acceptor, da = both, 0 = none) or by a number indicating the degree of polarity (from 1 = low polarity to 5 = high polarity). The mapping of representative biomolecules is shown in Figure 2.1. Nonbonded interactions: All particle pairs i and j at distance rij interact via a Lennard–Jones (LJ) potential: VLJ = 4εij[(σ/rij)12 − (σ/rij)6].
(2.1)
The strength of the interaction, determined by the value of the welldepth εij, depends on the interacting particle types. The value of ε ranges from εij = 5.6 kJ/mol for interactions between strongly polar groups to εij = 2.0 kJ/mol for interactions between polar and apolar groups mimicking the hydrophobic effect. The effective size of the particles is governed by the LJ parameter σ = 0.47 nm for all normal particle types. For the special class of particles used for ringlike molecules, slightly reduced parameters are defined to model ring–ring interactions; σ = 0.43 nm, and εij is scaled to 75% of the standard value. The full interaction matrix can be found in the original publication.10 In addition to the LJ interaction, charged groups (type Q) bearing a charge q interact via a Coulombic energy function with a relative dielectric constant εrel = 15 for explicit screening: Vel = qiq j/4πε0εrelrij.
(2.2)
www.ebook3000.com 59556_C002.indd 6
8/2/08 7:15:15 AM
The MARTINI Force Field
7
FIGURE 2.1 Mapping between the chemical structure and the coarsegrained model for DPPC, cholesterol, water, benzene, and a peptide fragment (with five amino acids highlighted). The coarsegrained bead types that determine their relative hydrophilicity are indicated, with more polar groups shown in lighter shades. The prefix “S” denotes a special class of CG sites used to model rings.
Note that the nonbonded potential energy functions are used in their shifted form. The nonbonded interactions are cut off at a distance rcut = 1.2 nm. The LJ potential is shifted from rshift = 0.9 nm to rcut. The electrostatic potential is shifted from rshift = 0.0 nm to rcut. Shifting of the electrostatic potential in this manner mimics the effect of a distancedependent screening. Bonded interactions: Bonded interactions are described by the following set of potential energy functions: Vb =
1 Kb(dij − db)2, 2
(2.3)
Va =
1 Ka[cos(φijk) − cos(φa)]2, 2
(2.4)
Vd = Kd[1 + cos(θijkl − θd)],
(2.5)
Vid = Kid(θijkl − θid)2,
(2.6)
acting between bonded sites i, j, k, l with equilibrium distance db, angle φa, and dihedral angles θd and θid. The force constants K are generally weak, inducing flexibility of the molecule at the CG level resulting from the collective motions at the finegrained level. The bonded potential Vb is used for chemically bonded sites, and the angle potential Va to represent chain stiffness. Proper dihedrals Vd are presently only used to impose secondary structure of the peptide backbone, and the improper dihedral angle potential Vid is used to prevent outofplane distortions of planar groups. LJ interactions between nearest neighbors are excluded. Implementation: The functional form of the CG force field was originally developed for convenient use with the GROMACS simulation software.15 Example input files for many systems can be downloaded from http://md.chem.rug.nl/ ∼ marrink/coarsegrain.html. The general form of the potential energy functions has allowed other groups to implement our CG model (with small modifications) also into other major simulation packages such as NAMD20 and GROMOS.13 Effective time scale: For reasons of computational efficiency, the mass of the CG beads is set to 72 amu (corresponding to four water molecules) for all beads, except for beads in ring structures, for which the mass is set to 45 amu. Using this setup, the systems described in this paper can be simulated
59556_C002.indd 7
8/2/08 7:15:16 AM
8
CoarseGraining of Condensed Phase and Biomolecular Systems
with an integration time step of 30–40 fs, which corresponds to an effective time of 120–160 fs. In the remainder of the paper, we will use an effective time rather than the actual simulation time unless specifically stated. The CG dynamics is faster than the AA dynamics because the CG interactions are much smoother compared to atomistic interactions. The effective friction caused by the finegrained degrees of freedom is missing. Based on comparison of diffusion constants in the CG model and in atomistic models, the effective time sampled using CG interactions is 3–8fold longer. When interpreting the simulation results with the CG model, a standard conversion factor of 4 is used, which is the effective speedup factor in the diffusion dynamics of CG water compared to real water. The same order of acceleration of the overall dynamics is also observed for a number of other processes, including the permeation rate of water across a membrane,9 the sampling of the local configurational space of a lipid,13 and the aggregation rate of lipids into bilayers9 or vesicles.15 However, the speedup factor might be quite different in other systems or for other processes. Particularly for protein systems, no extensive testing of the actual speedup due to the CG dynamics has been performed, although protein translational and rotational diffusion was found to be in good agreement with experimental data in simulations of CG rhodopsin.26 In general, however, the time scale of the simulations has to be interpreted with care.
2.2.2
REPRODUCING THERMODYNAMIC DATA: OPTIMIZING NONBONDED PARAMETERS
In order to parameterize the nonbonded interactions of the CG model, a systematic comparison to experimental thermodynamic data has been performed. Specifically, the free energy of hydration, the free energy of vaporization, and the partitioning free energies between water and a number of organic phases were calculated for each of the 18 different CG particle types. Concerning the free energies of hydration and vaporization, the CG model reproduces the correct trend.10 The actual values are systematically too high, however, implying that the CG condensed phase is not as stable with respect to the vapor phase as it should be. The same is true with respect to the solid phase. This is a known consequence of using a LJ 126 interaction potential, which has a limited fluid range. Switching to a different nonbonded interaction potential could, in principle, improve the relative stability of the fluid phase. As long as its applications are aimed at studying the condensed phase and not at reproducing gas/fluid or solid/fluid coexistence regions, the most important thermodynamic property is the partitioning free energy. Importantly, the water/oil partitioning behavior of a wide variety of compounds can be accurately reproduced with the current parameterization of the CG model. Table 2.1 shows results obtained for the partitioning between water and a range of organic phases of increasing polarity (hexadecane, chloroform, and octanol) for a selection of the 18 CG particle types. The free energy of partitioning between organic and aqueous phases, ΔGoil/aq, was obtained from the equilibrium densities ρ of CG particles in both phases: ΔGoil/aq = kT ln(ρoil/ρaq).
(2.7)
The equilibrium densities can be obtained directly from a long MD simulation of the twophase system in which small amounts (around 0.01 mol fraction proved sufficient to be in the limit of infinite dilution) of the target substance are dissolved. With the CG model, simulations can easily be extended into the multimicrosecond range, enough to obtain statistically reliable results to within 1 kJ/mol for most particle types. As can be judged from Table 2.1, comparison to experimental data for small molecules containing four heavy atoms (the basic mapping of the CG model) reveals a close agreement to within 2 kT for almost all compounds and phases; indeed, agreement is within 1 kT for many of them. Expecting more accuracy of a CG model might be unrealistic. Note that the multiple nonbonded interaction levels allow for discrimination between chemically similar building blocks, such as saturated versus unsaturated alkanes or propanol versus butanol (which would be modeled as Nda) or ethanol (P2). A more extensive table including all particle types and many more building blocks can be found in the original publication.10
www.ebook3000.com 59556_C002.indd 8
8/2/08 7:15:17 AM
The MARTINI Force Field
9
TABLE 2.1 Oil, Chloroform, and Octanol/Water Partitioning Free Energies for a Selection of the 18 CG Particle Types, Compared to Experimental Values of the Corresponding Chemical Building Blocks Hexadecane/Water Building Block
Type
Chloroform/Water
Octanol/Water
CG
Exp
CG
Exp
CG
Exp
−20
−10
−8
Acetamide
P5
−28
−27
−18
Water
P4
−23
−25
−14
−
−9
−8
Propanol
P1
−11
−10
−2
−2
−1
0
Propylamine
Nd
−7
−6
0
1
3
3
Methylformate
Na
−7
−6
0
4
3
0
Methoxyethane
N0
−2
1
6
−
5
3
Butadiene
C4
9
11
13
−
9
11
Chloropropane
C3
13
12
13
−
14
12
Butane
C1
18
18
18
−
17
16
The experimental data are compiled from various sources (see Ref. 10); the simulation data are obtained using Equation 2.7. All values are expressed in kilojoules per mole and obtained at T = 300 K.
To select particle types for the amino acids, systematic comparison to experimental partitioning free energies is also used. Table 2.2 shows the resulting assignment of the amino acid side chains and the associated partitioning free energies. The simulation data are calculated from equilibrium densities of low concentrations of CG beads dissolved in a water/butane twophase system, using Equation 2.7. The experimental data refer to partitioning of sidechain analogues between water and cyclohexane.18 Both the simulation and the experimental data are obtained at 300 K. Where available, the experimental values are reproduced to within 2 kT, a level of accuracy that is difficult to obtain even with atomistic models. Most amino acids are mapped onto single standard particle types, similarly to the recent work of other groups.19,20 Figure 2.1 shows the mapping of a few of them. The apolar amino acids (Leu, Pro, Ile, Val, Cys, and Met) are represented as Ctype particles, the polar uncharged amino acids (Thr, Ser, Asn, Gln) by the class of Ptype particles, and the small negatively charged side chains (Glu, Asp) as Q type. The positively charged amino acids (Arg, Lys) are modeled by a combination of a Qtype and an N or Ctype particle. The bulkier ringbased side chains are modeled by three (His, Phe, Tyr) or four (Trp) beads of the special class of ring particles. The Gly and Ala residues are only represented by the backbone particle. The type of the backbone particle depends on its secondary structure; when free in solution or in a coil or bend, the backbone has a strong polar character (P type), while as part of a helix or beta strand the interbackbone hydrogen bonds reduce the polar character significantly (N type). Details of the parameterization of the amino acids can be found elsewhere.14
2.2.3
REPRODUCING STRUCTURAL DATA: OPTIMIZING BONDED PARAMETERS
To parameterize the bonded interactions, we use structural data that are either directly derived from the underlying atomistic structure (such as bond lengths of rigid structures) or obtained from comparison to finegrained simulations. In the latter procedure, the finegrained simulations are first converted into a “mapped” CG (MCG) simulation by identifying the center of mass of the corresponding atoms as the MCG bead. Second, the distribution functions are calculated for the mapped simulation and compared to those obtained from a true CG simulation. Subsequently the CG parameters are systematically changed until satisfactory overlap of the distribution functions is
59556_C002.indd 9
8/2/08 7:15:18 AM
10
CoarseGraining of Condensed Phase and Biomolecular Systems
TABLE 2.2 Free Energy Based Mapping of the Amino Acids Oil/Water Side Chain
Type
CG
Exp
Leu
C1
22
22
Ile
C1
22
22
Val
C2
20
17
Pro
C2
20
−
Met
C5
9
10
Cys
C5
9
5
Ser
P1
−11
−14
Thr
P1
−11
−11
Asn
P5
κ (α )
i ≤ κ (α )
φα ∈θ,
(17.5)
⎪⎧⎪e κ(α) × rαi , i > κ (α ) ∂ri ∂ri φα ∈ϕ , = =⎨ ∂φα ∂ϕ κ(α) ⎪⎪ 0, i ≤ κ (α ) ⎪⎩ where rαι = rι − rκ(α), and hij is the submatrix for the pair i, j in the Hessian matrix in CC. Once the generalized eigenvalue problem is solved for the Hessian matrix, we can convert the eigenvectors in IC to orthonormal vectors in CC by Δri( k ) =
∑ ∂φ∂r
i
α
∑ Δr
(k ) i
α
Δφα( k ) ,
⋅ Δri( k ′) = δ k ,k ′ ,
(17.6)
i
where Δri(k) is the eigenvector components of the kth mode for the ith Cα atom in CC, Δφ (k) α is the eigenvector components of the kth mode in IC, the summation over all α, and δ is the Dirac delta function. Finally, to measure the “tip effect,” we define a quantitative localization factor, T,
www.ebook3000.com 59556_C017.indd 258
8/2/08 7:58:31 AM
CoarseGrained Elastic Normal Mode Analysis
T=
∑ i
259
⎛ Δr − Δr i ⎜⎜ i+1 ⎜⎜ ⎜⎝ ri+1 − ri
3
⎞⎟ ⎟⎟ , ⎟⎟ ⎠
(17.7)
where the larger the T, the more prominent the “tip effect.” To generalize this method to proteins with multiple chains, we create a virtual bond connecting the last Cα atom of the preceding chain to the fi rst Cα atom of the following chain. This introduces six more degrees of freedom for each additional chain. Five of these degrees of freedom are internal and the sixth one is the virtual bond length, l, which is the only bond length that is flexible. We redefine φα to contain these new degrees of freedom as φα = {θ2 , ϕ 3 , θ3 , ϕ 4 , θ4 ,…, ϕ N1 , θ N1 , l N1+1 , ϕ N1+1 , θ N1+1 , ϕ N1+2 ,…} where N1 is the number of Cα atoms in the first chain. For the Hessian matrix construction, the virtual bond is handled by ∂ri = e κ(α) , i > κ (α ). ∂lκ(α)
(17.8)
Additionally, the index order has to be changed accordingly to account for the extra degrees of freedom.
17.2.2 XRAY CRYSTALLOGRAPHIC REFINEMENT OF ANISOTROPIC THERMAL PARAMETERS USING NORMAL MODES In Xray crystallography, the diffraction pattern of a structure can be calculated by Fcal (q) =
⎛
∑ f (q) exp(iq 〈r 〉) exp⎜⎜⎜⎝− 12 (q Δr ) T
j
2
T
j
j
j
⎞⎟ ⎟⎟ , ⎠
(17.9)
q = 2πΘ T h, where Fcal(q) is the calculated structure factor, Θ = (a*, b*, c*)T is a 3 × 3 matrix that converts CC into fractional coordinates with a*, b*, and c* being the reciprocal unit cell vectors of the crystal, h is the Miller index for a lattice point in reciprocal space, fj is the scattering factor for atom j, and rj is the position for atom j. The second exponential is referred to as the Debye–Waller factor, D(q), and represents the thermal fluctuations in the position of the atom. This term can be rewritten and simplified as ⎛ D (q) = exp⎜⎜− 12 q T Δrj ⎜⎝
(
)
2
⎞⎟ ⎟⎟ = exp − 1 q T U jq , 2 ⎠⎟
(
)
(17.10)
where Uj, the temperature factor, is a 3 × 3 symmetric matrix representing the mean square displacements for atom j. For full anisotropic refinement, the six independent parameters of Uj are the thermal parameters. This matrix is positivedefinite and can be visualized as an ellipsoid in real space. In the isotropic limit, Uj is a diagonal matrix where the three diagonal terms are identical, which reduces the number of thermal parameters to one. This special case for Uj can be visualized as a sphere. Since a set of normal modes is an equivalent basis set for the system, we can write the displacement of atom j from its equilibrium position as a function of M modes by
59556_C017.indd 259
8/2/08 7:58:31 AM
260
CoarseGraining of Condensed Phase and Biomolecular Systems
Δrj = E j σ ,
(17.11)
where Ej is a 3 × M matrix containing the components of the eigenvectors for atom j, and σ is a vector containing the weights that define the contributions of each eigenvector. Combining Equation 17.10 and Equation 17.11, we can express the Debye–Waller factor as a function of normal modes by
(
)
(
)
D (q) = exp − 1 q T U nm q = exp − 1 q T E j 〈σσ T 〉E Tj q , 2 2
(17.12)
Π ≡ 〈σσ T 〉. In conventional crystallographic refinement, each atom in the structure has independent thermal parameters. However, as shown in Equation 17.12, the thermal parameters are common across the entire structure and reduces to the variances and covariances of the M × M matrix, Π. To ensure that Uj remains positive and definite, Π is expressed as a lower triangular matrix, Ω, such that Π = ΩΩ T .
(17.13)
Therefore, the number of thermal parameters for normalmodebased refinement is M(M + 1)/2, which is the number of nonzero terms in Ω. These thermal parameters from normal modes are optimized according to a leastsquares method by minimizing the function
∑ w(h)( F
obs
( h) − Fcal ( h)
)
2
,
(17.14)
h
where Fobs(h) is the diffraction data measured from experiment. Since only the magnitudes are measured, the phases cannot be used during the minimization process. Lastly, because the modified eNMA method only calculates the eigenvectors for the Cα atoms, we extrapolate the normal modes to all the atoms by assuming that all the atoms in a residue move in the same direction as its Cα atom. While NMA is a powerful method that can describe the intrinsic motions of a structure, the external motions must be characterized as well for crystallographic refinement to be successful. Fortunately, the rigid body motion of the entire structure can be described by the Translation, Libration, and Screw (TLS) method [29]. Implemented in REFMAC5 [30] of the CCP4 suite of crystallographic software [31], the TLS method can model the motion of a rigid body with three 3 × 3 tensors, each describing the translation, libration, and screw motions, respectively. One final source of anisotropy comes from the crystal and not the atomic positions. However, we can account for this by adding an additional overall anisotropic temperature factor. If we assume that the sources of fluctuations are independent of each other, we can construct the final Uj for each atom as U j = U nm + stls U tls + U overall ,
(17.15)
where Unm is from Equation 17.12, Utls is from REFMAC5, stls is a scaling factor, and Uoverall is the overall anisotropic temperature factor. The scaling factor is included because the TLS parameters are determined by an external program and are independent of the minimization of the other parameters in Unm and Uoverall. With the theory in place to use normal modes to replace all the temperature factors of the protein atoms, we follow the standard procedure for model building where the temperature factors and atomic positions are updated iteratively. To track the progress of the refinement, the R factor
www.ebook3000.com 59556_C017.indd 260
8/2/08 7:58:32 AM
CoarseGrained Elastic Normal Mode Analysis
261
∑ F ( h) − k F R= ∑ F ( h) obs
cal
h
( h) ,
obs
h
∑ F ( h) F ( h ) k= ∑ F ( h) obs
(17.16)
cal
h
2
cal
h
is used. For validation purposes, a small percentage of the diffraction data, usually 5−10%, is set aside as the test set, while the rest of the data, the working set, is used for refinement. The R factor calculated from the working set is Rcryst and the R factor calculated from the test set is Rfree.
17.3
RESULTS
17.3.1 MODIFIED ENMA IN INTERNAL COORDINATES WITHOUT TIP EFFECT To verify that the modified eNMA method reproduces the subspace of the lowfrequency eigenvectors with no contamination of the tip effect, we compared the eigenvectors from the new method to those from conventional eNMA and from CHARMM for a variety of systems [14], one of which was a multichain supramolecular complex, the molecular chaperonin GroEL [32]. The structure is composed of 14 monomers, each with 525 residues, organized into two stacked heptameric rings. The chaperonin utilizes ATP to help other proteins fold correctly. Its structure has been studied extensively and is known to undergo large conformational changes to open and close the chamber in which the folding occurs [33–36]. Without coarsegraining, the Hessian matrix for a system the size of GroEL would not be possible to calculate due to the shear number of atoms. With the conventional eNMA [8], the “tip effect” is quite severe as the T values for the first 500 modes show in Figure 17.2b (solid squares). But by calculating the eigenvectors by the modified eNMA [14], the tip effect is dramatically reduced (empty circles). The motional patterns of the modes were also verified by comparing the lowfrequency modes with those previously observed [36]. Figure 17.2a shows that the collective motion of the second mode is a stretching motion along the diagonals of the complex.
17.3.2 REFINEMENT OF XRAY ANISOTROPIC THERMAL PARAMETERS USING NORMAL MODES For the refinement of Xray crystallography, we showed that the normalmodebased refinement protocol is successful in improving an isotropicallyrefined model in a previous study [28]. The target system was a 3.42 Å structure of mammalian formiminotransferase cyclodeaminase (FTCD) [37]. Biologically, this protein is involved in linking histidine catabolism and folate metabolism [38], integrating the Golgi complex with the vimentin intermediate filament cytoskeleton [39–41], and causing autoimmune hepatitis [42] and glutamate formiminotransferase deficiency [43]. The protein’s structure is similar to GroEL in that there are two stacked rings, but FTCD has eight monomers in two tetrameric rings. Each monomer is composed of two domains, the FT domain and the CD domain. The FT domain is further divided into the N subdomain and the C subdomain. Figure 17.3 shows the structures of FTCD in full complex and in various components. This 0.5 million Dalton (over 16,000 atoms) enzyme also is sufficiently large that coarsegraining is required in order for the normal modes to be calculated on contemporary computers. The normal mode calculation was performed on the biologically relevant molecule, the full octamer, and only the portions of the eigenvectors corresponding to the structure in the asymmetric unit of the crystal, two subunits from two octamers, were kept.
59556_C017.indd 261
8/2/08 7:58:33 AM
262
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 17.2 Results on multisubunit supramolecular complex, the molecular chaperonin GroEL. (a) Motional pattern of the second vibrational mode, which is a stretching mode along the diagonal line of the molecule. (b) Tip effect; the solid squares are for conventional eNMA, and the empty circles are for new eNMA. Note the vertical axis is made in logarithmic scale. (This figure is adopted from Figure 6 on page 469 in Lu, M., Poon, B., and Ma, J. J. Chem. Theor. Comp., 2, 464, 2006.)
FIGURE 17.3 (See color insert following page 238.) Structure and thermal ellipsoids of FTCD. (a) The square doughnut structure of an FTCD octamer. Two subunits are shown in red and blue, respectively. (b) The subunit structure of ligandfree FTCD. Backbone trace color ramped from the Nterminus to the Cterminus. (c) Superposition of the FT domain of human ligandfree FTCD (red) with the structure of the same domain in isolation (cyan) with the product analog, folinic acid (CPK mode), bound in the groove. (d) Rainbowcolored isotropic Bfactor in the original model. The hotter the color, the larger the Bfactors. The high flexibility of the Nsubdomain, the linker region, and the lower half of the CD domain are evident. (This figure is adopted from Figure 1 on page 7870 in Poon, B. K., Chen, X., Lu, M., Vyas, N. K., Quiocho, F. A., Wang, Q., and Ma, J. Proc. Natl. Acad. Sci. U.S.A., 104, 7869, 2007.)
Due to the poorly diffracting crystal and size of the structure, it was very difficult to even build the original isotropic model. Only the Cα trace was deposited into the Protein Data Bank (PDB code, 1TT9). However, we were able to obtain the final allatom, isotropic structure and apply our refinement method. At the start, the Rcryst and Rfree of the original structure refined in
www.ebook3000.com 59556_C017.indd 262
8/2/08 7:58:33 AM
CoarseGrained Elastic Normal Mode Analysis
263
CNS [44] were 24.6 and 28.8%, respectively [37]. After recalculating the initial values using REFMAC5, the Rcryst and Rfree became 23.5 and 28.7%, respectively. After several rounds of iteratively updating the normalmodebased temperature factors and atomic coordinates, Rcryst and Rfree converged to 24.0 and 24.9%, respectively. This is a significant improvement because the Rfree is a more accurate measure of the quality of the model. For refinement, the first 50 modes were used, resulting in 1275 normal mode parameters. Compared with over 16,000 thermal parameters for the original isotropic model, there was an order of magnitude decrease in the number of thermal parameters while improving the model and providing an anisotropic description of the thermal fluctuations. In addition to quantitatively improving the model through the R factors, our method also improves the electron density map, which gives crystallographers a more accurate picture of the structure. Figure 17.4a shows plots of the root mean square deviation (rmsd) of the mainchain atoms of one subunit between the original isotropic model and the final anisotropic model. The other three subunits in the asymmetric unit show the same trend. The peaks signify regions where the biggest changes were made to the original structure. The first spike occurs around residue 14. As shown in Figure 17.5a, this spike corresponds to a major shift in the main chain coordinates. The 2FoFc omit map for the isotropic model is fragmented, which can make the correct tracing of the backbone unclear. However, after performing normalmodebased refinement on the structure, the same type of map disambiguates the placement of the main chain and the side chains. The second spike also corresponds to a shift in the main chain atoms, but is less severe. In both the isotropic and anisotropic models, Figure 17.5b shows that the electron density is clear enough for atoms to be placed with confidence. The spike is a result of centering the atoms within the electron density.
FIGURE 17.4 (a) Structural shifts of the normalmoderefined new model with respect to the original model. The rmsd (Å) along the chain of a single subunit is shown for the main chains. Three large spikes are evident in both graphs. (This figure is adopted from Figure 3a on page 7871 in Poon, B. K., Chen, X., Lu, M., Vyas, N. K., Quiocho, F. A., Wang, Q., and Ma, J. Proc. Natl. Acad. Sci. U.S.A., 104, 7869, 2007.) (b) Anisotropically refined thermal ellipsoids for a single subunit of FTCD, in the same view as in Figure 17.3d. It is evident that the Nterminal subdomain of FT domain and the lower half of the CD domain are highly flexible. The results for other subunits are very similar due to symmetry constraint. (This figure is adopted from Figure 5a on page 7873 in Poon, B. K., Chen, X., Lu, M., Vyas, N. K., Quiocho, F. A., Wang, Q., and Ma, J. Proc. Natl. Acad. Sci. U.S.A., 104, 7869, 2007.)
59556_C017.indd 263
8/2/08 7:58:34 AM
264
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 17.5 Examples of large structural adjustments in normalmode refinement. The top panels are for the original model while the bottom panels are for the new normalmode model. (a) and (a’) Region Glu13Asn15 superimposed with omit 2Fo−Fc map contoured at 1.5σ. (b) and (b’) Region Glu147Pro150 superimposed with omit 2Fo−Fc map contoured at 1.0σ. In both panels, the original model (uniform in color) and the new model (grayscaled for chemical groups) are superimposed to highlight the structural shifts. (c) and (c’) Region Pro426Lys427 superimposed with omit 2Fo−Fc map contoured at 1.0σ. (d) and (d’) Residue SeMet132 superimposed with omit 2Fo−Fc map contoured at 1.5σ. (This figure is adopted from Figure 4 on page 7872 in Poon, B. K., Chen, X., Lu, M., Vyas, N. K., Quiocho, F. A., Wang, Q., and Ma, J. Proc. Natl. Acad. Sci. U.S.A., 104, 7869, 2007.)
Lastly, the third spike represents a rotation of the side chain for residue 427, as shown in Figure 17.5c. Again, in both the isotropic and anisotropic models, the electron density allowed for the placement of atoms. However, after normalmodebased refinement, the electron density map was changed, which allowed for adjustments to be made that lowered the R factors. While the three spikes showed large changes to the model, many of the improvements were smaller, but the sum total of these improvements allowed us to reach our final model. An example of a smaller but more common improvement is shown in Figure 17.5d. In this case, the electron density map of the original model did not show the positions of the atoms at the tip of the side chain, but after normalmodebased refinement, the density became visible and allowed for correct placement. Overall, there were about 55 residues for each subunit where the improved electron density map allowed for more confident placement of the atoms. In addition to the structure shift, Figure 17.4b shows the C α trace and thermal ellipsoids of one subunit of the final model. It is clear that the distribution of the magnitudes of the ellipsoids is comparable with the original isotropic model (Figure 17.3d). Furthermore, the direction of motion shown by the thermal ellipsoids nicely correlates with the ligandinduced cleftclosing motion (Figure 17.3c). This figure is an example of how powerful normalmodebased refinement can be. Traditionally, Xray crystallography is often viewed as providing a snapshot, frozen in time, of the molecule of interest. However, as we have shown, diffraction data contains information about the dynamics of the protein and only with anisotropic models can this information be elucidated.
ACKNOWLEDGMENTS The author acknowledges support from an NIH grant (GM067801) and a grant from the Welch Foundation.
www.ebook3000.com 59556_C017.indd 264
8/2/08 7:58:35 AM
CoarseGrained Elastic Normal Mode Analysis
265
REFERENCES 1. Brooks, III, C. L., Karplus, M., and Pettitt, B. M. 1988. Proteins: A theoretical perspective of dynamics, structure, and thermodynamics. Adv. Chem. Phys. 71:1. 2. McCammon, J. A., and Harvey, S. 1987. Dynamics of Proteins and Nucleic Acids. Cambridge: Cambridge University Press. 3. Brooks, B. R., Janezic, D., and Karplus, M. 1995. Harmonic analysis of large systems. I. Methodology. J. Comput. Chem. 16:1522. 4. Levitt, M., Sander, C., and Stern, P. S. 1985. Protein normalmode dynamics: Trypsin inhibitor, crambin, ribonuclease and lysozyme. J. Mol. Biol. 181:423. 5. Ma, J. 2004. New advances in normal mode analysis of supermolecular complexes and applications to structural refinement. Curr. Protein Pept. Sci. 5:119. 6. Ma, J. 2005. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure 13:373. 7. Tirion, M. M. 1996. Large amplitude elastic motions in proteins from a singleparameter, atomic analysis. Phys. Rev. Lett. 77:1905. 8. Atilgan, A. R., Durell, S. R., Jernigan, R. L., Demirel, M. C., Keskin, O., and Bahar, I. 2001. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 80:505. 9. Doruker, P., Jernigan, R. L., and Bahar, I. 2002. Dynamic of large proteins through hierarchical levels of coarsegrained structures. J. Comput. Chem. 23:119. 10. Doruker, P., and Jernigan, R. L. 2003. Functional motions can be extracted from onlattice construction of protein structures. Proteins 53:174. 11. Ming, D., Kong, Y., Lambert, M., Huang, Z., and Ma, J. 2002. How to describe protein motion without aminoacid sequence and atomic coordinates. Proc. Natl. Acad. Sci. USA 99:8620. 12. Tama, F., Wriggers, W., and Brooks, C. L. 2002. Exploring global distortions of biological macromolecules and assemblies from lowresolution structural information and elastic network theory. J. Mol. Biol. 321:297. 13. Lu, M., and Ma, J. 2005. The role of shape in determining molecular motions. Biophys. J. 89:2395. 14. Lu, M., Poon, B., and Ma, J. 2006. A new method for coarsegrained elastic normalmode analysis. J. Chem. Theor. Comp. 2:464. 15. Kamiya, K., Sugawara, Y., and Umeyama, H. 2003. Algorithm for normal mode analysis with general internal coordinates. J. Comput. Chem. 24:826. 16. Diamond, R. 1990. On the use of normal modes in thermal parameters refinement: Theory and application to the bovine pancreatic trypsin inhibitor. Acta Crystallogr. A 46:425. 17. Kidera, A., and Go, N. 1990. Refinement of protein dynamic structure: normal mode refinement. Proc. Natl. Acad. Sci. U.S.A. 87:3718. 18. Kidera, A., and Go, N. 1992. Normal mode refinement: Crystallographic refinement of protein dynamic structure. I. Theory and test by simulated diffraction data. J. Mol. Biol. 225:457. 19. Kidera, A., Inaka, K., Matsushima, M., and Go, N. 1992. Normal mode refinement: Crystallographic refinement of protein dynamic structure. II. Application to human lysozyme. J. Mol. Biol. 225:477. 20. Kidera, A., Inaka, K., Matsushima, M., and Go, N. 1992. Normal mode refinement: Crystallographic refinement of protein dynamic structure applied to human lysozyme. Biopolymers 32:315. 21. Kidera, A., Matsushima, M., and Go, N. 1994. Dynamic structure of human lysozyme derived from Xray crystallography: Normal mode refinement. Biophys. Chem. 50:25. 22. Suhre, K., and Sanejouand, Y. H. 2004. On the potential of normalmode analysis for solving difficult molecularreplacement problems. Acta Crystallogr. D Biol. Crystallogr. 60:796. 23. Lindahl, E., Azuara, C., Koehl, P., and Delarue, M. 2006. NOMADRef: Visualization, deformation and refinement of macromolecular structures based on allatom normal mode analysis. Nucleic Acids Res. 34:W52. 24. Delarue, M., and Dumas, P. 2004. On the use of lowfrequency normal modes to enforce collective movements in refining macromolecular structural models. Proc. Natl. Acad. Sci. U.S.A. 101:6957. 25. Kundu, S., Melton, J. S., Sorensen, D. C., and Phillips, Jr., G. N. 2002. Dynamics of proteins in crystals: Comparison of experiment with simple models. Biophys. J. 83:723. 26. Kondrashov, D. A., Cui, Q., and Phillips, Jr., G. N. 2006. Optimization and evaluation of a coarsegrained model of protein motion using xray crystal data. Biophys. J. 91:2760. 27. Kondrashov, D. A., Van Wynsberghe, A. W., Bannen, R. M., Cui, Q., and Phillips, Jr., G. N. 2007. Protein structural variation in computational models and crystallographic data. Structure 15:169.
59556_C017.indd 265
8/2/08 7:58:36 AM
266
CoarseGraining of Condensed Phase and Biomolecular Systems
28. Poon, B. K., Chen, X., Lu, M., Vyas, N. K., Quiocho, F. A., Wang, Q., and Ma, J. 2007. Normal mode refinement of anisotropic thermal parameters for a supramolecular complex at 3.42A crystallographic resolution. Proc. Natl. Acad. Sci. U.S.A. 104:7869. 29. Schomaker, V., and Trueblood, K. N. 1968. On the rigidbody motion of molecules in crystals. Acta Crystallogr. B 24:63. 30. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. 1997. Refinement of macromolecular structures by the maximumlikelihood method. Acta Crystallogr. D Biol. Crystallogr. 53:240. 31. Collaborative Computational Project, Number 4. 1994. The CCP4 suite: Programs for protein crystallography. Acta Crystallogr. D Biol. Crystallogr. 50:760. 32. Xu, Z., and Sigler, P. B. 1998. GroEL/GroES: Structure and function of a twostroke folding machine. J. Struct. Biol. 124:129. 33. Sigler, P. B., Xu, Z., Rye, H. S., Burston, S. G., Fenton, W. A., and Horwich, A. L. 1998. Structure and function in GroELmediated protein folding. Annu. Rev. Biochem. 67:581. 34. Ma, J., and Karplus, M. 1998. The allosteric mechanism of the chaperonin GroEL: A dynamic analysis. Proc. Natl. Acad. Sci. U.S.A. 95:8502. 35. Ma, J., Sigler, P. B., Xu, Z., and Karplus, M. 2000. A dynamic model for the allosteric mechanism of GroEL. J. Mol. Biol. 302:303. 36. Keskin, O., Bahar, I., Flatow, D., Covell, D. G., and Jernigan, R. L. 2002. Molecular mechanisms of chaperonin GroELGroES function. Biochemistry 41:491. 37. Mao, Y., Vyas, N. K., Vyas, M. N., Chen, D. H., Ludtke, S. J., Chiu, W., and Quiocho, F. A. 2004. Structure of the bifunctional and Golgiassociated formiminotransferase cyclodeaminase octamer. EMBO J. 23:2963. 38. Shane, B., and Stokstad, E. L. R. 1984. Folates in the synthesis and catabolism of histidine. In Folates and Pterins, vol. 1, ed. R. L. Blakley and S. J. Benkovic, 433–55. New York: Wiley. 39. Bashour, A. M., and Bloom, G. S. 1998. 58K, a microtubulebinding Golgi protein, is a formiminotransferase cyclodeaminase. J. Biol. Chem. 273:19612. 40. Gao, Y. S., Alvarez, C., Nelson, D. S., and Sztul, E. 1998. Molecular cloning, characterization, and dynamics of rat formiminotransferase cyclodeaminase, a Golgiassociated 58kDa protein. J. Biol. Chem. 273:33825. 41. Gao, Y. S., Vrielink, A., MacKenzie, R., and Sztul, E. 2002. A novel type of regulation of the vimentin intermediate filament cytoskeleton by a Golgi protein. Eur. J. Cell Biol. 81:391. 42. Lapierre, P., Hajoui, O., Homberg, J. C., and Alvarez, F. 1999. Formiminotransferase cyclodeaminase is an organspecific autoantigen recognized by sera of patients with autoimmune hepatitis. Gastroenterology 116:643. 43. Rosenblatt, D. 1995. Inherited disorders of folate transport and metabolism. In The Metabolic and Molecular Bases of Inherited Diseases, ed. C. Scriver, A. Beaudet, W. Sly, and D. Valle, Vol. 2, pp. 3111–28. New York: McGrawHill. 44. Brünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., GrosseKunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G. L. 1998. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D Biol. Crystallogr. 54:905.
www.ebook3000.com 59556_C017.indd 266
8/2/08 7:58:37 AM
Normal 18 CoarseGrained Mode Analysis to Explore LargeScale Dynamics of Biological Molecules Osamu Miyashita and Florence Tama Department of Biochemistry and Molecular Biophysics, The University of Arizona
CONTENTS 18.1 18.2
Introduction ......................................................................................................................... 267 Methods ............................................................................................................................... 269 18.2.1 Normal Mode Theory and Analysis ...................................................................... 269 18.2.2 RotationTranslationBlock (RTB) Method ........................................................... 271 18.2.3 Conformational Change Pathway .......................................................................... 273 18.2.4 The Protein Elastic Model: Tirion Potential ......................................................... 274 18.2.5 Strain Energy Analysis .......................................................................................... 275 18.3 Applications ........................................................................................................................ 275 18.3.1 RTB Approach to Study Large Biological Systems .............................................. 276 18.3.2 Strain Energy Analysis .......................................................................................... 278 18.3.2.1 The Linear Elastic Model ..................................................................... 278 18.3.2.2 Nonlinear Elastic Models...................................................................... 279 18.3.2.3 Strain Energy is Localized.................................................................... 279 18.3.3 Flexible Fitting of Atomic Structures into LowResolution Electron Density Maps .......................................................................................... 279 18.4 Conclusion ........................................................................................................................... 282 Acknowledgments .......................................................................................................................... 283 References ...................................................................................................................................... 283
18.1 INTRODUCTION Biomolecular machines made of proteins and RNAs perform and sustain most functions in our bodies. To elucidate their functional mechanisms, there has been a tremendous effort to obtain structural information for these biological molecules. While structure provides important insights, a deeper understanding could be obtained through examination of their dynamical properties and physical interactions within the system. Thus it is beneficial to complement experimental work by theoretical and computational techniques that can directly examine physical interactions, explore dynamics of the biological molecules, and bring useful atomiclevel insights into protein functions. To computationally study dynamical properties of biological molecules, several approaches can be considered. The most common is the use of molecular dynamics simulations in which the system 267
59556_C018.indd 267
8/2/08 8:00:56 AM
268
CoarseGraining of Condensed Phase and Biomolecular Systems
evolves as a function of time [Karplus and McCammon 2002]. Exploration of molecular motions of biological molecules and their assemblies by this approach has provided significant insights into structurefunction relationships. This method can give very detailed information on the dynamics near the native state. However, even though computational techniques and processing power have been improving significantly, the application for largescale macromolecular assemblies is limited due to the computational complexity of allatom simulation methods and reaching time scales corresponding to functional motions still remains impractical. An example of such work is the 10 ns simulation of the satellite mosaic virus, which required 10 days of computer time using 256 processors. It would take years to reach longer time scales (ms), which are relevant for largescale rearrangements of proteins [Freddolino et al. 2006]. An alternative approach to extend the time scale of molecular dynamics simulations is to use coarsegrained models, which enable microsecond time scales to be reached for small proteins [Tozzini 2005]. The simulations typically consider the Cα and P atoms, strung as beads, which considerably reduce the number of atoms necessary for simulation. Details for such models can be found in other chapters of this book. However, these calculations are still computationally expensive to observe large functional motions for large macromolecular assemblies such as the ribosome. Also, use of advanced sampling methods to explore long time scale and largeamplitude conformational changes (e.g., protein folding) are still far from routine. In order to simulate large and slow conformational rearrangements of large biological molecules, we need to employ alternative techniques. One of these techniques is normal mode analysis (NMA), which is commonly used in physics, and was introduced to structural biology about 20 years ago [Go, Noguti, and Nishikawa 1983; Brooks and Karplus 1983]. In NMA the energy surface is approximated, in other words coarsegrained, as harmonic. Exploration of the normal modes of a molecular system can yield insights, at the atomic level, on the mechanism of largescale rearrangements of protein/protein complexes, which often occur upon ligand/protein binding. Biological studies employing NMA have generally focused on a few largeamplitude/lowfrequency normal modes, which are expected to be relevant to function. Until recently, NMA applications were limited to small proteins (up to 300 residues). There were two reasons for this limitation. The fi rst one is related to the size of the biological system. The standard protein model used in the calculation consists of classical points of mass with typically one point per atom. Interactions between these atoms are defi ned by semiempirical force fields. Using these force fields requires an allatom description to represent the macromolecule, which becomes computationally difficult with increasing system size (see Methods). The second problem is related to the minimization process (see also Methods), which is required before NMA when semiempirical force fields are used. It is particularly detrimental due to the distortion in protein conformation during minimization. Moreover, this process is time consuming. The applicability of NMA has been advancing by the development of new coarsegrained models. Those coarsegrained models do not require allatom description to represent the mechanical properties of a system. Thus a subset of atoms could be used to perform NMA and virtually any system size could be studied (of course at coarsegrained level). NMA and coarsegrained models are approximations. Ideally, it would be best if we simulate biological molecules at full scale and full detail, however in order to study conformational changes of large macromolecules with the computational power available today, alternative approaches are necessary and coarsegrained NMA is quite successful in this aspect. Coarsegrained methods at both molecule and algorithmic levels provide us with tools to extend our work to larger systems. However, one has to be aware that using such coarsegrained models is an approximation, which means that there are limitations to the approach; therefore one needs to be careful in the interpretation of the data. Our philosophy is to take full advantage of the computational power available today and to adjust the level of coarsegrain accordingly in order to represent the system as precisely as possible.
www.ebook3000.com 59556_C018.indd 268
8/2/08 8:00:57 AM
CG NMA to Explore LargeScale Dynamics of Biological Molecules
18.2
269
METHODS
18.2.1 NORMAL MODE THEORY AND ANALYSIS NMA is a relatively mature technique [Goldstein 1950], which has in recent years piqued the interest of researchers due to new algorithmic developments that enable applications to larger systems. In NMA, one approximately represents the dynamics of a molecule as a set of harmonic oscillators. This is beneficial because the motion of a harmonic oscillator can be analytically described. For a harmonic oscillator of a mass m with coordinate x connected to a spring with the spring constant k, the Hamiltonian is: 2 1 ⎛⎜ dx ⎞⎟ 1 ⎟ H = m ⎜ ⎟ + kx 2 . 2 ⎜⎝ dt ⎟⎠ 2
(18.1)
The dynamics of the particle can then be derived by solving Equation 18.1, as x = C cos(ωt + φ) where C and φ are the amplitude and the phase at time t = 0 and ω = (k/m)1/2 is the angular frequency associate with the vibrational mode. Unlike a simple harmonic oscillator, the potential energy of biological molecules is complex, and thus the equation of motion cannot be solved analytically. However, if one focuses on the motions in the vicinity of stable conformation, the potential function can be approximated into a simple form. We consider a molecule with N atoms and describe the coordinates of the atoms as r = (x1,y1,z1,…z N), where (xi,yi,zi) is the coordinate of atom i. Assuming that we analyze the motion around a stable conformation r0, where superscript 0 indicates the energy minimum, a Taylor expansion of the potential energy function U(r) around a minimum on the energy surface, r0, gives: U (r ) = U (r 0 ) +
∑ ∂∂Ur i
+
1 3!
∑ ijk
(ri − ri0 ) +
i r=r
∂3U ∂ri ∂rj ∂rk
0
1 2!
∑ ∂∂r ∂Ur 2
i
ij
(ri − ri0 )(rj − rj0 )
j r=r 0
(ri − ri0 )(rj − rj0 )(rk − rk0 ) + … .
(18.2)
r=r 0
Since the reference structure r0 is a minimum of the energy function, ∂U/∂ri(r0) = 0. In addition, the potential energy can be defined relative to this reference structure as U(r0) = 0. Finally, if one considers sufficiently small displacements, terms beyond the second order may be neglected (i.e., harmonic approximation). The approximate potential energy function is given as: U (r ) ≅
∑ ∂∂r ∂Ur 2
1 2
i
ij
(ri − ri0 )(rj − rj0 ) .
(18.3)
j r=r 0
Thus the Hamiltonian of the system is then given by: H (r ) ≅ K (r ) + U (r ) =
1 2
∑
mi
i
1 dri2 + 2 2 d t
∑ ij
∂2U ∂ri ∂rj
(ri − ri0 )(rj − rj0 ), r=r
(18.4)
0
where K represents the kinetic energy, and mi represents the mass of the coordinate ri. For convenience, we rewrite the equation using a mass weighted coordinate, Xi = mi1/2(ri –ri0): H ( X) ≅
59556_C018.indd 269
1 2
∑ dXdt
2 i
2
i
+
1 2
∑ ∂X∂ ∂UX 2
ij
i
Xi X j .
(18.5)
j X= X 0
8/2/08 8:00:58 AM
270
CoarseGraining of Condensed Phase and Biomolecular Systems
As we already discussed, in the normal mode theory we represent the dynamics of a biological molecule as a collection of harmonic oscillators. The dynamics is not directly expressed in Cartesian coordinates but in normalmode coordinates q. The two coordinates are related by the transformation matrix A as follows: X = Aq. This relation might be more intuitive in a vector form: ⎛ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜⎜ ⎜ ⎜⎜⎜ ⎜⎜ ⎝
m1 ( x1 − x10 ) ⎞⎟⎟ ⎛ ax1n ⎞⎟ ⎛ ax12 ⎞⎟ ⎟⎟ ⎛⎜ ax11 ⎞⎟ ⎜⎜ ⎜⎜ ⎟ ⎟⎟ ⎟⎟ m1 ( y1 − y10 ) ⎟⎟⎟ ⎜⎜ ⎜⎜ a ⎟⎟⎟ ⎜ ⎟ ⎟ ⎜⎜ a y12 ⎟⎟ ⎟⎟ ⎜⎜ a y11 ⎟⎟ y1n ⎟ ⎜ ⎟⎟ ⎟⎟ ⎜⎜ ⎜⎜ m1 ( z1 − z10 ) ⎟⎟⎟ ⎜⎜ a ⎟⎟⎟ ⎜⎜ az1n ⎟⎟ ⎜⎜ az12 ⎟⎟ ⎟⎟ ⎜⎜ z11 ⎟⎟ ⎟ ⎟ ⎜⎜ ⎜⎜ ⎟ m2 ( x 2 − x 20 ) ⎟⎟⎟ ⎜⎜⎜ ax 21 ⎟⎟⎟ ax 2 n ⎟⎟⎟ ax 22 ⎟⎟ ⎜ ⎜ ⎟⎟ ⎜ ⎟⎟ ⎟⎟ ⎟⎟ ⎜⎜ ⎜⎜ ⎜⎜a y 2 n ⎟⎟⎟ ⎜⎜ a y 22 ⎟⎟⎟ m2 ( y2 − y20 ) ⎟⎟⎟ ⎜⎜⎜ a y 21 ⎟⎟⎟ ⎟ = ⎜⎜ ⎟⎟ qn + . ⎟⎟ q1 + ⎜⎜ ⎟⎟ q2 + ⎜⎜ 0 ⎟ ⎟ ⎟ ⎟ ⎟ a a a ⎜ ⎜ ⎜ m2 ( z2 − z2 ) ⎟⎟ ⎜ z 21 ⎟⎟ ⎜⎜ z 2 n ⎟⎟⎟ ⎜⎜ z 22 ⎟⎟⎟ ⎟⎟ ⎜⎜ ⎟⎟ ⎟ ⎜ ⎜ ⎟⎟ ⎜ ⎟⎟ ⎜⎜ ⎟⎟⎟ ⎜⎜ ⎟⎟ ⎟⎟ ⎜⎜ ⎟⎟ ⎟⎟ ⎜⎜ a ⎟⎟⎟ ⎜ ⎜⎜axN 2 ⎟⎟ ⎟ ⎜a ⎟ ⎜⎜ xNn ⎟⎟ mN ( x N − x N0 )⎟⎟ ⎜⎜ xN 1 ⎟⎟ ⎟ ⎜⎜ ⎟⎟ ⎜a ⎟⎟ ⎟⎟ ⎜⎜a ⎟⎟ a ⎜⎜ yN 2 ⎟ ⎟ ⎜ yN 1 ⎟⎟ ⎜⎜ yNn ⎟⎟⎟ ⎟ mN ( yN − yN0 )⎟⎟ ⎜⎜ ⎜⎜ ⎟ ⎟ ⎝⎜ azN 1 ⎟⎠⎟ azN 2 ⎠⎟ ⎝⎜ azNn ⎠⎟ ⎝ ⎟ mN ( z N − z N0 ) ⎟⎟⎠
(18.6)
Each column vector of A represents a normal mode. The matrix, A, needs to be defined in such a way that the new coordinates {qn} are independent from each other in the Hamiltonian. The second term of the original Hamiltonian can be converted by finding a matrix A that satisfies ATHA = L, where Hij = ∂2U/∂Xi∂Xj (Hessian matrix) and L is a diagonal matrix also to be determined. In addition, if ATA = I, the first term in Equation 18.2 remains to be the same kinetic energy form. Using these expressions above, the Hamiltonian can be converted into the following form: H (q) ≅
1 2
3 N −6
∑ n=1
2
1 dqn + 2 2 d t
3 N −6
∑ω q , 2 2 n n
(18.7)
n=1
where ωn2 is the diagonal element nn of the matrix L. This Hamiltonian can be solved as a set of independent harmonic oscillators {qn} with corresponding frequencies {ωn}. In practice, the transformation matrix A and the diagonal matrix L can be determined by solving the eigenvalue problem, which is to find a vector a and a value λ that satisfy Ha = λa. For H, which is a 3N × 3N matrix, we find 3N sets of solutions (a1, λ1), (a2, λ2) … (a3N, λ3N). Among them, 3N – 6 normal modes are meaningful—the six normal modes have an eigenvalue equal to 0 and correspond to rigid body translational and rotational motions of the whole system. The solutions are normally sorted in ascending order of the eigenvalue, providing the eigenvalue matrix L = diag(λ1,λ2…λ3N – 6) and associated eigenvector matrix A = (a1,a2,…a3N – 6). In summary, as a result of NMA a set of normal mode vectors {qn} and corresponding frequencies {ωn} are obtained. The nth normal mode variable, qn , oscillates with the frequency ωn; that is, the nth eigenvector (a1n a2n…a3Nn)T gives the direction and relative amplitudes of atomic displacements in Cartesian space and all those oscillational displacements occur at the same frequency, ωn. Motions within the system are described as a superposition of those modes. In the case of a simple system such as a water molecule, the resulting normal mode vectors reveal three wellknown motions of the water molecule; that is, bending mode, symmetric mode, and asymmetric stretching mode. We should note that the frequency obtained from computation with a detailed force field can be directly related to infrared experiments for which bond stretching can be observed.
www.ebook3000.com 59556_C018.indd 270
8/2/08 8:00:58 AM
CG NMA to Explore LargeScale Dynamics of Biological Molecules
271
From NMA, several dynamical properties can be calculated. As an example, Bfactors or temperature factors can be calculated as follows. If the system is in thermal equilibrium, the average of the potential of each mode is equal to kBT/2, where T is the absolute temperature and kB is the Boltzmann constant; thus 〈qn2 〉 =
k BT . ω 2n
(18.8)
Using those relations, the Bfactor of each atom is given as: Bi =
8π 2 8π 2 1 〈(ri − ri0 )2 〉 = 3 3 mi
∑ a 〈q 〉 = 8π3 2 in
2 n
n=1
2
k BT mi
∑ ωa n=1
2 in 2 n
.
(18.9)
From this equation, it is evident that the largest contribution to the atomic displacement comes from the lowest frequency normal modes (small ω). For the same reason, the lowest frequency modes are expected to be relevant to biological functions because large conformational changes can be induced by perturbations to the system such as ligand binding. In addition, the lowestfrequency eigenvectors represent the most globally distributed or collective motions; that is, a large number of atoms have significant components (axi , a yi , azi )T, while for highfrequency eigenvectors only a few atoms are involved in the motions. Studies employing NMA generally focus on a few largeamplitude/lowfrequency normal modes as they can be used to unveil large conformational changes of biological molecules.
18.2.2
ROTATIONTRANSLATIONBLOCK (RTB) METHOD
The application of NMA critically depends on diagonalization of the Hessian and this can be a limiting factor in applying NMA to interesting large molecular systems such as the ribosome, myosin, chaperones, or viruses, among others. The RTB method was introduced to reduce the size of the Hessian by introduction of a simple physical idea: a protein or nucleic chain may be viewed as being comprised of rigid components linked together, such as residues/bases, groups of residues/bases, or more extensive segments of secondary structural elements (see Figure 18.1a) [Durand, Trinquier, and Sanejouand 1994; Tama et al. 2000]. The combination of rotation and translation of these rigid components should provide a good representation of the lowfrequency normal modes of the biological system. Thus, in the RTB method, the molecular system is first divided into nb blocks, each consisting of one or a few consecutive residues/base pairs, etc. Then, the lowestfrequency normal modes of the biological system are obtained as a linear combination of the rotations and translations of these blocks. In standard approaches, the normal modes of the system are calculated through the diagonalization of the massweighted Hessian matrix H. In the RTB approach, H, the Hessian being diagonalized, is first expressed in a basis set defined by the rotational and translational degrees of freedom of nb blocks. Hb, the projected Hessian, is given by: H b = P T HP,
(18.10)
where P is the orthogonal 3N × 6nb matrix built with the vectors associated with the local rotations and translations of each block. By diagonalizing Hb, which is a 6nb × 6nb matrix, the normal modes, AP, are obtained. The corresponding (3N) atomic displacements are recovered by A P = PA b.
59556_C018.indd 271
8/2/08 8:00:59 AM
272
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 18.1 (a) In the RTB approach, the polypeptidic chain is treated as a collection of rigid blocks, the blocks being made of one residue or more, and only the rotational and translational degrees of freedom of those blocks are considered. (b) Asymmetric unit of HK97, which contains seven copies of the same protein. HK97 is a T = 7 virus; that is, a total of 420 proteins. In the normal mode calculation, each protein is assigned to a block or divided into two blocks (840 total) to take into account the flexibility of the loop. (c, d) Difference in the shape of HK97 (T = 7) between its two known conformations. (Adapted from the Viperdb web site: Shepherd, C. M., I. A. Borelli, G. Lander, P. Natarajan, V. Siddavanahalli, C. Bajaj, J. E. Johnson, C. L. Brooks, and V. S. Reddy, Nucleic Acids Res., 34:D386, 2006.)
Following the above formalism, the actual computational procedure consists of three steps. In the first step, blocks of residues are defined and for each block, α, the corresponding component of matrix P, Uα , is determined and stored. These 6nb vectors form a new basis of small dimension that corresponds to the projector P. In the second step, the Hessian matrix is expressed in this RTB basis, separately for each coupling or diagonal block, Hαβ: b Hαβ = U Tα Hαβ Uβ .
(18.11)
The set of n2b Hαbβ blockmatrices forms the matrix Hb. The construction of Hb has minimal memory requirements, since the Hessian corresponding to each block; that is, Hαβ, is calculated and projected into the rotationtranslation matrix. Therefore, during this step, the largest matrix kept in memory corresponds to the size of one block in the 3D coordinates. The RTB method requires only the small dimension vectors Uα and the small 6nb × 6nb Hb matrix to be stored. In the last step, Hb is diagonalized, as in standard methods. It has been demonstrated that this approach yields very accurate approximations of the lowfrequency normal modes of proteins. Studies have also shown that the manner the protein is
www.ebook3000.com 59556_C018.indd 272
8/2/08 8:01:00 AM
CG NMA to Explore LargeScale Dynamics of Biological Molecules
273
partitioned into blocks has minimal qualitative consequence on the description of the lowfrequency normal modes of the system [Tama et al. 2000].
18.2.3 CONFORMATIONAL CHANGE PATHWAY Identifying the pathways for conformational changes in macromolecular systems can be useful to understand their functional mechanism. In particular atomiclevel descriptions of the conformational transition process could help to elucidate the molecular basis of the motions. To generate these pathways, tentative models of intermediate structures between the two known conformations need to be built which can be approached in several ways from the stone point of modeling [Schlitter et al. 1993; Guilbert, Perahia, and Mouawad 1995]. NMA has also been applied to describe conformational change pathways [Mouawad and Perahia 1996; Xu, Tobi, and Bahar 2003]. These methods take advantage of the lowenergy normalmode directions of a system between the two endpoint states. Here we present an alternative iterative technique. Before introducing this iterative technique, we should mention how a simple linear approach can be used to describe conformational changes of biological molecules. The displacement vector between two endpoint conformations, Δr, can be expressed as the superposition of displacements along the normalmode direction of the system: Δr =
∑a q , n n
(18.12)
n
since the normalmode eigenvectors should span the conformational space. The normalmode amplitudes {qn} are given as: qn = a n ⋅ Δr .
(18.13)
By using some fraction of normalmode coordinate, qn, of Equation 18.13 for the deformation, the intermediate structures can be generated (Equation 18.12). However, we should note that using all modes corresponds to simple Cartesian interpolation of the two endpoint structures, which often generates physically unrealistic structures. Generally, one finds that a smaller subset of modes {an} account for the majority of the conformational deformation between two endpoint conformations and this serves as a basis for expressing the, possibly functional, dynamics of the conformational change. These modes are coincident to a few modes with the lowest frequencies [Tama and Sanejouand 2001]. Although the linear interpolation approach just described is often adequate in some instances, to describe conformational changes between two conformationally distinct states requires a nonlinear description due to the anharmonic character of the energy landscape. Conformational change pathways are nonlinear, however normal modes provide only linear motions and therefore such modes cannot provide pathways from one structure to another. Another critical aspect is that displacing too far along the direction given by the lowestfrequency modes, which is however the globally preferential direction of the conformational change, can induce large distortion in the local structure such as bond distances. The problems arising from the harmonic approximation employed in the NMA can be ameliorated by performing the normalmode analyses and conformational deformations in an iterative manner [Miyashita, Onuchic, and Wolynes 2003; Tama, Miyashita, and Brooks 2004ab; Miyashita, Wolynes, and Onuchic 2005]. Instead of moving the structure from the initial to the final form directly, the deformation is limited to a small amount, and normal modes are recalculated for the deformed structure. The procedure is as follows: the initial conformation is defined as CI = C0 and the final state is CF. At step k, NMA is performed on Ck (k initially taken at k = I). The vector difference Δrk between Ck and CF is (re)evaluated. The structure Ck is displaced along a linear
59556_C018.indd 273
8/2/08 8:01:02 AM
274
CoarseGraining of Condensed Phase and Biomolecular Systems
combination of a few normal modes {ank} toward the final state leading to the next structure Ck + 1. The amplitude, qnk, of the displacements along normal mode n is given by: qnk = a nk ⋅ Δr k Q
(18.14)
where Q is a parameter that determines how far the structure is displaced, 0 equals the current coordinates, and 1 equals the full projection of the current normal mode coordinates onto CF. A small value of Q such as 0.01 may be used to generate pathways with small distortions. This procedure is repeated until RMSD between the kth iterate and the final conformation cannot be decreased.
18.2.4 THE PROTEIN ELASTIC MODEL: TIRION POTENTIAL To perform NMA, a potential energy function needs to be defined. Models used in standard calculation to represent biological molecules consist of classical points of mass with typically one point per atom. The energy terms for interactions between these atoms are defined by semiempirical force fields. Using these force fields requires an allatom description to represent the macromolecule, which becomes computationally difficult with increasing system size. Using such models also requires a minimization of the potential energy before NMA to ensure that the system is at an energy minimum. This process is particularly detrimental due to the change of protein conformation occurring during the minimization. Moreover, it is time consuming and structures with missing residues are difficult to study. Instead a simplified representation of the potential energy can be introduced for NMA of biological systems. In this representation, the elastic network model, the biological system is described as a threedimensional elastic network based on the equilibrium distribution of atoms [Tirion 1996]. Amino acids or base pairs may be represented in full atomic detail, or at a more coarsegrained level. For example one mass point per residue [Hinsen 1998], only Cα atoms [Bahar, Atilgan, and Erman 1997; Tama and Sanejouand 2001], or more coarsegrained particlebased models [Doruker, Jernigan, and Bahar 2002] may be used to identify the junctions of the network. These junctions are representative of the mass distribution of the system and are connected together via a simple harmonic restoring force: ⎧⎪ k ( r − r  −  r 0 − r 0 )2 for  r 0 − r 0 ≤ R a b a b C ⎪2 a b , E (ra , rb ) = ⎪⎨ 0 0 ⎪⎪ ⎪⎩0 for  ra − rb > RC
(18.15)
where ra − rb denotes the vector connecting pseudoatoms a and b, the zero superscript indicates the initial configuration of the pseudoatoms, and RC is a spatial cutoff for interconnections between the particles. The strength of the potential k is a phenomenological constant assumed to be the same for all interacting pairs. The total potential energy of the molecule is expressed as the sum of elastic strain energies: ESystem =
∑ E(r , r ). a
b
(18.16)
a ,b
Note that this energy function, ESystem, is a minimum for any chosen configuration of any system, thus eliminating the need for minimization prior to NMA. Consequently, NMA can be performed directly on crystallographic or NMR structures [Tirion 1996]. Several studies have shown that this Hookean potential is sufficient to reproduce the lowfrequency normal modes of proteins as produced by more complete potential energy functions
www.ebook3000.com 59556_C018.indd 274
8/2/08 8:01:03 AM
CG NMA to Explore LargeScale Dynamics of Biological Molecules
275
[Tama and Sanejouand 2001]. The high degree of accord between the modes constructed from these methods suggests that lowfrequency normal modes are predominantly a property of the shape of the molecular system [Tama et al. 2003; Tama, Wriggers, and Brooks 2002; Ming et al. 2002]. While this agreement tends to breakdown at high frequencies, there have been many cases showing that collective motions found in the lowfrequency modes characterize biologically relevant conformational changes well [Tama and Sanejouand 2001].
18.2.5 STRAIN ENERGY ANALYSIS Originally, the Tirion potential was proposed for NMA. This potential is crude but adequate for the lowfrequency motion. Thus, it can be used to analyze mechanical energy of structures along those lowfrequency motions. In many respects it is the elastic counterpart of the Go model used in proteinfolding simulations. In the strain energy analysis, we examine how a protein would be strained when it is deformed from its stable structure. Normally, the stable conformation is the one determined by Xray crystallography. The network definition of the Tirion potential is defined based on this structure, which is then the most stable structure from the definition of the potential (Equation 18.15 and Equation 18.16). Any deformation to the original structure causes increase in the energy; that is, strain. To examine strain energy quantitatively, the spring constant of the Tirion potential, k, has to be chosen appropriately (note that the normal mode vector does not depend on this parameter but the frequency does). One of the simplest approaches is to adjust it so that the average atomic Bfactors from Xray crystallography and NMA coincide [Tirion 1996; Bahar, Atilgan, and Erman 1997]. It could also be determined from a systematic study of the Xray crystallography structure database [Kundu et al. 2002]. The Bfactor includes not only atomic fluctuation from protein dynamics but also crystal disorder. On the other hand, crystal contact could also affect the Bfactor. Thus estimation of the spring constant is not straightforward. There is also an approach to consider a protein as a plastic object [Maragakis and Karplus 2005]. Strain energy analysis can be used to estimate the energetic cost of deforming a protein structure from a stable one to others. In addition, examination of local distribution of the strain energy provides information of effects of conformational fluctuation on the local environment around each of the residues. High strain energy indicates that the local environment of the residue is correlated to the global dynamics of the protein. The strain energy of an atom, i, is defined in Equation 18.17: k Ei = 4
ri , j 8 Å is smoother and could be approximated with an aminoaciddependent potential. This includes mainly hydrophobicity and electrostatics. This problem of the “doublenature” of the nonbonded interactions in CG models can only be partially solved adding a separate bead for the side chain and additional beads for the backbone, as in the multiple bead models the doublewell structure of the Unb(r) are still present,2 although less pronounced. Thus, remaining within the onebead model, a possibility is to parameterize more accurate aminoacidtype dependent Unb(r), separating it into local and nonlocal parts: U nb (r ) = U nbloc (r ) + U nbnonloc (r ) ,
(19.5)
the first including anisotropic potentials and the second including isotropic terms. Some very preliminary steps towards this completely unbiased and accurate/predictive model were made in Ref. 8. The minimal polypeptide model in Ref. 8 shows secondary structure transitions and quite accurate structures, but only for oligopeptides, and an aminoacidbased optimized parameterization is in progress. A possible way to get around this difficulty is to preserve a local bias in the model, as will be shown in detail in the “Results” section of this chapter. 19.2.3.1
The Inclusion of Electrostatics: Solvent Effects
In the parameterization of Unbnonloc(r) it should be borne in mind that in order to maintain the advantage of saving computational cost, the solvent is treated implicitly. This term, however, is in principle easier to parameterize, since it can be treated as isotropic, and basically includes only two effects: the hydrophobicity and the electrostatics. The two effects cannot be separated if the parameterization is based on the g(r), however the first one can be conveniently represented as a Morselike potential whose parameters depend on the hydrophobicity value of the aminoacidinteracting couple. This problem was addressed by several authors (see for instance Ref. 20). Conversely, a proper treatment of electrostatics based on the Boltzmann inversion is difficult, because it involves inverting g(r) at medium and long range, where it is less well determined. An attractive direction of work in the future will be the development of hybrid models that combine the onebead CG elements with implicit solvation models that have been developed for arbitrary types of molecular dynamics and molecular modeling simulations.21 For longranged electrostatic interactions, these include new developments that use finite element or boundary element methods to solve the Poisson–Boltzmann equation very efficiently.22 Apolar interactions have traditionally been accounted for separately from the electrostatic interactions, by means of effective surface tension models.21 Newer solvation models are becoming available that allow for a more accurate, coupled treatment of the polar and apolar interactions.23,24 To conclude this section, in Figure 19.4 a schematic representation of the classification of the mentioned onebead models for proteins is presented (highlighted in dark gray). The models are placed according to their predictivitytransferability and accuracy. As already remarked, the more transferable models are also in general less accurate in reproducing local structures. While the ultimate goal in onebead model parameterization is to have good accuracy together with high predictivity, in the following section a compromise model will be presented, having both good
59556_C019.indd 293
8/2/08 8:04:25 AM
294
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 19.4 A qualitative predictivityversusaccuracy diagram of the onebead models. The main characteristics of the models are given.
accuracy and predictivity, which can be considered an intermediate temporary step on the path toward developing a truly unbiased model for proteins
19.3 RESULTS: INTERMEDIATE STEPS TOWARDS A COMPLETELY UNBIASED ONEBEAD MODEL As we have seen in the previous paragraph, the main problem in the parameterization of the onebead models is the representation of local nonbonded interactions, which are various in nature, highly nonisotropic, and very specific. Always having in mind that the final goal should be a sequencebased parameterization of these interactions, we report a possible way to get around this problem. The idea is to keep a local bias towards a known reference structure.17,25 In practice, the nonbonded potential is separated into two parts, as in Equation 19.5, the cutoff between the two being conveniently located at ∼ 8 Å, and both of them are represented by Morse potentials vM (r ) = ε{[exp(−α(r − r0 )) − 1]2 − 1}. The Unbloc(r) bears a bias: r0 = r0 ,ij , where r0,ij is taken from a reference structure. Additionally, the dissociation energy is made exponentially decreasing with the equilibrium distance, ε (r0 ,ij ) = A exp(−λr0 ,ij ) , to account for the decreasing of the bond strength as the equilibrium distance increases. Unbnonloc(r) has the same functional form but r0 and ε are independent of the reference structure, and matched with those of the local part. The parameters A and λ were determined based on the iterative Boltzmann inversion technique. The conformational terms (double well for the bond angle and a simplified cosine form for the dihedral) are unbiased and parameterized by aminoacid type. This model was applied to the HIV1 protease, a key enzyme in the HIV replication cycle. HIV1pr binds to the viral polyproteins and cleaves them into functional pieces. The enzyme mechanism is thought to involve the opening of two betahairpin structures that protect the active site, called the flaps, but the opening frequency is on the micromillisecond time scale. Thus this was a good test case, since the multimicrosecond time scale is feasible with onebead models, while not reachable
www.ebook3000.com 59556_C019.indd 294
8/2/08 8:04:25 AM
OneBead CoarseGrained Models for Proteins
295
with allatom simulations. The presence of the local bias towards the crystallographic apoHIV1pr structure maintains a very good accuracy of the secondary structure of the protein. However, this bias has also proven to be weak enough to ensure the possibility of large fluctuations from the reference structure: the protease flap can open, leaving the active site completely exposed to the solvent (see Figure 19.5). Indeed, the comparison of the simulation with a recently crystallized semiopen structure shows very good accuracy (see Figure 19.5a). Since no bias towards this structure was included in the model, this is an a posteriori indicator of the predictivity of the model. Additionally, the flap opening was allowed by the particularly accurate form of the doublewell bond angle term: the flap tip needs to curl for the flap to open. This local conformational transition involves three subsequent Cα s and it does not occur with a harmonic potential. Multiple microsecond simulations were performed, showing that the flap opening fraction depends on the temperature and follows a sigmoid curve that is typical of phase transitions: the
FIGURE 19.5 (See color insert following page 238.) (a) Snapshots from the free protease simulation, showing the steps of the flap opening. For the first two steps the experimental structures are available (in blue in the color figure) that superimpose very well on the simulated structures (in red). (b) A snapshot of the simulation in the presence of crowders. (c) Substrate approach (A, B), interaction with the flaps (C, D), substrate adjustment and flap closing (E, F), cleavage and release (G–I). (d) The ligand binding with closed flaps (for small ligands). (e) Coarsegraining of the nucleosome: the allatom cartoon representation (left) and the onebead model (right). (f) The onebead model of the S70 bacterial ribosome.
59556_C019.indd 295
8/2/08 8:04:27 AM
296
CoarseGraining of Condensed Phase and Biomolecular Systems
model describes a system stable in the closed state, but very near to the transition, as one would expect considering the mechanism. This is also in agreement with experimental association rates, if one assumes that the substrate association is mainly triggered by the flap opening. The flapopening frequency, conversely, depends on the damping constant in the Langevin dynamics (or on the hydrodynamic radius in the Brownian dynamics). For physically reasonable values it reaches the microsecond time scale, indicating again a good accuracy of the model, even for what concerns the characteristic times, and demonstrating how the CG model can be combined with a stochastic approach in order to reproduce the correct dynamics. The study of the correlations between the flap opening and other principal modes has revealed the location of a possible allosteric inhibition site.27 The influence of the crowder molecules on the flapopening dynamics was studied, by representing the crowders as large soft spheres, showing that at certain concentrations the presence of crowders can hinder the flap opening26 (see Figure 19.5b). The model was also applied to the ligand binding27,28 and substrate bindingcleavage dynam27,29 ics. The effect of mutations on the binding affinity was evaluated and found in agreement with the experiment. It was also shown that while small ligands can enter HIV1pr from the sides with partial opening of the flaps, the opening must be complete for the substrate to enter (Figure 19.5c,d). The initial approach of the substrate is by free diffusion. Subsequently, the substrate explores different possible approach angles, interacts with the flaps, and modifies their dynamics, favoring the open state. When the flaps open, if the substrate is in the correct orientation it enters and correctly positions into the active site. The flaps close and the HIV1pr–substrate complex is stable in the closed conformation. When the substrate is cleaved, the products are released without flap opening. This is the first simulation of the entire process of capture, cleavage, and release of this kind of system. In conclusion to this section, the local bias retained in the model has the positive effect of correctly reproducing the local hydrogenbond network and shape effects of the side chains, without precluding the possibility to explore conformations very far from the reference one, thanks to the other unbiased terms of the potential. In other words, this is a good compromise between completely biased models (Go, networks) and completely unbiased models. In addition, depending on the cases, the bias can be made weaker or stronger, by tuning the cutoff between the local and nonlocal part of the potential and/or biasing other term of the potential. For instance, in Ref. 7 the bias is stronger to increase the stability of the structural features of the system, the S70 bacterial ribosome. This is a huge system (about 9000 residues) that was simulated around the microsecond time scale with the onebead approach, revealing the slow motions responsible for the translocation process (see Figure 19.5f). In Ref. 18, in a model for the nucleosome (about 1300 residues, see Figure 19.5e), the bias is intermediate, and a very accurate parameterization leads to a particularly good comparison between CG and allatom simulations on the nanosecond time scale. This model is designed to simulate the nucleosome unwrapping preliminary to transcription, replication, or chromosome condensation phases.18
19.4 CONCLUDING REMARKS In this chapter we have described some of the most representative onebead CG models available in the literature. We focused on the onebead models because they have several advantages with respect to other CG models. Their resolution matches with that of cryoelectron microscopy, which is a fortunate circumstance that allows CG modeling and experiments to synergistically give a realistic and accurate view of a system’s structure and internal dynamics. Additionally, this level of coarsening allows simulating the maximum sizes and time scales while preserving the possibility of explicitly describing complex structural transitions. This requires a preliminary study of the mapping between the allatom and onebead internal variables describing the backbone conformation, so that the available secondary structure information in the Ramachandran map can be used also in the onebead representation. Finally, but maybe most obviously, the onebead models are the
www.ebook3000.com 59556_C019.indd 296
8/2/08 8:04:28 AM
OneBead CoarseGrained Models for Proteins
297
simplest to implement and the most “natural” from the point of view of the hierarchical structural organization of proteins, since the amino acid is the basic unit of the proteins. The onebead models were reviewed and classified according to the functional forms and parameterization philosophy of their force fields. These determine the accuracy and transferabilitypredictivity of the models, which are usually competing factors in these approaches. This is basically due to the fact that, if one wants to preserve the intrinsic simplicity of the model, it is very difficult to include many complex interactions that occur between amino acids in the relatively few parameters. However to have an accurate and yet transferable and predictive onebead force field is the ultimate goal, and some steps forward have been taken, as described in this chapter. In particular, we indicate possible ways to exploit at best the potential of the conformational term (bond angles and dihedrals) of the force field, to properly introduce solvent effects into the electrostatic term and to treat intermediaterange polar and apolar interactions. However, the most critical issue is the parameterization of the shortrange nonbonded interactions, which must include many highly specific physicochemical effects in a few parameters. Good and simple recipes for the parameterization of this term are not yet available in the general case, and work is in progress. However, less general, “intermediate” models were presented. Those are not yet completely independent of some a priori knowledge of the system, yet are able to include a high degree of accuracy together with predictivity and, as shown in the chapter, have proven to be capable of simulating very slow processes (such as the HIV1 protease substrate capture) occurring in very large systems (nucleosomes and ribosomes). The ultimate goal remains to reduce to zero, if possible, the necessary a priori knowledge of the system and to accurately predict structures and internal dynamics, a hard task that hopefully will stimulate researchers from different fields, including biochemistry, biophysics, bioinformatics, and mathematics.
ACKNOWLEDGMENTS Work in VT’s group is supported in part by “INFMCNR parallel computing initiative 2005–2006” and by IIT. VT also wishes to thank Karine Voltz and Joanna Trylska for useful discussions and for having provided material for figures. Work in JAM’s group is supported in part by the NIH, NSF, HHMI, CTBP, NBCR, and Accelrys.
REFERENCES 1. Tozzini, V. 2005. Coarsegrained models for proteins. Curr. Opin. Struct. Biol. 15:144–50. 2. Izvekov, S., and Voth, G. A. 2006. Modeling real dynamics in the coarsegrained representation of condensed phase systems. J. Chem. Phys. 125:151101. 3. Reynwar, B. J., Illya, G., Harmadaris, V. A., Müller, M. M., Kremer K., and Deserno M. 2007. Aggregation and vesiculation of membrane proteins by curvaturemediated interactions. Nature 447:461–64. 4. Chu, J.W., and Voth, G. A. 2006. Coarsegrained modeling of the actin filament derived from atomisticscale simulations. Biophys. J. 90:1572–82. 5. Eghiaian, F. 2005. Structuring the puzzle of prion propagation. Curr. Opin. Struct. Biol. 15:724–30. 6. Tama, F., Valle, M., Frank, J., and Brooks, C. L., III. 2003. Dynamic reorganization of the functionally active ribosome explored by normal mode analysis and cryoelectron microscopy. Proc. Natl. Acad. Sci. U.S.A. 100:9319–23. 7. Trylska, J., Tozzini, V., and McCammon, J. A. 2005. Exploring global motions and correlations in the ribosome. Biophys. J. 89:1455–63. 8. Tozzini, V., Rocchia, W., and McCammon, J. A. 2006. Mapping allatom models onto onebead coarsegrained models: General properties and applications to a minimal polypeptide model. J. Chem. Theor. Comp. 2:667–73. 9. Hamacher, K., and McCammon, J. A. 2006. Computing the aminoacid specificity of fluctuations in biomolecular systems. J. Chem. Theory Comput. 2:873–78.
59556_C019.indd 297
8/2/08 8:04:29 AM
298
CoarseGraining of Condensed Phase and Biomolecular Systems
10. Matysiak, S., and Clementi, C. 2004. Optimal combination of theory and experiment for the characterization of the protein folding landscape of S6: How far can a minimalist model go? J. Mol. Biol. 343:235–48. 11. Liu, Z., and Chan, H. S. 2005. Desolvation is a likely origin of robust enthalpic barriers to protein folding. J. Mol. Biol. 349:872–89. 12. Das, P., Matysiak, S., and Clementi, C. 2005. Balancing energy and entropy: A minimalist model for the characterization of protein folding landscapes. Proc. Natl. Acad. Sci. U.S.A. 102:10141–46. 13. Sorenson, J. M., and HeadGordon, T. 2000. Matching simulation with experiment: A new simplified model for simulating protein folding. J. Comput. Chem. 7:469–81. 14. McCammon, J. A., and Northup, S. H. 1980. Helixcoil transition in a simple polypeptide model. Biopolymers 19:2033–45. 15. Wade, R. C., Davis, M. E., Luty, B. A., Madura, J. D., and McCammon, J. A. Gating of the active site of triose phosphate isomerase: Brownian dynamics simulations of flexible peptide loops in the enzyme. Biophys. J. 64:9–15. 16. Reith, D., Pütz, M., and MüllerPlathe, F. 2003. Deriving effective mesoscale potentials for atomistic simulations. J. Comput. Chem. 24:1624–36. 17. Tozzini, V., and McCammon, J. A. 2005. A coarsegrained model for the dynamics of flap opening in HIV1 protease. Chem. Phys. Lett. 413:123–28. 18. Voltz, K., Trylska, J., Tozzini, V., KurkalSiebert, K., Langowsky, J., and Smith, J. 2008. Coarsegrained force field for the nucleosome from selfconsistent multiscaling. J. Comput. Chem. 29:1429–39. 19. Izvencov, S., and Voth, G. A. 2005. A multiscale coarsegraining method for biomolecular systems. J. Phys. Chem. B. 109:2469–73. 20. Levitt, M. 1976. A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104:59–107. 21. Adcock, S. A., and McCammon, J. A. 2006. Molecular dynamics: A survey of methods for simulating the activity of proteins. Chem. Rev. 106:1589–1615. 22. Lu, B., Cheng, X., Huang, J., and McCammon, J. A. 2006. An order N algorithm for computation of electrostatic interactions in biomolecular systems. Proc. Natl. Acad. Sci. U.S.A. 59:19314–15. 23. Dzubiella, J., Swanson, J. M. J., and McCammon, J. A. 2006. Coupling hydrophobic, dispersion, and electrostatic contributions in continuum solvent models. Phys. Rev. Lett. 96:087802. 24. Dzubiella, J., Swanson, J. M. J., and McCammon, J. A. 2006. Coupling nonpolar and polar solvation free energies in implicit solvent models. J. Chem. Phys. 124:084905. 25. Tozzini, V., Trylska, J., Chang, C. E., and McCammon, J. A. 2007. Flap opening dynamics in HIV1 protease explored with a coarsegrained model. J. Struct. Biol. 157:606–15. 26. Minh, D. D. L., Chang, C. E., Trylska, J., Tozzini, V., and McCammon, J. A. 2006. The influence of macromolecular crowding on HIV1 protease internal dynamics. J. Am. Chem. Soc. 128:6006–6007. 27. Chang, C. E., Shen, T., Trylska, J., Tozzini, V., and McCammon, J. A. 2006. Gated binding of ligands to HIV1 protease: Brownian dynamics simulations in a coarsegrained model. Biophys. J. 90:3880–85. 28. Chang, C. E., Trylska, J., Tozzini, V., McCammon, J. A. 2007. Binding pathways of ligands to HIV1 protease: Coarsegrained and atomistic simulations. Chem. Biol. Drug Des. 65:5–13. 29. Trylska, J., Tozzini, V., Chang, C.E., and McCammon, J. A. 2007. HIV1 protease substrate binding and product release pathway explored with a coarsegrained molecular dynamics. Biophys. J. 92:4179–87.
www.ebook3000.com 59556_C019.indd 298
8/2/08 8:04:29 AM
of ResidueBased 20 Application and ShapeBased CoarseGraining to Biomolecular Simulations Peter L. Freddolino and Amy Y. Shih Center for Biophysics and Computational Biology, University of Illinois at UrbanaChampaign
Anton Arkhipov, Ying Ying, Zhongzhou Chen, and Klaus Schulten Department of Physics, University of Illinois at UrbanaChampaign
CONTENTS 20.1 Introduction ......................................................................................................................... 299 20.2 ResidueBased CoarseGraining.........................................................................................300 20.2.1 Interaction Potentials for ResidueBased CG ........................................................300 20.2.2 Reverse CoarseGraining and Resolution Switching ............................................ 301 20.2.3 Application to Nanodiscs and HDL ......................................................................302 20.2.4 Application to the BAR Domain ...........................................................................304 20.3 ShapeBased CoarseGraining............................................................................................ 305 20.3.1 Selection of Bead Arrangement and Potentials ..................................................... 305 20.4 Application to Structural Dynamics of Viruses .................................................................308 20.4.1 Application to the Bacterial Flagellum ................................................................. 310 20.5 Future Applications of CoarseGraining ............................................................................ 311 References ...................................................................................................................................... 312
20.1
INTRODUCTION
A vast array of problems currently addressed by computer simulations, including biological systems, involve the analysis of properties on long time and length scales derived from simulations on relatively short time and length scales [Katsoulakis, Majda, and Vlachos 2003]. Although these techniques can provide a great deal of insight into the processes under study, traditional simulations of this type are limited in scope by their computational costs, which impose an upper limit on the time scale that can be studied (currently in the nanosecond range, for biological systems [Sastry et al. 2005]). This limitation has lead to the development of a wide variety of techniques attempting to provide longer time and length scale information than traditional (usually atomistic) simulations, many of which fall into the category of coarsegraining. In the broadest possible sense, the term “coarsegraining” (CG) can be used to refer to any simulation technique in which a simulated 299
59556_C020.indd 299
8/2/08 8:34:36 AM
300
CoarseGraining of Condensed Phase and Biomolecular Systems
system is simplified by clustering several subcomponents of it into one component, thus effectively reducing the computational complexity by removing both degrees of freedom and interactions from the system. The fundamental assumption behind such techniques is that by eliminating insignificant degrees of freedom, one will be able to obtain physically correct data on the properties of a system over longer time scales than would otherwise be achievable [Schütte et al. 1999]. A wide variety of CG methods for biological systems currently exist, ranging in some sense from unitedatom models to elastic network models. We focus on the principles and applications of two classes of biological CG, namely residuebased and shapebased CG. Residuebased CG is a broad family of methods in which clusters of 10–20 covalently bonded atoms are represented by one bead; it is a fairly natural and common method for CG when a speedup of 1–2 orders of magnitude over allatom simulations is required. Shapebased CG is a method recently developed in our group that uses a neural network algorithm to assign CG beads to domains of a protein, efficiently reproducing the shape of the protein with a minimal number of particles. Interactions between beads are then parameterized from allatom simulations of the bead components. In this chapter we present a summary of both methods, along with exemplary applications of residuebased CG to two lipidprotein systems involving largescale conformational changes, and of shapebased CG to the mechanical properties of multiprotein systems.
20.2 RESIDUEBASED COARSEGRAINING The most natural (and frequently used) method for coarsegraining a biological system is to assign sections of each biological molecule (or monomer, in the case of a biopolymer) with similar chemical properties and spatial location to a “bead,” and then treat the CG system as an ensemble of beads. This type of description is henceforth referred to as “residuebased coarsegraining.” For example, in one possible description of a protein each amino acid residue would be represented by two beads, one representing the backbone atoms and a second (different for each residue type) representing the sidechain atoms [Shih et al. 2006, 2007b]. While in principle similar to the unitedatom models common in the early stages of molecular dynamics (MD) [Leach 1996], modern residuebased CG methods are generally geared toward much longer time scales, and are thus coarser. The strategy of making a cluster of connected heavy atoms the unit particle, rather than atoms or heavy atoms, permits a longer timestep and thereby yields a larger reduction in computational effort than unitedatom models, but obviously carries a commensurate loss of detail. Recent interest in residuebased CG has emerged in the field of lipid simulations, where several groups have developed CG lipid models either by attempting to reconstruct the forces observed in allatom MD [Shelley et al. 2001; Stevens, Hoh, and Woolf 2003; Stevens 2004; Nielsen et al. 2004; Nielsen and McCammon 2003] or by using a created potential with parameters tuned to match experimental thermodynamic data [Marrink and Mark 2002, 2003, 2004; Marrink, de Vries, and Mark 2004; Marrink, Risselada, and Mark 2005; Baron et al. 2007]. In both of these cases, the CG process maps approximately 10 atoms to one coarsegrained particle (“bead”), and the resultant CG model reproduced both the physical properties and (to the extent that they are experimentally known) assembly mechanisms of bilayers, micelles, and other lipid aggregates on microsecond time scales. Similar efforts have recently been extended to proteins, including simulation of proteinlipid assemblies [Shih et al. 2006; Bond and Sansom 2006] and protein folding [Das, Matysiak, and Clementi 2005].
20.2.1 INTERACTION POTENTIALS FOR RESIDUEBASED CG In the broadest sense, the force fields used in residuebased CG models tend to fall into one of two categories, either being derived phenomenologically or through MDbased parameterization. The former approach, exemplified by the lipidwater force fields of Marrink and coworkers [Marrink and Mark 2003, 2004; Marrink, de Vries, and Mark 2004; Marrink, Risselada, and Mark 2005]
www.ebook3000.com 59556_C020.indd 300
8/2/08 8:34:37 AM
Application of ResidueBased and ShapeBased CoarseGraining
301
and by the more recent MARTINI force field [Marrink et al. 2007], involves partitioning clusters of atoms into abstract “types” based on their physical properties (for example, polarity and ability to hydrogen bond); the interactions between beads are then parameterized to reproduce experimental data such as partition energies [Marrink, de Vries, and Mark 2004]. The latter approach is a direct analogue of parameterization of allatom MD models from quantum mechanical calculations; here, allatom simulations are performed on some system including the CG beads whose interactions are to be parameterized, and the results are used to construct an effective potential between the beads. Both approaches have been successfully applied to a number of systems, but potentials derived from allatom MD simulations carry the added benefit of improved miscibility of allatom and CG components, which is likely to become increasingly important as mixed allatom/CG simulations [Shi, Izvekov, and Voth 2006; Praprotnik, Site, and Kremer 2005, 2006; Lyman, Ytreberg, and Zuckerman 2006] become more common. MDbased parameterization can be carried out in a variety of ways, depending on the scope and intended use of the parameter set in question. Given an allatom simulation including the components whose interactions are to be parameterized, an effective interaction potential between CG beads can be constructed by attempting to match the forces present between the beads in the allatom description as a function of distance [Izvekov and Voth 2005a, 2005b, 2006; Shi, Izvekov, and Voth 2006] or through a process such as Boltzmann inversion [Reith, Pütz, and MüllerPlathe 2003; Tozzini and McCammon 2005], which is described in more detail in the following sections. Note that although the example given below is for shapebased CG, the same techniques can be applied to determine interactions for residuebased CG models. Both in the case of MDbased and phenomenological parameterization, the resulting potentials may either be fitted to an existing potential form (for example, the Lennard–Jones potential for nonbonded interactions) or used directly (for example, in the form of an energy/force lookup table). While making use of an existing potential form has long been preferred because it allows the use of existing MD packages without further modification, the use of tabulated potentials allows more control over the exact potential form being used, and is increasingly supported in common MD packages such as DLPOLY and NAMD.
20.2.2 REVERSE COARSEGRAINING AND RESOLUTION SWITCHING Coarsegrained MD simulations have proven quite useful for obtaining data on the behavior of systems, where the relevant time or length scales (or both) are inaccessible to allatom MD. However, even heavier use of CG simulations could be made if CG could be used as an accelerator, with atomic detail either maintained in regions of interest or recoverable from snapshots in the CG trajectory. Recent progress has been made along both these fronts, in the form of mixed CGallatom simulations [Shi, Izvekov, and Voth 2006] and simulations involving dynamic switching of components between CG and allatom descriptions [Praprotnik, Site, and Kremer 2006; Lyman, Ytreberg, and Zuckerman 2006]. The primary new challenges faced in either of these cases lie in deriving accurate potentials for interactions between CG and allatom components, and in effectively mapping CG conformations to allatom conformations. The latter challenge is particularly significant both because any given conformation of CG particles can be taken to represent an ensemble of conformations of the corresponding allatom system (any set of states where the centers of mass of the component atoms for each bead correspond to the CG bead positions), and because switching to the allatom system will almost certainly cause a change in the energy of the system due to the introduction of new interactions. Early efforts in switching of scales have focused on building a method allowing true mixedscale dynamics, either by allowing particles to transition between allatom and CG representations while passing through a specific region in space [Praprotnik, Site, and Kremer 2006] or by allowing exchange between lowresolution and highresolution replicas of a system being simulated in parallel [Lyman, Ytreberg, and Zuckerman 2006]. Outgrowths of these methods will likely be quite useful in the future, although both face the difficulty that deterministically mapping a given CG
59556_C020.indd 301
8/2/08 8:34:38 AM
302
CoarseGraining of Condensed Phase and Biomolecular Systems
conformation to an allatom conformation may be insufficient for more complex beads (such as beads representing an amino acid sidechain or significant fraction thereof) and that the freeenergy discontinuities experienced during scaleswitching may become prohibitively high if a poor initial allatom conformation is chosen during exchange. In some cases where a CG model is used to accelerate sampling, there is no need to repeatedly switch between CG and allatom descriptions; it is sufficient to sample the conformational space of the system using the CG model and then analyze the results in terms of a consistent allatom model. This is the case, for example, in the studies of nanodiscs presented below, where allatom conformations had to be extracted from various snapshots of the CG simulation for comparison with experimental data. In this case, it proved sufficient to reverse coarsegrain the system by superimposing the allatom components of the system on the CG structure such that the center of mass of each cluster of atoms is located on the corresponding CG bead, and then minimizing and annealing the resulting allatom structure with the center of mass of each atom cluster constrained to the bead location. This can be conceptually interpreted as sampling the conformational space of the allatom structure in the region consistent with the CG structure being converted. While this method is far too timeconsuming to use when rapid switching of allatom and CG representations is desired, and does not preserve the dynamic or thermodynamic properties of the CG system, it is sufficient for recovering an allatom snapshot from a CG simulation, and some conformational sampling scheme similar to that used here is likely to become necessary in resolution exchange for cases where mapping the CG conformation to an allatom conformation is nontrivial.
20.2.3 APPLICATION TO NANODISCS AND HDL Highdensity lipoproteins (HDL) are lipid–protein particles that function in the body to remove cholesterol from peripheral tissues and return them to the liver for processing. These particles, which occur in a wide variety of shapes and sizes in vivo, are known to play an important role in protecting the body from heart disease [Wang and Briggs 2004]. HDL particles are known to be composed of a discshaped patch of membrane enclosed by two or more copies of apolipoprotein AI (ApoAI). In addition to their medical importance, a truncated form of the protein component of HDL particles has recently been used to assemble homogeneous protein–lipid particles known as nanodiscs [Bayburt, Grinkova, and Sligar 2002; Sligar 2003], which can incorporate membrane proteins and thus be used to study them in an environment more realistic than micelles or liposomes [Seddon, Curnow, and Booth 2004; Davydov et al. 2005; Baas, Denisov, and Sligar 2004; Duan et al. 2004; Civjan et al. 2003; Boldog et al. 2006; Shih et al. 2005]. The conditions needed to cause nanodiscs to assemble around a protein, however, are very dependent on the protein itself, and different conditions are required to efficiently incorporate different proteins [Denisov et al. 2004; Bayburt, Grinkova, and Sligar 2006; Boldog et al. 2006]. Obtaining information on the structure and assembly of nanodiscs would thus be useful in the rational design of nanodisc assembly protocols, and would additionally provide data on HDL assembly and characteristics. Unfortunately, no highresolution structure has been obtained for a complete HDL particle or nanodisc, although a consensus doublebelt model is emerging for the general layout of the proteins and lipids in the particle [Koppaka et al. 1999; Panagotopulos et al. 2001; Li et al. 2000; Tricerri et al. 2001; Silva et al. 2005; Li et al. 2006; Gorshkova et al. 2006]. Unfortunately, nanodisc assembly takes place on a time scale of microseconds to milliseconds, far longer than can be treated using allatom MD simulations. The nature of the type of data sought—relatively coarse data on important stages of nanodisc assembly and factors affecting it—is in principle appropriate for a residuebased CG model. In addition, the fact that hydrophobic interactions and the properties of a lipid patch are the primary features likely to drive the simulation meant that the bulk of the force field in this case could be taken from the lipid–water model of Marrink and coworkers [Marrink, de Vries, and Mark 2004], a phenomenological model which had shown excellent results in the assembly and physical properties of micelles and bilayers. For the
www.ebook3000.com 59556_C020.indd 302
8/2/08 8:34:38 AM
Application of ResidueBased and ShapeBased CoarseGraining
303
protein component of the system, the bead types of Marrink’s force field were assigned to protein components according to their properties, with each amino acid residue represented by a backbone bead (the same type for each residue) and a sidechain bead [Shih et al. 2006]. A very similar model was proposed by Bond and coworkers in their simulations of the bacterial membrane protein OmpA [Bond and Sansom 2006]. The use of a CG model on the nanodisc provides a factor of 500 speedup compared with allatom simulations, due to the use of 50 fs timesteps and reduction in number of particles by a factor of 10 [Shih et al. 2006]. Simulation of the components of a single nanodisc beginning from a random mixture with water, over a period of 10 μs, revealed a complete pathway for the assembly of nanodiscs from their components, as shown in Figure 20.1. Further simulations from other starting points showed both similar assembly pathways and mechanisms [Shih et al. 2006, 2007b; Shih et al. 2007a]. Analysis of the energetics of assembly illustrated that it occurs as a threestep process. First, nucleation of assembly occurs as the lipids form pseudomicelles, which are roughly spherical in shape; at this point, the hydrophobic face of the Apo AI proteins (each of which contains a set of amphipathic αhelices) binds to the pseudomicelle in a random conformation. After this initial aggregation, the proteins reorient along the surface to bring themselves into more favorable contact with each other, eventually forming a series of salt bridges that force the double belt orientation to form. Although no highresolution structural data on formed nanodiscs or HDL are available, the assembly mechanism and final structure obtained from CG simulations could still be compared to lowresolution information from SAXS studies [Shih et al. 2007c]. Theoretical SAXS curves can
FIGURE 20.1 (i) Snapshots from an assembly simulation in which 160 DPPC lipids and two Apo A1 proteins were assembled from a random mixture over 10 μs. CG water is present in all cases but omitted from images for clarity. (ii) Comparison of SAXS curves between experimental results for DPPC nanodiscs (a), DMPC nanodiscs (b), an ideal allatom model of a doublebelt nanodisc (c), and the final structure from a 10 μs CG assembly simulation (d). Note that the curves are separated vertically for clarity. (iii) Example of a CG conformation (left) mapped onto a corresponding allatom conformation (right).
59556_C020.indd 303
8/2/08 8:34:39 AM
304
CoarseGraining of Condensed Phase and Biomolecular Systems
be calculated from an allatom structure using the program CRYSOL [Svergun, Barberato, and Koch 1995]; however, obtaining a SAXS curve from CG simulations first requires reverse CG of CG snapshots. Because there was no need to significantly continue the simulations after reverse CG in this case, a fairly simple scheme was used, in which the centers of mass of the allatom components of each bead were aligned with this bead, and then the system annealed with the center of mass of the components of each bead constrained, allowing the structure to relax while remaining consistent with the CG snapshot. A comparison of the SAXS curve obtained from the assembled CG nanodisc with experimental results is shown in panel (ii) of Figure 20.1, and a time course of the SAXS curve observed during the CG assembly process in panel (iii) of Figure 20.1. The excellent agreement between experimental and theoretical results illustrates both the success of the CG model in reproducing the nanodisc assembly process and structure, and the utility of even fairly simple reverse CG methods.
20.2.4 APPLICATION TO THE BAR DOMAIN BAR domains constitute an ubiquitous type of protein, found in many organisms and performing the function of driving the formation of tubulated and vesiculated membrane structures inside cells [Sakamuro et al. 1996]. BAR domains contain a conserved protein motif and are involved in a variety of cellular processes including fission of synaptic vesicles, endocytosis, and apoptosis [Ren et al. 2006]. Structurally, BAR domains form crescentshaped dimers (see Figure 20.2) with a high density of positively charged residues on their concave face. The shape and charge distribution suggest that BAR domains induce membrane curvature by binding to negatively charged lipids [Peter et al. 2004; Blood and Voth 2006]. However, the common molecular mechanism underlying membrane sculpting by BAR domains remains largely unknown. Recently, allatom simulations [Blood and Voth 2006] have demonstrated that a single BAR domain induces membrane curvature. The allatom study required a simulation of up to 700,000 atoms on the time scales of 50 ns. The next demanding question after the discovery of the membrane bending by a single BAR domain is how multiple BAR domains work together to bend membranes. Allatom simulations of this process are too challenging at present, since one would have to consider millions of atoms in each simulation. However, the residuebased CG method appears to be
FIGURE 20.2 Membrane curvature induced by BAR domains. Upper panel: top view of the initial arrangement (four periodic cells along the vertical axis); lower panel: side view after 50 ns.
www.ebook3000.com 59556_C020.indd 304
8/2/08 8:34:40 AM
Application of ResidueBased and ShapeBased CoarseGraining
305
a good option for this application, and, thus, we have performed CG simulations of systems with multiple BAR domains, in order to determine how the cooperative interaction of the latter with the membrane induces global membrane curvature. The residuebased CG model [Shih et al. 2007b, 2006] described above is ideally suited to describe the membrane remodeling by BAR domains since it has demonstrated its power before on the tasks where lipids assemble, disassemble, and reshape membranes [Shih et al. 2007b, 2006; Marrink, de Vries, and Mark 2004]. The only difficulty is that the residuebased protein CG model has not been developed to work for proteins of arbitrary shapes. In particular, the model has not been designed to maintain tertiary structure of proteins, which is determined by the protection of hydrophobic side groups in the protein amino acid sequence from solvent (well described by the residuebased CG model), but also, to a large extent, by atomic level interactions that the residuebased CG model does not capture. Indeed, when the model was applied to the BAR domain, the tertiary structure was not preserved. Accordingly, we added harmonic bonds and angles connecting protein beads that conserve protein shape and flexibility. A minimal set of bonds and angles was selected for this purpose. The strength of these bonds and angles was chosen to reproduce the tertiary structure flexibility as observed in the allatom simulations. As a result, the protein was not heavily constrained, but the tertiary structure (the BAR domain’s crescent shape) was maintained well. This feature has been implemented through a NAMD [Phillips et al. 2005] functionality that allows one to add extra bonded interactions to simulations. In our previous residuebased CG simulations [Shih et al. 2007b, 2006; Marrink, de Vries, and Mark 2004], a relative dielectric constant ε of 20 was employed. In the case of the BAR domain simulations we chose ε = 1. Such a low εvalue is necessary for membrane curvature to be induced by BAR domains, which is driven by shortrange electrostatics, when charged groups from the protein’s concave surface interact at close range with charged lipid heads. Interactions at larger distances should be screened by water, requiring, in principle, higher values of ε. However, the electrostatic interactions at large distances appear to be relatively weak in the present case such that ε = 1 has no adverse effect on longrange electrostatics in case of the BAR domain simulations. The rather rough CG model of the BAR domain and lipid membrane, described above, has been applied to study the behavior of multiple BAR domains [Arkhipov, Yin, and Schulten 2008], as shown in Figure 20.2. The allatom simulations with a single BAR domain [Blood and Voth 2006], from other groups as well as our own, have been reproduced well by the residuebased CG simulations (not shown), in terms of both membrane curvature and protein structure. Six BAR domains interacting with a patch of membrane were then simulated. Two rows of three BAR domains each were placed in parallel (shifted with respect to each other) on top of a planar membrane, composed of electrostatically neutral DOPC lipids mixed with negatively charged DOPS lipids (30% DOPS). BAR domains produced a global bending mode [Arkhipov, Yin, and Schulten 2008], exhibiting a radius of curvature of 30 nm within 50 ns (comparable to experimental values for the curvature [Peter et al. 2004]). This result suggests how BAR domains quickly generate membrane curvature, as possibly occurs in cells during the formation of subcellular membrane structures [Ren et al. 2006].
20.3 SHAPEBASED COARSEGRAINING The shapebased CG [Arkhipov, Freddolino, and Schulten 2006; Arkhipov et al. 2006] method offers a higher degree of CG than the residuebased method, but at the price that the biopolymers described are restricted in their motion to elastic vibration around a given shape. The method is available through the molecular visualization software VMD [Humphrey, Dalke, and Schulten 1996].
20.3.1 SELECTION OF BEAD ARRANGEMENT AND POTENTIALS Biomolecules, and proteins in particular, assume a variety of shapes, often featuring both compact domains and elongated tails, the compact regions and tails often being equally important. To our
59556_C020.indd 305
8/2/08 8:34:41 AM
306
CoarseGraining of Condensed Phase and Biomolecular Systems
knowledge, all existing CG methods assign CG beads to represent a fixed group of atoms, but this is not efficient for the CG of molecules with complex shapes, because with such an approach either the tails are misrepresented or too many CG beads are used for the compact domains. With shapebased CG, one addresses the task of representing shapes with as few CG beads as possible by socalled topologyconserving maps [Martinetz and Schulten 1994]. Consider a molecule consisting of Na atoms with coordinates rn and masses mn, n = 1, 2, … , Na. One seeks to reproduce the shape of the molecule with N CG beads. The mass distribution pn = mn /M (M =Σ mn) is used as a target probability distribution for the evolving map. CG beads n are assigned their initial positions randomly; then, the beads are considered as nodes of a network [Martinetz and Schulten 1994], on which S adaptation steps are performed. At each step the following procedures are carried out. First, the nth atom is chosen randomly, according to the probability distribution pn; its coordinates rn = v are used to adapt the neural network (see Equation 20.1). Second, for each CG bead i (i = 1, 2,…, N), one determines the number ki of CG beads j, obeying the condition v–Rj < v–Ri, where Rj is the position of the jth bead. Third, positions of the beads are updated (i = 1, 2,…, N), according to the rule Rnewi = Roldi + ξe− ki / λ (v–Roldi ).
(20.1)
Parameters ξ and λ are adapted at each step according to the functional form fs = f0(f S /f0)s/S, where s is the current step, λ0 = 0.2N, λS = 0.01, ξ0 = 0.3, and ξS = 0.05. We use S = 200N; typical adaptation steps are shown in Figure 20.3. Once beads are placed, an allatom “domain” is found for each bead (the domain includes all atoms closer to this bead than to any other bead). The total mass and charge of a domain is assigned to the respective bead. Since the shape of a molecule is reproduced by this
FIGURE 20.3 Shapebased coarsegraining algorithm assigning CG beads. The CG beads (spheres) are the nodes of the network; their positions are updated throughout the learning steps (3400 steps for 17 beads in this example). As a result, the shape of a protein (here, the capsid unit protein of the brome mosaic virus) is reproduced with a small number of beads (chosen prior to starting the algorithm). After the assignment converged, the beads are connected by bonds. The algorithm is of a neural network type described in Martinetz and Schulten (1994).
www.ebook3000.com 59556_C020.indd 306
8/2/08 8:34:42 AM
Application of ResidueBased and ShapeBased CoarseGraining
307
CG model, the method is termed shapebased CG. The molecular graphics program VMD [Humphrey, Dalke, and Schulten 1996], through its shapebased CG plugin, can also build CG models from volumetric data, such as density maps obtained from cryoelectron microscopy. Currently, two ways of establishing bonds between CG beads are implemented. In one case, a bond is established if the distance between two beads is below a cutoff distance (chosen by the researcher). Another possibility is to establish a bond between two CG beads if their respective allatom domains are connected by protein or nucleic backbone trace; in the latter case, the topology of the molecular polymeric chain is reproduced better. Interactions between beads are described by a CHARMMlike force field [MacKerell et al. 1998]; that is, bonded interactions are represented by harmonic bond and angle potentials (no dihedral potentials). The nonbonded potentials include 6–12 LennardJones (LJ) and Coulomb terms: V=
∑ K2 (R − L ) + ∑ M2 (θ − Θ ) 2
i
i
bonds i
+
∑ m,n
k
i
k
2
k
angles k
12 ⎡⎛ ⎛ σ ⎞6 ⎤⎥ ⎢ σ ⎞ 4 Emn ⎢ ⎜⎜⎜ mn ⎟⎟⎟ − ⎜⎜⎜ mn ⎟⎟⎟ ⎥ + ⎝ rmn ⎠⎟ ⎥ ⎢ ⎝ rmn ⎟⎠ ⎣ ⎦
q q ∑ 4πεε r m n
m,n
0 mn
,
(20.2)
where Ri and θk are the distance and angle for bond i and angle k, Ki and Mk are the force constants, Li and Θk are the equilibrium bond length and angle; rmn is the distance between beads m and n, Emn and σmn are the LJ parameters, qm is the charge of the mth bead, and the sum over m and n runs over all pairs of CG beads. The constant ε0 is the vacuum dielectric permittivity; ε is a relative dielectric constant. Bonded parameters Ki, Li, etc., can be extracted from allatom MD simulations of the considered system. For each CG bond and angle, one follows the distances between the centers of mass of corresponding atomic domains; CG forcefield parameters are chosen so that in the CG simulation of a protein unit, the mean distances (angles) and respective root mean square deviations (rmsd) reproduce those found in an allatom simulation. This procedure can be illustrated by the simple example of a onedimensional harmonic oscillator, with a particle moving along the x coordinate in the potential V(x) = f(x–x0)2/2. With the system in equilibrium at temperature T, the average position 〈x〉 is equal to x0, and the rmsd is given by (kBT/f)1/2 (kB is the Boltzmann constant). Using an MD simulation, one can compute 〈x〉 and the rmsd, thus obtaining x0 and f. In allatom simulations, LJ radius σmn for a pair m,n is usually approximated by σmn = (σm + σn)/2, where σm is the LJ radius of the mth atom. We use the same approach for CG beads; σm for the mth bead is calculated as the radius of gyration of its allatom domain, increased by 2 Å (an average LJ radius of an atom in the CHARMM force field). The LJ well depth σmn is set to a uniform value for all pairs mn; usually, we used Emn = 4 kcal/mol. This choice for σmn and Emn was supported by allatom simulations of pairs of protein segments about 500 atoms each (roughly representing a single CG bead in one of our applications). Several such simulations were performed, for about 10 ns each. The effective potential of interaction between two segments was obtained for every pair using the Boltzmann inversion method [Reith, Pütz, and MüllerPlathe 2003; Tozzini and McCammon 2005]: assuming that the distribution of the distance between the segments x is given by ρ( x ) = e−V ( x )/ kBT , where V(x) is the potential, one computes ρ(x) from the simulation and finds the potential as V(x) = − kBT ln[ρ(x)] + const. The potentials computed from allatom simulations were similar to a LJ potential in shape, and for each pair the well depth was about 4 kcal/mol; the LJ radius was well represented using the procedure (radius of gyration + 2 Å) described above [Arkhipov et al. 2006]. An effect of the solvent is modeled implicitly, by reproducing three basic features of water, namely, viscosity, fluctuations due to Brownian motion, and dielectric permittivity. The relative dielectric constant ε is set to 80 everywhere (the experimental value for liquid water). Frictional
59556_C020.indd 307
8/2/08 8:34:43 AM
308
CoarseGraining of Condensed Phase and Biomolecular Systems
and fluctuating forces are introduced through the Langevin equation that describes the time evolution of the CG system for each bead m
∂r ∂2 r = F−m γ + χ ψ (t ). 2 ∂t ∂t
(20.3)
Here, r is the position of the bead, F is the force acting on the bead from other beads in the system, γ is a damping coefficient, ψ(t) is a univariate Gaussian random process, and χ is related to the frictional forces through the fluctuationdissipation theorem, χ = (2γ kBT / m)1/2, with m being the bead’s mass. With F = 0, Equation 20.3 describes free diffusion, where γ is related to the diffusion constant D, D = kBT/(mγ). In principle, γ can be computed from allatom simulations by calculating D for the molecule under study (although the force fields used in such simulations might not be good enough to reproduce the water viscosity), but a much better approach is to use an experimental value of D if available, for example, D for a molecule of similar size. Contrary to the extraction of D from allatom simulation, which is often difficult due to insufficient sampling, γ can be easily tuned in CG simulations to give the appropriate value of D for a given molecule, since one achieves sampling for the center of mass displacements much faster in CG simulations than in allatom simulations. Based on estimates from the allatom simulations and experimental data for various proteins, the appropriate values of γ for 500 atoms per CG bead should be in the range 3–15 ps − 1. The dynamics of the CG system is realized through MD simulations using NAMD [Phillips et al. 2005]. For the case of 500 atoms per CG bead the CG allows one to simulate systems 500 times larger than possible in allatom representation. As water often accounts for 80% of atoms in biomolecular simulations, and since the solvent is treated implicitly, the real gain is even higher, typically 2000–3000 times. Due to slower motions of CG beads in comparison with atoms, one can use a timestep of 500 fs to integrate the equations of motion, instead of the 1 fs timestep common for allatom simulations. As a result, the shapebased CG with a typical ratio of 500 atoms per bead allows one to simulate dynamics of micrometersized objects on time scales of 100 μs using just one to three processors, while allatom simulations even with 1000 processors are limited now to ∼ 20 nm in size and 100 ns in time. Of course, this gain comes at the price of limited resolution.
20.4 APPLICATION TO STRUCTURAL DYNAMICS OF VIRUSES Shapebased CG was successfully applied to study the structural dynamics of viruses. A virus [Levine 1991; Flint et al. 2004] is a macromolecular complex, normally 10–100 nm across, consisting of a genome enclosed in a protein coat (capsid); usually, the capsid is a symmetric assembly, often an icosahedron, formed by multiple copies of a few proteins. Other accessory molecules can be contained inside the capsid; additional proteins and a lipid bilayer envelope are also found on the surface of some viruses. The viral replication cycle starts with the delivery of the viral genome into a host cell, a step usually involving capsid disintegration. Then, the host cell replicates the viral genome and produces viral proteins, often at the cost of reducing the cell’s normal functionality. Finally, the newly produced parts of the virus assemble into viral particles and leave the host cell, which is usually destroyed as a result. Outside of the host cell a viral particle has to be stable and relatively rigid to protect the genome, but it also has to become unstable when virulence factors need to be released into the host cell. In order to determine the stability of viral capsids and transitions between stable and unstable structures, we performed MD simulations of several viruses, both in allatom [Freddolino et al. 2006] and CG representations [Arkhipov, Freddolino, and Schulten 2006]. Employing the shapebased CG method [Arkhipov, Freddolino, and Schulten 2006], we were able to study large viral capsids (up to 75 nm in diameter, see Figure 20.4) on 1.5–25 μs time scales. Most of the simulations were performed on a single processor, but parallel simulations on up to 48 processors were also carried out; the latter exhibited good parallel scaling similar to that of allatom simulations with NAMD [Phillips et al. 2005].
www.ebook3000.com 59556_C020.indd 308
8/2/08 8:34:44 AM
Application of ResidueBased and ShapeBased CoarseGraining
309
FIGURE 20.4 CG simulations of viral capsids. The initial and final structures for each simulation are shown (all particles are drawn to scale). The ratio of 200 atoms per CG particle is used. All capsids are simulated without gene content; that is, empty, except in the case of the satellite tobacco mosaic virus, in which case both empty and full capsids were simulated. From Arkhipov, Freddolino, and Schulten (2006).
First [Arkhipov, Freddolino, and Schulten 2006], we performed CG simulations of satellite tobacco mosaic virus (STMV), found to be in good agreement with previous allatom simulations [Freddolino et al. 2006]. STMV is one of the smallest and simplest viruses, only 17 nm in diameter (Figure 20.4), yet, to describe it using allatom simulations required dealing with a onemillionatom system. MD simulations on the complete STMV showed that it is perfectly stable on a time scale of 10 ns. The STMV capsid without genome, in contrast, was unstable, showing a remarkable collapse over the first 5–10 ns of simulation. The CG simulation of STMV reproduced the patterns and time scales of the collapse observed for the STMV capsid in allatom simulations. For both complete STMV and the capsid alone, several other quantities computed in CG simulations, such as the average capsid radius, were within a few angstroms from those in the allatom study. CG simulations of capsids of several more viruses were then carried out (Figure 20.4), of the satellite panicum mosaic virus (SPMV), the satellite tobacco necrosis virus (STNV), the brome mosaic virus (BMV), the poliovirus, the bacteriophage φX174, and reovirus. In CG simulations, the empty capsids of STMV, SPMV, and STNV collapsed. The reovirus core, the bacteriophage φX174 procapsid, and the poliovirus capsid were stable, and indeed, it is known experimentally that these are stable even without their respective genetic material. For BMV, empty capsids have been observed experimentally, while a cleavage of the Nterminal tails of the unit proteins makes the capsid unstable [Lucas, Larson, and McPherson 2002]. In agreement with these observations, the BMV capsid was stable in our simulations, although very flexible, but when the Nterminal tails were removed, the capsid collapsed.
59556_C020.indd 309
8/2/08 8:34:44 AM
310
CoarseGraining of Condensed Phase and Biomolecular Systems
Thus, results of CG simulations agree with allatom studies and experimental data, where available. The simulations also provide new quantitative information about viral dynamics. Perhaps the main finding in this regard is that some of the capsids (STMV, SPMV, and STNV) cannot maintain their structural integrity in the absence of the genome. This suggests a specific selfassembly pathway for these viruses: it must be the RNA, and not the protein, which nucleates assembly of the complete virus. Apparently, the RNA forms a spherical particle, and then capsid proteins attach to its surface. It is known for some viruses that they assemble “capsid first’’ [Flint et al. 2004], the genome being pulled into the preformed capsid. Our simulations and emerging experimental evidence [Lucas, Larson, and McPherson 2002; Kuznetsov et al. 2005] suggest that this might be different for some viruses. Related to what determines the stability, we found that the stability and flexibility of viral capsids are closely correlated with the strength of interactions between capsid subunits. Larger capsids, such as the reovirus core, have proteins that intricately intertwine with each other, featuring even a “thread and needle” arrangement. For STMV, SPMV, and STNV, unit proteins only touch each other by the edges. With more contacts between the protein units, a capsid has more hydrogen bonds and salt bridges per unit area (reflected in the CG model by generalized nonbonded LJ and Coulomb forces), and the frictional force between capsid faces a rise. These factors enhance capsid stability. Our simulations suggest that viruses like STMV, SPMV, and STNV have relatively few contacts between the capsid subunits and only their genomes render the capsids stable.
20.4.1 APPLICATION TO THE BACTERIAL FLAGELLUM The shapebased CG method has recently been applied to study the molecular basis of bacterial swimming. Many types of bacteria propel themselves through liquid media using whiplike structures known as flagella. The bacterial flagellum is a huge (several micrometers long, 20 nm wide), multiprotein assembly built of three domains: a basal body, fixed in the cell body below the outer membrane and acting as a motor; a filament, which grows out of the cell, making up the bulk of the length of the flagellum and interacting with solvent to propel the bacterium; and a hook, connecting basal body and filament and acting as a joint transmitting the torque from the former to the latter. Depending on the direction of the torque applied by the basal body, the filament assumes different helical shapes. Under counterclockwise rotation (as viewed from the exterior of the cell), several flagella form a single helical bundle which propels the cell along a straight line (running mode) [Berg 2000]. Under clockwise rotation, the individual flagella dissociate from the bundle and form separate righthanded helices, causing the cell to tumble. Varying the duration of running and tumbling, bacteria can move up or down a gradient of an attractant or repellent by a biased random walk. One of the unresolved questions about the flagellum is how the reversal of torque applied by the motor results in a switching between the helical shapes of the flagellar filament. This switching is a result of polymorphic transitions in the filament, when individual protein units slide against each other [Samatey et al. 2001], but its molecular mechanism remains poorly understood. Trying to answer this question, we performed CG MD studies of the flagellar filament [Arkhipov et al. 2006], which is formed by thousands of copies of a single protein, flagellin. Flagellin was coarsegrained with 500 atoms per CG bead, as shown in Figure 20.5. Segments of the filament (1100 flagellin units, or 0.5 μm long) were rotated clockwise and counterclockwise, with a constant rotation speed one turn in 10 μs applied to 33 protein units at the bottom of the segment. The simulations covered 30 μs each. The filament is built by the helical arrangement of flagellin units, 11 per turn. A thread of units each separated by one turn is called a “protofilament” (see Figure 20.5); 11 protofilaments comprise the filament. In the CG simulations, the filament segments remained stable when rotated, but protofilaments rearranged dramatically (though it must be noticed that the torque applied to the model flagellum exceeded by far the one arising under native conditions). In the straight filament, which was the starting structure, the protofilaments form a righthanded helix with large helical period. When the torque is applied counterclockwise (as viewed from the base to the tip), the protofilaments remain arranged in righthanded helices, but the pitch of the helices rises; when the torque is opposite, the
www.ebook3000.com 59556_C020.indd 310
8/2/08 8:34:46 AM
Application of ResidueBased and ShapeBased CoarseGraining
311
FIGURE 20.5 (See color insert following page 238.) CG of the flagellar filament. Unit proteins are represented by 15 CG beads (a). In (b), the flagellar filament viewed from the side and from the top is shown in allatom (left) and CG (right) representations. A filament segment (1100 monomers) is shown in CG representation in (c). A single helix turn of 11 unit proteins is highlighted in black.
helices become lefthanded. The filament also forms a helix as a whole. For the rotation corresponding to the running mode, the filament forms a lefthanded helix, whereas for the tumbling mode it becomes a righthanded helix. The same difference in handedness between these helices is found in living bacteria [Turner, Ryu, and Berg 2000]. Running and tumbling modes of bacterial swimming are determined by structural transitions in the flagellar filament, depending on the direction of the applied torque. Clearly, interactions between protein units play an important role in enabling this transition. However, flagella act in solvent (water), and, curiously, the role of the solvent had not been analyzed much before. The effect of solvent was taken into account using Equation 20.3 [Arkhipov et al. 2006]. It was found that without friction due to solvent, flagella rotate as a rigid body; that is, the mutual positions of monomers are frozen, both for running and tumbling mode. With the solvent’s friction present, the protofilaments rearrange as explained above, in agreement with structural changes in the flagellum suggested by experimental studies. Thus, the solvent (friction) plays a crucial role in the switching between the arrangements of protofilaments and, consequently, in producing supercoiling along the entire filament, or running and tumbling modes of motion.
20.5 FUTURE APPLICATIONS OF COARSEGRAINING Due to growing interest in large biomolecules and systems biology, coarsegrained simulations have grown increasingly common over the past few years as a means of accessing time and size scales that cannot be reached with allatom MD. Recent advances such as more reliable force fields for residuebased CG [Marrink et al. 2007; Zhou et al. 2007], mixed CG and allatom simulations [Shi, Izvekov, and Voth 2006; Praprotnik, Site, and Kremer 2006], and lowresolution shapebased CG models [Arkhipov et al. 2006; Arkhipov, Freddolino, and Schulten 2006] have improved the accuracy, flexibility, and potential scope of CG simulations. Since, however, coarsegrained simulations will never offer the same level of accuracy as allatom simulations, it seems likely that CG simulations will naturally evolve in directions allowing closer links to atomistic descriptions. Both the aforementioned techniques of dynamic changes of scale and mixing CG and allatom descriptions serve as useful and distinct models for how this can be accomplished, with the former using CG as an accelerant to improve sampling and then using allatom simulations to flesh out the details of the sampled states, and the latter allowing less important parts of a system (such as bulk solvent) to be treated with a lower resolution than the regions of interest. The utility of further development and application of these techniques can be illustrated, for example, for the case of the bacterial flagellum. Coarsegrained simulations have been used to investigate both the largescale behavior of the flagellar filament during supercoiling [Arkhipov et al. 2006] and solvent dynamics around the supercoiled flagellum [Gebremichael, Ayton, and Voth 2006]; at the same time, largescale allatom simulations have offered a potential atomicscale
59556_C020.indd 311
8/2/08 8:34:46 AM
312
CoarseGraining of Condensed Phase and Biomolecular Systems
mechanism for differential supercoiling [Kitao et al. 2006]. The remaining challenge for theory is to fully link the CG and atomistic descriptions to provide a coherent and fully testable model for filament supercoiling; the most likely path for developing such a model is to use rotation of a shapebased CG filament to develop an ensemble of conformations at different points along the flagellum, which can then be simulated and perturbed in an allatom representation to understand what interactions and structural transitions are important for the supercoiling process. A similar scaleswitching approach could be applied to other systems, including viral capsids (allowing the study of assembly intermediates obtained from shapebased CG). The shapebased CG methods should be further developed in a few important directions. Our present shapebased CG methodology [Arkhipov et al. 2006; Arkhipov, Freddolino, and Schulten 2006] allows one to simulate proteins. Despite initial successes, the protein model remains relatively rough and needs to be further refined, in particular with respect to the interaction potentials employed. These potentials can be improved using systematic allatom parameterizing simulations for target systems. The same is true for the solvent model, which should be further developed along the lines of a true implicit solvent model, such as the generalized Born approach [Dominy and Brooks, 1999; Bashford and Case 2000; Mongan, Case, and McCammon 2004]. The CG method should also be extended to biomolecules other than proteins; to that end, we have recently started the development of a shapebased CG membrane model [Arkhipov, Yin, and Schulten 2008]. In this model, each leaflet of a lipid bilayer is represented by a collection of twobead “molecules” (two beads connected by a spring), held together by nonbonded interactions tuned to mimic the bilayer stability, thickness, and area per lipid. This approach is similar to previous attempts of CG membrane simulations, such as by Reynwar et al. (2007). However, in our model each twobead “molecule” represents a patch of a leaflet (not necessarily an integer number of lipid molecules), rather than a single lipid. Using the model, we have been able to simulate bilayer selfassembly and reproduce the results of allatom and residuebased CG simulations of BAR domains (see above); much larger BAR domain simulations using the new model are under way. The shapebased CG model describing proteins and lipids will be very useful for simulations of subcellular processes, where multiple proteins interact with each other and with cellular membranes on long time scales. Future residuebased CG simulations of nanodiscs will continue to further our understanding of HDL assembly and maturation, as well as aiding in the use of synthetic nanodiscs as protein scaffolds. HDL particles acting in vivo absorb esterified cholesterol for transport [Wang and Briggs 2004]; understanding the structural transitions involved in this process will be a key step in the overall goal of characterizing HDL function. This absorption process can be studied through residuebased CG simulations designed to observe how the structure of a nanodisc adjusts to the presence of esterified cholesterol. Ongoing simulations of nanodiscs will also be used to refine reverse CG methods for residuebased CG models to move from the snapshotonly reversal described above to a thermodynamically correct method for changing from allatom to residuebased CG models. The continued development and application of CG, along with ongoing improvements in generally available computational resources, promises to enable biomolecular simulations to treat many systems which were previously inaccessible. The increasing application of allatom and CG simulations to the same system should greatly increase the impact of CG by allowing the CG method to be thoroughly tested or replaced by allatom calculations when desired. CG simulations will be useful for understanding the behavior of cellscale systems over millisecond time scales, and their role will increase with continuing improvements to CG potentials.
REFERENCES Arkhipov, A., P. L. Freddolino, K. Imada, K. Namba, and K. Schulten. 2006. Coarsegrained molecular dynamics simulations of a rotating bacterial flagellum. Biophys. J. 91:4589–97. Arkhipov, A., P. L. Freddolino, and K. Schulten. 2006. Stability and dynamics of virus capsids described by coarsegrained modeling. Structure 14:1767–77.
www.ebook3000.com 59556_C020.indd 312
8/2/08 8:34:47 AM
Application of ResidueBased and ShapeBased CoarseGraining
313
Arkhipov, A., Y. Yin, and K. Schulten. 2008. Fourscale description of membrane sculpting by BAR domains. Biophys. J. In press. Baas, B. J., I. G. Denisov, and S. G. Sligar. 2004. Homotropic cooperativity of monomeric cytochrome P450 3A4 in a nanoscale native bilayer environment. Arch. Biochem. Biophys. 430:218–28. Baron, R., D. Trzesniak, A. H. de Vries, A. Elsener, S. J. Marrink, and W. F. van Gunsteren. 2007. Comparison of thermodynamic properties of coarsegrained and atomiclevel simulation models. Chem. Phys. Chem. 8:452–61. Bashford, D., and D. A. Case. 2000. Generalized Born models of macromolecular solvation effects. Annu. Rev. Phys. Chem. 51:129–52. Bayburt, T. H., Y. V. Grinkova, and S. G. Sligar. 2002. Selfassembly of discoidal phospholipid bilayer nanoparticles with membrane scaffold proteins. Nano Lett. 2:853–56. . 2006. Assembly of single bacteriorhodopsin trimers in bilayer nanodiscs. Arch. Biochem. Biophys. 450:215–22. Berg, H. C. 2000. Motile behavior of bacteria. Phys. Today 53:24–29. Blood, P. D., and G. A. Voth. 2006. Direct observation of Bin/amphiphysin/Rvs (BAR) domaininduced membrane curvature by means of molecular dynamics simulations. Proc. Natl. Acad. Sci. U.S.A. 103:15068–72. Boldog, T., S. Grimme, M. Li, S. G. Sligar, and G. L. Hazelbauer. 2006. Nanodiscs separate chemoreceptor oligomeric states and reveal their signaling properties. Proc. Natl. Acad. Sci. U.S.A. 103:11509–14. Bond, P. J., and M. S. P. Sansom. 2006. Insertion and assembly of membrane proteins via simulation. J. Am. Chem. Soc. 128:2697–704. Civjan, N. R., T. H. Bayburt, M. A. Schuler, and S. G. Sligar. 2003. Direct solubilization of heterologously expressed membrane proteins by incorporation into nanoscale lipid bilayers. Biotechniques 35:556–60, 562–63. Das, P., S. Matysiak, and C. Clementi. 2005. Balancing energy and entropy: A minimalist model for the characterization of protein folding landscapes. Proc. Natl. Acad. Sci. U.S.A. 102 (29):10141–46. Davydov, D. R., H. Fernando, B. J. Baas, S. G. Sligar, and J. R. Halpert. 2005. Kinetics of dithionitedependent reduction of cytochrome P450 3A4: Heterogeneity of the enzyme caused by its oligomerization. Biochemistry 44:13902–13. Denisov, I. G., Y. V. Grinkova, A. A. Lazarides, and S. G. Sligar. 2004. Directed selfassembly of monodisperse phospholipid bilayer nanodiscs with controlled size. J. Am. Chem. Soc. 126:3477–87. Dominy, B. N., and C. L. Brooks, III. 1999. Development of a generalized Born model parametrization for proteins and nucleic acids. J. Phys. Chem. B. 103:3765–73. Duan, H., N. R. Civjan, S. G. Sligar, and M. A. Schuler. 2004. Coincorporation of heterologously expressed Arabidopsis cytochrome P450 and P450 reductase into soluble nanoscale lipid bilayers. Arch. Biochem. Biophys. 424:141–53. Flint, S. J., L. W. Enquist, V. R. Racaniello, and A. M. Skalka. 2004. Principles of virology. 2nd ed. Washington, DC: ASM Press. Freddolino, P. L., A. S. Arkhipov, S. B. Larson, A. McPherson, and K. Schulten. 2006. Molecular dynamics simulations of the complete satellite tobacco mosaic virus. Structure 14:437–49. Gebremichael, Y., G. S. Ayton, and G. A. Voth. 2006. Mesoscopic modeling of bacterial flagellar microhydrodynamics. Biophys. J. 91:3640–52. Gorshkova, I. N., T. Liu, H. Y. Kan, A. Chroni, V. I. Zannis, and D. Atkinson. 2006. Structure and stability of apolipoprotein aI in solution and in discoidal highdensity lipoprotein probed by double charge ablation and deletion mutation. Biochemistry 45:1242–54. Humphrey, W., A. Dalke, and K. Schulten. 1996. VMD: Visual molecular dynamics. J. Mol. Graphics 14:33–38. Izvekov, S., and G. A. Voth. 2005a. A multiscale coarsegraining method for biomolecular systems. J. Phys. Chem. B. 109 (7):2469–73. . 2005b. Multiscale coarse graining of liquidstate systems. J. Chem. Phys. 123:134105. . 2006. Multiscale coarsegraining of mixed phospholipid/cholesterol bilayers. J. Chem. Theory Comput. 2:637–48. Katsoulakis, M. A., A. J. Majda, and D. G. Vlachos. 2003. Coarsegrained stochastic processes for microscopic lattice systems. Proc. Natl. Acad. Sci. U.S.A. 100 (3):782–87. Kitao, A., K. Yonekura, S. MakiYonekura, F. A. Samatey, K. Imada, K. Namba, and N. Go. 2006. Switch interactions control energy frustration and multiple flagellar filament structures. Proc. Natl. Acad. Sci. U.S.A. 103 (13):4894–99.
59556_C020.indd 313
8/2/08 8:34:47 AM
314
CoarseGraining of Condensed Phase and Biomolecular Systems
Koppaka, V., L. Silvestro, J. A. Engler, C. G. Brouillette, and P. H. Axelsen. 1999. The structure of human lipoprotein AI. Evidence for the “belt” model. J. Biol. Chem. 274:14541–44. Kuznetsov, Y. G., S. Daijogo, J. Zhou, B. L. Semler, and A. McPherson. 2005. Atomic force microscopy analysis of icosahedral virus RNA. J. Mol. Biol. 347:41–52. Leach, A. R. 1996. Molecular modelling, principles and applications. Essex: Addison Wesley Longman Limited, Harlow, England. Levine, A. J. 1991. Viruses. Scientific American Library. Li, H., D. S. Lyles, M. J. Thomas, W. Pan, and M. G. SorciThomas. 2000. Structural determination of lipidbound ApoAI using fluorescence resonance energy transfer. J. Biol. Chem. 275:37048–54. Li, Y., A. Z. Kijac, S. G. Sligar, and C. M. Rienstra. 2006. Structural analysis of nanoscale selfassembled discoidal lipid bilayers by solidstate NMR spectroscopy. Biophys. J. 91:3819–28. Lucas, R. W., S. B. Larson, and A. McPherson. 2002. The crystallographic structure of brome mosaic virus. J. Mol. Biol. 317:95–108. Lyman, E., F. M. Ytreberg, and D. M. Zuckerman. 2006. Resolution exchange simulation. Phys. Rev. Lett. 96 (2):028105. MacKerell, A. D., Jr., D. Bashford, M. Bellott, R. L. Dunbrack, Jr., J. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. Kuchnir, K. Kuczera, F. T. K. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, I. W. E. Reiher, B. Roux, M. Schlenkrich, J. Smith, R. Stote, J. Straub, M. Watanabe, J. WiorkiewiczKuczera, D. Yin, and M. Karplus. 1998. Allatom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B 102:3586–616. Marrink, S. J., A. H. de Vries, and A. E. Mark. 2004. Coarse grained model for semiquantitative lipid simulations. J. Phys. Chem. B 108:750–60. Marrink, S. J., and A. E. Mark. 2002. Molecular dynamics simulations of mixed micelles modelling human bile. Biochemistry 41:5375–82. Marrink, S. J., and A. E. Mark. 2003. Molecular dynamics simulation of the formation, structure, and dynamics of small phospholipid vesicles. J. Am. Chem. Soc. 125:15233–42. . 2004. Molecular view of hexagonal phase formation in phospholipid membranes. Biophys. J. 87:3894–900. Marrink, S. J., J. Risselada, and A. E. Mark. 2005. Simulation of gel phase formation and melting in lipid bilayers using a coarse grained model. Chem. Phys. of Lipids 135 (2):223–44. Marrink, S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, and A. H. de Vries. 2007. The Martini forcefield: Coarse grained model for biomolecular simulations. J. Phys. Chem. B 111:7812–24. Martinetz, T., and K. Schulten. 1994. Topology representing networks. Neural Netw. 7 (3):507–22. Mongan, J., D. A. Case, and J. A. McCammon. 2004. Constant pH molecular dynamics in generalized Born implicit solvent. J. Comp. Chem. 25:2038–48. Nielsen, J. E., and J. A. McCammon. 2003. On the evaluation and optimization of protein Xray structures for pKa calculations. Protein Sci. 12:313–26. Nielsen, S. O., C. F. Lopez, G. Srinivas, and M. L. Klein. 2004. Coarse grain models and the computer simulation of soft materials. J. Phys.: Condens. Matter 16:R481–512. Panagotopulos, S. E., E. M. Horace, J. N. Maiorano, and W. S. Davidson. 2001. Apolipoprotein AI adopts a beltlike orientation in reconstituted high density lipoproteins. J. Biol. Chem. 276:42965–70. Peter, B. J., H. M. Kent, I. G. Mills, Y. Vallis, P. Johnathon, G. Butler, P. R. Evans, and H. T. McMahon. 2004. BAR domains as sensors of membrane curvature: The amphiphysin BAR structure. Science 303:495–99. Phillips, J. C., R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R. D. Skeel, L. Kale, and K. Schulten. 2005. Scalable molecular dynamics with NAMD. J. Comp. Chem. 26:1781–802. Praprotnik, M., L. D. Site, and K. Kremer. 2005. Adaptive resolution molecular dynamics simulation: Changing the degrees of freedom on the fly. J. Chem. Phys. 123:224106. . 2006. Adaptive resolution scheme for efficient hybrid atomisticmesoscale molecular dynamics simulations of dense liquids. Phys. Rev. E 73:066701. Reith, D., M. Pütz, and F. MüllerPlathe. 2003. Deriving effective mesoscale potentials from atomistic simulations. J. Comp. Chem. 24:1624–36. Ren, G., P. Vajjhala, J. S. Lee, B. Winsor, and A. L. Munn. 2006. The BAR domain proteins: Molding membranes in fission, fusion, and phagy. Microbiol. Mol. Biol. Rev. 70:37–120. Reynwar, B. J., G. Illya, V. A. Harmandaris, M. M. Müller, K. Kremer, and M. Deserno. 2007. Aggregation and vesiculation of membrane proteins by curvaturemediated interactions. Nature 447:461–64.
www.ebook3000.com 59556_C020.indd 314
8/2/08 8:34:48 AM
Application of ResidueBased and ShapeBased CoarseGraining
315
Sakamuro, D., K. J. Elliott, R. WechslerReya, and G. C. Prendergast. 1996. BIN1 is a novel mycinteracting protein with features of a tumour suppressor. Nat. Genet. 14:69–77. Samatey, F. A., K. Imada, S. Nagashima, F. Vonderviszt, T. Kumasaka, M. Yamamoto, and K. Namba. 2001. Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410:331–37. Sastry, K., D. D. Johnson, D. E. Goldberg, and P. Bellon. 2005. Genetic programming for multitimescale modeling. Phys. Rev. B 72 (8):085438. Schütte, Ch., A. Fischer, W. Hiosinga, and P. Deuflhard. 1999. A direct approach to conformational dynamics based on hybrid Monte Carlo. J. Comput. Phys. 151:146–68. Seddon, A. M., P. Curnow, and P. J. Booth. 2004. Membrane proteins, lipids and detergents: Not just a soap opera. Biochim. Biophys. Acta 1666:105–17. Shelley, J. C., M. Y. Shelley, R. C. Reeder, S. Bandyopadhyay, P. B. Moore, and M. L. Klein. 2001. Simulations of phospholipids using a coarse grain model. J. Phys. Chem. B 105:9785–92. Shi, Q., S. Izvekov, and G. A. Voth. 2006. Mixed atomistic and coarsegrained molecular dynamics: Simulation of a membranebound ion channel. J. Phys. Chem. B 110 (31):15045–48. Shih, A. Y., A. Arkhipov, P. L. Freddolino, and K. Schulten. 2006. Coarse grained protein–lipid model with application to lipoprotein particles. J. Phys. Chem. B 110:3674–84. Shih, A. Y., A. Arkhipov, P. L. Freddolino, S. G. Sligar, and K. Schulten. 2007a. Assembly of lipids and proteins into lipoprotein particles. J. Phys. Chem. B 111:11095–104. Shih, A. Y., I. G. Denisov, J. C. Phillips, S. G. Sligar, and K. Schulten. 2005. Molecular dynamics simulations of discoidal bilayers assembled from truncated human lipoproteins. Biophys. J. 88:548–56. Shih, A. Y., P. L. Freddolino, A. Arkhipov, and K. Schulten. 2007b. Assembly of lipoprotein particles revealed by coarsegrained molecular dynamics simulations. J. Struct. Biol. 157:579–92. Shih, A. Y., P. L. Freddolino, S. G. Sligar, and K. Schulten. 2007c. Disassembly of nanodiscs with cholate. Nano Lett. 7:1692–96. Silva, R. A. G. D., G. M. Hilliard, L. Li, J. P. Segrest, and W. S. Davidson. 2005. A mass spectrometric determination of the conformation of dimeric apolipoprotein AI in discoidal high density lipoproteins. Biochemistry 44:8600–607. Sligar, S. G. 2003. Finding a singlemolecule solution for membrane proteins. Biochem. Biophys. Res. Commun. 312:115–19. Stevens, M. J. 2004. Coarsegrained simulations of lipid bilayers. J. Chem. Phys. 121:11942–48. Stevens, M. J., J. H. Hoh, and T. B. Woolf. 2003. Insights into the molecular mechanism of membrane fusion from simulations: Evidence for the association of splayer tails. Phys. Rev. Lett. 91:188102. Svergun, D. I., C. Barberato, and M. H. J. Koch. 1995. CRYSOL: A program to evaluate Xray solution scattering of biological macromolecules from atomic coordinates. J. Appl. Cryst. 28:768–73. Tozzini, V., and A. McCammon. 2005. A coarse grained model for the dynamics of flap opening in HIV1 protease. Chem. Phys. Lett. 413:123–28. Tricerri, M. A., A. K. Behling Agree, S. A. Sanchez, J. Bronski, and A. Jonas. 2001. Arrangement of apolipoprotein AI in reconstituted highdensity lipoprotein disks: An alternative model based on fluorescence resonance energy transfer experiments. Biochemistry 40:5065–74. Turner, L., W. S. Ryu, and H. C. Berg. 2000. Realtime imaging of fluorescent flagellar filaments. J. Bacteriol. 182 (10):2793–801. Wang, M., and M. R. Briggs. 2004. HDL: The metabolism, function, and therapeutic importance. Chem. Rev. 104:119–37. Zhou, J., I. F. Thorpe, S. Izvekov, and G. A. Voth. 2007. Coarsegrained peptide modeling using a systematic multiscale approach. Biophys. J. 92 (12):4289–303.
59556_C020.indd 315
8/2/08 8:34:48 AM
www.ebook3000.com 59556_C020.indd 316
8/2/08 8:34:49 AM
21 CoarseGraining Protein Mechanics Richard Lavery Institute de Biologie et Chimie des Protéines, Université de Lyon
Sophie SacquinMora Laboratoire de Biochimie Théorique, Institut de Biologie PhysicoChimique
CONTENTS 21.1 21.2 21.3
Introduction ......................................................................................................................... 317 Methodology ....................................................................................................................... 319 Results and Discussion ........................................................................................................ 320 21.3.1 Force Constant “Spectra” ...................................................................................... 320 21.3.2 Locating Active Sites............................................................................................. 323 21.3.3 Conformational Versus Mechanical Changes ....................................................... 323 21.3.4 Architectural Fingerprints in the Force Constant Spectra .................................... 324 21.4 Conclusions ......................................................................................................................... 325 References ...................................................................................................................................... 326
21.1 INTRODUCTION Almost 50 years after the first protein structures were solved [1,2], structural databases now contain tens of thousands of structures, which have been extensively analyzed and classified. Despite these efforts, we still have relatively little understanding of how structure is related to the mechanical and dynamical properties of proteins, which are nevertheless indissociable features of protein function. This situation is beginning to change because of progress in both experimental and theoretical approaches. Experimentally, both of the methods for determining highresolution structures, Xray crystallography and NMR spectroscopy, also provide some information on protein flexibility. First, it is possible to compare structures resolved with or without interacting species, or, in the case of enzymes, to capture intermediate conformational states using unreactive substrate analogs. Both methods can also provide finer data on the positional fluctuations of individual residues within proteins in terms of Debye–Waller temperature factors or order parameters. A new route to mechanical probing has recently arisen with the development of singlemolecule experiments [3,4], which enable a protein to be pulled apart, either by tethers on its N and Ctermini or, in “triangulation” experiments, between other residue pairs [5,6]. The latter approach has convincingly demonstrating that, not surprisingly, proteins respond differently depending on the direction of the applied forces. Theoretically, a number of different approaches have been applied to analyzing protein flexibility. First amongst these are allatom molecular dynamics simulations, taking into account the 317
59556_C021.indd 317
8/2/08 8:36:43 AM
318
CoarseGraining of Condensed Phase and Biomolecular Systems
surrounding solvent (generally represented by explicit solvent molecules, but also, potentially, with simpler continuum representations). Such simulations are generally limited to the nanosecond time scale and are expensive in terms of computer resources. They are thus generally limited to studying specific cases, although this situation is changing today [7]. Dynamic trajectories can be analyzed to understand which parts of a protein are the most mobile, how domains move with respect to one another within multidomain structures or how much and how fast individual amino acid side chains can change their conformational substates [8,9]. Trajectories can also be biased in a number of ways to mimic external forces acting on proteins and thus to model singlemolecule experiments (albeit on a very different and much faster time scale) [10] or environmental forces such as membrane tension [11]. Simpler methods, notably those based on elastic network models [12,13], can also provide valuable data on protein deformations, despite the fact that these models generally ignore the difference between individual amino acid residues and are guided only by the proximity of residues within the 3D structure of the protein. Thus, the socalled Gaussian network model (GNM), which extracts normal modes from an elastic network protein representation, has been shown to provide useful information on the slow, largeamplitude, collective motions which characterize domain movements, allosteric effects, and enzyme activity [14]. Elastic network models can also be used to calculate the atomic fluctuations. These can be converted to temperature factors (also termed Bfactors), which generally show good overall correlations with those measured crystallographically. It has recently been shown that this correlation can be further improved by taking crystalpacking effects into account [15]. Good correlations have also been found with the conformational fluctuations represented by the multiple structures compatible with NMR data [16]. It has also been found that elastic network models are capable of reproducing the anisotropy of protein fluctuations to a surprisingly good extent [17]. Other coarsegrain approaches to protein flexibility include graphtheoretic models based on the concept of tensegrity (which determines the residual degrees of freedom in a mechanically linked system) [18]. These, along with elastic network approaches, have also become the basis of a variety of multiscale coarsegrain models [19–21]. We started to become interested in protein mechanics as a result of our earlier work on the mechanics of DNA [22,23] and the associated basesequencedependent mechanical properties for understanding protein–DNA recognition [24,25]. From the beginning of our studies, we were interested in defining mechanical properties on the residue level since this seemed to be the easiest way of making comparisons with data on biological function, the impact of point mutations, differences between homologous proteins and so on. We were unsatisfied with the possibility of using temperature factors to answer these questions, notably because of the work of Halle [26], which showed convincingly that temperature factors basically reflect only local structure, and, in particular, local atomic packing densities. We consequently looked for a new measure. Although one obvious approach was to copy the singlemolecule triangulation experiments cited above and test the resistance of all residue–residue (or atomatom) vectors, this method has the disadvantage that it does not easily yield properties that can be associated with individual residues. Tests on the ease of displacing residues with respect to the center of mass of the protein also turned out to be unsatisfactory because observed flexibility could again be attributed either to the probed residue or to the center of mass (for example, because of the movement of a flexible region on the distal side of the protein with respect to the probed residue) [27]. We finally found that testing the displacement of each residue with respect to the rest of the protein structure gave the most interesting results. This involved asking how much energy was necessary to change the mean distance di from residue i to all other residues j≠i in an N residue protein: N
di =
∑
1 ri − r N − 1 j=1, j≠ i
j
.
www.ebook3000.com 59556_C021.indd 318
8/2/08 8:36:44 AM
CoarseGraining Protein Mechanics
319
Note that the position of each residue ri can be characterized by a single atom, such as Cα . The mean distance di can alternatively be obtained by averaging over the mean distances for each atom in a given residue. If the mean distance was successively decreased and increased, we obtained an energy versus mean distance plot. For distance changes of the order of a few tenths of angstroms, these plots turned out to be virtually quadratic and could thus be characterized by the second derivative at the energy minimum, or, in other words, an effective force constant (hereafter denoted ki) for displacing a residue i within the whole protein structure. Note that di is a scalar quantity. Changes in di leave all residues free to move in their energetically optimal directions. The studies we have subsequently carried out on a variety of proteins [27–29] show that the associated force constants are a very interesting guide to protein mechanics. They reveal the extent of the mechanical heterogeneity induced by the complex 3D shapes of proteins and suggest that this heterogeneity plays a significant role in preparing proteins for their biological functions. We have notably found that mechanical properties seem to be very useful in identifying active sites, which in turn provides valuable information for determining protein function [30], a major problem in our postgenomic era [31]. This chapter summarizes the approaches that we have used to obtain residuebyresidue force constants, gives an example of their application to a specific protein, and speculates on future developments.
21.2 METHODOLOGY Our earliest studies in this field used allatom protein representations and a conventional AMBER force field [32] combined with a generalized Born continuum solvent model [33]. An internal coordinate minimization program based on JUMNA [34] was used to relax the protein structure and then to perturb the Cα position of each residue in turn by constraining the mean distance to all other Cα s to increase or decrease. This approach was naturally slow since it typically required four energy minimizations ( ± 0.1 Å, ± 0.2 Å) for each residue. We thus looked for ways of speeding up the calculation. This was achieved in two steps. Firstly, we noted that rather than physically constraining each residue to move within the overall protein structure, we could simply analyze the fluctuations of the mean distance di from each residue (to the rest of the structure) occurring naturally within a molecular dynamics simulation [27]: ki =
3k BT
(
di − di
)
2
,
where di is the mean distance defined above, 〈 〉 denotes the average over the simulation, kB is the Boltzmann constant and T is the temperature of the simulation. This implied that the Nresidue force constants could be obtained from a single dynamic trajectory rather than from 4N + 1 minimizations. The results obtained in this way were very similar to those derived from constrained energy minimization. However, since allatom dynamics simulations generally require an explicit solvent representation to avoid deforming the initial protein structure, the resulting computational cost was still high. We consequently turned to simpler elastic network models [12–14] to gain time. Although we made initial trials with one point per residue models, where each amino acid gives rise to a single node in the elastic network (positioned on the Cα atom), it was clear that a more refined model which could distinguish between the various types of amino acid would be necessary if we wanted to study the impact of sequence mutations. We consequently adopted the model proposed by Zacharias [35,36] which has two or three points per amino acid and has already proved effective in proteinprotein docking studies. In this model, each amino acid has one pseudoatom at the Cα position. Small side
59556_C021.indd 319
8/2/08 8:36:45 AM
320
CoarseGraining of Condensed Phase and Biomolecular Systems
chains (excepting glycine) have a second pseudoatom at the geometric center of the heavy atoms of the side chain, while larger side chains (Arg, Gln, Glu, His, Lys, Met, Trp, Tyr) have a pseudoatom at the center of the CβCγ bond and a third pseudoatom at the geometrical center of the heavy atoms of the sidechain atoms beyond Cγ [35]. With this coarsegrain protein representation, the force field was also simplified to a set of quadratic springs placed between all pseudoatoms lying below a chosen cutoff distance. We chose a distance of 9 Å. All springs had identical force constants of 0.6 kcal mol−1 Å−2 (note that changing this value simply acts as an overall scale factor on the final results). With this type of representation, it is appropriate to replace Newtonian dynamics with stochastic Brownian dynamics (BD), which ignores inertial effects and treats solvent only through random forces and hydrodynamic drag. Full details of the BD simulation protocol we use can be found in one of our earlier publication [28].
21.3 21.3.1
RESULTS AND DISCUSSION FORCE CONSTANT “SPECTRA”
We have chosen to illustrate our force constant calculations using a soluble enolase [37]. The structure of this dimeric protein, PDB 2AL1 [38], has been solved to a resolution of 1.5 Å in the presence of its substrates, 2phosphoDglycerate (2PGA) and phosphoenolpyruvate (PEP), and two magnesium ions. Figure 21.1a shows a cartoon version of this α/βfold protein with its two monomers colored dark and light gray. Each monomer consists of two domains and the substrates, in this case 2PGA (black), are bound within the Cterminal domains. The substratebinding pocket shown in Figure 21.1b involves residues Ser39, His159, Glu168, Glu211, Lys345, His373, and Lys396, with Lys345 and Glu211 serving as acid/base catalysts in the interconversion of 2PGA and PEP [37]. Note that Ser39 has been excluded from Figure 21.1b for clarity. The coordination of the two magnesium ions in the enolase (black spheres) also involves residues Ser39, Asp246, Glu295, and Asp320 [39]. The force constants calculated for this protein, by analyzing the fluctuations from a BD simulation on a 2–3point representation, are shown in Figure 21.2. The inhibitor was not represented by elastic network points and consequently has no impact on the force constant calculation. The reader
FIGURE 21.1 (a) Cartoon representation of a yeast enolase dimer, PDB 2AL1 [37]. The two monomeric units are colored in light and dark gray and the 2PGA substrates are shown in black. All the molecular graphics in this article were prepared using VMD [50]. (b) Simplified representation of the active site and βbarrel of enolase (2AL1). Catalytic and magnesiumbinding residues are in black, and the two magnesium ions and the 2PGA substrate are in dark gray. Ser39 has been omitted for clarity.
www.ebook3000.com 59556_C021.indd 320
8/2/08 8:36:45 AM
CoarseGraining Protein Mechanics
321
FIGURE 21.2 Force constant plot for enolase. The residues are numbered consecutively and the two monomeric units follow one another along the abscissa. Force constants in Figure 21.2, Figure 21.3, and Figure 21.5 are in units of kcal mol−1 Å −2.
is referred to our earlier publications, which show that very similar results are obtained whether the force constants are calculated by energy minimization or BD simulations and also that bound ligands generally have very little effect on the results [28,29]. Note that the residues have been numbered consecutively in the force plot shown in Figure 21.2. The first striking observation concerning these results is that the force constants are highly variable and often change sharply from one residue to the next. Here the values range from 3 to 507 kcal mol−1 Å−2 with a standard deviation of 48 around an average of 32 kcal mol−1 Å−2 (note: 1 kcal mol−1 Å−2 = 0.07 nN Å−1). In Figure 21.2, the force constants for the two monomers follow one another, giving rise to the horizontally repeating pattern. Figure 21.3a shows the results for the first monomer in more detail. It can be seen that the largest force constants occur for residues in the core of the dimer. Their location is illustrated graphically by dark shading in Figure 21.4a for the residues in the righthand monomer. (Note that these results can be seen better in the color version of Figure 21.4, where high force constant residues are shown in green.) It can be seen that the highest force constants occur for residues at the junction between the two monomers. In contrast, except for Glu211 and His373, no residues with high force constants are found in the active site pocket, as can been seen in Figure 21.3a, where the circles and triangles indicate the values corresponding respectively to the active site and the magnesiumionbinding residues cited above. We have found that this behavior is common to most multidomain proteins and reflects the fact that domain movements leave the residues at the junctions virtually undisturbed [28,29]. This leads to high force constants in our approach, since the mainly rotational movements of the domains do not modify the distances of other residues to these hinge points. Similar fi ndings have been observed with normal mode analyses of elastic network models [40,41]. To avoid this effect dominating the force constant spectra, we have developed a socalled domain separation approach. This consists of calculating force constants for changing the mean distance for a given residue with respect to the subset of other residues belonging to the same domain. Note that this change does not influence the elastic network representation, which still includes all residues from all domains. The results of this procedure are shown for a single monomer in the plot in Figure 21.3b and illustrated graphically in Figure 21.4b. It is now seen that the residues with the highest force constants (black, or green in the color version of the figure) lie near the center of the Cterminal domain, shown for the righthand monomer in Figure 21.4b, and close to the substratebinding site. Five of
59556_C021.indd 321
8/2/08 8:36:47 AM
322
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 21.3 (a) Force constant plot for the first monomer of enolase. Circles indicate the active site residues (from left to right: Ser39, His159, Glu168, Glu211, Lys345, His373, and Lys396) and triangles indicate residues binding the magnesium ions (from left to right: Asp246, Glu295, and Asp320). (b) Force constant plot for the first monomer of enolase after domain separation. Circles indicate the active site residues (from left to right: Ser39, His159, Glu168, Glu211, Lys345, His373, and Lys396) and triangles indicate residues binding the magnesium ions (from left to right: Asp246, Glu295, and Asp320).
FIGURE 21.4 (See color insert following page 238.) (a) Backbone diagram of enolase. Residues with high force constants within the righthand monomer are shown in black. (b) Following domain separation, residues with high force constants within the righthand domain are shown in black. (c) Mechanical changes in passing from the monomeric to the dimeric form of enolase. Residues with significantly increased force constants are shown in black and those with significantly decreased force constants in gray (changes are only shown for the righthand domain).
www.ebook3000.com 59556_C021.indd 322
8/2/08 8:36:47 AM
CoarseGraining Protein Mechanics
323
the seven key activesite residues, indicated by circles in Figure 21.3b, now lie within force constant peaks and, in particular, the catalytic residues Lys345, His373, and Lys396 represent three of only four residues having force constants above 300 kcal mol−1 Å−2 within the monomer. It is interesting to note that, after domain separation, rigidity peaks corresponding to the magnesiumbinding residues also become visible. As shown by the triangles in Figure 21.3b, all three of these residues (Asp246, Glu295, and Asp320) are now in force constant peaks.
21.3.2
LOCATING ACTIVE SITES
The example of enolase illustrates the general behavior of mechanical properties of enzymes. In a recent study, we looked at a group of almost 100 enzymes containing proteins belonging to all the main enzymatic families [29]. In the vast majority of the cases studied, the active sites of these residues, as defined in the Catalytic Site Atlas database [42] or in an earlier elastic network study [43], turned out to be amongst the most strongly fixed residues within the protein structures. During this study, bound ligands or inhibitors where again ignored and the domain separation approach was applied to proteins with nonsymmetric domains and more than one active site. Since the range of force constants varies with the size of each protein (being in general larger for larger proteins), we also normalized their values by converting them to Zscores, that is, units of standard variation σ(k) with respect to the mean 〈k〉: k′ =
k − 〈k 〉 , σ( k )
where both σ(k) and 〈k〉 are calculated protein by protein. Using these values, it turns out that active site residues are generally associated with force constants well above the mean. By setting a cutoff at k′ = 0, the residues with force constants above the average represent only 28% of the total set (the overall distribution is highly skewed to lower values). This set is very highly enriched in active site residues, containing 78% of all such residues and only 25% of other residues. Consequently, rigidity within the overall protein structure seems to be a good guide to catalytic activity. This is a somewhat surprising result, given that active site residues are generally assumed to be amongst the most flexible, flexibility being necessary for them to carry out their catalytic functions [44]. However, the reverse has already been found by an analysis of temperature factors [43,45,46] and by looking at the residue fluctuations associated with the lowfrequency normal modes representing collective motions [43]. These results are in line with our present findings.
21.3.3 CONFORMATIONAL VERSUS MECHANICAL CHANGES We have used the increased resolution of the multipoint Zacharias amino acid representation to compare the impact of conformational change within a given protein on its mechanical properties. In our study of hemoproteins [28], we were able to detect differences in the rigidity profile of the active and inactive forms of cytochrome c peroxidase, which correlated well with the known role of the active site residues in this enzyme’s function. Here we compare the flexibility of yeast enolase in its active form, complexed with two magnesium ions, PDB 2AL1, and in an inactive form, complexed with one calcium ion, PDB 5ENL [47]. These two structures are very similar to one another, with an average Cα RMSD of 1.2 Å. The main conformational change involves an important opening movement of the backbone loop between residues 36 and 44. The average Cα RMSD of these amino acids is 7.4 Å. The variations in residue rigidity when changing from the active to the inactive form of enolase are shown in the upper curve of Figure 21.5. Except for His373, all the residues involved in substrate and magnesium binding show a decrease in their force constants, thus suggesting a globally more flexible catalytic site in the inactive form of the protein.
59556_C021.indd 323
8/2/08 8:36:49 AM
324
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 21.5 Changes in the force constants when passing from the active to the inactive form of enolase. Upper curve: Force constants calculated using the Zacharias reduced multipoint amino acid representation. Lower curve: Force constants calculated using a singlepointperresidue representation (with a vertical offset of −120 kcal mol−1 Å−2 for clarity).
It is worth noting that these more detailed studies of protein mechanics require the improved resolution of the multipoint Zacharias representation. This is clearly shown in the lower plot in Figure 21.5, which was obtained using a onepointperresidue protein representation. As seen, this cruder representation which ignores the size and conformation of the amino acid side chains shows little structure and does not single out any particular behavior for the active site or ionbinding residues.
21.3.4
ARCHITECTURAL FINGERPRINTS IN THE FORCE CONSTANT SPECTRA
Most of the proteins studied to date show high force constants for a number of residues other than those in the active sites. In some case, these residues are simply close to the active site residues and presumably play a role in maintaining its overall rigidity. However, in other cases, the residues are far from the active site. One such example, seen in our study of hemoproteins [28], involved two pairs of highly conserved residues at the junction between two αhelices within proteins of the cytochrome c family. These residues have been identified as playing key roles in the folding of such proteins [48]. Another very preliminary study of cytochrome c (see the supplementary material of Ref. [28]) suggested that there might be some correlation between the folding units (“foldons”) identified by hydrogen exchange experiments [49] and our calculated force constants, the groups of highly rigid residues along the primary sequence being generally associated with early folding units. This suggests that mechanical properties may reflect to some extent the proteinfolding pathways. More data are however needed to test this hypothesis. We have also observed that, in some cases, high force constants are a signature of the overall protein structure, as in the case of residues lying within each βstrand within βbarrel domains [28]. This behavior is seen in our enolase test case where the active site is located at the top of an eightstranded βbarrel. The barrel fold of this protein is reflected in the force constant spectra as the series of peaks starting at residue Asn152. Obviously, much remains to be studied in this area. One possibility is that such “architectural fingerprints” can be defined for each family of protein
www.ebook3000.com 59556_C021.indd 324
8/2/08 8:36:49 AM
CoarseGraining Protein Mechanics
325
folds and then removed from the overall force constant spectra, making it still easier to detect active site residues. Finally, it is also possible to study the buildup of mechanical properties by taking a protein apart at the monomer or domain levels. This is illustrated for our example of enolase in Figure 21.6. In this case we have calculated the change in force constants (after normalization by conversion to Zscores, see Methods) in passing from a single monomer to the full dimeric structure. Note that here, in contrast to the domain separation technique, we are actually changing the elastic network representation being studied (monomer or dimer). The plot in Figure 21.6 shows that moving from a monomer to a dimer does not simply lead to a general increase in force constants, since both increases and decreases are seen. The location of the changes are illustrated in Figure 21.4c where it is observed that force constants understandably increase at the junction between the two monomers, but, more surprisingly, decrease in the Cterminal domain, in a region not far from the substratebinding site. We have seen complex, and not easily predictable, changes such as this in other proteins, both as the result of conformational changes or as a result of point mutations.
21.4 CONCLUSIONS The complex structures of proteins appear to lead to equally complex mechanical properties. The coarsegraining approach described here makes it possible to analyze such properties on a residuebyresidue basis. The results suggest that proteins are very heterogeneous in mechanical terms and that active sites, and possibly other functionally important residues, have unusual properties, generally being associated with above average force constants. While a singlepointperresidue representation captures the main features of a protein’s mechanical properties, a finer representation, taking sidechain size and orientation into account, is necessary for analyzing the effects of point mutations or small conformational changes. We have also shown that the fluctuations of our meandistance function observed using BD simulations enable residue force constants to be calculated quickly, while giving results very close to those obtained with allatom minimization or molecular dynamics approaches. Although more work clearly remains to be done to understand how mechanical heterogeneity is actually generated and is related to the structural classes of proteins, this property seems well worth studying in a systemic way.
FIGURE 21.6 Changes in normalized force constants (units of standard deviation) in passing from a single monomer to the dimeric form of enolase.
59556_C021.indd 325
8/2/08 8:36:50 AM
326
CoarseGraining of Condensed Phase and Biomolecular Systems
REFERENCES 1. Kendrew, J. C., G. Bodo, H. M. Dintzis, R. G. Parrish, H. Wyckoff, and D. C. Phillips. 1958. A threedimensional model of the myoglobin molecule obtained by xray analysis. Nature 181:662–66. 2. Perutz, M. F. 1960. Structure of hemoglobin. Brookhaven Symp. Biol. 13:165–83. 3. Bustamante, C. 2004. Of torques, forces and protein machines. Protein Sci. 13:3061–65. 4. Lavery, R., A. Lebrun, J.F. Allemand, D. Bensimon, and V. Croquette. 2002. Structure and mechanics of single biomolecules: Experiment and simulation. J. Phys. Condens. Matter 14:R383–414. 5. Dietz, H., and M. Rief. 2006. Protein structure by mechanical triangulation. Proc. Natl. Acad. Sci. U.S.A. 103:1244–47. 6. Dietz, H., F. Berkemeier, M. Bertz, and M. Rief. 2006. Anisotropic deformation response of single protein molecules. Proc. Natl. Acad. Sci. U.S.A. 103:12724–28. 7. Rueda, M., C. FerrerCosta, T. Meyer, A. Perez, J. Camps, A. Hospital, J. L. Gelpi, and M. Orozco. 2007. A consensus view of protein dynamics. Proc. Natl. Acad. Sci. U.S.A. 104:796–801. 8. Norberg, J., and L. Nilsson. 2003. Advances in biomolecular simulations: Methodology and recent applications. Q. Rev. Biophys. 36:257–306. 9. Karplus, M., and J. Kuriyan. 2005. Molecular dynamics and protein function. Proc. Natl. Acad. Sci. U.S.A. 102:6679–85. 10. Gao, M., D. Craig, O. Lequin, I. D. Campbell, V. Vogel and K. Schulten. 2003. Structure and functional significance of mechanically unfolded fibronectin type III1 intermediates. Proc. Natl. Acad. Sci. U.S.A. 100:14784–89. 11. Gullingsrud, J., D. Kosztin, and K. Schulten. 2001. Structural determinants of MscL gating studied by molecular dynamics. Biophys. J. 80:2074–81. 12. Tirion, M. M. 1996. Large amplitude elastic motions in proteins from a singleparameter, atomic analysis. Phys. Rev. Lett. 77:1905–1908. 13. Tozzini, V. 2005. Coarsegrained models for proteins. Curr. Opin. Struct. Biol. 15:144–50. 14. Chennubhotla, C., A. J. Rader, L. W. Yang, and I. Bahar. 2005. Elastic network models for understanding biomolecular machinery: From enzymes to supramolecular assemblies. Phys. Biol. 2:S173–80. 15. Song, G., and R.L. Jernigan. 2007. vGNM: A better model for understanding the dynamics of proteins in crystals. J. Mol. Biol. 369:880–93. 16. Yang, L.W., E. Eyal, C. Chennubhotla, J. G. Jee, A. M. Gronenborn, and I. Bahar. 2007. Insights into equilibrium dynamics of proteins from comparison of NMR and Xray data with computational procedures. Structure 15:741–49. 17. Kondrashov, D. A., Q. A. Cui, and G. N. Phillips Jr. 2006. Optimization and evaluation of a coarsegrained model of protein motion using Xray crystal data. Biophys. J. 91:2760–67. 18. Jacobs, D. J., A. J. Rader, L. A. Kuhn, and M. F. Thorpe. 2001. Protein flexibility predictions using graph theory. Proteins 44:150–65. 19. Kurkcuoglu, O., R. L. Jernigan, and P. Doruker. 2005. Collective dynamics of large proteins from mixed coarsegrained elastic network model. QSAR Comb. Sci. 24:443–48. 20. Aqeel, A., and H. Gohlke. 2006. Multiscale modeling of macromolecular conformational changes combining concepts from rigidity and elastic network theory. Proteins 63:1038–51. 21. Zhao, Y., D. Stoffler, and M. Sanner. 2006. Hierarchical and multiresolution representation of protein flexibility. Bioinformatics 22:2768–74. 22. Cluzel, P., A. Lebrun, C. Heller, R. Lavery, J. L. Viovy, D. Chatenay, and F. Caron. 1996. DNA: An extensible molecule. Science 271:792–94. 23. Allemand, J. F., D. Bensimon, R. Lavery, and V. Croquette. 1998. Stretched and overwound DNA forms a Paulinglike structure with exposed bases. Proc. Natl. Acad. Sci. U.S.A. 95:14152–57. 24. Lebrun, A., Z. Shakked, and R. Lavery. 1997. Local DNA stretching mimics the distortion caused by the TATA boxbinding protein. Proc. Natl. Acad. Sci. U.S.A. 94:2993–98. 25. Paillard, G., and R. Lavery. 2004. Analyzing proteinDNA recognition mechanisms. Structure 12:113–22. 26. Halle, B. 2002. Flexibility and packing in proteins. Proc. Natl. Acad. Sci. U.S.A. 99:1274–79. 27. Navizet, I., F. Cailliez, and R. Lavery. 2004. Probing protein mechanics: Residuelevel properties and their use in defining domains. Biophys. J. 87:1426–35. 28. SacquinMora, S., and R. Lavery. 2006. Investigating the local flexibility of functional residues in hemoproteins. Biophys. J. 90:2706–17. 29. SacquinMora, S., E. Laforet, and R. Lavery. 2007. Locating the active sites of enzymes using mechanical properties. Proteins 67:350–59.
www.ebook3000.com 59556_C021.indd 326
8/2/08 8:36:51 AM
CoarseGraining Protein Mechanics
327
30. Glaser, F., R. J. Morris, R. J. Najmanovich, R. A. Laskowski, and J. M. Thornton. 2006. A method for localizing ligand binding pockets in protein structures. Proteins 62:479–88. 31. Soro, S., and A. Tramontano. 2005. The prediction of protein function at CASP6. Proteins 61 (Suppl. 7): 201–13. 32. Wang, J. M., P. Cieplak, and P. A. Kollman. 2000. How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? J. Comput. Chem. 21:1049–74. 33. Tsui, V., and D. A. Case. 2000. Molecular dynamics simulations of nucleic acids with a generalized born solvation model. J. Am. Chem. Soc. 122:2489–98. 34. Lavery, R., K. Zakrzewska, and H. Sklenar. 1995. JUMNA (junction minimization of nucleicacids). Comp. Phys. Commun. 91:135–58. 35. Zacharias, M. 2003. Proteinprotein docking with a reduced protein model accounting for sidechain flexibility. Protein Sci. 12:1271–82. 36. Bastard, K., C. Prevost, and M. Zacharias. 2006. Accounting for loop flexibility during protein–protein docking. Proteins 62:956–69. 37. Sims, P. A., A. L. Menefee, T. M. Larsen, S. O. Mansoorabadi, and G. H. Reed. 2006. Structure and catalytic properties of an engineered heterodimer of enolase composed of one active and one inactive subunit. J. Mol. Biol. 355:422–31. 38. Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The protein data bank. Nucleic Acids Res. 28:235–42. 39. Larsen, T. M., J. E. Wedekind, I. Rayment, and G. H. Reed. 1996. A carboxylate oxygen of the substrate bridges the magnesium ions at the active site of enolase: Structure of the yeast enzyme complexed with the equilibrium mixture of 2phosphoglycerate and phosphoenolpyruvate at 1.8 Å resolution. Biochemistry 35:4349–58. 40. Isin, B., P. Doruker, and I. Bahar. 2002. Functional motions of influenza virus hemagglutinin: A structurebased analytical approach. Biophys. J. 82:569–81. 41. Bahar, I., and R. L. Jernigan. 1999. Cooperative fluctuations and subunit communication in tryptophan synthase. Biochemistry 38:3478–90. 42. Porter, C. T., G. J. Bartlett, and J. M. Thornton. 2004. The catalytic site atlas: A resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res. 32:D129–33. 43. Yang, L. W., and I. Bahar. 2005. Coupling between catalytic site and collective dynamics: A requirement for mechanochemical activity of enzymes. Structure 13:893–904. 44. Daniel, R. M., R. V. Dunn, J. L. Finney, and J. C. Smith. 2003. The role of dynamics in enzyme activity. Annu. Rev. Biophys. Biomol. Struct. 32:69–92. 45. Bartlett, G. J., C. T. Porter, N. Borkakoti, and J. M. Thornton. 2002. Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 324:105–21. 46. Yuan, Z., J. Zhao, and Z. X. Wang. 2003. Flexibility analysis of enzyme active sites by crystallographic temperature factors. Protein Eng. 16:109–14. 47. Lebioda, L., B. Stec, J. M. Brewer, and E. Tykarska. 1991. Inhibition of enolase: The crystal structures of enolaseCa2 + 2phosphoglycerate and enolaseZn2 + phosphoglycolate complexes at 2.2 Å resolution. Biochemistry 30:2823–27. 48. Ptitsyn, O. B. 1998. Protein folding and protein evolution: Common folding nucleus in different subfamilies of ctype cytochromes? J. Mol. Biol. 278:655–66. 49. Krishna, M. M. G., Y. Lin, L. Mayne, and S. W. Englander. 2003. Intimate view of a kinetic protein folding intermediate: Residueresolved structure, interactions, stability, folding and unfolding rates, homogeneity. J. Mol. Biol. 334:501–13. 50. Humphrey, W., A. Dalke, and K. Schulten. 1996. VMD: Visual molecular dynamics. J. Mol. Graph. 14:33–38, 27–28.
59556_C021.indd 327
8/2/08 8:36:51 AM
www.ebook3000.com 59556_C021.indd 328
8/2/08 8:36:52 AM
of Surfactants in 22 SelfAssembly Bulk Phases and at Interfaces Using CoarseGrain Models Wataru Shinoda Research Institute of Computational Science, National Institute of Advanced Industrial Science and Technology
Russell DeVane, and Michael L. Klein The Laboratory for Research on the Structure of Matter, University of Pennsylvania
CONTENTS 22.1 Introduction ......................................................................................................................... 329 22.2 CoarseGrained Surfactant Model ...................................................................................... 331 22.2.1 Parameter Fitting for Pure Solvents ...................................................................... 332 22.2.2 Parameters for Immiscible Solvents ...................................................................... 334 22.2.3 Parameters for Solutes ........................................................................................... 334 22.3 Selected Applications .......................................................................................................... 337 22.3.1 Lamellar Phase Formation .................................................................................... 337 22.3.2 Monolayer at the Air/Water Interface ................................................................... 338 22.4 Future Perspectives ............................................................................................................. 339 22.5 Conclusions .........................................................................................................................340 Acknowledgments ..........................................................................................................................340 References ...................................................................................................................................... 341
22.1 INTRODUCTION The amphiphilic nature of surfactant molecules leads to their aggregation and selfassembly into a variety of morphologies when exposed to solvents. The observed morphology depends on a number of variables including the molecular structure of the specific surfactant, its solvophilicity, the concentration of the surfactant, the solvent properties and finally the thermodynamic conditions. Understanding such a complex interplay of variables at the atomic level is a natural goal of molecular simulations using highperformance computing resources. However, even with generous access to multiterascale machines, this goal is particularly challenging due to both the temporal and spatial scales involved. Simply put, the study of surfactant selfassembly is beyond the capabilities of current computational resources if one desires an allatom representation. To overcome this difficulty, two approaches are commonly adopted: (1) use of enhanced sampling techniques and (2) simplified molecular representation of the surfactant molecules; that is, coarsegraining. With the relentless increase in available computer resources, some of the issues that arise in the investigation of complex phenomena will likely be resolved via currently available and recently enhanced sampling 329
59556_C022.indd 329
8/2/08 8:43:40 AM
330
CoarseGraining of Condensed Phase and Biomolecular Systems
techniques. However, for the foreseeable future many aspects of the timescale problem are likely to persist and remain beyond the scope of allatom simulations. Coarsegraining (CG) models reduce computational demand by reducing the number of degrees of freedom for the molecules (i.e., number of atomic sites) that comprise the system of interest. Of course, with this reduction in the description of the system comes a reduction in the level of chemical detail that is retained. An early example of a coarsegrain approach is the molecular dynamics (MD) simulations of the folding of a small protein by Levitt and Warshel [1]. The polymer community has also adopted CG models with considerable success [2,3]. More recently the study of surfactants by Smit and coworkers [4] inspired our firstgeneration CG model for lipid bilayers [5]. Ultimately, any coarsegrain approach requires a selection of “key” molecular properties or attributes to be retained in advance of model development (parameterization). The inherent limitation of a typical CG model is illustrated by the work of Siepmann et al. [6,7], who presented a new approach to constructing an intermolecular potential, called the TraPPE force field, in which liquid–gas phase equilibrium data were used as a target property to be reproduced by the model. To parameterize and test the force field, simulations were performed using the configurationalbias Monte Carlo techniques in the Gibbs ensemble. In a series of alkane models they changed the resolution from a unitedatom (UA) to an allatom (AA) description [6,7]. Both models perfectly reproduced the phase equilibrium diagram. However, a comparison of UA and AA models for several thermodynamic quantities at ambient conditions revealed deficiencies in the UA model. This simply implies that a reduction in the number of degrees of freedom yields a model with less adjustability and consequently a model with a more modest scope of applicability. This degradation is inevitable even with minimal coarsegraining, for example, AA to UA, and in general it is impossible to reproduce all of the properties that are obtained from the (original) AA model. Thus, CG models should be designed for a more specific purpose than finergrained AA models. The primary motivation of CG modeling is a reduction of computational overhead, thus allowing larger system and time scales to be accessed and explored. Accordingly, it is necessary to strike a balance between the complexity of the model and the increase in computational efficiency such that the level of accuracy required to provide insight into the behavior of the system of interest is retained while still providing computational efficiency. That is to say, even with a reduction in the number of degrees of freedom (interaction sites), it is possible to maintain a high level of accuracy in the force field model by using a more complex description of the intermolecular interactions [8]. However this typically comes at the expense of an increase in computational overhead, which in turn could easily offset all gains made by reducing the description of the system. Thus, if one wishes to reduce the computational cost significantly compared with an AA model, one is forced to focus on a selection of the properties to be reproduced in the CG model while keeping the force field as simple as possible to meet that goal. With this target in mind, the question arises as to what experimental properties should be retained and how to preserve those in the CG model. There is no unique answer, as will be evident by the fact that readers will find several different approaches in the other chapters of this book [5,9–16]. Nonetheless, herein we will outline a systematic procedure to build a CG model for surfactant systems. As mentioned above, surfactant solutions exhibit a variety of morphologies depending on the thermodynamic conditions. These morphologies are mainly determined by the interfacial properties so that the surface/interfacial tension is one of the key properties that characterize the system. Thus, surface tension and density are used as target properties to fix the parameters for the nonbonded interaction of pure solvents. The specific functional form is selected to refine the structural data and compressibility (of water). Interfacial tension is used to parameterize the interaction between phaseseparated fluids, while solvation free energy is employed for the interaction between soluble fluids. The systems used in the parameterization are examined at both the AA and CG levels. By keeping the number of unknown parameters smaller than the number of target properties at each fitting step, it is possible to find suitable parameters straightforwardly and unambiguously. The extensive use of many molecular systems for fitting is essential for a systematic parameterization.
www.ebook3000.com 59556_C022.indd 330
8/2/08 8:43:41 AM
SelfAssembly of Surfactants Using CoarseGrain Models
331
As a result of this parameterization approach, we have several favorable features in the CG model. For example, the model guarantees the correct molecular partitioning and is applicable to systems having an air/solution interface. The former is guaranteed by requiring the model to predict the correct solvation (or transfer) free energy, and the latter is a result of using the surface tension and density as target values in the parameterization. Polyethylene glycol (PEG) surfactant solutions will be presented here to exemplify our strategy to build a CG model.
22.2
COARSEGRAINED SURFACTANT MODEL
The initial step of the approach is to systematically map the system into groups of atoms that will each be represented by a CG site (see Figure 22.1). The atomic groups needed to construct a CG PEG/water system with our level of mapping (roughly three to four heavy atoms with associated hydrogens per CG site) are W, CT, CM, CT2, EO, EOT, and OA, which represent (H2O)3, CH3–CH2–CH2–, –CH2–CH2–CH2–, CH3–CH2–, –CH2–O–CH2–, CH3–O–CH2–, and HO–CH2–, respectively. The CG water, W, is special because the site represents three molecules, while the other CG particle corresponds just to a segment of a single molecule. With just seven CG sites, there are 28 pairinteractions that have to be determined. For ease of implementation, a Lennard–Jones (LJ) function is used for the nonbonded interactions: ⎪⎧⎪⎛ σ ⎞m ⎛ σ ⎞n ⎪⎫⎪ ij ⎟ ij ⎟ U LJ (rij ) = Bε ij ⎪⎨⎜⎜⎜ ⎟⎟⎟ − ⎜⎜⎜ ⎟⎟⎟ ⎪⎬ . ⎪⎪⎜⎝ rij ⎟⎠ ⎜⎝ rij ⎟⎠ ⎪⎪ ⎪⎩ ⎪⎭ Several pairs of the repulsive and attractive parameters, m and n, were tested to search for a suitable functional form to give the best structural and thermodynamic properties. Ultimately, the values of (m,n) chosen were (12,4) and (9,6). The choice depends on the type of interaction with the nonbonded interactions involving “W” modeled with the LJ124 function, while all others employ the LJ96 functional form. The prefactor B, which is chosen such that U LJ (σ ) = 0 and min(ULJ) = ε is given by 3 3 / 2 and 27/4 for LJ124 and LJ96, respectively. The longrange force is simply truncated at 15 Å so that the cutoff distance should affect the calculated system properties. Note that here we have a nonionic system and use no ionic particles. For ionic systems, it may be necessary to employ alternative methods to handle the longrange interactions.
FIGURE 22.1 (See color insert following page 238.) Atomistic (a) and coarsegrained (b) representation of C12E2 molecule. The atomic groups, (HO – CH2 –), (– CH2 – O – CH2 –), (– CH2 – CH2 – CH2 –), and (CH3 – CH2 –), are referred to as OA, EO, CM, and CT2 segments, respectively.
59556_C022.indd 331
8/2/08 8:43:41 AM
332
CoarseGraining of Condensed Phase and Biomolecular Systems
For the bonded interactions, we employ simple harmonic potentials for 12 bond stretching and 123 angle bending given by Ustretching (rij ) = k b (rij − r0 )2 , U bending (θijk ) = kθ (θijk − θ0 )2 . Here the force constants and zeroforce distance and angle are fitted to reproduce the corresponding distribution functions from AAMD trajectories. Although we sometimes find bimodal probability distributions in AA results, the above functions are used and the CG parameters are fit to give the average and dispersion of the AA distribution. In our experience, this simplification does not give significant error in the assembled morphologies. The bonded interactions exceeding three bodies, for example, torsions and dihedrals, are not treated with an internal potential. However, these CG sites do interact via the nonbonded pair potential with no scaling of the potential introduced. Importantly, a target temperature at 30°C was selected for the parameterization presented herein, although it is possible to select any arbitrary temperature, with the only constraint being that the target molecules be in the liquid state in order to use the condensed phase surface tension and density data. The transferability of the CG model to a different temperature is not expected in principle, though a test with the CG water model showed good transferability with respect to surface tension and density within the liquid temperature range of water [9]. To optimize the CG force field, we have carried out a series of MD simulations for the systems shown below. The methods used for the simulations are briefly summarized here. The CHARMM PARAM27 force field was used for all AAMD simulations except for the PEG headgroups [17]. The interaction parameters for the PEG headgroup were taken from Ref [18]. The van der Waals interactions were truncated at 12 Å by applying the standard CHARMM smoothing function for the tail region of 10–12 Å, while the Coulomb interaction was calculated using the Ewald or particle mesh Ewald method [19]. The SHAKE/RATTLE (ROLL) method was used to fix the bond lengths involving hydrogen atoms and allowed the use of a 2 fs timestep [20]. For CGMD, two timestep sizes were used to solve the equations of motion by employing the rRESPA algorithm [20,21]; for updating longrange nonbonded forces (0.6–1.5 nm) a 10 fs timestep was used and 2 fs was used for updating shortrange nonbonded and bonded forces. Those can be extended to 40 and 5 fs, respectively, without changing the system properties.
22.2.1
PARAMETER FITTING FOR PURE SOLVENTS
For pure solvents, surface tension and density data were used to fix the LJ parameters, σ and ε. These parameters were fit by a trialanderror approach. To do this efficiently, we employed the following technique. First, a cubic simulation box is prepared with the edge length of approximately 40 Å and the proper target density. To fix the pressure, short NVTMD runs (typically 100 ps) are performed while adjusting the LJ parameters. After selecting the parameters to give zero pressure, the simulation box is elongated in the zdirection to 400 Å to create a system with a liquid/vacuum interface. Again, NVTMD simulations are carried out on the elongated box to measure the surface tension, which was calculated by γ=
LZ 2
⎫ ⎧⎪ ⎪⎨ P − Pxx + Pyy ⎪⎪⎬ zz ⎪⎪ . ⎪⎪ 2 ⎭ ⎩
Here, the factor of 1/2 is included to account for the two interfaces in the simulation box, and Pij is the ij component of the averaged pressure tensor. To achieve the convergence of surface tension with a precision of 1 dyne/cm, 5–10 ns MD simulations are usually needed. Finally, to confirm the system density, NPTMD is also carried out for 1 ns on the cubic simulation box.
www.ebook3000.com 59556_C022.indd 332
8/2/08 8:43:42 AM
SelfAssembly of Surfactants Using CoarseGrain Models
333
TABLE 22.1 Comparison of CGMD and Experiments for Surface Tension, γ (dyne/cm), and Density, ρ (g/cm3) at 303 K Expb
MD γ
ρ
γ
ρ
Water
W
WW
70.8
0.9949
71.20
0.9957
Hexane
CTCT
CTCT
17.5
0.6498
17.43
0.6518
System
Molecular structure
Interaction
a
Nonane
CTCMCT
22.3
0.7129
21.94
0.7114
Dodecane
CT(CM)2CT
24.5
0.7422
24.48
0.7415
Pentadecane
CT(CM)3CT
25.9
0.7603
26.23
0.7616
Octadecane
CT(CM)4CT
27.6
0.7726
27.53
0.7722
Heptane
CT2CMCT2
19.4
0.6791
19.27
0.6773
Decane
CT2(CM)2CT2
22.3
0.7239
22.92
0.7247
Dimethoxyethane
EOTEOT
EOTEOT
19.9
0.8617
19.45
0.8593
Diethylene glycol dimethyl ether
EOTEOEOT
EOTEO, EOEO
25.6
0.9374
28.60
0.9372
Triethylene glycol dimethyl ether
EOT(EO)2EOT
29.8
0.9804
27.83
0.9735
Tetraethylene glycol dimethyl ether
EOT(EO)3EOT
31.7
1.0060
32.88
1.0010
Ethylene glycol
OAOA
Diethylene glycol
OAEOOA
Triethylene glycol
OA(EO)2OA
CMCT, CMCM
CT2CT2, CT2CM
OAOA EOOA
50.2
1.1060
49.01
1.1070
44.8
1.0990
48.86
1.1100
45.2
1.1150
45.80
1.1180
Tetraethylene glycol
OA(EO)3OA
45.0
1.1200
43.53
1.1170
Diethylene glycol dinbutyl ether
CT(EO)3CT
EOCT
26.3
0.8767
26.07
0.8774
Dipropylether
CT2EOCT2
EOCT2
19.1
0.7379
19.46
0.7366
Dinhexylether
CT2CMEOCMCT2
EOCM
24.9
0.7858
24.91
0.7860
1Propanol
CT2OA
CT2OA
22.8
0.7943
23.80
0.7950
1Hexanol
CT2CMOA
CMOA
25.7
0.8121
25.48
0.8123
a b
Interaction column gives the CG particle pair parameterized using the system. Experimental data are taken from Ref. [22].
This approach was used to parameterize pure solvents; that is, water, alkanes, and ethylene glycols, which are listed in Table 22.1. The LJ124 function was used for the CG water model in order to maintain a liquid state from 0 to 100°C while simultaneously optimizing the model with respect to compressibility and interfacial properties (at the alkane–water interface) and obtaining the correct transfer free energy of alkane from its bulk to water (see the next subsection). The choice of LJ96 for chained molecules was made in order to preserve structural detail as much as possible. Figure 22.2 plots the radial distribution functions for a triethylene glycol dimethyl ether (EOT–EO–EO–EOT) system. Although a slightly higher first peak is observed for EOT–EOT with the CG model, the overall structure agrees reasonably well with the AA results. It is worth reiterating the agreement that is achieved with the simple interaction functions used here and point out that we have only slight degradation of the structural properties when compared to the tabulated potentials based on the inverse Boltzmann method. It should be noted that, as shown in Table 22.1, the model is transferable to chains of various length. This is achieved by making use of segments of various lengths; that is, CT, CM, and CT2, that
59556_C022.indd 333
8/2/08 8:43:42 AM
334
CoarseGraining of Condensed Phase and Biomolecular Systems 6 EOEO 5
4
g
EOTEO 3
2 EOTEOT 1
0 0
allatom Coarsegrained 10 15
5 r [Å]
FIGURE 22.2 Radial distribution functions from AA and CG simulations of triethylene glycol dimethyl ether (EOT–EO–EO–EOT).
can be assembled into alkanes with a variety of lengths, all of which give reasonable surface tension, density, and pairdistribution functions.
22.2.2
PARAMETERS FOR IMMISCIBLE SOLVENTS
Next we discuss the parameterization of the alkanetype CG sites (CT, CM, and CT2) and water. To fix the LJ parameter, ε, between alkane sites and water, the interfacial tension was used as a target property to be reproduced. As for σ, which represents a contact distance between the particles, the arithmetic average between the alkane particle and W was used. The LJ124 function was used for water–alkane interactions in order to produce a more attenuated interfacial width. The broadening of the interfacial width is usually observed with CG models simply due to the larger size of the CG particles [12]. Choosing a steeper function (more strongly repulsive term) gives better agreement in the interfacial width compared with the AA model [9]. Following this parameterization approach, the model was systematically built to have the correct interfacial tension for a series of alkane chains with water (Table 22.2). It should be noted here that our CG model reproduces the experimental transfer free energy accurately. We have carried out a series of steered MD simulations [24] which involve dragging an nhexane molecule from the bulk nhexane region to the bulk water region along the interface normal. The free energy cost for the transfer was calculated using Jarzynski’s theory based on 15 sets of steered MD calculations [25]. The transfer free energy is estimated to be ∼ 8 kcal/mol, which is in good agreement with the experimental value, 7.74 kcal/mol, as shown in Figure 22.3 [26]. We also confirm the convergence of the freeenergy profile by measuring the work with the reverse operation; that is, dragging a nhexane molecule from the bulk water region to the bulk nhexane region. We should emphasize that the accurate transfer free energy is not just a coincidence but a result of extensive exploration of suitable interaction function and parameters.
22.2.3
PARAMETERS FOR SOLUTES
For the interaction between miscible pairs, for example, PEG/water, values for the solvation free energy are used for fitting the LJ parameter, ε. Although the combination rule for σ can be used, a different approach is taken to estimate the σ value in this case because the effective size of a CG site in water will change depending on the hydrophilicity and may be different from that in bulk
www.ebook3000.com 59556_C022.indd 334
8/2/08 8:43:42 AM
SelfAssembly of Surfactants Using CoarseGrain Models
335
TABLE 22.2 Comparison of CGMD and Experiment for Interfacial Tension at Alkane/Water Interface at 303 K Interfacial tension (dyne/cm) Mixture
Interaction
Water/hexane
CTW
Water/nonane Water/dodecane
CMW
Water/pentadecane Water/heptane a
CT2W
Expa
MD 50.0
49.96
51.9
51.21
52.9
52.14
52.9
–
50.1
50.30
Experimental data are taken from Ref. [23].
δG exp = 7.74 kcal/mol (at 25oC)
δG [kcal/mol]
8 6 4 2
nhexane
water
0 –20
–10
0 z [Å]
10
FIGURE 22.3 Freeenergy profile of nhexane molecule across the interface between nhexane (z > 0 Å) and water (z < 0 Å). The solid line denotes the work needed to drag a hexane molecule from the bulk hexane region to the bulk water region and the dotted line gives the work along the inverse pathway.
solution. To estimate the effective size of a CG site, we use a potential of mean force (PMF) analysis of an AAMD trajectory of a single molecule (or fragment of molecule) corresponding to the CG site in bulk water. Details of this procedure are given in the previous publication [9]. After σ is fixed, a series of freeenergy calculations are needed to find a suitable ε to reproduce the experimental hydration free energy. This approach is generally useful for a variety of molecules as long as the experimental hydration free energy data are available. Thus, a systematic parameterization for a series of CG segments is feasible. We choose the parameters for the OA–W interaction with this protocol using the experimental hydration free energy for ethylene glycol. All parameters have been fixed except for the EO–W interaction. Since no experimental hydration freeenergy data are available to parameterize this interaction, structural data of the lamellar phase of the C12E2/water system are used [27]. As mentioned above, σ is estimated from the effective size of the EO segments in water from AAMD simulations. Since the lamella spacing and the molecular area of C12E2 are available from Xray diffraction measurements, ε is fixed using these quantities. A series of NPnATMD simulations of the lamellar systems at the surfactant composition of 67 wt% have been carried out with the crosssectional area fixed to have a experimental molecular area of 30 Å2. With these simulations, ε is fit to give zero surface tension. After the parameterization, 10 nsNPTMD simulations of the C12E2/water system were performed to assess the membrane properties. The average molecular area shows perfect agreement with the experimental value, while the lamellar spacing, 48.1 Å, is slightly overestimated compared with the experiment (47.3 Å). Figure 22.4 plots the number density for each CG segment along the bilayer normal and
59556_C022.indd 335
8/2/08 8:43:43 AM
336
CoarseGraining of Condensed Phase and Biomolecular Systems
(c)
W
20
AA CG
z [Å]
10
0
CT2 CM
–10
EO OA
–20 0
0.005 0.01 –3 P [Å ]
0.015
FIGURE 22.4 A snapshot of C12E2 lamellar system with (a) AA and (b) CG models. Thick lines denote C12E2 molecules with headgroup in dark gray. Water is depicted by solid line and white particle in AA and CG, respectively. (c) The density profile of each component of the C12E2 lamellar system along the bilayer normal is shown for the AA and CG simulation.
compares it with the equivalent measurement from the AAMD simulations. Considering the fact that no structural details of the surfactant, other than the molecular area, were used in the parameterization, the agreement is remarkable. Due to the fact that the AA force field is not guaranteed to predict the correct surface tension, it was necessary to perform the AA simulations using the fixed area ensemble. For example, TIP3P water, which is the most widely used water model, gives a surface tension of about 52.7 dyne/cm, while it should be 72.8 dyne/cm at ambient temperature [28]. In addition, it was reported that the CHARMM force field overestimates the surface tension of the DPPC bilayer system; consequently, a long timescale MD simulation will eventually generate a gellike bilayer in the NPT ensemble even at liquidcrystal conditions [29]. Thus, the surface properties are subtle and can be difficult to reproduce even with a widely adopted AA description of the system. This point helps to highlight the advantages of a model guaranteed to properly reproduce experimental properties.
www.ebook3000.com 59556_C022.indd 336
8/2/08 8:43:44 AM
SelfAssembly of Surfactants Using CoarseGrain Models
22.3 22.3.1
337
SELECTED APPLICATIONS LAMELLAR PHASE FORMATION
One of the goals of developing a CG force field is to be able to investigate nonequilibrium molecular processes that take place on time scales not accessible by AAMD. A selforganized mesostructure of amphiphiles is one such example. We demonstrate here an application of our new CG potential to observe the selfassembly process of C12E2 molecules in water into the lamellar phase. The initial configuration was made with 1296 C12E2 molecules and 3456 W particles randomly packed into the cubic simulation box with an edge of approximately 100 Å. Lamellar formation occurred in a 10 nsCGMD run (Figure 22.5). The simulation time does not correspond to the physical time straightforwardly, because of much simplified energy surface results from the coarsegraining procedure. Although we do not have a sophisticated measure for “real” time in the CG simulations, a comparison of the diffusion coefficients of surfactant molecules calculated for CG and AA models, respectively, suggests that the physical time is longer than the simulation time by at least two orders of magnitude. The initial stage of the structural reorganization was a local rearrangement to reduce the contact area between hydrophilic and hydrophobic components. This process proceeded in a short time period, 1 . Examples include DNA and NaPSS. A weak polyelectrolyte such as polyacrylic acid has B /a 1 . More care and awareness must be applied in treatment of charged systems, because many of the rules that apply to shortranged interactions do not apply to Coulomb interactions. For most of the Coulomb pair interactions in Equation 23.1, the energies are less than kBT, but the Coulomb interactions can sum up to be much greater than kBT. Consider a straight configuration of a polyelectrolyte and the monomer in the middle. The Coulomb energy for this monomer is N /2
∑
U=2
j=1
e2 = 2 k BT B εaj a
N /2
∑ 1j ,
(23.2)
j=1
www.ebook3000.com 59556_C023.indd 344
7/14/08 7:11:00 PM
CoarseGrained Simulations of Polyelectrolytes
345
which diverges as N→ ∞. This divergence shows the longranged nature of the Coulomb interaction. The Coulomb energy of the total system will be finite, but the order of the summation is important, as the sum is conditionally convergent. In physical terms, the nature of the screening by the ions in solution is important. The screening can yield a net interaction that is short ranged, and that can be treated effectively in ways similar to other shortranged interactions. For example, when the Debye– Hückel (DH) approximation is valid the Yukawa potential can be substituted for the 1/r potential. On the other hand, many of the interesting phenomena occur for strong Coulomb interactions, which can demand explicit longranged evaluations. Fortunately, in the last decade the development and availability of fast Coulomb codes has greatly reduced the computational cost of treating such systems, and the longranged interaction can be treated at minimal extra cost in most cases. These codes use particlemesh methods, which are discussed in the Methods section. The starting point for treating the Coulomb interactions in theoretical works is the DH approximation for the electrostatic interactions. Briefly, for a system containing added salt, the approximation is as follows. The Poisson equation in a uniform dielectric with constant ε, is ∇ 2φ =
4 πe 4 πe ρ(r ) = ε ε
∑ z ρ (r ),
(23.3)
α α
α
where φ is the electrostatic potential, ρα is the number density of mobile ion species α, and zα is the valence of species α. Using the Boltzmann distribution for the ion densities and expanding, one finds ρ(r ) ≈
∑ρ e α
− zα eβφ
,
(23.4)
α
where β = 1 / k BT. The PoissonBoltzmann (PB) approximation is a combination of Equation 23.2 and Equation 23.3 and is a meanfield approximation. The nonlinear PB equations can be solved only for selected geometries such as charged lines and cylinders [Lifson and Katchalsky 1954]. Linearizing Equation 23.3 yields the DH approximation: ∇2 φ ≈
4 πe ε
∑
zα ρα (1 − zα eβφ) = −
α
4 πe 2 εk BT
∑ z ρ φ = −κ φ, 2 α α
2
(23.5)
α
where Debye length is D = κ −1 = 4 π B
∑z ρ . 2 α α
(23.6)
α
This can be a rather severe approximation, especially for a fully charged polyelectrolyte and for r near the chain. The solution for the DH equation is φ(r ) = φ 0
e−r / D , r
(23.7)
which is the screened Coulomb or Yukawa potential. Interactions beyond one or two Debye lengths can be neglected. Manning determined important physical aspects of polyelectrolytes from the solution of the DH equations for the simplest model of a polyelectrolyte, namely a charged line [Manning 1969]. One of the key concepts to arise from these calculations and from the work of Oosawa (1971), is the idea of counterion condensation. The solution of the DH equations in terms of the Manning parameter
59556_C023.indd 345
7/14/08 7:11:01 PM
346
CoarseGraining of Condensed Phase and Biomolecular Systems
ξ = B /a has a singularity at ξ = 1. The physical interpretation of the singularity is that for ξ > 1, a sufficient number of counterions condense onto the chain neutralizing some of the monomer charge and effectively change a such that the renormalized ξ is 1. Oosawa simultaneously pointed out that for such strong polyelectrolytes there are two types of counterions: free and condensed. The condensed counterions are localized (trapped) near the polymer chain by the strong Coulomb interactions. There is a simple, charged model system that has a complete solution particularly for the strong coupling regime. This system is the onecomponent plasma (OCP), which consists of charged, point particles in a uniform neutralizing background [Brush, Sahlin, and Teller 1966; Stringfellow, DeWitt, and Slattery 1990]. The thermodynamics for the OCP are all a function of just one parameter, Γ = B /a. Here a is the average spacing between the charged particles defined in terms of the volume per particle or number density as V/N = 1/ρ = 4 πa 3 / 3, which is similar to the definition of a for polyelectrolyte chains. Also, Γ is similar to the Manning parameter. The pressure as a function of Γ is shown in Figure 23.1. The plot shows the full OCP pressure [Stringfellow, DeWitt, and Slattery 1990] and the pressure in the PB approximation, which is the ideal gas pressure. This plot provides a basis for understanding some of the most interesting behavior of polyelectrolyte systems. At low Γ the PB approximation is accurate. In this regime, entropy dominates the interaction, which is the regime of validity for the PB approximation. As Γ approaches 1, the PB begins to break down and the pressures diverge. At larger Γ the OCP pressure exhibits some fundamental differences. First, there is a mechanical instability where dP/dV < 0 for Γ > 3.09. This is in the regime where Coulomb interactions dominate. The system wants to collapse to reduce the Coulomb energy. Consider the case of a crystal with NaCl structure of positive and negative point charges and lattice spacing a. Decreasing a will lower the Coulomb energy since this among other things brings the nearest neighbor ± pairs closer together. For point particles, the decreasing a will reduce the energy ultimately to − ∞, since there is no steric repulsion to limit the contraction. Thus, as in the plot the pressure becomes negative at sufficiently large Γ. We will see that these negative pressures do occur in more realistic polyelectrolyte systems. As in the OCP system, their origin is in the Coulomb interaction being stronger than entropy. An important point to keep in mind is that the OCP is a fluid in the range shown in Figure 23.1. The solid phase (Wigner crystal) does not form until very large values of Γ (∼ 170) [Stringfellow, DeWitt, and Slattery 1990]. In the fluid phase, some degree of charge ordering does occur and is related to the structural origins of the instability. However, the degree of ordering is that of a liquid (small peaks in correlation functions) and not a solid (delta function peaks). 4
PB
Pressure
3 2 OCP 1 electrostatics dominates
0 –1 0
entropy dominates
1
2 Γ=
3
4
/a ~ ξ B
FIGURE 23.1 The pressure (solid line) of the onecomponent plasma (OCP) is given as a function of the ratio of the Bjerrum length B and the average interparticle spacing a. The pressure in the Poisson–Boltzmann (PB) approximation is shown by the dashed line. The square point denotes the instability point where dP/dV = 0. (From Stevens, M.J. and Robbins, M.O., Europhys. Lett., 12, 81, 1990. With permission.)
www.ebook3000.com 59556_C023.indd 346
7/14/08 7:11:01 PM
CoarseGrained Simulations of Polyelectrolytes
347
One of the hallmarks of polymer theory are scaling theories. In the Flory scaling argument for polyelectrolytes, the free energy F is the sum of the chain entropy and Coulomb energy. F = k BT
(efN )2 R2 + . 2 εR Nb
(23.8)
Minimizing with respect to the endtoend distance R yields ⎛ ⎞1/ 3 R ~ Nbf 2 / 3 ⎜⎜ B ⎟⎟⎟ , ⎜⎝ b ⎟⎠
(23.9)
which gives a Flory exponent of v = 1( R ∼ N v ). This is very different from neutral polymers, which have ν = 1/2 for ideal chains and ν = 3/5 in a good solvent. More details of scaling theory are given in the references [Odijk 1979; Dobrynin, Colby, and Rubinstein 1995].
23.2
METHODS
Coarsegrained models have been applied to polymers for a long time, especially flexible, neutral polymers [Binder 1995]. The coil diameter for flexible, neutral polymers is of the order 100 Å, which is much larger than the bond length (∼ 1−2 Å) or the Kuhn length (∼ 10 Å), the length that the intramolecular interactions keep the polymer locally stiff and straight. From the perspective of understanding the conformation of the polymer on the coil diameter scale, the local details of the structure on the atomic scale are secondary. The understanding of the physics of neutral polymers has come from realizing that the fundamental conformation is the random walk, and this conformation determines much of the physical properties. That is, that the dependence of the properties (e.g., viscosity as a function of concentration) is primarily due to the coarsegrained, random walk structure. Many physical properties can be scaled so that plots of data for different polymers coincide. The absolute magnitudes do depend on the chemical detail. Coarsegrained models are thus inherent in polymer physics. Analytic theories are based on coarsegrained models. Simulations have traditionally been a means by which calculations of the polymer properties can be performed without the further approximations that are necessary in most analytic calculations and that often limit the range of validity of the calculation. This is particularly true for polyelectrolytes. As noted above, the model of polyelectrolytes is an extension of the successful methods used for neutral polymers [Kremer and Grest 1995]. The basic model of the polymer is a beadspring chain that can treat flexible polyelectrolytes like RNA and NaPSS and semiflexible polyelectrolytes like DNA and actin. The systems studied are composed of Np beadchain polymers of N monomers and Nc counterions. All particles are monovalent, and since the system is neutral, the total number of monomers, N = N p N , equals the number of counterions. In this chapter the focus will be on saltfree solutions as the extension to include salt is straightforward. Added salt primarily increases the screening without adding additional physical phenomena [Stevens and Plimpton 1998]. The counterion and monomer number densities are the same (ρm = ρc ), and we drop the subscript and use ρ as either density. The interaction between beads is the LennardJones (LJ) potential: 12 ⎧⎪ ⎡ 12 ⎛ σ ⎞⎟6 ⎤⎥ ⎛ σ ⎞⎟6 ⎛ σ ⎞⎟ ⎪⎪ ⎢⎛⎜ σ ⎞⎟ ⎜ ⎜ ⎟ ⎟ ⎟ − ⎜ ⎟ − ⎜⎜ ⎟ + ⎜⎜⎜ ⎟⎟ ⎥ ; r ≤ rc ⎪4 ε ⎢⎜ ⎜⎝ r ⎟⎠ ⎝ rc ⎟⎠ U LJ (r ) = ⎪⎨ ⎢⎜⎝ r ⎟⎟⎠ ⎝ rc ⎟⎠ ⎥ ⎪⎪ ⎣ ⎦ ⎪⎪ ; . > 0 r r c ⎪⎩
59556_C023.indd 347
(23.10)
7/14/08 7:11:02 PM
348
CoarseGraining of Condensed Phase and Biomolecular Systems
For polyelectrolytes with a good solvent backbone, the cutoff is chosen to be rc = 21 6 σ, which yields a purely repulsive interaction. Treating poor solvent condition can be done by including an attractive part of the LJ interaction [Micka, Holm, and Kremer 1999]. However, the interaction of the backbone with the water may require treatment using implicit solvent models used in protein simulations [Reddy and Yethiraj 2006]. The monomers of a chain that represent several atoms are connected by a ‘bond’ potential. Here, we consider only the case where each monomer is charged; that is, b = a. For work treating poor solvent chains, generally not all the beads are charged (b < a). The attractive part of the bond potential (FENE) is given by ⎛ 1 r2 ⎞ U FENE = − k b R02 ln ⎜⎜⎜1 − 2 ⎟⎟⎟ , ⎜⎝ 2 R0 ⎟⎠
(23.11)
with k b being the spring constant and R0 the maximum extent of the bond. The FENE bond potential has a singularity at r = R0, which prevents the bond length from becoming larger than R0. The repulsive part of the LJ potential is combined with the FENE potential to yield the total bond potential. A key physical characteristic of polymers is that the chains cannot cross. This requires the bond potential to prevent bonded beads separating enough to allow chains to cross. The FENE potential inherently achieves this. A harmonic bond potential does not limit the bond length and may be problematic, although in many cases a sufficiently strong harmonic bond potential will work fine. For systems with entanglements that put large stresses locally at the point where two chains intersect, one must be more careful and the FENE bond potential is preferred. From a computational point of view the cost of the FENE potential, while larger than a harmonic potential, is negligible overall, since the computational cost is dominated by the nonbond interactions. In either case, the spring constant is chosen primarily to maintain the polymer connectivity and is much weaker than a chemical bond length, which allows timesteps equal to that used for the LJ potential. Particularly, biopolymers have an intrinsic stiffness due to the intramolecular bonding of the polymer. We modify the angle bending potential in Equation 23.1 by including the quartic term Uangle (θ) = ka 2 (θ − θ0 )2 + ka 4 (θ − θ0 )4 ,
(23.12)
where ka are the bending constants, θ is the angle between three consecutive monomers on the chain, and θ0 is the equilibrium angle, which is typically 180°. The quartic term is included in Equation 23.9 to make sharp bends prohibitively expensive [Stevens 2001]. The persistence length L p is the quantity used to define the choice of the ka, since L p is a measured quantity. The persistence length is conceptually the length over which the chain is straight. For separations s < Lp the tangent vectors of the chain are parallel. For larger separations the tangent vectors become uncorrelated. The definition of L p is (t(s ) − t(0 ))2 = e
−2 Lps
,
(23.13)
where t(s) is the tangent vector at position s along the chain [Doi and Edwards 1986]. As noted above, the Coulomb interactions are long ranged and require special treatment. Not only must all ion pair interactions within the simulation be calculated, but also the interactions with the images. This summation is the Ewald sum, which treats the Coulomb energy for a periodic system with boundary dielectric εm at a radius much larger than the cell dimensions [Allen and Tildesley 1987]. The Ewald sum splits the calculation into two parts, a real space sum and a reciprocal space sum, such that each part converges “rapidly”. A parameter G is used to control the
www.ebook3000.com 59556_C023.indd 348
7/14/08 7:11:03 PM
CoarseGrained Simulations of Polyelectrolytes
349
convergence, or equivalently, the number of terms in the two sums required to achieve the desired accuracy. The real space sum for a system of Ntot total charged particles of valence zi is N tot −1 N tot
Ur =
∑∑ i=1
j>i
⎛ r ⎞ zi z j G erfc ⎜⎜⎜G ij ⎟⎟⎟ − ⎟ ⎜⎝ rij 2⎠ 2π
N tot
∑ z + (11+42πε 2 i
i=1
m
)V
M2 ,
(23.14)
where M is the simulation cell’s total dipole moment, and the sums only involve particles within the simulation cell, no image particles (the full Ewald sum includes these). The volume V is the simulation cell volume. The complementary error function limits the range of the first term. The last two terms are generally not used in simulations. The second term is a constant and thus neglectable. In most cases, the system has no net dipole moment and the last term is zero. However, this may not always be the case. An interesting example of the system dipole moment being relevant is in some dipolar systems [Wei and Patey 1992]. The reciprocal space contribution to the total system energy is 1 Uk = πV
N tot −1 N tot
∑ ∑∑ i=1
j>i
k≠ 0
zi z j
⎛ k 2 ⎞⎟ 4π2 ⎜⎜− ⎟⎟ cos(k ⋅ rij ), exp ⎜⎜⎝ 2G 2 ⎟⎠ k2
(23.15)
where the sum over k is over the reciprocal lattice vectors of the simulation cell lattice. The exponential limits the range in kspace of this sum. The double sum over i and j can be made into a single sum using trigonometric sum rules. However, this applies to the full total energy and not to forces on individual particles. Using cutoff methods instead of variants of the Ewald sum to evaluate the Coulomb interaction can lead to gross errors in some cases. For example, the solidification of the OCP is off by an order of magnitude when calculated using the minimum image cutoff [Brush, Sahlin, and Teller 1966]. One of the subtle aspects of the Coulomb interaction is that the energy can often be calculated relatively accurately and sometimes even radial distribution functions are not so bad, but the orientational correlation functions are poor [Schreiber and Steinhauser 1992]. Given the speed of present particlemesh algorithms, it is best to use them and to know that the calculation (with right parameters) is accurate. The basic idea of fast particlemesh calculations of the Coulomb interactions is to calculate the kspace sums using fast Fourier transforms (FFTs). The great advantage is that algorithm scales as N log N, where N is the number of charged particles in the simulation. In addition the algorithm is parallelizable [Plimpton, Pollock, and Stevens 1997]. Furthermore, the crossover where the particlemesh algorithms are faster than standard Ewald methods is a small number of particles (∼ 100). The basic algorithm is to interpolate the charges to a 3D mesh; solve Poisson’s equation on the mesh using FFTs; and interpolate back the electric fields to the atoms from which the forces are calculated. There are various particlemesh methods available [Hockney and Eastwood 1988; Darden, York, and Pederson 1993; Pollock and Glosli 1996; Deserno and Holm 1998]. The advantages both computationally and physically for using one of these are so significant that standard cutoff methods are not worth considering. A discussion of the parallel implementation of particlemesh methods can be found in the references [Plimpton 1995; Pollock and Glosli 1996; Plimpton, Pollock, and Stevens 1997]. In addition, the LAMMPS molecular dynamics code is open source and available online [Plimpton]. A comment concerning the nature of dielectric screening is worthwhile, since the issue comes up particularly when comparing with analytic calculations. The models discussed here treat the solvent, typically water, as a uniform dielectric medium. The temperature dependence of this approximation is subtle and often neglected. The dielectric constant is temperature dependent. In thermodynamics the relevant coupling parameter is B, because the Boltzmann weighting involves U/kBT. However, for water the temperature dependence of the Bjerrum length is small (15%) over the range from 0 to 100°C, because the temperature dependence of the ε is canceled by the kBT in B.
59556_C023.indd 349
7/14/08 7:11:03 PM
350
CoarseGraining of Condensed Phase and Biomolecular Systems
As an example of mapping the coarsegrained model to a real polyelectrolyte, we consider the NaPSS system. In NaPSS, every other carbon atom in the backbone has a sulfonate group, which is typically charged. While not all the sulfonate groups are charged, to a good approximation we can consider them as charged. The distance between charges along the backbone is then a = 2.5 Å. Using the FENE bond potential with k b = 7ε/σ2 and R0 = 2σ, the average bond length is 1.1σ. Equating the values of a defines the LJ unit as σ = 2.2 Å. We also would have B = 7.1 Å = 3.2σ. To treat a polyelectrolyte with a fraction f of charged monomers, we equate a/f = 1.1σ. For NaPSS, f = 0.29 yields σ = 8.6 Å and B = 0.83σ.
23.2.1
DYNAMICS
The dynamics of the system are performed at constant temperature T = 1.2ε using the Langevin thermostat [Schneider and Stoll 1978]. The dynamical equations of motion with random noise term W are mri = Fi − mΓri + Wi (t ),
(23.16)
where ri and Fi are the ith particle’s position and force, respectively, and Γ is the damping constant such that Wi (t )Wj (t ′) = 6 k BTmΓδ ijδ(t − t ′) .
(23.17)
The two additional terms to Newton’s equation couple the system to a heat bath maintaining a constant, average temperature. To thermostat the polymer beads we use Γ = 1τ−1 . The timestep is 0.015 τ. Typically about 3 × 105 timesteps are used for N = 32 systems and 8 × 105 timesteps are used for N = 64. However, some circumstances such as multivalent ions require longer simulations [Stevens 2001].
23.3 APPLICATIONS 23.3.1
POLYELECTROLYTES IN SALTFREE SOLUTION
In polymer physics the structure of the single polymer in the lowdensity, noninteracting limit is the basis for the more complex calculations of the polymer structure as a function of concentration. While this perspective remains true in polyelectrolytes, the dilute limit for polyelectrolytes is not as trivial as for neutral polymers. For neutral polymers, the singlechain structure is the same for all concentrations below the overlap concentration c*, since there is no chain–chain interactions below c*. In polyelectrolyte systems there is an interaction due to the longranged Coulomb potential and to the screening by counterions. Thus, to know that one has reached the dilute limit structure, simulations have to be performed as a function of concentration (even if only one chain is treated in the simulation). For such simulations, we consider flexible polyelectrolyte chains of length N = 16, 32, and 64 in saltfree solution. Since the chains are flexible, there is no angle term in the potential. The interactions are just electrostatic and bond forces. In these simulations all the beads are singly charged and B = 0.83σ, which corresponds to a = 8.6 Å. To characterize the structure of the single chain, we calculate the ration r = R 2 / RG2 , where R is the average endtoend distance of the polyelectrolyte chain and RG is the average radius of gyration. For a rod the ratio is 12 and for an ideal chain the ratio is 6. Thus, this ratio encompasses the two limits one expects for polyelectrolytes. Figure 23.2 shows the plot of r as a function of monomer density and chain length. At low densities the value of r obtains a limiting value that depends on the
www.ebook3000.com 59556_C023.indd 350
7/14/08 7:11:03 PM
CoarseGrained Simulations of Polyelectrolytes
351
12 9
r
8
10
7 6 5
0
2
4
6
8
10
r
lB (σ)
8
6 –7 10
10
–6
10
–5
–4
–3
10 10 density(σ–3)
10
–2
10
–1
10
0
FIGURE 23.2 The ratio r = R2/RG2 is plotted as a function of the monomer density for saltfree solutions at chain lengths of N = 16 (circles), 32 (squares), 64 (diamonds), and 128 (triangles). The arrows denote the overlap density for N = 16, 32, and 64 going from high to low density. The straight line is a guide to the eye for the part of the curves that is independent of N. Inset: The ratio r is plotted as a function of Bjerrum length for N = 32 at ρ = 0.001σ −3. (Modified from Stevens, M.J. and Kremer, K., J. Chem. Phys., 103, 1669, 1995. With permission.)
chain length. The increase in r with chain length is consistent with the Coulomb interaction being long ranged and longer chains having a larger net Coulomb repulsion among the monomers. The lowdensity limit of r is much greater than the neutral chain values, yet all the chain lengths have r below the ideal rod limit. There are still fluctuations within the chain structure; the chain entropy is not zero [Stevens and Kremer 1995]. To obtain the r = 12 rodlike structure requires much larger N and lower densities. The rodlike limit is a double limit in N and ρ. The overlap densities ρ* for N = 16, 32, and 64 are marked by arrows in the figure. For ρ < ρ*, the value of r is still increasing. As noted above, this behavior is different from neutral polymers and different from early theoretical work [de Gennes et al. 1976]. The screening of the monomer repulsion by the counterions is substantial at concentrations near ρ*. As the density decreases, this screening decreases and r increases. The saturation limit occurs when the local concentration of counterions becomes negligible. The density at which a single counterion will occupy the volume of the chain assuming uniform counterion density is a good approximation for the saturation density. Experimentally, the singlechain structure factor is the measurable quantity. For polyelectrolytes measuring the dilute limit structure factor is very difficult for reasons apparent from the discussion of Figure 23.2. To obtain a structure factor that is independent of concentration requires going to very low concentrations—orders of magnitude lower than in neutral polymers—which greatly reduces the signal. This is a case where the simulations are much easier to perform than the experiments. The singlechain structure factor is 1 S(q) = N
2
N
∑ exp(iq ⋅ r ) , j
(23.18)
j=1
where the normalization is S (0) = N . The spherically averaged quantity S(q) is calculated for 2π /b < q < R . This range of q corresponds to structure on length scales between the bond length and the endtoend distance, R. The concentration dependence is in the slope in the range −1 < log qσ < 0 , which corresponds to the lengths between b < r < L . The slope is related to the Flory exponent ν, which defines the scaling relations R ∼ N v and S (q) ∼ q−1/ v . For the lowest densities the ν is
59556_C023.indd 351
7/14/08 7:11:04 PM
352
CoarseGraining of Condensed Phase and Biomolecular Systems
near 1.0, which is the rodlike limit [Stevens and Kremer 1993, 1995]. A finer examination gives ν = 0.93, which is consistent with the data for the ratio r. As the density increases, the screening of the monomer repulsion increases and consequently ν decreases, reaching the neutral value of 5/3 at the highest densities. Thus, the singlechain polyelectrolyte structure as a function of concentration possesses the range of conformations from almost rodlike to selfavoiding random walk. Validation of the simulation results by comparison to experimental data is essential. There were two measurable quantities that could be compared with the simulation data at the time of the original work. The osmotic pressure of the polyelectrolyte solution shows a density dependence. The data from several groups are presented in a paper by Wang and Bloomfield (1990). They found that the osmotic pressure scaled as a function of concentration P ∼ cα with α ≈ 1 at low concentrations and α = 9/4 at high concentrations, which is the neutral limit. The simulations reproduced these measured results and provided a refinement due to the strength of the Coulomb interactions [Stevens and Kremer 1995]. The other measured quantity is the peak in the monomer–monomer structure factor. While the singlechain structure factor is difficult to measure, the total monomer–monomer structure factor is measurable and also shows a density dependence. The position of the peak in the structure factor scales as c1/3 at low concentrations and has a chain length dependent crossover to c1/2 at high concentrations [Kaji et al. 1988]. This result was reproduced by the simulations [Stevens and Kremer 1995]. In addition the relation between the crossover point and the overlap concentration could be directly calculated. 23.3.1.1
Counterion Condensation and the Strength of Coulomb Interactions
Counterion condensation is an important physical characteristic of polyelectrolyte systems. As noted above, for strong polyelectrolytes the total charge that resides in the chain is so large that some counterions are captured by the chain much like a nucleus binds electrons. In the same vein, the effective net charge is reduced by the condensed counterions. The Debye–Huckel approximation breaks down when the interactions are strong enough to yield counterion condensation. With simulations we can perform calculations without any approximation. A result that is very indicative of the nature of charged interactions and has broad implications is the counterion condensation and its connection with chain conformation at a dilute concentration as a function of varying B/a [GonzalesMozuelos and Cruz 1995; Stevens and Kremer 1995]. The inset of Figure 23.2 shows the ratio r calculated for N = 32 at ρ = 0.001σ − 3 for varying values of B [Stevens and Kremer 1995]. In the main plot of Figure 23.2, B is 0.83σ. As B decreases from this value in the main plot, r decreases, which is as expected due to the reduced Coulomb repulsion between the monomers yielding a more coiled structure. The ratio r goes toward the correct, neutral limit (∼ 6.3) as B→0. Very interesting behavior occurs for B >1σ. Instead of r continuing to increase with larger B and stronger Coulomb interactions, r decreases. The reason lies in the strong interaction regime of the OCP pressure discussed above. For B >1σ the Coulomb interaction begins to dominate the entropy. Thus, the counterion attraction to the polyelectrolyte chains becomes strong enough that counterions are captured by the chain, and the number of condensed counterions increases with B. Figure 23.3 shows a set of singlechain images with counterions within 2σ. The individual chains were chosen such that their eigenvalues of the radius of gyration tensor match the average values of the simulation for the given B. The chains are shown oriented such that the largest eigenvector of the RG tensor is along the width of the page and the second largest is along the height of the page. The set of images shows variation in the size of the average configuration and the increase of counterion condensation as a function of B. The dual nature of the Coulomb interactions is evident from these images. While the Coulomb repulsion between monomers on the chain will yield a more rodlike conformation, the attraction of the counterions screens the monomer repulsion and shrinks the chain. In fact for the largest B in the inset figure, the chain size is smaller than the neutral chain size (r < 6).
www.ebook3000.com 59556_C023.indd 352
7/14/08 7:11:05 PM
CoarseGrained Simulations of Polyelectrolytes
353
FIGURE 23.3 Images of singlechain conformations for (top to bottom) B = 0.0, 0.3, 1.0, 5.0, and 10.0σ for system in Figure 23.2. The chains have been oriented such that their largest eigenvector of RG is along the width of the paper and the second largest eigenvector is along the height of the page. The light gray spheres are polyelectrolyte monomers and the dark gray spheres are counterions within 2σ of the chain.
In the simulations the variation of B was done by changing the dielectric constant. This is certainly possible, but there is a limit of physically realizable values. This raises the question of what is the relevant range of B/a (the ratio is the relevant quantity; a was kept fixed in the discussion above). Nature gives us a guide. We have treated monovalent ions in the discussion to this point.
59556_C023.indd 353
7/14/08 7:11:05 PM
354
CoarseGraining of Condensed Phase and Biomolecular Systems
In the expression for OCP Γ parameter, the valence z enters as z2. Thus, a trivalent ion can push the value of B/a by a factor of 10! This brings us to the next section. 23.3.1.2 DNA Condensation DNA is one of the prototypical polyelectrolytes and is one of the most highly charged polymers with a charge every 1.7 Å along the axis. Thus, DNA is well into the counterion condensation regime with ξ = 4.2. A fundamental issue is packing DNA into cells. The contour length of the DNA can be larger than the cell. The simplest case is packing of DNA into viral capsids. For example, the λ bacteriophage has a capsid diameter of 60 nm. The λ bacteriophage’s DNA has a contour length of 16 μm and can be coiled up within the capsid. How is the electrostatic repulsion between the highly charged DNA overcome in order to pack the DNA into the capsid? Moreover, like many biopolymers, doublestranded DNA is intrinsically stiff (due to the doublestranded structure), with L p = 500 Å. Bending DNA must overcome both the Coulomb interaction and the intrinsic mechanical stiffness of the polymer. We saw above that flexible polyelectrolytes can collapse for large values of B/a or equivalently strong Coulomb interactions can dominate entropic interactions. With respect to DNA, the value of B/a is fixed. Recall in Figure 23.3 that the counterions become more attracted to the polyelectrolyte with increasing B/a and the strong Coulomb interactions yield the more compact structure as the system tries to obtain a chargeordered structure. The strength of the Coulomb interaction between the DNA and counterions can be increased by increasing the valence of the counterion. In fact, it is well known that DNA will pack into toroidal structures in the presence of counterions with valences z ≥ 3 [Kleinschmidt et al. 1962; Widom and Baldwin 1980]. This effect is purely electrostatic in that it does not depend on the chemical structure of the counterion [Widom and Baldwin 1980]. This behavior of DNA packing into condensed structures is called DNA condensation. We can examine DNA condensation with coarsegrained polyelectrolyte simulations using the model described above. We now include the angle potential to produce an intrinsic mechanical stiffness in the polymer. We treat the DNA as a beadspring polymer (i.e., no double strand) with every bead charged b = a = 1.7 Å. The persistence length of DNA is prohibitively long to treat even in the coarsegrained simulations. However, the issue is what happens to a semiflexible polyelectrolyte with L Lp a in the presence of counterions of different valence. We can perform simulations with this constraint. Simulations were performed with N = 256, ka2 = 5 ε/rad2 and ka4 = 200 ε/rad4. The bead diameters were chosen to be 4 Å, which corresponds to a typical ionic diameter with σ = 1.5 Å [Stevens 2001]. First, the effect of divalent counterions was examined to see whether condensation can occur. Starting from random conformations, simulations performed with divalent ions do not form any condensed structures. This does not demonstrate that condensation does not occur with divalent ions, because there is always the issue of whether condensation would occur if the simulation were run longer; that is, that the nucleation event occurs on a time scale longer than that simulated. The result does show that there is a barrier to condensation of the polyelectrolyte. To treat this computational issue, simulations were performed starting with initial conformations near the toroid structure to determine if the structure is stable with divalent ions. The initial polyelectrolyte conformation was a spiral. The counterions are placed on a separate, translated spiral such that they are between successive arcs of the polymer’s spiral. The energy of the single conformation with counterions was calculated for varying spiral radii and pitches. For the above forcefield parameters the minimum energy conformation was found and used as the initial state. In this minimum energy state, one turn of the spiral has 40 beads and the pitch is 2 × 21/6 d, where d is the bead diameter. This value of the pitch puts the counterions and charged monomers as close as possible without overlap of the LJ spheres. The spiral structure should be able to evolve easily into a toroidal structure, which is a slightly more condensed structure with charge ordering in three dimensions. Figure 23.4a shows the conformations of the eight polyelectrolytes in the simulations with divalent counterions after 5 × 106 timesteps starting from an initial spiral conformation. Clearly,
www.ebook3000.com 59556_C023.indd 354
7/14/08 7:11:06 PM
CoarseGrained Simulations of Polyelectrolytes
(a)
355
(b)
FIGURE 23.4 Images of N = 256 chains (light gray) with counterions (dark gray) showing chain conformations. Each chain is oriented as in Figure 23.3 and placed on the figure individually. (a) Divalent counterion case showing that chains do not form toroidal structures starting from a spiral initial conformation. (b) Tetravalent counterions form toroidal structures. (Adapted from Stevens, M.J., Biophys. J., 80, 130, 2001. With permission.)
the chains unwind from the spiral structure and the toroidal structure is not stable for the divalent system. Some counterions are delocalized, and as a whole the counterions are not fully screening the monomeric charges. On average, 116 out of 128 counterions per chain condense to within 2d of the polyelectrolytes. Each chain in combination with these counterions has a net negative charge. The simulations show that this net charge results in a net repulsion within the molecule and an extended structure. For divalent ions, not enough of the counterion and chain entropy can be overcome by Coulomb interactions to yield DNA condensation. For the same parameter set but with tetravalent counterions, toroidal structures form and are stable. Figure 23.4b shows the eight conformations that evolved to be toroids. Even starting from random polymer conformations, condensed structures form for z = 3 and 4. (Depending on the angle bend potential, kinked rod structures as well as toroids can form [Stevens 2001].) In general, for z = 4, all the counterions condense to the chains. While the counterions are condensed, they still move about in the volume near the polymer. In other words, the counterions are bound to the polyelectrolyte, not to individual monomers. As such, they do not lose all their entropy in becoming condensed. These results show the competition between entropy, particularly of the counterions, and the Coulomb free energy. Condensing the counterions reduces their entropy. This can occur only if the Coulomb free energy of condensing the counterions compensates for the entropy loss.
59556_C023.indd 355
7/14/08 7:11:07 PM
356
CoarseGraining of Condensed Phase and Biomolecular Systems
Thus, the Coulomb coupling strength must be large enough to achieve this compensation. Also, divalent counterions have more entropic costs, since there are more of them than larger valence counterions. In condensation of single, semiflexible polyelectrolytes such as DNA, this competition requires z ≥ 3 in agreement with experimental data [Bloomfield 1996]. 23.3.1.3 Bundle Formation The competition between entropy and Coulomb interactions is further elucidated by the formation of bundles in stiff polyelectrolytes. DNA condensation is the collapse of a single polyelectrolyte whose length is greater than its persistence length. A set of polyelectrolyte chains can collapse as a group. Of particular interest is the case when L > Lp. There are a variety of very stiff biopolymers that fall in this class, for example, Factin, fd virus, short DNA [Tang and Janmey 1996; Tang et al. 2002]. These polyelectrolytes are also highly charged and will form bundles in the presence of multivalent ions. The basic principle is the same; the Coulomb interaction dominates entropy and the system forms a chargeordered structure. Simulations use the same basic model as the DNA model above, using just the harmonic angle bending potential. The chains have N = 8–64 monomers. The spring constant ka = 60ε/σ2 is large enough to make Lp ≥ L . The system density is chosen below the onset of liquid crystal phase, but not so dilute that the chains do not interact. The systems start with the chains and counterions randomly placed without overlap. Simulations were performed with monovalent and divalent counterions. Figure 23.5 shows the interchain monomer–monomer radial distribution function gmm(r) as a function of N for divalent ions and for monovalent ions with N = 32. For the monovalent ions the bundles do not form, which also verifies that the system would not form a liquid crystal phase in the neutral case. There is a correlation hole in gmm(r) for the monovalent ion showing that the monomers on separate chains do not get close and form bundles. In contrast, for z = 2 the correlation function has a peak at r = 2σ for all N, which grows with chain length. The peak occurs at this location because two parallel chains with counterions packed between them have a separation of 2σ. This is an indication of the chargeordered structure that exists within these Coulombdominated systems. The growth in the peak is due to the stronger ordering of the chains within the bundle for larger N. This is in part due to the stronger total electrostatic interactions with the longer chains (with larger total charge).
FIGURE 23.5 The monomer–monomer radial distribution function for the stiff polyelectrolytes at ρ = 0.01σ − 3. The solid lines are for divalent counterions. From top to bottom at the peak position r = 2σ, the lines are for N = 64, 32, 16, and 8. The dotted line is for monovalent counterions and N = 32. (From Stevens, M.J., Phys. Rev. Lett., 82, 101, 1999. With permission.)
www.ebook3000.com 59556_C023.indd 356
7/14/08 7:11:08 PM
CoarseGrained Simulations of Polyelectrolytes
357
Examination of the counterion dynamics reveals the connection between the necessary valence to form condensed structures and the competition between entropic and Coulomb interactions. The divalent counterions are localized to the whole bundle. In comparison to the DNA condensation, the counterion entropy is larger in the bundle, because the counterions occupy a larger volume. For this reason, only divalent counterions are needed to form bundles. In general, the greater the loss of entropy in the system, the greater the Coulomb strength must be to compensate. The DNA condensation is an example where there is selfattraction within a macromolecule due to charge ordering and bundle formation is attraction between likecharged macromolecules again due to charge ordering in strongly coupled Coulomb systems [Stevens and Robbins 1990].
23.3.2
GRAFTED POLYELECTROLYTES
Grafted polyelectrolyte systems are a common application of polyelectrolytes. A main use of synthetic, grafted polyelectrolytes is stabilization of colloidal suspensions [Napper 1983]. A more recent technological example is in DNA microarrays, where DNA is grafted to a surface. Recent articles explore much of the progress on grafted polyelectrolyte topics [Netz and Andelman 2003; Ruhe et al. 2004; Naji, Seidel, and Netz 2006]. In the last few years, simulations of the basic polyelectrolyte brush systems have been performed [Csajka and Seidel 2000; Hehmeyer and Stevens 2005; Kumar and Seidel 2005, 2007]. The model is an extension of the model described above for polyelectrolytes in solution. For grafted polyelectrolytes one end of each chain is bound to a flat surface. The geometry consists of a system periodic in x and y. The substrate is typically a repulsive wall at z = 0 that is treated as a zdependent potential, U wall ( z ) = U LJ ( z ).
(23.19)
Typically the cutoff is chosen so that the interaction is purely repulsive. The polyelectrolyte chains are bound to the surface with an area per chain A. As one of the applications of grafted polyelectrolytes is the repulsion between two surfaces coated with the chains, simulations of two apposed surfaces is of fundamental interest and has been a focus of analytic work [Pincus 1991]. To treat such systems, the basic geometry described above is doubled with the chains grafted on the inside of opposite walls with separation D. Each wall has Np = 16 chains arranged in a triangular lattice. In all LJ pair interactions, the bead diameter is set to be d = 4 Å, the value used in primitive model electrolytes. The polyelectrolytes are treated as flexible (no angle potential). One of the interesting results is the density profile as a function of surface separation distance D. Figure 23.6 shows the density profile for A = 77.4σ2, N = 32, and separations defined by δ = D/L, the ratio of the separation to the chain contour length. For separations larger than the contour length, the chain profiles naturally do not overlap. As the separation shrinks, the peak in the density shifts from close to the substrate to the middle of the system. At these short separations, the density of the system is large and the screening is strong. In addition, much like in the solution as the concentration goes from well below overlap toward overlap, the polyelectrolyte chains contract as they avoid each other and the counterion screening increases. Overall, the chains have conformations more like neutral coils at small δ.
23.4 FUTURE DIRECTIONS There are many future directions for coarsegrained modeling of polyelectrolytes, because of the variety of polyelectrolytes. Work to date has focused on the simplest cases. Future work will include more complex polyelectrolytes and systems of polyelectrolytes with other molecules. Two examples are given below that will likely have a big impact beyond just the polyelectrolyte field. Recently is has become understood that a large fraction (30% eukaryotic genome) of proteins are not natively folded. These “unstructured proteins” are polyampholytes strictly speaking, but
59556_C023.indd 357
7/14/08 7:11:09 PM
358
CoarseGraining of Condensed Phase and Biomolecular Systems
−3
ρ (σ )
0.06
0.03
0
0
0.2
0.4
0.6
0.8
1
z/D
FIGURE 23.6 Density profiles of apposing polyelectrolyte brushes. Monomer density is indicated by a solid line with different point types to indicate the gap width. The series are for gap widths of δ = 1.42 (triangles), 1.14 (circles), 0.85 (diamonds), 0.57 (squares), 0.43 (open circles), and 0.28 (open squares). Each monomer density profile is paired with a counterion density profile that is indicated by a solid line. (From Hehmeyer, O. and Stevens, M.J., J. Chem. Phys., 122, 134909, 2005. With permission.)
they typically have a net charge and can behave on large length scales as polyelectrolytes. In some cases unstructured proteins only partially unfolded; often the tail segments are unstructured. The unstructured proteins are biologically functional. Many bind to nuclei acids; thus, a polyelectrolytepolyelectrolyte binding occurs in this case. An interesting example of the function of unstructured proteins is the neurofilament fibers (NF) and microtubule associating proteins (MAPs) [Bright, Woolf, and Hoh 2001; Weathers et al. 2004]. There are three NF polymers, called NFL, NFM, and NFH for low, medium, and high molecular weight. The NFM and NFH have long tails that are unstructured. The NFL, which has the same beginning amino acid sequence as NFM and NFH, forms a coiledcoil structure that binds together into 10 nm wide fibers. The long unstructured tails of the NFM and NFH form a polyelectrolyte brush bound to the fiber. In a related manner MAPs form brushes in conjunction with microtubules. Part of the MAP is folded and binds to the microtubule. The unstructured part extends from the microtubule and forms the polymer brush. Functionally these polymer brushes provide mechanical stability for neural axons. The polymer brush core (coiled coils of NF and the microtubules) are oriented along the axis of the axon tube. The systems of NF and MAPmicrotubules form a polymer brush that, much like colloids with grafted polymers, forms a liquid structure [Brown and Hoh 1997; Mukhopadhyay and Hoh 2001]. The grafted polyelectrolyte repulsion provides a mechanical stability for the axon. This mechanical structure is quite flexible. A rigid, bonded network does not exist. This is just one example of unstructured proteins possessing interesting physical properties. This class of polymers is very broad and most likely contains a large number of interesting physical phenomena to study. A strong basis has been provided for coarsegrained modeling by work using support vector machine algorithms to characterize unstructured proteins based on reduced amino acid groups, which found that only four groups are necessary [Weathers et al. 2004]. The combination of polyelectrolytes and other charged molecules is an area of growing interest. An example is the complexes formed by cationic lipid bilayers and DNA as well as other biopolymers. One application of such systems is packing of DNA as a delivery mechanism for gene therapy. The complexes selfassemble to form hierarchical structures. At one level there is the selfassembly of the lipids into a bilayer. The bilayers and the DNA form another level, which can have different structures depending on the lipid compositions. Lamellar and hexagonal phases have been observed [Wong et al. 2000; Liang, Harries, and Wong 2005]. The basic mechanism for forming the complexes is believed to be electrostatic interactions. With the advent of coarsegrained models for lipid
www.ebook3000.com 59556_C023.indd 358
7/14/08 7:11:09 PM
CoarseGrained Simulations of Polyelectrolytes
359
molecules, [reference Chapters 2 and 3] it is now possible to simulate these complexes. The connection of the fundamental interactions and the complex structure can be investigated in such systems. Because of the richness of possibilities of putting charges on polymers and mixing different polyelectrolytes, lipids, nanoparticles, etc., this is a very exciting research area. Over the last decade, major progress has been made in some of the basic polyelectrolyte systems. Future research is now possible on more complex assemblies that will naturally possess more complex properties.
REFERENCES Allen, M. P., and D. J. Tildesley. 1987. Computer Simulation of Liquids. New York: Oxford University. Barrat, J. L., and J. F. Joanny. 1996. Theory of polyelectrolyte solutions. Adv. Chem. Phys. 94:1–66. Binder, K., ed. 1995. Monte Carlo and Molecular Dynamics Simulations in Polymer Science. New York: Oxford. Bloomfield, V. A. 1996. DNA condensation. Curr. Op. in Str. Biol. 6:334–41. Bright, J. N., T. B. Woolf, and J. H. Hoh. 2001. Predicting properties of intrinsically unstructured proteins. Prog. Biophys. Mol. Biol. 76:131–73. Brown, H. G., and J. H. Hoh. 1997. Entropic exclusion by neurofilament sidearms: A mechanism for maintaining interfilament spacing. Biochemistry 36 (49):15035–40. Brush, S. G., H. L. Sahlin, and E. Teller. 1966. Monte Carlo study of a onecomponent plasma. I. J. Chem. Phys. 45:2102. Csajka, F. S., and C. Seidel. 2000. Strongly charged polyelectrolyte brushes: A molecular dynamics study. Macromolecules 33:2728–39. Darden, T., D. York, and L. Pederson. 1993. Particle mesh Ewald: An N log(N) method for Ewald sums in large systems. J. Chem. Phys. 98:10089. de Gennes, P. G. 1979. Scaling Concepts in Polymer Physics. Ithaca, NY: Cornell University. de Gennes, P. G., P. Pincus, R. M. Valesco, and F. Brochard. 1976. Remarks on polyelectrolyte conformation. J. Physique 37:1461. Deserno, M., and C. Holm. 1998. How to mesh up Ewald sums. I. A theoretical and numerical comparison of various particle mesh routines. J. Chem. Phys. 109:7678–793. Dobrynin, A. V., R. H. Colby, and M. Rubinstein. 1995. Scaling theory of polyelectrolyte solutions. Macromolecules 28:1859–71. Doi, M., and S. F. Edwards. 1986. The Theory of Polymer Dynamics. New York: Oxford University Press. GonzalesMozuelos, P., and M. O. de la Cruz. 1995. Ion condensation in saltfree dilute polyelectrolyte solutions. J. Chem. Phys. 103:3145–57. Grest, G. S., and K. Kremer. 1986. Molecular dynamics simulation of polymers in the presence of a heat bath. Phys. Rev. A 33:3628–31. Hehmeyer, O. J., and M. J. Stevens. 2005. Molecular dynamics simulations of grafted polyelectrolytes on two apposing walls. J. Chem. Phys. 122:134909. Hockney, R. W., and J. W. Eastwood. 1988. Computer Simulation Using Particles. New York: Adam Hilger. Kaji, K., H. Urakawa, T. Kanaya, and R. Kitamaru. 1988. Phase diagram of polyelectrolyte solutions. J. Physique 49:993. Kleinschmidt, A. K., D. Lang, D. Jacherts, and R. K. Zahn. 1962. Darstellung und Längenmessungen des gesamten desoxyribonucleinsäure: Inhaltes von T2Bakteriophagen. Biochim. Biophys. Acta 61:857–64. Kremer, K., and G. S. Grest. 1990. Dynamics of entangled linear polymer melts: A moleculardynamics simulation. J. Chem. Phys. 92:5057–86. . 1995. Entanglement effects in polymer melts and networks. In Monte Carlo and Molecular Dynamics Simulations in Polymer Science, ed. K. Binder, 194–262. New York: Oxford. Kumar, N. A., and C. Seidel. 2005. Polyelectrolyte brushes with added salt. Macromolecules 38:9341–50. . 2007. Interaction between two polyelectrolyte brushes. Phys. Rev. E 76:020801. Liang, H., D. Harries, and G. C. L. Wong. 2005. Polymorphism of DNAanionic liposome complexes reveals hierarchy of ionmediated interactions. Proc. Natl. Acad. Sci. U.S.A. 102:11173–78. Lifson, S., and A. Katchalsky. 1954. The electrostatic free energy of polyelectrolyte solutions. I. Fully stretched macromolecules. J. Polym. Sci. 13:43. Manning, G. 1969. Limiting laws and counterion condensation in polyelectrolyte solutions I. Colligative properties. J. Chem. Phys. 51:924–33. Micka, U., C. Holm, and K. Kremer.1999. Strongly charged, flexible polyelectrolytes in poor solvents: Molecular dynamics simulations. Langmuir 15:4033–44.
59556_C023.indd 359
7/14/08 7:11:10 PM
360
CoarseGraining of Condensed Phase and Biomolecular Systems
Mukhopadhyay, R., and J. H. Hoh. 2001. AFM force measurements on microtubuleassociated proteins: The projection domain exerts a longrange repulsive force. FEBS Lett. 505 (3):374–78. Naji, A., C. Seidel, and R. R. Netz. 2006. Theoretical approaches to neutral and charged polymer brushes. Adv. Polym. Sci. 198:149–83. Napper, H. 1983. Polymeric Stabilisation of Colloidal Dispersions. London: Academic Press. Netz, R. R., and D. Andelman. 2003. Neutral and charged polymers at interfaces. Phys. Rep. 380:1. Odijk, T. 1979. Possible scaling relations for semidilute polyelectrolyte solutions. Macromolecules 12:688. Oosawa, F. 1971. Polyelectrolytes. New York: Marcel Dekker. Pincus, P. 1991. Colloid stabilization with grafted polyelectrolytes. Macromolecules 24:2912–19. Plimpton, S. J. n.d. LAMMPS. Code may be downloaded at lammps.sandia.gov. . 1995. Fast parallel algorithms for shortrange molecular dynamics. J. Comput. Phys. 117:1–19. Plimpton, S. J., E. L. Pollock, and M. Stevens. 1997. Particlemesh Ewald and rRESPA for parallel molecular dynamics simulations. In Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, MN, SIAM. Pollock, E. L., and J. Glosli. 1996. Comments on P3M, FMM and the Ewald method for large periodic coulombic systems. Comput. Phys. Commun. 95:93. Reddy, G., and A. Yethiraj. 2006. Implicit and Explicit Solvent Models for the simulations of dilute polymer solutions. Macromolecules 39:8536–42. Ruhe, J., M. Ballauff, et al. 2004. Polyelectrolyte brushes. In Polyelectrolytes with Defined Molecular Architecture I. ed. M. Schmidt, 79–150. Berlin: Springer. Schiessel, H. 2003. The physics of chromatin. J. Phys.: Condens. Matter 15 (19): R699–774. Schneider, T., and E. Stoll. 1978. Molecular dynamics study of a threedimensional onecomponent model for distortive transitions. Phys. Rev. B 17:1302–22. Schreiber, H., and O. Steinhauser. 1992. Molecular dynamics studies of solvated polypeptides: Why the cutoff scheme does not work. Chem. Phys. 168:75–89. Stevens, M. J. 2001. Simple simulations of DNA condensation. Biophys. J. 80:130–39. Stevens, M. J., and K. Kremer. 1993. Form factor of saltfree linear polyelectrolytes. Macromolecules 26:4717–19. . 1995. The nature of flexible linear polyelectrolytes in salt free solution: A molecular dynamics study. J. Chem. Phys. 103:1669–90. Stevens, M. J., and S. J. Plimpton. 1998. The effect of added salt on polyelectrolyte structure. Euro. Phys. J. B 2:341. Stevens, M. J., and M. O. Robbins. 1990. Density functional theory of ionic screening: When do like charges attract. Europhys. Lett. 12:81. Stringfellow, G. S., H. E. DeWitt, and W. L. Slattery. 1990. Equation of state of the onecomponent plasma derived from precision Monte Carlo calculations. Phys. Rev. A 41:1105. Tang, J. X., and P. Janmey. 1996. The polyelectrolyte nature of Factin and the mechanism of actin bundle formation. J. Biol. Chem. 271:8556–63. Tang, J. X., P. Janmey, A. Lyubartsev, and L. Nordenskiold. 2002. Metal ioninduced lateral aggregation of filamentous viruses fd and M13. Biophys. J. 83:566–81. Wang, L., and V. A. Bloomfield. 1990. Osmotic pressure of polyelectrolytes without added salt. Macromolecules 23:804. Weathers, E. A., M. E. Paulaitis, T. B. Woolf, and J. H. Hoh. 2004. Reduced amino acid alphabet is suffi cient to accurately recognize intrinsically disordered protein. FEBS Lett. 576:348–52. Wei, D., and G. N. Patey. 1992. Ferroelectric liquidcrystal and solid phases formed by strongly interacting dipolar soft spheres. Phys. Rev. A 46:7783–92. Widom, J., and R. L. Baldwin. 1980. Cationinduced toroidal condensation of DNA. J. Mol. Biol. 144:431. Williams, L. D. 2000. Electrostatic mechanisms of DNA deformation. Annu. Rev. Biophys. Biomol. Struct. 29:497–521. Wong, G. C. L., J. X. Tang, A. Lin, Y. Li, P. A. Janmey, and C. R. Safinya. 2000. Hierarchical selfassembly of Factin and cationic lipid complexes: Stacked threelayer tubule networks. Science 288:2035–39.
www.ebook3000.com 59556_C023.indd 360
7/14/08 7:11:11 PM
Carlo Simulations of 24 Monte a CoarseGrain Model for Block Copolymer Systems F.A. Detcheverry Department of Chemical and Biological Engineering, University of WisconsinMadison
K.Ch. Daoulas and M. Müller Institut für Theoretische Physik, GeorgAugust Universität
P.F. Nealey and J.J. de Pablo Department of Chemical and Biological Engineering, University of WisconsinMadison
CONTENTS 24.1 Introduction ......................................................................................................................... 361 24.2 Method ................................................................................................................................ 363 24.2.1 Model and CoarseGrain Parameters .................................................................... 363 24.2.2 Definition of Local Densities ................................................................................364 24.2.3 MC Simulations ..................................................................................................... 365 24.2.4 Choice of Parameters............................................................................................. 367 24.2.5 Stress Tensor and Variable Cell Shape Method .................................................... 367 24.2.6 Soft Nanoparticles ................................................................................................. 368 24.3 Applications ........................................................................................................................ 369 24.3.1 Equilibrium Morphologies .................................................................................... 369 24.3.2 Qualitative Description of the Dynamics .............................................................. 370 24.3.3 NanoparticleInduced Phase Transition (Soft Nanoparticles) ............................... 371 24.4 Conclusion ........................................................................................................................... 374 Acknowledgments .......................................................................................................................... 375 References ...................................................................................................................................... 375
24.1 INTRODUCTION Polymeric systems are characterized by a wide spectrum of length scales that range from short chemical bonds (Å) to chain dimensions (10 nm) and macroscopic behavior. The corresponding time scales associated with motions on such length scales are even broader; bond vibrations occur on the scale of picoseconds (10 –13 s) and, depending on molecular weight, temperature and density, chain relaxation and morphology formation, can occur over seconds, minutes, or hours. Multiple 361
59556_C024.indd 361
8/2/08 8:46:40 AM
362
CoarseGraining of Condensed Phase and Biomolecular Systems
length scales are inherently linked through the connectivity of the chain molecules, and different levels of description are therefore coupled and cannot be treated independently. Molecular dynamics simulations using atomistic force fields are unable to access the time scales necessary to achieve chain relaxation for polymeric systems of intermediate or high molecular weights. Advanced Monte Carlo (MC) methods have been developed for the equilibration of dense polymeric systems with long chains. Nevertheless, the size of the systems that can be efficiently simulated is still limited by the performance of presentday computers [Binder 1995; Kotelyanskii and Theodorou 2004]. In order to study polymeric systems, particularly their ability to selfassemble over tens or hundreds of nanometers, it is necessary to reduce the number of degrees of freedom. In doing so, it is essential that one preserves a number of relevant key features that give rise to the characteristic behavior on mesoscopic and macroscopic length scales. This coarsegraining procedure leads to a hierarchy in degrees of freedom with increasing length scales: for polymers, we start with atoms, continue with monomers (groups of 10–100 atoms), and then have polymer chains (soft fluid) [Murat and Kremer 1998; Louis et al. 2000; Eurich and Maass 2001; Yatsenko et al. 2004; Pierleoni et al. 2006], or one integrates out the microscopic degrees of freedom of the molecular conformations and describes the system via local, spatially varying densities. The latter are the central object in selfconsistent field (SCF) theoretic treatments [Fredrickson 2006]. On the largest length scale, phenomenological treatments of polymeric systems in terms of GinzburgLandau functionals utilize an even coarser description that ignores much of the spatially extended molecular architecture and is mainly based upon symmetry considerations. At each level of coarsegraining some information is irreversibly lost: for instance, in the case of dynamics of polymeric melts, the reptation motion of individual molecules in highly entangled melts can no longer be captured when polymer chains are represented as collections of just a few beads, or as simple ellipsoids that can overlap with each other. There are two broad classes of coarsegrain models that are particularly relevant to our discussion. In the first of these approaches—denoted “systematic coarsegraining”, the model retains, even at the coarsegrain level, the specificity of the polymer under consideration. Such systematically coarsegrained models are useful to relate the macroscopic properties of a polymeric material to the chemical structure of the individual chains. The degrees of freedom are often effective segments comprised of a small number of atoms whose characteristics (interaction parameters) are adjusted to match results obtained at a fully atomistic level. A coarsegrain model of that type provides access to properties that arise over longer time and length scales than those accessible to a fully atomistic model. In a process known as fine graining or reverse mapping, the details of the atomistic model can be reintroduced. On the other hand, minimally coarsegrained models only retain relevant features common to a class of systems; they assume that universal properties emerge on large length scales (an example is provided by the Gaussian nature of chain molecules in a polymeric melt). Contrary to “systematically coarsegrained” models, whose predictions are absolute quantities for a specific material, minimally coarsegrained models predict the mesoscopic and largescale properties of a class of materials. Their predictions can be quantitatively related to a specific material by matching a small number of coarsegrain parameters or invariants (e.g., the endtoend distance or the FloryHuggins parameter) that define the strength of the relevant interactions in the specific material and the minimally coarsegrained model. In this chapter we discuss a minimally coarsegrained model for a polymer melt. It is based on models introduced in the context of SCF theory [Muller and Schmid 2005; Matsen and Schick 1994; Fredrickson 2006; Edwards 1965; Helfand and Tagami 1972; Hong and Noolandi 1981], but it is viewed from a different perspective theory since the fundamental degrees of freedom are not the local densities but the positions of polymer segments. This particlebased approach describes chain conformations explicitly. Perhaps more importantly, it facilitates description of complex chain architecture and nonpolymeric objects such as nanoparticles. We do not invoke a saddlepoint approximation that is utilized in SCF theory, but we study the exact statistical mechanics of the particlebased Hamiltonian via MC simulations. The underlying idea has been explored previously in the context of polymer brushes [Laradji, Guo, and Zuckermann 1994; Soga, Zuckermann, and
www.ebook3000.com 59556_C024.indd 362
8/2/08 8:46:41 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
363
Guo 1996; Soga, Guo, and Zuckermann 1995] and in recent studies of polymeric melt [Daoulas and Muller 2006; Kang et al. 2008; Detcheverry et al. 2008]. Here we examine various aspects of the proposed approach that turn out to have important consequences for the results, and we illustrate its promise by presenting applications to mixtures of copolymers and nanoparticles—systems whose description in the context of SCF remains particularly challenging. After describing the method, and briefly discussing its relation to other approaches, we illustrate a few possible applications, such as the prediction of diblock copolymer morphologies and selfassembly in mixtures of blockcopolymer/nanoparticles. We end with a few concluding remarks regarding the general applicability of the approach outlined in this work.
24.2 24.2.1
METHOD MODEL AND COARSEGRAIN PARAMETERS
For simplicity, the model is presented in the context of a diblock copolymer melt; extensions to multicomponent systems, including multiblock systems or copolymer–homopolymer blends, are straightforward. Consider n copolymer molecules in a volume V at temperature T. The polymer chains are assumed to be Gaussian and are represented by a beadspring model. The chain contour is discretized with N beads; ri (s) denotes the position of the sth bead in the ith chain. For an isolated, noninteracting chain, the probability of adopting a given conformation r(s) is given by: ⎡ H [r(s)] ⎤ ⎥, P[r(s)] = exp ⎢− b ⎢ ⎥ k T B ⎣ ⎦ where kB is the Boltzmann constant. The bonded interactions H b ⎡⎢⎣ r(s)⎤⎥⎦ between the beads correspond to harmonic springs and are given by: H b [r(s)] 3 = k BT 2
N −1
∑ [r(s + 1b) − r(s)] , 2
2
(24.1)
s=1
where b 2 is the mean squared bond length. Nonbonded interactions among the effective segments are taken into account through an interaction functional F[φA , φ B] that is comprised of enthalpic and entropic contributions due to the coarsegraining procedure. It depends on the local, normalized bead densities φA (r) and φ B(r) . In this work, F[φA , φ B] is given by the simple choice: F[φA , φB ] = ρ0
∫
⎡ ⎤ κ d 3 r ⎢ χφA φB + (1 − φA − φB )2 ⎥ , ⎢ ⎥ 2 V ⎣ ⎦
(24.2)
where ρ0 = nN / V is the average bead number density. The first term in the sum accounts for the incompatibility between beads of different type, the strength of which is quantified by the Flory–Huggins parameter χ. The second term enforces the finite compressibility of the melt, which is inversely proportional to κ. The socalled Helfand quadratic approximation [Helfand and Tagami 1972] does not aim to describe the liquidlike structure of the polymeric melt; rather, it is the simplest form that penalizes fluctuations of the local densities away from their average value, thereby enforcing near incompressibility of the melt on long length scales. The resulting Hamiltonian is given by: H [{ri (s)}] 3 = k BT 2
59556_C024.indd 363
n
N −1
i=1
s=1
∑∑
N −1 [ri (s + 1) − ri (s)]2 + N Re2
∫
⎤ κN d 3 r ⎡⎢ (1 − φA − φB )2 ⎥ . χN φA φB + 3 ⎢ ⎥ 2 V Re ⎣ ⎦
(24.3)
8/2/08 8:46:41 AM
364
CoarseGraining of Condensed Phase and Biomolecular Systems
It incorporates the three relevant ingredients necessary to describe the physics of diblock copolymers on mesoscopic length scales: the chain structure and connectivity, the incompatibility between unlike molecules, and the small but finite compressibility of the melt that stems from the excluded volume of the beads. It should be apparent from Equation 24.3 that only a few coarsegrain parameters emerge from this model. The first is the mean squared endtoend distance Re2 = ( N − 1)b 2 for a noninteracting chain; Re sets the length scale for the coarsegrain representation. The chain contour discretization N (number of beads in the chain) is not a physical parameter in itself; only the products χN and κN are meaningful. The last parameter, N = (ρ0 Re3 / N )2 , controls the strength of fluctuations; N is , referred to as the invariant degree of polymerization because in a dense melt, Re ∼ N and N ∼ N ; N is a dimensionless density that measures the number of chains found in the typical volume of a single chain (estimated as Re3), and it also provides a rough estimate of the number of chains that a given molecule interacts with.
24.2.2
DEFINITION OF LOCAL DENSITIES
To completely describe the model, we need to specify how the local, normalized densities φA (r) and φB (r) are defined in terms of the bead positions {ri (s)}. Note that the densities are not given by the microscopic expression φA (r) = ∑ i∈Abead δ(r − ri ) , as in liquid state theory. Instead, as in SCF theory, they are defined after a coarsegraining procedure that introduces a microscopic cutoff and results in a continuous scalar field. There are at least two ways to define such densities from the bead positions. The fi rst is to associate to each bead a cloud density, such as a Gaussian instead of the δfunction in the microscopic expression [Laradji, Guo, and Zuckermann 1994]. The local densities are then unambiguously defined and the width of the Gaussian sets the microscopic cutoff. The second possibility is to use a particletomesh technique [Soga, Zuckermann, and Guo 1996]. In the simplest scheme (zerothorder interpolation), a regular, cubic mesh is introduced and from the number n kA of A beads in the cell k, the local, normalized density is given by: φAk = n kA / ncell,
(24.4)
where ncell is the average number of beads in a cell (see Figure 24.1). In this case, the grid spacing sets the microscopic cutoff. Alternatively, one can use other assignment functions between the particle positions and the grid. No matter which technique is chosen, clouddensity or particletomesh, the definition of local densities requires the introduction of a new discretization parameter—the microscopic cutoff ΔL. Physically, ΔL corresponds to the range of interaction between beads. Its choice must meet several constraints. On the one hand, ΔL cannot be too small: if, for example, ΔL were much smaller than the mean distance between neighboring beads, the beads would barely interact with each other. In the following we use 10 ≤ ncell, which enforces a minimal value for the grid spacing ΔL = (ncell/ρ0)1/3. On the other hand, the grid spacing cannot be too large if one aims to spatially resolve the inhomogeneous density distribution. If the grid spacing was much larger than the smallest length scale over which the density exhibits significant variations (e.g., the width of an interface between Arich and Brich domains), one would observe an explicit dependence of the results on the grid spacing. Following those two constraints, ΔL is chosen to be the smallest distance over which it remains meaningful to define the local densities. The use of the cloud density is more computationally demanding than the particletomesh technique; therefore, in what follows, we restrict our discussion to the particletomesh approach with a zerothorder interpolation (Equation 24.4). Higherorder interpolations [Deserno and Holm 1998] are straightforward to implement, but require longer computation times. To avoid any artifact associated with a fixed grid, the position of the grid is randomly chosen at each MC step.
www.ebook3000.com 59556_C024.indd 364
8/2/08 8:46:42 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems A
A
365
cell
A
FIGURE 24.1 Top: illustration of the coarsegrain MC simulations proposed in this work. Local densities are defined by using a regular grid and by counting the number of beads in each cell (more accurate interpolation schemes can be used). When a MC move is proposed, for one bead or more, the difference in energy comes from the change in bond lengths ( Δbi ) and the change in local densities (ΔφA,B ); these changes can be computed efficiently. Bottom: in the variable cell shape method, the simulation box changes its shape and size to accommodate the natural symmetry and periodicity of the mesophase.
24.2.3
MC SIMULATIONS
The equilibrium properties of the model defined above are determined by MC simulations. A realization of the system consists of many molecules interacting as described above. Distinct configurations are sampled according to their Boltzmann weight. A MC move consists in choosing at random a chain molecule or a subset of beads, proposing trial positions, and determining whether the trial positions should be accepted on the basis of the energy change. This difference in energy ΔE between the old and the trial configuration stems from changes in the bond lengths (bonded interactions) and changes in the local densities (nonbonded interactions). The move is then accepted according to the Metropolis criterion; that is, with probability: pacc = min(1,exp(−ΔE / k BT )) . The simplest MC move is the random displacement of a single bead; other types of move include reptation of individual or multiple beads, translation of entire chains, switching the order of blocks while keeping the same chain conformation, and deleting an entire chain and randomly rebuilding it at a different position. Drastic, global moves are particularly helpful for rapidly reaching the equilibrium morphology of the system.
59556_C024.indd 365
8/2/08 8:46:43 AM
366
CoarseGraining of Condensed Phase and Biomolecular Systems
We now briefly discuss the present method in relation to several other approaches. In Equation 24.3, the basic variables are the bead positions; our approach is thus particlebased, and it describes the conformation of individual chains explicitly. Equation 24.3 has also been taken as the starting point for SCF theory [Fredrickson 2006]; note, however, that in SCF theory the local densities φA (r) and φB (r) are the fundamental variables and, as in other fieldtheoretic techniques, the configurational degrees of freedom of the chains have been integrated out. This tacitly assumes that local chain conformations are always in equilibrium with the density distribution. The SCF theory neglects fluctuations: the saddle point approximation involved in SCF theory selects from all possible density distributions only the one that minimizes the functional F[φA , φ B]. This treatment becomes valid in the limit N → ∞; it is important to emphasize that much of our current understanding of block copolymer behavior has been generated in the context of SCF theory. In the limit N → ∞, our proposed MC simulations recover the SCF solution; note, however, that they do not rely on the saddlepoint approximation invoked in SCF theory [Fredrickson, Ganesan, and Drolet 2002; Fredrickson 2006]. As such, its results could be viewed as an exact solution of the Hamiltonian of Equation 24.3. The specific form of the Hamiltonian in Equation 24.3 is essential for efficient solution of the SCF equations. The Gaussian nature of the chain is required to integrate out the conformational degrees of freedom via the solution of a modified diffusion equation, and the quadratic approximation allows the decoupling of interacting fields via a Hubbard–Stratonovitch transformation. In contrast, virtually any kind of bonded interactions (bond length potentials, angular and torsional contributions, or chain branching) or interaction functional can be used in the MC simulations outlined above, without a significant computational overhead. From a different standpoint, our coarsegrain approach can also be viewed as a traditional MC simulation of a model defined by an unusual kind of interaction potential [Daoulas and Muller 2006]. The interaction between two beads depends not only on their relative positions, but also involves the grid. In the scheme presented here, only those beads that are found within the same cell interact with each other. The interaction is therefore discontinuous and anisotropic, and it is not translationally invariant. As we show later in this work, this simple approach suffices to capture the block copolymer properties on long length scales while drastically reducing the computational demands of energy calculations (visàvis those encountered when conventional pairwise additive interactions are employed). In contrast, keeping track of the local densities in our simulations is relatively straightforward, and the computational time remains strictly proportional to the number of beads. The coarsegrain approach described here is closely related to single chain in mean field (SCMF) simulations. In SCMF simulations [Daoulas et al. 2006; Muller and Smith 2005; Stoykovich et al. 2005; Daoulas et al. 2006], the freeenergy functional is expressed as F[φA , φB] ρ0 ∫ d 3r[φA wA + φ BwB ] . = 2 k BT
(24.5)
As in SCF theory, the fields wA and wB are defined from the local densities, for instance wA = χφ B − κ (1 − φA − φB), but they fluctuate in time instead of being selfconsistently determined static quantities. SCMF simulations consist of the following two steps that are repeated until sufficient statistics are generated: (1) perform a short MC simulation of the chains placed in the given external fields wA (r) and w B (r) , with the nonbonded energy given by the Equation 24.5, (2) update the fields from the instantaneous value of the local densities. Due to a temporary decoupling between the field value and the chain conformations, the energy associated with a MC move is only an approximate form of the exact expression given by Equation 24.3. As discussed by Daoulas and Muller (2006), this “quasiinstantaneous approximation” is controlled by a small parameter ε which plays a role similar to that of the Ginzburg parameter in SCF theory. This parameter ε depends on the discretization of the chain contour N and space ΔL and therefore it can be made small even if the Ginzburg parameter is large and fluctuations are important to capture the physics. The quasiinstantaneous approximation becomes accurate if the external fields mimic the instantaneous fluctuating interactions of a chain with its fluctuating environment. To this end, the fields have to
www.ebook3000.com 59556_C024.indd 366
8/2/08 8:46:45 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
367
be frequently updated and the density should not change significantly between updates. The main advantage of SCMF simulations over MC simulations is that they can be parallelized in a straightforward manner since, for a given value of the fields, the chains evolve independently from one another. On the other hand, SCMF simulations are less appropriate for dilute systems, the use of global MC moves that facilitate rapid equilibration is somewhat restricted, and extension to systems such as polymer nanocomposites is more demanding.
24.2.4
CHOICE OF PARAMETERS
We now explain our choice of parameters. From the melt density and the molecular weight of the diblock used in a specific experiment, one can deduce N; a typical order of magnitude for a polymer melt is 10 2 . The product χN is determined by taking the value of χ extracted in experiments and, for N , the degree of polymerization (number of monomers). The inverse of κ can be related to the isothermal compressibility through 1 κ = −(ρ0 k BT / V )(∂V /∂ p) T , where p is the pressure. In accord with previous studies we utilize κN = 50, a value which is high enough to prevent fluctuations of the total densities on length scales larger than a fraction of Re. On the practical side, higher values are difficult to consider because they increase the equilibration time: most MC moves (particularly global moves) induce a local density fluctuation and are rejected in the nearly incompressible limit. The choice of N is dictated by the properties one wants to study. For instance, an accurate description of the narrow width and the detailed density profile at a hard surface or a liquid–vapor interface would require high N (>10 3 ). The high number of degrees of freedom per chain would then imply that only small systems (a few R 3e ) could easily be accessible with common computational resources. The spirit of the coarsegrain model is to study properties on length scales set by Re, such as the morphologies formed by the copolymer. The simulation of large systems (many R 3e ) favors a choice of N as low as possible, while still faithfully describing the chain architecture. We found that N = 32 provides a good compromise between those two requirements. Note, however, that triblock and other multiblock copolymers might require higher N, since each block must be represented by a sufficient number of beads. Taking N = 1282 , N = 32, and ncell ≈ 15 results in a grid spacing ΔL ≈ 0.15Re. With these parameters, systems containing more than a million beads can be simulated on a single processor machine.
24.2.5
STRESS TENSOR AND VARIABLE CELL SHAPE METHOD
Within the meanfield approximation the internal stress tensor [Doi and Edwards 1986; Maurits, Zvelindovsky, and Fraaije 1998; Tyler and Morse 2003; Barrat, Fredrickson, and Sides 2005] for vanishing interaction range is given by: σ αβ k BT / V
n
N −1
i=1
s=1
∑ ∑ NR−1 b
= −nN δ αβ + 3
2 e
i ,α
( s ) b i ,β ( s ) ,
(24.6)
where bi (s) = ri (s + 1) − ri (s) is the bond vector joining two adjacent beads. This approximation for the internal stress tensor can be computed from a given configuration and averaged over several MC steps. When the block copolymer forms ordered microphases, such as lamellae or cylinders, the size and shape of the simulation box are bound to influence the geometric properties. In particular, it is important to avoid finitesize effects in order to determine the true periodicity of the microphase. A first possibility is to use a large cell calculation, with a simulation box as large as possible, thereby minimizing the influence of the finitesize constraints. The long computational times required by large system sizes are not the only difficulty; it is also desirable to obtain a defectfree microphase (i.e., perfect, longrange order of the domains), which is often a challenge. This is why unit cell calculations are usually more efficient. Assuming a particular symmetry for the microphase (that
59556_C024.indd 367
8/2/08 8:46:46 AM
368
CoarseGraining of Condensed Phase and Biomolecular Systems
can be deduced from a large cell calculation), the size and shape of the simulation box are then relaxed in order to minimize the free energy (SCF) or relieve the internal stress. Such variable cell shape methods were originally introduced by Parinello, Ray and Ramhan [Parrinello and Rahman 1981; Ray and Rahman 1984] and used, for example, to describe crystalline structures in solids. More recently, variable shape methods have been implemented within the context of SCF theory [Barrat, Fredrickson, and Sides 2005; Tyler and Morse 2003]. Here we briefly describe how this technique can be applied to compute the natural periodicity of the microphase. In the following we utilize the notation of Barrat, Fredrickson, and Sides (2005). The geometry of the simulation box, assumed to be a parallelepiped (but not necessarily orthorhombic), is specified by three vectors h1, h2, and h3 that constitute the box sides. Let H be the matrix obtained by concatenating these three vectors: H = [h1,h2, h3]. The H matrix evolves during a MC simulation according to the following equation: dH = −λD [H−1 ΣHT −1 ]. dt
(24.7)
In the context of our MC approach, the time t corresponds to the number of MC steps. H −1 is the inverse of H and H T−1 denotes its transpose; D is a matrix operator defined as DA = A −(1 / 3) Tr (A)I . This evolution equation drives a change in the box shape and dimensions until the system reaches a stressfree configuration, that is σ = 0, while keeping the volume of the box constant. In practice, the box shape is updated after a given number of MC steps; the amplitude of the shape changes can be tuned by the parameter λ. Because the box relaxation towards the natural periodicity of an ordered microphase can be slow, it is sometimes faster to compute the stressstrain relationship. For instance, lamellae can initially be formed with a nonequilibrium lamellar spacing imposed by the box dimensions and periodic boundary conditions; the stress tensor is then easily computed as a function of the lamellar spacing. The natural periodicity is reached when the stress tensor is isotropic.
24.2.6
SOFT NANOPARTICLES
The model defined so far can describe bulk systems of pure copolymers (or blends and other multicomponent polymer systems). We now introduce an approximate but simple approach to include in our model nonpolymeric objects such as nanoparticles. Such nanoparticles often consist of a solid, metallic core to which short polymer chains are grafted so as to facilitate dispersion in the polymer melt [Mackay et al. 2006]. Nanoparticles have two effects on the surrounding chains: (1) they enforce an excluded volume and (2) the brush coating might exhibit a preferential interaction with one block of the copolymer. In the following, we propose to describe a nanoparticle as a cluster of beads, all attached together to form a rigid object of spherical shape. The density of beads inside the sphere is chosen equal to the average density outside so that the compressibility constraint partially prevents the chains from penetrating the nanoparticle. The interaction between the nanoparticle and the neighboring chains is controlled by changing its composition; that is, the proportion of A and B beads forming the nanoparticle. When dispersed in a diblock copolymer melt, an Alike nanoparticle interacts preferentially with the A block; this situation corresponds to a nanoparticle covered with chains that are chemically identical to one of the copolymer blocks. Taking for each bead a random position inside the sphere, a nanoparticle made of A and B beads in equal proportion would be nonselective, having no preference for either A or B domains. Alternatively, the nanoparticle beads could be of type C and, in that case, two additional parameters, χAC and χBC, would be introduced to describe the nanoparticleblock interactions. Extensions to nonspherical nanoparticles (e.g., rods) or to systems having more elaborated brush structures do no present any additional difficulties. In the following, we consider the case of Januslike nanoparticles, which are spherical but are coated with two hemispheres having a different brush. Such nanoparticles have been recently studied by Kramer and coworkers [Kim et al. 2007], and it is therefore of interest to consider whether the
www.ebook3000.com 59556_C024.indd 368
8/2/08 8:46:47 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
369
model proposed here can describe some of their main experimental observations. The simplest case is to have one hemisphere entirely made of A beads, and the other entirely made of B beads. This model of nanoparticle is a crude representation that does not focus on the effect of an isolated nanoparticle. In particular, polymer chains are not strictly prevented from entering the nanoparticle (hence “soft nanoparticle”). Rather, the model is designed to reproduce the collective behavior arising when many nanoparticles are dispersed in block copolymer melt, and their interplay with the copolymer morphology. A more accurate model of nanoparticle would treat them as a potential that explicitly interacts with the polymer beads. The degrees of freedom associated with each nanoparticle include its position, and its orientation when anisotropic. Therefore, the only MC moves that are needed are translation and rotation; they are treated in the same way as the MC moves for the polymer.
24.3 APPLICATIONS 24.3.1
EQUILIBRIUM MORPHOLOGIES
We begin this section by examining the capability of the proposed coarsegrain model and method to predict the morphology of block copolymers. We consider the simple case of a linear diblock copolymer in the bulk but more complex situations could be addressed, including thin films, patterned substrates, linear or star triblocks. Figure 24.1 illustrates the variable cell shape method. The simulation box, initially cubic, deforms to accommodate the hexagonal cylindrical phase and adjusts its size to match the natural periodicity. Our coarsegrain description is primarily designed to describe properties on the length scale Re. Nevertheless it is interesting to consider the validity of our predictions on small length scales (small fractions of Re). Figure 24.2 shows the average density profiles of A and B beads in the lamellar phase of a symmetric diblock copolymer. For comparison, the result of a onedimensional SCF calculation is also included. The equilibrium lamellar spacings found in both cases are very close to each other: Lo = 1.80 Re in MC simulations and Lo = 1.83 Re in SCF calculations. The corresponding density profiles are almost similar to each other when plotted in units of the lamellar spacing. Compared to the onedimensional meanfield calculation, the variations in total density observed in threedimensional simulations are less pronounced and the density profile is not as steep, thereby resulting in a slightly wider interface. Considering the rather low contour and space discretization employed here (N = 32 and ΔL ≈ 0.15Re) the MC method is surprisingly accurate. Also note that using higher discretization N (and lower ΔL) or a more accurate interpolation scheme to compute local densities from bead positions does not seem to yield significant changes in the MC profiles, suggesting that the main differences with SCF can be ascribed to interface fluctuations. These capillary waves in the threedimensional MC simulations are expected to broaden the width of the interface.
FIGURE 24.2 Left: density profiles of A and B beads and total density in the lamellar phase of a symmetric diblock copolymer, computed with MC simulations (solid lines) and SCF theory (dashed lines). The unit length is the lamellar spacing. Parameters: χN = 36.7, κN = 50, N = 128 ( N = 32, ΔL = 0.15 Re ) Right: distribution . of nanoparticles in the lamellar phase of a symmetric diblock copolymer. The nanoparticle composition (fraction of A beads) is 0.5, 0.8, and 1, from left to right. Each graph shows the distribution of nanoparticle centers (solid line), the density profile of A beads (continuous line), B beads (dashed line), and the total density profile (dasheddotted line). Parameters: Rp = 0.16 Re , φp = 0.05, χN = 40, κN = 50, N = 128 ( N = 32, ΔL = 0.19 Re ) .
59556_C024.indd 369
8/2/08 8:46:48 AM
370
24.3.2
CoarseGraining of Condensed Phase and Biomolecular Systems
QUALITATIVE DESCRIPTION OF THE DYNAMICS
In addition to equilibrium properties, MC simulations can provide an approximate but reasonable account of the dynamics on long length scales at a qualitative level. In order to do so, it is important to avoid “drastic” nonlocal MC moves, and use instead MC moves that mimic those encountered in real polymeric systems and, in particular, give rise to a diffusive relaxation of the local densities. In what follows, only two types of MC moves are used: the random displacement of a single bead which, when used alone, would lead to Rousetype dynamics, and the slitheringsnake move, which mimics the reptation of the chains in the “tube” created by the topological constraints imposed by neighboring chains. Since the simulated chains can cross each other, all entanglement effects are neglected and the dynamics cannot be realistic at the level of an individual chain. However, when the collective, global motion of many chains is required, such as during the formation of an ordered microphase or during the relaxation of a structural defect, this MC dynamics might become qualitatively correct on large time and length scales because it captures the diffusive relaxation kinetics of composition fluctuations. In this case, the time scale can be identified by matching the singlechain diffusion coefficient in the simulation and the experimental system. Those assumptions are less restrictive than those involved in most dynamics approaches within the context of SCF theory. In such approaches, the time evolution is driven by the spatial variation of a local chemical potential and chain conformations are assumed to be fast variables that adjust instantaneously to the slow variables (local densities and fields). Kinetic coefficients must be introduced that relate the time evolution of the slow variables to gradients of chemical potentials. Exact expressions for those Onsager coefficients are not available for inhomogeneous systems, and assumptions on the chain structure must be made to obtain approximate parameters. Figure 24.3 provides an example of simulated dynamics in a thin film of a symmetric diblock copolymer confined between two hard walls (with periodic boundary conditions in the other directions), so as to represent a thin film laid over a neutral substrate. The simulation starts with the chain positions and conformations chosen randomly (in experiments this would correspond to a quench from a high temperature). Very rapidly, the copolymer forms lamellae perpendicular to the substrate. The ordering remains only local, however, and the lamellae form the characteristic fingerprint pattern that is seen in experiments. From a series of snapshots one can analyze the type of defects that are formed, and the mechanisms by which they annihilate each other and disappear. While the system shown in Figure 24.3 has been simulated using a singleprocessor machine, the use of SCMF simulations on a parallel computer cluster could permit study of the relaxation of defects on larger length scales and over longer times [Edwards et al. 2005]. The phenomenological approach that has been used so far to study such phenomena generally relies on a Landau expansion
FIGURE 24.3 Simulation of a symmetric diblock copolymer confined between two hard walls, starting from a random initial configuration. The figures provide topdown snapshots of three configurations. The MC moves employed for these calculations were local. From left to right, configurations correspond to 50, 500, and 6000 MC steps, respectively. The natural lamellar spacing is Lo = 1.53 Re and the system size is 40 × 40 ×1.53 Re3 . Parameters: χN = 18, κN = 50, N = 128 ( N = 32, ΔL = 0.19 Re ) .
www.ebook3000.com 59556_C024.indd 370
8/2/08 8:46:49 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
371
of the local order parameter (difference between A and B local densities), namely a coarser representation where information about the chain conformations is lost.
24.3.3 NANOPARTICLEINDUCED PHASE TRANSITION (SOFT NANOPARTICLES) Incorporating nanoparticles into diblock copolymers is of interest for design of new functional materials [Bockstaller, Mickiewicz, and Thomas 2005]. Experiments with a low loading of nanoparticles [Chiu et al. 2005; Kim et al. 2006; Bockstaller et al. 2003] have shown that both the nanoparticle
FIGURE 24.4 Change in morphology induced by a high loading of selective (Alike) nanoparticles. In these crosssections, the A beads are shown in dark gray, the B beads in black, and the nanoparticle beads in light gray. The system has been replicated once in each direction. Top row: symmetric copolymer ( f = 0.5 ) with φp = 0.2 and φp = 0.4 (from left to right). Bottom row: asymmetric copolymer ( f = 0.25 ) with φp = 0.1 and φp = 0.3. Parameters: Rp = 0.16 Re , χN = 25, κN = 50, N = 128 ( N = 32, ΔL = 0.19 Re ) .
59556_C024.indd 371
8/2/08 8:46:51 AM
372
CoarseGraining of Condensed Phase and Biomolecular Systems
size and the type of brush that covers it are important in determining the location of the nanoparticle, be it in specific domains or at the interface between them. At high loadings of nanoparticles [Sides et al. 2006; Kim et al. 2005], it is possible to induce a change in morphology. For instance, the lamellar morphology of symmetric diblock can be coaxed into forming a hexagonal morphology when the volume fraction of nanoparticles exceeds a critical threshold. Considerable effort has been directed towards predicting the morphology of nanoparticle/block copolymer mixtures. In their initial studies, Balazs and coworkers combined the SCF theory for the polymer with a density functional theory for hard spheres to describe the nanoparticles (SCFDFT method). A variety of systems, including nanoparticles in lamellae, bulk or confined, were considered by these authors [Thompson et al. 2001; Lee et al. 2002; Thompson et al. 2002; Lee, Shou, and Balazs 2003a, 2003b; Lin et al. 2005; Smith, Tyagi, and Balazs 2005; Balazs, Emrick, and Russell 2006]. One limitation of that approach is that the coupling between the nanoparticle and the melt is only approximate, since the correlations between nanoparticles are assumed to reproduce the structure of a hard sphere fluid. The hybrid particlefield method (HPF), recently introduced [Sides et al. 2006], does not involve such an approximation; the nanoparticle positions remain explicit degrees of freedom, and a Brownian dynamics technique is used to describe their evolution in time and space. This approach has been shown to reproduce experimentally observed nanoparticleinduced changes in morphology. However, both the SCFDFT and HPF methods have so far been restricted to twodimensional systems. MC simulations of manybody models for nanoparticle/diblock mixtures [Schultz, Hall, and Genzer 2005; Pryamitsyn and Ganesan 2006] are fully threedimensional, but they remain computationally intensive, particularly for large systems. Recently, we have introduced a more tractable approach [Kang et al. 2008] that maintains a full coupling between the nanoparticles and the polymer chains. In the interest of brevity, here we only outline some results obtained with the simple ‘soft nanoparticle’ model presented above. The nanoparticle radius (including the solid core and the brush) is chosen in the range Rp = 0.16 − 0.21Re, thus in a range intermediate between the protein limit ( Rp Re ) and the colloid limit ( Rp Re ). Computing the pair correlation function g(r ) between the nanoparticle center and the polymer beads shows that the local density of polymer beads does not vanish inside the nanoparticle; however, it is significantly reduced. In the worst case considered here (Alike nanoparticle with Rp = 0.16 placed in an A homopolymer melt), g(r = 0) ≈ 0.4 ; larger nanoparticles lead to a stronger exclusion of the polymer chains. The fact that chains overlap with the nanoparticle can be expected, as the compressibility constraint is enforced only at the scale of the grid spacing, which is not much smaller than the nanoparticle’s radius. A higher discretization N (and therefore a smaller grid spacing) or a lower compressibility (higher κ) would improve exclusion effects, but it would also lead to longer computational times. The properties we focus on here are the collective effects induced by a high volume fraction of nanoparticles, not the influence of an isolated nanoparticle on the neighboring chains. The first property we consider is the location of nanoparticles in the diblock. Figure 24.2 shows the density profile of spherically symmetric nanoparticles dispersed in the lamellar phase of a symmetric diblock, for various compositions of the nanoparticle (i.e., fraction of A beads). As expected, Alike nanoparticles are found exclusively in the A domains (with a preference for the center). On the other hand, nonselective (neutral) nanoparticles are found at the interface between domains, where they screen contacts between A and B blocks and reduce the penalty associated with the decrease in total density. Given that the nanoparticles are rather small and soft, the entropic penalty they impose on the chains by restricting the possible conformations is not dominant here, thus enthalpic factors are expected to be most important. As shown in Figure 24.5, Januslike nanoparticles are found only at the interface, with each hemisphere located in its respective domain. When dispersed at highvolume fractions, neutral nanoparticles tend to aggregate. Therefore, in what follows only the cases of selective and Januslike nanoparticles are considered. Figure 24.6 shows the predicted morphology when nanoparticles are dispersed in a symmetric block copolymer with a volume fraction φp ranging from 0.1 to 0.4. The simulation box has fixed dimensions Lx × L y × Lz = 40 × 40 ×1.53 Re3 (here L 0 = 1.53 Re). Choosing a small thickness Lz
www.ebook3000.com 59556_C024.indd 372
8/2/08 8:46:53 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
373
(a)
(b)
(c)
FIGURE 24.5 Morphology obtained with Januslike nanoparticles dispersed in a symmetric AB block copolymer. The nanoparticle radius is Rp = 0.21Re. In these crosssections, the A beads are shown in dark gray, the B beads in black. The Alike hemisphere of the nanoparticle is light gray, the Blike hemisphere white. (a) φp = 0.05 yields a lamellar morphology. (b) Same as previously with only nanoparticles shown. (c) The morphology observed with φp = 0.35 suggests a bicontinuous phase. The system size is 7 × 7 ×6.9 Re3 . Parameters: χN = 40, κN = 50, N = 128 ( N = 32, ΔL = 0.19 Re ).
favors the ordering in the x–y plane and helps to identify the morphology. Note that the system is indeed threedimensional and the nanoparticle are spheres, not disks or rods. As φp increases, the A lamellae form more Tjunctions and rings until the B domain is finally fragmented into isolated cylinders. Even on a local scale, the hexagonal order is barely visible, since the cylinders widely vary in radius, but the change in morphology is clear. To better characterize the morphology, we used a smaller simulation box of variable shape, as shown in Figure 24.4. All simulation boxes have converged towards different stable dimensions to better accommodate the periodicity of the microphase. Figure 24.4 shows a second example of a nanoparticleinduced change in morphology, where the cylindrical phase of an asymmetric block copolymer is converted into a lamellar phase. The mechanism driving the transition is the same: the A domains are swollen by nanoparticles, and deform until the initial morphology becomes unstable. Note that replacing the nanoparticles with a homopolymer we would not reproduce the same transition. Instead, depending on the ratio between the molecular weights of homopolymer and the diblock copolymer, the lamellar morphology would be conserved, but with A domains swollen by the homopolymer and a larger lamellar spacing. Alternatively, a microemulsion could be formed or the homopolymers could macroscopically phaseseparate from the diblock domains. The results presented here do not provide an estimate of the critical volume fraction at which the transition occurs, but the window is compatible with that observed in experiments. A high loading of Januslike nanoparticles can also induce a change in morphology: for instance, above a critical volume fraction, the cylindrical phase of an asymmetric copolymer is replaced by a lamellar phase (not shown). The mechanism is different from that observed for selective nanoparticles: instead of swelling their preferred domain, Januslike nanoparticles decrease the interface tension between domains (this difference is already reflected at low loading in the lamellar phase: selective nanoparticles lead to an increase of lamellar spacing, and Januslike nanoparticles to a decrease). They also might modify the spontaneous curvature and bending rigidity of the interface
59556_C024.indd 373
8/2/08 8:46:54 AM
374
CoarseGraining of Condensed Phase and Biomolecular Systems
(a)
(b)
(c)
(d)
FIGURE 24.6 (See color insert following page 238.) Morphology of nanoparticle/copolymer mixtures. In these crosssections, the A beads are shown in red, the B beads in blue, and the nanoparticle beads in green. As the nanoparticle volume fraction increases, the morphology changes from lamellar to cylindrical. The diblock copolymer is symmetric; the nanoparticles are Aselective and have a radius Rp = 0.16 Re and the volume fraction is φp = 0.1, 0.2, 0.3, and 0.4 in (a), (b), (c), and (d) respectively. Parameters: χN = 40, κN = 50, N = 128 ( N = 32, ΔL = 0.19 Re ) .
[Pryamitsyn and Ganesan 2006]. As the loading increases, it becomes favorable to increase the amount of interface between domains, which is a possible driving force for the transition. Figure 24.5c shows a mixture of symmetric copolymer and nanoparticles. Even at the local scale, the nature of the morphology is unclear; it does not seem to be lamellar or cylindrical. Besides, any crosssection of the system shows A and B domains interpenetrating each other, and separated by an interface packed with nanoparticles. This suggests the possibility of a bicontinuous phase, in agreement with experimental observations [Kim et al. 2007].
24.4 CONCLUSION The SCF theory has been instrumental in understanding the properties of block copolymers. However, there are systems of considerable fundamental and technological interest, including complex multiblock materials and nanoparticle/copolymer mixtures, that continue to pose challenges for traditional SCF treatments. We have presented in this chapter an alternative numerical framework for description of polymeric systems that exhibits several attractive features. Since it is a particlebased method, it treats the conformations of the chains in an explicit manner; it is therefore relatively straightforward to describe polymeric molecules of arbitrary architecture. When a rough description of the system suffices, nonpolymeric objects such as functionalized nanoparticles can be represented as a rigid cluster of beads, as was shown here. If a more accurate description is needed, nanoparticles can be represented through a potential energy function that interacts explicitly with the polymer beads. Such an approach has been used to predict the spatial distribution of
www.ebook3000.com 59556_C024.indd 374
8/2/08 8:46:56 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
375
nanoparticles for nanoparticle/copolymer thin films on nanopatterned substrates [Kang et al. 2008; Detcheverry et al. 2008]. The predictions of our MC simulations are not restricted to equilibrium properties but can be extended, at least at the qualitative level, to the dynamics. Since our approach does not rely on a saddlepoint approximation, it should be able to describe fluctuations effects; such fluctuations, however, must still be characterized and it remains to be seen if the MC simulations proposed here provide a simpler alternative to fieldtheoretic methods. This will require a better understanding of the conditions under which discretization effects are negligible. In contrast to SCF theory, the MC simulations described in this work do not directly provide the free energy of the system; it is therefore difficult to determine and trace precise phase boundaries. Methods that permit efficient calculation of the chemical potential or free energy of the system must be developed for systematic studies of phase behavior.
ACKNOWLEDGMENTS This research was supported by the National Science Foundation through the Nanoscale Science and Engineering Center. Support from the Semiconductor Research Corporation is also gratefully acknowledged.
REFERENCES Balazs, A. C., T. Emrick, and T. P. Russell. 2006. Nanoparticle polymer composites: Where two small worlds meet. Science 314 (5802):1107–10. Barrat, J. L., G. H. Fredrickson, and S. W. Sides. 2005. Introducing variable cell shape methods in field theory simulations of polymers. J. Phys. Chem. B 109 (14):6694–6700. Binder, K. 1995. Monte Carlo and Molecular Dynamics Simulations in Polymer Science. New York: Oxford University Press. Bockstaller, M. R., Y. Lapetnikov, S. Margel, and E. L. Thomas. 2003. Sizeselective organization of enthalpic compatibilized nanocrystals in ternary block copolymer/particle mixtures. J. Am. Chem. Soc. 125 (18):5276–77. Bockstaller, M. R., R. A. Mickiewicz, and E. L. Thomas. 2005. Block copolymer nanocomposites: Perspectives for tailored functional materials. Adv. Mater. 17 (11):1331–49. Chiu, J. J., B. J. Kim, E. J. Kramer, and D. J. Pine. 2005. Control of nanoparticle location in block copolymers. J. Am. Chem. Soc. 127 (14):5036–37. Daoulas, K. C., and M. Muller. 2006. Single chain in mean field simulations: Quasiinstantaneous field approximation and quantitative comparison with Monte Carlo simulations. J. Chem. Phys. 125 (18):18. Daoulas, K. C., M. Muller, J. J. de Pablo, P. F. Nealey, and G. D. Smith. 2006. Morphology of multicomponent polymer systems: Single chain in mean field simulation studies. Soft Matter 2 (7):573–83. Daoulas, K. C., M. Muller, M. P. Stoykovich, S. M. Park, Y. J. Papakonstantopoulos, J. J. de Pablo, P. F. Nealey, and H. H. Solak. 2006. Fabrication of complex threedimensional nanostructures from selfassembling block copolymer materials on twodimensional chemically patterned templates with mismatched symmetry. Phys. Rev. Lett. 96 (3):4. Deserno, M., and C. Holm. 1998. How to mesh up Ewald sums. I. A theoretical and numerical comparison of various particle mesh routines. J. Chem. Phys. 109 (18):7678–93. Detcheverry, F. A., H. Kang, K. Ch. Daoulas, M. Muller, P. F. Nealey, and J. J. de Pablo. 2008. Monte Carlo simulations of a coarse grain model for block copolymers and nanocomposites. To appear in Macromolecules. Doi, M., and S. F. Edwards. 1986. The Theory of Polymer Dynamics. Oxford: Oxford University Press. Edwards, E. W., M. P. Stoykovich, M. Muller, H. H. Solak, J. J. De Pablo, and P. F. Nealey. 2005. Mechanism and kinetics of ordering in diblock copolymer thin films on chemically nanopatterned substrates. J. Polym. Sci. Part B: Polym. Phys. 43 (23):3444–59. Edwards, S. F. 1965. Statistical mechanics of polymers with excluded volume. Proc. Phys. Soc. London 85 (546P):613. Eurich, F., and P. Maass. 2001. Soft ellipsoid model for Gaussian polymer chains. J. Chem. Phys. 114 (17):7655–68.
59556_C024.indd 375
8/2/08 8:46:59 AM
376
CoarseGraining of Condensed Phase and Biomolecular Systems
Fredrickson, G.H. 2006. The Equilibrium Theory of Inhomogeneous Polymers. Oxford: Clarendon Press. Fredrickson, G. H., V. Ganesan, and F. Drolet. 2002. Fieldtheoretic computer simulation methods for polymers and complex fluids. Macromolecules 35 (1):16–39. Helfand, E., and Y. Tagami. 1972. Theory of interfaces between immiscible polymers. II. J. Chem. Phys. 56 (7):3592. Hong, K. M., and J. Noolandi. 1981. Theory of inhomogeneous multicomponent polymer systems. Macromolecules 14 (3):727–36. Kang, H., F. A. Detcheverry, A. N. Mangham, M. P. Stoykovich, K. Ch. Daoulas, R. J. Hamers, M. Müller, J. J. de Pablo, and P. F. Nealey. 2008. Hierarchical assembly of Nanoparticle superstructures from block copolymernanoparticle composites. Phys. Rev. Lett. 100:148303. Kim, B. J., J. Bang, C. J. Hawker, and E. J. Kramer. 2006. Effect of areal chain density on the location of polymermodified gold nanoparticles in a block copolymer template. Macromolecules 39 (12):4108–14. Kim, B. J., J. J. Chiu, G. R. Yi, D. J. Pine, and E. J. Kramer. 2005. Nanoparticleinduced phase transitions in diblockcopolymer films. Adv. Mater. 17 (21):2618. Kim, B.J., G. H. Fredrickson, C. J. Hawker, and E. J. Kramer. 2007. Nanoparticle surfactants as a route to bicontinuous block copolymer morphologies. Langmuir 23 (14):7804. Kotelyanskii, M., and D. N. Theodorou. 2004. Simulation Methods for Polymers. New York: Dekker. Laradji, M., H. Guo, and M. J. Zuckermann. 1994. Offlattice MonteCarlo simulations of polymer brushes in good solvents. Phys. Rev. E 49 (4):3199–3206. Lee, J. Y., Z. Shou, and A. C. Balazs. 2003a. Modeling the selfassembly of copolymernanoparticle mixtures confined between solid surfaces. Phys. Rev. Lett. 91 (13). Lee, J. Y., Z. Y. Shou, and A. C. Balazs. 2003b. Predicting the morphologies of confined copolymer/nanoparticle mixtures. Macromolecules 36 (20):7730–39. Lee, J. Y., R. B. Thompson, D. Jasnow, and A. C. Balazs. 2002. Entropically driven formation of hierarchically ordered nanocomposites. Phys. Rev. Lett. 89 (15). Lin, Y., A. Boker, J. B. He, K. Sill, H. Q. Xiang, C. Abetz, X. F. Li, J. Wang, T. Emrick, S. Long, Q. Wang, A. Balazs, and T. P. Russell. 2005. Selfdirected selfassembly of nanoparticle/copolymer mixtures. Nature 434 (7029):55–59. Louis, A. A., P. G. Bolhuis, J. P. Hansen, and E. J. Meijer. 2000. Can polymer coils be modeled as “soft colloids”? Phys. Rev. Lett. 85 (12):2522–25. Mackay, M. E., A. Tuteja, P. M. Duxbury, C. J. Hawker, B. Van Horn, Z. B. Guan, G. H. Chen, and R. S. Krishnan. 2006. General strategies for nanoparticle dispersion. Science 311 (5768):1740–43. Matsen, M. W., and M. Schick. 1994. Stable and unstable phases of a diblock copolymer melt. Phys. Rev. Lett. 72 (16):2660–63. Maurits, N. M., A. V. Zvelindovsky, and J. G. E. M Fraaije. 1998. Equation of state and stress tensor in inhomogeneous compressible copolymer melts: Dynamic meanfield density functional approach. J. Chem. Phys. 108 (6):2638–50. Muller, M., and F. Schmid. 2005. Incorporating fluctuations and dynamics in selfconsistent field theories for polymer blends. Adv. Polym. Sci. 185:1–58. Muller, M., and G. D. Smith. 2005. Phase separation in binary mixtures containing polymers: A quantitative comparison of singlechaininmeanfield simulations and computer simulations of the corresponding multichain systems. J. Polym. Sci. Part B: Polym. Phys. 43 (8):934–58. Murat, M., and K. Kremer. 1998. From many monomers to many polymers: Soft ellipsoid model for polymer melts and mixtures. J. Chem. Phys. 108 (10):4340–48. Parrinello, M., and A. Rahman. 1981. Polymorphic transitions in singlecrystals: A new molecular dynamics method. J. Appl. Phys. 52 (12):7182–90. Pierleoni, C., C. Addison, J. P. Hansen, and V. Krakoviack. 2006. Multiscale coarse graining of diblock copolymer selfassembly: From monomers to ordered micelles. Phys. Rev. Lett. 96 (12):4. Pryamitsyn, V., and V. Ganesan. 2006. Strong segregation theory of block copolymer–nanoparticle composites. Macromolecules 39 (24):8499–8510. Ray, J. R., and A. Rahman. 1984. Statistical ensemble and moleculardynamics studies of anisotropic solids. J. Chem. Phys. 80 (9):4423–28. Schultz, A. J., C. K. Hall, and J. Genzer. 2005. Computer simulation of block copolymer/nanoparticle composites. Macromolecules 38 (7):3007–16. Sides, S. W., B. J. Kim, E. J. Kramer, and G. H. Fredrickson. 2006. Hybrid particlefield simulations of polymer nanocomposites. Phys. Rev. Lett. 96 (25):250601.
www.ebook3000.com 59556_C024.indd 376
8/2/08 8:46:59 AM
Monte Carlo Simulations of a CoarseGrain Model for Block Copolymer Systems
377
Smith, K. A., S. Tyagi, and A. C. Balazs. 2005. Healing surface defects with nanoparticlefilled polymer coatings: Effect of particle geometry. Macromolecules 38 (24):10138–47. Soga, K. G., H. Guo, and M. J. Zuckermann. 1995. Polymer brushes in a poor solvent. Europhys. Lett. 29 (7):531–36. Soga, K. G., M. J. Zuckermann, and H. Guo. 1996. Binary polymer brush in a solvent. Macromolecules 29 (6):1998–2005. Stoykovich, M. P., M. Muller, S. O. Kim, H. H. Solak, E. W. Edwards, J. J. de Pablo, and P. F. Nealey. 2005. Directed assembly of block copolymer blends into nonregular deviceoriented structures. Science 308 (5727):1442–46. Thompson, R. B., V. V. Ginzburg, M. W. Matsen, and A. C. Balazs. 2001. Predicting the mesophases of copolymernanoparticle composites. Science 292 (5526):2469–72. Thompson, R. B., V. V. Ginzburg, M. W. Matsen, and A. C. Balazs. 2002. Block copolymerdirected assembly of nanoparticles: Forming mesoscopically ordered hybrid materials. Macromolecules 35 (3):1060–71. Tyler, C. A., and D. C. Morse. 2003. Stress in selfconsistentfield theory. Macromolecules 36 (21):8184–88. Yatsenko, G., E. J. Sambriski, M. A. Nemirovskaya, and M. Guenza. 2004. Analytical softcore potentials for macromolecular fluids and mixtures. Phys. Rev. Lett. 93 (25):4.
59556_C024.indd 377
8/2/08 8:47:00 AM
www.ebook3000.com 59556_C024.indd 378
8/2/08 8:47:00 AM
Coarse and 25 StructureBased FineGraining in Soft Matter Simulations Nico F.A. van der Vegt, Christine Peter, and Kurt Kremer Max Planck Institute for Polymer Research
CONTENTS 25.1 Introduction ......................................................................................................................... 379 25.2 Methods............................................................................................................................... 380 25.2.1 General Concept .................................................................................................... 380 25.2.2 Mapping Scheme ................................................................................................... 381 25.2.3 Bonded Interaction Potentials ............................................................................... 382 25.2.4 Nonbonded Interaction Potentials ......................................................................... 383 25.2.5 CoarseGrained Simulations: Equilibration of Mesoscale Structures .................. 384 25.2.6 Reintroduction of Atomistic Details (“Inverse Mapping”) ................................... 384 25.3 Examples ............................................................................................................................. 385 25.3.1 Structure ................................................................................................................ 385 25.3.1.1 InverseMapped BPAPC Melts ............................................................ 385 25.3.1.2 Two Mapping Schemes for Polystyrene ................................................ 387 25.3.1.3 AzobenzeneBased Mesogens............................................................... 389 25.3.2 Dynamics ............................................................................................................... 391 25.3.2.1 LongTime Atomistic BPAPC Trajectories Obtained by Inverse Mapping.................................................................................... 391 25.3.2.2 Dynamic Speedup: Additive Molecules in a LongChain Polystyrene Melt.................................................................................... 393 25.4 Some Recent Developments and Future Perspectives ........................................................ 394 25.4.1 Adaptive Resolution MD ....................................................................................... 394 25.4.2 Surface Interactions of Biomolecules .................................................................... 394 25.4.3 Nonbonded Interactions ........................................................................................ 395 25.4.4 Perspectives ........................................................................................................... 395 Acknowledgments .......................................................................................................................... 395 References ...................................................................................................................................... 395
25.1
INTRODUCTION
Many physical phenomena in biology, chemistry, and materials science involve processes occurring on atomistic length and time scales, which affect structural and dynamical properties on mesoscopic scales exceeding far beyond atomistic ones. Because it is infeasible (and most often undesirable) to run computer simulations of very large systems with atomically detailed models, mesoscale (coarsegrained) models are being developed through which structural relaxations can 379
59556_C025.indd 379
8/2/08 8:49:30 AM
380
CoarseGraining of Condensed Phase and Biomolecular Systems
be studied at large length scales, allowing for full system equilibration on mesoscopic time scales [1–5]. Ideally, coarsegrained (CG) models stay reasonably close to the chemical structure of the material so that inversemapping (reintroduction of chemical details) procedures can be employed and atomically detailed processes can be studied in various windows of the CG trajectory where “something interesting happens.” Only with that possibility at hand, the corresponding CG models can be used to describe chemically realistic systems over a wide range of length and times scales in a hierarchical, sequential set of simulations at multiple resolution levels, or in a single, multiscale simulation where the level of resolution can be changed at will, locally or adaptively (in the course of a simulation). Linking chemical structure to properties and behavior of materials on different time and length scales can be achieved only if the various (high and low) resolution models involved are structurally consistent. Ideally, the structural agreement should hold down to the smallest possible length scale, which is the dimension of a CG unit. It is important to realize that, depending on the extent of coarsegraining, many allatom (AT) states correspond to one CG configuration. Although a onetoone correspondence between AT and CG configurations therefore does not exist, it is crucial that the conformational ensemble obtained with a CG model corresponds to that of the allatom system, with the latter being analyzed in terms of the CG degrees of freedom. If we limit ourselves to the classical (nonquantum mechanical) case it means that the CG model must be parameterized such that the statistical weights of CG configurations are obtained from a (Boltzmann) weighted average over all corresponding AT states. Although for many systems we are still far from achieving this goal, it makes clear that quantum mechanical (QM), classical atomistic (AT) and coarsegrained (CG) mesoscopic models should ideally be developed such that “scalehopping” [1,6–8] is possible in both forward and backward directions. It is the purpose of this chapter to discuss some of these issues and provide examples of CG models and multiscale modeling methods recently developed in our lab. We will emphasize structurebased coarsegraining for reasons following from the goal to allow for structurebased scale hopping as outlined above. In doing so, we follow a coarsegraining prescription without using ad hoc input in order to get the desired properties right. Alternative coarsegraining approaches (described elsewhere in this book) will not be discussed. Also, approaches that go much further and map the whole chain to one ellipsoidal [9] particle or just a soft sphere [10] are not considered here. Figure 25.1 shows the systems that are discussed in this chapter. It includes bisphenolA polycarbonate (BPAPC) [11,12], polystyrene (PS) [13,14], and the liquid crystalline (LC) azobenzene derivative 8AB8 [15]. The CG representations are superimposed onto the chemical structures illustrating the typical level of coarsegraining. In Section 25.2 we shall discuss the coarsegraining and inversemapping procedures employed. In Section 25.3, several aspects of the CG models representing the above molecules are being discussed in terms of the structure (melt structure, chain conformations, LC order) and dynamics they predict. In this section, we focus on recent developments (what can be done nowadays with structurebased coarsegraining approaches and where possible pitfalls are that need to be avoided), inversemapped atomistic structures, and issues concerning the timemapping procedure. In Section 25.4, an outlook to future developments and recent extensions to an adaptive scheme is being presented.
25.2 25.2.1
METHODS GENERAL CONCEPT
The following section will be organized along the sequence of steps in deriving a CG model: first, we have to formulate a mapping scheme that relates the coordinates in the atomistic description with the centers of the CG particles. Second, one has to decide on a strategy concerning bonded and nonbonded interactions. In the coarsegraining procedure used by us, nonbonded and bonded interactions are strictly separated and derived sequentially. Such a clear separation makes the possibility
www.ebook3000.com 59556_C025.indd 380
8/2/08 8:49:30 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
381
FIGURE 25.1 Atomistic and coarsegrained models of bisphenolApolycarbonate (BPAPC), polystyrene (PS), and 4,4´dioctyloxyazobenzene (8AB8). The CG mapping points are indicated with black dots. The corresponding CG superatoms, centered on the CG mapping points, are represented by the dashed spheres. For PS, two mapping schemes are shown. For BPAPC, mapping points on the carbonate, phenyl, and isopropylidene groups are connected through a single CG bond.
to transfer potentials more likely, and allows us to distinguish between effects due to inter and intramolecular potentials. Consequently, we describe separately how bond stretching, bond angle bending, and dihedral torsion potentials in the CG scheme are derived based on an atomistically detailed simulation of the isolated molecule in vacuo. Next, nonbonded interaction potentials between CG beads are derived based on the liquid structure of polymer melts or lowmolecularweight fluids (i.e., fragments of the target molecule or chain). These interaction potentials are subsequently used to generate wellequilibrated mesoscale structures and longtime trajectories of the system of interest. A last step, which also belongs to the coarsegraining procedure in the sense that it is a crucial link between the atomistic and the CG level of resolution, is the procedure of reintroducing atomistic details into a CG simulation trajectory (“backmapping” or “inverse mapping”).
25.2.2
MAPPING SCHEME
The mapping scheme relates the atomistic coordinates of a structure to the bead positions in the CG model. (Our models usually rely on CG centers with spherically isotropic potentials.) It is clear that there is no unique way to map a given set of atoms onto a coarser description. However, depending on the specific system and on the properties of the system that one wants to see reflected on the coarse level, one can define criteria to determine mapping points. Examples for such criteria are requirements to keep the ability to account for stereoregularity of chain molecules (e.g., PS [13,14]), or to capture certain geometry changes. For example, for azobenzenecontaining LCs (8AB8) [15] one needs a clear distinction between the cis and trans geometry of the AB unit if one wants to investigate photoinduced phase transitions. There are other criteria that make a certain CG model more or less appealing, for example, in the PS example, a mapping was chosen which avoids “branching off” dangling side groups; that is, all CG beads are linearly connected in the chain, which saves complicated torsion and angle potentials [13,14]. When discussing the computational efficiency of a specific mapping scheme, one has to take several aspects into account. Trivially one would assume that fewer CG beads per molecule result in higher computational efficiency. In addition to a reduction in number of degrees of freedom
59556_C025.indd 381
8/2/08 8:49:31 AM
382
CoarseGraining of Condensed Phase and Biomolecular Systems
(DOFs), there is a speedup of the dynamics of the system due to a reduced molecular friction (larger beads, smoother potentials) of the CG model. In the case of chain molecules, there is, however, another aspect that should be kept in mind. Chain dynamics is faster if the envelope of the beads of the chain is tubelike, preventing optimised sphere packing and subsequent cage formation with corresponding higher friction [16]. A measure for this commensurability is given by the ratio of mean bond length and bead diameter. This criterion was used to explain why, for BPAPC, a mapping scheme of Figure 25.1 with more beads is computationally more efficient than another one where the phenyl rings were included in somewhat larger spheres at the carbonate and isopropylidene units [11]. Another criterion that needs to be accounted for when devising a mapping scheme relates to the statistical correlations of internal DOFs. The mapping should be chosen such that these correlations are as weak as possible so that the intramolecular (bonded) potentials can be separated into bond stretching, bond angle bending, and torsion terms, as outlined in the next subsection.
25.2.3
BONDED INTERACTION POTENTIALS
First of all, the determination of interaction potentials for the CG model is based on the assumption that the total potential energy UCG can be separated into bonded/covalent (U BCG ) and nonbonded CG (U NB ) contributions [1]: U CG =
∑U
CG B
+
∑U
CG NB
.
(25.1)
Intramolecular bonded/covalent interactions of the CG model are determined by sampling the distributions of (CG) conformational DOFs based on an atomically detailed simulation (Monte Carlo or molecular dynamics (MD) using a stochastic thermostat to ensure proper equilibration) of an isolated molecule in vacuo. These conformational distributions are in general characterized by CG bond lengths {r}, bond angles {θ}, and torsions {φ}; that is, P CG (r , θ, φ, T ) and are clearly temperature dependent (for simplicity we assume here that there is only one kind of bond, bond angle, or torsion). If one assumes that the different CG internal DOFs are uncorrelated, P CG (r , θ, φ, T ) factorizes into independent probability distributions of bond length, angle, and torsional DOFs: P CG (r , θ, φ, T ) = P CG (r , T ) P CG (θ, T ) P CG (φ, T ) .
(25.2)
This assumption has to be carefully checked (it is not uncommon that CG DOFs are correlated, for example that certain combinations of CG bonds, angles, and torsions are “forbidden” in the distributions obtained from the “real” atomistic chain), and is an important test of the suitability of a mapping scheme [14], because a mapping scheme that requires complex multiparameter potentials is computationally rather inefficient. The individual probability distributions P CG (r , T ) , P CG (θ, T ) , and P CG (φ, T ) are then Boltzmann inverted to obtain the corresponding potentials and—through taking the derivatives— the forces U CG (r , T ) = −k BT ln[ P CG (r , T ) / r 2 ] + Cr,
(25.3)
U CG (θ, T ) = −k BT ln[ P CG (θ, T ) / sin θ] + Cθ ,
(25.4)
U CG (φ, T ) = −k BT ln P CG (φ, T ) + Cφ.
(25.5)
www.ebook3000.com 59556_C025.indd 382
8/2/08 8:49:32 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
383
When deriving potentials from bond and angle distributions one has to account for the respective volume elements r 2 and sin θ. Using the inverted distributions as potentials means that these potentials are in fact potentials of mean force. Ergo they are free energies and consequently temperature dependent. As mentioned before, this temperature dependence originates not only from the prefactor kBT, but from the distributions P themselves. Strictly speaking they can only be applied at the temperature (state point) they were derived at. The approach outlined in this section is in contrast to other approaches, where the CG internal DOFs are determined based on the distributions obtained from an atomistic simulation of the liquid phase [3]. In the latter case one obtains potentials for bonded and nonbonded interactions simultaneously from the same liquid simulation; consequently they are potentially interdependent; that is, there is no clear separation between covalent and nonbonded interaction potentials. We achieve this separation by deriving CG bond length, bond angle, and torsional distributions from the atomically detailed conformations sampled by a single (chain) molecule in vacuo. In the atomistic simulation performed to generate the distributions of CG intramolecular DOFs, the inclusion of nonbonded interactions has to be taken with care to avoid “double counting” of interactions. This means that longrange intrachain nonbonded interactions (beyond the distance between CG beads which are explicitly covered via bonded interaction potentials, for example, beyond the distance of three CG bonds if torsion potentials are used) should be excluded when the single chains are sampled. Instead these longrange interactions should be treated equivalently to CG intermolecular nonbonded interactions.
25.2.4
NONBONDED INTERACTION POTENTIALS
The general principle when deriving nonbonded interaction potentials is to reproduce structural properties; that is, radial distribution functions of (lowmolecularweight) liquids or polymer melts (experimentally known or obtained from atomistic simulations). Similarly to the above case of bonded interaction functions, one has two principal options: either (1) to use analytical potentials, in which case one would optimize the parameters of a chosen analytical function to reproduce the structure of the atomistic melt/liquid as accurately as possible (or to account for the excluded volume interaction only, in which case no further optimization is being done, see BPAPC [1,11]); or (2) one would use numerically derived tabulated potentials, which are designed such that the CG liquid reproduces the atomistic liquid structure, when the latter is analyzed in terms of the overlaid CG structure the microstate corresponds to. In the first case, analytical potentials of various types can be used: the “normal” Lennard–Jones 126 potential is frequently used; it has, however, been proven to be in many cases too steeply repulsive; that is, too “hard,” for CG particles, which are rather large and soft. In that case, softer Lennard– Jonestype (e.g., 96 or 76) [14], Buckingham or Morse potentials [15] are employed. These potentials are usually made purely repulsive in the spirit of the WCA potential [17] by shifting upwards and truncating in the minimum. In order to search in parameter space to optimize these analytical potentials to reproduce a given liquid or melt structure, a simplex algorithm can be used [18,19]. Concerning the second option to generate numerically a tabulated potential that closely reproduces a given melt structure; that is, a given radial distribution function g(r), the iterative Boltzmann inversion method has been developed [20,21]. This method relies on an initial guess for a nonCG bonded potential U NB,0 . Usually the Boltzmann inverse of the target gtarget(r); that is, the potential of mean force, CG U NB,0 = −k BT ln gtarget (r ) ,
(25.6)
is used, with which one then generates a CG simulation trajectory of the liquid. The resulting structure will not match the target structure since, due to multibody interactions, the potential of mean
59556_C025.indd 383
8/2/08 8:49:33 AM
384
CoarseGraining of Condensed Phase and Biomolecular Systems
force is a good estimate for the pair potential only at very high dilution. However, using the iteration scheme ⎡ g (r ) ⎤ CG CG ⎢ i ⎥, U NB, i+1 = U NB,i + k BT ln ⎢ ⎥ g ( r ) target ⎢⎣ ⎥⎦
(25.7)
the original guess can be selfconsistently refined until the desired structure is obtained. There can be limits to this approach because it is not always clear whether the chosen CG mapping scheme can converge to an optimal fit. For complex molecules with a large number of different CG beads or more importantly in the case of molecules that form complex or anisotropic liquid or melt structures, for example, liquid crystals, the procedure to determine nonbonded interaction functions is more complicated. In these cases it is advantageous to split the target molecule into fragments so that the nonbonded interactions between different bead types can be determined based on the structure of isotropic liquids of these fragment molecules. One principal problem that arises if one uses smaller fragments to generate nonbonded interaction potentials for larger molecules is that different conformations may contribute to the structure of the liquid of the fragment molecules differently than in the (polymeric) melt [22]. One example where such an effect may play a role is in the parameterization of phenyl rings based on the structure of liquid benzene: in that case the relative population of parallel and perpendicular arrangements of two phenyl rings that are part of longer chain molecules potentially differs from the arrangements in liquid benzene for steric reasons. Despite these potential problems, the procedure to parameterize CG nonbonded interactions based on small molecules is promising to generate CG parameters for complex molecules and it also allows reuse of certain CG potentials for reoccurring building blocks (such as alkyl or phenyl groups), which aims at some sort of building block or LEGO set of molecule fragments for CG simulations. Of course, this approach needs to be carefully tested and the transferability of the potentials generated from these fragments to (slightly) different conditions needs to be carefully evaluated (as will be further discussed in the Examples section).
25.2.5
COARSEGRAINED SIMULATIONS: EQUILIBRATION OF MESOSCALE STRUCTURES
Even with the dynamic speedup gained by CG models, it is not trivial to obtain wellequilibrated structures of mesoscale (polymeric) systems. In particular for longchain molecules (beyond a few entanglement lengths), branched polymers, or polymers at interfaces, brute force MD algorithms that follow the slow dynamics of the system will not easily lead to complete equilibration of the chains. Besides, criteria are needed to judge whether a melt structure is really equilibrated since local monomer packing and the statistics of endtoend distances or radii of gyration are not sufficient. Auhl et al. [23] describe such criteria and investigate various methods to generate wellequilibrated polymer melts using MD simulations. Based on such CG structures and simulation trajectories it is in the next step possible to reintroduce atomistic coordinates and to obtain equilibrated atomistic structures on the mesoscale or longtime atomistic trajectories.
25.2.6
REINTRODUCTION OF ATOMISTIC DETAILS (“INVERSE MAPPING”)
Inverse mapping; that is, reintroduction of atomistic detail, requires finding a set of atomistic coordinates that corresponds to a given CG structure. In general there is no unique solution to that problem since each CG structure corresponds to many allatom configurations. Therefore, one needs to find one representative allatom structure, with the correct statistical weight of those DOFs that are not resolved in the CG description. Several slightly different strategies to reintroduce atomistic detail into a CG structure have been presented [2,3,12,13,15,24].
www.ebook3000.com 59556_C025.indd 384
8/2/08 8:49:34 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
385
If the (polymer) chain consists of reasonably rigid (allatom) fragments, it is sufficient to fit these rigid allatom units onto the corresponding CG chain segment coordinates. The atomistic fragments can be taken from a pool of structures that correctly reflect the statistical weight of those DOFs (certain torsions, ring flips, etc.) that are not resolved in the CG description and that relax too slowly to be properly equilibrated in a short equilibration run of the resulting atomistic structure. If the CG molecule/polymer chain consists of very flexible units, for example, alkyl tails, and in particular if the CG structure consists of small molecules (8AB8, a lowmolecularweight LC), where even in a very short equilibration step, the atomistic structure significantly diffuses away from the CG coordinates, a slightly different strategy was employed: atomistic coordinates were inserted into the CG structure using fragments for the rigid units and random atomistic positions for the flexible units (with the constraint that the atomistic coordinates have to satisfy the “mapping” condition; that is, the atomistic coordinates have to correspond to the CG structure if one applies the mapping scheme). The resulting structure was then relaxed (energy minimized and equilibrated by MD simulations), while restraining the atomistic coordinates to CG mapping points. This results in a perfectly equilibrated structure that (almost, depending on the strength of the restraining potential) exactly reproduces the CG structure.
25.3 EXAMPLES In this section, we discuss, on the basis of the three examples shown in Figure 25.1 (and Figure 25.6), various aspects of structurebased coarsegraining focusing on recent developments, inversemapped atomistic structures and dynamics. In Section 25.3.1 (“Structure”) we discuss experimental validation of inversemapped BPAPC and PS melt structures and the prospects that open up due to the resulting well equilibrated longtime/largescale atomistic trajectories; we illustrate the consequences of the choice of a CG mapping scheme using the example of PS, and we show the application of the present coarsegraining approach to LC molecules. In Section 25.3.2 (“Dynamics”) we discuss how, by application of CG models, the corresponding time scales are modified. In that context we compare BPAPC chain dynamics in allatom and CG molecular liquids as well as diffusion of lowmolecularweight additives in CG PS melts.
25.3.1 25.3.1.1
STRUCTURE InverseMapped BPAPC Melts
Although many aspects of, for example, polymer dynamics, overall chain conformations, or LC order can be well described with CG resolution, for many other questions chemical details need to be reintroduced by inversemapping methods described in the previous section. This we illustrate here by discussing aspects of packing of BPAPC polymeric liquids [2,12] and the evaluation of interactions (chemical potentials) of small molecules inside polymeric microstructures [25]. To check the quality of BPAPC melt structures, we calculated neutron scattering functions of the (reintroduced) allatom melts. Figure 25.2a shows the coherent neutron scattering function for a melt containing 100 chains of N = 20 chemical repeat units at two temperatures [12,26]. The simulated functions are compared with experiments obtained at T = 1.5 K [26] and consequently most probably a slightly higher density. The peak at 0.6 Å−1 corresponds to the intrachain sequential carbonatecarbonate distance of about 11 Å and not to interchain correlations. This could be concluded from the simulations, where the nscattering functions were calculated. For the “computer samples” one can vary the atomic scattering lengths in the analysis and delete or create scattering contrast for any correlation at will. The main peak (amorphous halo) corresponds to the typical interchain (packing) distance. The agreement between the experimental data and the simulations is close to perfect. The discrepancies are due to the higher temperature of the simulated melts, which causes the amorphous halo to broaden and to shift to slightly larger distances and the peak corresponding to intrachain
59556_C025.indd 385
8/2/08 8:49:34 AM
386
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 25.2 (a) Coherent neutron scattering function of two BPAPC melts (290 and 570 K) [12] in comparison with experiments of a sample, which was cooled down and kept at a temperature of 1.5 K [26]. The solid and dashed curves were obtained by inverse mapping of chemical details for a system containing 100 chains of 20 repeat units each. (b) Radial distribution function of a simulated atactic polystyrene melt obtained by inverse mapping of chemical details [13] in comparison with the experimental RDF obtained from Xray diffraction [27]. All atomatom correlations are included except those between atoms within phenyl rings and atoms along the backbone separated by less than three chemical bonds.
carbonate–carbonate correlations to wash out. A comparison of simulated scattering curves with experimental data for partially deuterated BPAPC samples was also made [2,12], which further supported the overall agreement with experiments [26]. A similar comparison was made for a PS melt. Figure 25.2b shows the total radial distribution function obtained after reintroducing chemical details together with experimental data obtained by wideangle Xray diffraction measurements [27]. In both the simulation and the experimental data, intramolecular correlations due to 12 and 13 bonded neighbors (along the backbone) as well as all intraring correlations have been removed in order to emphasize the features deriving from the packing of nonbonded segments. Despite differences in temperature and chain length of the simulated and experimental samples, the overall agreement is very good. Moreover, in our analysis of the simulation trajectories we employed a unitedatom model. Because of that, we assumed Qindependent atomic scattering functions taking the carbon nuclear positions as scattering centers. This assumption gives rise to a stronger developed peak in the simulated data slightly below 4 Å in comparison with the Xray experiment. As a second example we mention a significant advantage of using inversemapped polymer microstructures in studying permeation of small molecules (socalled ‘penetrants’). The first application using this approach was a computational study of phenol in BPAPC [28]. The phenol diffusion process revealed a strong coupling between size and shape fluctuations of the pore space and the hopping of the penetrant. The pore structure was also analyzed in terms of the positronium annihilation time [29]. The resulting lifetime distribution functions compared very well to those from experiments, again supporting the overall consistency of the approach. In addition to diffusion, the penetrant solubility or excess chemical potential inside the polymer microstructure is also of interest. With currently available methods, penetrant excess chemical potentials can only be computed with sufficient statistical accuracy for fairly small penetrants. These are usually pure substances, such as gases under ambient conditions. A polymeric simulation box with a typical linear dimension of 4–5 nm is usually large enough to contain a statistically meaningful number of preexisting, empty cavities, which can host a small molecule without significantly modifying the matrix. Thus standard methods, such as testparticle insertion techniques, can be used to obtain reliable data. However, calculations of excess chemical potentials of larger penetrants, with equally
www.ebook3000.com 59556_C025.indd 386
8/2/08 8:49:35 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
387
high statistical reliability, are extremely cumbersome for several reasons. Most importantly, larger penetrants (e.g., phenol, propane, chloroform) occupy larger cavities, which in microstructures with the abovementioned linear dimensions occur very infrequently, albeit contributing significantly to the excess chemical potential. This problem can be resolved only if a large number of statistically uncorrelated microstructures can be generated at small computational expense. Obviously, reinserted allatom microstructures generated from CG mesoscale simulations can be used to resolve this problem. Based on large systems generated in this way, we currently explore an alternative, nonequilibrium freeenergy sampling technique, in order to resolve insertion problems usually encountered with large molecules in dense systems [25]. 25.3.1.2 Two Mapping Schemes for Polystyrene As discussed in Section 25.2, CG intramolecular potentials are developed assuming that the CG bond length, bond angle, and dihedral angle have no interdependencies. The validity of this assumption depends however on how we choose the CG mapping points. Figure 25.1 shows two CG representations for PS [14]. In the fi rst scheme (I), the PS repeat unit is represented by a CG bead (type “A”) localized on the methylene position, and another CG bead (type “B”) is localized on the mass center position of the remaining atoms. A and B beads are connected by CG bonds giving rise to bond angles θABA and θBAB, and dihedral torsions ϕABAB and ϕBABA. In the second scheme (II), bead A is positioned at the center of mass defined by the methylene group and the two adjacent CHgroups (taking however the halfmasses rather than the full CH masses in defining the CG bead mass center). Bead B corresponds to the phenyl group. The A and B beads in scheme (II) are also connected by CG bonds, giving rise to the same number of DOF (see Figure 25.3a). We note that the corresponding intramolecular potentials depend on the chain stereoregularity (i.e., the type of dyad [13]), hence the model can in principle be used in simulations of atactic, isotactic, and syndiotactic PS. The PS conformation on the lefthand side in Figure 25.3a is based on CG mapping scheme (I) and is shown to illustrate how the (θ,ϕ) CG angles are correlated. If the A bead on the left end of the picture is being rotated along the indicated CG bond, the adjacent B bead will also be rotated because these two beads are directly connected through two underlying chemical bonds. This causes variations of ϕABAB and θBAB to be correlated. Whether at all and to which degree such correlations lead to erroneous conformational sampling in the CG simulations depends on the mapping scheme and needs to be tested to assess the quality of a mapping scheme. Figure 25.3b shows energy diagrams (defined as −ln[ P CG (θ, ϕ ) / sin θ] ) in a contour map representation for the racemic PS dyad [14]. The bond bending angle θ corresponds to BAB and the dihedral angle ϕ to ABAB. The diagrams presented in the left part of this figure are obtained from simulations of a single united atom chain and diagrams on the right were obtained with the corresponding CG models. The upper panel corresponds to mapping scheme (I) and the lower panel to mapping scheme (II) (see Figure 25.3a). From the contour maps obtained with the CG models, the (θ,ϕ) correlation discussed above is lost to some extent. For example, CG scheme (I) has an energy minimum at θ ≈ 150° (upper panel, left), which is about 3 kBT deeper than the minimum at θ ≈ 100°. Therefore, CG model (I) predominantly samples θ ≈ 150°, independent from the torsion angle ϕ, which causes the energy basin at (θ,ϕ) ≈ (100°, 240°) observed with the unitedatom model (upper panel, left) to shift to a region (θ,ϕ) ≈ (150°, 240°) (upper panel, right) hardly ever sampled by the unitedatom model. With mapping scheme (I), the CG model also samples parts in (θ,ϕ)space not at all accessible by the unitedatom model (e.g., (80°, 300°) or (80°, 30°)). These ‘forbidden’ regions include conformations with excluded volume violations of CG 14 interaction sites (methylene units partly overlapping with phenyl groups). These overlaps can be avoided by introducing a special 14 nonbonded interaction in the CG model [13]. Noteworthy, CG model (II) clearly performs much better in this respect. Because special 14 nonbonded terms are not needed [14], it is also more consistent with
59556_C025.indd 387
8/2/08 8:49:36 AM
388
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 25.3 (a) PS conformation with CG mapping points based on schemes (I) and (II) (cf. Figure 25.1). The CG mapping points are indicated with black dots, CG bonds are indicated by thick gray lines. (b) (θ,ϕ)energy surfaces: I (AT), obtained by sampling the atomistic model, analyzed in terms of CG scheme (I); I (CG), obtained by sampling with the CG model, scheme (I); II (AT), obtained by sampling the atomistic model, analyzed in terms of CG scheme (II); II (CG), obtained by sampling with the CG model, scheme (II).
the general CG strategy outlined in the previous sections. In addition, there are certain advantages when studying dynamical properties compared to CG model (I). It is very important to be aware of correlations of internal DOFs in CG simulations, even though artifacts introduced by decoupling the bondangle bending and dihedral torsion potentials in CG models have so far been shown to affect neither the overall chain conformations nor the ability to successfully perform the inverse mapping in polymer modeling [13,14]. This is potentially more problematic in CG models for biomolecules. Here a similar decoupling of the bonded potentials is likely to be more tedious because specific (θ,ϕ)combinations may turn out to be needed for discriminating turns, helices, sheets, etc. which will be a significant criterion to distinguish “good” and “bad” mapping schemes [30].
www.ebook3000.com 59556_C025.indd 388
8/2/08 8:49:37 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
389
25.3.1.3 AzobenzeneBased Mesogens In the previous two examples the coarsegraining procedure (Section 25.2) was applied to polymeric systems, where the behavior of the melt is very much determined through chain connectivity and excluded volume interactions of the polymeric beads. Consequently, it often is not essential to introduce attractive (nonbonded/intermolecular) interactions in order to correctly predict melt structure and dynamics on the mesoscale. It is however very interesting to explore how far the above coarsegraining scheme carries if one tries to apply it to systems where attractive nonbonded interactions are likely to be more important than in amorphous polymers; that is, where the balance of attractions between different chemical units plays a possible role in structure formation. Biopolymers, liquid crystals, and in general selfassembling systems are examples where this can be of importance. The compound 8AB8 (see Figure 25.1) is a LC compound that contains azobenzene as a mesogen and forms a thermotropic nematic phase (and a monotropic smectic). This system is used to study how the coarsegraining approach can be adapted to LC systems. It is of particular interest to build a CG model that is close to an atomistic description not only in order to obtain as much chemical accuracy as possible but also because a close link between the coarse (mesoscale) and the atomistic level is important for multiscale simulation purposes. The reason for this is that azobenzene is a photoswitchable mesogen; that is, it undergoes a trans/cis photoisomerization, which goes along with a drastic shape change: in its trans form it is rodshaped and functions as a mesogen; in its cis form, it is bent and does not induce a mesophase. Therefore, with 8AB8 a photoinduced nematictoisotropic phase transition is observed. This LC phase change and the photoisomerization mechanism are interdependent since on the one hand the LC phase change obviously depends on the degree of trans/cis isomerization, and on the other hand it is believed that the photoisomerization mechanism depends on the (anisotropic) environment or the mechanical pulling of the tails that are attached to the azobenzene group. Therefore, the LCphotoswitching of azobenzene compounds is a true multiscale problem, since the photoisomerization mechanism can be studied using quantummechanical (QM) simulation techniques, whereas investigations of the LC phase change requires much longer length and time scales that can only be achieved by mesoscale (CG) techniques. In this constellation it is important to be able to switch between the levels of resolution, where the atomistic description can function as a link; that is, the coarse model needs to be built on the atomistic description, and the inverse mapping from the CG to the allatom level is essential to link to QM calculations of the transition. Ref. 15 describes how a CG model for 8AB8 was developed using the CG techniques developed for polymers. It was shown how intramolecular (bonded) potentials were obtained from simulations of an allatom single 8AB8 molecule, and how intermolecular potentials were developed based on allatom simulations of isotropic liquids of fragments of the 8AB8 molecule. The isotropic liquids that were used in the parameterization process were liquid benzene, liquid azobenzene (in its trans and in its cis form), liquid octadecane, and various mixtures of these compounds. Based on the structure of these liquids (radial distribution functions), nonbonded interaction potentials were determined, both using analytical potential functions and the iterative Boltzmann inversion method as detailed in the Methods section (for the case of octadecane see Figure 25.4a). The resulting interaction functions were then used for liquid (trans) 8AB8, where we tried to reproduce the experimentally observed LC phase behavior. In particular we aimed at obtaining a stable nematic phase. One could observe that the use of (soft) analytical potentials that are purely repulsive (in the spirit of the previous coarsegraining examples of polymeric systems) did not yield the correct mesophase behavior of 8AB8; in fact no longrange ordering was observed for the model chosen (see Figure 25.4b), even with a rather wide scan of temperatures and pressures. With potentials generated with the iterative Boltzmann inversion method; that is, numerical (tabulated) potentials which are also partly attractive, it is however possible to generate nematiclike (and smectic) phase of 8AB8. Thus, for the given molecule; that is, the given size and shape of the mesogen and the given molecular flexibility of the alkoxy tails, it seems to be important to account for attractions
59556_C025.indd 389
8/2/08 8:49:38 AM
390
CoarseGraining of Condensed Phase and Biomolecular Systems
FIGURE 25.4 (a) Structurebased derivation of nonbonded interaction potentials: carbon–carbon radial distribution functions (RDF) of CG centers in an octadecane liquid at 400 K. Thin straight line: RDF obtained from atomistic simulation, mapped onto CG centers. Thin dashed line: RDF obtained in CG simulation after optimizing a purely repulsive Morse potential to reproduce the atomistic structure as well as possible. Fat dotted line: RDF obtained in CG simulation after determining a numerical potential through iterative Boltzmann inversion so that the atomistic structure is reproduced. (b) Order parameter of 8AB8 system in coarsegrained simulations (initial setup fully ordered: four smectic layers). Black and light gray lines: simulations with potentials obtained through iterative Boltzmann inversion (partly attractive). Black line: the system remains ordered at T = 0.8 (corresponds to 320 K) (nematiclike structures are observed). Light gray line: the system becomes isotropic at T = 0.95 (corresponds to 380 K). Dark gray line: Simulation with purely repulsive Morse potentials—the system becomes disordered (at a wide range of temperatures and densities).
between the different beads in the CG model in order to reproduce the ordered phase of 8AB8. A snapshot of a structure that shows the alignment of the 8AB8 molecules in a nematiclike phase can be seen in the Color Figure 25.6 in the center of the book. This structure was generated by MD simulations using the CG model, the atomistic coordinates that are also shown in the figure were obtained using the inversemapping procedure as outlined above (restraining the atom coordinates during equilibration such that the “mapping criterion” is satisfied and the CG structure is therefore preserved). It shows that the structurebased coarsegraining approach originally developed in the polymer framework can be extended to LC systems, where mesoscale (with both large length and long time scales) simulations are essential to probe phase behavior and to generate well equilibrated mesostructures. With the given approach the mesoscale simulations also maintain an
www.ebook3000.com 59556_C025.indd 390
8/2/08 8:49:39 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
391
important link to the chemical structure, and through the inversemapping procedure it is possible to obtain atomistic coordinates of the system. In the course of the parameterization process of the nonbonded interactions, we also performed preliminary tests on the transferability of these fragmentbased potentials. We tested, for example, the applicability of potentials derived for pure liquids on mixtures of various compositions and of potentials derived for liquid benzene on liquid trans or cis azobenzene. Overall, the transfer of the nonbonded potentials worked surprisingly well; the limitations are more thoroughly discussed in Ref. 15, and these investigations will also be extended in the future.
25.3.2
DYNAMICS
Within CG models length scales are usually well defined through the construction of the coarsegraining itself. In most dynamic CG simulations reported in the literature little attention is paid however to the corresponding “coarsegraining” of the time unit. From polymer simulations of both simple continuum as well as lattice models it is known that such simulations reproduce the essential generic features of polymer dynamics; that is, the crossover from the Rouse to the entangled reptation regime, qualitatively and to a certain extent quantitatively [31,32]. While such previous studies concern motion distances on scales well above a typical monomer extension and provide quantitative information on characteristic time ratios, this still leaves a number of open questions. These refer to the predictive quantitative modeling of diffusion, viscosity, rates, and correlation times, etc. of dynamic events as well as to the question of minimal time and length scales CG simulations apply to. Particle mass, size, and energy scale, which are all well defined within a CG model, of course trivially fix a time scale, too, and it is indeed this time scale that is most often reported in MD simulations of CG systems. However, it does not usually correspond to the true physical time scale, because part of the friction experienced by a (sub)molecule (in the AT representation) is lost in the CG representation, causing the CG system to evolve faster. (Note that this is in principle also the case for atomistic simulations that make use of socalled united atoms where aliphatic hydrogen atoms are incorporated into the carbon atoms.) In other words, the fluctuating random forces of atomic DOFs, which are integrated out in the CG model, contribute to a “background friction” that must be considered in order to obtain a realistic time scale in the CG dynamics simulation. In their study of CG blob dynamics in polyethylene melts, Padding and Briels [33] employed effective potentials, frictions, and random forces all derived from detailed MD simulations. Izvekov and Voth [34] proposed a closely related recipe within the coarsegraining framework of force matching. Alternatively, CG dynamic quantities can in some cases be mapped directly onto the corresponding quantity obtained from detailed MD simulations or from experiments. For example, a diffusion coefficient D CG in units [m2/τ] can be mapped onto the diffusion coefficient DAT in units [m2/s] providing the time units of the CG simulation τ = x [sec]. Alternatively, the CG mean squared displacement curve can be superimposed with the atomistic curve at (for atomistic simulations) long times [35]. This approach was used to study entangled polycarbonate (BPAPC) melts of up to 20 entanglement lengths. The CG simulations provided truly quantitative information on the different measures of the entanglement molecular weight (from displacements, scattering functions, modulus and topological analysis) and the ratios of the different crossover times. 25.3.2.1 LongTime Atomistic BPAPC Trajectories Obtained by Inverse Mapping All CG mapping schemes shown in Figure 25.1 stay close to the atomistic structure of the molecules. Therefore, the dynamics of the CG system is expected to follow quite closely that of the atomistic system down to small length and time scales. Moreover, due to significant dynamic speedup, the CG systems can be simulated up to times that exceed far beyond what is possible in brute force detailed atomistic simulations, allowing for in silico experiments looking at exactly the same quantities as in experiments. The idea is to reintroduce atomic details in longtime CG trajectories of the system (BPAPC for the present case) and measure dynamic relaxations on time scales that altogether cover
59556_C025.indd 391
8/2/08 8:49:40 AM
392
CoarseGraining of Condensed Phase and Biomolecular Systems
at least nine decades and overlap the experimental regime probed, for example, with spectroscopic techniques. Here we only discuss the dynamic chain scattering function S(Q, t) as obtained in neutron spin echo experiments: S (Q, t ) =
1 n
∑ l l exp[i Q ⋅ (r (t) − r (0))] i j
i
i, j
.
j
(25.8)
Q
The double sum runs otver all n atoms in the chain. The term ri is the position of atom i and li is the neutron scattering length of atom i. The index Q indicates spherical averaging. For nonentangled melts on time scales above the local fast oscillations and above the persistence length of the polymer the Rouse model predicts S (Q, t ) / S (Q, 0) ∝ exp (−WQ 2 t1/ 2 ) , where W is related to the effective bead friction. The onset of this universal behavior is typically small compared to the diffusion time and chain extension. For larger times the overall diffusion takes over; that is, S (Q, t ) / S (Q, 0) ∝ exp (−DQ 2 t ). In the case of entangled polymers, S(Q,t) displays a qualitatively different behavior due to the tubelike confinement of the monomer motion. On intermediate time scales the scatterer “sees” a smearedout monomer density in the tube of diameter dT leading to an analog of a Debye Waller factor with, in the simplest approximation S (Q, t ) / S (Q, 0) = 1 − Q 2dT2 / 36. CG and atomistic MD simulations of BPAPC melts were performed with N = 5 up to N = 120 repeat units [35] and used to analyze this property. The entanglement molecular weight of BPAPC (1200–1400 g/mol) corresponds to Ne ≈ 5–6 repeat units. Based on performing a time mapping by superimposing repeat unit mean squared displacements of the CG and atomistic systems for N = 5 and N = 20 for long times, a time unit is obtained. While the intrinsic time unit of the CG model (determined through conversion of Lennard–Jones reduced units, assuming the same mass for all beads) is τ ≈ 1.7 ps, the physical time unit of the underlying BPAPC is much larger, namely τ = 30 ps at the temperature studied here (T = 570 K) [35]. Note that the typical timestep in a CG dynamic simulation is 0.01 τ, thus roughly 0.3 ps. For N = 20 the atomistic simulations only covered a bead motion up to about the monomer size. This time mapping unit was used in Figure 25.5a, which shows S (Q, t ) / S (Q, 0) for a N = 5 and N = 20 BPAPC melt [12]. For each chain length two independent sets of data are shown; the first has been obtained after reinsertion of chemical details in longtime CG trajectories (symbols); the second has been obtained from separate detailed, allatom simulations (lines). Data
FIGURE 25.5 (a) Dynamic scattering function S(Q,t) / S(Q,0) of BPAPC chains in the melt (570 K) as measured by nspin echo experiments versus the scaled time Q2 t1/2 for Q = 0.2 Å−1 [12]. Data obtained by original atomistic simulations are shown by the solid and dashed line; data obtained from inverse mapped conformations are shown by the symbols. (b) Arrhenius representation of the time mapping constant for the ethylbenzene motions in PS melt [36].
www.ebook3000.com 59556_C025.indd 392
8/2/08 8:49:40 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
393
are presented for Q = 0.2 Å−1, which covers the typical chain extension. A remarkable agreement is observed by the data obtained based on the CG trajectory and the allatom simulations. This perfect agreement of trajectories illustrates that the CG dynamic trajectories are physically meaningful down to very small length and time scales. It also shows that with such a time mapping of CG and atomistic simulations absolute data for long time and large scale dynamic quantities can be obtained without calibrating simulation timescales using experimental data. Based on the above timemapping and inversemapping methods, the largest allatom system simulated consisted of 200 BPAPC chains of N = 120 (corresponding to roughly 800,000 atoms in a box with a linear dimension of 100 nm) up to 4 × 10−5 sec. 25.3.2.2 Dynamic Speedup: Additive Molecules in a LongChain Polystyrene Melt The above route to determining the physical time scale in a CG simulation has been applied to several systems. To better understand the physical origin of the dynamic speedup in comparison with allatom models and reallife experimental systems, we discuss in this section an example of a simulation study of the dynamics of CG ethylbenzene (EB) molecules dissolved in a CG PS microstructure. A physical time scale was obtained by mapping the simulated EB diffusion coefficients onto the corresponding experimental data obtained by pulse field gradient NMR [36]. The time
FIGURE 25.6 (See color insert following page 238.) Snapshots of selected molecules from CG simulations of BPAPC, PS, and 8AB8 indicating both CG centers and atomistic coordinates obtained through inverse mapping.
59556_C025.indd 393
8/2/08 8:49:41 AM
394
CoarseGraining of Condensed Phase and Biomolecular Systems
conversion unit τ = DCG / D e xp (expressed in picoseconds) is presented in Figure 25.5b on a logarithmic scale versus the inverse temperature. The key observation is that τ depends exponentially on the temperature; that is, τ = τ 0 exp(− A / T ), where the constant A is positive. This observation originates from the fact that energy barriers for EB diffusional motions are lower in the CG system where interparticle potentials are softer and more smoothly varying with distance. The time mapping τ(T) between the real and the CG system therefore follows an Arrhenius dependency with an “activation energy” kB A describing an average reduction of energy barriers in the CG system. It should be noted that D CG and Dexp do not follow an Arrhenius dependency. Because the time scale for migration of the relatively large EB molecules is coupled to chain rearrangements of the PS matrix, it is important that the CG model is capable of reproducing the nonArrhenius (VogelFulcher) type temperature dependence of structural relaxation of the melt.
25.4 SOME RECENT DEVELOPMENTS AND FUTURE PERSPECTIVES 25.4.1
ADAPTIVE RESOLUTION MD
In many systems formation (e.g., selfassembly) and dynamics of largescale structures and conformations cannot be decoupled from local, chemical processes and specific intermolecular interactions. To perform computer simulations for those cases, dualscale resolution schemes can be used [37–42]. One can however even go beyond using molecular models with fi xed (single or dual) resolution and allow for a dynamic change of molecular resolution by changing the number of molecular DOF onthefly during the course of an MD simulation. Recently, such an adaptive resolution scheme (AdResS) has been introduced in which molecules can freely exchange between a highresolution and lowresolution region [43–45]. A key ingredient in this new method is a transition region in which a weighting function is applied that mixes the highresolution and lowresolution pair forces thereby slowly modifying the resolution of the molecules that move through [46]. The ‘latent heat’ associated with increasing or decreasing the number of molecular DOF is supplied or removed by a properly chosen thermostat. By these means thermodynamic equilibrium is maintained throughout the system. This method, which so far has been used for liquid water [44] and a polymersolvent system [45], is of great interest in a much wider variety of systems. An example could be an active site on a protein where the biological function requires an explicit description of solvent molecules. It would clearly be beneficial if far away from the active site the system could be described at lower resolution to avoid spending 99% of computer time on moving water molecules around in regions not of primary interest.
25.4.2
SURFACE INTERACTIONS OF BIOMOLECULES
Interactions of biomolecules with metal and inorganic surfaces are becoming increasingly important in nanobiotechnology. Typical questions involve how the functionality of a bio/inorganic hybrid device depends on the conformation of adsorbed biomolecules and how conformations are affected by the nature of the surface interactions involved. Multiscale modeling techniques that bridge between quantum, classical atomistic, and CG model descriptions are needed to approach such issues. Recently, initial steps have been made to bridge between the quantum and classical atomistic levels by performing a quantumclassical modeling of statistical conformations and interactions of amino acids and water molecules with metal surfaces [47]. This work has provided a recipe for treating surface interactions of amino acid residues in a classicallevel description through an interactive quantumclassical modeling approach that can in principle be applied to larger organic molecules. Further progress will rely on the development of dualresolution or adaptive resolution models that can be used to describe the system (solute and solvent) at high resolution close to the surface, combined with a description at lower resolution far away from the surface.
www.ebook3000.com 59556_C025.indd 394
8/2/08 8:49:43 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
25.4.3
395
NONBONDED INTERACTIONS
Although the iterative Boltzmann inversion method (Equation 25.7) provides nonbonded interaction potentials for CG models, it is based on radial distributions functions, which usually do not precisely define the system. In addition this can lead to very complicated and longrange potentials, which reduce the efficiency of the CG simulation significantly. Ideally one should aim to put as little as possible prior information into the model because that unavoidably leads to CG potentials that lack transferability and thus predictive potential. With respect to CG models for biopolymers in solution (e.g., oligopeptides) one ideally develops the CG force field in a way that distinguishes the bonded and nonbonded parts of the interaction potentials (in analogy to the method described above). Whereas for the bonded part, lessons learned from polymer coarsegraining could be applied, for the nonbonded part important challenges remain. Current developments include empirical parameterization against thermodynamic data [48] and forcematching approaches [49,50]. Alternative to these approaches, intermolecular pair potentials of mean force obtained from atomistic MD simulations can be used. Based on this approach, CG potentials for aqueous electrolytes were recently reported [51,52]. This method has been extended to a wide range of electrolytes including, for example, alkylammonium salts for which a realistic description of the ion pairing and dissociation equilibrium requires accounting for aspects of hydrophobicity that—in addition to standard electrostatics—gives rise to an additional attraction between the ions [53].
25.4.4 PERSPECTIVES Questions related to the specific systems discussed in this chapter lead automatically to another, almost philosophical aspect, namely—how specific is specific? In polymer physics one knows which properties are universal and which are chemistry specific. The systems considered there, however, are, in the end, very simple systems, where the above question is rather simple to answer. In problems related to structure formation, selfassembly, and surface interactions in synthetic and biological systems, specific interactions are operating. In these cases, it is far less understood which (chemistry) specific details should be kept in CG models (and which can safely be ignored). Moreover, it is not clear at what length scales the various CG modeling approaches described throughout this book merge and equally well describe these types of systems. Especially for biological molecules or complex structures employed in organic electronics, we, however, are still far away from such an understanding.
ACKNOWLEDGMENTS We wish to acknowledge Berk Hess and Vagelis Harmandaris for providing data and figures. We wish to thank Berk Hess, Vagelis Harmandaris, Pim Schravendijk, Matej Praprotnik, and Luigi Delle Site for many stimulating discussions and fruitful collaborations. CP acknowledges financial support from the Volkswagen Foundation. Most atomistic simulations were carried out using the Gromacs simulation package [54]; CG simulations were mainly performed with the ESPResSo suit of programs [55].
REFERENCES 1. Tschöp, W., Kremer, K., Batoulis, J., Bürger, T., and Hahn, O. 1998. Simulation of polymer melts. I. Coarsegraining procedure for polycarbonates. Acta Polym. 49:61–74. 2. Tschöp, W., Kremer, K., Hahn, O., Batoulis, J., and Bürger, T. 1998. Simulation of polymer melts. II. From coarsegrained models back to atomistic description. Acta Polym. 49:75–79. 3. MüllerPlathe, F. 2002. Coarsegraining in polymer simulation: From the atomistic to the mesoscopic scale and back. ChemPhysChem 3:754–69. 4. Müller, M., Katsov, K., and Schick, M. 2006. Biological and synthetic membranes: What can be learned from a coarsegrained description? Phys. Rep. 434:113–76.
59556_C025.indd 395
8/2/08 8:49:44 AM
396
CoarseGraining of Condensed Phase and Biomolecular Systems
5. Ayton, G. S., Noid, W. G., and Voth, G. A. 2007. Multiscale modeling of biomolecular systems: In serial and in parallel. Curr. Opin. Struct. Biol. 17:192–98. 6. Kremer, K. 2000. Computer simulations in soft matter science. In Soft and Fragile Matter, Nonequilibrium Dynamics, Metastability and Flow, ed. M. E. Cates and M. R. Evans, 145–84. Bristol: Institute of Physics. 7. Baschnagel, J., Binder, K., Doruker, P., Gusev, A. A., Hahn, O., Kremer, K., Mattice, W. L., MüllerPlathe, F., Murat, M., Paul, W., Santos, S., Suter, U. W., and Tries, V. 2000. Bridging the gap between atomistic and coarsegrained models of polymers: Status and perspectives. Adv. Polym. Sci. 152:41–156. 8. MüllerPlathe, F. 2003. Scalehopping in computer simulations of polymers. Soft Mater. 1:1–31. 9. Murat, M., and Kremer, K. 1998. From many monomers to many polymers: Soft ellipsoid model for polymer melts and mixtures. J. Chem. Phys. 108:4340–48. 10. Bolhuis, P. G., Louis, A. A., Hansen, J. P., and Meijer, E. J. 2001. Accurate effective pair potentials for polymer solutions. J. Chem. Phys. 114:4296–311. 11. Abrams, K., and Kremer, K. 2003. Combined coarsegrained and atomistic simulation of liquid bisphenol Apolycarbonate: Liquid packing and intramolecular structure. Macromolecules 36:260–67. 12. Hess, B., León, S., Van der Vegt, N., and Kremer, K. 2006. Long time atomistic polymer trajectories from coarse grained simulations: BisphenolA polycarbonate. Soft Mater. 2:409–14. 13. Harmandaris, V. A., Adhikari, N. P., Van der Vegt, N. F. A., and Kremer, K. 2006. Hierarchical modeling of polystyrene: From atomistic to coarsegrained simulations. Macromolecules 39:6708–19. 14. Harmandaris, V. A., Reith, D., Van der Vegt, N. F. A., and Kremer, K. 2007. Comparison between coarsegraining models for polymer systems: Two mapping schemes for polystyrene. Macromol. Chem. Phys. 208:2109–20. 15. Peter, C., Delle Site, L., and Kremer, K. 2008. Classical simulations from the atomistic to the mesoscale and back: Coarse graining an azobenzene liquid crystal. Soft Matter 4:859–69. 16. Abrams, C. F., and Kremer, K. 2002. Effects of excluded volume and bond length on the dynamics of dense beadspring polymer melts. J. Chem. Phys. 116:3162–65. 17. Weeks, J. D., Chandler, D., and Andersen, H. C. 1971. Role of repulsive forces in determining equilibrium structure of simple liquids. J. Chem. Phys. 54:5237–47. 18. Meyer, H., Biermann, O., Faller, R., Reith, D., and MüllerPlathe, F. 2000. Coarse graining of nonbonded interparticle potentials using automatic simplex optimization to fit structural properties. J. Chem. Phys. 113:6264–75. 19. Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical Recipes in C. The Art of Scientific Computing. Cambridge: Cambridge University Press. 20. Lyubartsev, A. P., and Laaksonen, A. 1995. Calculation of effective interaction potentials from radialdistribution functions: A reverse MonteCarlo approach. Phys. Rev. E 52:3730–37. 21. Reith, D., Pütz, M., and MüllerPlathe, F. 2003. Deriving effective mesoscale potentials from atomistic simulations. J. Comp. Chem. 24:1624–36. 22. McCoy, J. D. and Curro, J. G. 1998. Mapping of explicit atom onto united atom potentials. Macromolecules 31:9362–68. 23. Auhl, R., Everaers, R., Grest, G. S., Kremer, K., and Plimpton, S. J. 2003. Equilibration of long chain polymer melts in computer simulations. J. Chem. Phys. 119:12718–28. 24. Santangelo, G., Di Matteo, A., MüllerPlathe F., and Milano, G. 2007. From mesoscale back to atomistic models: A fast reversemapping procedure for vinyl polymer chains. J. Phys. Chem. B 111:2765–73. 25. Hess, B., Peter, C., Özal, T. A., Van der Vegt, N. F. A. 2008. Fastgrowth thermodynamic integration: Calculating excess chemical potentials of additive molecules in polymer microstructures. Macromolecules 41:2283–89. 26. Eilhard, J., Zirkel, A., Tschop, W., Hahn, O., Kremer, K., Scharpf, O., Richter, D., and Buchenau, U. 1999. Spatial correlations in polycarbonates: Neutron scattering and simulation. J. Chem. Phys. 110:1819–30. 27. Londono, J. D., Habenschuss, A., Curro, J. G., and Rajasekaran, J. J. 1996. Shortrange order in some polymer melts from Xray diffraction. J. Polym. Sci. B 34:3055–61. 28. Hahn, O., Mooney, D. A., MüllerPlathe, F., and Kremer, K. 1999. A new mechanism for penetrant diffusion in amorphous polymers: Molecular dynamics simulations of phenol diffusion in bisphenolApolycarbonate. J. Chem. Phys. 111:6061–68. 29. Schmitz, H. 1999. Computersimulation von positroniumannihilation in polymeren. PhD thesis, University of Mainz, Germany. 30. Tozzini, V., Rocchia, W., and McCammon, J. A. 2006. Mapping allatom models onto onebead coarsegrained models: General properties and applications to a minimal polypeptide model. J. Chem. Theory Comput. 2:667–73.
www.ebook3000.com 59556_C025.indd 396
8/2/08 8:49:44 AM
StructureBased Coarse and FineGraining in Soft Matter Simulations
397
31. Kremer, K., and Grest, G. S. 1990. Dynamics of entangled linear polymer melts: A moleculardynamics simulation. J. Chem. Phys. 92:5057–86. 32. Kremer, K. 2006. Polymer dynamics: Long time simulations and topological constraints. In Computer Simulations in Condensed Matter: From Materials to Chemical Biology, vol. 2. ed. M. Ferrario, G. Cicotti, and K. Binder, 341–78. Lect. Notes. Phys., vol. 704. Berlin, Heidelberg: Springer. 33. Padding, J. T., and Briels, W. J. 2002. Time and length scales of polymer melts studied by coarsegrained molecular dynamics simulations. J. Chem. Phys. 117:925–43. 34. Izvekov, S., and Voth, G. A. 2006. Modeling real dynamics in the coarsegrained representation of condensed phase systems. J. Chem. Phys. 125:151101. 35. León, S., Van der Vegt, N., Delle Site, L., and Kremer, K. 2005. Bisphenol A polycarbonate: Entanglement analysis from coarsegrained MD simulations. Macromolecules 38:8078–92. 36. Harmandaris, V. A., Adhikari, N. P., Van der Vegt, N. F. A., Kremer, K., Mann, B. A., Voelkel, R., Weiss, H., and Liew, C. 2007. Ethylbenzene diffusion in polystyrene: United atom atomistic/coarse grained simulations and experiments. Macromolecules 40:7026–35. 37. Chun, H. M., Padilla, C. E., Chin, D. N., Watanabe, M., Karlov, V. I., Alper, H. E., Soosaar, K., Blair, K. B., Becker, O. M., Caves, L. S. D., Nagle, R., Haney, D. N., and Farmer, B. 2000. MBO(N)D: A multibody method for longtime molecular dynamics simulations. J. Comput. Chem. 21:159–84. 38. Malevanets, A., and Kapral, R. 2000. Solute molecular dynamics in a mesoscale solvent. J. Chem. Phys. 112:7260–69. 39. Abrams, C. F., Delle Site, L., and Kremer, K. 2003. Dualresolution coarsegrained simulation of the bisphenolApolycarbonate/nickel interface. Phys. Rev. E 67:021807. 40. Villa, E., Balaeff, A., Mahadevan, L., and Schulten, K. 2004. Multiscale method for simulating proteinDNA complexes. Multiscale Model. Simul. 2:527–53. 41. Delle Site, L., Leon, S., and Kremer, K. 2004. BPAPC on a Ni(111) surface: The interplay between adsorption energy and conformational entropy for different chainend modifications. J. Am. Chem. Soc. 126:2944–55. 42. Schravendijk, P., Van der Vegt, N., Delle Site, L., and Kremer, K. 2005. Dualscale modeling of benzene adsorption onto Ni(111) and Au(111) surfaces in explicit water. Chemphyschem 6:1866–71. 43. Praprotnik, M., Delle Site, L., and Kremer, K. 2005. Adaptive resolution moleculardynamics simulation: Changing the degrees of freedom on the fly. J. Chem. Phys. 123:224106. 44. Praprotnik, M., Matysiak, S., Delle Site, L., Kremer, K, and Clementi, C. 2007. Adaptive resolution simulation of liquid water. J. Phys. Condens. Mater 19:292201. 45. Praprotnik, M., Delle Site, L., and Kremer, K. 2007. A macromolecule in a solvent: Adaptive resolution molecular dynamics simulation. J. Chem. Phys. 126:134902. 46. Praprotnik, M., Kremer, K., and Delle Site, L. 2007. Fractional dimensions of phase space variables: A tool for varying the degrees of freedom of a system in a multiscale treatment. J. Phys. A: Math. Theor. 40:F281–88. 47. Schravendijk, P., Ghiringhelli, L., Delle Site, L., and Van der Vegt, N. F. A. 2007. Interaction of hydrated amino acids with metal surfaces: A multiscale modeling description. J. Phys. Chem. C 111:2631–42. 48. Marrink, S.J., de Vries, A. H., and Mark, A. E. 2004. Coarse grained model for semiquantitative lipid simulations. J. Phys. Chem. B 108:750–60. 49. Izvekov, S., Parrinello, M., Burnham, C. J., and Voth, G. A. 2004. Effective force fields for condensed phase systems from ab initio molecular dynamics simulation: A new method for forcematching. J. Chem. Phys. 120:10896–913. 50. Izvekov, S., and Voth, G. A. 2005. A multiscale coarsegraining method for biomolecular systems. J. Phys. Chem. B 109:2469–73. 51. Hess, B., Holm, C., and Van der Vegt, N. F. A. 2006. Modeling multibody effects in ionic solutions with a concentration dependent dielectric permittivity. Phys. Rev. Lett. 96:147801. 52. Hess, B., Holm, C., and Van der Vegt, N. F. A. 2006. Osmotic coefficients of atomistic NaCl (aq) force fields. J. Chem. Phys. 124:164509. 53. Hess, B., and Van der Vegt, N. F. A. 2007. Solventaveraged potentials for alkali, earth alkali and alkylammonium halide aqueous solutions. J. Chem. Phys. 127:234508. 54. Van der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, A. E., and Berendsen, H. J. C. 2005. GROMACS: Fast, flexible, and free. J. Comput. Chem. 26:1701–18. 55. Limbach, H.J., Arnold, A., Mann, B. A., and Holm, C. 2006. ESPResSo: An extensible simulation package for research on soft matter systems. Comput. Phys. Commun. 174:704–27.
59556_C025.indd 397
8/2/08 8:49:45 AM
www.ebook3000.com 59556_C025.indd 398
8/2/08 8:49:46 AM
Atomistic Modeling of 26 From Macromolecules Toward Equations of State for Polymer Solutions and Melts: How Important Is the Accurate Description of the Local Structure? Kurt Binder, Wolfgang Paul, Peter Virnau, and Leonid Yelash Institut für Physik, Johannes GutenbergUniversität Mainz
Marcus Müller Institut für Theoretische Physik, GeorgAugustUniversität Göttingen
Luis González MacDowell Departamento de Quimica Fisica, Universidad Compluteuse de Madrid
CONTENTS 26.1 Introduction ......................................................................................................................... 399 26.2 Methods ...............................................................................................................................405 26.3 Applications ........................................................................................................................408 26.4 Concluding Remarks ...........................................................................................................409 Acknowledgments .......................................................................................................................... 411 References ...................................................................................................................................... 411
26.1
INTRODUCTION
For designing the properties of polymeric materials one often uses multicomponent systems (polymer blends, copolymers of various architectures, etc.), and in the process of making them solvents play a key role. This is particularly true when nucleation processes are considered. Structure formation processes may occur, which start out at the nanometer scale but create nontrivial structures on mesoscopic scales up to 100 μm. A good example of high industrial relevance is the creation
399
59556_C026.indd 399
8/12/08 2:35:45 PM
400
CoarseGraining of Condensed Phase and Biomolecular Systems
of polymeric foam materials (by using polystyrene in supercritical carbon dioxide as a solvent for instance). Clearly, a detailed theoretical understanding of these processes and the resulting structure– property relationships is a challenging problem, also in its own right, as a problem of the statistical thermodynamics and physical chemistry of condensed matter. Due to the complexity of this problem, any approach exclusively relying on analytical theory will be extremely limited, and developing approaches based on computer simulation is highly desirable. However, due to the range of length scales involved and the multiscale character of the problem, a straightforward chemically realistic allatom approach is unfeasible. In addition, there is the problem that methods based on classical molecular dynamics need force fields that often contain parameters of doubtful accuracy, in particular with respect to intermolecular nonbonded interactions, which are often modeled in an ad hoc manner by Lennard–Jones parameters fitted to some experimental data. For a recent critical assessment of force fields, see Smith (2005). In view of these problems, it has been a very attractive and longstanding idea [Baschnagel et al. 1991, 1992; Batoulis et al. 1991; Paul et al. 1991; Paul and Pistoor 1994; Tries et al. 1997; Tschöp et al. 1998a, 1998b; Baschnagel et al. 2000; Hahn, delle Site, and Kremer 2001; MüllerPlathe 2002, 2003; Milano and MüllerPlathe 2005; Theodorou 2006; Bedrov, Ayyagari, and Smith 2006] to provide an explicit connection between a chemically realistic atomistic model and coarsegrained models, which describe only certain degrees of freedom on the mesoscopic scale. In fact, there is a wealth of coarsegrained models, both lattice models such as the simple selfavoiding walk model [Kremer and Binder 1988; Sokal 1995] and the bond fluctuation model [Carmesin and Kremer 1988; Deutsch and Binder 1991; Paul et al. 1991], and offlattice models such as various types of beadspring models [Grest and Kremer 1986; Kremer and Grest 1990; Gerroff et al. 1993; Milchev, Paul, Binder 1993; Bennemann et al. 1998; Milchev and Binder 2002]). While a large variety of simulation methods exists for these models [Baumgärtner 1984, 1992; Binder 1995; Baschnagel, Wittmer, and Meyer 2004; Kotelyanskii and Theodorou 2004], in most cases studies lack any connection to specific systems exhibiting chemical detail, and rather address “universal’’ properties of polymers [de Gennes 1979]. First attempts to create such a connection have focused on an intramolecular mapping procedure from atomistic models of polycarbonate [Paul et al. 1991] or polyethylene [Baschnagel et al. 1991, 1992; Paul and Pistoor 1994; Tries et al. 1997] to the bond fluctuation model. These studies are based on the idea that n ≈ 3 − 5 successive chemical carbon–carbon bonds along the backbone of the chain are mapped into one bond of the bond fluctuation model (recall that the length of the bonds in this model may vary from 2 to 10 lattice spacings). The intrachain potentials of the atomistic model (potentials for the lengths of the chemical bonds and the angles between them, as well as the torsional potential) are then used to construct the distribution Pn ( ) of the length of an effective segment of the atomistic model containing n bonds, as well as the distribution Pn(θ) of the angle θ between two such (subsequent) effective segments. These distributions are then used to fit suitable effective potentials U( ) and V(θ) controlling the length of the bonds in the bond fluctuation model and the angle θ between two such subsequent lattice bonds. In this way it is possible, for instance, to obtain the temperature dependence of the characteristic ratio C N for polyethylene (see Figure 26.1). In the regime where real polyethylene is chemically stable and hence C N can be measured, the simulation results are in reasonable agreement with experimental data. For describing the dynamics, one needs to use a measure of the local mobility of the real chain determined by the barriers of the torsional potential to construct a hopping rate for the effective monomers of the lattice model. With the derivation of a time rescaling factor, which relates the time unit of the Monte Carlo simulation (1 Monte Carlo step per effective monomer) to the physical time, a selfconsistent coarsegrained description of the statics and dynamics of the considered polymer melt (polyethylene, polycarbonate, etc.) is obtained [Paul et al. 1991; Tries et al. 1997]. Although this approach is surprisingly successful with respect to the prediction of glass transition temperatures [Paul et al. 1991], many problems remain: (i) The lattice structure limits the accuracy with which
www.ebook3000.com 59556_C026.indd 400
8/12/08 2:35:46 PM
How Important Is the Accurate Description of the Local Structure?
401
8
7 2bond
CN
6
4bond
5
4
3
2
0
500
1000
1500
2000
T [K]
FIGURE 26.1 Characteristic ratio of polyethylene plotted vs. temperature, for N = 20 effective monomers. Two versions of the mapping procedure are shown: the twobond approximation uses properties of two successive lattice bonds for the optimization procedure of the potential, while the fourbond approximation is believed to yield better results, but is more cumbersome to use. (From Tries et al., J. Chem. Phys. 1997, 106, 738–48, Copyright American Institute of Physics.)
structural properties can be predicted. (ii) Apart from excluded volume interactions (since each lattice site can be occupied only once) no intermolecular interactions are accounted for, and it is not at all straightforward to include them in a quantitatively meaningful manner. (iii) Due to the use of a discrete lattice model, only the NVT ensemble (both the volume V and the particle number N are fixed) can be straightforwardly simulated. However, from the point of view of experiments, a NpT ensemble, p being the pressure, would be preferable. (iv) Both the effective interactions and the effective monomeric jump rate are clearly statedependent (i.e., depend both on temperature T and density ρ = N / V ). Clearly, drawbacks (i) and (ii) can be mitigated by using offlattice beadspringtype models, onto which a mapping of the atomistic model is performed [Tschöp et al. 1998a, 1998b; Hahn, delle Site, and Kremer 2001; Reith, Meyer, and MüllerPlathe 2001; MüllerPlathe 2002, 2003; Milano and MüllerPlathe 2005]. Typically, these models involve a chain of spherically symmetric effective monomers bound together by stiff springs to model chain connectivity, a purely repulsive intermolecular potential (like the repulsive part of a Lennard–Joneslike potential, see Reith, Meyer, and MüllerPlathe (2001)), and a bondangle potential. The latter is derived from the atomistic model in a rather direct and elegant way, from the angular distribution of the effective bonds, applying a Boltzmann inversion procedure [Tschöp et al. 1998a, 1998b; MüllerPlathe 2002, 2003]. While it is clearly an advantage that on the level of the coarsegrained model one no longer has to deal with a torsional potential, it must be noted that the angular potential is strongly state dependent and often rather complicated. For example, in the case of poly(vinyl alcohol) studied by Reith, Meyer, and MüllerPlathe (2001) the angular potential has a complicated shape with three minima. Due to the lack of intermolecular attractive potentials, the models of Tschöp et al. (1998a, 1998b) and Reith, Meyer, and MüllerPlathe (2001) are unsuitable to include solvents. Only in more recent work [Reith, Pütz, and MüllerPlathe 2003; Milano and MüllerPlathe 2005; Bedrov, Ayyagari, and Smith 2006] intermolecular attractive interactions are extracted from Boltzmann inversion procedures as well. However, these effective potentials are strongly state dependent again. In addition it
59556_C026.indd 401
8/12/08 2:35:47 PM
402
CoarseGraining of Condensed Phase and Biomolecular Systems
is doubtful to what extent effective interactions that are always assumed to be of a pairwise form are accurate at all. The atomistic foundation of effective potentials for the mesoscale modeling of complex binary fluids is a fundamental problem of statistical mechanics [Silbermann et al. 2006]. For instance, in the case of colloid–polymer mixtures it is well known that even in the framework of very simplified models, such as the AsakuraOosawa (AO) model where the polymer–polymer interaction is ideal gaslike, integrating out the polymers one creates multibody interactions among the colloids, and not just pairwise interactions, that become important for a polymer to colloid size ratio exceeding about 15% [Dijkstra, Brader, and Evans 1999]. Similar nonpairwise contributions to effective potentials between effective monomers (and solvent molecules) must be expected when one integrates out degrees of freedom of an atomistic model of a polymer plus solvent system as well. Thus it is clear that the task of systematically integrating out shortwavelength degrees of freedom to construct a coarsegrained model which contains only degrees of freedom on the nanoscale or even mesoscale but nevertheless provides a very accurate description of structure and dynamics is very difficult if at all feasible. Therefore, we pursue a more modest approach in the present chapter: we no longer require our coarsegrained model to accurately describe the local geometric structure of the polymer chains, nor their dynamics faithfully, but we focus on thermodynamic properties. In particular, we ask what is the minimal coarsegrained model for polymer solutions and melts that is required to describe their equation of state with sufficient accuracy? In fact, the theoretical modeling of the equation of state of polymer solutions, melts, and blends has been a central topic of polymer science since the work of Flory (1941, 1953) and Huggins (1941). It now is well known, however, that the predictive power of these descriptions, which are based on simple lattice models and their generalizations [Sanchez and Lacombe 1978], is somewhat limited [Binder 1994]. In the dilute and semidilute regime, the (osmotic) pressure exhibits universal behavior which can be described by scaling considerations [blob picture, de Gennes 1979] or renormalization group theory [Des Cloizeaux and Jannink 1990]. In this regime, minimal models are well suited to investigate the equation of state and have made significant contributions. In a dense melt, however, the pressure is dictated by the packing of the fluid of segments and the equation of state is expected to sensitively depend on the nonuniversal details of the chemical structure. It is this technically important regime of dense polymer melts that we focus on in the chapter. At present, stateofthe art analytical theories of equation of state of polymeric systems rely mostly on liquidstate theories known as “statistical associating fluid theory” (SAFT) [Chapman et al. 1989] and their various generalizations [see e.g., Müller and Gubbins 2001; Economou 2002 for reviews]. Using a “reference fluid” of unconnected monomers as a starting point, one treats the chain connectivity in the framework of a thermodynamic perturbation theory for chain molecules (TPT1). This perturbative treatment prevents the approach from capturing the powerlaw dependencies that characterize the semidilute regime, but it is justifiable in a dense melt. Particularly popular is the socalled perturbed chainSAFT (PCSAFT) method [Gross and Sadowski 2001, 2002], although it has recently been shown that this approach suffers from artificial multiple criticality in the predicted phase diagrams [Yelash et al. 2005a, 2005b]. It is based upon a hardchain reference system, with attractive interactions being accounted for by a perturbation approach [Barker and Henderson 1967], and free parameters adjusted to experimental data. However, in view of the problems with PCSAFT mentioned above [Yelash et al. 2005a, 2005b], an alternative approach [MacDowell et al. 2000, 2002] based on SAFT seems preferable: unlike PCSAFT, which is based on a repulsive hardsphere potential (with a temperaturedependent diameter, derived from the potential of Chen and Kreglewski (1977)) augmented by an attractive square well interaction, a Lennard–Jones fluid is utilized as a reference system, which is analytically describable within the mean spherical approximation (MSA). The extension to chain molecules is referred to as TPT1MSA in the literature [MacDowell et al. 2000, 2002]. It is essentially a liquidstate theory based on the same type of coarsegrained beadspring models that are commonly used in many computer simulations [Bennemann et al. 1998; Müller and MacDowell 2003; Binder, Baschnagel, and Paul 2003]. However, this model differs from the
www.ebook3000.com 59556_C026.indd 402
8/12/08 2:35:48 PM
How Important Is the Accurate Description of the Local Structure?
403
coarsegrained models resulting from mapping procedures based on atomistic models in one very important aspect: it completely lacks an effective bondangle potential! However, the PCSAFT approach [Gross and Sadowski 2001, 2002] also lacks such a bondangle potential, and moreover provides a poor description of both intramolecular and intermolecular pair correlation functions between effective monomers, since the steps of the potential lead to corresponding jumps in the correlation functions. This point is exemplified in Figure 26.2 and Figure 26.3, where Monte Carlo simulations of the Lennard–Jones beadspring chains [Yelash et al. 2006] are compared with corresponding results of ChenKreglewski chains, both with 29 beads/molecule, and results obtained from a real coarsegraining of a unitedatom (UA) model of polybutadiene [Krushev 2002]. More details on these simulations will be given in Section 26.2. One can see that the ChenKreglewski chains provide a rather poor representation of the data derived from the UA model, while the Lennard–Jones chains perform somewhat better. However, it is known from the literature [Gross and Sadowski 2001, 2002] that PCSAFT does provide a rather
C

C

FIGURE 26.2 Intramolecular segmentsegment correlation functions obtained from the Monte Carlo simulations of the Lennard–Jones (LJ) beadspring chains (thin solid curve) and the ChenKreglewski chains (dashed curve) for chains with 29 beads/molecule at reduced pressure p* ≡ pσ 3 / ε = 0.001 and reduced temperatures T * = k BT / ε = 0.9/1.3; σ, and ε being the parameters of the LJ potential. Bold curves are from the unitedatom molecular dynamics simulations of polybutadiene at T = 240 K and T = 353 K [Krushev 2002]. The distance r* = r / σ , with a choice of σ = 4.5 Å. (From Yelash et al., J. Chem. Theory Comput. 2, 588–597, 2006. Copyright 2006 American Chemical Society.)
FIGURE 26.3 Intermolecular pair correlation functions obtained from the Monte Carlo simulations of the beadspring chains (thin solid curves) and ChenKreglewski chains (dashed curves) for the same systems as in Figure 26.2. For explanations of the simulated model see Figure 26.2 and Section 26.2. (From Yelash et al., J. Chem. Theory Comput. 2, 588–597, 2006. Copyright 2006 American Chemical Society.)
59556_C026.indd 403
8/12/08 2:35:49 PM
404
CoarseGraining of Condensed Phase and Biomolecular Systems
reasonable fit of a large body of equation of state data for a huge variety of polymer melts, solutions, and blends. In fact, in the case referred to in Figure 26.2 and Figure 26.3, one also finds that equation of state data of polybutadiene are described by PCSAFT by a fit of fair quality (Figure 26.4), though some systematic deviations are noticeable, which arise from a spurious liquid–liquid unmixing predicted to occur by PCSAFT at higher densities [Yelash et al. 2005a, 2005b], while the fit based on TPT1MSA is quite perfect. Figure 26.4 thus suggests that equation of state data of polymeric systems can be described by a simple beadspring model of the polymer, with stateindependent parameters for the intermolecular Lennard–Jones interaction, over a wide range of temperatures and pressures, although the description of both intra and intermolecular structure provided by the model (Figure 26.2 and Figure 26.3) is only in qualitative accord with the corresponding description based on an atomistic model. Quantitative distinctions can be seen clearly, and with respect to distributions of effective bond angles there is even qualitative disagreement [Yelash et al. 2006]. This can be expected, however, because our model does not include any effective bondangle potential. Thus the concept followed in the present chapter is the idea that a much cruder model is sufficient, if the only goal of the modeling is the description of the equation of state at fairly elevated temperatures where the system is fluid, rather than describing structure and dynamics on nanoscopic scales. It is clear that for the latter goal a description in terms of simple potentials that are independent of temperature and pressure over a wide range of these variables cannot be expected: for example, as one can see from Figure 26.1 for alkane melts, the effective chain stiffness depends considerably on temperature. While in the melt the meansquare endtoend distance 〈 R 2 〉 of a chain with N carbon–carbon bonds along the backbone varies as 〈 R 2 〉 = C N 2cc N , where cc ≈ 1.53 Å is the length of a carbon–carbon covalent bond and C N the characteristic ratio shown in Figure 26.1, a rather different behavior applies for low pressures and densities where the vapor–liquid transition of the alkane chains occurs: in the vapor phase, the chains form collapsed globules for temperatures below the vapor–liquid critical point [de Gennes 1979], while far above the critical point they form swollen coils, with 〈 R 2 〉 ∝ N 2 ν with ν ≈ 0.59. Analogous changes occur in the singlechain structure when we consider the polymer–solvent equilibrium, where below the theta temperature of the solution [Flory 1953; de Gennes 1979] a demixing occurs in a solventrich and a polymerrich phase. Both structure and dynamics of the macromolecules in these various phases that are of interest will depend very much on the thermodynamic state of the system, and there would be little hope to describe the system accurately with stateindependent
FIGURE 26.4 A comparison between experimental data for polybutadiene melts in the temperature range from 299 to 461 K (symbols) and calculations using PCSAFT (dashed curves) and TPT1MSA (solid curves) models. At high pressure, the PCSAFT calculation predicts a much too large density as a result of the vicinity of the spurious “liquid–liquid” phase separation predicted by PCSAFT, as discussed in detail by Yelash et al. (2005a, 2005b), from which papers the data for polybutadiene reanalyzed here are taken.
www.ebook3000.com 59556_C026.indd 404
8/12/08 2:35:51 PM
How Important Is the Accurate Description of the Local Structure?
405
potentials under all these various conditions. However, for many applications this is not necessary, and one just wishes to describe the macroscopic thermodynamic properties of a polymer melt or polymer solution with reasonable accuracy. In the present chapter, we discuss such a description where the polymer is modeled by a simple beadspringtype chain, and the solvent is modeled by spherical particles, interacting with each other and the effective monomers of the macromolecule. The effective Lennard–Jones potentials are suitably chosen with stateindependent parameters. We suggest that a preferable choice of these parameters is made such that the critical points of the vapor–liquid phase diagrams of the solvent and polymer are correctly reproduced. Then we test to what extent the solution phase diagram can be predicted. This is a very nontrivial test, since in binary fluid mixtures a large variety of phase diagrams can be realized [Scott and van Konynenburg 1970]. In addition, the approach to use the critical points to fix the parameters of the coarsegrained models implies that analytical theories such as the variants of SAFT, including TPT1MSA, should not be used to fix these parameters: all these theories describe criticality in terms of a meanfieldtype approximation, similar to the van der Waals equation. The meanfield character of these theories implies that the extent over which liquid–vapor or liquid–liquid phase separation occurs in the parameter space of the model (temperature T, pressure p, mole fraction x in a binary system) is overestimated significantly (and the shape of the coexistence curves is described by meanfield exponents rather than those of the Ising model universality class; see Binder et al. (2005) for a more detailed discussion of this issue). Thus, it is important to use computer simulation methods for the prediction of the phase diagrams of these coarsegrained models and the resulting adjustment of their parameters to critical point data of the real systems to be modeled. In the next section we summarize this methodology in more detail, while in the third section we present applications to alkanes and carbon dioxide as a solvent, while the fourth section gives some concluding remarks and an outlook to unsolved problems.
26.2
METHODS
Having in mind that we wish to present solvent particles (such as CO2 molecules, for instance) as spherical particles, and a macromolecule as a beadspring chain without bond angle or torsional potentials, the question arises how many carbon atoms along the backbone of the polymer should be integrated into one effective unit of the coarsegrained chain. Of course, there is neither a rigorous nor a general answer to this question. In the mapping of polyethylene to the bond fluctuation model it was found that n = 5 CH2 groups was a useful choice [Tries et al. 1997]. However, varying n systematically from n = 2 to n = 16 for polybutadiene it was found that n = 4 was the optimum choice [Yelash et al. 2006]. But with respect to the solvent–polymer mixing thermodynamics, it is also important to roughly preserve the geometrical size ratio between the solvent molecule and the effective polymer segment, which determines the intermolecular packing [Virnau et al. 2002, 2004a]. Having in mind an application to the system hexadecane (C16H34) plus CO2, it was decided that the most plausible choice was to replace the 15 covalent CC bonds by four effective beads in the beadspring model; that is, we work with N = 5 effective beads. This means literally that n = 3.2 CH2 groups correspond to one effective segment. The reader may be bewildered by this choice for n, which is noninteger. However, since we disregard here the geometric structure of the polymer, this is not at all a problem. Note that in analytical models such as PCSAFT even the number of effective beads N is treated as noninteger in the fitting to experimental data [Gross and Sadowski 2001, 2002]. For Monte Carlo simulations, however, N must be integer, while a noninteger n is no problem at all for the theory. A comment also deserves to be made on why a short polymer such as C16H34 and not a much larger macromolecule was chosen. The answer is that for C16H34 experimental data on the properties of the vapor–liquid critical point of the pure polymer are still available. For much longer alkanes, such data do not exist, since the critical temperature Tc would be so high that the polymer is no longer
59556_C026.indd 405
8/12/08 2:35:53 PM
406
CoarseGraining of Condensed Phase and Biomolecular Systems
chemically stable. Of course, it is an interesting question to what extent the effective Lennard–Jones parameters extracted for C16H34 can be used for a reliable modeling of other alkanes as well. Note that in our description no account is made for the fact that the two chemical end groups (CH3) differ from the interior chemical monomers (CH2). We shall return to this important question of the transferability of a coarsegrained model description to a chemically similar system in the last section of this chapter. For the nonbonded interaction between the effective monomers, we use a truncated and shifted Lennard–Jones potential: ⎪⎧4 ε [(σ / r )12 − (σ pp / r )6 + 127 / 16384], VLJ (r ) = ⎪⎨ pp pp ⎪⎪0, ⎩
fo or r < rc , for r ≥ rc
(26.1)
where the cutoff rc is twice the distance of the potential minimum from the origin, rc = 2 ⋅ 6 2 σ pp. The additive constant in Equation 26.1 is chosen such that VLJ (r ) is continuous at rc. Effective monomers along a chain also interact with this potential, and in addition are bonded together via FENE (finitely extensible nonlinear elastic) springs [Kremer and Grest 1990]: VFENE (r ) = −33.75ε pp ln ⎡⎢1 − (r / Rpp ) z ⎤⎥ , ⎣ ⎦
(26.2)
with Rpp = 1.5σpp. The solvent particles were described in Virnau et al. (2002, 2004b) by exactly the same type of potential as Equation 26.1, but with different parameters, namely σss and εss. With current Monte Carlo techniques, which will be briefly characterized below, it is nowadays possible to predict critical temperatures and densities Tc, ρc of models such as those introduced above with a relative accuracy of a few parts in a thousand (or better). Thus, εss, σss have been adjusted such that the experimental Tc and ρc of the solvent are reproduced, and εpp, σpp are chosen such that the experimental Tc and ρc of hexadecane are reproduced. This yields σpp = 4.52 × 10 − 10 m, εpp = 5.79 × 10 − 21 J, while σss = 0.816σpp and εss = 0.726εpp. Given these values, our model for each of these materials no longer exhibits any adjustable parameter whatsoever. In view of this fact, it is rather remarkable that for both materials a rather good description of phase coexistence simultaneously in the temperature–density plane and in the pressure–temperature plane is obtained (Figure 26.5) [Virnau et al. 2002]. For CO2, one notes a slight systematic discrepancy on the liquid branch of the coexistence curve in the (T,ρ) plane. This discrepancy is mostly due 80
(T c, pc)CO
2
Pressure [bar]
60
40
(b) 800 Experiment lvcoexistence CO2 Critical point CO2 lvcoexistence C16H34 Critical point C16H34
20
(Tc, pc)C
H34
16
0 200
300
400 500 600 Temperature [K]
700
Experiment BinodalC16H34
700 Temperature [K]
(a)
Binodal CO2
600
Critical point
500 400 300
800
200
0
0.5 Density ρ [g/cm3]
1
FIGURE 26.5 (a) Phase diagrams of pure CO2 (lower two curves) and pure C16H34 (upper two curves) in the temperature–density plane. (b) Same as (a) but in the pressure–temperature plane. (From Virnau et al., Comput. Phys. Comm. 147, 378, 2002. Copyright 2002 Elsevier.)
www.ebook3000.com 59556_C026.indd 406
8/12/08 2:35:54 PM
How Important Is the Accurate Description of the Local Structure?
407
to the neglect of the quadrupole moment, which is rather large for the CO2 molecule. If one takes the quadrupole–quadrupole interaction between CO2 molecules into account, using the experimental value of the quadrupole moment as a further input to the model, the agreement between the model results and experiment is improved significantly [Mognetti et al. 2008]. A similar improvement also occurs with respect to the description of the temperature dependence of the interfacial tension between the coexisting phases [Mognetti et al. 2008]. Note that no further parameter is available to be fitted for the interfacial tension, and hence the fact that it can be predicted so accurately [Virnau et al. 2002, 2004a; Mognetti et al. 2008] is very remarkable. In the following we summarize the methodic aspects relevant for the construction of phase diagrams such as shown in Figure 26.5 from Monte Carlo simulations. A key ingredient is the sampling of the density distribution function PL (ρ) using L × L × L boxes with periodic boundary conditions in the grandcanonical μVT ensemble [Virnau et al. 2002, 2004a; Landau and Binder 2005]. Varying the chemical potential μ for T < Tc, PL (ρ) exhibits a single maximum (of approximately Gaussian shape) deep in the onephase region, but it adopts a double peak shape when μ is close to μcoex, the chemical potential for which twophase coexistence occurs. When μ varies through μcoex, the weights of the two peaks (one centered near the density ρυ of the vapor phase, the other centered near the density ρ of the liquid phase) change gradually, and μcoex can actually be located with high precision when the weights of both peaks are equal [Binder and Landau 1984; Borgs and Kotecky 1990]. Getting accurate data for the weights of both peaks of PL (ρ) for μ near μcoex is not at all straightforward, however, since there is often a pronounced hysteresis since the two states with densities near ρυ and ρ are separated by a high freeenergy barrier in phase space (due to the interfacial freeenergy cost of a mixedphase configuration). This difficulty can be overcome by suitable biased sampling methods, such as “successive umbrella sampling” [Virnau and Müller 2004]. Another difficulty is that the acceptance rate for inserting a particle in a rather dense configuration (a move that is necessary in the grandcanonical ensemble simulation) may be negligibly small. This problem constrains the applicability of the μVT simulation approach to rather short polymer chains and not very low temperatures. Even then the particle insertions and deletions require the implementation of configurational bias Monte Carlo methods [Laso, de Pablo, and Suter 1992; Siepmann and Frenkel 1992; Siepmann, Karaborni, and Smit 1993]. In addition, the chain configurations in between the configurational bias moves are relaxed by local monomer displacements and slithering snake movements [Binder 1995; Kotelyanskii and Theodorou 2004]. From the methods mentioned above, one obtains μcoex(T) and the associated estimates for the coexisting liquid and vapor densities, ρυ (T ) and ρ (T ) , as well as the coexistence diameter ρd (T ) = (ρυ (T ) + ρ (T ))/2. It must be stressed, however, that the “naïve” estimates of ρυ (T ) and ρ (T ) extracted from the peak positions of PL (ρ) are not at all reliable estimates of bulk behavior near the critical temperature, due to pronounced finite size effects [Landau and Binder 2005]. Applying finite size scaling methods [Binder 1992; Wilding 1996], a reliable extrapolation of such Monte Carlo data for finite box linear dimensions L to the thermodynamic limit ( L → ∞) is, however, possible, and such techniques were in fact used by Virnau et al. (2002, 2004a) to obtain the results shown in Figure 26.5. Having studied both the phase behavior of both the pure solvent and of the pure polymer melt, the next step is the study of the phase behavior of the polymer solution, of course. First of all, the interaction between the solvent molecules and the effective monomers needs to be specified. A simple and widely used approximation relies on the Lorentz–Berthelot mixing rules for the Lennard–Jones parameters εsp, σsp for this solvent–polymer mixture [Maitland et al. 1987]: σ sp = (σ ss + σ pp ) / 2 , ε sp = ε ss ε pp .
59556_C026.indd 407
(26.3)
8/12/08 2:35:57 PM
408
CoarseGraining of Condensed Phase and Biomolecular Systems
Since it is well known that in many cases of interest Equation 26.3 is not accurate enough, a parameter ξ is commonly introduced, describing deviations from the Lorentz–Berthelot mixing rule for the energy parameters: ε sp = ξ ε ss ε pp .
(26.4)
The simulation in the grandcanonical ensemble then amounts to the variation of two chemical potentials μs for the solvent particles and μp for the polymers, respectively, and a distribution function involving, correspondingly, two densities ρs, ρp is recorded PL (ρs , ρp ). This task is practically feasible when suitable reweighting methods are applied [see Virnau et al. 2002]. A correct description of the order parameter for the mixtures would in principle require a linear combination of the densities of the polymer and the solvent particle. In most cases, however, it is sufficient to consider a single density because one of the two usually exhibits only Gaussian fluctuations. This corresponds to a projection of the joint probability distribution PL (ρs , ρp ) onto either the polymer or the solvent axis. Methods for determining the probability weight can still be applied with a onedimensional weight function. Gaussian fluctuations in the second density do not constitute a barrier and need not be considered. Therefore, a single scalar order parameter (e.g., the polymer density) characterizes the phase transition, which then belongs to the Ising model universality class as the pure systems do. Thus, one can apply the same finitesize scaling techniques as for the pure systems.
26.3 APPLICATIONS In this section, we describe rather briefly the application of the concepts sketched in the previous section to the mixture of CO2 and C16H34. Note that no adjustable parameters whatsoever are any longer available for the models of the pure systems, after we have requested that their vapor–liquid critical temperatures and densities should coincide with their experimental counterparts. However, no a priori information is available on the parameter ξ in Equation 26.4, describing the deviation from the Lorentz–Berthelot mixing rule. Thus, rather arbitrarily three choices were tried: ξ = 1, ξ = 0.9, and ξ = 0.886 [Virnau et al. 2002, 2004a,b]. Figure 26.6a shows the projection of the critical line of the vapor–liquid transition of the mixed system onto the (T,p) plane [Binder et al. 2005]. Along the critical line the molar fraction x of CO2 quickly rises as Tc(x) decreases from its maximum value Tc(0) for pure hexadecane. For ξ = 1 one can clearly see that x monotonously rises to x = 1 (Figure 26.6b) and the critical line pc(T,x) just connects smoothly the critical points of both pure substances. This is the simplest case among all possible scenarios of binary mixture phase diagrams, namely the “type I” diagram in the classification scheme of Scott and van Konynenburg (1970). It is well known, however, that the real hexadecane + carbon dioxide system does not belong to this class, but rather it belongs to “type III” in this classification. This implies that pc(T,x) does not decrease smoothly toward pc (TcCO2 , x = 1) [Schneider et al. 1967] as x increases towards unity. Instead the critical line pc (T , x ) reaches a min minimum value at some x < 1, and this minimum value pc exceeds the critical pressure of pure carbon dioxide. For temperatures less than the associated temperature Tmin of this minimum the curve pc (T , x ) rises sharply. It was empirically found [Virnau et al. 2004a] that a choice ξ = 0.886 for the parameter that characterizes the deviation from the Lorentz–Berthelot rule corresponds rather clearly to the behavior of the real material. However, we add two caveats: fi rst of all, even for this rather simple system (both CO2 and C16H34 are chemically very stable molecules, cheap and easy to handle in the laboratory) there is still a significant uncertainty about the phase diagram, as the discrepancy between the data reported by Schneider et al. (1967) and by Amon, Martin, and Kobayashi (1986), that we have included in Figure 26.6a, shows. This scarcity of accurate experimental data on the phase behavior of polymer solutions as a function of temperature, pressure, and molar fraction of solvent is an
www.ebook3000.com 59556_C026.indd 408
8/12/08 2:35:59 PM
How Important Is the Accurate Description of the Local Structure?
409
FIGURE 26.6 (a) Phase diagram of the model for the hexadecanecarbon dioxide mixture as a function of temperature and pressure for three different trial values of the parameter ξ. Squares correspond to ξ = 1, diamonds to ξ = 0.9, and triangles to ξ = 0.886. The simulation results for the liquidvapor coexistence of the pure components are shown by circles. Thick lines mark two experimental observations of the critical lines in hexadecane and CO2 from Schneider et al. (1967) and Amon et al. (1986), respectively (from Binder et al. (2005)). (b) Molar fraction x of CO2 along the critical line plotted as function of the critical temperature, for the same systems as in (a). (From Binder et al., Adv. Polym. Sci. 173, 1–110, 2005. Copyright 2005 Springer.)
even more acute problem for less common materials, of course (in particular for solvents which are highly poisonous or chemically reactive or even explosive). Secondly, the physical significance of the parameter ξ is open to doubt; its existence has no firstprinciples theoretical justification at all. The need to use such a parameter ξ may rather indicate that the description of the pure materials may be too crude in certain respects. Indeed, including quadrupolar interactions in the description of carbon dioxide not only gives a much more accurate account of the properties of pure CO2 but also seems to provide a significant improvement of the description of the mixture behavior. Mognetti et al. (2008) demonstrated that such a model with ξ = 1 yields a phase diagram that almost coincides with results such as those shown in Figure 26.6 for ξ = 0.9. Clearly, it would be a significant improvement of the theoretical modeling of mixture phase behavior if Equation 26.3 would hold strictly, and no need to fit such a ξparameter would arise. Of course, the theoretical modeling is not at all restricted to a prediction of the vapor–liquid and liquid–liquid demixing critical lines, but one can also study twophase coexistence very nicely. As an example, Figure 26.7 presents an isothermal slice of the phase diagram at T = 486 K [Virnau et al. 2004b]. Here, corresponding results from the TPT1MSA approach are included (assuming exactly the same interactions). One sees that the coexistence curves are in very good agreement, apart from the (expected) discrepancies close to the critical point. Even threephase coexistence along the triple line where solvent vapor plus solvent liquid plus a dense polymerrich phase coexist could be studied (in the ρs − ρp plane then three peaks grow, corresponding to the three coexisting phases; see Virnau et al. (2004a)). Thus, the simulations of such coarsegrained models can predict their phase behavior in impressive detail.
26.4 CONCLUDING REMARKS In this chapter we have discussed an approach devoted to deriving a coarsegrained model of polymer plus solvent systems which is able to describe the equation of state of these systems with reasonable accuracy, even though no attempt is made to reproduce intra and intermolecular correlations reliably. Note that this endeavor is a formidable task, since the interference of liquid–vapor and liquid–liquid phase separation in these systems leads to a very rich variety of phase diagrams in the
59556_C026.indd 409
8/12/08 2:36:00 PM
410
CoarseGraining of Condensed Phase and Biomolecular Systems 400
Pressure [bar]
300
γ [mN/m]
15
200
Spinodal decomposition
10
Nucleation
5
0 0
50 100 150 200 250 300
p [bar]
100
0
0
0.2
0.4 0.6 Molar fraction x
0.8
1
FIGURE 26.7 Isothermal slice of the phase diagram of CO2–C16H34 at T = 486 K as obtained from Monte Carlo simulation (thick solid line and open symbols) and the TPT1MSA approach (longdashed line). The spinodals obtained from the TPT1MSA equation of state are indicated as shortdashed lines. The arrows indicate the study of possible pressure quench experiments. The inset presents the interfacial tension between the coexisting phases as a function of pressure. (From Virnau et al., New J. Phys. 6, 7, 2004. Copyright 2004 Institute of Physics.)
space of the three relevant thermodynamic control parameters: temperature, pressure, and molar fraction. Using the example of the system hexadecane plus carbon dioxide solvent as a test case, and implementing the idea to fix Lennard–Jones parameters of the pure materials in terms of their critical temperatures and densities, a surprisingly accurate description of surprisingly many physical quantities of interest (coexistence curves, associated pressure at phase coexistence in the pTplane, interface tension between coexisting phases) is obtained. Unfortunately, it is less clear how one should determine the exact interaction potential between the polymer and the solvent. The simple Lorentz–Berthelot mixing rule does not seem to be accurate enough. However, with a slight modification of this mixing rule also a rich variety of useful predictions for the full binary system can be obtained. Note that in spite of the fact that short alkanes at low temperatures are rather stiff, with a persistence length (manifested in a characteristic ratio CN much larger than one) that distinctly grows as the temperature is lowered, we have used a fully flexible beadspring model (similar to the way the common analytical equations of state such as PCSAFT and TPT1MSA, etc. do, although some of these analytical methods suffer from other problems). This observation leads to one of the main messages of this chapter, namely the suggestion that for a description of the equation of state of polymer plus solvent systems the variable local stiffness of the polymer chains is less important. To a first approximation bond angle potentials for the coarsegrained models can be disregarded. As a consequence, an accurate description of local intra and intermolecular structure of the polymer solution or melt is no longer obtained. However, this does not seem to matter too much for the equation of state. Of course, one should not overemphasize this conclusion: when one deals with rather stiff short chains, the possibility of nematic order in the polymer solution arises, and this new phase changes the phase diagram significantly. Such nematic order in polymer solutions is clearly beyond the realm of the present model. Thus, it would be very interesting to extend the present approach by including a bondangle potential and apply it to such a solution of stiff chains. Then one could also make contact with the traditional mapping approaches, where via Boltzmann inversion from an atomistic model a bond angle potential on the coarsegrained scales inevitably comes into play.
www.ebook3000.com 59556_C026.indd 410
8/12/08 2:36:02 PM
How Important Is the Accurate Description of the Local Structure?
411
Another important extension is the inclusion of electrostatic interactions. While it is clearly a long way to go from the present approach where the solvent molecules are described as Lennard– Jonestype point particles towards systems, such as biopolymers or synthetic polyelectrolytes in aqueous solution, a very desirable first step is the inclusion of dipole or quadrupole moments of the molecules. Current work has shown [Mognetti et al. 2008] that even for carbon dioxide the consideration of the quadrupolar interactions leads to a very significant improvement in agreement between the model results and the experimental data. Considering mixtures, deviations from the Lorentz–Berthelot rule are much reduced. Thus, the work reviewed in this chapter is only a small first step. The trend, however, is promising.
ACKNOWLEDGMENTS Early stages of the research reviewed here were supported by the German Federal Ministry of Education and Research (BMBF), Bayer AG, and BASF AG. We thank J. Baschnagel, K. Kremer, and F. MüllerPlathe for many useful discussions, and V. Tries for the fruitful collaboration that led to Figure 26.1.
REFERENCES Amon, C. T., Martin, R. J., and Kobayashi, R. 1986. Application of a generalized multiproperty apparatus to measure phase equilibrium and vapor phase densities of supercritical carbon dioxide in nhexadecane systems up to 26 MPa. Fluid Phase Equil. 31:89–104. Barker, J. A., and Henderson, D. 1967. Perturbation theory and equation of state for fluids. II. A successful theory of liquids. J. Chem. Phys. 47:2856–61. Baschnagel, J., Binder, K., Doruker, P., Gusev, A. A., Hahn, O., Kremer, K., Mattice, W. L., MüllerPlathe, F., Murat, M., Paul, R., Santos, S., Suter, U. W., and Tries, V. 2000. Bridging the gap between atomistic and coarsegrained models of polymers: Status and perspectives. Adv. Polym. Sci. 152:41–156. Baschnagel, J., Binder, K., Paul, W., Laso, M., Suter, U. W., Batoulis, I., Jilge, W., and Bürger, T. 1991. On the construction of coarsegrained models for linear flexible polymer chains: Distribution functions for groups of consecutive monomers. J. Chem. Phys. 95:6014–25. Baschnagel, J., Qin, K., Paul, W., and Binder, K. 1992. Monte Carlo simulation of models for single polyethylene coils. Macromolecules 25:3117–24. Baschnagel, J., Wittmer, J. P., and Meyer, H. 2004. Monte Carlo simulation of polymers: Coarsegrained models. In Computational Soft Matter: From Synthetic Polymers to Proteins, ed. N. Attig, K. Binder, H. Grubmüller, and K. Kremer, 83–140. Juelich: John von Neumann Institute for Computing (NIC). Batoulis, J., Binder, K., Gentile, F. T., Heermann, D. W., Jilge, W., Kremer, K., Laso, M., Ludovice, P. J., Morbitzer, L., Paul, W., Pittel, B., Plaetschke, R., Reuter, K., Sommer, K., Suter, U. W., Timmermann, R., and Weymans, G. 1991. Correlation between primary chemical structure and property phenomena in polycondensates. Adv. Mater. 3:590–99. Baumgärtner, A. 1984. Simulations of polymer models. In Applications of the Monte Carlo Method in Statistical Physics. ed. K. Binder, 145–80. Berlin: Springer. . 1992. Simulations of macromolecules. In The Monte Carlo Method in Condensed Matter Physics, ed. K. Binder, 285–316. Berlin: Springer. Bedrov, D., Ayyagari, C., and Smith, G. D. 2006. Multiscale modeling of poly(ethylene oxide)poly(propylene oxide)poly(ethylene oxide) triblock copolymer micelles in aqueous solution. J. Chem. Theory Comput. 2:598–606. Bennemann, C., Paul, W., Binder, K., and Dünweg, B. 1998. Moleculardynamics simulations of the thermal glass transition in polymer melts: αRelaxation behavior. Phys. Rev. E 57:843–57. Binder, K. 1992. In Computational Methods in Field Theory, ed. C. B. Lang and H. Gausterer, 59–125. Berlin: Springer. . 1994. Phase transitions in polymer blends and blockcopolymer melts: Some recent developments. Adv. Polymer Sci. 112:181–99. . 1995. ed. Monte Carlo and Molecular Dynamics Simulations in Polymer Science. New York: Oxford University Press.
59556_C026.indd 411
8/12/08 2:36:04 PM
412
CoarseGraining of Condensed Phase and Biomolecular Systems
Binder, K., Baschnagel, J., and Paul, W. 2003. Glass transition of polymer melts: Test of theoretical concepts by computer simulation. Prog. Polym. Sci. 18:115–72. Binder, K., and Landau, D. P. 1984. Finitesize scaling at firstorder phase transitions. Phys. Rev. B 30:1477–85. Binder, K., Müller, M., Virnau, P., and MacDowell, L. G. 2005. Polymer+solvent systems: Phase diagrams, interface free energies, and nucleation. Adv. Polymer Sci. 173:1–110. Borgs, C., and Kotecky, R. 1990. A rigorous theory of finitesize scaling at firstorder phase transitions. J. Stat. Phys. 61:79–119. Carmesin, I., and Kremer, K. 1988. The bond fluctuation method: A new effective algorithm for the dynamics of polymers in all spatial dimensions. Macromolecules 21:2819–23. Chapman, W. G., Gubbins, K. E., Jackson, G., and Radosz, M. 1989. SAFT: Equationofstate solution model for associating fluids. Fluid Phase Equilibria 52:31–38. Chen, S. S., and Kreglewski, A. 1977. Applications of the augmented van der Waals theory of fluids. I. Pure fluids. Ber. Bunsenges. Phys. Chem. 81:1048–49. Des Cloizeaux, J., and Jannink, G. 1990. Polymers in Solution: Their Modeling and Structure. Oxford: Oxford University Press. Deutsch, H.P., and Binder, K. 1991. Interdiffusion and selfdiffusion in polymer mixtures: A Monte Carlo study. J. Chem. Phys. 94:2292–2304. Dijkstra, M., Brader, J. M., and Evans, R. 1999. Phase behaviour and structure of model colloid–polymer mixtures. J. Phys.: Condens. Matter 11:10079–106. Economou, I. G. 2002. Statistical associating fluid theory: A successful model for the calculation of thermodynamic and phase equilibrium properties of complex fluid mixtures. Ind. Eng. Chem. Res. 41:953–62. Flory, P. J. 1941. Thermodynamics of high polymer solutions. J. Chem. Phys. 9:660–61. . 1953. Principles of Polymer Chemistry. Ithaca: Cornell University Press. de Gennes, P. G. 1979. Scaling Concepts in Polymer Physics. Ithaca: Cornell University Press. Gerroff, I., Milchev, A., Binder, K., and Paul, W. 1993. A new offlattice Monte Carlo model for polymers: A comparison of static and dynamic properties with the bondfluctuation model and application to random media. J. Chem. Phys. 98:6526–39. Grest, G. S., and Kremer, K. 1986. Molecular dynamics simulation for polymers in the presence of a heat bath. Phys. Rev. A 33:3628–31. Gross, J., and Sadowski, G. 2001. Perturbedchain SAFT: An equation of state based on a perturbation theory for chain molecules. Ind. Eng. Chem. Res. 40:1244–60. . 2002. Modeling polymer systems using the perturbedchain statistical associating fluid theory equation of state. Ind. Eng. Chem. Res. 41:1084–93. Hahn, O., delle Site, L., and Kremer, K. 2001. Simulation of polymer melts: From spherical to ellipsoidal beads. Macromol. Theory Simul. 10:288–303. Huggins, M. J. 1941. Solutions of long chain compounds. J. Chem. Phys. 9:440. Kotelyanskii, M. J., and Theodorou, D. Y. eds. 2004. Simulation Method for Polymers. New York: Marcel Dekker. Kremer, K., and Binder, K. 1988. Monte Carlo simulation of lattice models for macromolecules. Computer Phys. Rep. 7:259–310. Kremer, K., and Grest, G. S. 1990. Dynamics of entangled linear polymer melts: A moleculardynamics simulation. J. Chem. Phys. 92:5057–86. Krushev, S. 2002. Computersimulationen zur Dynamik und Statik von Polybutadienschmelzen. Ph.D. Thesis (unpublished), Universität Mainz. Landau, D. P., and Binder, K. 2005. A Guide to Monte Carlo Simulation in Statistical Physics. 2nd ed. Cambridge: Cambridge University Press. Laso, M., de Pablo, J. J., and Suter, U. W. 1992. Simulation of phase equilibria for chain molecules. J. Chem. Phys. 97:2817–19. MacDowell, L. G., Müller, M., Vega, C., and Binder, K. 2000. Equation of state and critical behavior of polymer models: A quantitative comparison between Wertheim’s thermodynamic perturbation theory and computer simulations. J. Chem. Phys. 113:419–33. MacDowell, L. G., Virnau, P., Müller, M., and Binder, K. 2002. Critical lines and phase coexistence of polymer solutions: A quantitative comparison between Wertheim’s thermodynamic perturbation theory and computer simulations. J. Chem. Phys. 117:6360–71. Maitland, G. C., Rigby, M., Smith, E. B., and Wakeham, W. A. 1987. Intermolecular Forces. Oxford: Clarendon.
www.ebook3000.com 59556_C026.indd 412
8/12/08 2:36:05 PM
How Important Is the Accurate Description of the Local Structure?
413
Milano, G., and MüllerPlathe, F. 2005. Mapping atomistic simulations to mesoscopic models: A systematic coarsegraining procedure for vinyl polymer chains. J. Phys. Chem. B 109:18609–19. Milchev, A., and Binder, K. 2002. Offlattice Monte Carlo methods for coarsegrained models of polymeric materials and selected applications. J. ComputerAided Mater. Des. 9:33–74. Milchev, A., Paul, W., Binder, K. 1993. Offlattice Monte Carlo simulation of dilute and concentrated polymer solutions under theta conditions. J. Chem. Phys. 99:4786–98. Mognetti, B. M., Yelash, L., Virnau, P., Paul, W., Binder, K., Müller, M., MacDowell, L. G. (2008). Efficient prediction of thermodynamic properties of quadrupolar fluids from simulation of a coarsegrained model: The case of carbon dioxide. J. Chem. Phys. 128:104501, 1–13. Müller, E. A., and Gubbins, K. E. 2001. Molecular based equations of state for associating fluids: A review of SAFT and related approaches. Ind. Eng. Chem. Res. 40:2198–2211. Müller, M., and MacDowell, L. G. 2003. Wetting of polymer liquids: Monte Carlo simulations and selfconsistent field calculations. J. Phys.: Condens. Matter 15:R609–53. MüllerPlathe, F. 2002. Coarsegraining in polymer simulation: From the atomistic to the mesoscopic scale and back. Chem. Phys. Chem. 3:754–69. . 2003. Scalehopping in computer simulations of polymers. Soft Mater. 1:1–31. Paul, W., Binder, K., Heermann, D. W., and Kremer, K. 1991. Crossover scaling in semidilute polymer solutions: A Monte Carlo test. J. Phys. (Paris) II 1:37–60. Paul, W., Binder, K., Kremer, K., and Heermann, D. W. 1991. Structure–property correlation of polymers, a Monte Carlo approach. Macromolecules 24:6332–34. Paul, W., and Pistoor, N. 1994. A mapping of realistic onto abstract polymer models and an application to two bisphenol polycarbonates. Macromolecules 27:1249–55. Reith, D., Meyer, H., and MüllerPlathe, F. 2001. Mapping atomistic to coarsegrained polymer models using automatic simplex optimization to fit structural properties. Macromolecules 34:2335–45. Reith, D., Pütz, M., and MüllerPlathe, F. 2003. Deriving effective mesoscale potentials from atomistic simulations. J. Comput. Chem. 24:1624–36. Sanchez, I. C., and Lacombe, R. H. 1978. Statistical thermodynamics of polymer solutions. Macromolecules 11:1145–56. Schneider, G., Alwani, Z., Heim, W., Horvath, E., and Franck, E. U. 1967. Phase equilibriums and critical phenomena in binary mixed systems to 1500 bars. Carbon dioxide with noctane, nundecane, ntridecane, and nhexadecane. Chem. Ingr. Tech. 39:649–56. Scott, R. L., and van Konynenburg, P. H. 1970. Van der Waals and related models for hydrocarbon mixtures. Discuss. Faraday Soc. 49:87–97. Siepmann, J. I., and Frenkel, D. 1992. Configurational bias Monte Carlo: A new sampling scheme for flexible chains. Mol. Phys. 75:59–70. Siepmann, J. I., Karaborni, S., and Smit, B. 1993. Vapor–liquid equilibria of model alkanes. J. Am. Chem. Soc. 115:6454–55. Silbermann, J. R. Klapp, S. H. K., Schoen, M., Channamsetty, N., Bock, H., and Gubbins, K. E. 2006. Mesoscale modeling of complex binary fluid mixtures: Towards an atomistic foundation of effective potentials. J. Chem. Phys. 124:074105. Smith, G. D. 2005. Atomistic potentials for polymers and organic materials. In Handbook of Materials Modeling, ed. S. Yip, 2561–71. Berlin: Springer. Sokal, A. D. 1995. Monte Carlo methods for the selfavoiding walk. In Monte Carlo and Molecular Dynamics Simulations in Polymer Science, ed. K. Binder, 47–124. New York: Oxford University Press. Theodorou, D. N. 2006. Equilibration and coarsegraining methods for polymers. In Computer Simulations in Condensed Matter: From Materials to Chemical Biology, vol. 2. ed. F. Ferrario, G. Ciccotti, and K. Binder, 419–48. Berlin: Springer Tries, V., Paul, W., Baschnagel, J., and Binder, K. 1997. Modeling polyethylene with the bond fluctuation model. J. Chem. Phys. 106:738–48. Tschöp, W., Kremer, K., Batoulis, J., Bürger, T., and Hahn, O. 1998a. Simulation of polymer melts I: Coarse graining procedure for polycarbonates. Acta Polym. 49:61–74. . 1998b. Simulation of polymer melts {II}: From coarse grained models back to atomistic description. Acta Polym. 49:75–79. Virnau, P., and Müller, M. 2004. Calculation of free energy through successive umbrella sampling. J. Chem. Phys. 120:10925–30. Virnau, P., Müller, M., MacDowell, L. G., and Binder, K. 2002. Phase diagrams of hexadecane–CO2 mixtures from histogramreweighting Monte Carlo. Comput. Phys. Comm. 147:378–81.
59556_C026.indd 413
8/12/08 2:36:06 PM
414
CoarseGraining of Condensed Phase and Biomolecular Systems
. 2004a. Phase behavior of nalkanes in supercritical solution: A Monte Carlo study. J. Chem. Phys. 121:2169–79. . 2004b. Phase separation kinetics in compressible polymer solutions: Computer simulation of the early stages. New J. Phys. 6:7. Wilding, N. B. 1996. Critical phenomena in simple and complex fluids. In Annual Reviews of Computational Physics, vol. 4, ed. D. Stauffer, 37–73. Singapore: World Scientific. Yelash, L., Müller, M., Paul, W., and Binder, K. 2005a. Artificial multiple criticality and phase equilibria: An investigation of the PCSAFT approach. Phys. Chem. Chem. Phys. 7:3728–32. . 2005b. A global investigation of phase equilibria using the perturbedchain statisticalassociatingfluidtheory approach. J. Chem. Phys. 123:014908. . 2006. How well can coarsegrained models of real polymers describe their structure? The case of polybutadiene. J. Chem. Theory Comput. 2:588–97.
www.ebook3000.com 59556_C026.indd 414
8/12/08 2:36:07 PM
Interaction 27 Effective Potentials for CoarseGrained Simulations of PolymerTethered Nanoparticle SelfAssembly in Solution Elaine R. Chan Semiconductor Electronics Division, Electronics and Electrical Engineering Laboratory, National Institute of Standards and Technology
Alberto Striolo School of Chemical, Biological and Materials Engineering, The University of Oklahoma
Clare McCabe, Peter T. Cummings Department of Chemical Engineering, Vanderbilt University
Sharon C. Glotzer Department of Chemical Engineering and Department of Materials Science and Engineering, University of Michigan
CONTENTS 27.1 27.2
27.3 27.4 27.5
Introduction ......................................................................................................................... 416 CoarseGraining Methodology ........................................................................................... 418 27.2.1 Physical Mapping of the CoarseGrained Model .................................................. 418 27.2.2 Derivation of SolventMediated Effective Potentials ............................................ 419 27.2.2.1 Approach ............................................................................................... 419 27.2.2.2 Alternative routes .................................................................................. 421 27.2.2.3 Simulation Details................................................................................. 422 CoarseGrained Potentials for Bare Poss Molecules .......................................................... 422 CoarseGrained Potentials for Monotethered Poss Molecules ........................................... 424 CoarseGrained Force Field Evaluation and Validation ..................................................... 425 27.5.1 Varying Initial Guesses for the Effective Potentials ............................................. 425 27.5.2 Varying Numerical Iteration Algorithms .............................................................. 426 415
59556_C027.indd 415
8/2/08 10:12:50 AM
416
CoarseGraining of Condensed Phase and Biomolecular Systems
27.5.3 Validation from Atomistic Simulations................................................................. 426 27.6 Conclusions and Outlook .................................................................................................... 428 Acknowledgments .......................................................................................................................... 429 References ...................................................................................................................................... 430
27.1
INTRODUCTION
Selfassembly is a highly promising route for constructing new and enhanced nanoparticlebased materials and devices with unique properties. However, fabrication of these nanoscale materials and devices requires knowledge of the processes that occur during selfassembly at the relevant length and time scales. Theory and simulation are useful tools for probing selfassembly in nanoscale systems because they allow access to pertinent length and time scales and enable exploration of the vast parameter space efficiently and systematically. The development and application of multiscale modeling and simulation techniques are increasingly desirable for investigating assemblies of molecular nanoparticles having various geometries and/or functionalized with appropriate substituents. Polyhedral oligomeric silsesquioxane (POSS) molecules with the formula (RSiO1.5)8 [Lichtenhan 1995] is one example of such nanoparticles. These molecules resemble cubes with silicon atoms at the corners and oxygen atoms interspersed between them (Figure 27.1). The silicon atoms can be functionalized with nonreactive organic substituents R to render the molecules compatible with
FIGURE 27.1 (Top) Mapping of the CG tethered POSS molecule onto its atomistic counterpart. CG bead labels in parentheses denote beads in the background (not shown). (Bottom) C3–C7–C5 bond angle probability distribution (left) and bond length probability distributions for CG cube bead pairs (right) computed from AA simulations at T = 400 K.
www.ebook3000.com 59556_C027.indd 416
8/2/08 10:12:51 AM
CG Potentials for Simulating PolymerTethered Nanoparticle SelfAssembly in Solution
417
polymers and surfaces, or with reactive functional groups R that provide sites for polymerization, grafting, and surface bonding. POSS molecules are therefore attractive candidates for engineering precursor structures or assemblies to construct hybrid organic/inorganic nanostructured materials with enhanced properties. In particular, previous experiments have demonstrated that POSS molecules functionalized with polymer tethers can be synthesized, and that POSS/polymer pendant copolymers selfassemble into lamellar, cylindrical, and micellar structures in solution or melt states [Knischka et al. 1999; Kim and Mather 2002; Kim, Keum, and Chujo 2003; Cardoen and Coughlin 2004]. In conjunction with these experiments, molecular simulations have been performed to predict the types of structures that can arise from selfassembly of polymertethered POSS in solution when concentration and temperature are varied [Chan et al. 2005; Zhang, Chan, and Glotzer 2005; Chan, Ho, and Glotzer 2006]. These simulations utilized a minimal model of tethered POSS that was developed on the basis of structural and energetic insights from quantum mechanical calculations. To investigate selfassembly phenomena at the mesoscale, hundreds and thousands of minimal model molecules were considered simultaneously. Such simulations are presently computationally prohibitive at the explicit atom level because they involve hundreds of thousands of atoms. The inclusion of atomistic detail limits the possible simulation times compared to that achievable in mesoscale simulations, and thus selfassembled structures that may form on longer time scales may not be observed. Despite these limitations, progress has been made [McCabe et al. 2004]. It has been demonstrated that standard force fields are sufficiently accurate to describe systems of POSS monomers at the explicit atom level [Ionescu et al. 2006; Li et al. 2007]. Detailed allatom (AA) molecular dynamics simulations have been conducted for POSS monomers dissolved in common organic solvents and provide insights on effective POSS–POSS interactions in solution under varying temperatures and solvent compositions [Striolo, McCabe, and Cummings 2005a, 2005b; Striolo et al. 2007]. Other groups have also reported additional AA simulation studies of systems containing POSS monomers [Bharadwaj, Berry, and Farmer 2000; Capaldi, Rutledge, and Boyce 2005; Capaldi, Boyce, and Rutledge 2006; Patel, Mohanraj, and Pittman 2006; Qi, Durandurdu, and Kieffer 2007; Zhou and Kieffer 2007; Zhou et al. 2007]. However, it remains generally unclear how to relate the parameters in minimal models of POSS monomers to the properties of these systems obtained from AA simulations. To accurately examine selfassembly of POSS monomers into bulk structures at long length and time scales, it is necessary to develop mapping schemes that relate coarsegrained (CG) models to their underlying AA representations. Presented herein is the development of a CG force field for accurately simulating monotethered POSS molecule selfassembly in an organic solvent. The force field consists of effective solventmediated interaction potentials that implicitly account for POSSsolvent molecule interactions. Hence, the solvent molecules do not need to be explicitly accounted for in the CG simulations, resulting in a reduced number of particles. Our effort builds upon recent results obtained for systems of linear molecules such as polymer melts [MüllerPlathe 2002; Ashbaugh et al. 2005; Milano and MüllerPlathe 2005] and phospholipids in water [Shelley et al. 2001; Lyubartsev 2005]. We extend those methods here to coarsegrain cubic molecules such as POSS monomers. Coarsegraining approaches aim to improve the computational efficiency of a simulation by reducing the number of degrees of freedom in the system in a systematic fashion [Baschnagel et al. 2000; Kremer and MüllerPlathe 2001; Glotzer and Paul 2002; Kremer and MüllerPlathe 2002; MüllerPlathe 2002, 2003; Nielsen et al. 2004; Lu and Kaxiras 2005]. These methods reduce the central processing unit (CPU) time by two to four orders of magnitude compared to the corresponding AA simulations. Currently, CG methodologies typically involve two steps: (1) mapping a detailed atomistic or molecular representation onto a CG representation, and (2) deriving the equivalent CG interaction potentials. The approach utilized in this work is to map specific groups of atoms onto CG particles and derive CG numerical effective potentials that sufficiently reproduce at the mesocale structural properties observed in the AA simulations. The mapping scheme preserves important molecular details, such as connectivity, in the CG representation as
59556_C027.indd 417
8/2/08 10:12:52 AM
418
CoarseGraining of Condensed Phase and Biomolecular Systems
well as relevant physical properties, such as intermolecular packing, which should be captured in the mesoscale simulations. With regard to the development of the effective potentials, two methodologies are often employed, namely, analytical potentials with tunable parameters or numerical potentials in tabulated form. Although analytical potentials are desirable because they can be parameterized according to experimental data or quantum mechanical calculations, the processes available to obtain the correct parameter values can be timeconsuming, and in some cases, the data necessary for parameterization are unavailable. Hence, most current CG models utilize solely numerical potentials or combinations of numerical and simple analytical potentials to describe complex interactions. CG numerical potentials can be derived by requiring that the mesoscale simulations reproduce specific intra and intermolecular probability distribution functions computed from the underlying AA simulations [Lyubartsev and Laaksonen 1995; Soper 1996; Tschöp et al. 1998; Eilhard et al. 1999; Lyubartsev et al. 2003; Reith, Putz, and MüllerPlathe 2003; Lyubartsev 2005]. These structuralbased coarsegraining schemes require iterative numerical methods and are attractive because they can be automated. However, one caveat of the method is that the resulting effective potentials lack transferability across thermodynamic state space, as the CG Hamiltonians are only parameterized to reproduce structural correlations correctly [Ashbaugh et al. 2005]. It has been suggested that such transferability could be obtained if the effect of enthalpy and entropy are decoupled and the CG force fields account for the decoupling [Baron et al. 2006, 2007]. Another drawback is the nonuniqueness of the derived effective potentials; that is, different effective potentials exist that can each reproduce the target distribution functions from the AA simulations. The coarsegraining approach undertaken in the following examples is a structuralbased one where effective numerical potentials are derived that reproduce in the CG simulations target structures in the underlying AA simulations. These target structural features are expected to influence the local intermolecular packing within selfassembled structures of polymertethered POSS molecules, and consequently the formation of specific types of bulk structures at longer length and time scales. In addition to obtaining the CG force field for simulating POSS molecule selfassembly, particular aspects of the coarsegraining approach, including nonuniqueness of the effective potentials and variations on the numerical iteration algorithm, are examined. The work presented herein is adapted from previous publications [Chan 2006; Chan et al. 2007; Striolo et al. 2007], which the reader can refer to for additional details and discussion.
27.2 COARSEGRAINING METHODOLOGY 27.2.1 PHYSICAL MAPPING OF THE COARSEGRAINED MODEL We have developed a CG model of a POSS molecule functionalized with a single nonyl tether on one corner and nonreactive methyl groups on the remaining seven silicon atoms (Figure 27.1). The hydrocarbon substituents render the molecule soluble in chemically similar and common solvents such as hexane. Because the silsesquioxane core is symmetric, one starting point is to model the cage as a rigid cube with interaction sites on the corners, as in our previous minimal model [Chan et al. 2005]. Each of the resulting eight cube corner beads thus represents one silicon atom, the neighboring oxygen atoms, and the methyl (or methylene in the case of the nonyl tether) substituent attached to the silicon atom. The bead interaction sites are at the centers of the siliconcarbon bonds that connect each substituent to the cage. The beads are connected by rigid bonds. To examine the physical appropriateness of this CG model of the silsesquioxane cage, the bond length and bond angle probability distributions are compared with those computed in AA simulations of nonyltethered POSS molecules dissolved in hexane [Chan et al. 2007; Striolo et al. 2007]. Figure 27.1 shows one example of an AA simulated bond angle distribution that is sharply peaked at
www.ebook3000.com 59556_C027.indd 418
8/2/08 10:12:53 AM
CG Potentials for Simulating PolymerTethered Nanoparticle SelfAssembly in Solution
419
about 90°, thereby indicating that the grouping of atoms on the silsesquioxane cage is commensurate with a rigid cube model having eight corner sites. The distances l between the centers of the silicon– carbon bonds that correspond to the interaction sites in the CG rigid cube model exhibit peaks centered at l = 4.2 Å. Mapping the AA simulation results to the CG model establishes a length scale in the CG simulations by specifying the edge of the CG cube equal to this value. Each cube corner bead in the model is thus assigned a diameter of σc = 4.2 Å. To model the nonyl tether, two methylene groups are assigned to each CG tether bead. Although this mapping is on a finer scale compared to previous CG models of hydrocarbon chains that employ groupings of three or more methylene groups per CG bead [Baschnagel et al. 1991; Marrink, de Vries, and Mark 2004; Ashbaugh et al. 2005; Depa and Maranas 2005], it is chosen in order to facilitate future efforts to bridge length and time scales in polymertethered POSS selfassembly via reverse mapping schemes where the CG model is mapped back onto its explicit atom counterpart. Note that the end tether bead actually represents a CH2–CH3 group in the model, and it is assumed that the behavior and physical properties of this group are not significantly different from those of a CH2–CH2 group. The interaction sites for the CG tether beads occur along the center of the corresponding carbon–carbon bond in the AA molecule. The bondlength distributions between pairs of tether bead sites are computed from AA simulations [Striolo et al. 2007]. On the basis of these results [Chan et al. 2007], we assign a diameter of σt = 2.5 Å to each bead in the CG tether.
27.2.2 27.2.2.1
DERIVATION OF SOLVENTMEDIATED EFFECTIVE POTENTIALS Approach
We seek to reproduce in mesoscale simulations a select set of target structural quantities computed from the underlying AA simulations that correspond to the bead interaction sites in the CG model. These quantities are the intramolecular bond length and bond angle probability distributions and the intermolecular radial distribution function (RDF) between cube corner beads on different molecules. The algorithm used to derive the effective potentials is a numerical iteration scheme that produces effective potentials via the following equation [Lyubartsev and Laaksonen 1995; Soper 1996; Reith, Putz, and MüllerPlathe 2003; Ashbaugh et al. 2005]: ⎡ P (x) ⎤ ⎥ i = 0, 1, 2,… , Ui+1 ( x ) = Ui ( x ) + αk BT ln ⎢⎢ i ⎥ ⎢⎣ Ptarget ( x ) ⎦⎥
(27.1)
where i is the iteration step number, kB is Boltzmann’s constant, T is the temperature, x is the independent variable, and P(x) is a probability distribution function, such as a RDF, bond length probability distribution, or bond angle probability distribution. The algorithm updates trial effective potentials Ui(x) at each iteration step by adding a correction term based on the deviation between the trial CGsimulated probability distribution function and target AAsimulated distribution function. The term α is an arbitrary number that scales the magnitude of the correction term to ensure algorithm stability and convergence. A CG effective potential that reproduces the desired structural features in the underlying AA simulations is obtained when the trial CGsimulated and target AAsimulated distribution functions are sufficiently close according to some prescribed tolerance value. It is important to emphasize that Equation 27.1 has no theoretical basis [Chan et al. 2007] and is employed here with the understanding that it is simply one of many possible numerical algorithms that yield CG effective potentials that satisfactorily reproduce the target distribution functions in the AA simulations. Briefly, the concept of structuralbased coarsegraining is motivated by the proof
59556_C027.indd 419
8/2/08 10:12:54 AM
420
CoarseGraining of Condensed Phase and Biomolecular Systems
that there is a unique mapping between the RDF and the intermolecular potential for simple pairwise additive and spherically symmetric potentials at a given thermodynamic state point [Henderson 1974]. The relationship between the potential of mean force (PMF) and the RDF at infinite dilution for molecular centers of mass is given by the following equation [McQuarrie 2000]: U PMF (r ) = −k BT ln[ g(r )].
(27.2)
The PMF is precisely equal to the intermolecular pair potential between two point particles. It is strictly applicable to particles or molecules described as single interaction sites and is invalid for molecules treated as collections of multiple interaction sites or beads, such as polymer chains and the CGtethered POSS molecules of interest here. These types of molecules exhibit orientational correlations that are not accounted for in Equation 27.2, as explained further in the Appendix of Chan et al. (2007). Instead, Equation 27.1 is merely a convenient algorithm to use here, as it satisfies the boundary condition that the trial CG effective potentials converge when the CGsimulated distribution functions match the target AAsimulated ones. We explore this point further in Section 27.2.2.2. To generate the initial guesses (i = 0) for the CG effective potentials, the target RDF, bond length probability distribution P(l), and bond angle probability distribution P(θ) computed from the AA simulations are Boltzmann inverted using the equations below, respectively. Note these choices for the initial guesses are rather arbitrary, as discussed further in Section 27.2.2.2 and in the Appendix of Chan et al. (2007). U 0 (r ) = −k BT ln[ gtarget (r )] ,
(27.3)
U 0 (l ) = −k BT ln[ Ptarget (l )],
(27.4)
⎡ Ptarget (θ) ⎤ ⎥. U 0 (θ) = −k BT ln ⎢⎢ ⎥ ⎢⎣ sin θ ⎥⎦
(27.5)
To assess convergence of the derived effective potentials, during each iteration step the following merit functions [MüllerPlathe 2002; Reith, Putz, and MüllerPlathe 2003] are computed for the intermolecular RDF between cube corner beads, intramolecular bond length probability distributions, and intramolecular bond angle probability distributions, respectively.
fmerit,RDF =
∫ w(r )[g (r ) − g
fmerit,bond =
i
target
∫ w(l)[P (l) − P
fmerit,angle =
i
target
∫ [P (α) − P i
target
(r )]2 dr,
(27.6)
(l )]2 dl ,
(27.7)
(α)]2 d α.
(27.8)
Optional nonnegative weighting functions w(r ) = exp (−r / σ c ) and w(l ) = exp (−l / σ t ) are also utilized to penalize deviations between the distribution functions in the CG and AA simulations at small separation distances.
www.ebook3000.com 59556_C027.indd 420
8/2/08 10:12:54 AM
CG Potentials for Simulating PolymerTethered Nanoparticle SelfAssembly in Solution
421
On the basis of RDFs computed from the AA simulations for the CG tether bead sites [Striolo et al. 2007], a purely repulsive softsphere potential [Leach 2001] is used to capture the intermolecular excluded volume interactions between tether beads. U (r ) =
9 6⎤ ⎡ 27ε ⎢⎛⎜ σ ⎞⎟ ⎛⎜ σ ⎞⎟ ⎥ ⎢⎜⎜ ⎟⎟⎟ − ⎜⎜ ⎟⎟⎟ ⎥ + ε r ≤ rc , 4 ⎢⎝ r ⎠ ⎝ r ⎠ ⎥ ⎣ ⎦
(27.9)
U (r ) = 0 r > rc . In this expression, rc = (3 / 2)1/ 3 and ε = k BT . The choice of this potential is not expected to significantly affect the resulting CG probability distribution functions involving the tether beads or selfassembly of the molecules. 27.2.2.2
Alternative routes
As the effective potentials obtained using the approach discussed in Section 27.2.2.1 are nonunique, one means to evaluate their accuracy is to derive them from different types of initial guesses using the same numerical iteration algorithm and compare the results. This exercise is helpful for corroborating an effective potential in cases where different initial guesses yield the same result or for assessing the best effective potential if different potentials result. The intermolecular POSS cube corner bead effective potentials are first obtained by deriving them using initial guesses generated by Equation 27.3; that is, Boltzmann inversions of the target AAsimulated RDFs. As there is no theoretical basis for using this expression to generate the initial guesses [Chan et al. 2007], we next derive effective potentials using a different initial guess; that is, the purely repulsive Weeks– Chandler–Andersen (WCA) [Allen and Tildesley 1987] interaction potential: 12 ⎡ ⎛ σ ⎞6 ⎤⎥ ⎢⎛ σ ⎞ U (r ) = 4 ε ⎢⎜⎜ c ⎟⎟⎟ − ⎜⎜⎜ c ⎟⎟⎟ ⎥ + ε r ≤ rc , ⎝ r ⎟⎠ ⎥ ⎢⎜⎝ r ⎟⎠ ⎣ ⎦
(27.10)
U (r ) = 0 r > rc , where rc = 21/ 6 σ c and ε = k BT . We also compare the effective potentials derived from a different numerical equation since the iterative scheme of Equation 27.1 has no theoretical basis [Chan et al. 2007]. Equation 27.1 is a successful algorithm for deriving CG effective potentials because the logarithmic term is able to change sign ( + / − ) accordingly so that the updated effective potential produces a CG distribution function that is in better agreement with the AA target distribution function. Thus, this property of the correction term functions as one criterion for devising alternative numerical algorithms that are equally or potentially superior to Equation 27.1. A simple correction term that takes the linear difference between the RDFs computed in the CG and AA simulations satisfies both the above criterion and the boundary condition that the effective potential converges when the CG and target RDFs are equal. We thus propose the following numerical equation for deriving effective potentials: Ui+1 (r ) = Ui (r ) + αk BT [ gi (r ) − gtarget (r )],
(27.11)
where α is an arbitrary number used to scale the magnitude of the correction term. We compare below the effective potentials generated by Equation 27.1 and Equation 27.11 from identical initial guesses. The speed of each algorithm is also examined.
59556_C027.indd 421
8/2/08 10:12:55 AM
422
CoarseGraining of Condensed Phase and Biomolecular Systems
27.2.2.3 Simulation Details Brownian dynamics, a stochastic molecular dynamics simulation method that samples the canonical ensemble, is utilized to conduct the CG simulations. Additional details on this method are presented elsewhere [van Gunsteren, Berendsen, and Rullmann 1981; Chan et al. 2005; Zhang, Chan, and Glotzer 2005; Chan, Ho, and Glotzer 2006]. Systems containing Nb = 5 and Nb = 20 CG nonyltethered POSS molecules (N = 40 and 240 total particles, respectively) are simulated at overall density ρ = 0.75 g/cm3 and temperatures T = 300 and 400 K. The simulations employ cubic boxes and periodic boundary conditions. The equations of motion are integrated using the leapfrog algorithm, and the rigidbody motion of the cubes is captured using the method of quaternions [Allen and Tildesley 1987]. Each system is first relaxed athermally to generate initial configurations. Selfassembly of the molecules over time is monitored by inspecting simulation snapshots of configurations. These configurations are subsequently compared to those in the corresponding AA molecular dynamics simulations having the same number of molecules and at the same temperature and density. AA simulations of Nb = 20 nonyltethered POSS molecules dissolved in 987 hexane solvent molecules (N = 6642 total atoms) are performed using the DL_POLY [Smith and Forester 1996] simulation package. The FrischknechtCurro force field [Frischknecht and Curro 2003] is employed to describe the POSS cage, and the TRAPPE force field [Martin and Siepmann 1998] is used to describe the nonyl tether and hexane solvent. Further details of these simulations are reported in Striolo et al. (2007).
27.3 COARSEGRAINED POTENTIALS FOR BARE POSS MOLECULES Initially, an intermolecular CG effective potential that captures the interactions between cube corner beads is derived. This is a logical starting point since the addition of a single hydrocarbon tether on one corner of the silsesquioxane cage has little impact on cage behavior [Li et al. 2007]. Hence, the tether should have minimal impact on the intermolecular interactions between nonreactive “bare,” or nontethered, POSS monomers. AA molecular dynamics simulations of Nb = 5 octamethyl functionalized POSS monomers dissolved in hexane have been previously performed at overall density ρ = 0.75 g/cm 3 and temperatures T = 300 and 400 K [Striolo et al. 2007]. Figure 27.2 displays target RDFs computed from these simulations that characterize the local structure among the CG cube corner bead sites from the underlying atomistic molecules. The RDFs exhibit pronounced peaks that occur primarily at integer values of the cube edge length. This behavior in the RDF was also previously observed in simulations of POSS monomers dissolved in hexadecane [Striolo, McCabe, and Cummings 2005a]. The tails in the RDFs at large separation distances fall below unity at both temperatures because of a combination of three factors: (1) small system size effects (Nb = 5 molecules or N = 40 particles), which are corrected by multiplying the RDF by the correction factor N/(N − 1) [Barker and Henderson 1971; McQuarrie 2000]; (2) not accounting for the close proximity of the cube corner beads that are rigidly bound together when normalizing the RDF; and (3) nonuniform clustering of the POSS monomers throughout the simulation box. The CG effective pair potentials derived on the basis of these RDFs and the initial guesses used in the iteration algorithm are also shown in Figure 27.2. The interaction potential cutoff value used in the CG simulations is rc = 28 Å. Small correction steps (α = 0.01−0.1) are required during numerical iteration to ensure algorithm stability and convergence of the potentials, most likely because explicit solvent molecules are absent in the CG model. Previous applications of Equation 27.1 to derive effective potentials for polymer melts report success with larger parameter values α = 0.2 [Ashbaugh et al. 2005] and α = 1 [Reith, Putz, and MüllerPlathe 2003]. At T = 300 K, the effective potential consists of an alternating series of attractive wells and repulsive peaks that correspond to the peaks and valleys in the target RDF computed from the AA simulations,
www.ebook3000.com 59556_C027.indd 422
8/2/08 10:12:56 AM
CG Potentials for Simulating PolymerTethered Nanoparticle SelfAssembly in Solution
423
FIGURE 27.2 Sitesite CG effective potentials of bare POSS molecules at T = 300 K (top) and T = 400 K (bottom). The corresponding intermolecular radial distribution functions are shown in the insets.
respectively. This relationship between the shapes of the effective potential and the target RDF is absent at a higher temperature, T = 400 K. The effective pair potential here exhibits a steep attractive well at r = 8.3 Å followed by broader attractive wells and repulsive peaks compared to those observed at lower temperature. The latter behavior indicates loss of longrange structure with increasing temperature. The RDFs produced in the CG simulations by the effective potentials are shown in Figure 27.2. The agreement between the CG and AA target RDFs at T = 300 K is excellent and the merit function value is fmerit,RDF ≈ 10 − 5 when the iteration algorithm reaches convergence [Reith, Putz, and MüllerPlathe 2003]. The agreement between the two RDFs at T = 400 K is good, as indicated by fmerit,RDF ≈ 10 − 4 when convergence is attained.
59556_C027.indd 423
8/2/08 10:12:56 AM
424
27.4
CoarseGraining of Condensed Phase and Biomolecular Systems
COARSEGRAINED POTENTIALS FOR MONOTETHERED POSS MOLECULES
We next build upon the model developed thus far for bare POSS cubes by considering the interactions introduced when a nonyl tether is attached to one cube corner (Figure 27.1). Effective potentials are derived to capture the bond stretching and bending interactions now present in this CG monotethered POSS molecule. Because the POSS cages are treated as rigid cubes, only four bonded interactions are considered between the following pairs of beads: C8–T1, T1–T2, T2–T3, and T3–T4 (see Figure 27.1). Four bending interactions due to the angles defined by the bead triplets C6–C8–T1, C8–T1–T2, T1–T2–T3, and T2–T3–T4 are included in the model. Dihedral interactions are not incorporated to maintain model simplicity. An example of simulated probability distributions for the effective bond stretching and bending interactions, along with the corresponding effective potentials, is presented in Figure 27.3. The CG and target distributions for the C8–T1 bond match closely with fmerit,bond ≈ 10 − 5 when the iteration algorithm converges. The two distributions for the tether bonds T1–T2, T2–T3, and T3–T4 are in good agreement with fmerit,bond ≈ 10 − 4. The bond bending distributions display multiple peaks that can probably be attributed to dihedral transitions along the alkyl chain that are captured in the fine level of coarsegraining adopted here. The corresponding effective potentials exhibit peaks and valleys that mirror the shape and relative magnitude of these features in the target distribution functions. There is excellent agreement between the CG and target bond bending distributions for each of the four angles treated in the model. The merit function values are fmerit,angle