Reliable crystal structure predictions from first principles

Nikhar, Rahul; Szalewicz, Krzysztof

doi:10.1038/s41467-022-30692-y

Download PDF

Article
Open access
Published: 02 June 2022

Reliable crystal structure predictions from first principles

Nature Communications volume 13, Article number: 3095 (2022) Cite this article

5930 Accesses
17 Citations
8 Altmetric
Metrics details

Subjects

Abstract

An inexpensive and reliable method for molecular crystal structure predictions (CSPs) has been developed. The new CSP protocol starts from a two-dimensional graph of crystal’s monomer(s) and utilizes no experimental information. Using results of quantum mechanical calculations for molecular dimers, an accurate two-body, rigid-monomer ab initio-based force field (aiFF) for the crystal is developed. Since CSPs with aiFFs are essentially as expensive as with empirical FFs, tens of thousands of plausible polymorphs generated by the crystal packing procedures can be optimized. Here we show the robustness of this protocol which found the experimental crystal within the 20 most stable predicted polymorphs for each of the 15 investigated molecules. The ranking was further refined by performing periodic density-functional theory (DFT) plus dispersion correction (pDFT+D) calculations for these 20 top-ranked polymorphs, resulting in the experimental crystal ranked as number one for all the systems studied (and the second polymorph, if known, ranked in the top few). Alternatively, the polymorphs generated can be used to improve aiFFs, which also leads to rank one predictions. The proposed CSP protocol should result in aiFFs replacing empirical FFs in CSP research.

Predicting crystal form stability under real-world conditions

Article Open access 08 November 2023

Predicting stable crystalline compounds using chemical similarity

Article Open access 26 January 2021

Automated crystal structure analysis based on blackbox optimisation

Article Open access 05 June 2020

Introduction

Properties of crystalline solids depend critically on the polymorphic form of a given substance and many crystals can exist in several such forms^1,2. The knowledge of possible stable polymorphic forms of a crystal is of particular importance for pharmaceutical industry³. If a polymorph different from the one obtained in laboratories crystallizes during manufacturing of a drug, it will have different physicochemical properties and may lead to undesirable therapeutic effects, two examples are ritonavir^4,5 and rotigotine^6,7,8. Thus, in the drug development process, one would like to know if the polymorph used is thermodynamically the most stable form in ambient conditions. In defense industry, developments of energetic materials are costly and highly dangerous^9,10 and a priori knowledge of crystal structure of notional materials would allow accelerated screening of such materials. Also semiconductor industry can benefit from such knowledge¹¹. CSP methods answer these needs by finding a set of most stable crystalline polymorphs of a given molecule starting from its two-dimensional diagram and not using any experimental information specific for this molecule.

Reliable CSPs for molecular crystals starting from the knowledge of only two-dimensional diagrams of monomer(s) were nearly impossible for a long time. In 1988, Maddox¹² described failure of CSPs as a continuing scandal in the physical sciences and stated that in general even simplest crystalline solids posed great challenge. In mid 1990s, Gavezzotti¹³ asked the fundamental question: ‘Are crystal structures predictable?’, and his answer was ‘No’. In response to this criticism, the Cambridge Crystallographic Data Centre (CCDC) conducted a series of “blind” tests^{14,15,16,17,18,19} by providing only two-dimensional diagrams of monomers of crystals that have been measured but not published and asking research groups to submit their predictions, with the results of the first test published in 2000. While the field has advanced significantly since the first test, the results of the last, 6th test¹⁹ are still not completely satisfactory. The participating groups achieved the success rate between 13% and 57% (not including polymorphs C and E of system XXIII), where success means that the experimental polymorph was found among polymorphs on two lists containing 100 polymorphs each.

One should remark here that predictions of crystals structures are actually a difficult problem for physical science, opposite to what Maddox¹² implied. The reason is the high dimensionality of the conformational and crystallographic space resulting in thousands of plausible polymorphs produced by sampling of this space within a relatively narrow window of lattice energies and densities. The energetic distances between consecutive polymorphs ordered by lattice energy are of the order of 1 kJ/mol at the low-energy end, which requires accuracies nearly impossible to achieve by empirical FFs. Also, for experimentally observed polymorphs, the differences between their computed lattice energies are of the same order²⁰.

While there are several variants of CSPs, including a recent use of deep neural networks²¹, the majority of groups participating in the 6th blind test used some form of FFs, mostly of empirical character. The most successful CSP protocol consisting of polymorph-space sampling plus lattice-energy minimization has been developed by Neumann et al.^22,23. This protocol uses a tailor-made FF which is obtained by refining parameters of an empirical FF to reproduce as close as possible pDFT+D lattice energies (and their derivatives). The initial polymorphs for pDFT+D calculations are obtained using the empirical FF. The method is included in the commercial software package GRACE (Generation Ranking and Characterization Engine)²⁴, but some of its details are not available. Recent reviews of the field of CSPs can be found in refs. ^{25,26,27,28,29,30,31}.

In the present work, a CSP protocol is proposed based entirely on first principles, i.e., not utilizing any experimental information. Since the main characteristic of this method is the use of aiFFs, we will refer to it as the CSP(aiFF) protocol. This protocol consists of several stages shown in Fig. 1. While aiFFs have been used in CSPs for some time^19,32,33,34, such predictions were taking a long time (several months at the minimum), required huge amounts of human effort, and were possible for monomers with up to about 20 atoms. In the present work, four recent developments are combined to dramatically reduce costs and increase predictability of such CSPs: (a) The development of a very effective variant³⁵ of symmetry-adapted perturbation theory (SAPT)³⁶ for ab initio calculations of interaction energies; (b) The creation of autoPES^37,38: an automatic, effective, and reliable method for generation of potential energy surfaces (PESs) with minimal human involvement; (c) Enabling the use such aiFFs in the lattice-energy minimization stage of CSPs, a part of the present work; (d) The application of pDFT+D for a final refinement of polymorph rankings. Stage 3 of Fig. 1 can produce even millions of polymorphs at low costs and past experience indicates that the experimentally relevant polymorphs are almost always among them. Thus, the essence of CSP protocols is to filter all relevant low lattice energy polymorphs out of this set. In the past few years, it has been demonstrated by several groups that pDFT+D geometry optimization of polymorphs places the experimental polymorph ranked within the top few, often as number one^19,39,40,41. However, such calculations are so expensive that they can be afforded for only a hundred or so polymorphs. In contrast, if an FF is used in Stage 4, tens of thousands polymorphs can be optimized. This FF has to be sufficiently accurate not to miss any important polymorphs. Thus, both the ab initio method and the fit to interaction energies computed using this method must have sufficiently small uncertainties. In calculations of dimer interaction energies, the variant of SAPT used by us (see “Methods”) is nearly as accurate^35,42,43 as the coupled cluster method with single, double, and noniterative triple excitations, CCSD(T), the “gold-standard” method of electronic structure theory, but is significantly less expensive. To prevent loss of accuracy due to fitting, the form of the fitting function has to be significantly more involved than those of empirical FFs that are typically built from Lennard-Jones plus Coulomb potentials, see “Methods”. The extended form can fit ab initio data with uncertainties of about 1 kJ/mol, which we will show to be sufficient for reliable CSPs. Such form has never been used in lattice energy minimizations and we had to modify CSP software to make it possible. Finally, to make Stage 2 affordable, the number of ab initio grid points needed to fit an aiFF has to be reasonably small. The autoPES method^37,38 reduces this number by two orders of magnitude compared to typical surface-fitting approaches, reducing in this way the development costs by the same ratio. It also reduces amount of human involvement almost to zero as the whole process is completely automated. We show below that the proposed protocol found the experimental crystal ranked as number one for all 15 molecules studied (and the second polymorph, if known, ranked in the top few).

**Fig. 1: Overview of aiFF-based CSP protocol.**

Results and discussion

Performance of CSP(aiFF) protocol

To asses the performance of our method, we carried out CSPs for 15 molecules including several systems from the CCDC blind tests^{14,15,16,17,18,19} (denoted by roman numerals), as well as for methanol, benzene, nitromethane, 5,5′-dinitro-2H,2′H-3,3′-bi-1,2,4-triazole (DNBT), 1-3-5-trinitrobenzene (TNB), deferiprone, and fluorouracil. The molecular graphs are shown in Supplementary Fig. 1. The results are summarized in Table 1. An extended version of this table is available in Supplementary Table 1.

Table 1 CSPs from SAPT(DFT)-based aiFFs minimizations followed by pDFT+D fixed-geometry calculations.

Full size table

The CSP(aiFF) protocol ranked the experimental polymorph as number 1 in 5 cases, as number 2–6 in 7 cases, and as numbers 9, 9, and 16. We have also included a second experimentally identified polymorph in the cases of system I, benzene, and deferiprone, denoted as “Poly2” in Table 1, and these are ranked as numbers 8, 4, and 8, respectively. After pDFT+D calculations on top-ranked 20 polymorphs of each crystal, without any further geometry optimization, an experimental crystal became ranked as number 1 in each case. For deferiprone, it was Poly2 that became the rank 1 polymorph, while Poly1 remained at rank 2. For system I and benzene, Poly2 changed rank from 8 to 2 and from 4 to 3, respectively. RMSD₂₀’s between the calculated and experimental crystals vary between 0.09 and 0.67 Å, below the CCDC threshold of 0.8 Å. Also densities and cell parameters, shown in Supplementary Table 1, agree very closely. Supplementary Fig. 2 displays the percent deviations between the calculated and experimental lattice parameters. The average errors for the cell parameters a, b, c, and β amount to 4.3%, 2.6%, 4.3%, and 2.4%, respectively. Such level of predictivity is unprecedented for a completely first-principles CSP protocol. The overlaps of the experimental polymorphs with the closest calculated ones are shown in Fig. 2. This figure allows intuitive appreciation how close these structures are. This exceptional performance of CSP(aiFF) has been achieved despite the investigated systems exhibiting typical difficulties due to closeness of polymorphs’ lattice energies and despite using rigid-monomer approximation. The lattice energy vs. density landscapes from the aiFF minimizations for systems IV and XXII are shown in Supplementary Fig. 3. Analogous graphs for the other systems look similar. The lowest-energy 100 polymorphs spread the range of about 5 kJ/mol for systems I, XII, XIII, benzene, and nitromethane, about 10 kJ/mol for systems II, IV, VIII, XVI, XXII, methanol, TNB, deferiprone, and fluorouracil, and about 20 kJ/mol for DNBT. At the low-energy end, the energy differences between consecutive polymorphs are less than 1 kJ/mol, i.e., comparable to the RMSEs of the fits over all dimer configurations with negative interaction energies, shown in Table 1.

**Fig. 2: Overlaps of crystal structures.**

Performance of a simplified aiFF form

The use of the extended functional form of aiFFs in the lattice energy minimizations instead of the simpler exp-6-1 form (not including a polynomial in front of exponential, damping functions, etc., see “Methods”) used in some empirical FFs leads to enormous improvements in rankings. To quantify such improvements, we performed lattice energy minimizations with the exp-6-1 form of aiFFs, fitted using the same level of theory as in the case of the extended form, for systems I, II, IV, and XXII, achieving rankings of 138, 2231, 49, and 60, respectively, while the rankings of the extended aiFF form for these systems are 1 or 2, see Table 1. The main reason for this improvement is that RMSEs for negative interaction energies are from 2.3 to 5.3 times smaller in the latter case (these ratios are correlated with the number of fit parameters: e.g., 30 and 270 for the exp-6-1 and the extended form, respectively, in the case of system IV).

Performance of an empirical FF

In order to quantify better the predictive power of our approach, calculations analogous to those described above have been performed with an empirical FF. We have chosen the W99 FF⁴⁴ with point charges computed by us using the CHELPG method⁴⁵. For the 18 experimental polymorphs considered, the W99+charges FF found 33% of them at rank 10 or better, while the analogous result for aiFF (without the pDFT+D step) is 94%. This amounts to a qualitative difference for technological applications. For more details on CSPs with the W99+charges FF, see Supplementary Tables 1 and 2.

Alternative CSP(aiFF) protocol

One may ask why pDFT+D calculations are needed to improve the rankings, while several comparisons on benchmark interaction energies, see, e.g., refs. ^42,43, show that SAPT(DFT) is nearly as accurate as CCSD(T) and more accurate than DFT+D methods. The main reason is that what is used in CSPs are aiFFs, and they include additional uncertainties due to fitting. Although the average fit error for negative interaction energies is only ~1 kJ/mol, errors may be larger at some configurations. If a configuration with a too negative interaction energy is important for a polymorph, this polymorph may become overly stable and therefore too highly ranked. Two other possible reasons, basis set size and neglect of many-body effects in CSP(aiFF), are discussed in Supplementary Information and found unlikely to be a reason. To improve the predictions from Stage 4, we have developed an alternative version of our method, alt-CSP(aiFF). After executing the CSP(aiFF) protocol less the pDFT+D stage, the geometries of 20 polymorphs with the lowest lattice energies are examined and consecutive nearest neighbor dimers identified. Then SAPT calculations are performed for these dimers, the aiFF is refitted, and lattice minimizations for the 20 polymorphs are performed with the new aiFF. This procedure is iterated until the energies of the 5x5x5 clusters extracted from each polymorph computed in two ways: just from the aiFF and in a hybrid way, replacing the aiFF interaction energies by the available SAPT ones, are the same to within some threshold. We have applied alt-CSP(aiFF) to two of the worst ranking crystals from Table 1: system XVI (rank 16) and fluorouracil (rank 9). In each case, alt-CSP(aiFF) resulted in the experimental polymorph at rank 1, while RMSD₂₀ was reduced from 0.29 to 0.15 Å and from 0.61 to 0.42 Å, respectively. Thus, alt-CSP(aiFF) can be used without the pDFT+D stage. However, the additional ab initio calculations are about as expensive as the pDFT+D ones, so there is no gain in terms of efficiency.

Cost comparisons

The method proposed not only is highly reliable, as shown above, but also is very efficient compared to alternative ways of combining FF-based CSPs with pDFT+D calculations. To demonstrate this, we show in Fig. 3 the costs of three possible CSP strategies in terms of single-core wall times on the example of system I. Note that this type of calculations are typically performed on hundreds of cores, so the actual wall time is just a couple hours for Strategy 1, the approach proposed here. The majority of time for Strategy 1, 7 core-days, is spent for the development of an aiFF and most of this time is used to compute SAPT(DFT) interaction energies for 706 dimer configurations, with very little time spent on fitting these energies. The next stage, the packing and minimization (PACK+MIN) of hundreds of thousands of crystals, requires only less than a third of a day. The final stage, pDFT+D calculations for the top 20 polymorphs at aiFF geometries, requires approximately one day. Hypothetical Strategy 2 differs from Strategy 1 by the use of an empirical FF in the PACK+MIN stage and by performing pDFT+D calculations for 100 polymorphs with reoptimization of geometries (this number of polymorphs was chosen as a trade-off between success rate and computational costs). The time required for the latter stage would be 70 core-days, so Strategy 2 is about an order of magnitude more expensive than Strategy 1. Moreover, if the W99+charges FF were used, the success rate of Strategy 2 on the set of 18 polymorphs examined here would be 72% (see Supplementary Table 2), while the success rate of Strategy 1 is 100% already with 16 top-ranked polymorphs. All the PACK+MIN bars appear to be of about the same height for aiFF and for the empirical FF. This is because the calculation of the lattice energy is only about two times more expensive in the former case. Hypothetical Strategy 3 performs pDFT+D calculations with geometry optimization for all 25,500 polymorphs produced by PACK+MIN. This strategy would have a very high reliability (since practice indicates that the experimental polymorphs are almost always included in such a large pool of candidate structures), but it would be extremely costly, 49 single-core years, and hence not practical (although possible if a few thousands cores were used). With the use of an empirical FF, one can set the number of polymorphs included in the pDFT+D stage anywhere between 100 and 25,000, systematically increasing costs and reliability relative to Strategy 2. However, with W99+charges and our set of polymorphs, the success rate would remain at 72% until the number of polymorphs is at least 589. For Strategies 2 and 3, the PACK+MIN stage can be replaced by any other protocol producing the required number of candidate polymorphs, with insignificant effects on the total timings.

**Fig. 3: Computational cost of the considered CSP protocols.**

Neglected effects

Since aiFFs are sums of two-body interactions, they neglect the many-body effects mentioned earlier and discussed in Supplementary Information. While we show that these effects are not critical in CSPs for the crystals considered here, they may be significant for some other crystals^46,47,48. The most important many-body effect, the many-body polarization, can be accounted for using polarizable aiFFs that can be developed using autoPES, but are not yet implemented in our CSP codes. In Supplementary Information, we also explain why the relatively small basis set that we used is adequate for CSPs. A much more important neglected effect is flexibility of monomers. Although the monomers considered by us were assumed to be rigid, the proposed CSP(aiFF) protocol can be applied to monomers with soft degrees of freedom. Such monomers may be significantly deformed in crystals compared to their equilibrium structures in gas phase. The recent version of autoPES³⁸ has the capability of computing interaction energies accounting for all or selected intramonomer degrees of freedom and most CSP codes can perform packing and minimization including all degrees of freedom, therefore such predictions can be made still completely from first principles. However, costs of such calculations increase steeply with the total number of degrees of freedom. One way around this problem is to assume separation of inter- and intramonomer degrees of freedom in Stage 2, as it has been done in all biomolecular FFs and in all FFs used in flexible-monomers CSPs. Since our aiFFs depend only of separations between atoms of different monomers, interaction energies can be computed for arbitrary monomer configurations. Such “flexibilized” intermonomer FF can replace the intermonomer component of current empirical FFs, while the intramonomer component can be kept unchanged. One can expect that such a replacement should lead to improved predictions in flexible-monomer CSPs.

Other effects neglected by the present version of CSP(aiFF) are thermal and entropic ones, as the results presented by us correspond to 0 K temperature. For some crystals, these effects can change the rankings of polymorphs, as pointed out by Brandenburg and Grimme³⁹ and recently investigated extensively by Hoja et al.⁴¹. The thermal and entropic effects can be routinely computed using pDFT+D, although such calculations are several times more expensive than pDFT+D calculations with static geometries. As a test, we have computed both effects for the 5 lowest lattice energy polymorphs of system XXII, leading to no change of rankings.

Concluding remarks

The first-principles CSP(aiFF) method developed here was applied to crystals of 15 rigid molecules with 18 known experimental polymorphs. When aiFFs are applied in CSPs for crystals of these molecules, 17 or 94% the polymorphs are ranked in the range 1–10, while the remaining one has rank 16. For comparison, analogous CSPs with the empirical W99+charges FF ranks only 33% of polymorphs in the range 1–10, 3 experimental polymorphs are not found within 568 or more generated ones, and for two molecules predictions were not possible due to missing atom types. The ability of CSP(aiFF) to minimize tens of thousands polymorphs is its key advantage over alternative approaches which have to use low-accuracy methods at this stage, often erroneously leading to discarding of correct structures. Upon a subsequent reranking of the top 20 polymorphs with pDFT+D calculations at fixed aiFF geometries, for all 15 molecules an experimental polymorph became ranked as number 1, while the second polymorphs became ranked as numbers 2, 2, and 3. The pDFT+D step can be omitted if aiFFs are iteratively improved by performing ab initio calculations on dimers extracted from crystals predicted with the previous iteration of an aiFF [the alt-CSP(aiFF) protocol]. The proposed CSP protocol not only shows ultimate predictive power for the systems tested, but is also inexpensive compared to other highly predictive approaches. On about a hundred cores, complete predictions for any of the systems investigated here take less than a day, including the aiFF generation. The CSP(aiFF) protocol requires a minimal human involvement, consisting only of input preparation for autoPES, UPACK, and pDFT+D calculations, and includes only free software with open source codes. Limitations of the current implementation of the CSP(aiFF) methodology have been discussed, in particular the neglect of many-body interactions and the rigid-monomer approximation. Although the test set included only homogeneous crystals, there are no reasons to doubt that the method will work equally well for cocrystals including salts since the quality of aiFFs does not depend on dimers being homogeneous or heterogeneous (of course, for two-component cocrystals, three PESs have to be developed). Also, while the largest of the test molecules included 22 atoms, the method should apply equally well to larger molecules since the relative accuracy of SAPT(DFT) does not change with system size³⁵. Of course, calculations will be more expensive as the size increases, but molecules with about 100 atoms are within reach. The effectiveness of the proposed CSP protocol is due to the use of the SAPT(DFT) method which is computationally efficient relative to other accurate electronic structure methods and due to the use of the autoPES method for fitting aiFFs since this method not only cuts the costs of such fits by orders of magnitude, but also reduces human effort of this most difficult to automate step almost to zero. An important element of the CSP(aiFF) protocol is that it replaces simple potential forms used in all earlier CSP protocols by an extended form capable of fitting ab initio interaction energies with significantly decreased uncertainties. An advantage of the proposed protocol is that it constitutes a complete first-principles procedure for investigating crystal structures and properties. Such a protocol should work equally well for any type of monomer, in contrast to the protocols using empirical FFs, which are expected to work well only for systems similar to those used in fitting such FFs. We believe that the overall effect of the proposed CSP protocol will be that the field of CSPs will move from the use of empirical FFs to aiFFs. This should increase reliability of predictions and therefore, while CSPs have played so far at the best advisory role in technology developments, they may become a leading element in developments of novel crystalline materials. More generally, aiFFs can be used in several types of computational material design.

Methods

Monomer geometry minimization

In Stage 1, monomer geometries were optimized using ORCA^49,50 with the PBE⁵¹ functional and D3 correction⁵² in the aug-cc-pVTZ⁵³ basis set.

Ab initio calculations of interaction energies

To make the CSP(aiFF) protocol practical, aiFFs have to be constructed in Stage 2 at reasonably low costs, but at the same time with small uncertainties, for monomers with dozens of atoms. This requires first that the ab initio method used to compute intermolecular interaction energies is inexpensive and accurate. It appears that the best current choice for such calculations is SAPT^36,54, an ab initio method that computes interaction energies directly, starting from isolated monomers and imposing the correct electron permutational symmetry. We applied the SAPT variant based on DFT, SAPT(DFT)^55,56, see ref. ³⁵ for a recent review of this method. SAPT(DFT) and CCSD(T) calculations scale as ${{{{{{{\mathcal{O}}}}}}}}({n}^{5})$ and ${{{{{{{\mathcal{O}}}}}}}}({n}^{7})$ with system size, respectively, where n is the number of electrons, and for dimers with a couple dozens of atoms, SAPT(DFT) calculations are about two orders of magnitude less expensive than CCSD(T) calculations. The recently developed new SAPT(DFT) algorithms and effective computer codes^35,42 can be used to compute thousands of grid points for dimers with ~100-atom monomers using reasonable computer resources and being able to achieve this in a few days if a sufficient number of computer cores are available.

The details of calculations of SAPT(DFT)^55,56,57,58 first- and second-order interaction energies are as follows. We used the density-fitting version^35,59,60 in the SAPT2020⁶¹ codes interfaced with the ORCA package^49,50 for calculations on monomers. The PBE⁵¹ functional was used in DFT calculations applying the gradient-regulated asymptotic correction (GRAC)^62,63. The aug-cc-pVDZ⁵³ basis set plus a set of 3s3p2d2f midbond functions (default of autoPES) was used in the monomer-centered plus basis set (MC⁺BS) format⁶⁴. The terms accounting for higher-order induction and exchange-induction effects, denoted as $\delta {E}_{{{{{{{{\rm{int}}}}}}}},{{{{{{{\rm{resp}}}}}}}}}^{{{{{{{{\rm{HF}}}}}}}}}$ and obtained as a difference between Hartree–Fock (HF) interaction energies and the sum of appropriate SAPT(HF) first- and second-order corrections in their response (resp) versions, was included for all systems except system XIII, benzene, DNBT, and TNB. We use a short-hand notation for SAPT interaction energy components: “indx” is the sum of the second-order induction and exchange-induction components, as well as of the $\delta {E}_{{{{{{{{\rm{int}}}}}}}},{{{{{{{\rm{resp}}}}}}}}}^{{{{{{{{\rm{HF}}}}}}}}}$ contribution, “dispx” is the sum of the dispersion and exchange-dispersion components, “elst” is the electrostatic component, and “exch” is the first-order exchange component. Relative importance of attractive components is illustrated in Supplementary Fig. 4.

Generation of aiFFs

In all past CSPs, only simple FFs have been used at the lattice-energy minimization stage. The two most often used forms are the Lennard-Jones 12-6-1 potential: A₁₂/r¹² − C₆/r⁶ + q_aq_b/r, and the Buckingham exp-6-1 potential: Ae^−βr − C₆/r⁶ + q_aq_b/r, where r is an atom-atom distance and A₁₂, A, β, C₆, q_a, and q_b are adjustable parameters. SAPT(DFT)-based aiFFs have been used in CSPs, but always with the exp-6-1 potential form in the minimization stage. This form is not pliable enough to fit well ab initio data, leading to uncertainties of a few kJ/mol, too large for reliable CSPs. In contrast, the extended form used by us in the CSP(aiFF) protocol can fit ab initio data with uncertainties of about 1 kJ/mol, which we show to be sufficient for reliable CSPs. This functional form is^37,38

$$V= \mathop{\sum}\limits_{a\in A,b\in B}\left\{\left[1+\mathop{\sum}\limits_{i=1,2}{a}_{i}^{ab}{({r}_{ab})}^{i}\right]{e}^{{\alpha }^{ab}-{\beta }^{ab}{r}_{ab}}+\frac{{A}_{12}^{ab}}{{({r}_{ab})}^{12}}\right.\\ \left.-\mathop{\sum}\limits_{n=6,8}{f}_{n}({\delta }_{n}^{ab},{r}_{ab})\frac{{C}_{n}^{ab}}{{({r}_{ab})}^{n}}+{f}_{1}({\delta }_{1}^{ab},{r}_{ab})\frac{{q}_{a}{q}_{b}}{{r}_{ab}}\right\}$$

(1)

where a (b) goes over the sets of atoms in monomer A (B), respectively, α^ab, β^ab, ${a}_{i}^{ab}$, ${A}_{12}^{ab}$ are repulsion-energy parameters, ${C}_{n}^{ab}$ are long-range dispersion plus induction energy parameters, q_x, x = a, b, are atomic partial charges, ${\delta }_{n}^{ab}$ are damping parameters, and f_n are the Tang-Toennies⁶⁵ damping functions: ${f}_{n}(\delta ,r)=1-{e}^{-\delta r}\mathop{\sum }\nolimits_{m = 0}^{n}{(\delta r)}^{m}/m!$ Long-range interaction energies were computed using an ab initio-distributed approach. The damping parameters in the dispersion plus induction term were fitted separately to the sum of all close-range second-order components plus $\delta {E}_{{{{{{{{\rm{int}}}}}}}},{{{{{{{\rm{resp}}}}}}}}}^{{{{{{{{\rm{HF}}}}}}}}}$, while ${\delta }_{1}^{ab}$ were fitted to electrostatic energies. All PESs developed here are two-body, 6-dimensional PESs, i.e., assume rigid monomers. The aiFFs were constructed as sums of these two-body PESs. One should add that the extended form of FFs given by Eq. (1) has been used in some published CSPs, but only in molecular dynamics (MD) simulations that can replace the pDFT+D calculations of Stage 5. Note that MD calculations are about as expensive as pDFT+D ones and significantly more expensive than the minimizations of Stage 4. Graphs showing SAPT(DFT) interaction energy components and their fits as functions of the distance R between the centers of mass of monomers are included in Supplementary Fig. 5. One can see in particular that the ab initio electrostatic energies are reproduced very well for R’s larger than the van der Waals minimum distance R_vdW despite using only damped charge-charge interactions, i.e., omitting higher multipolar terms. While the use of the latter terms in empirical FFs improves the predictions compared to the use of charges only^66,67, our results show that higher-rank multipoles are not needed if the electrostatic function includes damping and is fitted to ab initio electrostatic energies. The worsening of the agreement with ab initio values seen for R < R_vdW is inevitable and is due to the charge-overlap effects that are not proportional to inverse powers of R³⁶. These effects are accounted for in the overall fit by the first term in Eq. (1). This is why the total fitted and ab initio interaction energies are in excellent agreement for all R.

Crystal packing and lattice-energy minimization

Since none of the available CSP packages is capable of using the form of aiFFs given by Eq. (1), we have modified two such packages: MOLPAK⁶⁸ and UPACK⁶⁹ to be applied in Stages 3 and 4. MOLPAK uses the concept of coordination geometry and by default searches in 26 space groups: P1, $P\bar{1}$, P2, Pm, Pc, P2₁, P2/c, P2₁/m, P2/m, P2₁/c, Cc, C2, C2/c, Pnn2, Pba2, Pnc2, P22₁, Pmn2₁, Pma2, P2₁2₁2, P2₁2₁2₁, Pca2₁, Pna2₁, Pnma, Fdd2, Pbcn, and Pbca. It generates polymorphs on a grid in three-dimensional search space by systematically varying the orientation of the central molecule in steps of 10^∘. This generation is performed in all 51 coordination geometries. The packing in the unit cell is controlled by a simple repulsive 1/r¹² interaction between atoms: the molecules are brought together until an energy threshold is reached. This step provides an initial set of 6859 angle combinations × 51 coordination geometries = 349,809 hypothetical polymorphs. From this set, 25,500 densest polymorphs, 500 from each coordination geometry, are minimized using the program WMIN⁷⁰. The default functional form of the FF in WMIN is exp-6-1. We have modified this code to include FFs of the form of Eq. (1).

UPACK generates random crystal structures in 13 default space groups: C2, C2/c, Cc, P1, $P\bar{1}$, P2₁, P2₁/c, P2₁2₁2₁, Pbca, Pc, Pbcn, Pca2₁, and Pna2₁. It can use any 12-6-1 potential and we selected the OPLS-AA FF⁷¹. The packing stage is divided in UPACK into two steps. In the first step, only 500 reasonable structures per symmetry group are randomly generated in an unrestricted way and are then used to estimate cell dimensions. In the second step, the random generation is performed in a restricted coordinate space using this cell estimate. Most of the generated structures are immediately rejected using the criterion that atom-atom 12-6-1 interactions are not allowed to be larger than 2000 kJ/mol for any pair. Such generations plus energy criterion testings continue until 5000 polymorphs per symmetry group, i.e., the total of 65,000 polymorphs are found. This second step involves also a rough optimization of lattice energies. The resulting list is subjected to clustering⁷² to remove duplicates. Clustering reduces the pool significantly. For example, for system XXII it is reduced to 13,014 polymorphs.

In Stage 4 of CSP(aiFF) realized with UPACK, all the polymorphs from the reduced set are minimized with tight thresholds. We have modified UPACK to be able to use FFs of the form of Eq. (1). We found that it is advantageous to perform Stage 4 first with the OPLS-AA FF, i.e., using the original UPACK path including clustering, and then minimize the reduced set using aiFF. The procedure was chosen not to save time, although it does result in minor savings, but to avoid minimizations ending up in “holes” of an FF, i.e., unphysical minima at very short intermonomer separations. By construction, 12-6-1 FFs do not have any holes, while exp-6-1 and our extended-form FFs almost always have holes (although behind about 100 kJ/mol barriers, one of constraints of the autoPES fitting). We found that aiFF minimizations starting from the OPLS-minimized structures almost never end up in holes. We could have easily avoided the use of OPLS by fitting a 12-6-1 FF to the ab initio data.

The two CSP packages modified by us produced almost identical predictions for cases where we used both. MOLPAK was used for systems I, II, XII, XXII, nitromethane, and benzene. UPACK was used for the remaining systems, as well as for system I, II, and XXII treated also by MOLPAK. For these three systems, rankings of the experimental crystal by the two packages were identical.

PLATON⁷³ was used for checking missed symmetries⁷⁴ and for space group transformations from non-standard setting to standard setting by assigning the target crystal the proper space group and cell parameters, leading to the data in Table 1. For example, for system II both MOLPAK and UPACK predicted the experimental crystal in P2₁/c symmetry, and PLATON transformed it to P2₁/n symmetry.

pDFT+D calculations

In Stage 5, periodic single-point DFT+D lattice energy calculations, i.e., without geometry optimizations, were performed for the 20 top-ranked polymorphs from aiFF minimizations using the PBE⁵¹ functional with pseudopotentials⁷⁵ plus the D3 dispersion correction⁵² with the Becke–Johnson (BJ) damping^76,77. We used Quantum ESPRESSO (QE)^78,79 codes, with the plane-wave kinetic energy cutoffs of 340 and 3061 eV for the wave functions and charge densities, respectively.

The zero-point vibrational energy (ZPVE) and thermal effects were calculated within the harmonic approximation using Phonopy 2.8.1⁸⁰ and VASP 5.4.4^{81,82,83,84,85} with the same DFT+D approach as applied in QE. The projector augmented-wave pseudopotentials^86,87 were used. For the relaxation of the crystal, a cutoff of 1000 eV for the plane-wave basis set was used. The relaxation was stopped if the total energy change between two steps for electronic and ionic motions were smaller than 10⁻⁵ and 0.5 10⁻² eV, respectively. Phonon calculations were performed at the Γ-point using a supercell of at least 10 Å length in each direction. Similarly to the relaxation step, a cutoff of 1000 eV for the plane-wave basis set and a convergence threshold of 10⁻⁸ eV were used in the total energy calculation. Next, ZPVE and thermal effects were calculated on a mesh of 8 × 8 × 8 using the dynamical matrix built from the force constants of the displaced atoms in the supercell.

Data availability

The data that support the findings of this study are included within the Article and Supplementary Information. In particular, the .zip file contains coordinates and energies of all computed data points, parameters of the fits, and the crystallographic information files for a set of top-ranked polymorphs.

Code availability

The codes used for electronic structure calculations, fitting, CSPs, and pDFT+D calculations: SAPT, ORCA, autoPES (part of the SAPT package), MOLPAK, UPACK, Quantum Espresso, and VASP are available on the web and the links are provided in references of the main paper and the Supplementary Information. A patch to UPACK is available on the SAPT web site. A FORTRAN program computing the fitted potentials is included in the Supplementary_Data_1.zip file.

References

Cruz-Cabeza, A. J., Reutzel-Edens, S. M. & Bernstein, J. Facts and fictions about polymorphism. Chem. Soc. Rev. 44, 8619–8635 (2015).
Article CAS PubMed Google Scholar
Bučar, D., Lancaster, R. W. & Bernstein, J. Disappearing Polymorphs Revisited. Angew. Chem. Int. Ed. 54, 6972–6993 (2015).
Article CAS Google Scholar
Hilfiker, R. & von Raumer, M. (eds.) Polymorphism in the Pharmaceutical Industry: Solid Form and Drug Development (Wiley-VCH, Weinheim, Germany, 2019).
Bauer, J. et al. Ritonavir: An Extraordinary Example of Conformational Polymorphism. Pharm. Res 18, 859–866 (2001).
Article CAS PubMed Google Scholar
Chemburkar, S. R. et al. Dealing with the Impact of Ritonavir Polymorphs on the Late Stages of Bulk Drug Process Development. Org. Process Res. Dev. 4, 413–417 (2000).
Article CAS Google Scholar
Waknine, Y. Rotigotine patch recalled due to drug crystallization. Medscape (2008).
Rietveld, I. B. & Ceolin, R. Rotigotine: Unexpected Polymorphism with Predictable Overall Monotropic Behavior. J. Pharm. Sci. 104, 4117–4122 (2015).
Article CAS PubMed Google Scholar
Mortazavi, M. et al. Computational polymorph screening reveals late-appearing and poorly-soluble form of rotigotine. Comm. Chem. 2, 70 (2019).
Article CAS Google Scholar
Badgujar, D. M., Talawar, M. B., Asthana, S. N. & Mahulikar, P. P. Advances in science and technology of modern energetic materials: An overview. J. Hazard. Mater. 151, 289–305 (2008).
Article CAS PubMed Google Scholar
Zhang, C. Origins of the Energy and Safety of Energetic Materials and of the Energy & Safety Contradiction. Propellants Explos. Pyrotech. 43, 855–856 (2018).
Article CAS Google Scholar
Jurchescu, O. D. et al. Effects of polymorphism on charge transport in organic semiconductors. Phys. Rev. B 80, 085201 (2009).
Article ADS CAS Google Scholar
Maddox, J. Crystals from first principles. Nature 335, 201 (1988).
Article ADS Google Scholar
Gavezzotti, A. Are Crystal Structures Predictable? Acc. Chem. Res. 27, 309–314 (1994).
Article CAS Google Scholar
Lommerse, J. P. M. et al. A test of crystal structure prediction of small organic molecules. Acta Cryst. B 56, 697–714 (2000).
Article CAS Google Scholar
Motherwell, W. D. S. et al. Crystal structure prediction of small organic molecules: a second blind test. Acta Cryst. B 58, 647–661 (2002).
Article CAS Google Scholar
Day, G. M. et al. A third blind test of crystal structure prediction. Acta Cryst. B 61, 511–527 (2005).
Article CAS Google Scholar
Day, G. M. et al. Significant progress in predicting the crystal structures of small organic molecules - a report on the fourth blind test. Acta Cryst. B 65, 107–125 (2009).
Article CAS Google Scholar
Bardwell, D. A. et al. Towards crystal structure prediction of complex organic compounds - a report on the fifth blind test. Acta Cryst. B 67, 535–551 (2011).
Article CAS Google Scholar
Reilly, A. M. et al. Report on the sixth blind test of organic crystal-structure prediction methods. Acta Cryst. B 72, 439–459 (2016).
Article CAS Google Scholar
Nyman, J. & Day, G. M. Static and lattice vibrational energy differences between polymorphs. CrystEngComm 17, 5154–5165 (2015).
Article CAS Google Scholar
Ryan, K., Lengyel, J. & Shatruk, M. Crystal Structure Prediction via Deep Learning. J. Am. Chem. Soc. 140, 10158–10168 (2018).
Article PubMed CAS Google Scholar
Neumann, M. A., Leusen, F. J. J. & Kendrick, J. A Major Advance in Crystal Structure Prediction. Angew. Chem. Int. Ed. 47, 1–5 (2008).
Article Google Scholar
Neumann, M. A. Tailor-Made Force Fields for Crystal-Structure Prediction. J. Phys. Chem. B 112, 9810–9829 (2008).
Article CAS PubMed Google Scholar
Neumann, M. A. GRACE; Avant-garde Materials Simulation: St-Germain-en-Laye, France. https://www.avmatsim.eu/ (2008).
Day, G. M. Current approaches to predicting molecular organic crystal structures. Crystallogr. Rev. 17, 3–52 (2011).
Article Google Scholar
Price, S. L. Predicting crystal structures of organic compounds. Chem. Soc. Rev. 43, 2098–2111 (2014).
Article CAS PubMed Google Scholar
Szalewicz, K. Determination of Structure and Properties of Molecular Crystals from First Principles. Acc. Chem. Res. 47, 3266–3274 (2014).
Article CAS PubMed Google Scholar
Hoja, J., Reilly, A. M. & Tkatchenko, A. First-principles modeling of molecular crystals: structures and stabilities, temperature and pressure. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 7, e1294 (2017).
Google Scholar
Oganov, A. R. Crystal structure prediction: reflections on present status and challenges. Faraday Discuss. R. Soc. Chem. 211, 643–660 (2018).
Article ADS CAS Google Scholar
Price, S. L. & Brandenburg, J. G. Molecular crystal structure prediction. In Non-Covalent Interactions in Quantum Chemistry and Physics, (eds de-la Roza, A. O. & DiLabio, G. A.) 333–363 (Elsevier, 2017).
Price, S. L. Is zeroth order crystal structure prediction (CSP_0) coming to maturity? What should we aim for in an ideal crystal structure prediction code? Faraday Discuss. R. Soc. Chem. 211, 9–30 (2018).
Article ADS CAS Google Scholar
Podeszwa, R., Bukowski, R., Rice, B. M. & Szalewicz, K. Potential energy surface for cyclotrimethylene trinitramine dimer from symmetry-adapted perturbation theory. Phys. Chem. Chem. Phys. 9, 5561–5569 (2007).
Article CAS PubMed Google Scholar
Podeszwa, R., Rice, B. M. & Szalewicz, K. Predicting Structure of Molecular Crystals from First Principles. Phys. Rev. Lett. 101, 115503 (2008).
Article ADS PubMed CAS Google Scholar
Misquitta, A. J., Welch, G. W. A., Stone, A. J. & Price, S. L. A first principles prediction of the crystal structure of C₆Br₂ClFH₂. Chem. Phys. Lett. 456, 105–109 (2008).
Article ADS CAS Google Scholar
Garcia, J., Podeszwa, R. & Szalewicz, K. SAPT codes for calculations of intermolecular interaction energies. J. Chem. Phys. 152, 184109 (2020).
Article ADS CAS PubMed Google Scholar
Jeziorski, B., Moszyński, R. & Szalewicz, K. Perturbation Theory Approach to Intermolecular Potential Energy Surfaces of van der Waals Complexes. Chem. Rev. 94, 1887–1930 (1994).
Article CAS Google Scholar
Metz, M. P., Piszczatowski, K. & Szalewicz, K. Automatic Generation of Intermolecular Potential Energy Surfaces. J. Chem. Theory Comput. 12, 5895–5919 (2016).
Article CAS PubMed Google Scholar
Metz, M. P. & Szalewicz, K. Automatic Generation of Flexible-Monomer Intermolecular Potential Energy Surfaces. J. Chem. Theory Comput. 16, 2317–2339 (2020).
Article CAS PubMed Google Scholar
Brandenburg, J. G. & Grimme, S. Organic crystal polymorphism: a benchmark for dispersion-corrected mean-field electronic structure methods. Cryst. Acta B 52, 502–513 (2016).
Article CAS Google Scholar
Whittleton, S. R., de-la Roza, A. O. & Johnson, E. R. Exchange-Hole Dipole Dispersion Model for Accurate Energy Ranking in Molecular Crystal Structure Prediction. J. Chem. Theory Comput. 13, 441–450 (2017).
Article CAS PubMed Google Scholar
Hoja, J. et al. Reliable and practical computational description of molecular crystal polymorphs. Sci. Adv. 5, eaau3338 (2019).
Article ADS PubMed PubMed Central Google Scholar
Garcia, J. & Szalewicz, K. Ab Initio Extended Hartree–Fock plus Dispersion Method Applied to Dimers with Hundreds of Atoms. J. Phys. Chem. A 124, 1196–1203 (2020).
Article CAS PubMed Google Scholar
Taylor, D. C. et al. Blind test of density-functional-based methods on intermolecular interaction energies. J. Chem. Phys. 145, 124105 (2016).
Article ADS PubMed CAS Google Scholar
Williams, D. E. Improved intermolecular force field for molecules containing H, C, N, and O atoms, with application to nucleoside and peptide crystals. J. Comput. Chem. 22, 1154–1166 (2001).
Article CAS Google Scholar
Breneman, C. M. & Wiberg, K. B. Determining atom-centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis. J. Comp. Chem. 11, 361 (1990).
Article CAS Google Scholar
Welch, G. W. A., Karamertzanis, P. G., Misquitta, A. J., Stone, A. J. & Price, S. L. Is the Induction Energy Important for Modeling Organic Crystals? J. Chem. Theory Comput. 4, 522–532 (2008).
Article CAS PubMed Google Scholar
Karamertzanis, P. G. et al. Modeling the interplay of inter- and intramolecular hydrogen bonding in conformational polymorphs. J. Chem. Phys. 128, 244708 (2008).
Article ADS PubMed CAS Google Scholar
Greenwell, C. et al. Overcoming the difficulties of predicting conformational polymorph energetics in molecular crystals via correlated wavefunction methods. Chem. Sci. 11, 2200–2214 (2020).
Article CAS PubMed PubMed Central Google Scholar
Neese, F. The ORCA program system. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2, 73–78 (2012).
CAS Google Scholar
Neese, F. ORCA, an ab initio, DFT, and semiempirical electronic structure package. with contributions from U. Becker, et al. https://orcaforum.kofo.mpg.de.
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article ADS CAS PubMed Google Scholar
Grimme, S., Antony, J., Elrich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Article ADS PubMed CAS Google Scholar
Kendall, R. A., Dunning, T. H. Jr & Harrison, R. J. Electron-affinities of the first-row atoms revisited. Systematic basis sets and wave functions. J. Chem. Phys 96, 6796–6806 (1992).
Article ADS CAS Google Scholar
Szalewicz, K. Symmetry-adapted perturbation theory of intermolecular forces. Wiley Interdisc. Rev. Comp. Mol. Sci. 2, 254–272 (2012).
Article CAS Google Scholar
Misquitta, A. J., Podeszwa, R., Jeziorski, B. & Szalewicz, K. Intermolecular potentials based on symmetry-adapted perturbation theory including dispersion energies from time-dependent density functional calculations. J. Chem. Phys. 123, 214103 (2005).
Article ADS PubMed CAS Google Scholar
Hesselmann, A., Jansen, G. & Schütz, M. Density-functional theory-symmetry-adapted intermolecular perturbation theory with density fitting: A new efficient method to study intermolecular interaction energies. J. Chem. Phys. 122, 014103 (2005).
Article ADS CAS Google Scholar
Misquitta, A. J., Jeziorski, B. & Szalewicz, K. Dispersion Energy from Density-Functional Theory Description of Monomers. Phys. Rev. Lett. 91, 033201 (2003).
Article ADS PubMed CAS Google Scholar
Hesselmann, A. & Jansen, G. Intermolecular dispersion energies from time-dependent density functional theory. Chem. Phys. Lett. 367, 778–784 (2003).
Article ADS CAS Google Scholar
Bukowski, R., Podeszwa, R. & Szalewicz, K. Efficient calculations of coupled Kohn-Sham dynamic susceptibility functions and dispersion energies with density fitting. Chem. Phys. Lett. 414, 111–116 (2005).
Article ADS CAS Google Scholar
Podeszwa, R., Bukowski, R. & Szalewicz, K. Density-Fitting Method in Symmetry-Adapted Perturbation Theory Based on Kohn-Sham Description of Monomers. J. Chem. Theory Comput. 2, 400–412 (2006).
Article CAS PubMed Google Scholar
Bukowski, R. et al. SAPT2020: An ab initio program for many-body symmetry-adapted perturbation theory calculations of intermolecular interaction energies. http://www.physics.udel.edu/~szalewic/SAPT/SAPT.html (2020).
Grüning, M., Gritsenko, O. V., van Gisbergen, S. J. A. & Baerends, E. J. Shape corrections to exchange-correlation potentials by gradient-regulated seamless connection of model potentials for inner and outer region. J. Chem. Phys. 114, 652–660 (2001).
Article ADS Google Scholar
Cencek, W. & Szalewicz, K. On asymptotic behavior of density functional theory. J. Chem. Phys. 139, 024104–(1:27) (2013). Erratum: 140, 149902 (2014).
Article ADS CAS Google Scholar
Williams, H. L., Mas, E. M., Szalewicz, K. & Jeziorski, B. On the effectiveness of monomer-, dimer-, and bond-centered basis functions in calculations of intermolecular interaction energies. J. Chem. Phys. 103, 7374–7391 (1995).
Article ADS CAS Google Scholar
Tang, K. T. & Toennies, J. P. An improved simple-model for the van der Waals potential based on universal damping functions for the dispersion coefficients. J. Chem. Phys. 80, 3726–3741 (1984).
Article ADS CAS Google Scholar
Day, G. M., Motherwell, W. D. S. & Jones, W. Beyond the Isotropic Atom Model in Crystal Structure Prediction of Rigid Molecules: Atomic Multipoles versus Point Charges. Cryst. Growth Des. 5, 1023–1033 (2005).
Article CAS Google Scholar
Price, S. L. Computational prediction of organic crystal structures and polymorphism. Int. Rev. Phys. Chem. 27, 541–568 (2008).
Article CAS Google Scholar
Holden, J. R., Du, Z. & Ammon, H. L. Prediction of possible crystal structures for C-, H-, N-, O-, and F- containing organic compounds. J. Comput. Chem. 14, 422–437 (1993).
Article CAS Google Scholar
van Eijck, B. P. & Kroon, J. UPACK program package for crystal structure prediction: Force fields and crystal structure generation for small carbohydrate molecules. J. Comput. Chem. 20, 799–812 (1999).
Article PubMed Google Scholar
Busing, W. R. Report ORNL-5747. Oak Ridge National Laboratory, Oak Ridge, TN (1981).
Jorgensen, W. L. & Tirado-Rives, J. The OPLS [Optimized Potentials for Liquid Simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. J. Am. Chem. Soc. 110, 1657–1666 (1988).
Article CAS PubMed Google Scholar
Van Eijck, B. P. & Kroon, J. Fast clustering of equivalent structures in crystal structure prediction. J. Comput. Chem. 18, 1036–1042 (1997).
Article Google Scholar
Spek, A. L. Single-crystal structure validation with the program PLATON. J. Appl. Crystallogr. 36, 7–13 (2003).
Article CAS Google Scholar
Spek, A. L. Structure validation in chemical crystallography. Acta Crystallogr., Sect. D: Biol. Crystallogr. 65, 148–155 (2009).
Article CAS Google Scholar
Rappe, A. M., Rabe, K. M., Kaxiras, E. & Joannopoulos, J. D. Optimized pseudopotentials. Phys. Rev. B: Condens. Matter 41, 1227 (1990).
Article ADS CAS Google Scholar
Becke, A. D. & Johnson, E. R. Exchange-hole dipole moment and the dispersion interaction. J. Chem. Phys. 122, 154104–(1:5) (2005).
Article ADS Google Scholar
Grimme, S., Ehrlich, S. & Goerigk, L. Effect of the damping function in dispersion corrected density functional theory. J. Comp. Chem. 32, 1456–1465 (2011).
Article CAS Google Scholar
Giannozzi, P. et al. Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 http://www.quantum-espresso.org (2009).
Article PubMed Google Scholar
Giannozzi, P. et al. Advanced capabilities for materials modelling with quantum ESPRESSO. J. Phys. Condens. Matter 29, 465901 (2017).
Article CAS PubMed Google Scholar
Togo, A. & Tanaka, I. First principles phonon calculations in materials science. Scr. Mater. 108, 1–5 (2015).
Article ADS CAS Google Scholar
Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993).
Article ADS CAS Google Scholar
Kresse, G. & Hafner, J. Ab initio molecular-dynamics simulation of the liquid-metal–amorphous-semiconductor transition in germanium. Phys. Rev. B: Condens. Matter 49, 14251–14269 (1994).
Article ADS CAS Google Scholar
Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6, 15–50 (1996).
Article CAS Google Scholar
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B: Condens. Matter 54, 11169–11186 (1996).
Article ADS CAS Google Scholar
Kresse, G. et al. VASP: Vienna ab initio simulation package. http://www.vasp.at (2021).
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B: Condens. Matter 50, 17953–17979 (1994).
Article ADS Google Scholar
Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B: Condens. Matter 59, 1758–1775 (1999).
Article ADS CAS Google Scholar

Download references

Acknowledgements

This work was supported by the U.S. Army Research Laboratory and Army Research Office (Grant No. W911NF-19-1-0117) and the NSF (Grant No. CHE-1900551). We thank Rafał Podeszwa for comments on the manuscript.

Author information

Authors and Affiliations

Department of Physics and Astronomy, University of Delaware, Newark, DE, 19716, USA
Rahul Nikhar & Krzysztof Szalewicz

Authors

Rahul Nikhar
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Szalewicz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.N. and K.S. designed the method. R.N. coded it and performed numerical calculations. Both authors analyzed the results, wrote the manuscript, and revised it.

Corresponding author

Correspondence to Krzysztof Szalewicz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Gregory Beran, Graeme Day and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Information

Peer Review File

Supplementary Dataset 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Nikhar, R., Szalewicz, K. Reliable crystal structure predictions from first principles. Nat Commun 13, 3095 (2022). https://doi.org/10.1038/s41467-022-30692-y

Download citation

Received: 17 November 2021
Accepted: 10 May 2022
Published: 02 June 2022
DOI: https://doi.org/10.1038/s41467-022-30692-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.