Modeling of such substances as proteins, nucleic acids, lipids are traditionally performed in the force fields like AMBER, OPLS, ets. These fields describe in a consistent manner a small number of functional groups - amino acids, nucleic bases, phosphates, ets. Whenever we the need to model a new substance or functional group raises the question of its parameterization. First of all, we need to decide what the required quality of the parameterization is and what resources can be spent on it.
AMBER like force fields are parameterized such as follows:
Consider three cases:
The main problem is a nonvalent interactions. Whatever way we used, we have to calculate the partial charges on atoms. In the classic AMBER partial charges calculated by the quantum data obtained by the Hartree-Fock method in the 6-31G(d) basis. Depending on the number of atoms, these calculations may take considerable time.
In the case of (2.) we are acting in the usual way. Having some experience we can be parameterized 2-3 substances per day.
In case (1.), We can save on electrostatics calculations, taking a smaller basis. If we want to stay within the ab initio calculations (which is desirable), we can use the MINI basis set. Parameterization of the valence interactions can be waived. Instead, we can use some universal force field, for example DREIDING or UFF. The Abalone program can use the DREIDING like force field. Of course, this will worsen the quality of the calculations, but not much. The exact value of the parameters of the valence interactions is generally less significant than the non-valent.
In the case of (3.) it is desirable to calculate the electrostatics as carefully as possible. First, we need a good basis set. For this purpose aug-cc-pVDZ basis is suitable1). Second, we need to take into account the correlation of the electrons. This can be done using MP2 or, with slightly worse results, such DFT functionals as PBE0, B3LYP, m06-2x. Third, if we're going to use the model in high dielectric constant environment such as water, we should increase the dipole moment of the model. This can be done using the COSMO method.
Thus the recommended method for the partial charges calculation for the simulation of biomolecules is an aug-cc-pVDZ PBE0 COSMO model.
The evaluation of the valence parameters should be either spectroscopic data or calculations in the basis set at least 2d quality. Suitable basis set is 6-311+G(2d,p). It provides good conformational results. It’s also possible to use it for charges calculations in the case if the molecule is too large to be computed in the aug-cc-pVDZ.
These recommendations are approximate enough. For a really good parameterization is necessary to use different quantum-chemical methods. If we talk about the most robust, but at the same time sufficiently precise method, it is possible to recommend 6-311+G(2d,p) m06-2x. It can be used both in electrostatic and in conformational calculations. And it gives reasonable results for the non-polar and unsaturated compounds, which usually is a problem.