Commit 03a821be authored by Bjoern Lange's avatar Bjoern Lange
Browse files

BLL Initial commit

parents
libOMM
======
Version
-------
0.1.0
Date
----
02.06.2015
Author
------
Fabiano Corsetti, CIC nanoGUNE, Donostia-San Sebastián, Spain
* Email: f.corsetti {at} nanogune.eu
* Homepage: <http://www.nanogune.eu/fabiano-corsetti/>
Description
-----------
libOMM solves the Kohn-Sham equation as a generalized eigenvalue problem for a fixed Hamiltonian. It implements the orbital minimization method (OMM), which works within a density matrix formalism. The basic strategy of the OMM is to find the set of Wannier functions (WFs) describing the occupied subspace by direct unconstrained minimization of an appropriately-constructed functional. The density matrix can then be calculated from the WFs. The solver is usually employed within an outer self-consistency (SCF) cycle. Therefore, the WFs resulting from one SCF iteration can be saved and then re-used as the initial guess for the next iteration.
Installation
------------
1. Enter the `src` directory.
2. Copy `make.inc.example` to `make.inc` and modify it to suit your needs. `MSLIBPATH` should point to the MatrixSwitch directory (default in `make.inc.example` is for the version included in the distribution). MatrixSwitch should be compiled with the `-DCONV` flag. Available options for `FPPFLAGS` are:
* `-DMPI`: enable MPI parallel routines
* `-DLAP`: enable LAPACK routines (currently necessary for preconditioning/Cholesky factorization)
* `-DSLAP`: enable ScaLAPACK routines (requires `-DMPI`)
* `-DNORAND`: fixed seed for the random number generator. Enable for testing purposes.
* `-DCBIND`: use ISO_C_BINDING for LOGICAL inputs in the wrapper interfaces. Enable for linking to C.
3. Type `make`.
Testing
-------
The `examples` directory contains a number of small programs that make use of libOMM with MatrixSwitch. These can be useful both for testing the installation and for learning how to use the library. To compile them:
1. Enter the `examples` directory.
2. Copy `make.inc.example` to `make.inc` and modify it to suit your needs. Be aware that `make.inc` in the `src` directory will also be used.
3. Type `make`.
Each example contains a header explaining what the program does and providing sample output to compare against.
Publications
------------
The algorithms and implementation are described in: F. Corsetti, *The orbital minimization method for electronic structure calculations with finite-range atomic basis sets*, Comput. Phys. Commun. **185**, 873 (2014). <http://dx.doi.org/10.1016/j.cpc.2013.12.008>
Documentation
-------------
A complete documentation is maintained at: <http://esl.cecam.org/libOMM>. Also see the examples in the `examples` directory.
License
-------
Copyright &copy; 2014, Fabiano Corsetti
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#-include ../src/make.inc
#-include make.inc
ifeq ($(strip $(MPI)),yes)
FPPFLAGS += -DMPI
endif
ifeq ($(strip $(LAP)),yes)
FPPFLAGS += -DLAP
endif
ifeq ($(strip $(SLAP)),yes)
FPPFLAGS += -DSLAP
endif
.SUFFIXES:
.SUFFIXES: .x .f90 .F90
all : example1.x example2.x example3.x example4.x example5.x example6.x example7.x
example1.x : example1.F90
example2.x : example2.F90
example3.x : example3.F90
example4.x : example4.F90
example5.x : example5.F90
example6.x : example6.F90
example7.x : example7.F90
clean :
rm -f *.x
LINK_MACRO = $< -I../src -I../src/MatrixSwitch-0.1.2/src -L../src -lOMM -L../src/MatrixSwitch-0.1.2/src -lmatrixswitch $(SCALAPACK) -lblas -llapack -o $@
.F90.x :
$(MPIFC) $(FCFLAGS_O) $(FPPFLAGS) $(LINK_MACRO)
.f90.x :
$(MPIFC) $(FCFLAGS_O) $(LINK_MACRO)
!==================================================================================================!
! example1 : Gamma-point only, spin unpolarized !
! !
! This example demonstrates a typical calculation with an outer loop of two MD iterations, and an !
! inner loop of five SCF iterations per MD step. For convenience, the Hamiltonian and overlap !
! matrices at each point have been pre-generated and are read in from file; in a real code, !
! however, they should be calculated at each step based on the output density matrix given by !
! libOMM. !
! !
! This example is for a system with 832 basis orbitals and 128 occupied states (generated from a !
! 64-atom supercell of bulk Si). At the end of each SCF iteration, the Kohn-Sham energy 2*e_min is !
! printed out, together with the first element of the density matrix as an extra check. At the end !
! of each MD iteration, the energy-weighted density matrix is also calculated, and its first !
! element is printed out. !
! !
! Things to note: !
! 1. The eigenspectrum shift parameter eta has to be set larger than 0 for convergence, since !
! the occupied spectrum extends beyond 0 (this is typically a sign of a poor basis). Try !
! changing eta to see how the convergence speed is affected. However, also take into account: !
! 2. The Hamiltonian has to be provided to libOMM *already shifted by eta* (H -> H-eta*S) !
! 3. There are no optional arguments in the call to libOMM. Therefore, matrices which are not !
! needed should simpy be passed without having been allocated by MatrixSwitch (m_allocate !
! routine). Other variables which are not needed will be ignored by libOMM. !
! 4. Try enabling Cholesky factorization (precon=1) or preconditioning (precon=3) in the call to !
! libOMM to see how the convergence speed is affected. Preconditioning is even more effective !
! if a T matrix is provided (scale_T should be set around 10 Ry). Take care that, for !
! Cholesky factorization, S and H will be overwritten by U and U^(-T)*H*U^(-1). !
! 5. The dealloc variable should only be .true. for the very last call. This is because libOMM !
! stores and reuses internal information from one call to the next. !
! !
! Sample output can be found in example1.out and example1_libOMM.log !
!==================================================================================================!
program example1
use MatrixSwitch
implicit none
#ifdef MPI
include 'mpif.h'
#endif
!**** PARAMS **********************************!
integer, parameter :: dp=selected_real_kind(15,300)
complex(dp), parameter :: cmplx_1=(1.0_dp,0.0_dp)
complex(dp), parameter :: cmplx_i=(0.0_dp,1.0_dp)
complex(dp), parameter :: cmplx_0=(0.0_dp,0.0_dp)
!**** VARIABLES *******************************!
character(5) :: m_storage
character(3) :: m_operation
character(21) :: file_name
logical :: new_S, dealloc
integer :: mpi_err, mpi_size, mpi_rank
integer :: m, n, imd, iscf, i, j, k, l, iostat
real(dp) :: he, se, el, e_min, eta
type(matrix) :: H, S, D_min, ED_min, C_min, T
!**********************************************!
#ifdef MPI
call mpi_init(mpi_err)
call mpi_comm_size(mpi_comm_world,mpi_size,mpi_err)
call mpi_comm_rank(mpi_comm_world,mpi_rank,mpi_err)
call ms_scalapack_setup(mpi_size,1,'c',32)
m_storage='pddbc'
m_operation='lap'
#else
mpi_rank=0
m_storage='sdden'
m_operation='ref'
#endif
m=832
n=128
call m_allocate(H,m,m,m_storage)
call m_allocate(S,m,m,m_storage)
call m_allocate(D_min,m,m,m_storage)
call m_allocate(ED_min,m,m,m_storage)
call m_allocate(C_min,n,m,m_storage)
eta=0.36748994224532205_dp
do imd=1,2
do iscf=1,5
if (mpi_rank==0) then
print('(a)'), '//////////////////////////'
print('(a,1x,i1,a,1x,i1,a)'), '/ MD STEP', imd, ' - SCF STEP', iscf, ' /'
print('(a)'), '//////////////////////////'
print('()')
end if
write(file_name,'(a,i1,a,i1,a)') 'data/example1_', imd, '_', iscf, '.dat'
open(10,file=file_name)
call m_set(H,'a',0.0_dp,0.0_dp)
if (iscf==1) call m_set(S,'a',0.0_dp,0.0_dp)
do i=1,m*m
read(10,'(2(1x,i5),2(1x,f21.15))',iostat=iostat) k, l, he, se
if (iostat/=0) exit
call m_set_element(H,k,l,he-eta*se)
if (iscf==1) call m_set_element(S,k,l,se)
end do
close(10)
if (iscf==1) then
new_S=.true.
else
new_S=.false.
end if
call omm(m,n,H,S,new_S,e_min,D_min,.false.,eta,C_min,.false.,T,0.0_dp,0,1,1,-1.0_dp,.true.,.false.,m_storage,m_operation,&
mpi_rank)
if (mpi_rank==0) then
print('(a,f21.15)'), 'e_min : ', 2.0_dp*e_min
end if
call m_get_element(D_min,1,1,el)
if (mpi_rank==0) then
print('(a,f21.15)'), 'D_11 : ', el
print('()')
end if
end do
if (imd==2) then
dealloc=.true.
else
dealloc=.false.
end if
call omm(m,n,H,S,.false.,e_min,ED_min,.true.,eta,C_min,.false.,T,0.0_dp,0,1,1,-1.0_dp,.true.,dealloc,m_storage,m_operation,&
mpi_rank)
call m_get_element(ED_min,1,1,el)
if (mpi_rank==0) then
print('(a,f21.15)'), 'ED_11 : ', el
print('()')
end if
end do
call m_deallocate(C_min)
call m_deallocate(ED_min)
call m_deallocate(D_min)
call m_deallocate(S)
call m_deallocate(H)
#ifdef MPI
call mpi_finalize(mpi_err)
#endif
end program example1
!==================================================================================================!
! example2 : Gamma-point only, spin polarized !
! !
! This example demonstrates a typical calculation with an outer loop of two MD iterations, and an !
! inner loop of five SCF iterations per MD step. For convenience, the Hamiltonian and overlap !
! matrices at each point have been pre-generated and are read in from file; in a real code, !
! however, they should be calculated at each step based on the output density matrix given by !
! libOMM. !
! !
! This example is for the same system as example1, but with a (quite unphysical) fixed spin !
! polarization (131 occupied states for up, 125 for down). This is to demonstrate that n can !
! differ between up and down. At the end of each SCF iteration, the Kohn-Sham energy !
! e_min_up+e_min_down is printed out, together with the first element of the up and down density !
! matrices as an extra check. At the end of each MD iteration, the energy-weighted up and down !
! density matrices are also calculated, and their first elements are printed out. !
! !
! Things to note: !
! 1. The eigenspectrum shift parameter eta has to be set larger than 0 for convergence, since !
! the occupied spectrum extends beyond 0 (this is typically a sign of a poor basis). Try !
! changing eta to see how the convergence speed is affected. However, also take into account: !
! 2. The Hamiltonian has to be provided to libOMM *already shifted by eta* (H -> H-eta*S) !
! 3. There are no optional arguments in the call to libOMM. Therefore, matrices which are not !
! needed should simpy be passed without having been allocated by MatrixSwitch (m_allocate !
! routine). Other variables which are not needed will be ignored by libOMM. !
! 4. Try enabling Cholesky factorization (precon=1) or preconditioning (precon=3) in the call to !
! libOMM to see how the convergence speed is affected. Preconditioning is even more effective !
! if a T matrix is provided (scale_T should be set around 10 Ry). Take care that, for !
! Cholesky factorization, S and H will be overwritten by U and U^(-T)*H*U^(-1); this is !
! *particularly* important for spin polarized calculations using the same S for up/down, !
! since the second spin point should call libOMM with precon=2 (S has already been factorized !
! and returned as U by the first spin point). !
! 5. The dealloc variable should only be .true. for the very last call. This is because libOMM !
! stores and reuses internal information from one call to the next. !
! !
! Sample output can be found in example2.out and example2_libOMM.log !
!==================================================================================================!
program example2
use MatrixSwitch
implicit none
#ifdef MPI
include 'mpif.h'
#endif
!**** PARAMS **********************************!
integer, parameter :: dp=selected_real_kind(15,300)
complex(dp), parameter :: cmplx_1=(1.0_dp,0.0_dp)
complex(dp), parameter :: cmplx_i=(0.0_dp,1.0_dp)
complex(dp), parameter :: cmplx_0=(0.0_dp,0.0_dp)
!**** VARIABLES *******************************!
character(5) :: m_storage
character(3) :: m_operation
character(21) :: file_name
logical :: new_S, dealloc
integer :: mpi_err, mpi_size, mpi_rank
integer :: m, n_up, n_down, imd, iscf, i, j, k, l, iostat
real(dp) :: hue, hde, se, eul, edl, e_min_up, e_min_down, eta
type(matrix) :: H_up, H_down, S, D_min_up, D_min_down, ED_min_up, ED_min_down, C_min_up, C_min_down, T
!**********************************************!
#ifdef MPI
call mpi_init(mpi_err)
call mpi_comm_size(mpi_comm_world,mpi_size,mpi_err)
call mpi_comm_rank(mpi_comm_world,mpi_rank,mpi_err)
call ms_scalapack_setup(mpi_size,1,'c',32)
m_storage='pddbc'
m_operation='lap'
#else
mpi_rank=0
m_storage='sdden'
m_operation='ref'
#endif
m=832
n_up=131
n_down=125
call m_allocate(H_up,m,m,m_storage)
call m_allocate(H_down,m,m,m_storage)
call m_allocate(S,m,m,m_storage)
call m_allocate(D_min_up,m,m,m_storage)
call m_allocate(D_min_down,m,m,m_storage)
call m_allocate(ED_min_up,m,m,m_storage)
call m_allocate(ED_min_down,m,m,m_storage)
call m_allocate(C_min_up,n_up,m,m_storage)
call m_allocate(C_min_down,n_down,m,m_storage)
eta=0.36748994224532205_dp
do imd=1,2
do iscf=1,5
if (mpi_rank==0) then
print('(a)'), '//////////////////////////'
print('(a,1x,i1,a,1x,i1,a)'), '/ MD STEP', imd, ' - SCF STEP', iscf, ' /'
print('(a)'), '//////////////////////////'
print('()')
end if
write(file_name,'(a,i1,a,i1,a)') 'data/example2_', imd, '_', iscf, '.dat'
open(10,file=file_name)
call m_set(H_up,'a',0.0_dp,0.0_dp)
call m_set(H_down,'a',0.0_dp,0.0_dp)
if (iscf==1) call m_set(S,'a',0.0_dp,0.0_dp)
do i=1,m*m
read(10,'(2(1x,i5),3(1x,f21.15))',iostat=iostat) k, l, hue, hde, se
if (iostat/=0) exit
call m_set_element(H_up,k,l,hue-eta*se)
call m_set_element(H_down,k,l,hde-eta*se)
if (iscf==1) call m_set_element(S,k,l,se)
end do
close(10)
if (iscf==1) then
new_S=.true.
else
new_S=.false.
end if
if (mpi_rank==0) print('(a)'), 'up spin...'
call omm(m,n_up,H_up,S,new_S,e_min_up,D_min_up,.false.,eta,C_min_up,.false.,T,0.0_dp,0,2,1,-1.0_dp,.true.,.false.,m_storage,&
m_operation,mpi_rank)
if (mpi_rank==0) print('(a)'), 'down spin...'
call omm(m,n_down,H_down,S,new_S,e_min_down,D_min_down,.false.,eta,C_min_down,.false.,T,1.0_dp,0,2,2,-1.0_dp,.true.,.false.,&
m_storage,m_operation,mpi_rank)
if (mpi_rank==0) then
print('()')
print('(a,f21.15)'), 'e_min : ', e_min_up+e_min_down
end if
call m_get_element(D_min_up,1,1,eul)
call m_get_element(D_min_down,1,1,edl)
if (mpi_rank==0) then
print('(a,f21.15,a,f21.15,a)'), 'D_11 : ', eul, ' (up), ', edl, ' (down)'
print('()')
end if
end do
call omm(m,n_up,H_up,S,.false.,e_min_up,ED_min_up,.true.,eta,C_min_up,.false.,T,0.0_dp,0,2,1,-1.0_dp,.true.,.false.,m_storage,&
m_operation,mpi_rank)
if (imd==2) then
dealloc=.true.
else
dealloc=.false.
end if
call omm(m,n_down,H_down,S,.false.,e_min_down,ED_min_down,.true.,eta,C_min_down,.false.,T,0.0_dp,0,2,2,-1.0_dp,.true.,dealloc,&
m_storage,m_operation,mpi_rank)
call m_get_element(ED_min_up,1,1,eul)
call m_get_element(ED_min_down,1,1,edl)
if (mpi_rank==0) then
print('(a,f21.15,a,f21.15,a)'), 'ED_11 : ', eul, ' (up), ', edl, ' (down)'
print('()')
end if
end do
call m_deallocate(C_min_down)
call m_deallocate(C_min_up)
call m_deallocate(ED_min_down)
call m_deallocate(ED_min_up)
call m_deallocate(D_min_down)
call m_deallocate(D_min_up)
call m_deallocate(S)
call m_deallocate(H_down)
call m_deallocate(H_up)
#ifdef MPI
call mpi_finalize(mpi_err)
#endif
end program example2
!==================================================================================================!
! example3 : k points, spin unpolarized !
! !
! This example demonstrates a typical calculation with an outer loop of two MD iterations, and an !
! inner loop of five SCF iterations per MD step. For convenience, the Hamiltonian and overlap !
! matrices at each point have been pre-generated and are read in from file; in a real code, !
! however, they should be calculated at each step based on the output density matrix given by !
! libOMM. !
! !
! This example is for a system with 988 basis orbitals and 152 occupied states at each k point !
! (generated from a 76-atom C nanotube). There are four k points, weighted equally. At the end of !
! each SCF iteration, the Kohn-Sham energy 0.25*(e_min(1)+e_min(2)+e_min(3)+e_min(4)) is printed !
! out, together with the first element of the density matrix at each k point as an extra check. At !
! the end of each MD iteration, the energy-weighted density matrix at each k point is also !
! calculated, and its first element is printed out. !
! !
! Things to note: !
! 1. The eigenspectrum shift parameter eta in this case can be set to 0 (see example1/2). !
! 2. There are no optional arguments in the call to libOMM. Therefore, matrices which are not !
! needed should simpy be passed without having been allocated by MatrixSwitch (m_allocate !
! routine). Other variables which are not needed will be ignored by libOMM. !
! 3. Try enabling Cholesky factorization (precon=1) or preconditioning (precon=3) in the call to !
! libOMM to see how the convergence speed is affected. Preconditioning is even more effective !
! if a T matrix is provided (scale_T should be set around 10 Ry). Take care that, for !
! Cholesky factorization, S and H will be overwritten by U and U^(-T)*H*U^(-1). !
! 4. The dealloc variable should only be .true. for the very last call. This is because libOMM !
! stores and reuses internal information from one call to the next. !
! !
! Sample output can be found in example3.out and example3_libOMM.log !
!==================================================================================================!
program example3
use MatrixSwitch
implicit none
#ifdef MPI
include 'mpif.h'
#endif
!**** PARAMS **********************************!
integer, parameter :: dp=selected_real_kind(15,300)
complex(dp), parameter :: cmplx_1=(1.0_dp,0.0_dp)
complex(dp), parameter :: cmplx_i=(0.0_dp,1.0_dp)
complex(dp), parameter :: cmplx_0=(0.0_dp,0.0_dp)
!**** VARIABLES *******************************!
character(5) :: m_storage
character(3) :: m_operation
character(21) :: file_name
logical :: new_S, dealloc
integer :: mpi_err, mpi_size, mpi_rank
integer :: m, n, nk, imd, iscf, i, j, k, l, iostat
real(dp) :: eta
real(dp), allocatable :: he(:), se(:), e_min(:)
complex(dp) :: cmplx_he, cmplx_se, el
type(matrix) :: T
type(matrix), allocatable :: H(:), S(:), D_min(:), ED_min(:), C_min(:)
!**********************************************!
#ifdef MPI
call mpi_init(mpi_err)
call mpi_comm_size(mpi_comm_world,mpi_size,mpi_err)
call mpi_comm_rank(mpi_comm_world,mpi_rank,mpi_err)
call ms_scalapack_setup(mpi_size,1,'c',32)
m_storage='pzdbc'
m_operation='lap'
#else
mpi_rank=0
m_storage='szden'
m_operation='ref'
#endif
m=988
n=152
nk=4
allocate(e_min(nk))
allocate(he(2*nk))
allocate(se(2*nk))
allocate(H(nk))
allocate(S(nk))
allocate(D_min(nk))
allocate(ED_min(nk))
allocate(C_min(nk))
do i=1,nk
call m_allocate(H(i),m,m,m_storage)
call m_allocate(S(i),m,m,m_storage)
call m_allocate(D_min(i),m,m,m_storage)
call m_allocate(ED_min(i),m,m,m_storage)
call m_allocate(C_min(i),n,m,m_storage)
end do
eta=0.0_dp
do imd=1,2
do iscf=1,5
if (mpi_rank==0) then
print('(a)'), '//////////////////////////'
print('(a,1x,i1,a,1x,i1,a)'), '/ MD STEP', imd, ' - SCF STEP', iscf, ' /'
print('(a)'), '//////////////////////////'
print('()')
end if
write(file_name,'(a,i1,a,i1,a)') 'data/example3_', imd, '_', iscf, '.dat'
open(10,file=file_name)
do i=1,nk
call m_set(H(i),'a',0.0_dp,0.0_dp)
if (iscf==1) call m_set(S(i),'a',0.0_dp,0.0_dp)
end do
do i=1,m*m
read(10,'(2(1x,i5),16(1x,f21.15))',iostat=iostat) k, l, he(1:2), se(1:2), &
he(3:4), se(3:4), &
he(5:6), se(5:6), &
he(7:8), se(7:8)
if (iostat/=0) exit
do j=1,nk
cmplx_se=cmplx(se(2*j-1),se(2*j),dp)
cmplx_he