Commit b12ea40c authored by Christian Krekeler's avatar Christian Krekeler

Merge branch 'krekeler_master' into 'master'

Update Krekeler master

See merge request !9
parents 9cb0d2a5 6b2dd6be
Pipeline #4165 passed with stages
in 1 minute and 50 seconds
# compiled python files
*.pyc
# sphinx build folder
_build
_templates
*.pyc
*.DS_Store
# OS generated files #
######################
.DS_Store?
ehthumbs.db
Icon?
Thumbs.db
# Editor backup files #
#######################
*~
image: umbrellium/sphinx-doc
image: cloudcompass/docker-rtdsphinx
spelling:
script:
- pip3 install codespell
- codespell --skip=".git,_static,_build,Diff*,*.patch" --quiet-level=2 --ignore-words-list="adress"
only:
- master
- merge_requests
orphans:
script:
# Report all the orphans but ignore the exit code
- find ./ -name "*.rst"|xargs -i grep -H orphan {} || true
# Now handle the error code
- if [ $(find ./ -name "*.rst"|xargs -i grep -H orphan {}|wc -l) -gt "2" ]; then $(exit 1); else $(exit 0); fi
only:
- master
pages:
script:
- apt-get update
- apt-get -y install dvipng
- pip3 install pygments --upgrade
- pip3 install Sphinx --upgrade
- pip3 install sphinx-bootstrap-theme --upgrade
- READTHEDOCS=True sphinx-build -nWT -b html . _build/html
- mv _build/html/ public/
- echo -e "\n\n\e[1mYou can find your build of this documentation at \n\t\e[32m${CI_PAGES_URL}\e[0m\n\n"
artifacts:
paths:
- public
only:
- master
- merge_requests
Add a brief description briefly here (which replaces this sentence), a line or two is usually enough.
## Module verification checklist (for reviewers)
*Checklist when the module is first submitted*
- [ ] Have the relevant labels been added to the MR
- [ ] If submitted on someone elses behalf, has the software author been referenced (if they have a GitLab account)
*Checklist when module is no longer "WIP"*
- [ ] Is the module documentation sufficiently detailed?
- [ ] Is it mergeable? (i.e., there should be no merge conflicts)
- [ ] Are the build instructions sufficient - source code locations, build instructions, etc.? (If not the MR should be updated)
- [ ] Did it pass the tests that were described? (Are there unit/regression tests? Do they pass?)
- [ ] Are the tests sufficient?
- [ ] If the module introduces new functionality, is it tested? (Unit/regression tests?)
- [ ] Is the associated source code well formatted? (typos, line length, brackets,...it should be consistent with existing source)
- [ ] Is all new source code sufficiently documented? (functions, their arguments,...)
- [ ] Is there a description of any applications the module has? (This is a hard requirement for E-CAM PDRAs)
*After Merging*
- [ ] Make sure the module appears in a toctree
- [ ] Add a link to the final result on https://e-cam.readthedocs.io
......@@ -184,6 +184,8 @@ The modules that are based on OPS, but remain separate, are:
./modules/ops_piggybacker/readme
./modules/contact_maps/readme
./modules/contact_maps_parallelization/readme
./modules/dw_dimer_testsystem/readme
./modules/lammps_ops/readme
Nine of these modules were part of
`E-CAM Deliverable 1.2 <https://www.e-cam2020.eu/deliverables/>`_. Those modules
......@@ -198,6 +200,21 @@ together with the partner and typically are to facilitate or improve the scope o
partner. The related code development for the pilot projects are open source (where the licence of the underlying
software allows this) and are described in the modules associated with the pilot projects.
More information on Classical MD pilot projects can be found on the main E-CAM website:
* `Project on binding kinetics <https://www.e-cam2020.eu/pilot-project-biki/>`_
* `Project on food and pharmaceutical proteins <https://www.e-cam2020.eu/pilot-project-food-proteins/>`_
The following modules were developed specifically for the Classical MD pilot projects.
.. toctree::
:glob:
:maxdepth: 1
./modules/contact_maps/readme
./modules/contact_maps_parallelization/readme
./modules/contact_concurrences/readme
Extended Software Development Workshops (ESDWs)
===============================================
......@@ -225,5 +242,15 @@ August 2017. The following modules have been produced:
./modules/OpenPathSampling/ops_sr_shooter/readme
./modules/OpenPathSampling/ops_web_throwing/readme
./modules/OpenPathSampling/ops_plumed_wrapper/readme
./modules/OpenPathSampling/ops_s_shooting/readme
The third ESDW for the Classical MD workpackage was held in Turin, Italy in July
2018. The following have been produced as a result:
.. toctree::
:glob:
:maxdepth: 1
./modules/HTC/decorators/readme
.. _E-CAM: https://www.e-cam2020.eu/
.. In ReStructured Text (ReST) indentation and spacing are very important (it is how ReST knows what to do with your
document). For ReST to understand what you intend and to render it correctly please to keep the structure of this
template. Make sure that any time you use ReST syntax (such as for ".. sidebar::" below), it needs to be preceded
and followed by white space (if you see warnings when this file is built they this is a common origin for problems).
.. Firstly, let's add technical info as a sidebar and allow text below to wrap around it. This list is a work in
progress, please help us improve it. We use *definition lists* of ReST_ to make this readable.
.. sidebar:: Software Technical Information
Name
``jobqueue_features``
Language
Python
Licence
`MIT <https://opensource.org/licenses/mit-license>`_
Documentation Tool
In-source documentation
Application Documentation
Not currently available.. Example usage provided.
Relevant Training Material
Not currently available.
Software Module Developed by
Adam Włodarczyk (Wrocław Centre of Networking and Supercomputing),
Alan O'Cais (Juelich Supercomputing Centre)
.. In the next line you have the name of how this module will be referenced in the main documentation (which you can
reference, in this case, as ":ref:`example`"). You *MUST* change the reference below from "example" to something
unique otherwise you will cause cross-referencing errors. The reference must come right before the heading for the
reference to work (so don't insert a comment between).
.. _htc:
#######################################
E-CAM High Throughput Computing Library
#######################################
.. Let's add a local table of contents to help people navigate the page
.. contents:: :local:
.. Add an abstract for a *general* audience here. Write a few lines that explains the "helicopter view" of why you are
creating this module. For example, you might say that "This module is a stepping stone to incorporating XXXX effects
into YYYY process, which in turn should allow ZZZZ to be simulated. If successful, this could make it possible to
produce compound AAAA while avoiding expensive process BBBB and CCCC."
E-CAM is interested in the challenge
of bridging timescales. To study molecular dynamics with atomistic detail, timesteps must be used on
the order of a femtosecond. Many problems in biological chemistry, materials science, and other
fields involve events that only spontaneously occur after a millisecond or longer (for example,
biomolecular conformational changes, or nucleation processes). That means that around :math:`10^{12}` time
steps would be needed to see a single millisecond-scale event. This is the problem of "rare
events" in theoretical and computational chemistry.
Modern supercomputers are beginning to make it
possible to obtain trajectories long enough to observe some of these processes, but to fully
characterize a transition with proper statistics, many examples are needed. In order to obtain many
examples the same application must be run many thousands of times with varying inputs. To manage
this kind of computation a task scheduling high throughput computing (HTC) library is needed. The main elements of mentioned
scheduling library are: task definition, task scheduling and task execution.
While traditionally an HTC workload is looked down upon in the HPC
space, the scientific use case for extreme-scale resources exists and algorithms that require a
coordinated approach make efficient libraries that implement
this approach increasingly important in the HPC space. The 5 Petaflop booster technology of `JURECA <http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JURECA/JURECA_node.html>`_
is an interesting concept with respect to this approach since the offloading approach of heavy
computation marries perfectly to the concept outlined here.
Purpose of Module
_________________
.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment
This module is the first in a sequence that will form the overall capabilities of the library. In particular this module
deals with creating a set of decorators to wrap around the `Dask-Jobqueue <https://jobqueue.dask.org/en/latest/>`_
Python library, which aspires to make the development time cost of leveraging it lower for our use cases.
Background Information
______________________
.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment
The initial motivation for this library is driven by the ensemble-type calculations that are required in many scientific
fields, and in particular in the materials science domain in which the E-CAM Centre of Excellence operates. The scope
for parallelisation is best contextualised by the `Dask <https://dask.org/>`_ documentation:
A common approach to parallel execution in user-space is task scheduling. In task scheduling we break our program
into many medium-sized tasks or units of computation, often a function call on a non-trivial amount of data. We
represent these tasks as nodes in a graph with edges between nodes if one task depends on data produced by another.
We call upon a task scheduler to execute this graph in a way that respects these data dependencies and leverages
parallelism where possible, multiple independent tasks can be run simultaneously.
Many solutions exist. This is a common approach in parallel execution frameworks. Often task scheduling logic hides
within other larger frameworks (Luigi, Storm, Spark, IPython Parallel, and so on) and so is often reinvented.
Dask is a specification that encodes task schedules with minimal incidental complexity using terms common to all
Python projects, namely dicts, tuples, and callables. Ideally this minimum solution is easy to adopt and understand
by a broad community.
While we were attracted by this approach, Dask did not support *task-level* parallelisation (in particular
multi-node tasks). We researched other options (including Celery, PyCOMPSs, IPyParallel and others) and organised a
workshop that explored some of these (see https://www.cecam.org/workshop-0-1650.html for further details).
Building and Testing
____________________
.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment
The library is a Python module and can be installed with
::
python setup.py install
More details about how to install a Python package can be found at, for example, `Install Python packages on the
research computing systems at IU <https://kb.iu.edu/d/acey>`_
To run the tests for the decorators within the library, you need the ``pytest`` Python package. You can run all the
relevant tests from the ``jobqueue_features`` directory with
::
pytest tests/test_decorators.py
Examples of usage can be found in the ``examples`` directory.
Source Code
___________
The latest version of the library is available on the `jobqueue_features GitHub repository
<https://github.com/E-CAM/jobqueue_features>`_, the file specific to this module
is `decorators.py <https://github.com/E-CAM/jobqueue_features/blob/master/jobqueue_features/decorators.py>`_.
(The code that was originally created for this module can be seen in the specific commit `4590a0e427112f
<https://gitlab.e-cam2020.eu/adam/jobqueue_features/tree/4590a0e427112fbf51edff6116e34c90e765baf0>`_
which can be found in the original private repository of the code.)
......@@ -99,7 +99,7 @@ This module supports (as listed in PLUMED documentation):
``UNITS``).
* CV Documentation: all CVs are created by calling ``PLUMEDCV(name,
PLUMEDInterface, definition)``. The returned function can be appied
PLUMEDInterface, definition)``. The returned function can be applied
to a ``Trajectory``. CVs with components should specify the
``components=["c1", "c2", "c3", ...]`` keyword and the corresponding
PLUMED keywords in the ``definition``.
......
.. _ost_s_shooting:
##############################
S-shooting in OpenPathSampling
##############################
.. sidebar:: Software Technical Information
The information in this section describes OpenPathSampling as a whole.
Information specific to the additions in this module are in subsequent
sections.
Language
Python (2.7)
Documentation Tool
Sphinx, numpydoc format (ReST)
Application Documentation
http://openpathsampling.org
Relevant Training Material
http://openpathsampling.org/latest/examples/
Licence
LGPL, v. 2.1 or later
.. contents:: :local:
Authors: Andreas Singraber
This module implements the S-shooting method [1]_ in OpenPathSampling.
Purpose of Module
_________________
S-shooting [1]_ is a recently developed method to determine rate constants of
rare events. It is similar in spirit to the reactive flux method but its
relaxed requirements help to overcome practical problems. The method is based
on a simple shooting algorithm where trajectories are propagated forward and
backward in time for a fixed number of timesteps. The starting points need to
be provided and must lie in the saddle point region. This so-called S region
(hence the name S-shooting) is defined via a suitable reaction coordinate and
must to separate the stable states A and B in such a way that no trajectory can
connect A with B without visiting S. In contrast to the reactive flux method
the time derivative of the reaction coordinate is not required, which makes
this approach applicable to systems exhibiting diffusive dynamics along the
reaction coordinate. The S-shooting method can also be applied if the initial
shooting points are taken from a biased simulation. Thus, it is a natural
follow-up to free energy calculations like umbrella sampling and, in
combination with free energy curves, allows the computation of rate constants.
The implementation of the S-shooting method in OpenPathSampling (OPS) is split
into two main parts:
- Forward and backward trajectories started from initial snapshots are
harvested and glued together calling the ``SShootingSimulation`` class. The
user needs to provide the initial snapshots, a suitable definition of the
S region and the desired trajectory length.
- The S-shooting analysis is performed upon calling the ``SShootingAnalysis``
class. Mandatory arguments include the definition of the stable states (A and
B) and of the S region. In case the initial snapshots are taken from a biased
simulation a bias function may be provided as an optional argument.
This module comes also with an IPython example notebook demonstrating the
method by applying it to a one-dimensional system (a brownian walker in a
double-well potential).
.. [1] Menzl, G., Singraber, A. & Dellago, C. S-shooting: a Bennett–Chandler-like method for the computation of rate constants from committor trajectories. Faraday Discuss. 195, 345–364 (2017), https://doi.org/10.1039/C6FD00124F
Background Information
______________________
This module builds on OpenPathSampling, a Python package for path sampling
simulations. To learn more about OpenPathSampling, you might be interested in
reading:
* OPS documentation: http://openpathsampling.org
* OPS source code: http://github.com/openpathsampling/openpathsampling
Testing
_______
Follow these steps to test the module:
1. Download and install OpenPathSampling (see http://openpathsampling.org/latest/install.html).
.. caution::
This module has been developed alongside a specific OPS version available at
that time. If incompatibilities arise as OPS is further enhanced, please use
version 0.9.5 available here:
https://github.com/openpathsampling/openpathsampling/releases/tag/v0.9.5 .
2. Install the `nose`_ package.
3. Download the source files of the module (see the `Source Code`_ section below).
4. Install the module: change to the ``S-Shooting`` directory and run ``python setup.py install``.
5. Run the tests: execute ``nosetests`` in the ``S-Shooting`` directory.
.. IF YOUR MODULE IS IN OPS CORE:
.. This module has been included in the OpenPathSampling core. Its tests can
.. be run by setting up a developer install of OpenPathSampling and running
.. the command ``nosetests`` from the root directory of the repository.
.. IF YOUR MODULE IS IN A SEPARATE REPOSITORY
.. The tests for this module can be run by downloading its source code,
.. installing its requirements, and running the command ``nosetests`` from the
.. root directory of the repository.
Examples
________
See the ``sshooting-example.ipynb`` IPython notebook in the source directory, here is the direct link: https://gitlab.e-cam2020.eu/singraber/S-Shooting/blob/master/ops_s_shooting/sshooting-example.ipynb
To run the example execute ``jupyter notebook sshooting-example.ipynb`` in your terminal.
Source Code
___________
.. link the source code
.. IF YOUR MODULE IS IN OPS CORE
.. This module has been merged into OpenPathSampling. It is composed of the
.. following pull requests:
.. * link PRs
.. IF YOUR MODULE IS A SEPARATE REPOSITORY
.. The source code for this module can be found in: URL.
The source code for this module is located here:
https://gitlab.e-cam2020.eu/singraber/S-Shooting
.. tip::
Ultimately, this module will be merged into the official OPS code. Check
the status of the corresponding pull request here:
https://github.com/openpathsampling/openpathsampling/pull/787 .
.. CLOSING MATERIAL -------------------------------------------------------
.. Here are the URL references used
.. _nose: http://nose.readthedocs.io/en/latest/
......@@ -70,7 +70,7 @@ ________
| There are two `example jupyter notebooks <https://gitlab.e-cam2020.eu/hejung/sr_shooter/tree/master/examples>`_ in the example directory of the repository:
| One shows the `general setup of a two way shooting transition path sampling with a shooting range on a toy system <https://gitlab.e-cam2020.eu/hejung/sr_shooter/blob/master/examples/toy_example.ipynb>`_.
| The other is a `comparison between one way shooting and two way shooting from the shooting range <https://gitlab.e-cam2020.eu/hejung/sr_shooter/blob/master/examples/OneWayShooting_vs_TwoWayShooting.ipynb>`_ and shows that path space is explored faster with two way shooting when using a (well placed) shooting range. The reason beeing that the shots initiated at the barrier top have a high probability of success and two way shooting decorrelates faster (if using randomized velocities even faster).
| The other is a `comparison between one way shooting and two way shooting from the shooting range <https://gitlab.e-cam2020.eu/hejung/sr_shooter/blob/master/examples/OneWayShooting_vs_TwoWayShooting.ipynb>`_ and shows that path space is explored faster with two way shooting when using a (well placed) shooting range. The reason being that the shots initiated at the barrier top have a high probability of success and two way shooting decorrelates faster (if using randomized velocities even faster).
Source Code
___________
......
......@@ -49,7 +49,7 @@ This module tries to efficiently find a single transition state frame from each
trajectory. This is done by bisection of the trajectory, depending on the
current committor. For example, if the current committor is to high (to much
ends up in state B) the next index is selected halfway towards the left edge
and the current index is set as the new right edge. This is repeated untill a
and the current index is set as the new right edge. This is repeated until a
committor within a given range is reached or no new frame can be selected.
In the end this module returns a dictionary of shape ``{snapshot: comittor
......
.. In ReStructured Text (ReST) indentation and spacing are very important (it is how ReST knows what to do with your
document). For ReST to understand what you intend and to render it correctly please to keep the structure of this
template. Make sure that any time you use ReST syntax (such as for ".. sidebar::" below), it needs to be preceded
and followed by white space (if you see warnings when this file is built they this is a common origin for problems).
.. Firstly, let's add technical info as a sidebar and allow text below to wrap around it. This list is a work in
progress, please help us improve it. We use *definition lists* of ReST_ to make this readable.
.. sidebar:: Software Technical Information
This module extends the contact_maps project.
Name
contact_maps
Language
Python 2.7, 3.5, 3.6
Licence
LGPL 2.1+
Documentation Tool
Sphinx/RST
Application Documentation
http://contact-map.readthedocs.io/
Relevant Training Material
TODO
Software Module Developed by
David W.H. Swenson
.. _contact_concurrences:
####################
Contact Concurrences
####################
.. Let's add a local table of contents to help people navigate the page
.. contents:: :local:
.. Add an abstract for a *general* audience here. Write a few lines that
explains the "helicopter view" of why you are creating this module. For
example, you might say that "This module is a stepping stone to
incorporating XXXX effects into YYYY process, which in turn should allow
ZZZZ to be simulated. If successful, this could make it possible to
produce compound AAAA while avoiding expensive process BBBB and CCCC."
This module deals with the analysis of contacts between parts of
biomolecules based on "contact concurrences," i.e., what contacts occur
simultaneously during a trajectory. This is useful when using contacts as a
definition of a metastable state in a trajectory.
Purpose of Module
_________________
Contact frequencies, as developed in the module :ref:`contact-map`, are a
useful tool for studying biomolecular systems, such as binding/unbinding of
a ligand from a protein. However, they suffer from one problem when trying
to use them to define metastable states: since they are averaged over time,
they don't show time-dependent behavior. To identify a stable state,
time-dependent behavior must be considered.
For example, a particular contact pair might have a frequency of 0.1 during
a 100ns trajectory. But this could be achieved in several ways. If the
contact events are randomly distributed through time, this contact probably
isn't characteristic of a metastable state. On the other hand, if the
contact is constantly present during the last 10 ns (and not otherwise
present), it might represent a metastable state. More importantly, there
might be multiple contacts that are *all* present during those last 10 ns.
Those concurrent contacts could be used to define a metastable state. This
module helps identify and analyze those concurrent contacts by providing a
tool to visualize them.
.. figure:: concurrences.png
:alt: Output of contact concurrence visualization
:figwidth: 50 %
:align: right
The figure shows the output of the contact concurrence visualization for the
contacts between an inhibitor (labelled YYG) and various residues of the
protein GSK3B. The plot shows when each contact occurred. The x-axis is
time. Each dot represents that a specific contact pair is present at that
time. The contact pairs are separated along the vertical axis.
This trajectory shows two groups of stable contacts between the protein and
the ligand; i.e. there is a change in the stable state. This allows us to
visually identify the contacts involved in each state. Both states involve
the ligand being in contact with Phe33, but the earlier state includes
contacts with Ile28, Gly29, etc., while the later state includes contacts
with Ser32 and Gly168.
This is an important tool for identifying stable states based on long-lived
groups of contacts, and is being used as part of the `E-CAM pilot project on
binding kinetics <https://www.e-cam2020.eu/pilot-project-biki/>`_. It has
also been used a part of a bachelor's thesis project to develop an automated
approach to identifying metastable intermediates during binding/unbinding
processes.
Classes implemented in this module include:
* ``Concurrence``: Superclass for contact concurrence objects, enabling
future custom concurrence types.
* ``AtomContactConcurrence``: Contact concurrences for atom-atom contacts.
* ``ResidueContactConcurrence``: Contact concurrences for residue-residue
contacts (based on minimum distance between constituent atoms).
* ``ConcurrencePlotter`` and ``plot_concurrences``: Class and convenience
function (respectively) for making plots of contact concurrence.
* ``ContactsDict``: Dict-like object giving access to atom or residue
contacts based on string keys. Also added ``ContactObject.contacts``
property, which returns a ``ContactsDict`` object for the
``ContactObject``.
.. * Who will use the module? in what area(s) and in what context?
.. * What kind of problems can be solved by the code?
.. * Are there any real-world applications for it?
.. * Has the module been interfaced with other packages?
.. * Was it used in a thesis, a scientific collaboration, or was it cited in
.. a publication?
.. * If there are published results obtained using this code, describe them
briefly in terms readable for non-expert users. If you have few
pictures/graphs illustrating the power or utility of the module, please
include them with corresponding explanatory captions.
Background Information
______________________
This module is part of the `contact_map
<http://contact-map.readthedocs.io>`_ project, which builds on tools from
`MDTraj <http://mdtraj.org>`_.
Building and Testing
____________________
.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment
This module will be included in the 0.4 release of ``contact_map``. After
that release, it can be easily installed with ``conda``, using ``conda
install -c conda-forge contact_map``, or ``conda install -c conda-forge
contact_map==0.4.0`` for the first version that includes this module. To see
the current release, go to https://pypi.org/project/contact-map/#history.
Until the release, this module can only be installed through a developer
install of ``contact_map``. This involves downloading the ``contact_map``
repository, installing the requirements, and then installing the
``contact_map`` package from source. Instructions can be found on the
`installation page
<http://contact-map.readthedocs.io/en/latest/installing.html#developer-installation>`_
of the ``contact_map`` documentation.
Once installed, tests are run using pytest. To check that the code has been
correctly installed, run ``python -c "import contact_map"`` from the command
line. To run the tests, install pytest and run the command ``py.test
--pyargs contact_map``.
Examples
--------
An example can be found in the documentation to the ``contact_map`` paper:
[`docs <https://contact-map.readthedocs.io/en/latest/examples/nb/concurrences.html>`_ | `GitHub <https://github.com/dwhswenson/contact_map/blob/master/examples/concurrences.ipynb>`_]
Source Code
___________
.. Notice the syntax of a URL reference below `Text <URL>`_ the backticks matter!
The source code for this module is contained in the following pull requests
in the ``contact_map`` repository:
* https://github.com/dwhswenson/contact_map/pull/28
* https://github.com/dwhswenson/contact_map/pull/47
......@@ -63,7 +63,7 @@ Testing
_______
This module can be installed with conda, using ``conda install -c
conda-forge contact_map``. To intall the specific version associated with
conda-forge contact_map``. To install the specific version associated with
this module, use ``conda install -c conda-forge contact_map==0.2.0``
Tests for this module can be run with pytest. Install pytest with ``pip
......
.. In ReStructured Text (ReST) indentation and spacing are very important (it is how ReST knows what to do with your
document). For ReST to understand what you intend and to render it correctly please to keep the structure of this
template. Make sure that any time you use ReST syntax (such as for ".. sidebar::" below), it needs to be preceded
and followed by white space (if you see warnings when this file is built they this is a common origin for problems).
.. Firstly, let's add technical info as a sidebar and allow text below to wrap around it. This list is a work in
progress, please help us improve it. We use *definition lists* of ReST_ to make this readable.
.. sidebar:: Software Technical Information
Name
OpenMMTools
Language
Python (3.6, 3.7)