From d2aba039b658485f9e2c0ea1416ba2e10f2f019c Mon Sep 17 00:00:00 2001 From: Salmenjoki Henri Date: Fri, 18 Oct 2019 09:42:14 +0300 Subject: [PATCH 1/3] new module on optimization of paradis running on amd rome --- .../paradis_precipitate_amd_rome/readme.rst | 121 ++++++++++++++++++ 1 file changed, 121 insertions(+) create mode 100644 Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst diff --git a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst new file mode 100644 index 00000000..5fb55979 --- /dev/null +++ b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst @@ -0,0 +1,121 @@ +.. In ReStructured Text (ReST) indentation and spacing are very important (it is how ReST knows what to do with your + document). For ReST to understand what you intend and to render it correctly please to keep the structure of this + template. Make sure that any time you use ReST syntax (such as for ".. sidebar::" below), it needs to be preceded + and followed by white space (if you see warnings when this file is built they this is a common origin for problems). + + +.. Firstly, let's add technical info as a sidebar and allow text below to wrap around it. This list is a work in + progress, please help us improve it. We use *definition lists* of ReST_ to make this readable. + +.. sidebar:: Software Technical Information + + Name + ParaDiS_Precipitate_HPC + + + Language + C++ + + Licence + This is patch based on the ParaDIS version 2.5.1. The additions are GPL. + + Documentation Tool + Sphinx + + Application Documentation + http://paradis.stanford.edu/ + + Relevant Training Material + https://version.aalto.fi/gitlab/csm_open/paradis_version_diffs/tree/master/test_run + + +.. In the next line you have the name of how this module will be referenced in the main documentation (which you can + reference, in this case, as ":ref:`example`"). You *MUST* change the reference below from "example" to something + unique otherwise you will cause cross-referencing errors. The reference must come right before the heading for the + reference to work (so don't insert a comment between). + +.. _paradis_precipitate_hpc: + +###################################################### +ParaDiS with precipitates optimized to HPC environment +###################################################### + +.. Let's add a local table of contents to help people navigate the page + +.. contents:: :local: + +.. Add an abstract for a *general* audience here. Write a few lines that explains the "helicopter view" of why you are + creating this module. For example, you might say that "This module is a stepping stone to incorporating XXXX effects + into YYYY process, which in turn should allow ZZZZ to be simulated. If successful, this could make it possible to + produce compound AAAA while avoiding expensive process BBBB and CCCC." + +Discrete dislocation dynamics (DDD) simulations usually treat with "pure" crystals and dislocations in them. In reality, there is a need to look at more +complicated scenarios of impurities interacting with the dislocations and their motion. Effects on a single atom / vacancy level may be +incorporated by renormalizing the dislocation mobility but in many cases the dislocation dynamics is changed by the presence of clusters or precipitates, +that act as local pinning centers. The consequences of the impurities are multiple: the yield stress is changed, and in general the plastic deformation +process is greatly affected. Simulating these by DDD allows to look at a large number of issues from materials design to controlling the yield stress and +may be done in a multiscale manner by computing the dislocation-precipitate interactions from microscopic simulations or by coarse-graining the DDD +results for the stress-strain curves on the mesoscopic scale to more macroscopic Finite Element Method (the material model therein). + +This module provides +an extension of the ParaDIS DDD code (LLNL, http://paradis.stanford.edu/) where dislocation/precipitate interactions are included. The extension is for an HPC environment, in which the original code has been optimized for the Mahti cluster running on AMD Rome environment at CSC in Finland in mind. + +Purpose of Module +_________________ + +.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment + +The method is based on extending a recent version of ParaDIS to handle the presence of pinning centers. These work as localized Gaussian potentials that +interact with the near-by dislocations (see A. Lehtinen et al. Phys. Rev. E 93, 013309 (2016)). The "disorder field" is given as an input where the locations +of the precipates are given in 3D, and the interactions are parametrized by the impurity strength (which may vary from precipitate to another) and the range +of the Gaussian potential (which also may vary). The dislocation dynamics is handled as in ParaDIS in general with an additional force terms that accounts for +each dislocation segment for the nearby impurities (a cut-off is applied in the force). + +The Module thus allows to study various precipitate fields (density, geometry, strength, interaction range) as desired. In a typical ParaDIS simulation one +does a simulation of the response of a dislocation system to a strain/stress protocol. The starting point is a dislocation system, which has been obtained from +relaxing a random or patterned configuration under zero external stress until the evolution becomes negligible. In the presence of impurities the customary approach +is to do two relaxation steps: first follow the relaxation of dislocation configuration, then add the disorder field to that and re-relax. In the current version apart from HPC-related parallelization-relevant steps the subroutines SegSegForce (segment-to-segment force calculation) and FMSigma2Core2 (force multipole expansion) are well vectorized, and the code now also uses better multiple threads in their context. + +Background Information +______________________ + +.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment + +The module version is built on the ParaDIS version 2.5.1 which can be obtained from http://paradis.stanford.edu/ and +following the steps outlined there for obtaining the code. + +Building and Testing +____________________ + +.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment + +The version offered is built exactly like the normal ParaDIS; the makefiles etc. are for the local CSC system and should be +modified for the local environment. To test the ParaDiS build, an example case of a constant strain rate simulation of BCC iron with precipitates is included. +The input of the test simulation is in file ParaDiS_test.ctrl, where the output directories and the used number of computational domains need to be defined. +The initial dislocation structure is contained in the ParaDiS_test.data as usual and the structure of the file is identical to the files used by default ParaDiS. +In addition, the simulation has ~8500 precipitates which are included in the ParaDiS_test.pdata file. This .pdata file has first some domain variables defined similar to .data file, +and then the precipitates. These are presented one precipitate per line, and the data columns are as follows: [precipitate tag, position x, y and z, +impurity strength, interaction radius, boolean], where the boolean states if the precipitate is active. + +The used printing options defined in .ctrl file can be modified. Here, examples of the output property data and restart files are included in run_output folder +and the file called ParaDiS_test.out contains the standard output of the test when the simulation system is run for ~1.5e-9 seconds. The restart files are written +similarly as in unmodified ParaDiS, except that now the precipitates are also included in corresponding rsXXXX.pdata files. In addition to the property files produced +by original ParaDiS, the modified ParaDiS writes also files allepsdot and avalanche. Allepsdot contains columns [simulations time, strain rate tensor element 11, stress +tensor element 11,...], and avalanche columns [time, average velocity, plastic strain, applied stress, total dislocation length, integrated strain rate] where the average +velocity is calculated as a segment weighted average velocity of dislocations. + +The test case is illustrated with three files: ParaDiS_test.out, and two plots, which are: +aver_velocity_time.pdf (the resulting average dislocation velocity during the run) and stress_plastic_strain.pdf (yield strain versus applied stres during the run). + + + + +Source Code +___________ + +.. Notice the syntax of a URL reference below `Text `_ + + + +Due to licensing reasons, only the difference between ParaDiS version 2.5.1 files and modified files are submitted, and these files can be found in ``_. + -- GitLab From e5e02b7c62bb08a3b2c23285e930839234dd5508 Mon Sep 17 00:00:00 2001 From: Salmenjoki Henri Date: Mon, 1 Jun 2020 17:03:35 +0300 Subject: [PATCH 2/3] updated readme and chart to now include the results of the optimization --- .../paradis_precipitate_amd_rome/chart.png | Bin 0 -> 8271 bytes .../paradis_precipitate_amd_rome/readme.rst | 280 ++++++++++-------- 2 files changed, 159 insertions(+), 121 deletions(-) create mode 100644 Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/chart.png diff --git a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/chart.png b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/chart.png new file mode 100644 index 0000000000000000000000000000000000000000..79c24165767342d3e36bf177b9845ada852fcc78 GIT binary patch literal 8271 zcmds6dpy(o|6fWfx5`N&ms3gNkV?vBP8X+B!s!SrX0)Uda=(sLhf{8Kx+u35Ng^9^ zoy+7hccmESQV27)FdJsu@4dB3m*acB-|yr5&(9wp+h_0B<^8%mU$6J;ef2lfgW@Y> zS3n>T@k75J_#FZff1Lr zJaEcoaGyA=x=PfTT6YOszcjnoF^GH5AgA78&!%YVyQ#!I$&Sq(NIb5YTf$|dQ)@4<<*eEM9s$W5Xr!cSHNzRu>P)g?~HW=v;(Ivgb0Iu2vO+5ZP^a>eFii? zvj{~)s0(f^Z#S)Y(mlF6fL7vGF1)s~=Eg?kV8gvE$L2&+!=Bqh>1>${IxWV={g91E zqqf%RhPwsBN#Q5!@BGw(jIs4UD-9zvfC74L7=+?Y6|<(qn8Y$4nr(LVRbThhv1S_Ev?!_vL5KpN|k0l|Gd(g(^-Lv0U*OJvX2-{#;bE za*Bvhw6bwCV!&8CU^pLI)XU_91QfG7#sKr=$=1Q91T`##x21QAu>3A};8aB_FB(1EhJU*Np{R;|(3Y_a#jBaUZ8O;KU@4`XV;PevL z-qM1pHtQV99k3;;EbMAhQj+iBJ;!WHeG6vnL6|~aX8G(p&0ra+%|~iB91*Z)!LRuA z_&LUVH0?S5tNQWmO#xwWIVWOfq^M@`wt)OM?_v-NUS3`aLB+VGmPOZ5m4B`puaq)7 zwU(CZv#OLm<{EwW)gvQkXJ^KGzB3JO8N0gF>19IQC~0i3?JOoOnv#jrz?H0i9XyQk zBUn^Kgx5lV9d{d?4;rbOW{lh^cF~iIilf{J5sutXJrWW3j~g-7m;U03apXv7N)}SZPh$^&W!_$iyTv?!zC+xG3e(a9c>OC%NUWwo{vyV_IxnoE}Q78@)e{D;r4+g7UB;zhsP! zo^kVK-Q;Xr1c9qzEuWrP$jZ&lJ$mezXA9v4yThY3waCjRHyU7A(J{LeO4?FM zh4u9@n$0EV+t{$Mn+Bm&GqXR_^W-29_y%mKgnVp?@wP1ciqLgBE;eH8cW(|k%CLUm!>*LD{7-#}JV@p#$^@!}2#f3is(??+)cKzZQ~-)BEQPn3PptV| z1hjlEd~Tmuf7X?65utjZ^>TG8Fe=};pLOLKQm~cq03Ig&H&S?B`^rooEV8f^zVrXy zOjRY+H$6PCsCF{!eVRaG8BX@UXQ+s8RKmVdQo$AwwYx|l#{IAOsAl`?e(PHitq`wS zSiTGV)~#Cy?CtGu#l##mHm=udE_`>#I_duXH+M;|%nTC@psqc|5{l?;_WZC659p#O zQ{ht;g9>lq*VG-&gxk(ONJ~=<6_Z)5W|bKc8X9^Rit;~ZVUa+e1Qk;XK)n~0b4%6y zh=M2WxGtvh!Hsji*RRIi1H>#^g-<#ttK?JN5JSZiB@ zQt!s3r%$H#D-9_D8n@^G;ON%Y)}u#{-oT+t4epj!`Mkkj?XBA3)>m1;W{^CXa&mIX zh_WCpuaVd5c3-eN61!71&!u}SRxI+tTLmUh@>CQ4M6s_MW2}@xC%;k|p-`e(a|adt zHu3UbL{gyvb-uqkQtQl%`=e+UV=Nq$;>z|cWO8nC2wl1Gegro7Y8BrI0-JurSztIb9*Q`8 z+N;?&mX~nnZ%<(Ok8W1Mezwni8U?<4{@qmz`c&XW0RIa*EAsMszPtAF(&@{n_>Kww z*{bF0Wx`*?(SjEJ=qrG~3##CDou8!!%hZ9MAA@YZ5kG{yAgaDtg^3ex`=-l^Vio&M z4jS#% zhcG|RN3!W}05qD_1v(TXIK11~|7+%;tvz<^*Z?)vbu7h|E5G}~{fbH1@4}`3;V*GId!_}Gsv>S;#`utP z{>cGpszs;lo}S3vT31)+&UA8eN`Cx!_#rx`FAennSpiHccUC)4c}Q&!qLvRR3QYe2 znGDZ_*Kf)%BWEVw75YFdEiDx$i`S@GRz5zOobB6*^KorRG|<~@mKgZWJ`bO7$NpFy zDHj7ThJxO9l%VlU%X9d}DE!h@qj2de0P9Hn#YI}qozEmv-{}T?VPc~2*O>4*xcPx5 zC>`IR*T;nTkl&vQ)!#yq5-$e+9$E!S^wlcmd4cmS=MR9+L*oT8Q-H=iC|WQh1W+Wx z3-7<*kDw5Jy=q=ge4A?mz~c9i{{W5uRfY?Cf`0}wU#(i$r-F7Pgm3r1cb1?I{%2_X zePmC>PgsLr=GlEgi~iHHA+m>GNiQ#41_d$x<*K6Nc2BMIH4Yy>oP@=0tE;aEbD4Jq z&A7Lra!HRLZ)w`yA8}XT&s~3J;ASCVurY3}u&{6;>B@4cdF7_0M+RnMVxo1b<~pp< z-C^g>ot2<{OMdcXKO1^U6Re)C61uI4%sXJ5_lN-Qz9 z@?a`d$co$Le66pq?`DuK91f>bh?*y?t#eovq2l$>rJu`E2^9MzJv})oX@id;DqtHb zTr(kMBmze()vJzHM6&~sNTjs2d8%n7kX-%n(9M&Swx{z_$d)FSQ5C+H5WmMmH!#Bt z4LyoL1V3-?0OKC-pV}5n>R;#2XT)T8Zeb0TqF5)(F9~aTkG6Mp{4~#L7%SnnmK0-T zTgGuheEB&S7btZyA52Ib6TB?Dyohj>z{ZL&$%9-D8w_aV38=t-4^ne-%n3*^y3+Ee z4;Kt22I!(1e{ck#9@&la(YyR^KDPD(Y<=4L30qfvz}6vjF!Yy*b*m^ujNxF`N2J|N z5+E&0OX7>hQ*E)@$8)rI$ZUknj%>y57u7Ew?9cI$)vzhRRx}r$!;STp83yzu8HJw& z)k4b-fAQ`v=gzYR4vzC8+Y85UIlt4@fs-5`qUN`p^P7v0mHmI3p7Wyj+Xnb^v$Qad z2%4q;Kegd|-O{%>|NT`9`cwe2|FZx6FM#Zu-uzQZ4-);NCNErV|My1fT9&_UIa7w-v1I4t7-m^UYr56JvuDpzIgd=@k_rl@ zZzaaizHaqCW2F=3{(U7d)6DXlDGlXSmXC*1 zH6bT*ONtga0VA8B+B!Nz?BYN~a-`zi8xbx0=Sy@x{N=}Z8k@XSUhLHR#Nw!>%dSIBg(tw;7wuq55djIi(Fs&%Y(>)nj3|z z$)V6L!IA?DRA0^9BJ`x_kmxFr=Ge-N(N6?$U~u^E=@D?%TNuI zuX?53#;ev6DkKf#OKuF-s-PPTyx+em2WlKqx6W&DRk!u-bt!loI!aib;8rc82QiGG9kilG>TFbVl|CKv=T1{82CDy}~xr>U;= z8gAe(<7{+T8%NA?c0`Nq$@Rp|494ljmye8k1%vrNfj1RubPe*b@#*#wN)OuxhQ+Xg z3-*EZWW~rhmm*c>IGA*trW%0JMPYQx7Bkb-9U@ohIz|!ldb_}622rWVohCHArTgL4 z#;iZreD)(zbZRk_6K`ZC zn@caAY|=Bcn}RTCy=eA56?8BY*XyA1%uoq6aTDFpwB*^1`01`8DDkZ}nAT_KtRfH5 z4d^fS2bsTA4m3ubiBdG|aV^5{vl7qHDIJe2WRYbrdRA|}!D%8)OiXkk$u$2_a1@hP zOhEO#d|1}GC1FhwZ;Jen@DsItMHi zoc)+pa}es@g{ZF;O3xvh5O>+QRjmP0)}a^D%AVBuX*ecI>rXvI5j0ME^a9u>nhxVlU7T(5r*HlRy<=Tgn;AT#sn%q z0q0_m9&ZQDnBa3swLl3+7Q0sMyVboQkt)WV_ar};gxs3uYmBM%rlsnK((T9uW5F*_J%Kt9q=fHJbU zTd}$U-JZ@cR(K2zM}vm6Hu}wyQ*$WREnYv3l}%`>tB+n7>&_+G#=&)NHt9cuZ*2di z=$FlK3NzJ^t-8yl>u$oh{%pkJ_wF-!4)u4`>ws8=vbRE_-P%~rcy#b7;_9woF1usu zFj+PukH$nFwnGi+_15-e#xH$TS_;M8Wgj%0JxbNhq&wsCC=Xh!Z%gTUkJf~4H8L&o zrbOvcYfQsZMF-LTBWa^2(mc=U!*Np}33uQ)YG^#jOtf#`)lnkr&1K1IYBw&2Qoh8J zdAjy5RX61CdNOa zEy+NX^|K}H*e^eXeXTq3O{j>Ae)EBR;V_ya9B}^S&8BI;JvZ+t(JnQh1u=2`N#PnJ@3TC|-HsnW&TvlH^%ss@ z=rPnZB^MFXnon=4?1%SV*T@LN3r$?x&buP#)q_u0Hh#H7Zia2 literal 0 HcmV?d00001 diff --git a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst index 5fb55979..1b8c8562 100644 --- a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst +++ b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst @@ -1,121 +1,159 @@ -.. In ReStructured Text (ReST) indentation and spacing are very important (it is how ReST knows what to do with your - document). For ReST to understand what you intend and to render it correctly please to keep the structure of this - template. Make sure that any time you use ReST syntax (such as for ".. sidebar::" below), it needs to be preceded - and followed by white space (if you see warnings when this file is built they this is a common origin for problems). - - -.. Firstly, let's add technical info as a sidebar and allow text below to wrap around it. This list is a work in - progress, please help us improve it. We use *definition lists* of ReST_ to make this readable. - -.. sidebar:: Software Technical Information - - Name - ParaDiS_Precipitate_HPC - - - Language - C++ - - Licence - This is patch based on the ParaDIS version 2.5.1. The additions are GPL. - - Documentation Tool - Sphinx - - Application Documentation - http://paradis.stanford.edu/ - - Relevant Training Material - https://version.aalto.fi/gitlab/csm_open/paradis_version_diffs/tree/master/test_run - - -.. In the next line you have the name of how this module will be referenced in the main documentation (which you can - reference, in this case, as ":ref:`example`"). You *MUST* change the reference below from "example" to something - unique otherwise you will cause cross-referencing errors. The reference must come right before the heading for the - reference to work (so don't insert a comment between). - -.. _paradis_precipitate_hpc: - -###################################################### -ParaDiS with precipitates optimized to HPC environment -###################################################### - -.. Let's add a local table of contents to help people navigate the page - -.. contents:: :local: - -.. Add an abstract for a *general* audience here. Write a few lines that explains the "helicopter view" of why you are - creating this module. For example, you might say that "This module is a stepping stone to incorporating XXXX effects - into YYYY process, which in turn should allow ZZZZ to be simulated. If successful, this could make it possible to - produce compound AAAA while avoiding expensive process BBBB and CCCC." - -Discrete dislocation dynamics (DDD) simulations usually treat with "pure" crystals and dislocations in them. In reality, there is a need to look at more -complicated scenarios of impurities interacting with the dislocations and their motion. Effects on a single atom / vacancy level may be -incorporated by renormalizing the dislocation mobility but in many cases the dislocation dynamics is changed by the presence of clusters or precipitates, -that act as local pinning centers. The consequences of the impurities are multiple: the yield stress is changed, and in general the plastic deformation -process is greatly affected. Simulating these by DDD allows to look at a large number of issues from materials design to controlling the yield stress and -may be done in a multiscale manner by computing the dislocation-precipitate interactions from microscopic simulations or by coarse-graining the DDD -results for the stress-strain curves on the mesoscopic scale to more macroscopic Finite Element Method (the material model therein). - -This module provides -an extension of the ParaDIS DDD code (LLNL, http://paradis.stanford.edu/) where dislocation/precipitate interactions are included. The extension is for an HPC environment, in which the original code has been optimized for the Mahti cluster running on AMD Rome environment at CSC in Finland in mind. - -Purpose of Module -_________________ - -.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment - -The method is based on extending a recent version of ParaDIS to handle the presence of pinning centers. These work as localized Gaussian potentials that -interact with the near-by dislocations (see A. Lehtinen et al. Phys. Rev. E 93, 013309 (2016)). The "disorder field" is given as an input where the locations -of the precipates are given in 3D, and the interactions are parametrized by the impurity strength (which may vary from precipitate to another) and the range -of the Gaussian potential (which also may vary). The dislocation dynamics is handled as in ParaDIS in general with an additional force terms that accounts for -each dislocation segment for the nearby impurities (a cut-off is applied in the force). - -The Module thus allows to study various precipitate fields (density, geometry, strength, interaction range) as desired. In a typical ParaDIS simulation one -does a simulation of the response of a dislocation system to a strain/stress protocol. The starting point is a dislocation system, which has been obtained from -relaxing a random or patterned configuration under zero external stress until the evolution becomes negligible. In the presence of impurities the customary approach -is to do two relaxation steps: first follow the relaxation of dislocation configuration, then add the disorder field to that and re-relax. In the current version apart from HPC-related parallelization-relevant steps the subroutines SegSegForce (segment-to-segment force calculation) and FMSigma2Core2 (force multipole expansion) are well vectorized, and the code now also uses better multiple threads in their context. - -Background Information -______________________ - -.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment - -The module version is built on the ParaDIS version 2.5.1 which can be obtained from http://paradis.stanford.edu/ and -following the steps outlined there for obtaining the code. - -Building and Testing -____________________ - -.. Keep the helper text below around in your module by just adding ".. " in front of it, which turns it into a comment - -The version offered is built exactly like the normal ParaDIS; the makefiles etc. are for the local CSC system and should be -modified for the local environment. To test the ParaDiS build, an example case of a constant strain rate simulation of BCC iron with precipitates is included. -The input of the test simulation is in file ParaDiS_test.ctrl, where the output directories and the used number of computational domains need to be defined. -The initial dislocation structure is contained in the ParaDiS_test.data as usual and the structure of the file is identical to the files used by default ParaDiS. -In addition, the simulation has ~8500 precipitates which are included in the ParaDiS_test.pdata file. This .pdata file has first some domain variables defined similar to .data file, -and then the precipitates. These are presented one precipitate per line, and the data columns are as follows: [precipitate tag, position x, y and z, -impurity strength, interaction radius, boolean], where the boolean states if the precipitate is active. - -The used printing options defined in .ctrl file can be modified. Here, examples of the output property data and restart files are included in run_output folder -and the file called ParaDiS_test.out contains the standard output of the test when the simulation system is run for ~1.5e-9 seconds. The restart files are written -similarly as in unmodified ParaDiS, except that now the precipitates are also included in corresponding rsXXXX.pdata files. In addition to the property files produced -by original ParaDiS, the modified ParaDiS writes also files allepsdot and avalanche. Allepsdot contains columns [simulations time, strain rate tensor element 11, stress -tensor element 11,...], and avalanche columns [time, average velocity, plastic strain, applied stress, total dislocation length, integrated strain rate] where the average -velocity is calculated as a segment weighted average velocity of dislocations. - -The test case is illustrated with three files: ParaDiS_test.out, and two plots, which are: -aver_velocity_time.pdf (the resulting average dislocation velocity during the run) and stress_plastic_strain.pdf (yield strain versus applied stres during the run). - - - - -Source Code -___________ - -.. Notice the syntax of a URL reference below `Text `_ - - - -Due to licensing reasons, only the difference between ParaDiS version 2.5.1 files and modified files are submitted, and these files can be found in ``_. - +:orphan: + +.. sidebar:: Software Technical Information + + Name + ParaDiS_Precipitate_GC optimized for AMD Zen2 + + Language + C++ + + Licence + Extension is based on ParaDIS version 2.5.1. The additions in the + extension are GPL. + + Documentation Tool + Sphinx + + Application Documentation + http://paradis.stanford.edu/ + + Relevant Training Material + https://version.aalto.fi/gitlab/csm_open/paradis_version_diffs/tree/master/test_run + + Software Module Developed by + Phuong Nguyen (phuong.nguyen@csc.fi) + +.. _paradis_rome: + +################################################ +ParaDiS with precipitates optimized for AMD Zen2 +################################################ + +.. contents:: :local: + +Discrete dislocation dynamics (DDD) simulations usually treat with “pure” +crystals and dislocations in them. An extension of the ParaDIS DDD code (LLNL, +http://paradis.stanford.edu/) that includes dislocation/precipitate +interactions has been developed (E-CAM module: `ParaDiS with precipitates`_). + +This module provides a guide for optimal porting of the +`ParaDiS with precipitates`_ to the AMD Rome CPUs, in preparation for the +`Mahti supercomputer`_ service at CSC, Finland. +Mahti is an Atos BullSequana XH2000 system consisting of 1404 nodes each with two 64-core AMD Zen2 CPUs (AMD EPYC 7H42, 2.6GHz). +Since Mahti is not ready for general access at this moment, the module was +prepared based on a single testing node which has 2 AMD EPYC 7742 @2.25GHz (128 cores in total). + +By choosing a suitable compiler and compiler optimization flags, the application works +more efficiently on the target platform. On the testing node, Intel compilers with either AVX or +AVX2 vector sets gives the best performance for *ParaDiS with precipitates*. Alternatively, GCC compilers with AVX2 vector works as competitive as the Intel ones. + + +Purpose of Module +_________________ + +This module helps to run simulations of the *ParaDiS with precipitates* more +efficiently. By using a suitable set of optimization flags for compilers, +especially the one determining vectorization type, the best library routines +can be chosen. + + +Background Information +______________________ + +The module is based on the ParaDiS (http://paradis.stanford.edu/) +extension `ParaDiS with precipitates`_. + + +Building and Testing +____________________ + +Build instructions for `ParaDiS with precipitates`_ are provided with the +extension. + +Different compilers and compiler options were tested to find the most optimal +ones for the Zen2 architecture. Figure 1 (below) shows a comparison +of normalized running times between different vectorization extensions and +compilers. On the testing platform, Intel compilers with either AVX or AVX2 helps the +application to achieve good performance. Alternatively, GCC compilers with AVX2 can be used to obtain the same performance as the Intel ones. + +Table 1 presents a comparison of different optimization flags +for the Intel and GCC compilers. For the Intel compilers, the optimal performance is reached with the compiler +options: ``-O3 -mavx2`` or ``-O3 -mavx``. For the GCC compilers, +``-O2 -march=znver2 -pipe -fomit-frame-pointer -ftree-vectorize`` compiler options +help the application to gain a good performance. + + +.. figure:: chart.png + :alt: Figure1 + :width: 600px + + Figure 1: Comparison of normalized times between different compilers and + vectorization extentions (smaller is better) + + +*Table 1: Comparison between different optimization flag options* + +.. list-table:: + :widths: 15 40 15 + :header-rows: 1 + + * - Compilers + - Flags + - Time (s) + * - Intel + - -O2 -axCORE-AVX2 + - 328 + * - + - -O2 -axHASWELL + - 360 + * - + - -O2 -mavx2 + - 309 + * - + - -O3 -mavx2 + - 295 + * - + - -O3 -mavx + - 298 + * - + - -Ofast -mavx2 + - 301 + * - + - -O3 -mavx2 -funroll-all-loops + - 317 + * - GCC + - -O2 -march=znver1 -pipe -fomit-frame-pointer -ftree-vectorize + - 352 + * - + - -O2 -march=znver2 -pipe -fomit-frame-pointer -ftree-vectorize + - 296 + * - + - -O3 -march=znver2 + - 382 + * - + - -O2 -march=haswell -pipe -fomit-frame-pointer -ftree-vectorize + - 352 + +``*`` *The input case in these tests is different to the one at* `ParaDiS with precipitates optimized for Puhti`_ . + + +Besides, in the `ParaDiS with precipitates optimized to HPC environment`_, +it's written that using multiple threads through a hybrid OpenMP and MPI model speeds up +the calculation up to 1.5 factors, especially for large-scale simulations. +However, this combination did not give an advantage of performance on the Zen2 +testing machine. Thus, using a single thread for each MPI process is recommended. + + +Source Code +___________ + +Source code modifications for the extension *ParaDiS with precipitates* are +available here: +https://version.aalto.fi/gitlab/csm_open/paradis_version_diffs.git. + + +.. _ParaDiS with precipitates: https://e-cam.readthedocs.io/en/latest/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_GC/readme.html +.. _ParaDiS with precipitates optimized to HPC environment: https://e-cam.readthedocs.io/en/latest/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_HPC/readme.html +.. _ParaDiS with precipitates optimized for Puhti: https://gitlab.csc.fi/hpc-support/e-cam-library/tree/paradis-rome/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_optimized_puhti +.. _Mahti supercomputer: https://research.csc.fi/techspecs~Mahti -- GitLab From 5dd5de60406012dda7620609512cd832c06cc7e1 Mon Sep 17 00:00:00 2001 From: Alan O'Cais Date: Wed, 31 Mar 2021 09:09:34 +0000 Subject: [PATCH 3/3] Tidy up, add to ToC --- Meso-Multi-Scale-Modelling-Modules/index.rst | 1 + .../paradis_precipitate_amd_rome/readme.rst | 37 +++++++++++-------- 2 files changed, 23 insertions(+), 15 deletions(-) diff --git a/Meso-Multi-Scale-Modelling-Modules/index.rst b/Meso-Multi-Scale-Modelling-Modules/index.rst index 0036d60d..68ca9447 100644 --- a/Meso-Multi-Scale-Modelling-Modules/index.rst +++ b/Meso-Multi-Scale-Modelling-Modules/index.rst @@ -120,6 +120,7 @@ The following modules connected to the ParaDiS code have been produced so far: ./modules/paradis_precipitate/paradis_precipitate_GC/readme ./modules/paradis_precipitate/paradis_precipitate_HPC/readme + ./modules/paradis_precipitate/paradis_precipitate_amd_rome/readme GC-AdResS diff --git a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst index 1b8c8562..f3786ed0 100644 --- a/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst +++ b/Meso-Multi-Scale-Modelling-Modules/modules/paradis_precipitate/paradis_precipitate_amd_rome/readme.rst @@ -1,5 +1,3 @@ -:orphan: - .. sidebar:: Software Technical Information Name @@ -32,7 +30,7 @@ ParaDiS with precipitates optimized for AMD Zen2 .. contents:: :local: -Discrete dislocation dynamics (DDD) simulations usually treat with “pure” +Discrete dislocation dynamics (DDD) simulations usually treat with "pure" crystals and dislocations in them. An extension of the ParaDIS DDD code (LLNL, http://paradis.stanford.edu/) that includes dislocation/precipitate interactions has been developed (E-CAM module: `ParaDiS with precipitates`_). @@ -40,13 +38,19 @@ interactions has been developed (E-CAM module: `ParaDiS with precipitates`_). This module provides a guide for optimal porting of the `ParaDiS with precipitates`_ to the AMD Rome CPUs, in preparation for the `Mahti supercomputer`_ service at CSC, Finland. -Mahti is an Atos BullSequana XH2000 system consisting of 1404 nodes each with two 64-core AMD Zen2 CPUs (AMD EPYC 7H42, 2.6GHz). +Mahti is an Atos BullSequana XH2000 system consisting of 1404 nodes each with +two 64-core AMD Zen2 CPUs (AMD EPYC 7H42, 2.6GHz). Since Mahti is not ready for general access at this moment, the module was -prepared based on a single testing node which has 2 AMD EPYC 7742 @2.25GHz (128 cores in total). +prepared based on a single testing node which has 2 AMD EPYC 7742 @2.25GHz +(128 cores in total). -By choosing a suitable compiler and compiler optimization flags, the application works -more efficiently on the target platform. On the testing node, Intel compilers with either AVX or -AVX2 vector sets gives the best performance for *ParaDiS with precipitates*. Alternatively, GCC compilers with AVX2 vector works as competitive as the Intel ones. +By choosing a suitable compiler and compiler optimization flags, the application +works +more efficiently on the target platform. On the testing node, Intel compilers with +either AVX or +AVX2 vector sets gives the best performance for *ParaDiS with precipitates*. +Alternatively, GCC compilers with AVX2 vector support is competitive with +the Intel compilers. Purpose of Module @@ -75,13 +79,15 @@ Different compilers and compiler options were tested to find the most optimal ones for the Zen2 architecture. Figure 1 (below) shows a comparison of normalized running times between different vectorization extensions and compilers. On the testing platform, Intel compilers with either AVX or AVX2 helps the -application to achieve good performance. Alternatively, GCC compilers with AVX2 can be used to obtain the same performance as the Intel ones. +application to achieve good performance. Alternatively, GCC compilers with AVX2 can +be used to obtain the same performance as the Intel ones. Table 1 presents a comparison of different optimization flags -for the Intel and GCC compilers. For the Intel compilers, the optimal performance is reached with the compiler +for the Intel and GCC compilers. For the Intel compilers, the optimal performance +is reached with the compiler options: ``-O3 -mavx2`` or ``-O3 -mavx``. For the GCC compilers, ``-O2 -march=znver2 -pipe -fomit-frame-pointer -ftree-vectorize`` compiler options -help the application to gain a good performance. +help the application to achieve good performance. .. figure:: chart.png @@ -89,7 +95,7 @@ help the application to gain a good performance. :width: 600px Figure 1: Comparison of normalized times between different compilers and - vectorization extentions (smaller is better) + vectorization extensions (smaller is better) *Table 1: Comparison between different optimization flag options* @@ -135,12 +141,13 @@ help the application to gain a good performance. - -O2 -march=haswell -pipe -fomit-frame-pointer -ftree-vectorize - 352 -``*`` *The input case in these tests is different to the one at* `ParaDiS with precipitates optimized for Puhti`_ . +``*`` *The input case in these tests are different to the ones at* +`ParaDiS with precipitates optimized for Puhti`_ . -Besides, in the `ParaDiS with precipitates optimized to HPC environment`_, +In the `ParaDiS with precipitates optimized to HPC environment`_, it's written that using multiple threads through a hybrid OpenMP and MPI model speeds up -the calculation up to 1.5 factors, especially for large-scale simulations. +the calculation up to a factor of 1.5, especially for large-scale simulations. However, this combination did not give an advantage of performance on the Zen2 testing machine. Thus, using a single thread for each MPI process is recommended. -- GitLab