I have tested performance of PFC3D 7.00.153 using simple model (gravity falling of 1000 balls) with changing the number of threads. (This is very similar with the performance test suggested for using YADE Introduction — Yade 3rd ed. documentation)
The model was solved to reach equilibrium ratio of 1E-5 and the ‘average number of cycles per second’ is calculated.
This is performance result of my machine (AMD Ryzen thread ripper PRO 5975WX)
The result show that using 7 threads may yield the best performance for this model.
Question
-
Do I have to conduct performance test for every model that I made? or The ‘one’ test result for the specific machine could be regarded as general performance to be expected?
-
What is the reason of poor performance when increasing the number of threads? Is it a problem of overhead in parallel computing?
Main script file “C_T[1].py”
import itasca as it
it.command("python-reset-state false") #the model new command will not affect python environment
import ct2 as ct2
import importlib
importlib.reload(ct2)
for i in range(1,65):
it.set_threads(i)
ct2.falling_test()
External module “ct2.py”
import itasca as it
from datetime import datetime
def falling_test():
# MODEL CONFIGURATION
it.command("""model new
model configure dynamic
model deterministic on
model large-strain on
model energy mechanical off
model orientation-tracking off
model title 'core performance test'
""")
it.set_domain_min((-0.05, -0.05, -0.05))
it.set_domain_max((0.05, 0.05, 0.05))
it.command("""
wall generate box -0.05 0.05
ball generate box -0.05 0.05 -0.05 0.05 0.03 0.05 radius 0.002 number 1000
""")
it.command("""
contact cmat default type ball-ball model hertz property hz_shear 1e6 hz_poiss 0.2 dp_nratio 0.2 dp_sratio 0.2
contact cmat default type ball-facet model linear property kn 1e6 ks 1e6 dp_nratio 0.2 dp_sratio 0.2
ball attribute density 2000
""")
it.set_gravity((0.0,0.0,-9.81))
start_time = datetime.now()
it.command("""
model solve ratio 1e-5
""")
running_time = datetime.now() - start_time
print('Thread = ', it.threads())
print('Total cycle = ', it.cycle(), 'Cycle per second = ', int(it.cycle()/running_time.total_seconds()))
print('Elapsed time = ', running_time)