# Increasing the computational speed in UDEC

Greetings!
I am running a UCS test in UDEC for 0.1 s. It takes around 30 min to complete. Is there any way to allocate more number of CPU cores for solving the problem? Currently, it is using only 5% CPU. I found that through File>Memory settings, I can change the memory allocated to UDEC and GIIC. Are there any more settings?

Thank you.

program threads XX for UDEC7, set process XX for UDEC6 and lower.

1 Like

In the analysis, I have taken a rock specimen (54x108mm) placed in between two platens (10mm thick). I am applying a velocity 0.01m/s on top part of the upper platen and keeping the bottom part of the lower platen fixed (figure attached). The rock specimen is discretized using voronoi blocks. The zones in the rock are elastic and contacts between voronoi blocks is kept Mohr-Coulomb. The friction between platens and rock specimen is kept nearly zero. My aim is to perform a quasi-static analysis. The total time of the analysis is 0.1 s. Since UDEC is using very small time step approx 10^(-8), it is taking around 1111533 steps (around half an hour time) to solve the problem. Are these details sufficient? Let me know.
I am attaching the model photo, the stress vs cycle plot and y velocity contour. I think I am going in the right direction. Just I have a slight doubt in my mind that the y velocity of top platen is partially red and partially orange. While the red stands for 0.01m/s which is what I applied, the orange is for 0m/s. So I am not sure why the whole platen is not in red.
Thank you.

You have only fixed the top two gridpoints with the applied velocity, so it is possible that the velocity of the lower gridpoints on the platen is slightly below 0.01 m/s at times. Please not, that the legend does not say that â€śorange is for 0 m/sâ€ť, it has to be intepreted as â€śorange is between 0 and 0.01â€ť

Other than that it looks fine. Since I assume you are using the default damping, your timescale is fictitious either way and does not represent a â€śrealâ€ť time. Therefore you might get very similar (but faster) results with a higher applied velocity. In order for your assumption of a â€śquasi-staticâ€ť test you just have to make sure, that the stress increase information has enough time to propagate through your model. I usually check this by measuring the interface stress at the top and bottom platen. If these curves are similar (and not for example the stress in the upper interface much higher than in the lower interface) then your applied velocity should still be fine.

When in doubt, it is always worthwhile to run comparative simulations with different loading velocities for some exemplary cases, just to be sure.

1 Like
1. I want to clarify that I have applied gridpoint velocity to all the gridpoints on the top of the upper platen and not just two. The script is given below
block gridpoint apply velocity-y -0.01 range pos-x -0.001 0.0645 pos-y 0.127 0.129
I ran an analyses with whole of the top and bottom platen being applied the respective boundary condition. Velocity contours are shown in Fig1. I think this pretty much solves the issue of differential velocity observed in the upper platen.
2. What is the meaning of interface stress? Do you mean stress-yy at topplaten-specimen interface and bottom platen-specimen interface? I plotted these two stresses using gridpoints near each of the two interfaces. Fig2 displays the comparison. The difference between magnitudes of peak stress is around 10 MPa which is okay I think. Also the slopes before pre peak match fairly.

3. Finally, does the command script program threads XX increase the CPU usage. I kept XX as 500 and observed that around 18% of CPU is being used. Is it really utilizing more CPU power? Similarly memory usage is also very low. Fig3 attached for reference.

Thank you.

Maybe I misread your first image then. It looked like it had only two triangular elements for the whole platen and thus only two gridpoints in total on the top. But yes, if you apply the velocity for the whole block you will (by definition) see homogeneous velocities.

If you sum up the forces of all contacts between your specimen and e.g. the upper platen (used to be c_nforce(ci) in UDEC6) divided by the total length of the (c_length(ci)) you will get the average normal stress on the interface between platen and specimen. That is what I typically used. When you use stress-yy at some point near top or bottom you might get a similar result, but it might also differ or oscillate very much depending on the zone you are looking at.

Do you really have 500 cores/threads available? The program may allow you to specify more threads than your machine can actually have, but that does not mean that the speed will increase (if anything it might slow it down maybe). The task manager shows how many threads you can actually have at maximum.

And even then the multithreading capabilities of UDEC are - from my experience - somewhat limited and I donâ€™t think youâ€™ll see 50%+ utilization there. I think Mark once told me, that introducing multithreading in UDEC was not giving a comparable speed-up to when this was introduced in 3DEC. I havenâ€™t checked UDEC7 in that respect though, so maybe some Itascan can give more info on that.

1 Like

`Maybe I misread your first image then. It looked like it had only two triangular elements for the whole platen and thus only two gridpoints in total on the top. But yes, if you apply the velocity for the whole block you will (by definition) see homogeneous velocities.`
Actually you interpreted it correctly. It has two triangular elements and hence on the top there are two gridpoints. Now I applied to the whole of the platen i.e., 4 gridpoints

`If you sum up the forces of all contacts between your specimen and e.g. the upper platen (used to be c_nforce(ci) in UDEC6) divided by the total length of the (c_length(ci)) you will get the average normal stress on the interface between platen and specimen.`
Can you please guide me here? I am a newbie in this field. If you can give me an example of how these commands are written in the script, that would be great. I am learning FISH and BLOCK commands from the help section. If there is any more documentation, then do let me know. Thanks in advance.

`Do you really have 500 cores/threads available?`
I have 48 cores. Probably what you said is right. Introducing multithreading is not working. Anyway if I find a way, I will let the community know.

`Can you please guide me here? I am a newbie in this field. If you can give me an example of how these commands are written in the script, that would be great. I am learning FISH and BLOCK commands from the help section. If there is any more documentation, then do let me know. Thanks in advance.`
I have found this link which is giving step by step guide to FISH
https://www.geomatlab.com/itasca/udec7.0help/common/docproject/source/manual/scripting/fish_scripting/fish_languagerules.html
I will read it and hopefully make suitable programs. If I have questions, then I will post it. Thank you.

You donâ€™t have to do it like that, it was just a suggestion in case you are familiar with Fish. You could alternatively also simply check histories of your unbalanced forces or solve ratio and simply make sure, that they stay below a certain threshold. Or just occasionally run a comparative run with much faster/slower loading rate to make sure you are still close to a steady-state like experiment.

Oh I think multithreading is working in UDEC, just not as effective as you may have hoped. You can try program threads 1 to see the single-threaded speed in comparison.

Actually another reason I asked about learning FISH is that I need to calculate the number and accumulated length of shear and tension cracks as the sample breaks. As the contacts are defined using Mohr-Coulomb criterion, it implies that when sigma_n increases the specified tensile strength, a tension crack develops and when the shear stress increases the shear strength (tau = c + sigma_n*tan phi), then a shear crack develops. So for this I think I have to write FISH function may be use if, end_if loop or something else. I have gone through the tutorials about FISH. However, I need to learn more on this front. I will post a new question on this. Meanwhile, if you have any suggestions, please provide. Thank you.

`Oh I think multithreading is working in UDEC, just not as effective as you may have hoped. You can try **program threads 1** to see the single-threaded speed in comparison.`
Okay I will do that. Thank you.

There are two processes that affect the time to calculate a solution in the computer. Memory access and CPU utilization. The CPU usage only tells you what the CPU is doing. It turns out that in UDEC the time is controlled by memory access. You may have 8 or 16 or whatever number of threads, but the CPUs are fed by only three or four memory channels. Making the calculations more efficient actually reduces CPU utilization. Trading CPU efficiency for memory efficiency is difficult and in UDEC, this is the best we have been able to achieve to date.

Okay alright. Thank you.
If CPUs are fed with only three or four memory channels, then is there a way to increase that memory?
For example, by using block giic memory xx, I can increase memory usage for the giic interface.

Venkatesh,

You can allocate any amount of memory to UDEC. Most UDEC models do not require anything near the memory installed in most computers these days.

The total amount of memory allocated to UDEC is not related to the memory channels. The memory channels are part of the CPU hardware. The channels control how fast the CPU can access the memory. All of the threads must share this â€śpipelineâ€ť. This is what is slowing down UDEC.

Note: It is best not to allocate more memory than the physical memory installed in the computer. If you exceed the physical memory then the computer will store some of the memory on the hard drive and this get very slow.

Mark

Dear Mark,
I have set the memory of UDEC to 5000 MB and GIIC to 1000 MB whereas I have a RAM of 128 GB and hard disk of 2 TB. So I think I am safe in terms memory allocation. Thank you.

I have used different amounts of memory allocation and number of threads and from that I understand that it is best not to play around much and let UDEC do its job. I kept UDEC memory to 5000 MB and did not do anything else like changing threads with command which I did earlier. And I found that around 15-16% of CPU and 7% of memory is being utilized. This is the best I could get. The number of threads used for UDEC operation is 2078 in this case. The UDEC console shows *** as no. of threads in this case. Relevant figures are shown below.

I donâ€™t see the point of assigning more threads than are in the hardware. Typically at most 2 (with hyperthreading) per core. The *** means the output format never anticipated a number greater than 999.

1 Like

The core utilization is misleading and cannot be used to determine how fast you will solve a model in UDEC. The only way to do this is to time the model solution with a clock.

1 Like

Okay alright. Point noted. I am not assigning any number of threads by giving a command. I have just set UDEC and GIIC memory and let UDEC decide what is best. Thank you for all the support.