CPU Temperature Throttling
-
Hi @Rowan-Dempster
About 90C....+/- a few degrees depending on it's prediction algo and hysteresis settings. The thermal daemon is very intelligent.
It is normal for the CPU to throttle. It cannot run full speed indefinitely without turning down. It is by design and not necessarily a problem unless you need to do more.Any airflow will help though reduce thermal load so you can get more performance out of the system.
-
@Vinny Thanks for the response, I realized that I did not specify that this is on a VOXL2 with default CPU/thermal daemon settings. So to confirm, we are good to push the CPU cores over 80C without seeing any drop in CPU performance?
-
Hi @Rowan-Dempster
The CPU will self regulate. We cannot predict your usage and offer a real answer other than you need to test your config and check.
Most folks do not notice the throttling when it occurs, but if you have specific algorithms that are hampered by throttling, please post back more details, including your HW and SW config so we can assist later to further optimize if possible.
Again, the first step to mitigate throttling performance drops is to provide more airflow.
This may help: https://docs.modalai.com/voxl2-thermal-performance/#thermal-performance -
@Vinny Yeah being able to check if a core is being throttled is actually exactly what I'm looking for! I was going to use temperature as a proxy for that, but if the core can just tell me explicitly how it is operating that would be ideal. Can you provide me with technical documentation on how to query that information from the cores?
Also regarding https://docs.modalai.com/voxl2-thermal-performance/#thermal-performance that document is a boon of information and very helpful to our company, thank you to whoever wrote it and to you for linking it!
-
Oh and to answer your other question we don't have a clue if throttling will affect our overall system performance, but we definitely want to monitor CPU core KPIs just as a good engineering practice.
-
@tom , how can @Rowan-Dempster learn about the Voxl 2 logging/monitoring/profiling features for temp, core speeds, etc.? Do we have a spot specifically on that?
@Rowan-Dempster , thanks for the compliments on the thermal/EMC page.. that is my doing
Once Tom responds, maybe I can update it with a relevant link to the SW pages explaining how to do that logging/profiling suggested by my notes. -
@Vinny I personally am unsure about how throttling comes into play but
htop
is on there by default if you want to monitor core usage over time -
@tom @Vinny No problem Tom and Vinny I can do some research on my own and post back when I find a good method for future users. It's just Ubuntu right so I'm sure there are lots of open source tools for monitoring CPU core KPIs. Was just wondering if Modal had already written any "wrapper" software that packages up and exposes the CPU Core KPIs that Modal knows are important. Perhaps in the cpu-monitor code? I'll make an MR if I find anything useful to add for cpu monitoring :). Specifically I'm looking at exposing KPIs that indicate when a core is not performing at a level that it could be because of the thermal environment it is running in.
-
@Rowan-Dempster Take a look at the cpufreq documentation. For example https://docs.kernel.org/admin-guide/pm/cpufreq.html. You can use this information on VOXL 2 to see how CPU frequencies are being set.
-
About 95C is when the temperature control loop will kick in and start reducing the maximum core frequencies (gradually). You can monitor the cpu usage and current core frequencies using
voxl-inspect-cpu
.Here are the maximum core frequencies for all cores:
cpu0 1804.8 cpu1 1804.8 cpu2 1804.8 cpu3 1804.8 cpu4 2419.2 cpu5 2419.2 cpu6 2419.2 cpu7 2841.6
If you set the cpu governor mode to
perf
(voxl-set-cpu-mode perf
), it will pin all the cores to max frequency and they will stay there unless they are being throttled due to temperature, which you can check usingvoxl-inspect-cpu
.