Setting intel GPU limits in Ubuntu 17.04 no longer works

I was using Ubuntu MATE 16.04 before, but recently switched to 17.04 because it came with updated thermald and I assumed it will fix bugs like this for me. I had problems with thermald in Ubuntu 16.04 similar to one described there, but considering that it was said to be fixed in thermald (1.5.4-3) and Ubuntu 17.04 came with updated version by default, I assumed it will work better for me overall, with possible fixed all over system as well. So I've installed 17.04, tried it, everything worked well so I migrated completely.
After a while though I encountered very strange problem with system completely ignoring set CPU limits. In 16.04 in only happened to me when intel GPU was trying to work on frequency tied to CPU working state that was higher then limits. For example, if I run:

sudo cat /sys/kernel/debug/dri/1/i915_ring_freq_table

This is my output:

GPU freq (MHz) Effective CPU freq (MHz) Effective Ring freq (MHz)
650 800 0
700 800 0
750 1400 0
800 1500 0
850 1600 0
900 1600 0
950 1700 0
1000 1800 0
1050 1900 0
1100 2000 0

So if I want my CPU to work on 1500MHz max, and do not go higher then that, it means that GPU needs to be limited to 800MHz and never go higher as well, because they are tied together as it is GPU integrated into CPU.
In Ubuntu 16.04 what I did was to manually set GPU limits by writing to /sys/kernel/debug/dri/1/i915_max_freq value I want to be maximum limit my GPU can go. When I limit CPU to 1500MHz I would also run:

echo 800 | sudo tee /sys/kernel/debug/dri/1/i915_max_freq 

And my GPU will stay withing range, without messing up CPU working frequencies.
In Ubuntu 17.04 though, after settings limits GPU still goes all the way up to 1100MHz, which renders any CPU limits pointless and overheats processor.

~$ sudo cat /sys/kernel/debug/dri/1/i915_max_freq
800

As you can see limit is set and in place. Now we check frequency_info:

~$ sudo cat /sys/kernel/debug/dri/1/i915_frequency_info
PM IER=0x00000070 IMR=0xffffff8f ISR=0x00000000 IIR=0x00000000, MASK=0x0000002a
pm_intr_keep: 0x00000004
GT_PERF_STATUS: 0x000016cb
Render p-state ratio: 22
Render p-state VID: 203
Render p-state limit: 255
RPSTAT1: 0x00041610
RPMODECTL: 0x00000d92
RPINCLIMIT: 0x000019fa
RPDECLIMIT: 0x00003a98
RPNSWREQ: 1100MHz
CAGF: 1100MHz
RP CUR UP EI: 7165 (9171us)
RP CUR UP: 7006 (8967us)
RP PREV UP: 6725 (8608us)
Up threshold: 85%
RP CUR DOWN EI: 1314 (1681us)
RP CUR DOWN: 1315 (1683us)
RP PREV DOWN: 23741 (30388us)
Down threshold: 60%
Lowest (RPN) frequency: 650MHz
Nominal (RP1) frequency: 650MHz
Max non-overclocked (RP0) frequency: 1100MHz
Max overclocked frequency: 1100MHz
Current freq: 1100 MHz
Actual freq: 1100 MHz
Idle freq: 650 MHz
Min freq: 650 MHz
Boost freq: 1100 MHz
Max freq: 1100 MHz
efficient (RPe) frequency: 650 MHz
Current CD clock frequency: 400000 kHz
Max CD clock frequency: 400000 kHz
Max pixel clock frequency: 360000 kHz

We can see that current and actual frequency is at full max of 1100MHz.
This also bumps CPU frequency ignoring limit, because CPU can't go lower if GPU goes that high:

~$ sudo cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to , please.
analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 0.97 ms. hardware limits: 800 MHz - 2.00 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 1.50 GHz and 1.50 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 2.00 GHz (asserted by call to hardware).
analyzing CPU 1: driver: intel_pstate CPUs which run at the same hardware frequency: 1 CPUs which need to have their frequency coordinated by software: 1 maximum transition latency: 0.97 ms. hardware limits: 800 MHz - 2.00 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 1.50 GHz and 1.50 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 2.00 GHz (asserted by call to hardware).

As you can see, policy is range from 1.50 GHz to 1.50GHz, but it is bumped up to max because of GPU.

After we close graphical application:

sudo cat /sys/kernel/debug/dri/1/i915_frequency_info
PM IER=0x00000070 IMR=0xffffff8f ISR=0x00000000 IIR=0x00000000,
[...]
CAGF: 650MHz
[...]
Lowest (RPN) frequency: 650MHz
Nominal (RP1) frequency: 650MHz
Max non-overclocked (RP0) frequency: 1100MHz
Max overclocked frequency: 1100MHz
Current freq: 650 MHz
Actual freq: 650 MHz
Idle freq: 650 MHz
Min freq: 650 MHz
Boost freq: 1100 MHz
Max freq: 1100 MHz
[...]

GPU is back to minumum and CPU now works withing assigned limits:

sudo cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to , please.
analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 0.97 ms. hardware limits: 800 MHz - 2.00 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 1.50 GHz and 1.50 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 1.12 GHz (asserted by call to hardware).
analyzing CPU 1: driver: intel_pstate CPUs which run at the same hardware frequency: 1 CPUs which need to have their frequency coordinated by software: 1 maximum transition latency: 0.97 ms. hardware limits: 800 MHz - 2.00 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 1.50 GHz and 1.50 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 1.49 GHz (asserted by call to hardware).

Question: how do I make intel GPU to follow set limit in Ubuntu 17.04, so it stops messing around with my CPU limit, and why it ignores limits that worked in 16.04?

Update: After poking around I found this thing:

sudo cat /sys/kernel/debug/dri/0/i915_rps_boost_info
RPS enabled? 1
GPU busy? yes [1 requests]
CPU waiting? 0
Frequency requested 650 min hard:650, soft:650; max soft:700, hard:1100 idle:650, efficient:650, boost:1100
Xorg [1221]: 591 boosts
Kernel (anonymous) boosts: 8
RPS Autotuning (current "low power" window): Avg. up: 0% [above threshold? 95%] Avg. down: 0% [below threshold? 85%] 

What is this "RPS", and can it be reason why GPU 'boosts' to maximum ignoring set limits?

6

2 Answers

Found solution and cause of my issue - it was RPS boost that was ignoring set gpu frequency limit.
Instead of setting limit via /sys/kernel/debug/dri/1/i915_max_freq, I switched to setting it in /sys/class/drm/card1, parameters gt_max_freq_mhz and gt_boost_freq_mhz. After you set limit in i915_max_freq, it would not limit boost frequency, so when system requests boost it would boost it to limit specified in gt_boost_freq_mhz, ignoring what you set.
By running:

echo 800 | sudo tee /sys/class/drm/card1/gt_max_freq_mhz
echo 800 | sudo tee /sys/class/drm/card1/gt_boost_freq_mhz 

I set limits to both normal and boosted values, and system no longer pushes GPU past limit, which means that CPU limit will not be affected either in my case.

sudo cat /sys/kernel/debug/dri/1/i915_rps_boost_info
RPS enabled? 1
GPU busy? yes [32 requests]
CPU waiting? 0
Frequency requested 800 min hard:650, soft:650; max soft:800, hard:1100 idle:650, efficient:650, boost:800
[...]

Steps to apply this solution:

1) Read table at /sys/kernel/debug/dri/0/i915_ring_freq_table (or /sys/kernel/debug/dri/1/i915_ring_freq_table in some cases:

sudo cat /sys/kernel/debug/dri/0/i915_ring_freq_table 

Find CPU frequency that is withing CPU limit you want and look for GPU frequency tied to it, it will be limit you need to set on GPU.

2) Set limit for GPU frequency by writing to gt_max_freq_mhz and gt_boost_freq_mhz located at /sys/class/drm/card0 (can be cardX depending on situation, check manually if needed):

echo [GPU_frequency_limit] | sudo tee /sys/class/drm/cardX/gt_max_freq_mhz /sys/class/drm/cardX/gt_boost_freq_mhz

For example:

echo 800 | sudo tee /sys/class/drm/card0/gt_max_freq_mhz /sys/class/drm/card0/gt_boost_freq_mhz

3) Check if limits went trough (change 0 to your X value if you used cardX:

sudo cat /sys/kernel/debug/dri/0/i915_rps_boost_info

Your max soft and boost values should now be modified to what you set.
Be aware that limiting GPU frequency can reduce your OpenGL performance.


If you don't want to use first solution, you can try alternative from below.

There is another possible alternative solution that does not work for me due to BIOS limitation, but can be working for someone else, which is limiting package power limit as was suggested by @spandruvada from intel on github thermald issue thread.

First you see current value by reading /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw:

sudo cat /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw

Then you try to change limit value by running:

echo [reduced_power_value] | sudo tee /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw

For example, in my case I had 35000000 as initial value, and want to change it to 30000000:

echo 30000000 | sudo tee /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw

If you get "No data available" after trying to write to it, either it is just disabled (which can be checked by reading /sys/class/powercap/intel-rapl/intel-rapl:0/enabled, will be 0 if it is disabled), or it is locked by BIOS. If you can't enable it by writing 1 to "enable" option, check dmesg for error message (after trying to write to constraint_0_power_limit_uw:

dmesg | grep powercap
[29580.025164] powercap intel-rapl:0: package locked by BIOS, monitoring only

If you see "locked by BIOS" you will need to enable it in BIOS manually, if you can not do this, then you can't control it and this method is not for you. From what I understand, if you have it enabled and working, thermald should be adjusting those value for you automatically, without you needing to change them manually.

Issue on github with this suggestion
If you want to use this method manually, some more details about it here.

I uses intel_gpu_frequency, provided by: intel-gpu-tools

intel_gpu_frequency -i to lock frequency to min

or using 800MHz as an example

intel_gpu_frequency -s 800 to lock frequency to an absolute value (MHz)

Here is Ubuntu man pages for the program:

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like