CHROMIUM: MALI: Transition up devfreq more aggressively

Back on rk3288 on Chrome OS 3.14, we had this table to help us do dvfs
for mali:

struct kbase_rk_dvfs_threshold {
       unsigned long freq;
       unsigned int min;
       unsigned int max;
};
/*
 * if current_utilisation > max
 * level--
 * if current_utilisation < min
 * level++
 */
static const struct kbase_rk_dvfs_threshold kbase_rk_dvfs_threshold_table[] = {
       { 600000000, 20, 100 },
       { 400000000, 20, 40 },
       { 300000000, 20, 40 },
       { 200000000, 20, 40 },
       { 100000000, 0, 50 },
};

Now we're using the "simple_ondemand" devfreq policy.  By default that
moves up opp points at 90% utilization.

When running glmark2 on veyron we find that we incorrectly stick at
100 MHz almost all the time, running at about 30-70% utilization.  We
almost never bump up the GPU frequency during this test resulting in
terrible performance.

Comments in this CL talk about some theories about why this is the
case, but it's postulated to be related to the fact that there is a
lot of CPU overhead in some GPU operations (presumably in lots of the
ones tested by glmark2) and that CPU overhead isn't accounted for when
calculating GPU utilization.

For now we'll hardcode some values that try to account for it.  I
tried a whole bunch of different values and a number of them worked
pretty well but I settled on 35% / 5%.  This seemed to match the
utilization I was seeing in the test and also roughly matches the old
veyron numbers (the algorithm is different so we couldn't match
exactly).  I will assume that they are sane even for other boards w/
mali.

With this change glmark2 works at roughly the same speed as it did on
3.14 on veyron.

NOTE: I ran glmark2 manually when poking around thing using parameters
from the autotest.  Specifically, I ran:

/usr/local/autotest/deps/glmark2/glmark2 \
  --data-path=/usr/local/autotest/deps/glmark2/data \
  --size=800x600 \
  --annotate \
  -b :duration=2"

I will note that even with this tweak glmark2 isn't running as fast as
it could be.  This tweak gets us to about 96% of the results that we
get if we just pin the GPU frequency to max, but you can actually get
way better scores (>60% better) if we peg the CPU frequency to max
too.  Presumably the CPU Frequency governor is just as confused by
this test.  I'm a little less worried about that, though, because I
think in most real world situations there will be other CPU things
going on.  Also, while there is almost certainly always CPU overhead
when there is GPU overhead the inverse is not true.

BUG=chromium:941638
TEST=glmark2 on veyron and kukui; power load test

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1992847
Tested-by: Todd Broch <tbroch@chromium.org>
Reviewed-by: Dominik Behr <dbehr@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>
(cherry picked from commit 2f18d93d005c279cc03f22feed8ce192567d4269)
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>

Change-Id: I809386915ce96468358c580203cd80002b225a6e
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
1 file changed