Giter VIP home page Giter VIP logo

Comments (12)

alikates avatar alikates commented on August 18, 2024

While investigating the issue I found a code path that could lead to a null pointer dereference or similar in _add_opp_table_indexed. Maybe that's why we are having that kernel oops.

struct opp_table *_add_opp_table_indexed(struct device *dev, int index,
					 bool getclk)
{

...

	if (opp_table) {
		if (!_add_opp_dev(dev, opp_table)) {
			dev_pm_opp_put_opp_table(opp_table);
			opp_table = ERR_PTR(-ENOMEM);               <----
		}
	        mutex_lock(&opp_table_lock);
	} else {
		opp_table = _allocate_opp_table(dev, index);

		mutex_lock(&opp_table_lock);
		if (!IS_ERR(opp_table))
			list_add(&opp_table->node, &opp_tables);
	}

	opp_tables_busy = false;

unlock:
	mutex_unlock(&opp_table_lock);

	return _update_opp_table_clk(dev, opp_table, getclk);     <---- (this functions tries to access a struct field of opp_table without checking if the pointer is invalid)
}

from linux.

alikates avatar alikates commented on August 18, 2024

Nevermind, i didn't see the IS_ERR() in _update_opp_table_clk

from linux.

alikates avatar alikates commented on August 18, 2024

Now i got something

/mnt/linux # cat trace.txt | CROSS_COMPILE=aarch64-alpine-linux-musl- ./scripts/decode_stacktrace.sh .output/vmlinux ./
[    1.618458] Call trace:
[    1.618461] _of_add_table_indexed (/mnt/linux/.output/../drivers/opp/of.c:373 /mnt/linux/.output/../drivers/opp/of.c:1053 /mnt/linux/.output/../drivers/opp/of.c:1148)
[    1.618472] dev_pm_opp_of_add_table_indexed (/mnt/linux/.output/../drivers/opp/of.c:1235) 
[    1.618481] of_genpd_add_provider_onecell (/mnt/linux/.output/../drivers/base/power/domain.c:2380) 

Here's the issue:

linux/drivers/opp/of.c

Lines 369 to 375 in b265c06

for (i = 0; i < opp_table->required_opp_count; i++) {
required_opp_tables = opp_table->required_opp_tables;
/* Required opp-table is already parsed */
if (!IS_ERR(required_opp_tables[i]))
continue;

The size of the array that's being indexed and required_opp_count don't match because when initializing the opp_table if there's no required_tables it doesn't set the value of required_opp_count.

Here's the code (put_np is at the end of the function):

linux/drivers/opp/of.c

Lines 172 to 183 in b265c06

count = of_count_phandle_with_args(np, "required-opps", NULL);
if (count <= 0)
goto put_np;
required_opp_tables = kcalloc(count, sizeof(*required_opp_tables),
GFP_KERNEL);
if (!required_opp_tables)
goto put_np;
opp_table->required_opp_tables = required_opp_tables;
opp_table->required_opp_count = count;

from linux.

alikates avatar alikates commented on August 18, 2024

Lol I found what the error was...

cpu0_opp: cpu0-opp-table {
compatible = "operating-points-v2-qcom-cpu",
"operating-points-v2-kryo-cpu";

Compatible has to be operating-points-v2. No wonder it was reading a null opp table...

I just tested it and its working...

from linux.

alikates avatar alikates commented on August 18, 2024

Can you test changing the compatible and see if that's it?

from linux.

z3ntu avatar z3ntu commented on August 18, 2024

According to docs, standalone operating-points-v2-kryo-cpu compatible is a legit use case (and also used on msm8996 & qcs404, but operating-points-v2-qcom-cpu isn't documented anywhere. Removing the latter doesn't change anything though.

Are you sure that the cpufreq driver probes correctly when you set compatible to operating-points-v2?

from linux.

alikates avatar alikates commented on August 18, 2024

Okay right, it's failing with -ENOENT. But at least now it doesn't crash...
[ 1.296503] qcom-cpufreq-nvmem: probe of qcom-cpufreq-nvmem failed with error -2

Maybe something's missing in the DT?

from linux.

z3ntu avatar z3ntu commented on August 18, 2024

Yes, also commenting the nvmem-cells = <&cpu_speed_bin>; in dt makes probe fail and boot succeed.

I guess the bug lies in either b052d04 (which looks quite hacky tbh and looks like it might cause stuff like this) or 664d2ef and presumably doing things that shouldn't be done.

from linux.

z3ntu avatar z3ntu commented on August 18, 2024

Temporarily worked around with b265c06, I believe

from linux.

alikates avatar alikates commented on August 18, 2024

Fixed in 6.0.10 branch, the compatible for the spmi regulator that cpr uses was wrong so it didn't probe

from linux.

alikates avatar alikates commented on August 18, 2024

Can you test it in both sdm625 and 632?

from linux.

alikates avatar alikates commented on August 18, 2024

Is this still relevant?

from linux.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.