Python Performance: Why 'if not list' is 2x Faster Than Using len()

abhi9u@lemmy.world · 3 months ago

Python Performance: Why 'if not list' is 2x Faster Than Using len()

thebestaquaman@lemmy.world · 3 months ago

I write a lot of Python. I hate it when people use “X is more pythonic” as some kind of argument for what is a better solution to a problem. I also have a hang up with people acting like python has any form of type safety, instead of just embracing duck typing.This lands us at the following:

The article states that “you can check a list for emptiness in two ways: if not mylist or if len(mylist) == 0”. Already here, a fundamental mistake has been made: You don’t know (and shouldn’t care) whether mylist is a list. These two checks are not different ways of doing the same thing, but two different checks altogether. The first checks whether the object is “falsey” and the second checks whether the object has a well defined length that is zero. These are two completely different checks, which often (but far from always) overlap. Embrace the duck type- type safe python is a myth.

iAvicenna@lemmy.world · edit-2 3 months ago

isn’t the expected behaviour exactly identical on any object that has len defined:

“By default, an object is considered true unless its class defines either a bool() method that returns False or a len() method that returns zero, when called with the object.”

ps: well your objection is I guess that we cant know in advance if that said object has len defined such as being a collection so this question does not really apply to your post I guess.

thebestaquaman@lemmy.world · 3 months ago

Exactly as you said yourself: Checking falsieness does not guarantee that the object has a length. There is considerable overlap between the two, and if it turns out that this check is a performance bottleneck (which I have a hard time imagining) it can be appropriate to check for falsieness instead of zero length. But in that case, don’t be surprised if you suddenly get an obscure bug because of some custom object not behaving the way you assumed it would.

I guess my primary point is that we should be checking for what we actually care about, because that makes intent clear and reduces the chance for obscure bugs.

PattyMcB@lemmy.world · 3 months ago

I know I’m gonna get downvoted to oblivion for this, but… Serious question: why use Python if you’re concerned about performance?

JustAnotherKay@lemmy.world · edit-2 3 months ago

Honestly most people use Python because it has fantastic libraries. They optimize it because the language is middling, but the libraries are gorgeous

ETA: This might double post because my Internet sucks right now, will fix when I have a chance

Takapapatapaka@lemmy.world · 3 months ago

You may want to beneficiate from little performance boost even though you mostly don’t need it and still need python’s advantages. Being interested in performance isnt always looking for the very best performance there is out of any language, it can also be using little tips to go a tiny bit faster when you can.

Randelung@lemmy.world · 3 months ago

It comes down to the question “Is YOUR C++ code faster than Python?” (and of course the reverse).

I’ve built a SCADA from scratch and performance requirements are low to begin with, seeing as it’s all network bound and real world objects take time to react, but I’m finding everything is very timely.

A colleague used SQLAlchemy for a similar task and got abysmal performance. No wonder, it’s constantly querying the DB for single results.

iAvicenna@lemmy.world · edit-2 3 months ago

Yea and then you use “not” with a variable name that does not make it obvious that it is a list and another person who reads the code thinks it is a bool. Hell a couple of months later you yourself wont even understand that it is a list. Moreover “not” will not throw an error if you don’t use an sequence/collection there as you should but len will.

You should not sacrifice code readability and safety for over optimization, this is phyton after all I don’t think list lengths will be your bottle neck.

acosmichippo@lemmy.world · 3 months ago

if you’re worried about readability you can leave a comment.

thebestaquaman@lemmy.world · 3 months ago

There is no guarantee that the comment is kept up to date with the code. “Self documenting code” is a meme, but clearly written code is pretty much always preferable to unclear code with a comment, largely because you can actually be sure that the code does what it says it does.

Note: You still need to comment your code kids.

iAvicenna@lemmy.world · 3 months ago

If there is an alternative through which I can achieve the same intended effect and is a bit more safer (because it will verify that it has len implemented) I would prefer that to commenting. Also if I have to comment every len use of not that sounds quite redundant as len checks are very common

Artyom@lemm.ee · 3 months ago

In my experience, if you didn’t write the function that creates the list, there’s a solid chance it could be None too, and if you try to check the length of None, you get an error. This is also why returning None when a function fails is bad practice IMO, but that doesn’t seem to stop my coworkers.

LegoBrickOnFire@lemmy.world · 3 months ago

Passing None to a function expecting a list is the error…

iAvicenna@lemmy.world · edit-2 3 months ago

good point I try to initialize None collections to empty collections in the beginning but not always guaranteed and len would catch it

LegoBrickOnFire@lemmy.world · 3 months ago

I really dislike using boolean operators on anything that is not a boolean. I recently made an esception to my rule and got punished… Yeah it is skill issue on my part that I tried to check that a variable equal to 0 was not None using “if variable…”. But many programming rules are there to avoid bugs caused by this kind of inattention.

Opisek@lemmy.world · 3 months ago

The graph makes no sense. Did a generative AI make it.

pyre@lemmy.world · 3 months ago

yeah I got angry just looking at it

uis@lemm.ee · 3 months ago

There are decades of articles on c++ optimizations, that say “use empty() instead of size()”, which is same as here.

ne0n@lemmy.world · 3 months ago

Isn’t “-2x faster” 2x slower?

Randelung@lemmy.world · 3 months ago

Maybe they mean up to?

Harvey656@lemmy.world · 3 months ago

I could have tripped, knocked over my keyboard, cried for 13 straight minutes on the floor, picked my keyboard back up, accidentally hit the enter key making a graph and it would have made more sense than this thing.

-2x faster. What does that even mean?

AnUnusualRelic@lemmy.world · 3 months ago

There’s probably an “import * from relativity” in there somewhere.

Archr@lemmy.world · 3 months ago

I haven’t read the article. But I’d assume this is for the same reason that not not string is faster than bool(string). Which is to say that it has to do with having to look up a global function rather than a known keyword.

AnUnusualRelic@lemmy.world · edit-2 3 months ago

From that little image, they’re happy it takes a tenth of a fucking second to check if a list is empty?

What kind of dorito chip is that code even running on?

borokov@lemmy.world · 3 months ago

Isn’t it because list is linked list, so to get the Len it has to iterate over the whole list whereas to get emptyness it just have to check if there is a 1st element ?

I’ too lazy to read the article BTW.

riodoro1@lemmy.world · edit-2 3 months ago

So… it has to iterate over the whole empty list is what you’re saying? like once for every of the zero items in the list?

borokov@lemmy.world · 3 months ago

Don’t know how list are implemented in Python. But in the dumb linked list implementation (like C++ std::list), each element has a “next” member that point the the next element. So, to have list length, you have to do (pseudo code, not actual python code):

len = 0
elt = list.fisrt
while exist(elt):
    elt = elt.next
    len++
return len

Whereas to test if list is empty, you just have to:

return exist(list.first)

riodoro1@lemmy.world · 3 months ago

That’s exactly what I was getting at. Getting length of an empty list would not even enter the loop.