Why does AMD processor use sub instruction instead of xor to verify the stack canary?

Question

So I've been exploring the 12 chapter in the picoCTF primer and suddenly saw difference in my assembly of the program and the picoCTF's in the end of main function, where the stack canary is being checked. Their is xor rdx,QWORD PTR fs:0x28 and mine is sub rdx,QWORD PTR fs:0x28

I have AMD processor and my assembly uses the sub instruction to check the equality, but in their assembly there is xor. It does the same thing, I understand, but why is it like that? Isn't the xor operation more efficient and is it even because of the processor?

It's no more than personal preference. Both have been 1 cycle operations since the 486, and both affect all of the flags. — Tim Roberts
– Tim Roberts, Commented Aug 7, 2024 at 18:28
@TimRoberts okay, thanks for the explanation of the efficiency part! — digitale
– digitale, Commented Aug 7, 2024 at 19:51
xor instead of sub was a missed-optimization, fixed in GCC10 after I reported it. gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 . (They're not necessarily equal, @TimRoberts, because we're not still using 486 CPUs. Intel since Sandybridge can macro-fuse sub/jcc into a single uop, but can't for xor. Recent AMD CPUs can do the same. But yes, sub is not worse anywhere.) — Peter Cordes
– Peter Cordes, Commented Aug 8, 2024 at 7:24

Peter Cordes · Accepted Answer · 2024-08-08 09:39:30Z

7

Older GCC used xor, GCC10 and later use sub after I suggested that optimization: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568
It's not dependent on -mtune=znver3 or anything, just GCC version.

Intel Sandybridge-family can macro-fuse sub/jcc into a single uop, but can't for xor.

On other CPUs, sub and xor are equal in performance for this, so it's a win on that family of Intel CPUs with no downside anywhere else.

AMD Zen 3 and later can fuse sub or xor.
Earlier AMD can only fuse test and cmp.

edited Aug 8, 2024 at 9:39

answered Aug 8, 2024 at 7:27

Peter Cordes

377k50 gold badges742 silver badges1k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Why does AMD processor use sub instruction instead of xor to verify the stack canary?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related