This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Signed vs. unsigned indexes

Tags: None
(comma "," separated)
djp
Registered Member
Posts
7
Karma
0

Signed vs. unsigned indexes

Tue May 11, 2010 12:15 pm
Hi everyone,

I just read the discussion on the mailing list with the topic "Indexes: why signed instead of unsigned?" and, since I don't have email access at the moment, would like to contribute to the discussion via this channel.

As an argument in favor of having signed indexes, the following for-loop has been mentioned:
Code: Select all
for (int col = cols() - 1; col >= 0; --col)

which of course would fail when replacing int by size_t.

However, I don't think that this is a very strong argument, since a for-loop with decreasing index can be written as
Code: Select all
const size_t stop(-1); // could be defined globally somewhere
for (size_t i = size() - 1; i != stop; --i) ...

The only thing to take care of is to check for inequality instead of greater-than.

On the pro-side, using size_t instead of int, has the advantages that
1) since it's unsigned, out-of-bounds can be checked with a single "if(i>=size())"; the check for i<0 is unnecessary.
2) on 64bit architectures, it's 64 bit wide. This might not sound like a big advantage right now, but with the ever increasing memory capacities, I'm pretty sure that it's only a matter of time until someone hits the 2^31-limit. I'm using Eigen for signal processing tasks, and matrices with many million elements are not uncommon in my programs already.
3) Also, using the same code with STL and Eigen containers becomes easier, removing the need of casting into either the signed or the unsigned domain.

Anyway, these are just my thoughts on this topic.

Thanks for the great library, and best wishes,
Martin
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Signed vs. unsigned indexes

Tue May 11, 2010 12:49 pm
Development discussion should happen on the mailing list, not here. There's already a thread for that on the list. There's also already an entry in the issue tracker. Now this discussion is happening at 3 different places!

djp wrote:Hi everyone,

I just read the discussion on the mailing list with the topic "Indexes: why signed instead of unsigned?" and, since I don't have email access at the moment, would like to contribute to the discussion via this channel.

As an argument in favor of having signed indexes, the following for-loop has been mentioned:
Code: Select all
for (int col = cols() - 1; col >= 0; --col)

which of course would fail when replacing int by size_t.

However, I don't think that this is a very strong argument, since a for-loop with decreasing index can be written as
Code: Select all
const size_t stop(-1); // could be defined globally somewhere
for (size_t i = size() - 1; i != stop; --i) ...

The only thing to take care of is to check for inequality instead of greater-than.


We would have to remember doing that everytime. In the past this has bitten me, and compilers don't always emit warnings "condition always true" when checking i>=0 with unsigned i.

Moreover, notice that this makes the check a bit more expensive. While conditionally jumping if zero is just 1 instruction on most CPUs (at least x86), conditionally jumping based on comparing to 'stop' would typically involve 2 or 3 instructions. I agree this isn't a big deal since such reversed loops typically have large enough bodies, just mentioning a tiny disadvantage.

On the pro-side, using size_t instead of int, has the advantages that
1) since it's unsigned, out-of-bounds can be checked with a single "if(i>=size())"; the check for i<0 is unnecessary.


These checks are made only in asserts, which can be disabled in "release mode"; moreover, even when left enable, measurements show that in most practical situations they only have a minimal performance impact; so this doesn't make a significant difference.

That said, I agree that what you describe is an advantage. It just isn't a huge one.

2) on 64bit architectures, it's 64 bit wide.


Ah yes, while int is only 32 bits. That we can consider changing. ptrdiff_t would be the signed variant of size_t. Write to the list, this is too serious to discuss on the forum.

This might not sound like a big advantage right now, but with the ever increasing memory capacities, I'm pretty sure that it's only a matter of time until someone hits the 2^31-limit. I'm using Eigen for signal processing tasks, and matrices with many million elements are not uncommon in my programs already.


I understand. This is an argument in favor of ptrdiff_t vs. int, not in favor of unsigned vs. signed. However, while millions is common, billions is not too common. int gives us 2 billions, which is quite a lot. Asking the list for opinions.

3) Also, using the same code with STL and Eigen containers becomes easier, removing the need of casting into either the signed or the unsigned domain.


Not trying to follow too closely the STL's design is precisely what allowed us to come up with the current one... that said, when possible, easing interaction with the STL is of course an advantage.

Note: continuing discussion on the list, won't reply again here.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot]