This forum has been archived. All content is frozen. Please use KDE Discuss instead.

operation on Map crash when input is not aligned

Tags: None
(comma "," separated)
eranb
Registered Member
Posts
11
Karma
0
I have a problem, using Map of unaligned doubles (x86 compilation, x64 machine, windows7, visual studio 2013) :
Code: Select all
#include <Eigen/Eigen>
using namespace Eigen;
int main()
{
   // Create a buffer of doubles, that are not aligned to 8 bytes, but to 4
   char buf1[1024];
   char *b1(buf1);
   while ((int)b1 % 8 != 4) b1++;
   double *d1((double*)b1);

   // Create a buffer of doubles (may be aligned)
   double d2[100];
   for (int i=0;i<100;++i) d2[i]=i;

   // Create maps
   typedef Map<Matrix<double, Dynamic, Dynamic, RowMajor>, Unaligned > Map;
   Map map1(d1,10,10);
   Map map2(d2,10,10);
   
   // This line will crash
   map1 = map2;
   return 0;
}


The crash is happening in Assign.h line 465, when it's trying to copy the second line of the matrix.
When the first loop iteration is being run (copying the first line of the matrix), the alignedStart value is 10, so nothing is vectorized, which is right.
But, at the end of the for loop (line 471) the value of alignedStart for the next iteration is being evaluated, alignedStep value is 0 so (alignedStart+alignedStep)&packetSize = (10-0)%2 = 0, and alignedStart of the 2nd line becomes 0, which is wrong.
so at the second iteration (copying the 2nd line of the matrix), it's trying to vectorize the copy using aligned opcodes, and crashes.
The computation of 2nd alignedStart seems to be wrong. There will be no aligned elements in the whole matrix.
Is it a bug ?

Note :
This happened in my code when a variable on the stack was not aligned to 8 bytes but to 4, this example enforces that situation.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
This is surprising because Windows impose double to be aligned on a 64 bits boundary, even for 32 bits windows. Therefore, Eigen also assumes that double are aligned a 64 bits boundary on that platform. Looks like your stack has been corrupted.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
for the record, a possible workaround would be:
Code: Select all
diff --git a/Eigen/src/Core/Assign.h b/Eigen/src/Core/Assign.h
--- a/Eigen/src/Core/Assign.h
+++ b/Eigen/src/Core/Assign.h
@@ -434,29 +434,31 @@ struct assign_impl<Derived1, Derived2, L
 ***************************/
 
 template<typename Derived1, typename Derived2, int Version>
 struct assign_impl<Derived1, Derived2, SliceVectorizedTraversal, NoUnrolling, Version>
 {
   typedef typename Derived1::Index Index;
   static inline void run(Derived1 &dst, const Derived2 &src)
   {
-    typedef packet_traits<typename Derived1::Scalar> PacketTraits;
+    typedef typename Derived1::Scalar Scalar;
+    typedef packet_traits<Scalar> PacketTraits;
     enum {
       packetSize = PacketTraits::size,
       alignable = PacketTraits::AlignedOnScalar,
       dstAlignment = alignable ? Aligned : int(assign_traits<Derived1,Derived2>::DstIsAligned) ,
       srcAlignment = assign_traits<Derived1,Derived2>::JointAlignment
     };
-    const Index packetAlignedMask = packetSize - 1;
+    const Scalar *dst_ptr = &dst.coeffRef(0,0);
+    const Index packetAlignedMask = (dstAlignment && ((Index(dst_ptr) % sizeof(Scalar))==0)) ? packetSize - 1 : ~Index(0);
     const Index innerSize = dst.innerSize();
     const Index outerSize = dst.outerSize();
     const Index alignedStep = alignable ? (packetSize - dst.outerStride() % packetSize) & packetAlignedMask : 0;
     Index alignedStart = ((!alignable) || assign_traits<Derived1,Derived2>::DstIsAligned) ? 0
-                       : internal::first_aligned(&dst.coeffRef(0,0), innerSize);
+                       : internal::first_aligned(dst_ptr, innerSize);
 
     for(Index outer = 0; outer < outerSize; ++outer)
     {
       const Index alignedEnd = alignedStart + ((innerSize-alignedStart) & ~packetAlignedMask);
       // do the non-vectorizable part of the assignment
       for(Index inner = 0; inner<alignedStart ; ++inner)
         dst.copyCoeffByOuterInner(outer, inner, src);


but again, I'm surprised that such a situation occur.
eranb
Registered Member
Posts
11
Karma
0
Well... I'm compiling this code (Windows 7_64bit, Visual Studio 2013, compiled for x86 , Debug, created from default console application) :
Code: Select all
#include <iostream>
int main()
{
   double d;
   d=15.0;
   if ((long long)(&d) % 8) std::cout<<"not aligned";
}

Running gives "not aligned", sometimes (~20%).
This stack cannot be corrupted...

Tried in 3 different computers in our company...
eranb
Registered Member
Posts
11
Karma
0
From MSDN :
Stack Alignment
On both of the 64-bit platforms, the top of each stackframe is 16-byte aligned. Although this uses more space than is needed, it guarantees that the compiler can place all data on the stack in a way that all elements are aligned.
The x86 compiler uses a different method for aligning the stack. By default, the stack is 4-byte aligned. Although this is space efficient, you can see that there are some data types that need to be 8-byte aligned, and that, in order to get good performance, 16-byte alignment is sometimes needed. The compiler can determine, on some occasions, that dynamic 8-byte stack alignment would be beneficial—notably when there are double values on the stack.
The compiler does this in two ways. First, the compiler can use link-time code generation (LTCG), when specified by the user at compile and link time, to generate the call-tree for the complete program. With this, it can determine regions of the call-tree where 8-byte stack alignment would be beneficial, and it determines call-sites where the dynamic stack alignment gets the best payoff. The second way is used when the function has doubles on the stack, but, for whatever reason, has not yet been 8-byte aligned. The compiler applies a heuristic (which improves with each iteration of the compiler) to determine whether the function should be dynamically 8-byte aligned.

https://msdn.microsoft.com/en-us/library/aa290049%28v=vs.71%29.aspx

Well, according to my example the compiler can improve the heuristic... It seems that the compiler may fail to align doubles even with very simple cases.
Anyway, Eigen cannot assume that every double is aligned in windows platforms (with visual studio) - sometimes (that are not so rare) the heuristic fails, and doubles are not aligned.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
What puzzle me is that you are the first one hitting this issue while there are several hundreds of Eigen users on windows. Could you try to run the unit tests? Here is what I use on windows:
Code: Select all
$ call "c:\Program Files\Microsoft Visual Studio 11.0\VC\vcvarsall.bat"
$ cd path/to/eigen/sources
$ mkdir build
$ cd build
$ cmake .. -DCMAKE_CXX_FLAGS="/MP2" -DEIGEN_TEST_NO_OPENGL=on -DEIGEN_ENABLE_LAPACK_TESTS=OFF  -DEIGEN_TEST_NOQT=ON
$ ctest -V -D Experimental

this might take about 2 hours.
eranb
Registered Member
Posts
11
Karma
0
Iv'e checked the tests. No test checks mapping a double variable on the stack with outer stride, that should use packed elements.
The next small test will crash in x86, debug (when the output is "4", about 20% of the runs) :

void foo()
{
typedef Matrix<double, Dynamic, Dynamic, RowMajor> MatrixType;
typedef Map<MatrixType, Unaligned, OuterStride<> > MapType;
const int rows(5), cols(5), stride(6);
double arrayUnaligned[cols*stride];

std::cout<< (int)arrayUnaligned % 8 << std::endl;

MapType map(arrayUnaligned,rows,cols,OuterStride<>(stride));
map *= 2.0;
}
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
OuterStride is tested in mapstride.cpp unit test, however, I agree that stack allocated buffers were not tested. Fixed:

https://bitbucket.org/eigen/eigen/commits/dcd97d2510ef/
https://bitbucket.org/eigen/eigen/commits/1990c04fc972/
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]