I have a piece of code that I time with a very simple code (I have a SWIG wrapper on my cpp class for python).
The compiled library on Windows yields a run time of 1s, the same library compiled on Linux yields a run time of 40s. My code makes heavy use of big matrix multiplications. I thus assume that there is something wrong at the compilation step, regarding SMID extensions. Systems: Windows 8, MSVC 2012 compiler Ubuntu 14.04, GCC 4.8.2 Everything is run through CMake, which gives me the following compilation commands: (this is a sample so that you see the options; the names have been changed) Windows: C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\CL.exe /c /IC:\eigen_3 /nologo /W3 /WX- /O2 /Ob2 /Oy- /D WIN32 /D _WINDOWS /D NDEBUG /D EIGEN_NO_DEBUG /D "CMAKE_INTDIR=\"Release\"" /D _MBCS /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /GR /Fo"test.dir\Release\\" /Fd"test.dir\Release\vc120.pdb" /Gd /TP /analyze- /errorReport:queue -msse2 ..\..\test\test.cpp Linux: /usr/bin/c++ -DEIGEN_NO_DEBUG -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -I/usr/include/eigen3 -fPIC -o CMakeFiles/project.dir/test.cpp.o -c /home/flavian/test/test.cpp Am I missing anything? |
You are missing the most important flag to enable compiler optimizations: -O2
Oh you're right. Two quick additional questions (that don't deserve a dedicated topic, to me):
- What about the other -Ox's? -O3 in particular. - Does matrix multiplication take advantage of the dot product in SSE4? Thanks a lot for the quick reply! ![]() |
