This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Doubles 15x slower than floats on BeagleBone Black

Tags: None
(comma "," separated)
jack1qwertyuiop
Registered Member
Posts
1
Karma
0
I have been running tests on Eigen to decide parameters for an ubuntu port to a BeagleBone Black, and I am getting times that seem dis-proportionally slow. The calculations are made for two 100x100 random decimal matrices being multiplied together.

Float time: 2.030674 ms
Double time: 30.932476 ms
Ratio double/float: 15.23262

Is there any reason for this that I am missing?

The Makefile is:

Code: Select all
testeigen:
   g++ -I ~/Eigen/ testeigen.cpp -ffast-math -mfloat-abi=hard -O3 -funroll-loops -DNDEBUG -mfpu=neon -march=armv7 -o testeigen


The float test is:
Code: Select all
#include <iostream>
#include <fstream>
#include <math.h>
#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <Eigen/Dense>

#define ITERATIONS 10000
#define MAT_SIZE 100

using namespace std;

timespec difference(timespec start, timespec end) {
 timespec temp;
 if ((end.tv_nsec-start.tv_nsec)<0) {
   temp.tv_sec = end.tv_sec-start.tv_sec-1;
   temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
 } else {
   temp.tv_sec = end.tv_sec-start.tv_sec;
   temp.tv_nsec = end.tv_nsec-start.tv_nsec;
 }
 return temp;
}

int main ()
{
  srand(time(0));
  struct timespec start;
  struct timespec end;
  double eigentime = 0;

  int i;

  Eigen::MatrixXf e_x (MAT_SIZE, MAT_SIZE);
  Eigen::MatrixXf e_y (MAT_SIZE, MAT_SIZE);
  Eigen::MatrixXf e_result (MAT_SIZE, MAT_SIZE);


  for(int j = 0; j < ITERATIONS; j++) {
    for(int i = 0; i < MAT_SIZE*MAT_SIZE;i++) {
      e_x.data()[i] = (float)rand()/(float)RAND_MAX;;
      e_y.data()[i] = (float)rand()/(float)RAND_MAX;;
    }

    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start);
    e_result = e_x*e_y;
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);
    eigentime+=difference(start,end).tv_nsec/1000000.0;

  }

  printf("eigen time:        %f ms\n",eigentime/ITERATIONS);

  return 0;
}


With the double test being identical besides the lines:
Code: Select all
  Eigen::MatrixXd e_x (MAT_SIZE, MAT_SIZE);
  Eigen::MatrixXd e_y (MAT_SIZE, MAT_SIZE);
  Eigen::MatrixXd e_result (MAT_SIZE, MAT_SIZE);

  for(int j = 0; j < ITERATIONS; j++) {
    for(int i = 0; i < MAT_SIZE*MAT_SIZE;i++) {
      e_x.data()[i] = (double)rand()/(double)RAND_MAX;;
      e_y.data()[i] = (double)rand()/(double)RAND_MAX;;
    }
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
As far as I know, the ARM cortex 8 CPU does not support double precision which has to be emulated while single precision are fully optimized with the NEON vector instruction set. Therefore, a factor 15 is not that high.


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]