Registered Member
|
I have written a number of proof of concept programs using Eigen, but I have a particualr problem for which I have not yet found an answer. To illustrate, I have code that does an efficient and correct moving mean and standard deviation in which I adapted the algorithm that does a single pass algorithm so that the data is stored in either a std::deque or a boost::circular_buffer. In this algorithm I get the oldest value and remove it's impact on the final result and then add the newest value (so the program can work with a live data feed), and in the case of using an STL queue I pop the oldest value and push the newest value, or in the case of aboost::circular_buffer, I just let the container over-write the oldest value by the newest one. I have other variants of the same algorithm that simply works with iterators, so that as long as the container has member functions begin() and end(), this variant just iterates through the container. Of course, the first variant is faster since it needs deal with the flops relating to only the oldest and newest value instead of all the values in the container, but that is a different matter. Indeed, the second variant does not have to deal with a live feed, and just 9ultimately) gets the data it will work with once from a DB. In a multivaiate feed, each std::vector<double> would have a value for each variate (and the elements of the circular buffer would be these std:;vector<double>. I say this, in preface to my real qi=uestion, to emphasize that I often have to deal with a live feed, and I am trying to produce code that is as efficient as possible. In my recent attempts to use Eigen *(successfully I might add - I get good results), I have not found a way to get rid of 'obscene' copying; a practice that in my view wastes cpu cycles.
When I use Eigen, I find myself having to initialize temporary objects in my classes' functions representing the Eigen classes I need to use, and then COPY all the data from whichever container I am using to the instances of the Eigen classes I am using. What I would like to do is find a way to use these containers with Eigen, but avoid the wasteful copying that my recent attempts have entailed. Related to this is the fact that I get just a multivariate feed, and I have to specify which fo the variates is the dependant variable in, ay, an SVD regression. Thus, similarly, is there an efficient way to specify that in this feed, variate 'i' is the dependant variae and all the rest are to be treated as regressors; regardless of whether I am using a vector of vectors or a circular_buffer of vectors, to represent the matrix holding the feed data; avoiding as far as possible copying the data from one container to another?. NB: There is no alternative I am aware of to using the circular_buffer, as many of the functions I need to evaluate are themselves functions of the moving averages and standard deviations of the variates in the feed. But then others are functions of regression coefficients relating the variates, and various statistics of the residuals from the regressions; so I need both my highly efficient statistical funcstions and my regressions to be as fast as possible. Any guidance to efficient usage of Eigen with especially the STL and boost containers would be greatly appreciated. Thanks Ted |
Moderator
|
Hi,
regarding you first question, have you considered using pointers or some kind of shared/smart pointers? I'm not sure to understand your second question. Are you looking for a way to see a container of vectors as a 2-dimensional matrix compatible to Eigen? |
Registered Member
|
Yes, I have considered using pointers and smart pointers, but the question becomes one of how to do it right when one must use matrices and linear systems.
I know I can initialize a 1D array or Eigen's vector using an std::vector, by using std::vector's c_str() member function. And indeed, I have been using C pointers and smart pinters as automatic variables in the functions in which I am using Eigen to do different types of regression. But that has been trial and error. I do not know if Eigen expects data in column major order or row major order, and whether row and column vectors are stored differently (i.e. if a row vector is quivalent to a single C style array, is a column vector equivalent to an array of pointers to arrays with a single element). I have not traced through Eigen to see how memory is managed, particularly in intialization, but I don't see a way it could possibly handle direct use of either C pointers or containers of std::vectors, directly; unless you have writen over-loads to handle each case. But I don't see that documented, and I don't yet know how, or even if it can be done without at least one copy. If I were working with C, I could represent a matrix using an array of pointers to doubles, each of which would point to C arrays of the same length, or I could use a single C array with all data for al rows in a single array; accessing each row by using the right offsets (as in fact your sample shows initializing your arrays andmatrices using a 1D C array). But one can not use either of these to initialize the other without copying data, unless you initialize a 1D array in such a way that each fo the rows is conguiguous, and then initialize an array of pointers with the addresses in the first array correspodning to the first element in each row. And certainly one can not readily move between C arrays and C++ matrics comprised of a container in which the elements are themselves std::vectors. And yes, in a perfect world, I would like to have either std::vector<std::vector<double> >, or boost::circular_buffer<std::vector<double> >, and be able to use them directly with Eigen without having to copy the data. That is, I would like to be able to use them to create an Eigen matrix object, so that I can use that to do an appropriate factorization and use the result to solve a linear system (e.g. for a regression). If Eigen was written to work with iterators only, at least to read the input data, then it ought to make no difference whether one is using std::vector<std::vector<double>> or boost::circular_buffer<std::vector> >, as both support the same interator concepts. And , I'd like to have either of my preferred containers receiving data from a multivariate data feed, and be able to, for example, iterate through the variates, with each of the variates regressed against the others, without copying the data at each iteration (I can see doing this if the input data is never over-written, and one is creative in using indeces - but I don't know if Eigen is written in a way to support that). How much of what I am after is possible with Eigen? And if any of it is possible, how does one do it (or what are the relevant parts of the documentation that explains how to do it)? Thanks Ted |
Moderator
|
std::vector<std::vector<double> >, or boost::circular_buffer<std::vector<double> > could be understood by Eigen by writing an expression warping them. However, this kind of objects are pretty slow. Some related threads:
viewtopic.php?f=74&t=102354 viewtopic.php?f=74&t=97365 |
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]