This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Efficiently shuffling data?

Tags: None
(comma "," separated)
NSL
Registered Member
Posts
1
Karma
0

Efficiently shuffling data?

Wed Dec 01, 2010 11:47 pm
I'm working on a project right now that has a large number (~20,000) of row vectors, each of length N (~50,000). For each of these vectors, I need to sample it (with replacement) N times.

e.g.
Input:
[a, b, c, d, e]

Output:
[d, b, a, a, e]

I have "stacked" all the row vectors into a large matrix to make working with the data easier. Thus I need to figure out an efficient way to sample each row of a matrix (with replacement).

Input:
[a1, b1, c1]
[a2, b2, c2]
[a3, b3, c3]

Output:
[b1, c1, c1]
[c2, a2, b2]
[a3, a3, c3]


Is there trick, hack, or shortcut to do this efficiently in Eigen? The naive method of

Matrix input, output
for( i < input.rows() )
...for ( j < input.cols() )
......output[i, j] = input[i, rand(0, N)]

is O(n*m), which for my application is ~O(n^2). Normally I could live with this, but I have to repeat this sampling procedure many tens of thousands of times per run, so speed is a big issue. Any suggestions?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Efficiently shuffling data?

Thu Dec 02, 2010 12:02 pm
your problem is really O(n^2), so I'm only thinking about low level optimizations such as partial unrolling, multithreading, and using a custom fast rng function that can be inlined. Also makes sure you use a row major matrix: Matrix<T,Dynamic,Dynamic,RowMajor> to ensure coherent memory stores.


Bookmarks



Who is online

Registered users: Bing [Bot], Evergrowing, Google [Bot], rockscient