Efficient way to do a large number of regressions using numpy?

Question

I have a large collection (26,214,400 to be exact) of sets of data I want to perform a linear regressions on, i.e. each of the 26,214,400 data sets consists of n x values and n y values and I want to find y = m * x + b. For any set of points I can use sklearn or numpy.linalg.lstsq, something like:

A = np.vstack([x, np.ones(len(x))]).T
m, b = np.linalg.lstsq(A, y, rcond=None)[0]

Is there a way to set up the matrices such that I can avoid a python loop through 26,214,400 items? Or do I have to use a loop and would be better served using something like Numba?

If memory limitations are not a problem, I would imagine you could stack the matrices consisting of x-data and ones into a single (enormous) block-diagonal matrix, and your y-data into a single vector, then solve the resulting linear equation. Testing it out with some dummy data now, I'll come back if it works :) — GorillazOnPlane
– GorillazOnPlane, Commented May 25, 2020 at 10:05

David Hoffman · Accepted Answer · 2020-06-03 15:56:40Z

2

I ended up going the numba route which yielded a ~20x speed up on my laptop, it used all my cores so I assume more CPUs the better. The answer looked something like

import numpy as np
from numpy.linalg import lstsq
import numba

@numba.jit(nogil=True, parallel=True)
def fit(XX, yy):
    """"Fit a large set of points to a regression"""
    assert XX.shape == yy.shape, "Inputs mismatched"
    n_pnts, n_samples = XX.shape

    scale = np.empty(n_pnts)
    offset = np.empty(n_pnts)

    for i in numba.prange(n_pnts):
        X, y = XX[i], yy[i]
        A = np.vstack((np.ones_like(X), X)).T
        offset[i], scale[i] = lstsq(A, y)[0]

    return offset, scale

Running it:

XX, yy = np.random.randn(2, 1000, 10)                                   

offset, scale = fit(XX, yy)                                             

%timeit offset, scale = fit(XX, yy)                                     
1.87 ms ± 37.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The non-jitted version has this timing:

41.7 ms ± 620 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

answered Jun 3, 2020 at 15:56

David Hoffman

2,3931 gold badge18 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

RK1 Over a year ago

A bit late now but checkout fastats, they have an OLS implementation e.g. from fastats.linear_algebra import ols

Collectives™ on Stack Overflow

Efficient way to do a large number of regressions using numpy?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related