0

I am trying to do a piecewise linear regression in Python and the data looks like this,

image

I need to fit 3 lines for each section. Any idea how? I am having the following code, but the result is shown below. Any help would be appreciated.

    import numpy as np
    import matplotlib
    import matplotlib.cm as cm
    import matplotlib.mlab as mlab
    import matplotlib.pyplot as plt
    from scipy import optimize

    def piecewise(x,x0,x1,y0,y1,k0,k1,k2):
        return np.piecewise(x , [x <= x0, np.logical_and(x0<x, x< x1),x>x1] , [lambda x:k0*x + y0, lambda x:k1*(x-x0)+y1+k0*x0 lambda x:k2*(x-x1) y0+y1+k0*x0+k1*(x1-x0)])

    x1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15,16,17,18,19,20,21], dtype=float)
    y1 = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03,145,147,149,151,153,155])
    y1 = np.flip(y1,0)
    x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15,16,17,18,19,20,21], dtype=float)
    y = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03,145,147,149,151,153,155])
    y = np.flip(y,0)

    perr_min = np.inf
    p_best = None
    for n in range(100):
        k = np.random.rand(7)*20
        p , e = optimize.curve_fit(piecewise, x1, y1,p0=k)
        perr = np.sum(np.abs(y1-piecewise(x1, *p)))
        if(perr < perr_min):
            perr_min = perr
            p_best = p

    xd = np.linspace(0, 21, 100)
    plt.figure()
    plt.plot(x1, y1, "o")
    y_out = piecewise(xd, *p_best)
    plt.plot(xd, y_out)
    plt.show()

data with fit image

Thanks.

5
  • Your images haven't shown up Commented Sep 7, 2017 at 4:39
  • I attempted to run the code you posted, and it seems to be missing import statements. Would you please post an entire working example? Commented Sep 7, 2017 at 14:23
  • Hi, Just added the import statements... Commented Sep 7, 2017 at 19:56
  • Question: can you use a single equation, or for this specific case are you set on a piecewise model? Commented Sep 10, 2017 at 1:38
  • I cannot use a single equation as it is a scientific data and the 3 zones in the curve has different properties. So, looking for piece wise fit. Commented Sep 15, 2017 at 2:57

1 Answer 1

3

A very simple method (without iteration, without initial guess) can solve this problem.

The method of calculus comes from page 30 of this paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf (copy below).

enter image description here

The next figure shows the result :

enter image description here

The equation of the fitted function is :

enter image description here

Or equivalently :

enter image description here

H is the Heaviside function.

enter image description here

In addition, the details of the numerical calculus are given below :

enter image description here

Sign up to request clarification or add additional context in comments.

6 Comments

Do you know if there's a Python package for it?
Probably not. But writing a computer program of this kind is not difficult.
Sounds right. Before I dive into the source you provided - does it generalize well to m (parametrized) segments?
Theoretically I would say yes. But when the number of segments increases the matrixes become more and more big and the calculus less accurate for scattered data. In practice I am afraid that the number of segments be strongly limited due to numerical calculus contingencies and increasing deviations resulting from the scatter of the experimental data.
Yes, it is just matrix inversion and roots of polynomial equation. But the more they are terms, the more the deviation on the final result is large due to the scatter on the initial data.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.