3

I am trying to fit some data using scipy.optimize.curve_fit. I have read the documentation and also this StackOverflow post, but neither seem to answer my question.

I have some data which is simple, 2D data which looks approximately like a trig function. I want to fit it with a general trig function using scipy.

My approach is as follows:

from __future__ import division import numpy as np from scipy.optimize import curve_fit #Load the data data = np.loadtxt('example_data.txt') t = data[:,0] y = data[:,1] #define the function to fit def func_cos(t,A,omega,dphi,C): # A is the amplitude, omega the frequency, dphi and C the horizontal/vertical shifts return A*np.cos(omega*t + dphi) + C #do a scipy fit popt, pcov = curve_fit(func_cos, t,y) #Plot fit data and original data fig = plt.figure(figsize=(14,10)) ax1 = plt.subplot2grid((1,1), (0,0)) ax1.plot(t,y) ax1.plot(t,func_cos(t,*popt)) 

This outputs:

enter image description here

where blue is the data orange is the fit. Clearly I am doing something wrong. Any pointers?

2
  • Please upload sample data for 'example_data.txt' otherwise it's difficult to reproduce. Commented Dec 11, 2019 at 19:54
  • 1
    It is accessible under the some_data hyperlink Commented Dec 11, 2019 at 19:55

1 Answer 1

5

If no values are provided for initial guess of the parameters p0 then a value of 1 is assumed for each of them. From the docs:

p0 : array_like, optional
Initial guess for the parameters (length N). If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).

Since your data has very large x-values and very small y-values an initial guess of 1 is far from the actual solution and hence the optimizer does not converge. You can help the optimizer by providing suitable initial parameter values that can be guessed / approximated from the data:

  • Amplitude: A = (y.max() - y.min()) / 2
  • Offset: C = (y.max() + y.min()) / 2
  • Frequency: Here we can estimate the number of zero crossing by multiplying consecutive y-values and check which products are smaller than zero. This number divided by the total x-range gives the frequency and in order to get it in units of pi we can multiply that number by pi: y_shifted = y - offset; oemga = np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min())
  • Phase shift: can be set to zero, dphi = 0

So in summary, the following initial parameter guess can be used:

offset = (y.max() + y.min()) / 2 y_shifted = y - offset p0 = ( (y.max() - y.min()) / 2, np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min()), 0, offset ) popt, pcov = curve_fit(func_cos, t, y, p0=p0) 

Which gives me the following fit function:

Fit

Sign up to request clarification or add additional context in comments.

2 Comments

That is perfect, thanks! Is it just me or does the fit seem slightly vertically displaced though?
@user1887919 It looks like but this is the version that minimizes the difference between the two functions. Around the peaks it seems the fit function is shifted but this has the benefit that the slopes are in very good agreement. Considering that the peaks make only a small portion of the x-range while the slopes make a great portion, this way the difference is minimized. You can try and set the initial guess for the offset to something like amplitude + offset but it will again converge to this solution.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.