0
$\begingroup$

I am aware that there are algorithms to fit, say, an ellipse to a bunch of given points on a plane. For instance, this SO question has answers which feature both literature on the algorithms and implementations in python. However, suppose our data doesn't look like an ellipse, and instead of an ellipse, we want to fit a more general shape. Obviously that's a little bit too general, because "convex shape" is too broad a category, so I guess what I'm really looking for is for a way to fit a convex shape that is, in an appropriate sense, smooth.

My question arose after trying to calculate the area included inside a hysteresis loop which was given by a lot of noisy data, and they weren't given in proper order either. If I could just fit that somewhat weird shape, then I could use one of the many algorithms available to calculate the area enclosed by the curve I would be given. I guess one solution would be to find a way to do some kind of least squares fit of this specific shape, but I'm wondering whether there's some more general way.

$\endgroup$
7
  • 1
    $\begingroup$ You could construct the convex hull then mollify to make it smooth. $\endgroup$ Commented Mar 30 at 14:35
  • $\begingroup$ Wouldn't the convex hull give somewhat weird results for outlier points? $\endgroup$ Commented Mar 31 at 14:53
  • $\begingroup$ @CyclotomicField what I mean is that if I had, say, 1000 points forming a ring, and then 10 points forming an outer ring way outside of the inner ring, the convex hull would return the area inside the outer ring, whereas this is not at all what I would like. That's why I thought I might need some kind of least squares-like method. $\endgroup$ Commented Mar 31 at 14:58
  • $\begingroup$ The convex hull is the minimal convex shape that contains all the data. So you'll either have to exclude some interior points or relax convexity. $\endgroup$ Commented Mar 31 at 19:12
  • $\begingroup$ Right, but if I were to exclude unsuitable points "by hand", then that kind of defeats the point of the algorithm. Therefore, I don't think finding a convex hull is an appropriate solution. $\endgroup$ Commented Mar 31 at 19:17

1 Answer 1

1
$\begingroup$

From what you are saying in the comments, the problem can be formulated as follows: find a subset of $k$ outliers and then form the convex hull of the remaining $N-k$ points where $k \ll N$. The objective function could be to minimize an expression like

$$\frac{volume\_convex\_hull\_non\_outlier\_points}{volume\_convex\_hull\_all\_points} + \lambda \frac{k}{N}$$

where $\lambda$ determines the (subjective) tradeoff between reducing volume and reducing the number of outliers. I would not be surprised if this problem turns out to be NP complete since the somewhat related k-means clustering problem is known to be NP complete. But heuristics like starting with no outliers and then evaluating dropping one point at a time and then evaluating dropping a pair of points at a time may give you a good solution. The time complexity of this would only be of order $N^2$ which is quite feasible.

$\endgroup$
2
  • $\begingroup$ That's a clever idea for formulating it. However, on second thought, I think it runs into some difficulties because this will be minimized for very non-representative non-outliers, where they're all packed very closely together... I'm actually wondering now if convexity just isn't a very good requirement in the first place, perhaps I have to look for something more general, and only try to minimize volume or maximize smoothness or something. $\endgroup$ Commented Apr 1 at 13:16
  • $\begingroup$ I had another idea, perhaps I can randomly select a few points, and interpolate them. Then, I can do the same for a different set of randomly selected points, and keep going until I have a bunch of curves interpolating different points, which are more or less well behaved, because they only interpolate a few points. Then I can somehow find the mean curve of all these curves, and that should approximate it well enough. It doesn't guarantee convexity, but perhaps convexity is not such a good idea either way. Of course I'd have to figure out how to define the "mean" of a family of curves... $\endgroup$ Commented Apr 1 at 13:26

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.