# RANSAC (RANdom SAmple Consensus)

## RANSAC (RANdom SAmple Consensus)¶

Each iteration performs the following steps:

1. Select min_samples random samples from the original data and check whether the set of data is valid (see is_data_valid).
2. Fit a model to the random subset (base_estimator.fit) and check whether the estimated model is valid (see is_model_valid).
3. Classify all data as inliers or outliers by calculating the residuals to the estimated model (base_estimator.predict(X)&emsp;-&emsp;y) - all data samples with absolute residuals smaller than the residual_threshold are considered as inliers.
4. Save fitted model as best model if number of inlier samples is maximal. In case the current estimated model has the same number of inliers, it is only considered as the best model if it has better score.

## Theil-Sen estimator: generalized-median-based estimator¶

The TheilSenRegressor estimator uses a generalization of the median in multiple dimensions. It is thus robust to multivariate outliers. Note however that the robustness of the estimator decreases quickly with the dimensionality of the problem. It loses its robustness properties and becomes no better than an ordinary least squares in high dimension.

## Huber Regression¶

$\min_{w, \sigma} {\sum_{i=1}^n\left(\sigma + H_{\epsilon}\left(\frac{X_{i}w - y_{i}}{\sigma}\right)\sigma\right) + \alpha {||w||_2}^2}$
$H_{\epsilon}(z) = \begin{cases} z^2, & \text {if } |z| < \epsilon, \\ 2\epsilon|z| - \epsilon^2, & \text{otherwise} \end{cases}$

Last update : February 13, 2023
Created : February 13, 2023