[Feature] Feature selection
Ref: 1.13. 特征选择(Feature selection)
大纲列表
3.1 Filter
3.1.1 方差选择法
3.1.2 相关系数法
3.1.3 卡方检验
3.1.4 互信息法
3.2 Wrapper
3.2.1 递归特征消除法
3.3 Embedded
3.3.1 基于惩罚项的特征选择法
3.3.2 基于树模型的特征选择法
类 | 所属方式 | 说明 |
VarianceThreshold | Filter | 方差选择法 |
SelectKBest | Filter | 可选关联系数、卡方校验、最大信息系数作为得分计算的方法 |
RFE | Wrapper | 递归地训练基模型,将权值系数较小的特征从特征集合中消除 |
SelectFromModel | Embedded | 训练基模型,选择权值系数较高的特征 |
策略依据
从两个方面考虑来选择特征:
- 特征是否发散:如果一个特征不发散,例如方差接近于0,也就是说样本在这个特征上基本上没有差异,这个特征对于样本的区分并没有什么用。
- 特征与目标的相关性:这点比较显见,与目标相关性高的特征,应当优选选择。除方差法外,本文介绍的其他方法均从相关性考虑。
根据特征选择的形式又可以将特征选择方法分为3种:
- Filter:过滤法,按照发散性或者相关性对各个特征进行评分,设定阈值或者待选择阈值的个数,选择特征。
- Wrapper:包装法,根据目标函数(通常是预测效果评分),每次选择若干特征,或者排除若干特征。
- Embedded:嵌入法,先使用某些机器学习的算法和模型进行训练,得到各个特征的权值系数,根据系数从大到小选择特征。类似于Filter方法,但是是通过训练来确定特征的优劣。
特征选择
Filter
一、方差选择法
假设我们有一个带有布尔特征的数据集,我们要移除那些超过80%的数据都为1或0的特征。
结论:第一列被移除。
>>> from sklearn.feature_selection import VarianceThreshold
>>> X = [[0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [0, 1, 0], [0, 1, 1]]
>>> sel = VarianceThreshold(threshold=(.8 * (1 - .8)))
>>> sel.fit_transform(X)
array([[0, 1],
[1, 0],
[0, 0],
[1, 1],
[1, 0],
[1, 1]])
二、卡方检验
支持稀疏数据。常用的两个API:
(1) SelectKBest
移除得分前 名以外的所有特征
(2) SelectPercentile
移除得分在用户指定百分比以后的特征
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2 # 找到最佳的2个特征
rst = SelectKBest(chi2, k=2).fit_transform(iris.data, iris.target)
print(rst[:5])
参数设置
加入噪声列属性(特征),检测打分机制。
(1) 用于回归: f_regression
#%%
print(__doc__) import numpy as np
import matplotlib.pyplot as plt from sklearn import datasets, svm
from sklearn.feature_selection import SelectPercentile, f_classif, chi2 ###############################################################################
# import some data to play with # The iris dataset
iris = datasets.load_iris() # Some noisy data not correlated
E = np.random.uniform(0, 0.1, size=(len(iris.data), 20)) # Add the noisy data to the informative features
X = np.hstack((iris.data, E))
y = iris.target ###############################################################################
plt.figure(1)
plt.clf() X_indices = np.arange(X.shape[-1]) ###############################################################################
# Univariate feature selection with F-test for feature scoring
# We use the default selection function: the 10% most significant features
# selector = SelectPercentile(f_classif, percentile=10)
selector = SelectPercentile(chi2, percentile=10)
selector.fit(X, y)
scores = -np.log10(selector.pvalues_)
scores /= scores.max()
plt.bar(X_indices - .45, scores, width=.2,
label=r'Univariate score ($-Log(p_{value})$)', color='g')
f_classif 的结果
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAQaUlEQVR4nO3dfWiV9f/H8demCylNq2mDc05O6lgToZmeGZhkdDOPoCsq2DKykiNBMwqJ0Q1sI4gsQoJWyMFISxtLM0/UmIgJBi0uc07nNt2hlTstb6ZpRYXLru8f/To/5zavze3s2HvPBxw817k+51xvL45PDud0ThmSXAEA/vMy0z0AAGB4EHQAMIKgA4ARBB0AjCDoAGDE2HQd+Pjx4/rhhx/SdXgA+E+aOnWqpkyZ0ue+tAX9hx9+UCgUStfhAeA/yXGcfvfxlgsAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIzwDPq6det07NgxHThwoN81b731ltra2tTY2KhZs2YN64AAgIHxDPr777+vhQsX9rs/HA4rGAwqGAxqxYoVevfdd4d1QADAwHgGfffu3Tp16lS/+4uKirRhwwZJ0jfffKNJkyYpJydn+CYEAAzIkN9D9/l86ujoSG4nEgn5fL4+10YiETmOI8dxlJ2dPdRDj4yK/7sAwGVuyEHPyMjodZvr9v0/QYpGowqFQgqFQurq6hrqoQEA5xly0BOJhAKBQHLb7/ers7NzqA8LABikIQc9FovpsccekyTNnTtXZ86c0dGjR4c8GABgcDx/bXHTpk1asGCBsrOz1dHRofLycmVlZUmS1q5dqy+++EKLFi1SPB7X77//rieeeCLlQwMAevMM+iOPPOL5IKWlpcMyDADg0vFNUQAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARhD0VKgQv9AIYMQRdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIwg6ABgBEEHACMIOgAYQdABwAiCDgBGEHQAMIKgA4ARBB0AjCDoAGAEQQcAIwg6ABhB0AHAiAEFvbCwUK2trWpra1NZWVmv/YFAQDt37tTevXvV2NiocDg87IMCAC7OM+iZmZmqqqpSOBzWjBkzVFJSory8vB5rXn75ZdXU1Oi2225TcXGx3nnnnZQNDADom2fQCwoKFI/H1d7eru7ublVXV6uoqKjHGtd1dfXVV0uSJk6cqM7OztRMCwDo11ivBT6fTx0dHcntRCKhuXPn9lhTUVGh7du3a+XKlbrqqqt0zz339PlYkUhEK1askCRlZ2cPZW4AwAU8X6FnZGT0us113R7bJSUlev/99xUIBLRo0SJ98MEHfd4vGo0qFAopFAqpq6trCGMDAC7kGfREIqFAIJDc9vv9vd5SWb58uWpqaiRJ9fX1GjduHK/AAWCEeQbdcRwFg0Hl5uYqKytLxcXFisViPdYcOXJEd999tyTplltu0bhx43TixInUTAwA6JNn0M+dO6fS0lLV1dWppaVFNTU1am5uVmVlpRYvXixJWrVqlSKRiPbt26ePPvpIjz/+eKrnBgBcwPNDUUmqra1VbW1tj9vKy8uT11taWnTHHXcM72QAgEHhm6IAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIwg6ABgBEEHACMIOgAYQdABwAiCDgBGEHQAMIKgA4ARBB0AjCDoAGAEQQcAIwg6ABhB0AHACIIOAEYQdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIwYUNALCwvV2tqqtrY2lZWV9bnm4Ycf1sGDB9XU1KSNGzcO65AAAG9jvRZkZmaqqqpK9957rxKJhBzHUSwWU0tLS3LNTTfdpBdeeEHz5s3T6dOnNXny5JQODQDozfMVekFBgeLxuNrb29Xd3a3q6moVFRX1WBOJRFRVVaXTp09Lkk6cOJGaaQEA/fIMus/nU0dHR3I7kUjI5/P1WDN9+nRNnz5dX331lb7++msVFhYO/6QAgIvyfMslIyOj122u6/Z8kLFjFQwGtWDBAvn9fu3evVszZ87UmTNneqyLRCJasWKFJCk7O3socwMALuD5Cj2RSCgQCCS3/X6/Ojs7e63Ztm2b/vrrL33//fc6dOiQgsFgr8eKRqMKhUIKhULq6uoahvEBAP/yDLrjOAoGg8rNzVVWVpaKi4sVi8V6rPn000911113SZKuu+46TZ8+Xd99911qJgYA9Mkz6OfOnVNpaanq6urU0tKimpoaNTc3q7KyUosXL5Yk1dXV6eTJkzp48KC+/PJLPf/88zp16lTKhwcA/L8MSa7nqhRwHEehUCgdhx6cigv+TNV9AGAALtZOvikKAEYQdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIwg6ABgBEEHACMIOgAYQdABwAiCDgBGEHQAMIKgA4ARBB0AjCDoAGAEQQcAIwg6ABhB0AHACIIOAEYQdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARgwo6IWFhWptbVVbW5vKysr6Xffggw/KdV3Nnj172AYEAAyMZ9AzMzNVVVWlcDisGTNmqKSkRHl5eb3WjR8/Xs8884zq6+tTMigA4OI8g15QUKB4PK729nZ1d3erurpaRUVFvda98sorev311/Xnn3+mZFAAwMV5Bt3n86mjoyO5nUgk5PP5eqzJz89XIBDQ559/ftHHikQichxHjuMoOzv7EkcGAPTFM+gZGRm9bnNdt8f+NWvWaNWqVZ4Hi0ajCoVCCoVC6urqGuSoAICL8Qx6IpFQIBBIbvv9fnV2dia3J0yYoJkzZ2rXrl1qb2/X7bffrlgsxgejADDCPIPuOI6CwaByc3OVlZWl4uJixWKx5P5ffvlFkydP1rRp0zRt2jTV19dryZIl+vbbb1M6OACgJ8+gnzt3TqWlpaqrq1NLS4tqamrU3NysyspKLV68eCRmBAAMwNiBLKqtrVVtbW2P28rLy/tce9dddw19KgDAoPFNUQAwgqADgBEEHQCMIOgAYARBBwAjCDoAGDGg/2zxslPRz3UAGMV4hQ4ARhB0ADCCoAOAEQQdAIwg6ABgBEEHACMIOgAYQdABwAiCDgBGEHQAMIKgA4ARBB0AjCDoAGAEQQcAIwg6ABhB0AHACIIOAEYQdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARgwo6IWFhWptbVVbW5vKysp67X/uued08OBBNTY2aseOHbrhhhuGfVAAwMV5Bj0zM1NVVVUKh8OaMWOGSkpKlJeX12NNQ0OD5syZo1tvvVWbN2/W66+/nrKBAQB98wx6QUGB4vG42tvb1d3drerqahUVFfVYs2vXLv3xxx+SpPr6evn9/tRMCwDol2fQfT6fOjo6ktuJREI+n6/f9cuXL1dtbW2f+yKRiBzHkeM4ys7OvoRxAQD9Geu1ICMjo9dtruv2uXbp0qWaM2eO7rzzzj73R6NRRaNRSZLjOIOZEwDgwTPoiURCgUAgue33+9XZ2dlr3d13362XXnpJd955p86ePTu8UwIAPHm+5eI4joLBoHJzc5WVlaXi4mLFYrEea/Lz87V27VotWbJEJ06cSNmwAID+eb5CP3funEpLS1VXV6cxY8bovffeU3NzsyorK7Vnzx599tlneuONNzR+/Hh9/PHHkqQjR470+uA07Sr6uQ4ARngGXZJqa2t7fdBZXl6evH7vvfcO71QAgEHjm6IAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEQQdAIwg6ABgBEEHACMIOgAYQdABwAiCDgBGEHQAMIKgA4ARBB0AjCDoAGAEQQcAIwg6ABhB0AHACIIOAEYQdAAwgqADgBEEHQCMIOgAYARBBwAjCDoAGEHQAcAIgg4ARhB0ADCCoAOAEWPTPQAAjAoV/VwfRrxCBwAjBhT0wsJCtba2qq2tTWVlZb32X3HFFaqurlZbW5vq6+s1derUYR8U/zEV511GswpxDipk7zxU6LL8O3kGPTMzU1VVVQqHw5oxY4ZKSkqUl5fXY83y5cv1888/KxgMas2aNVq9enXKBgbMq9DgY3Ep98E/KmTmvHm+h15QUKB4PK729nZJUnV1tYqKitTS0pJcU1RUpIqKCknS5s2b9fbbb6dmWssqLvhzMPcZ7P0G63I+zqXc5/y1g7kPRu65cP7jD+Y4l3IfQzIkuRdb8OCDD2rhwoWKRCKSpEcffVRz587VypUrk2sOHDighQsX6scff5QkxeNxzZ07VydPnuzxWJFIRCtWrJAk3XzzzTp06NBw/l2SsrOz1dXVlZLH/q/gHPyD88A5+JeV8zB16lRNmTKlz32er9AzMjJ63ea67qDXSFI0GlU0GvU65JA5jqNQKJTy41zOOAf/4DxwDv41Gs6D53voiURCgUAgue33+9XZ2dnvmjFjxmjixIk6derUMI8KALgYz6A7jqNgMKjc3FxlZWWpuLhYsVisx5pYLKZly5ZJkh566CHt3LkzNdMCAPo1Rh4fH7iuq7a2Nm3cuFErV67Uhx9+qE8++USVlZWaMGGCDh8+rP3792vp0qV69dVXlZ+fr6eeekqnT58emb9BP/bu3ZvW418OOAf/4DxwDv5l/Tx4figKAPhv4JuiAGAEQQcAI0wF3esnCkaL9vZ27d+/Xw0NDXIcJ93jjJh169bp2LFjOnDgQPK2a665Rtu3b9fhw4e1fft2TZo0KY0Tpl5f56C8vFyJREINDQ1qaGhQOBxO44Sp5/f7tXPnTjU3N6upqUnPPPOMpNHzXHAtXDIzM914PO5OmzbNzcrKcvft2+fm5eWlfa50XNrb293rrrsu7XOM9GX+/PnurFmz3AMHDiRvW716tVtWVuZKcsvKytzXXnst7XOO9DkoLy93V61alfbZRuqSk5Pjzpo1y5Xkjh8/3j106JCbl5c3Kp4LZl6hn/8TBd3d3cmfKMDosXv37l7ffygqKtL69eslSevXr9f999+fjtFGTF/nYLQ5evSoGhoaJEm//fabWlpa5PP5RsVzwUzQfT6fOjo6ktuJREI+ny+NE6WP67ravn279uzZk/zJhtHq+uuv19GjRyX98w+9v69MW1daWqrGxkatW7fO7FsNfZk6dapmzZqlb775ZlQ8F8wEfaA/PzAazJs3T7Nnz1Y4HNbTTz+t+fPnp3skpNG7776rG2+8Ufn5+frpp5/05ptvpnukEXHVVVdpy5YtevbZZ/Xrr7+me5wRYSboA/mJgtHip59+kiSdOHFCW7duVUFBQZonSp9jx44pJydHkpSTk6Pjx4+neaKRd/z4cf39999yXVfRaHRUPB/Gjh2rLVu2aOPGjdq6dauk0fFcMBP0gfxEwWhw5ZVXavz48cnr9913n5qamtI8Vfqc/7MUy5Yt07Zt29I80cj7N2KS9MADD4yK58O6devU0tKiNWvWJG8bLc+FtH8yO1yXcDjsHjp0yI3H4+6LL76Y9nnScZk2bZq7b98+d9++fW5TU9OoOg+bNm1yOzs73bNnz7odHR3uk08+6V577bXujh073MOHD7s7duxwr7nmmrTPOdLnYMOGDe7+/fvdxsZGd9u2bW5OTk7a50zlZd68ea7rum5jY6Pb0NDgNjQ0uOFweFQ8F/jqPwAYYeYtFwAY7Qg6ABhB0AHACIIOAEYQdAAwgqADgBEEHQCM+B9FfonYZP0umQAAAABJRU5ErkJggg==" alt="" />
ch2 的结果
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAQEUlEQVR4nO3dfWiV9f/H8deWJyRnak4bnHNyo441EdrQMwMRFat1/DFXVLAlZCVnCE0pJAb1xxxBpBHSH0PkMFBLG0uzDWpsiAkGTa5ym85tukMn22HebJpWVLjW5/tHtV9zN9dsN0c/ez7gwl3X+excby+OTw5nO8ckSUYAgDtecqIHAACMD4IOAJYg6ABgCYIOAJYg6ABgiWmJOvHly5d1/vz5RJ0eAO5ICxYs0Pz584e8LWFBP3/+vILBYKJODwB3JMdxhr2Nl1wAwBIEHQAsQdABwBIEHQAsQdABwBIEHQAs4Rr0iooKXbp0SadPnx52zQcffKCOjg41NzcrOzt7XAcEAIyOa9D37Nmjp556atjbQ6GQAoGAAoGAioqKtGvXrnEdEAAwOq5BP378uK5evTrs7fn5+dq3b58k6cSJE5o9e7bS0tLGb0IAwKiM+TV0r9erzs7O/v14PC6v1zvk2nA4LMdx5DiOUlNTx3rqybHt7w0AbnNjDnpSUtKgY8YM/Z8gRSIRBYNBBYNB9fT0jPXUAIB/GXPQ4/G4/H5//77P51NXV9dY7xYAcIvGHPSamhq9+OKLkqRly5bp+vXrunjx4pgHAwDcGtdPWzxw4IBWrVql1NRUdXZ2qrS0VB6PR5K0e/duffHFF1q7dq2i0ah+/fVXvfzyyxM+NABgMNegv/DCC653UlxcPC7DAAD+O94pCgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYAmCDgCWIOgAYIlRBT03N1ft7e3q6OhQSUnJoNv9fr+OHj2qkydPqrm5WaFQaNwHBQC4MyNtycnJJhqNmoyMDOPxeExTU5PJzMwcsGb37t1m06ZNRpLJzMw0sVhsxPuUZBzHcV1zW2zb/t4SPQcbGxubRm6n6zP0nJwcRaNRxWIx9fb2qrKyUvn5+QPWGGN07733SpJmzZqlrq4ut7sFAIyzaW4LvF6vOjs7+/fj8biWLVs2YM22bdtUX1+vzZs3a8aMGXr88ceHvK9wOKyioiJJUmpq6ljmBgDcxPUZelJS0qBjxpgB+4WFhdqzZ4/8fr/Wrl2rDz/8cMjvi0QiCgaDCgaD6unpGcPYAICbuQY9Ho/L7/f37/t8vkEvqWzcuFFVVVWSpIaGBk2fPp1n4AAwyVyD7jiOAoGA0tPT5fF4VFBQoJqamgFrfvjhB61Zs0aS9Mgjj2j69Onq7u6emIkBAENyDXpfX5+Ki4tVV1entrY2VVVVqbW1VWVlZcrLy5Mkbd26VeFwWE1NTfr444/10ksvTfTcAICbJOmvX3eZdI7jKBgMJuLUt2bbTX8CQAKN1E7eKQoAliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGAJgg4AliDoAGCJUQU9NzdX7e3t6ujoUElJyZBrnn/+eZ05c0YtLS3av3//uA4JABgdM9KWnJxsotGoycjIMB6PxzQ1NZnMzMwBax566CFz8uRJM3v2bCPJzJs3b8T7lGQcx3Fdc1ts2/7eEj0HGxsbm0Zup+sz9JycHEWjUcViMfX29qqyslL5+fkD1oTDYZWXl+vatWuSpO7ubre7BQCMM9ege71edXZ29u/H43F5vd4BaxYuXKiFCxfqq6++0tdff63c3NzxnxQAMKJpbguSkpIGHTPGDLyTadMUCAS0atUq+Xw+HT9+XIsXL9b169cHrAuHwyoqKpIkpaamjmVuAMBNXJ+hx+Nx+f3+/n2fz6eurq5Ba6qrq/XHH3/o+++/19mzZxUIBAbdVyQSUTAYVDAYVE9PzziMDwD4h2vQHcdRIBBQenq6PB6PCgoKVFNTM2DNZ599ptWrV0uS5s6dq4ULF+q7776bmIkBAENyDXpfX5+Ki4tVV1entrY2VVVVqbW1VWVlZcrLy5Mk1dXV6cqVKzpz5oy+/PJLvfHGG7p69eqEDw8A+H9J+uvXXSad4zgKBoOJOPWt2XbTnxP1PQAwCiO1k3eKAoAlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlCDoAWIKgA4AlRhX03Nxctbe3q6OjQyUlJcOue/bZZ2WM0ZIlS8ZtQADA6LgGPTk5WeXl5QqFQlq0aJEKCwuVmZk5aF1KSoq2bNmihoaGCRkUADAy16Dn5OQoGo0qFoupt7dXlZWVys/PH7Tu7bff1o4dO/T7779PyKAAgJG5Bt3r9aqzs7N/Px6Py+v1DliTlZUlv9+vzz//fMT7CofDchxHjuMoNTX1P44MABiKa9CTkpIGHTPGDLh9586d2rp1q+vJIpGIgsGggsGgenp6bnFUAMBIXIMej8fl9/v7930+n7q6uvr3Z86cqcWLF+vYsWOKxWJ67LHHVFNTww9GAWCSuQbdcRwFAgGlp6fL4/GooKBANTU1/bf/9NNPmjdvnjIyMpSRkaGGhgatW7dO33777YQODgAYyDXofX19Ki4uVl1dndra2lRVVaXW1laVlZUpLy9vMmYEAIzCtNEsqq2tVW1t7YBjpaWlQ65dvXr12KcCANwy3ikKAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgCYIOAJYg6ABgiVEFPTc3V+3t7ero6FBJScmg219//XWdOXNGzc3NOnLkiB544IFxHxQAMDLXoCcnJ6u8vFyhUEiLFi1SYWGhMjMzB6xpbGzU0qVL9eijj+rgwYPasWPHhA0MABiaa9BzcnIUjUYVi8XU29uryspK5efnD1hz7Ngx/fbbb5KkhoYG+Xy+iZkWADAs16B7vV51dnb278fjcXm93mHXb9y4UbW1tUPeFg6H5TiOHMdRamrqfxgXADCcaW4LkpKSBh0zxgy5dv369Vq6dKlWrlw55O2RSESRSESS5DjOrcwJAHDhGvR4PC6/39+/7/P51NXVNWjdmjVr9NZbb2nlypW6cePG+E4JAHDl+pKL4zgKBAJKT0+Xx+NRQUGBampqBqzJysrS7t27tW7dOnV3d0/YsACA4bkGva+vT8XFxaqrq1NbW5uqqqrU2tqqsrIy5eXlSZLee+89paSk6JNPPlFjY6Oqq6snfHAAwECuL7lIUm1t7aAfdJaWlvZ//cQTT4zvVACAW8Y7RQHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEgQdACxB0AHAEqP6H4tuO9uG+RoApjCeoQOAJQg6AFiCoAOAJQg6AFiCoAOAJQg6AFiCoAOAJQg6AFiCoAOAJQg6AFiCoAOAJe7Mz3L5L7YN8zUAWIJn6ABgCYIOAJYg6ABgiVEFPTc3V+3t7ero6FBJScmg2++++25VVlaqo6NDDQ0NWrBgwbgPCgAYmWvQk5OTVV5erlAopEWLFqmwsFCZmZkD1mzcuFE//vijAoGAdu7cqe3bt0/YwACAobkGPScnR9FoVLFYTL29vaqsrFR+fv6ANfn5+dq7d68k6eDBg1qzZs3ETAsAGJbrry16vV51dnb278fjcS1btmzYNX19fbp+/brmzp2rK1euDFgXDodVVFQkSXr44YflOM6Y/wL6v8GHUlNT1dPTc0vf81/OMyHfM05cr8EUwXXgGvzDlusw0kvarkFPSkoadMwYc8trJCkSiSgSibidcswcx1EwGJzw89zOuAZ/4TpwDf4xFa6D60su8Xhcfr+/f9/n86mrq2vYNXfddZdmzZqlq1evjvOoAICRuAbdcRwFAgGlp6fL4/GooKBANTU1A9bU1NRow4YNkqTnnntOR48enZhpAQDDuksub4Q3xqijo0P79+/X5s2b9dFHH+nTTz9VWVmZZs6cqXPnzunUqVNav3693nnnHWVlZWnTpk26du3a5PwNhnHy5MmEnv92wDX4C9eBa/AP269DkqTBL3YDAO44vFMUACxB0AHAElYF3e0jCqaKWCymU6dOqbGxcXx+1/8OUVFRoUuXLun06dP9x+bMmaP6+nqdO3dO9fX1mj17dgInnHhDXYPS0lLF43E1NjaqsbFRoVAogRNOPJ/Pp6NHj6q1tVUtLS3asmWLpKnzWDA2bMnJySYajZqMjAzj8XhMU1OTyczMTPhcidhisZiZO3duwueY7G3FihUmOzvbnD59uv/Y9u3bTUlJiZFkSkpKzLvvvpvwOSf7GpSWlpqtW7cmfLbJ2tLS0kx2draRZFJSUszZs2dNZmbmlHgsWPMMfTQfUQC7HT9+fND7H/79sRR79+7V008/nYjRJs1Q12CquXjxohobGyVJv/zyi9ra2uT1eqfEY8GaoA/1EQVerzeBEyWOMUb19fX65ptvFA6HEz1OQt1///26ePGipL/+oc+fPz/BEyVGcXGxmpubVVFRYe1LDUNZsGCBsrOzdeLEiSnxWLAm6KP9+IGpYPny5VqyZIlCoZBeffVVrVixItEjIYF27dqlBx98UFlZWbpw4YLef//9RI80KWbMmKFDhw7ptdde088//5zocSaFNUEfzUcUTBUXLlyQJHV3d+vw4cPKyclJ8ESJc+nSJaWlpUmS0tLSdPny5QRPNPkuX76sP//8U8YYRSKRKfF4mDZtmg4dOqT9+/fr8OHDkqbGY8GaoI/mIwqmgnvuuUcpKSn9Xz/55JNqaWlJ8FSJ8++PpdiwYYOqq6sTPNHk+ydikvTMM89MicdDRUWF2tratHPnzv5jU+WxkPCfzI7XFgqFzNmzZ000GjVvvvlmwudJxJaRkWGamppMU1OTaWlpmVLX4cCBA6arq8vcuHHDdHZ2mldeecXcd9995siRI+bcuXPmyJEjZs6cOQmfc7Kvwb59+8ypU6dMc3Ozqa6uNmlpaQmfcyK35cuXG2OMaW5uNo2NjaaxsdGEQqEp8Vjgrf8AYAlrXnIBgKmOoAOAJQg6AFiCoAOAJQg6AFiCoAOAJQg6AFjif4VLXL4Ek1rmAAAAAElFTkSuQmCC" alt="" />
三、皮尔逊相关系数
四、互信息法
链接:https://www.zhihu.com/question/28641663/answer/41653367
Wrapper
一、递归特征消除法
原理就是给每个“特征”打分:
首先,预测模型在原始特征上训练,每项特征指定一个权重。
之后,那些拥有最小绝对值权重的特征被踢出特征集。
如此往复递归,直至剩余的特征数量达到所需的特征数量。
(1) Recursive feature elimination: 一个递归特征消除的示例,展示了在数字分类任务中,像素之间的相关性。
(2) Recursive feature elimination with cross-validation: 一个递归特征消除示例,通过交叉验证的方式自动调整所选特征的数量。
print(__doc__) from sklearn.svm import SVC
from sklearn.datasets import load_digits
from sklearn.feature_selection import RFE
import matplotlib.pyplot as plt # Load the digits dataset
digits = load_digits()
X = digits.images.reshape((len(digits.images), -1))
y = digits.target
########################################################
# Create the RFE object and rank each pixel
svc = SVC(kernel="linear", C=1)
rfe = RFE(estimator=svc, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape) # Plot pixel ranking
plt.matshow(ranking)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
对64个特征的重要性进行绘图,如下:
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAPgAAADwCAYAAAAtgqlmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3de1RTV/o38G8SotwvQgUBBW0Hqq5WwBekMi1TvFKs0llVsVppZeGvrRb76lSpM63zqh3tdMZLVzt2ZBikHdGKDgVrVSgoo9ZilIvyI0jVVEFuIrcAFiHZ7x8MqQh4TvSckMvzWWuvRZJznuwAT/Y+++yzjwQAAyHELEmHugKEEPFQghNixijBCTFjlOCEmDFKcELMGCU4IWbMbBN8w4YN+PLLLwd87b333kNSUpKBawSMHDkS+fn5aG1txV/+8he991er1Rg7duwj1cHHxweMMchkskeKM5hdu3bhD3/4w6CvP+jvQoQ3pAmuUqnQ0dEBtVqNmpoapKSkwM7OTvT33bJlC+Lj40V/n/stX74cDQ0NcHR0xO9+9zu993dwcIBKpRKhZsJ58803sXnzZgBAeHg4KisrHykeYwxtbW1Qq9WoqqrCX//6V0ilv/zbnjhxAnfu3IFardaV0NDQfvv2lnffffeR6mNqhrwFf/HFF+Hg4ICAgAAEBgbivffeG+oqicbHxwdlZWVDXQ2TM2nSJDg4OCA8PBwLFy7EsmXL+ry+cuVKODg46MoPP/zQb9/e8vHHHxu6+kNqyBO8V11dHY4fP46AgADdcy+88AIKCwvR0tKCGzduYMOGDbrXeruaS5cuxfXr13Hr1i2sX79+wNhWVlZIS0vDwYMHIZfL+3QTueJYW1tjz549aGxsRFlZGd59990HtkrPPPMMzp07h+bmZpw7dw7PPPMMACAlJQWxsbFYu3Yt1Go1pk2b1m/flJQU7Nq1C9nZ2WhtbcXJkycxZswY3euMMTz++OOQy+UoKirCypUrAQBSqRSnT5/G+++/DwCQSCRYt24drly5goaGBnz11VdwcXEZsL6xsbG4evUqWltbce3aNbzyyiv9thk+fDg6Ojrg6uoKAPj973+Prq4uODg4AAA2bdqE7du36z7Dpk2bYGtri6NHj8LT01PXeo4aNQoAMGzYMKSmpqK1tRWlpaWYPHnyoL/Pe129ehVnzpzp8z9iLLQ/50N79yKvcvToUYPVy2gS3MvLC5GRkbhy5Yruufb2dixduhTOzs6IiorCm2++iXnz5vXZ79e//jX8/f0xbdo0fPDBB3jyySf7vG5tbY2vv/4anZ2dWLBgAbq6ugZ8/8HibNiwAb6+vhg3bhxmzJiBJUuWDPoZXFxccOTIEXzyySdwdXXFtm3bcOTIEYwYMQKvv/469u7diz//+c9wcHBAbm7ugDEWL16MTZs2wc3NDcXFxdi7d2+/bbq6urBkyRJs3LgRTz75JBITEyGTyfDhhx8CABISEhAdHY3w8HB4enqiqakJn332Wb84tra2+OSTTxAZGQlHR0dMnToVxcXF/bbr7OyEQqFAeHg4AOC5557D9evXERYWpnucn5/fZ5+Ojg5ERkaiurpa13rW1NQAAObOnYv9+/fD2dkZWVlZ+PTTTwf9nd7L398fzz77bJ//EaMhdYHm9ku8ipubm0GrxoaqqFQqplarWWtrK2OMse+++445OTkNuv327dvZtm3bGADm4+PDGGPMy8tL93pBQQFbuHAhA8A2bNjAMjMz2cmTJ9nOnTv7xNmwYQP78ssvecW5evUqmzlzpu61uLg4VllZOWD9lixZwgoKCvo89/3337PY2FgGgKWkpLBNmzYN+vlSUlLYvn37dI/t7OxYd3c38/b2ZgAYY4w9/vjjutdXr17NlEola2xsZE888YTu+bKyMhYREaF77OHhwe7evctkMpnu88pkMmZra8uamprYb3/7W2Ztbf3Av9XGjRvZzp07mUwmYzU1NSwhIYFt2bKFDR8+nHV0dDBXV9d+nzE8PLzf72rDhg0sJydH93j8+PGso6Nj0PdljLGWlhbW1tbGGGMsLS2NDRs2TPf6iRMnWHt7O2tqamJNTU3swoUL/fbtfa2pqanP31LIoum8yH6uHsurKBQKg+XYkLfg0dHRcHR0RHh4OJ588sk+324hISHIy8tDfX09mpub8cYbb/T79qutrdX93NHRAXt7e93j0NBQPP3009i6dStnPQaL4+np2adL/qDuuaenJ65fv97nuevXr8PLy4vz/QeK397ejsbGRnh6eg64bWpqKnx9ffHtt9/2adV8fHyQkZGBpqYmNDU1QalUQqPRwN3dvc/+HR0dWLhwId544w3U1NTgm2++gb+//4DvlZ+fj9/85jcICgrCpUuXkJOTg/DwcISGhuLKlSu4ffs27894/+/axsbmgaP6QUFBsLe3x4IFCzBlypR+A7EJCQlwcXGBi4tLv+5+UFCQ7jUXFxdkZ2fzrqd+GLQ8iyENeYL3+s9//oM9e/b0OX2UlpaGrKwsjB49Gs7Ozvj8888hkUh4x8zOzsaWLVuQm5uLkSNHPlS9ampq4O3trXs8evToQbetrq6Gj49Pn+fGjBmDmzdv8n6/e+Pb2dlhxIgRqK6uHnDbv/3tb/jmm28wa9YsXXcZ6PmSiIyM7POPbWNjM2Cc7OxszJw5E6NGjUJ5efmgpw+///57+Pv746WXXkJ+fj6USiXGjBmDqKioft3zXowJ+8+cnp6Os2fP4oMPPhA0rhAYgC6m4VUMacgSfNasWfDy8kJqairWrVsHANixYwdmzJiBSZMmAeg5LdTY2IjOzk4EBwcPOAB0v9deew11dXV46623AAAff/wx0tLSkJubqxsk0seBAwfw3nvvYeLEiTh9+jQ+/fRTuLu7IyEhod+23377Lfz8/LBo0SLIZDIsWLAAEyZMwDfffDNo/OHDh6OgoADFxcWYN28eXnrpJYSFhUEul2PTpk0oKChAVVVVv/2WLFmCyZMn47XXXkNCQgJSU1N1Ldvnn3+ODz/8EGPGjIFUKsXFixdx7ty5fjFGjhyJF198Eba2tujs7ERbWxs0moH/Ae/cuYMLFy5gxYoVyM/Ph0qlgr29PVavXo3IyMgB96mrq4OrqyscHR0H/fz3c3JyQnp6OpRK5YBnHLZu3Yrly5f3641w8fPzQ1FRka60tLRg1apVesV4EAZQC657U6kUn332Gerq6hAXF4dFixZh/PjxaGhowBdffKEbDX7rrbewceNGtLa24oMPPsCBAwc4Y585cwazZ8/u89zmzZvx9ddf47vvvht0NHkwGzduRFVVFc6cOQNPT09s3rwZlZWVWLFiBcaPH99n28bGRsyZMwdr1qzB7du3sXbtWsyZM+eB3dfOzk5EREQgICAAhw8fRnNzM7Zt24bGxkZMnjwZixcv7rfP6NGjsWPHDixduhTt7e3Yt28fzp8/rxvJ3rlzJ7KyspCdnY329nb4+vrC2dm5XxypVIo1a9aguroajY2NCA8P130xDiQ/Px9yuVz3ZfGnP/0JMpkMU6dOHXD7y5cvY9++fbh27Rqampp0o+gPsnPnThw7dgzjx4/XfdHfq7S0FPn5+bzPZ5eUlECtVuPChQt44okncPLkSUyePBkdHR3IyMjgFYMfBg3PYmgGO+DvLaGhoezYsWO6x4mJiSwxMVGw+D4+PuzSpUui1P2NN95gJ0+eZF9//TWbPn26oLG/+OILVl1dzUJCQgSJ5+Xlxb777jv2/PPPs8OHDwtaV5VKpRtYE6o4ODiwa9euifJ3u7fMmDGDnT59WtCYdzuLWf1NT17F7AfZvLy8+gwmVVVV6TUQZUgeHh6YOnUqJBIJ/Pz8sGbNGuTn5yMwMBAFBQWCvIdUKkVRURFiYmKgUqkG7E4/jB07dmDt2rXQarWCxLsXYwzZ2dk4f/68YLMCx40bh1u3biElJQWFhYVISkqCra2tILHvFRMTg3379gkeV8MYr2JIQ5LgAw2UCT0gI5Rhw4bh73//O9RqNfLy8vDtt99izpw5eOedd6BWqwV5D61Wi8DAQKSnp8PT0xMTJ0585JhRUVGor69HYWGhADXsLywsDJMnT0ZkZCRWrFiBZ5999pFjWllZISgoCLt27UJQUBDa29uRmJgoQG1/IZfLMXfuXKSnpwsalwHQ8iyGNCQJXlVV1We02Nvbe9CR4qF248YNPPXUU7C3t4evry/8/f3xr3/9S+Djtx6LFy9GSkpKvzGEhxEWFoa5c+dCpVJh//79iIiIEPQij95JK7du3UJGRgZCQkIeOWZVVRWqqqp0PZiDBw8iKCjokePeKzIyEoWFhaivrxc0LgBBj8HvH2wMDQ3VnearqKhAdnb2gOMq9xuSBFcoFPjVr34FX19fyOVyxMTEICsrayiqopfk5GQolUrdYJYQ3Nzc4OTkBKBn1t306dNRXl7+yHHXr1+P0aNHY+zYsYiJiUFeXh5effXVR44L9MyA650nYGtri5kzZ6K0tPSR49bV1aGyshJ+fn4AgGnTpgk+d3/RokWidM8ZgC7Gr/Bx/2CjUqlEYmIicnNz4efnh9zcXN69G4Md8N9bIiMj2eXLl9mVK1fY+vXrBYublpbGqqur2d27d1llZSVbtmyZIHHDwsIYY4yVlJSwoqIiVlRUxCIjIx857lNPPcUKCwtZSUkJu3TpEnv//fcF/12Hh4cLOsg2duxYVlxczIqLi1lpaamgf79JkyYxhULBSkpKWEZGBnN2dhYsto2NDWtoaGCOjo6C/47vdBazispRvArXINtgg43l5eXMw8ODAT2zE8vLyznrJfnvD4SQR3CnsxjX6waeD3A/dV0WgoODB3190qRJ2L17N8rKyjBp0iRcuHABq1atws2bN/uc5m1sbMSIESMe+F5GM5ONEFPGAGgg4VXc3NygUCh05f6zEEIONloJ8NkIsXi9Cc5HQ0PDA1vwgQYbExMTUVdXBw8PD9TW1sLDw4PXQCG14IQIgUmg5Vm4DDbYmJWVhdjYWAA91/FnZmZyxqIWnBAB6NOC8/H2229j7969GDZsGK5du4bXX38dUqkUBw4cQFxcHG7cuIH58+dzxqFBNkIEoP65BP9bO497QwCyW+kP7KILaci76GItfmhqccWMbWpxxYwtXp35DbAJ2crzMeQJvnz5coorcmxTiytmbLHiMgAaJuVVDImOwQkRiHbo28t+RElwreY2oOG5iolkBLR3L/LatLKL/5rp9jIHXG+/ynv7llZ+Vy2NsLHFxf/Ow+ZDepd/l8zFzgal12u5NwQgb+vmHdfJ2Q6X/7f/ohGD0Q7nd1MEJwcbKK/w/13cdeC9KUbY2KCkjn/sEfZtvLazs3JEZfuPvLZ1GTYS9nInXtuyIeh+8yFOC665CXb7t4KH/UvN/xE8Zq8jOeIMetjfEOeP7vF9syhxAaBtnB6ZqIeb/VeKFkzM1LOCx1z5q4/02t7Q3W8+qItOiAAYAK3FtOCEWBgGCe4y40sn46sRISbKYgbZCLE0jEmg4TEN1dAowQkRQM9UVeNrwXnVaNasWSgvL8ePP/6oW8OcENKXlkl5FUPifLfeNcwjIyMxYcIE3RrmhJBf9JwHl/IqhsT5biEhIbhy5QpUKhW6urqwf//+fnf4JMTSMQBdTMarGBJngpvSGuaEDB2Jac5F57uGeXx8/C8T+SUPXieKEHNjshNd+K5hnpSUpLszJd+55YSYE2OcqspZI1Ndw5wQQzLWQTbOFlyj0WDlypU4fvw4ZDIZ/vnPfwq+GD0hJo+B13prhsZrosvRo0dx9OhRsetCiMky1okuNJONEAEwSAx+CowPSnBCBGLoWWp8UIITIgDLWtGFEAtELTghZqp3VVVjI0qCt2qlONMxXPC4WZcmCR6zl1wjTtzhc4S/0TwA3L3MbzHAh1EzVZyupvcTdaLEBQBb2V3BY0ol+twTRGKaM9kIIdwYs6AWnBBLI/RpMpVKBbVaDY1Gg+7ubgQHB8PFxQVfffUVfH198dNPP2HBggVobn7w6rrG95VDiIkS6u6ivZ5//nkEBgbq7mOWmJiI3Nxc+Pn5ITc3l9c9wynBCRFAz9VkUl7lYc2bNw+pqakAgNTUVERHR3PuQwlOiEA0/114kavwwRhDdnY2zp8/r7thoru7O2pre+58U1tbi5EjR3LGoWNwQgTAwL/77ebmBoVCoXu8e/du3aXWvcLCwlBTU4PHHnsMOTk5KC8vf6h6UYITIgTGf6JLQ0MD5/3Ba/57/7tbt24hIyMDISEhqKurg4eHB2pra+Hh4YH6eu5TsNRFJ0QAPVeTCXN/cFtbW9jb2+t+njlzJkpLS5GVlYXY2FgAQGxsLDIzMzljcbbgycnJmDNnDurr6/HUU09xBiTEEjFI0K0V5jSZu7s7MjIyAABWVlZIS0vD8ePHoVAocODAAcTFxeHGjRuYP38+ZyzOBN+zZw8+/fRTfPHFF49ec0LMmFAz2VQqFQICAvo939jYiOnTp+sVizPBT506BR8fH72CEmJpeuai80xwA85oFWyQ7d5VVeVSF6HCEmIiJPyvJjPFBL93VdXmn0uFCkuISWCmvCYbIYQbXU1GiBkzxhac86AhLS0NZ8+ehb+/PyorK7Fs2TJD1IsQk9J7moxPMSTOFvyVV14xRD0IMWkme+siQgg/xthFpwQnRAh6XuttKJTghAiAgVpwQswaJTghZooB6LaURRc7tVb46a6b4HHtnO4IHrOXfbZclLjWp8SZtlszVZz6AoDfbnGWN77y/xxFiQsAtl4iLJsMPZdNphacEPNEU1UJMXOU4ISYMUYJToh5oplshJg1GmQjxGwxBmi0FnKajBBLZIzH4JxfOd7e3sjLy0NZWRlKS0uRkJBgiHoRYnKEvjeZEDhb8O7ubqxZswZFRUWwt7fHhQsXkJOTA6VSaYj6EWISGHq66caGswWvra1FUVERAKCtrQ1KpRJeXl6iV4wQ0yKBlmcxJL2OwX18fBAYGIiCggKx6kOIyTLGY3DeCW5nZ4dDhw7hnXfegVqt7vf6vcsm28ichKshISbApKeqWllZ4dChQ9i7d6/ulir3u3fZ5LqOh7sTIiGmTKs10QRPTk6GUqnE9u3bxa4PISbLGLvonINsYWFhWLp0KSIiIlBUVISioiJERkYaom6EmIze+4MLeZpMKpWisLAQhw8fBgD4+vrihx9+QEVFBfbv3w+5nPuSYc4EP3PmDCQSCSZNmoTAwEAEBgbi6NGjvCtJiEVgPcfhfApfq1at6nM6+qOPPsL27dvh5+eHpqYmxMXFccYwvrl1hJgoxiS8Ch9eXl6IiorCP/7xD91zEREROHjwIAAgNTUV0dHRnHEowQkRiJAJvmPHDqxduxZarRYA4OrqiubmZmg0GgBAVVUVr/kolOCECITxLG5ublAoFLoSHx/fJ05UVBTq6+tRWFioe04i6f/FwHj09+liE0IEwBjAeJ4ma2hoQHBw8KCvh4WFYe7cuXjhhRdgbW0NR0dH7NixA87OzpDJZNBoNPD29kZ1dTXne1ELTohAhOqir1+/HqNHj8bYsWMRExODvLw8LFmyBCdOnMDLL78MAIiNjUVmZiZnLFFacKlEC1tpp+BxPRz7z6ATirxEI0rclonirKrqcF28KxsufyDO6qdPLCoSJS4AhKuEn1zlIP1Zr+3Fvthk3bp12L9/PzZv3oyioiIkJydz7kNddEIEwX8ATR/5+fnIz88HAKhUKkyZMkWv/SnBCRGKEc5kowQnRAh6TmIxFEpwQoRCCU6IeWLgf5rMkCjBCRGIMV5NRglOiBB6p6kZGc6JLsOHD0dBQQGKi4tRWlqKP/7xjwaoFiGmSMKzGA5nC97Z2YmIiAi0t7fDysoKp0+fxtGjR2ldNkLuZ4QtOK8uent7OwBALpdDLpfzmuROiMUxwrTgNRddKpWiqKgI9fX1yMnJwblz58SuFyEmRgKm5VcMiVeCa7VaBAYGwtvbGyEhIZg4cWK/beLj43WXv1nLnAWvKCFGje+1ogZu5fW6mqylpQUnT57E7Nmz+72WlJSE4OBgBAcH42dNs2AVJMRkMAm/YkCcCe7m5gYnp551zq2trTF9+nSUl9OyyITcT8L4FUPiHGQbNWoUUlNTIZPJIJVKceDAARw5csQQdSPEtBjhIBtngl+6dAlBQUGGqAshpo1mshFipox0JhslOCFC0Q51BfqjBCdEKNRFJ8R8GXqEnA9KcEKEYEnH4C7SLrxsz71ms752tNsKHrNX9yxXUeKq/bpFiTu8TrzvZu/93De1exgS+TBR4gKAj1WX4DGHGWOTrCdqwQkRgATURSfEvNEgGyFmjE6TEWKmhmCeOR+U4IQIxQgTnG4+SIhQBLoefLB1EH19ffHDDz+goqIC+/fvh1zOfbaDEpwQgQh1uWjvOogBAQEICAjA7NmzMWXKFHz00UfYvn07/Pz80NTUhLi4OM5YlOCECILnYg88R9oHWgcxIiICBw8eBACkpqYiOjqaMw7vBJdKpSgsLMThw4f57kKI5RB4yab710G8evUqmpubodH03Oa6qqoKXl5e3HH41n/VqlVQKpV8NyfE4ki0/Iqbm5tu/UKFQoH4+Ph+se5fB3H8+PH9tuGzujGvUXQvLy9ERUXhww8/xOrVq/nsQojF4XuarKGhAcHBwby27V0HMTQ0FM7OzpDJZNBoNPD29kZ1Nfd0cF4t+I4dO7B27VpotYOfyb93VVWJdASvyhNiVgTqog+0DqJSqcSJEyfw8ssvAwBiY2ORmZnJGYszwaOiolBfX4/CwsIHbnfvqqpM28j9KQgxJwIeg48aNQonTpxASUkJFAoFcnJycOTIEaxbtw6rV6/Gjz/+CFdXVyQnJ3PG4uyih4WFYe7cuXjhhRdgbW0NR0dHfPnll3j11Ve5a0qIhRDyYpPB1kFUqVSYMmWKXrE4W/D169dj9OjRGDt2LGJiYpCXl0fJTYiJoKmqhAjFCKeq6pXg+fn5yM/PF6suhJg0CV1NRoiZsqQlmwixRHS5KCHmjBKcEPNFLTgh5sxSErwbDA3au4LHvdMp3rK7Y/deESWudoy7KHE1tuJ9Nw+7Wi9KXGZvJ0pcACjrEj72ZKbH1dSMRtEJMW+W0oITYpEowQkxT3TjA0LMGU10IcS8UQtOiDmjBCfEfJlsC65SqaBWq6HRaNDd3c17PSlCLIqpJjgAPP/887h9+7aYdSHEdNG9yQgxc0aY4Lzm4jHGkJ2djfPnzw+4hjMhBILe+EAovFrwsLAw1NTU4LHHHkNOTg7Ky8tx6tSpPtvEx8dj+fLlAAApLZtMLJAxdtF5teA1NTUAgFu3biEjIwMhISH9trl32WQtLZtMLJERtuCcCW5rawt7e3vdzzNnzkRpaanoFSPElEgY/1sXGRJnF93d3R0ZGRk9G1tZIS0tDcePHxe9YoSYGmPsonMmuEqlQkBAgCHqQohpM8UEJ4TwRAlOiJky0okueqxJQwh5IIFG0b29vZGXl4eysjKUlpYiISEBAODi4oLs7GxUVFQgOzsbzs7OnLEowQkRBIOE8Stcuru7sWbNGkyYMAGhoaFYsWIFxo8fj8TEROTm5sLPzw+5ublITEzkjEUJTohAhDpNVltbi6KiIgBAW1sblEolvLy8MG/ePKSmpgIAUlNTER0dzRlLlGPwdq0cip89BY870aNG8Ji9Lsf6iRK3zU/41WUBwOmieCvMjjrfLErc+sVPixIXAHbXCn8AvNnNgf/GekxicXNzg0Kh0D3evXs3kpKSBtzWx8cHgYGBKCgogLu7O2prawH0fAmMHDmS871okI0QAeizJltDQwOvS67t7Oxw6NAhvPPOO1Cr1Q9VL+qiEyIUAaeqWllZ4dChQ9i7d69uolldXR08PDwAAB4eHqiv516/nhKcEIFIGL/CR3JyMpRKJbZv3657LisrC7GxsQCA2NhYZGZmcsahLjohQhDwQpKwsDAsXboUFy9e1A22rV+/Hlu3bsWBAwcQFxeHGzduYP78+ZyxKMEJEYhQE13OnDkDiUQy4GvTp0/XKxYlOCECkWiNbyobJTghQjG+/OY3yObk5IT09HQolUqUlZUhNDRU7HoRYlpM9XpwANi5cyeOHTuG+fPnQy6Xw9bWVux6EWJ6jLAF50xwBwcHPPfcc3jttdcAAF1dXWhpaRG7XoSYFGO9+SBnF33cuHG4desWUlJSUFhYiKSkJGrBCRkIY/yKAXEmuJWVFYKCgrBr1y4EBQWhvb19wKtY4uPjoVAooFAoMFzGfRkbIWbFSI/BORO8qqoKVVVVOHfuHADg4MGDCAoK6rfdvauqdmrEuViBEGMm5Ew2oXAmeF1dHSorK+Hn13O11bRp01BWViZ6xQgxOUbYRec1iv72229j7969GDZsGK5du4bXX39d7HoRYnKMcZCNV4KXlJTQHUUJeZAhuKkBHzSTjRCBmGwLTgjhgeaiE2Keem9dZGwowQkRhOFHyPmgBCdEIHQMTog5s5QEt5F0YcKwWsHjug5vFzxmrzHpN0WJ+1OMlyhxWx8X74BPvlCc5Y1dSztEiQsAzXdtBI+pYfotWcjnpgaGRi04IUJgAGiQjRDzRS04IeaMzoMTYqaM9PbBlOCECIW66ISYJwloJhshZsw4Z7Jxnujz8/NDUVGRrrS0tGDVqlWGqBshpoPvjQeNbUWXiooKBAYGIjAwEJMnT0ZHR4fuboeEkF9IGONV+EhOTkZdXR0uXbqke87FxQXZ2dmoqKhAdnY2nJ251z7Ua6rOtGnTcPXqVdy4cUOf3QixDBrGr/CwZ88ezJ49u89ziYmJyM3NhZ+fH3Jzcwdc/PR+eiV4TEwM9u3bN+Br966qKpO56hOWENPHhG3BT506hcbGxj7PzZs3D6mpqQCA1NRUREdHc8bhneByuRxz585Fenr6gK/fu6qqRnObb1hCzIfIiy66u7ujtrbnGo/a2lqMHDmScx/eo+iRkZEoLCxEfX39Q1eQEPPFP3nd3NygUCh0j3fv3o2kpCRRasU7wRctWjRo95wQAt4XmzQ0NDzUIqZ1dXXw8PBAbW0tPDw8eDW2vLroNjY2mDFjBv7973/rXSlCLILAx+ADycrKQnOIla4AAAHLSURBVGxsLAAgNjYWmZmZnPvwSvA7d+7Azc0Nra2tD105QsyegMfgaWlpOHv2LPz9/VFZWYlly5Zh69atmDFjBioqKjBjxgxs3bqVMw7NZCNEKFrh5qq+8sorAz4/ffp0veJQghMiBFrwgRBz9mjH12KhBCdEKJTghJgxS1nRpa3VE3fquYfwgZ6T/g0NDby2/R896qBPXADAVyLF1YNYsfWO+4xIcfVgDL8L55/1WKmVwXJacD5T6HopFApR7lxqanHFjG1qccWMLWadLSbBCbE8DNAY3zA6JTghQmAAmPEluAzAH4e6EoWFhRRX5NimFlfM2GLEffvN/4tj+87y2jZiwdOiXVxyPwmM8o5KhJiWyyXXsWrONl7bbvr3EvHGAe5DXXRChEKDbISYKUs6TUaI5WGARjPUleiHEpwQoVALToiZoi46IWbOUuaiE2J5GJgRTnShBCdECAzUghNi1ugYnBAzxeg0GSFmjQm46KJQKMEJEQp10QkxU4zRIBshZo1OkxFivhi14ISYK0YtOCFmiwGMTpMRYp6uNfyILefW8dpWrKWmB0JLNhFixnjdPpgQYpoowQkxY5TghJgxSnBCzBglOCFm7P8Dg34M+eMR9wIAAAAASUVORK5CYII=" alt="" />
$ print(ranking)
[[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]]
Embedded
一、基于惩罚项的特征选择法
二、基于树模型的特征选择法
该话题独立成章,详见: [Feature] Feature selection - Embedded topic
集成 pipeline
如下代码片段中,
(1) 我们将 sklearn.svm.LinearSVC
和 sklearn.feature_selection.SelectFromModel
结合来评估特征的重要性,并选择最相关的特征。
(2) 之后 sklearn.ensemble.RandomForestClassifier
模型使用转换后的输出训练,即只使用被选出的相关特征。
Ref: sklearn.pipeline.Pipeline
clf = Pipeline([
('feature_selection', SelectFromModel(LinearSVC(penalty="l1"))),
('classification', RandomForestClassifier())
])
clf.fit(X, y)
降维
一、主成分分析法(PCA)
二、线性判别分析法(LDA)
Goto: [Scikit-learn] 4.4 Dimensionality reduction - PCA
Ref: [Scikit-learn] 2.5 Dimensionality reduction - Probabilistic PCA & Factor Analysis
Ref: [Scikit-learn] 2.5 Dimensionality reduction - ICA
Goto: [Scikit-learn] 1.2 Dimensionality reduction - Linear and Quadratic Discriminant Analysis
End.
最新文章
- 基于HTML5的WebGL呈现A星算法的3D可视化
- storm UI
- 【Jsoup爬取网页内容】
- Linux系统下查看某文件修改的时间戳
- 使用PHP-Barcode轻松生成条形码(一)
- C#实现给手机发送短信
- log4j不生成日志文件的问题
- python 推导式
- webpack简单教程
- 《Mysql 字符集》
- Unity3D学习笔记(二十七):MVC框架下的背包系统(2)
- Spring MVC 处理异常
- 一个简单的python练习题
- Unity 添加自定义菜单(插件),添加功能
- mysql的master和slave同步方案
- 解剖Nginx·模块开发篇(1)跑起你的 Hello World 模块!
- js中斜杠转义
- 智能选择器和语义化的CSS
- c++之默认参数的函数
- 唯快不破:Web 应用的 13 个优化步骤