最新sklearn学习笔记.doc
《最新sklearn学习笔记.doc》由会员分享,可在线阅读,更多相关《最新sklearn学习笔记.doc(8页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、精品资料sklearn学习笔记.sklearn学习笔记 模型验证方法1.学习率曲线(learn_curve)2.交叉验证得分(cross_val_score)3.验证曲线(validation_curve)一.学习率曲线计算指定的学习器模型在不同大小的训练集上经过交叉验证的训练得分和测试得分首先,用一个交叉验证生成器划分整体数据集K次,每一次划分都有一个训练集和测试集。然后从每次划分的训练集中拿出若干个数量不断增加的子集,在这些训练集上训练模型。然后再计算模型在对应的子训练集和测试集上的得分,最后,对于在每种子训练集大小下,将K次训练集得分和测试集得分分别进行平均。python view pl
2、ain copy print?# import numpy as np # from sklearn.model_selection import learning_curve, ShuffleSplit # from sklearn.datasets import load_digits # from sklearn.naive_bayes import GaussianNB # from sklearn import svm # import matplotlib.pyplot as plt # def plot_learning_curve(estimator, title, X, y,
3、 ylim=None, cv=None,n_jobs=1, train_size=np.linspace(.1, 1.0, 5 ): # if _name_ = _main_: # plt.figure() # plt.title(title) # if ylim is not None: # plt.ylim(*ylim) # plt.xlabel(Training example) # plt.ylabel(score) # train_sizes, train_scores, test_scores = learning_curve(estimator, X, y, cv=cv, n_j
4、obs=n_jobs, train_sizes=train_size) # train_scores_mean = np.mean(train_scores, axis=1) # train_scores_std = np.std(train_scores, axis=1) # test_scores_mean = np.mean(test_scores, axis=1) # test_scores_std = np.std(test_scores, axis=1) # plt.grid()#区域 # plt.fill_between(train_sizes, train_scores_mea
5、n - train_scores_std, # train_scores_mean + train_scores_std, alpha=0.1, # color=r) # plt.fill_between(train_sizes, test_scores_mean - test_scores_std, # test_scores_mean + test_scores_std, alpha=0.1, # color=g) # plt.plot(train_sizes, train_scores_mean, o-, color=r, # label=Training score) # plt.pl
6、ot(train_sizes, test_scores_mean,o-,color=g, # label=Cross-validation score) # plt.legend(loc=best) # return plt # digits = load_digits() # X = digits.data # y = digits.target # cv = ShuffleSplit(n_splits=100, test_size=0.2, random_state=0)#切割100ci # estimator = GaussianNB() # title = Learning Curve
7、s(naive_bayes) # plot_learning_curve(estimator, title, X, y, ylim=(0.7, 1.01), cv=cv, n_jobs=4) # title = Learning Curves(SVM,RBF kernel, $gamma=0.001$) # cv = ShuffleSplit(n_splits=10, test_size=0.2, random_state=0)#交叉验证传入别的方法,而不是默认的k折交叉验证 # estimator = svm.SVC(gamma=0.001) # plot_learning_curve(es
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 最新 sklearn 学习 笔记
限制150内