Blendingとは

Blendingとは各予測モデルで予測された結果を加重平均にとるアンサンブル手法です．

現在Kaggleなどの機械学習コンペティションでよく用いられる手法で，高い予測精度を叩き出すことができることから注目を浴びています．

そもそもBlending・アンサンブル (Ensemble) ってなんだよ，という方はまず，こちらを参考にしてください．

以上に示した通り，原理についてのわかりやすい説明はたくさん存在するのですが，実際に最後まで（どのように重みを最適化するのか）検討しているサイトがほとんどなかったため自分でやってみました．

特にランダムによる重みの最適化をやっているサイトが一つだけありましたが，Optunaを使ったブラックボックス最適化による最小化をやっているサイトはなかったです．

上記も原理的に可能であると考え，取り組みました．

おそらくGridSearchCVなどを使ってもできると思いますが，自分は試していないのでご興味のある方は試してみるのも良いかもしれません．

アンサンブル？Blending?

アンサンブル学習についてはこれらが参考になります．↓

とっつきやすく，最初の導入に役立ったサイト．

最初のサイトの日本語での解説記事（ブログ）がこちら．↓

Blendingについてなんとなく知るためにははこちらが参考になります．↓

簡単にいうと学習モデルを重み付けして平均することなのですが，この重さをどのように定量的に決めるかについて自分が実施した例についてまとめます．

コード¶

モジュールのインポート¶

import numpy as np
import pandas as pd
import scipy
import sklearn
from scipy.optimize import minimize
from sklearn.metrics import mean_squared_error as mse
import matplotlib.pyplot as plt
from matplotlib import rcParams
from tqdm import tqdm

Versionsの確認¶

modules = [np, pd, scipy, sklearn]
for s in map(lambda module: module.__name__ + ': ' + module.__version__, modules):
    print(s)

numpy: 1.17.3
pandas: 0.25.1
scipy: 1.3.1
sklearn: 0.21.3

乱数発生の再現性を保つため，シードを固定¶

SEED = 334
np.random.seed(SEED)

アンサンブルするモデルの数とサンプルの数¶

n_models = 5
n_samples = 100000

0 ~ 1の間の乱数を発生させる（各学習モデルの予測結果に見立てる）¶

df_y_pred = pd.DataFrame(np.random.rand(n_samples, n_models), columns = ['model' + str(i) for i in range(n_models)])
df_y_pred.head()

答えの生成（0か1）¶

df_y_true = pd.Series(np.random.randint(low = 0, high = 2, size = n_samples), name = 'y_true')
df_y_true.head()

0    0
1    1
2    0
3    0
4    0
Name: y_true, dtype: int64

それぞれのモデルの予測精度を求める¶

今回はmean_squared_errorを指標として用いています．

for model_name, y_pred in df_y_pred.iteritems():
    print('MSE ({0}): {1:.4f}'.format(model_name, mse(df_y_true, y_pred)))

MSE (model0): 0.3335
MSE (model1): 0.3341
MSE (model2): 0.3342
MSE (model3): 0.3336
MSE (model4): 0.3345

加重平均をとって，そのスコア（MSE）を計算して返す関数を定義¶

もし何も重みを入れなければ普通の加重平均となって返ってくるように関数を定義しています．

def calc_score(weight = np.ones(n_models)):
    y_pred_blended = np.average(df_y_pred, axis = 1, weights = weight)
    return mse(y_pred_blended, df_y_true)

重みの最適化を最大何回やるかの変数¶

max_iter = 100

乱数で探索する場合 (Random Search)¶

結果をとっておくための配列を定義¶

scores = []
weights = []

初期の重みを一括で生成（再現性のため）¶

本当はデータ数が増えるとメモリを圧迫するのでこの方法はとりたくなかったのですが，再現性がうまくとれなかったのでこの方法で実施しています．
デモなのでご了承を．

initial_weights = np.random.uniform(size = (max_iter, n_models))

最適化¶

tqdmパッケージを使ってプログレスバーを表示してます．（←便利ですね！）

for i in tqdm(range(max_iter)):
    # 最適化のときの条件出し
    bounds = [(0,1)] * n_models

    result = minimize(calc_score, initial_weights[i], method = 'Nelder-Mead', bounds = bounds)
    score = result['fun']
    weight = result['x']
    # message = result['message']
    scores.append(score)
    weights.append(weight)

100%|██████████| 100/100 [01:02<00:00,  1.60it/s]

最小値を取った試行回数を取得¶

i_argmin = np.argmin(scores)
print('{}回目が最適．'.format(i_argmin))

44回目が最適．

最適な重みとそのときのスコアを取得¶

わかりやすくするために重みの合計を1になるように「規格化」のようなことをしました．

best_score = scores[i_argmin]
best_weight = weights[i_argmin]
best_weight /= np.sum(best_weight)
print('best score: {0}\nbest weight: {1}'.format(best_score, best_weight))

best score: 0.2672625032243331
best weight: [0.2035174  0.19770572 0.1984565  0.20280714 0.19751324]

Optunaで最適化する場合¶

モジュールインポート¶

import optuna

versionを確認¶

print('optuna: {}'.format(optuna.__version__))

optuna: 1.3.0

最小化したい関数を定義¶

class Objective:
    def __init__(self, n_models):
        self.n_models = n_models

    def __call__(self, trial):
        weight = [trial.suggest_uniform('weight' + str(n), 0, 1) for n in range(self.n_models)]
        return calc_score(weight)
objective = Objective(n_models)

sampler = optuna.samplers.TPESampler(seed=SEED)
study = optuna.create_study(sampler = sampler)
study.optimize(objective, n_trials = max_iter, n_jobs = -1)

[I 2021-01-06 23:29:39,305] Finished trial#1 with value: 0.27037459705957706 with parameters: {'weight0': 0.8230466995543397, 'weight1': 0.7993373582992824, 'weight2': 0.10653088920510301, 'weight3': 0.8399510200707445, 'weight4': 0.596349542847271}. Best is trial#1 with value: 0.27037459705957706.
[I 2021-01-06 23:29:39,369] Finished trial#0 with value: 0.2717688873400324 with parameters: {'weight0': 0.23813656678623252, 'weight1': 0.03248528859266486, 'weight2': 0.124267595743489, 'weight3': 0.17054810624165562, 'weight4': 0.29020082724543383}. Best is trial#1 with value: 0.27037459705957706.
[I 2021-01-06 23:29:39,421] Finished trial#2 with value: 0.2720766037410089 with parameters: {'weight0': 0.8228984518791593, 'weight1': 0.6090536027403167, 'weight2': 0.761943003676073, 'weight3': 0.004309860316684011, 'weight4': 0.5225965547148727}. Best is trial#1 with value: 0.27037459705957706.
[I 2021-01-06 23:29:39,471] Finished trial#6 with value: 0.2688151846371434 with parameters: {'weight0': 0.9289911515391298, 'weight1': 0.4793736720436942, 'weight2': 0.953077941235539, 'weight3': 0.4391835069085309, 'weight4': 0.7654608015583195}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,563] Finished trial#5 with value: 0.2731418924717805 with parameters: {'weight0': 0.07213189298225564, 'weight1': 0.29677673340107447, 'weight2': 0.6416607185565314, 'weight3': 0.21747157713996568, 'weight4': 0.4700618922100559}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,563] Finished trial#3 with value: 0.27073554789494336 with parameters: {'weight0': 0.07754031666102801, 'weight1': 0.5263839238055392, 'weight2': 0.7301578524617401, 'weight3': 0.6086057523217057, 'weight4': 0.526219007790581}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,660] Finished trial#7 with value: 0.2821068528583616 with parameters: {'weight0': 0.0843014570838424, 'weight1': 0.20721260332006242, 'weight2': 0.2800723453393109, 'weight3': 0.14935997132393752, 'weight4': 0.9387884328134978}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,661] Finished trial#4 with value: 0.2740132175079263 with parameters: {'weight0': 0.9581094780142354, 'weight1': 0.31299042507635566, 'weight2': 0.5667018832511729, 'weight3': 0.031811330174331354, 'weight4': 0.5323058687135054}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,725] Finished trial#8 with value: 0.28362997596037154 with parameters: {'weight0': 0.6577675716646878, 'weight1': 0.12131976056831906, 'weight2': 0.0580007648675398, 'weight3': 0.05108072994824564, 'weight4': 0.899032302641813}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,788] Finished trial#9 with value: 0.27581556414619174 with parameters: {'weight0': 0.979163620441914, 'weight1': 0.1571263640300401, 'weight2': 0.21499682922214103, 'weight3': 0.2273331878371363, 'weight4': 0.7604341527001697}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,842] Finished trial#10 with value: 0.27506008337176213 with parameters: {'weight0': 0.41385434728268056, 'weight1': 0.3715305268528317, 'weight2': 0.2693477281564991, 'weight3': 0.9868580176147957, 'weight4': 0.108190505876676}. Best is trial#6 with value: 0.2688151846371434.
[I 2021-01-06 23:29:39,930] Finished trial#11 with value: 0.2676485229810982 with parameters: {'weight0': 0.7448577547405768, 'weight1': 0.9192569469994657, 'weight2': 0.9099730536875623, 'weight3': 0.6250032661984313, 'weight4': 0.7310848825050558}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:39,999] Finished trial#12 with value: 0.2680035634731865 with parameters: {'weight0': 0.677255640418053, 'weight1': 0.9515704814996009, 'weight2': 0.9904966045252914, 'weight3': 0.5805568555606693, 'weight4': 0.7365371966384726}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,065] Finished trial#13 with value: 0.2681748032371323 with parameters: {'weight0': 0.6591820961438137, 'weight1': 0.8773662608884274, 'weight2': 0.95908794039708, 'weight3': 0.4816910600236472, 'weight4': 0.763029562508469}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,134] Finished trial#14 with value: 0.2679624606426079 with parameters: {'weight0': 0.5636626125436909, 'weight1': 0.9855450800515324, 'weight2': 0.856743178688439, 'weight3': 0.6912266369704885, 'weight4': 0.6817326812307167}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,220] Finished trial#15 with value: 0.2683269991334918 with parameters: {'weight0': 0.555024940273191, 'weight1': 0.7001690038430877, 'weight2': 0.8402602295376201, 'weight3': 0.7788047015054844, 'weight4': 0.3819638400745668}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,282] Finished trial#16 with value: 0.26861292053775315 with parameters: {'weight0': 0.5134690867024803, 'weight1': 0.7587372347280825, 'weight2': 0.8275626212121452, 'weight3': 0.7773897478621431, 'weight4': 0.3500040906201397}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,368] Finished trial#17 with value: 0.2692292887044751 with parameters: {'weight0': 0.3910053491676991, 'weight1': 0.9882656850114424, 'weight2': 0.4328809004543422, 'weight3': 0.6851845849060619, 'weight4': 0.6682277802887806}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,439] Finished trial#18 with value: 0.2692634266077366 with parameters: {'weight0': 0.36961701522450785, 'weight1': 0.9692871072253219, 'weight2': 0.4392188068029355, 'weight3': 0.6672420101008462, 'weight4': 0.6299631274913214}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,509] Finished trial#19 with value: 0.26817582300564186 with parameters: {'weight0': 0.824005336476274, 'weight1': 0.884466326426076, 'weight2': 0.4532242214502498, 'weight3': 0.961459381275927, 'weight4': 0.9928047062560115}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,575] Finished trial#20 with value: 0.26853280819031977 with parameters: {'weight0': 0.8106394495049675, 'weight1': 0.8617170801083209, 'weight2': 0.8889216774058663, 'weight3': 0.3521475664717914, 'weight4': 0.8442099061460652}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,648] Finished trial#21 with value: 0.26790752210640223 with parameters: {'weight0': 0.7241333134521839, 'weight1': 0.6912967247364353, 'weight2': 0.9900606733164599, 'weight3': 0.5630201593105135, 'weight4': 0.6939567637560837}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,710] Finished trial#22 with value: 0.2681647450788221 with parameters: {'weight0': 0.6601128174297871, 'weight1': 0.9834540434863852, 'weight2': 0.96947425913965, 'weight3': 0.5577196355990724, 'weight4': 0.6882987354435012}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,775] Finished trial#23 with value: 0.2685908495502832 with parameters: {'weight0': 0.7192468917061373, 'weight1': 0.6825669692466246, 'weight2': 0.9968049238927835, 'weight3': 0.39786574910841127, 'weight4': 0.8620893880966192}. Best is trial#11 with value: 0.2676485229810982.
[I 2021-01-06 23:29:40,847] Finished trial#24 with value: 0.2674139038816828 with parameters: {'weight0': 0.731982116840896, 'weight1': 0.6649519639841952, 'weight2': 0.7238659776096451, 'weight3': 0.878354959534872, 'weight4': 0.8207413840247507}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:40,917] Finished trial#25 with value: 0.26768327176212625 with parameters: {'weight0': 0.5663574119655169, 'weight1': 0.7931777191424961, 'weight2': 0.7052915698315115, 'weight3': 0.9114171262399285, 'weight4': 0.8288087077780917}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:40,976] Finished trial#26 with value: 0.26750013949660334 with parameters: {'weight0': 0.7467672426334147, 'weight1': 0.602679890565527, 'weight2': 0.7361640707210527, 'weight3': 0.8855259279652993, 'weight4': 0.819403259415255}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,045] Finished trial#27 with value: 0.267965194991069 with parameters: {'weight0': 0.5924543976000108, 'weight1': 0.6029660924557253, 'weight2': 0.6043605554001529, 'weight3': 0.8716324107939495, 'weight4': 0.9237964521654963}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,115] Finished trial#28 with value: 0.267867857484193 with parameters: {'weight0': 0.9049575778513137, 'weight1': 0.5580270210765395, 'weight2': 0.6667978323458303, 'weight3': 0.7831050975272783, 'weight4': 0.9705992738329132}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,184] Finished trial#29 with value: 0.26817399430223415 with parameters: {'weight0': 0.8893947017790955, 'weight1': 0.46310608485763627, 'weight2': 0.6616596577744405, 'weight3': 0.99756455077938, 'weight4': 0.8255666703152719}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,254] Finished trial#30 with value: 0.2682500822307292 with parameters: {'weight0': 0.7543093416817074, 'weight1': 0.4497559886846172, 'weight2': 0.542472638333292, 'weight3': 0.9241394398847848, 'weight4': 0.6050027476634486}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,318] Finished trial#31 with value: 0.26827490309290575 with parameters: {'weight0': 0.7738540991112307, 'weight1': 0.4280091380508123, 'weight2': 0.5437209516360471, 'weight3': 0.8876868021640604, 'weight4': 0.8261702157951025}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,394] Finished trial#32 with value: 0.2679590144528272 with parameters: {'weight0': 0.44197361827120707, 'weight1': 0.7859825993707753, 'weight2': 0.7527428316823135, 'weight3': 0.8138173294303866, 'weight4': 0.810466917161785}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,463] Finished trial#33 with value: 0.2677838522810043 with parameters: {'weight0': 0.5819409617732625, 'weight1': 0.7853906577055223, 'weight2': 0.7480854728249826, 'weight3': 0.826126825158008, 'weight4': 0.993960187857404}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,525] Finished trial#34 with value: 0.268963026803479 with parameters: {'weight0': 0.31846183481276236, 'weight1': 0.6303103987148541, 'weight2': 0.7233146181118278, 'weight3': 0.9236687859668061, 'weight4': 0.8977934627817077}. Best is trial#24 with value: 0.2674139038816828.
[I 2021-01-06 23:29:41,599] Finished trial#35 with value: 0.2673130955637655 with parameters: {'weight0': 0.839556733600322, 'weight1': 0.8389494152231414, 'weight2': 0.8015949525199835, 'weight3': 0.7298152118057439, 'weight4': 0.8004604007506304}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:41,668] Finished trial#36 with value: 0.2673343486337821 with parameters: {'weight0': 0.8641919293582506, 'weight1': 0.7210264498534064, 'weight2': 0.7935399222782457, 'weight3': 0.7202424153984712, 'weight4': 0.7507778388656229}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:41,736] Finished trial#37 with value: 0.26752559273354654 with parameters: {'weight0': 0.8560788902689748, 'weight1': 0.7286986828398364, 'weight2': 0.8052305061200856, 'weight3': 0.72203023447456, 'weight4': 0.5685636685506494}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:41,812] Finished trial#38 with value: 0.2680040550159804 with parameters: {'weight0': 0.8770941591798119, 'weight1': 0.5654140415161075, 'weight2': 0.7895662850373227, 'weight3': 0.7316324912291621, 'weight4': 0.4614747590172548}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:41,880] Finished trial#39 with value: 0.26774687228926375 with parameters: {'weight0': 0.9644098393552503, 'weight1': 0.635542589429606, 'weight2': 0.607365609446188, 'weight3': 0.8570607355556419, 'weight4': 0.8751017992569805}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:41,940] Finished trial#40 with value: 0.26788748445990734 with parameters: {'weight0': 0.9766201057659329, 'weight1': 0.620655751259329, 'weight2': 0.5972343505148033, 'weight3': 0.9636890086232086, 'weight4': 0.7818904906115769}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,028] Finished trial#41 with value: 0.2677594897753814 with parameters: {'weight0': 0.995047961420627, 'weight1': 0.7290218840787301, 'weight2': 0.8003083489575515, 'weight3': 0.7238341649107345, 'weight4': 0.5690331280295479}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,192] Finished trial#43 with value: 0.26758532608511604 with parameters: {'weight0': 0.8429196626784382, 'weight1': 0.7351494667000512, 'weight2': 0.7736693468149759, 'weight3': 0.6344820956239468, 'weight4': 0.5567411086165208}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,265] Finished trial#42 with value: 0.26930929742923027 with parameters: {'weight0': 0.855116716069477, 'weight1': 0.7303418249628146, 'weight2': 0.7954055765842496, 'weight3': 0.6261042931697729, 'weight4': 0.21157514012109974}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,359] Finished trial#45 with value: 0.26816992503114906 with parameters: {'weight0': 0.8525962310112539, 'weight1': 0.5141650386758922, 'weight2': 0.9041512517948641, 'weight3': 0.7278405322200852, 'weight4': 0.49684545881896125}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,456] Finished trial#46 with value: 0.26779824897749704 with parameters: {'weight0': 0.8111332624733504, 'weight1': 0.835490412871047, 'weight2': 0.8773643578590077, 'weight3': 0.7367767332686553, 'weight4': 0.4936738830787542}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,457] Finished trial#44 with value: 0.26751838422704644 with parameters: {'weight0': 0.8503557517258473, 'weight1': 0.8179162968154635, 'weight2': 0.90538712314709, 'weight3': 0.7353600917370213, 'weight4': 0.62240402051544}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,552] Finished trial#47 with value: 0.2678808950949392 with parameters: {'weight0': 0.8429428476841986, 'weight1': 0.8374810419209393, 'weight2': 0.9161729971677546, 'weight3': 0.741662910436919, 'weight4': 0.4874681983554006}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,597] Finished trial#49 with value: 0.26750591170842436 with parameters: {'weight0': 0.8348536258068109, 'weight1': 0.8347395335905716, 'weight2': 0.9133365946981338, 'weight3': 0.7327356973612021, 'weight4': 0.6375043537198345}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,685] Finished trial#48 with value: 0.2675109188443249 with parameters: {'weight0': 0.8464625771391234, 'weight1': 0.8456349471741207, 'weight2': 0.9057960281844415, 'weight3': 0.7400830983055122, 'weight4': 0.6304788504805545}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,772] Finished trial#50 with value: 0.26778337850801265 with parameters: {'weight0': 0.858484763102481, 'weight1': 0.5125302664181783, 'weight2': 0.9166881120743993, 'weight3': 0.8169600367325729, 'weight4': 0.7370194738117408}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,849] Finished trial#51 with value: 0.2674105195758934 with parameters: {'weight0': 0.9325046938977253, 'weight1': 0.8468210519584093, 'weight2': 0.6994257750105609, 'weight3': 0.8247563682969158, 'weight4': 0.7354010786211573}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:42,947] Finished trial#52 with value: 0.26745890713791576 with parameters: {'weight0': 0.9159446698033081, 'weight1': 0.6573956029321824, 'weight2': 0.6911657637200685, 'weight3': 0.813404286579151, 'weight4': 0.7845355330605773}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,006] Finished trial#53 with value: 0.26789122941303994 with parameters: {'weight0': 0.9288398983713013, 'weight1': 0.9252536797755607, 'weight2': 0.8437548238183917, 'weight3': 0.5301666872115143, 'weight4': 0.7179021542671471}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,093] Finished trial#54 with value: 0.26739725359046734 with parameters: {'weight0': 0.9219656774294025, 'weight1': 0.9281896301457601, 'weight2': 0.7175479525095291, 'weight3': 0.8546508954545453, 'weight4': 0.7944381221545325}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,173] Finished trial#55 with value: 0.2675947780774488 with parameters: {'weight0': 0.930006021272707, 'weight1': 0.6596398956586502, 'weight2': 0.6898759343531063, 'weight3': 0.8479185911443101, 'weight4': 0.9465326326743723}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,238] Finished trial#56 with value: 0.26733780865968976 with parameters: {'weight0': 0.7733025487257451, 'weight1': 0.9142319994534733, 'weight2': 0.8477538428288482, 'weight3': 0.8556934036324038, 'weight4': 0.7847142283150113}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,316] Finished trial#57 with value: 0.27174367761926854 with parameters: {'weight0': 0.0035143435623128383, 'weight1': 0.9138528950507578, 'weight2': 0.6959065653599401, 'weight3': 0.8575807756039191, 'weight4': 0.7833631298072337}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,405] Finished trial#58 with value: 0.26743478395862624 with parameters: {'weight0': 0.9196927808617673, 'weight1': 0.9197662557898146, 'weight2': 0.6956404116065501, 'weight3': 0.799348550774481, 'weight4': 0.8815237783100661}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,483] Finished trial#59 with value: 0.2676270117128387 with parameters: {'weight0': 0.9354786642696656, 'weight1': 0.9229282790739632, 'weight2': 0.6864892196152514, 'weight3': 0.6639892978750884, 'weight4': 0.8867511036240489}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,585] Finished trial#60 with value: 0.2674105407167059 with parameters: {'weight0': 0.7805536394555318, 'weight1': 0.9181541898434236, 'weight2': 0.845522254663911, 'weight3': 0.9578032984179521, 'weight4': 0.7421609833383845}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,636] Finished trial#61 with value: 0.267570800665921 with parameters: {'weight0': 0.9135498431979446, 'weight1': 0.909303347758205, 'weight2': 0.6242403062711512, 'weight3': 0.7820918335934113, 'weight4': 0.7227958968143522}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,683] Finished trial#62 with value: 0.267518122764861 with parameters: {'weight0': 0.913337947318204, 'weight1': 0.8825743274153814, 'weight2': 0.6298635835508283, 'weight3': 0.7825585065045265, 'weight4': 0.7430552385422197}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,793] Finished trial#63 with value: 0.26737342497530886 with parameters: {'weight0': 0.9316909363203258, 'weight1': 0.8905368350287046, 'weight2': 0.8324446074270275, 'weight3': 0.9640482219970068, 'weight4': 0.7386908775881381}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,846] Finished trial#64 with value: 0.27198481938331737 with parameters: {'weight0': 0.004133975916775612, 'weight1': 0.896761139889776, 'weight2': 0.6424995821976365, 'weight3': 0.9311057601050954, 'weight4': 0.715145892160752}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:43,970] Finished trial#65 with value: 0.2676288752169071 with parameters: {'weight0': 0.7854572848934234, 'weight1': 0.8786391494877094, 'weight2': 0.6397360987120921, 'weight3': 0.9616572705595383, 'weight4': 0.672053651236759}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,060] Finished trial#67 with value: 0.26761258261490384 with parameters: {'weight0': 0.6909483688079733, 'weight1': 0.8864788595196493, 'weight2': 0.8283495634455613, 'weight3': 0.9683852097261844, 'weight4': 0.6565332793240821}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,139] Finished trial#66 with value: 0.26749811238811727 with parameters: {'weight0': 0.7814442565081046, 'weight1': 0.8836910262609701, 'weight2': 0.8219680529322354, 'weight3': 0.9685308633241013, 'weight4': 0.6653280875517998}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,188] Finished trial#68 with value: 0.2674229962679714 with parameters: {'weight0': 0.7891905858174492, 'weight1': 0.9457570304848986, 'weight2': 0.8281653564296149, 'weight3': 0.9762725402695297, 'weight4': 0.7627031261270378}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,301] Finished trial#69 with value: 0.2674257455139057 with parameters: {'weight0': 0.7880728273617882, 'weight1': 0.9938177908032335, 'weight2': 0.8627660847919943, 'weight3': 0.9695556982884221, 'weight4': 0.7975671417664577}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,379] Finished trial#70 with value: 0.2675322184515851 with parameters: {'weight0': 0.693427015472595, 'weight1': 0.9549726217879873, 'weight2': 0.8277831103384329, 'weight3': 0.9984126817652202, 'weight4': 0.8544759859819376}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,432] Finished trial#72 with value: 0.26736413101037443 with parameters: {'weight0': 0.7819801220014053, 'weight1': 0.9484505184621222, 'weight2': 0.8246376619640329, 'weight3': 0.9037166340305343, 'weight4': 0.8012165333673955}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,533] Finished trial#71 with value: 0.2674674542044638 with parameters: {'weight0': 0.690927501473842, 'weight1': 0.9535398919201115, 'weight2': 0.8537180679924683, 'weight3': 0.8906713543970723, 'weight4': 0.8576613143857031}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,643] Finished trial#73 with value: 0.267592081326272 with parameters: {'weight0': 0.7059574700449676, 'weight1': 0.9530771003728853, 'weight2': 0.9466147908945307, 'weight3': 0.8889945251224582, 'weight4': 0.695928569779941}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,730] Finished trial#74 with value: 0.2674842042782062 with parameters: {'weight0': 0.9942562962925943, 'weight1': 0.9763157266350175, 'weight2': 0.9565007061070516, 'weight3': 0.9124211127977665, 'weight4': 0.6959199430684125}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,801] Finished trial#75 with value: 0.2674771284697885 with parameters: {'weight0': 0.9647305903024366, 'weight1': 0.9931559167364927, 'weight2': 0.9500853599720067, 'weight3': 0.9029962989716528, 'weight4': 0.6994216017870946}. Best is trial#35 with value: 0.2673130955637655.
[I 2021-01-06 23:29:44,881] Finished trial#76 with value: 0.2672914393770987 with parameters: {'weight0': 0.996616922706734, 'weight1': 0.9967338518097747, 'weight2': 0.9472288784473567, 'weight3': 0.9056035687253208, 'weight4': 0.9194559781765913}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:44,971] Finished trial#77 with value: 0.2675340560173819 with parameters: {'weight0': 0.6309279558631289, 'weight1': 0.763927302620137, 'weight2': 0.7737045334335498, 'weight3': 0.9401064591072333, 'weight4': 0.7676728722772045}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,054] Finished trial#78 with value: 0.2676484320159843 with parameters: {'weight0': 0.6329690693254586, 'weight1': 0.7674546632252873, 'weight2': 0.9507307592527804, 'weight3': 0.9244288043118619, 'weight4': 0.925509198011978}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,131] Finished trial#79 with value: 0.2676617150955265 with parameters: {'weight0': 0.6136357950970266, 'weight1': 0.7643296907804845, 'weight2': 0.7624843777423427, 'weight3': 0.9270356483131729, 'weight4': 0.9361000346869195}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,212] Finished trial#80 with value: 0.2675753732673019 with parameters: {'weight0': 0.6200770701384295, 'weight1': 0.8140392676462853, 'weight2': 0.7642414331383949, 'weight3': 0.8431084319882525, 'weight4': 0.9290468879258629}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,312] Finished trial#81 with value: 0.2673920576486078 with parameters: {'weight0': 0.6395397084487257, 'weight1': 0.7717607156337072, 'weight2': 0.7640925207786573, 'weight3': 0.8324222025196645, 'weight4': 0.7474992326167009}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,390] Finished trial#82 with value: 0.26730649245148913 with parameters: {'weight0': 0.885394184401573, 'weight1': 0.7708352123815012, 'weight2': 0.7519527700881053, 'weight3': 0.8610663146007206, 'weight4': 0.8110385294010098}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,445] Finished trial#83 with value: 0.2674450456338052 with parameters: {'weight0': 0.6302250995696795, 'weight1': 0.7750631170368809, 'weight2': 0.758876474149294, 'weight3': 0.8396611870749794, 'weight4': 0.831232707421822}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,551] Finished trial#84 with value: 0.267510492677328 with parameters: {'weight0': 0.6271972972094475, 'weight1': 0.8154307728828849, 'weight2': 0.8807623532158062, 'weight3': 0.8659897366200414, 'weight4': 0.8346658462561378}. Best is trial#76 with value: 0.2672914393770987.
[I 2021-01-06 23:29:45,628] Finished trial#87 with value: 0.26727410730384316 with parameters: {'weight0': 0.8920903395826805, 'weight1': 0.8541547654100914, 'weight2': 0.8758648405670642, 'weight3': 0.866784765114253, 'weight4': 0.807714690282262}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:45,681] Finished trial#85 with value: 0.26728184821824386 with parameters: {'weight0': 0.887886796930126, 'weight1': 0.8545814774153462, 'weight2': 0.8766488136115718, 'weight3': 0.8324083108979399, 'weight4': 0.800834506578891}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:45,743] Finished trial#86 with value: 0.26730607177375987 with parameters: {'weight0': 0.9512166903482634, 'weight1': 0.8622942903858979, 'weight2': 0.8756928726684532, 'weight3': 0.8727137498138966, 'weight4': 0.9725773764863187}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:45,861] Finished trial#88 with value: 0.26743685875002926 with parameters: {'weight0': 0.8847933897097191, 'weight1': 0.8610565797347048, 'weight2': 0.8713912329161915, 'weight3': 0.6829794309249516, 'weight4': 0.9006780673315835}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:45,953] Finished trial#89 with value: 0.26748178149756796 with parameters: {'weight0': 0.8844037712484429, 'weight1': 0.8047680377474696, 'weight2': 0.9995981441868145, 'weight3': 0.7562544965570968, 'weight4': 0.982123202119807}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,041] Finished trial#91 with value: 0.26738825639690544 with parameters: {'weight0': 0.880829224936545, 'weight1': 0.8526903619941396, 'weight2': 0.9859117637579387, 'weight3': 0.8021911033164502, 'weight4': 0.972950289229117}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,090] Finished trial#90 with value: 0.2673913037067631 with parameters: {'weight0': 0.8890546467660698, 'weight1': 0.8638816720951399, 'weight2': 0.9805872360042045, 'weight3': 0.7623738369458832, 'weight4': 0.9076934244532562}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,187] Finished trial#92 with value: 0.2674363099121157 with parameters: {'weight0': 0.884441113268305, 'weight1': 0.8600085970985469, 'weight2': 0.9992815841827822, 'weight3': 0.7615581550071125, 'weight4': 0.9659018184628603}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,263] Finished trial#93 with value: 0.26730788030631464 with parameters: {'weight0': 0.876169230215138, 'weight1': 0.8650628241616639, 'weight2': 0.7913980612915329, 'weight3': 0.7685516905585323, 'weight4': 0.8080009191096329}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,319] Finished trial#94 with value: 0.2673271240594606 with parameters: {'weight0': 0.8749901171724956, 'weight1': 0.8568219991259982, 'weight2': 0.9994166806587024, 'weight3': 0.8741077195675947, 'weight4': 0.9043282139663664}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,368] Finished trial#95 with value: 0.26740347664234915 with parameters: {'weight0': 0.8889811955237857, 'weight1': 0.8654343739354848, 'weight2': 0.9912665804797458, 'weight3': 0.7658657914173153, 'weight4': 0.8157055175839505}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,466] Finished trial#96 with value: 0.2673500614084466 with parameters: {'weight0': 0.879952952653742, 'weight1': 0.7061804797007497, 'weight2': 0.7994881918935659, 'weight3': 0.8764903609800854, 'weight4': 0.863590829459318}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,518] Finished trial#98 with value: 0.2728392811169022 with parameters: {'weight0': 0.9512513244035508, 'weight1': 0.02607578247001896, 'weight2': 0.36114159407298896, 'weight3': 0.8720788468254399, 'weight4': 0.867337614317206}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,566] Finished trial#97 with value: 0.2690969519398476 with parameters: {'weight0': 0.8132355107809736, 'weight1': 0.26158480604129153, 'weight2': 0.9315631421583969, 'weight3': 0.7008782387925084, 'weight4': 0.8697112824214545}. Best is trial#87 with value: 0.26727410730384316.
[I 2021-01-06 23:29:46,615] Finished trial#99 with value: 0.2674800048367961 with parameters: {'weight0': 0.9993267988807075, 'weight1': 0.7060403119780618, 'weight2': 0.935324924042841, 'weight3': 0.7995934854113028, 'weight4': 0.8676883690808104}. Best is trial#87 with value: 0.26727410730384316.

最適な重みとそのときのスコアを取得¶

わかりやすくするために重みの合計を1になるように「規格化」をする．

best_weight = list(study.best_params.values())
best_weight = np.array(best_weight) / np.sum(best_weight)
best_score = study.best_value
print('best score: {0}\nbest weight: {1}'.format(best_score, best_weight))

best score: 0.26727410730384316
best weight: [0.20762659 0.1987974  0.20385024 0.20173692 0.18798886]

学習曲線作成¶

学習曲線用のデータを作成

df_optuna = pd.DataFrame([trial.value for trial in study.trials], columns = ['value'])
df_optuna['best_value'] = [np.min(df_optuna.loc[:i, 'value']) for i in range(1, len(df_optuna)+1)]
df_optuna.tail()

学習曲線

rcParams['font.size'] = 13
rcParams['font.family'] = 'Helvetica'

fig = plt.figure(facecolor = 'white')
ax = fig.add_subplot(111)

ax.scatter(df_optuna.index, df_optuna['value'], c = 'blue', s = 10, label = 'Value')
ax.plot(df_optuna.index, df_optuna['best_value'], c = 'r', label = 'Best Value')

ax.legend(facecolor = '#f0f0f0', edgecolor = 'None')
None

学習曲線を保存

fig.savefig('output.png', dpi = 300)

100回やるとき，実行時間としてはoptunaの方が早いです．

ただ，Random Searchの方は毎回が最小の場所に落ちるのですべて同じところに落ちていました．
これは1回の実行で良いということではなく，局所解に落ちていないことの確認になるだけだなというのが個人的な見解です．

【Blending】重みの最適化のやり方

Blendingとは

アンサンブル？Blending?

コード¶

モジュールのインポート¶

Versionsの確認¶

乱数発生の再現性を保つため，シードを固定¶

アンサンブルするモデルの数とサンプルの数¶

0 ~ 1の間の乱数を発生させる（各学習モデルの予測結果に見立てる）¶

答えの生成（0か1）¶

それぞれのモデルの予測精度を求める¶

加重平均をとって，そのスコア（MSE）を計算して返す関数を定義¶

重みの最適化を最大何回やるかの変数¶

乱数で探索する場合 (Random Search)¶

結果をとっておくための配列を定義¶

初期の重みを一括で生成（再現性のため）¶

最適化¶

最小値を取った試行回数を取得¶

最適な重みとそのときのスコアを取得¶

Optunaで最適化する場合¶

モジュールインポート¶

versionを確認¶

最小化したい関数を定義¶

最適な重みとそのときのスコアを取得¶

学習曲線作成¶

参考サイト¶

0 件のコメント:

コメントを投稿

自己紹介

このブログを検索

人気の投稿

Twitter

注目の投稿

PythonからIgor Proにデータを渡す方法【IgorWriter】

	model0	model1	model2	model3	model4
0	0.238137	0.032485	0.124268	0.170548	0.290201
1	0.823047	0.799337	0.106531	0.839951	0.596350
2	0.822898	0.609054	0.761943	0.004310	0.522597
3	0.077540	0.526384	0.730158	0.608606	0.526219
4	0.958109	0.312990	0.566702	0.031811	0.532306

	value	best_value
95	0.267403	0.267274
96	0.267350	0.267274
97	0.269097	0.267274
98	0.272839	0.267274
99	0.267480	0.267274