當前位置:
首頁 > 知識 > 用 Python 進行貝葉斯模型建模(3)

用 Python 進行貝葉斯模型建模(3)

(點擊

上方藍字

,快速關注我們)




編譯:伯樂在線 - JLee


如有好文章投稿,請點擊 → 這裡了解詳情






  • 第 0 節:導論



  • 第 1 節:估計模型參數



  • 第 2 節:模型檢驗



  • 《第 3 節:分層模型》




第3節:分層模型




貝葉斯模型的一個核心優勢就是簡單靈活,可以實現一個分層模型。這一節將實現和比較整體合併模型和局部融合模型。





import

itertools


import

matplotlib

.

pyplot

as

plt


import

numpy

as

np


import

pandas

as

pd

import

pymc3

as

pm


import

scipy


import

scipy

.

stats

as

stats


import

seaborn

.

apionly

as

sns


 


from

IPython

.

display

import

Image


from

sklearn

import

preprocessing


 


 


%

matplotlib

inline


 


plt

.

style

.

use

(

"bmh"

)


colors

=

[

"#348ABD"

,

"#A60628"

,

"#7A68A6"

,

"#467821"

,

"#D55E00"

,


          

"#CC79A7"

,

"#56B4E9"

,

"#009E73"

,

"#F0E442"

,

"#0072B2"

]


 


messages

=

pd

.

read_csv

(

"data/hangout_chat_data.csv"

)




模型合併




讓我們採取一種不同的方式來對我的 hangout 聊天回復時間進行建模。我的直覺告訴我回復的快慢與聊天的對象有關。我很可能回復女朋友比回復一個疏遠的朋友更快。這樣,我可以對每個對話獨立建模,對每個對話i估計參數 μi 和 αi。μi=Uniform(0,100)




一個必須考慮的問題是,有些對話相比其他的含有的消息很少。這樣,和含有大量消息的對話相比,我們對含有較少消息的對話的回復時間的估計,就具有較大的不確定度。下圖表明了每個對話在樣本容量上的差異。





ax

=

messages

.

groupby

(

"prev_sender"

)[

"conversation_id"

].

size

().

plot

(


    

kind

=

"bar"

,

figsize

=

(

12

,

3

),

title

=

"Number of messages sent per recipient"

,

color

=

colors

[

0

])


_

=

ax

.

set_xlabel

(

"Previous Sender"

)


_

=

ax

.

set_ylabel

(

"Number of messages"

)


_

=

plt

.

xticks

(

rotation

=

45

)







對於每個對話i中的每條消息j,模型可以表示為:






indiv_traces

=

{}


 


# Convert categorical variables to integer


le

=

preprocessing

.

LabelEncoder

()


participants_idx

=

le

.

fit_transform

(

messages

[

"prev_sender"

])


participants

=

le

.

classes_


n_participants

=

len

(

participants

)


 


for

p

in

participants

:


    

with

pm

.

Model

()

as

model

:


        

alpha

=

pm

.

Uniform

(

"alpha"

,

lower

=

0

,

upper

=

100

)


        

mu

=

pm

.

Uniform

(

"mu"

,

lower

=

0

,

upper

=

100

)


        


        

data

=

messages

[

messages

[

"prev_sender"

]

==

p

][

"time_delay_seconds"

].

values


        

y_est

=

pm

.

NegativeBinomial

(

"y_est"

,

mu

=

mu

,

alpha

=

alpha

,

observed

=

data

)


 


        

y_pred

=

pm

.

NegativeBinomial

(

"y_pred"

,

mu

=

mu

,

alpha

=

alpha

)


        


        

start

=

pm

.

find_MAP

()


        

step

=

pm

.

Metropolis

()


        

trace

=

pm

.

sample

(

20000

,

step

,

start

=

start

,

progressbar

=

True

)


        


        

indiv_traces

[

p

]

=

trace





Applied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

9.7

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.4

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.0

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.0

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

10.3

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

16.4

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.1

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

17.4

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

19.9

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

15.2

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

16.1

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

10.1

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

11.1

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

11.9

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.8

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

13.0

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

12.4

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

10.9

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

22.6

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

18.7

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

13.1

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

13.9

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

14.0

secApplied

interval

-

transform to alpha

and

added transformed alpha_interval to

model

.


Applied

interval

-

transform to mu

and

added transformed mu_interval to

model

.


[

-----------------

100

%-----------------

]

20000

of

20000

complete

in

13.6

sec





fig

,

axs

=

plt

.

subplots

(

3

,

2

,

figsize

=

(

12

,

6

))


axs

=

axs

.

ravel

()


y_left_max

=

2


y_right_max

=

2000


x_lim

=

60


ix

=

[

3

,

4

,

6

]


 


for

i

,

j

,

p

in

zip

([

0

,

1

,

2

],

[

0

,

2

,

4

],

participants

[

ix

])

:


    

axs

[

j

].

set_title

(

"Observed: %s"

%

p

)


    

axs

[

j

].

hist

(

messages

[

messages

[

"prev_sender"

]

==

p

][

"time_delay_seconds"

].

values

,

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

)


    

axs

[

j

].

set_ylim

([

0

,

y_left_max

])


 


for

i

,

j

,

p

in

zip

([

0

,

1

,

2

],

[

1

,

3

,

5

],

participants

[

ix

])

:


    

axs

[

j

].

set_title

(

"Posterior predictive distribution: %s"

%

p

)


    

axs

[

j

].

hist

(

indiv_traces

[

p

].

get_values

(

"y_pred"

),

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

,

color

=

colors

[

1

])


    

axs

[

j

].

set_ylim

([

0

,

y_right_max

])


 


axs

[

4

].

set_xlabel

(

"Response time (seconds)"

)


axs

[

5

].

set_xlabel

(

"Response time (seconds)"

)


 


plt

.

tight_layout

()







上圖顯示了3例對話的觀測數據分布(左)和後驗預測分布(右)。可以看出,不同對話的後驗預測分布大不相同。這可能準確反映出對話的特點,也可能是樣本容量太小造成的誤差。




如果我們把後驗預測分布聯合起來,我們希望得到的分布和觀測數據分布近似,讓我們來進行後驗預測檢驗。





combined_y_pred

=

np

.

concatenate

([

v

.

get_values

(

"y_pred"

)

for

k

,

v

in

indiv_traces

.

items

()])


 


x_lim

=

60


y_pred

=

trace

.

get_values

(

"y_pred"

)


 


fig

=

plt

.

figure

(

figsize

=

(

12

,

6

))


fig

.

add_subplot

(

211

)


 


fig

.

add_subplot

(

211

)


 


_

=

plt

.

hist

(

combined_y_pred

,

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

,

color

=

colors

[

1

])

  


_

=

plt

.

xlim

(

1

,

x_lim

)


_

=

plt

.

ylim

(

0

,

20000

)


_

=

plt

.

ylabel

(

"Frequency"

)


_

=

plt

.

title

(

"Posterior predictive distribution"

)


 


fig

.

add_subplot

(

212

)


 


_

=

plt

.

hist

(

messages

[

"time_delay_seconds"

].

values

,

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

)


_

=

plt

.

xlim

(

0

,

x_lim

)


_

=

plt

.

xlabel

(

"Response time in seconds"

)


_

=

plt

.

ylim

(

0

,

20

)


_

=

plt

.

ylabel

(

"Frequency"

)


_

=

plt

.

title

(

"Distribution of observed data"

)


 


plt

.

tight_layout

()







是的,後驗預測分布和觀測數據的分布近似。但是,我關心的是數據較少的對話,對它們的估計也可能具有較大的方差。一個減小這種風險的方法是分享對話信息,但仍對每個對話單獨估計 μi,我們稱之為局部融合。




局部融合




就像整體合併模型,局部融合模型對每個對話i都有獨自的參數估計。但是,這些參數通過融合參數聯繫在一起。這反映出,我的不同對話間的response_time有相似之處,就是我本性上傾向於回復快或慢。







接著上面的例子,估計負二項分布的參數 μi 和 αi。相比於使用均勻分布作為先驗分布,我使用帶兩個參數(μ,σ)的伽馬分布,從而當能夠預測 μ 和 σ 的值時,可以引入更多的先驗信息到模型中。




首先,我們來看看伽馬分布,下面你可以看到,它非常靈活。











局部融合模型可以表示為:





代碼:





Image("graphics/dag neg poisson gamma hyper.png", width=420)








with

pm

.

Model

()

as

model

:


    

hyper_alpha_sd

=

pm

.

Uniform

(

"hyper_alpha_sd"

,

lower

=

0

,

upper

=

50

)


    

hyper_alpha_mu

=

pm

.

Uniform

(

"hyper_alpha_mu"

,

lower

=

0

,

upper

=

10

)


    


    

hyper_mu_sd

=

pm

.

Uniform

(

"hyper_mu_sd"

,

lower

=

0

,

upper

=

50

)


    

hyper_mu_mu

=

pm

.

Uniform

(

"hyper_mu_mu"

,

lower

=

0

,

upper

=

60

)


    


    

alpha

=

pm

.

Gamma

(

"alpha"

,

mu

=

hyper_alpha_mu

,

sd

=

hyper_alpha_sd

,

shape

=

n_participants

)


    

mu

=

pm

.

Gamma

(

"mu"

,

mu

=

hyper_mu_mu

,

sd

=

hyper_mu_sd

,

shape

=

n_participants

)


    


    

y_est

=

pm

.

NegativeBinomial

(

"y_est"

,


                                

mu

=

mu

[

participants_idx

],


                                

alpha

=

alpha

[

participants_idx

],


                                

observed

=

messages

[

"time_delay_seconds"

].

values

)


    


    

y_pred

=

pm

.

NegativeBinomial

(

"y_pred"

,


                                

mu

=

mu

[

participants_idx

],


                                

alpha

=

alpha

[

participants_idx

],


                                

shape

=

messages

[

"prev_sender"

].

shape

)


    


    

start

=

pm

.

find_MAP

()


    

step

=

pm

.

Metropolis

()


    

hierarchical_trace

=

pm

.

sample

(

200000

,

step

,

progressbar

=

True

)





Applied

interval

-

transform to hyper_alpha_sd

and

added transformed hyper_alpha_sd_interval to

model

.


Applied

interval

-

transform to hyper_alpha_mu

and

added transformed hyper_alpha_mu_interval to

model

.


Applied

interval

-

transform to hyper_mu_sd

and

added transformed hyper_mu_sd_interval to

model

.


Applied

interval

-

transform to hyper_mu_mu

and

added transformed hyper_mu_mu_interval to

model

.


Applied

log

-

transform to alpha

and

added transformed alpha_log to

model

.


Applied

log

-

transform to mu

and

added transformed mu_log to

model

.


[

-----------------

100

%-----------------

]

200000

of

200000

complete

in

593.0

sec










對 μ 和 α 的估計有多條曲線,每個對話i對應一條。整體合併和局部融合模型的不同之處在於,局部融合模型的參數(μ和 α)擁有一個被所有對話共享的融合參數。這帶來兩個好處:






  1. 信息在對話間共享,所以對於含有有限樣本容量的對話來說,在估計過程中從別的對話處「借」信息來減小估計的方差。



  2. 我們對每個對話單獨做了估計,也對所有對話整體做了估計。




我們快速看一下後驗預測分布





x_lim

=

60


y_pred

=

hierarchical_trace

.

get_values

(

"y_pred"

)[

::

1000

].

ravel

()


 


fig

=

plt

.

figure

(

figsize

=

(

12

,

6

))


fig

.

add_subplot

(

211

)


 


fig

.

add_subplot

(

211

)


 


_

=

plt

.

hist

(

y_pred

,

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

,

color

=

colors

[

1

])

  


_

=

plt

.

xlim

(

1

,

x_lim

)


_

=

plt

.

ylabel

(

"Frequency"

)


_

=

plt

.

title

(

"Posterior predictive distribution"

)


 


fig

.

add_subplot

(

212

)


 


_

=

plt

.

hist

(

messages

[

"time_delay_seconds"

].

values

,

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

)


_

=

plt

.

xlabel

(

"Response time in seconds"

)


_

=

plt

.

ylabel

(

"Frequency"

)


_

=

plt

.

title

(

"Distribution of observed data"

)


 


plt

.

tight_layout

()







收縮效果:合併模型 vs 分層模型




如討論的那樣,局部融合模型中 μ 和 α 享有一個融合參數,通過對話間信息共享,它使得參數的估計收縮得更緊密,尤其是對含有少量數據的對話。




下圖顯示了這種收縮效果,可以看出通過融合參數,參數μ 和 α 是怎樣聚集在一起。





hier_mu

=

hierarchical_trace

[

"mu"

][

500

:

].

mean

(

axis

=

0

)


hier_alpha

=

hierarchical_trace

[

"alpha"

][

500

:

].

mean

(

axis

=

0

)


indv_mu

=

[

indiv_traces

[

p

][

"mu"

][

500

:

].

mean

()

for

p

in

participants

]


indv_alpha

=

[

indiv_traces

[

p

][

"alpha"

][

500

:

].

mean

()

for

p

in

participants

]


 


fig

=

plt

.

figure

(

figsize

=

(

8

,

6

))


ax

=

fig

.

add_subplot

(

111

,

xlabel

=

"mu"

,

ylabel

=

"alpha"

,


                    

title

=

"Pooled vs. Partially Pooled Negative Binomial Model"

,


                    

xlim

=

(

5

,

45

),

ylim

=

(

0

,

10

))


 


ax

.

scatter

(

indv_mu

,

indv_alpha

,

c

=

colors

[

5

],

s

=

50

,

label

=

"Pooled"

,

zorder

=

3

)


ax

.

scatter

(

hier_mu

,

hier_alpha

,

c

=

colors

[

6

],

s

=

50

,

label

=

"Partially Pooled"

,

zorder

=

4

)


for

i

in

range

(

len

(

indv_mu

))

:  


    

ax

.

arrow

(

indv_mu

[

i

],

indv_alpha

[

i

],

hier_mu

[

i

]

-

indv_mu

[

i

],

hier_alpha

[

i

]

-

indv_alpha

[

i

],


            

fc

=

"grey"

,

ec

=

"grey"

,

length_includes_head

=

True

,

alpha

=

.

5

,

head_width

=

0

)


 


_

=

ax

.

legend

()





對後驗分布提問




我們開始利用貝葉斯統計最好的方面之一——後驗分布。不像頻率論方法,我們得到完整的後驗分布而不是點估計。本質上,我們是得到許多可信的參數值,這使得我們可以按照比較自然直觀的方式提問。




What are the chances I』ll respond to my friend in less than 10 seconds?




為了估計這個概率,我們可以看看 Timothy 和 Andrew 的回復時間的後驗預測分布,檢查有多少樣本是小於 10 秒的。當我第一次聽到這個方法時,我以為我理解錯了,因為它太簡單了。








print("Here are some samples from Timothy"s posterior predictive distribution: n %s" % participant_y_pred("Timothy"))





Here are some samples

from

Timothy

"

s

posterior predictive

distribution

:


[

24

24

24

...,

19

19

19

]





def

person_plotA

(

person_name

)

:


    

ix_check

=

participant_y_pred

(

person_name

)

>

10


    

_

=

plt

.

hist

(

participant_y_pred

(

person_name

)[

~

ix_check

],

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

,

label

=

"<10 seconds"

)


    

_

=

plt

.

hist

(

participant_y_pred

(

person_name

)[

ix_check

],

range

=

[

0

,

x_lim

],

bins

=

x_lim

,

histtype

=

"stepfilled"

,

label

=

">10 seconds"

)


    

_

=

plt

.

title

(

"Posterior predictive ndistribution for %s"

%

person_name

)


    

_

=

plt

.

xlabel

(

"Response time"

)


    

_

=

plt

.

ylabel

(

"Frequency"

)


    

_

=

plt

.

legend

()


 


def

person_plotB

(

person_name

)

:


    

x

=

np

.

linspace

(

1

,

60

,

num

=

60

)


    

num_samples

=

float

(

len

(

participant_y_pred

(

person_name

)))


    

prob_lt_x

=

[

100

*

sum

(

participant_y_pred

(

person_name

)

<

i

)

/

num_samples

for

i

in

x

]


    

_

=

plt

.

plot

(

x

,

prob_lt_x

,

color

=

colors

[

4

])


    

_

=

plt

.

fill_between

(

x

,

prob_lt_x

,

color

=

colors

[

4

],

alpha

=

0.3

)


    

_

=

plt

.

scatter

(

10

,

float

(

100

*

sum

(

participant_y_pred

(

person_name

)

<

10

))

/

num_samples

,

s

=

180

,

c

=

colors

[

4

])


    

_

=

plt

.

title

(

"Probability of responding nto %s before time (t)"

%

person_name

)


    

_

=

plt

.

xlabel

(

"Response time (t)"

)


    

_

=

plt

.

ylabel

(

"Cumulative probability t"

)


    

_

=

plt

.

ylim

(

ymin

=

0

,

ymax

=

100

)


    

_

=

plt

.

xlim

(

xmin

=

0

,

xmax

=

60

)


 


fig

=

plt

.

figure

(

figsize

=

(

11

,

6

))


_

=

fig

.

add_subplot

(

221

)


person_plotA

(

"Timothy"

)


_

=

fig

.

add_subplot

(

222

)


person_plotB

(

"Timothy"

)


 


_

=

fig

.

add_subplot

(

223

)


person_plotA

(

"Andrew"

)


_

=

fig

.

add_subplot

(

224

)


person_plotB

(

"Andrew"

)


 


plt

.

tight_layout

()






participant_y_pred("Andrew")





array([13, 13, 13, ..., 28, 28, 28])




我發現這個方法非常直觀而且靈活。上圖左邊根據大於 10 秒或小於 10 秒把後驗預測的樣本分為兩部分,通過計算小於 10 的樣本比例來計算概率。右邊的圖對回復時間小於每個 0 到 60 間的值的概率進行了計算。所以,看起來,Timothy 和 Andrew 分別有 38% 和 8% 的可能性在 10 秒內得到回復。




我的朋友們兩兩之間比較如何呢?





def

prob_persona_faster

(

persona

,

personb

)

:


    

return

sum

(

participant_y_pred

(

persona

)

<

participant_y_pred

(

personb

))

/

len

(

participant_y_pred

(

persona

))


 


print

(

"Probability that Tom is responded to faster than Andrew: {:.2%}"

.

format

(

prob_persona_faster

(

"Tom"

,

"Andrew"

)))





Probability that Tom is responded to faster than Andrew: 33.05%





# Create an empty dataframe


ab_dist_df

=

pd

.

DataFrame

(

index

=

participants

,

columns

=

participants

,

dtype

=

np

.

float

)


 


# populate each cell in dataframe with persona_less_personb()


for

a

,

b

in

itertools

.

permutations

(

participants

,

2

)

:


    

ab_dist_df

.

ix

[

a

,

b

]

=

prob_persona_faster

(

a

,

b

)


    


# populate the diagonal


for

a

in

participants

:


    

ab_dist_df

.

ix

[

a

,

a

]

=

0.5





# Plot heatmap


f

,

ax

=

plt

.

subplots

(

figsize

=

(

12

,

9

))


cmap

=

plt

.

get_cmap

(

"Spectral"

)


_

=

sns

.

heatmap

(

ab_dist_df

,

square

=

True

,

cmap

=

cmap

)


_

=

plt

.

title

(

"Probability that Person A will be responded to faster than Person B"

)


_

=

plt

.

ylabel

(

"Person A"

)


_

=

plt

.

xlabel

(

"Person B"

)







看完本文有收穫?請轉

發分享給更多人


關注「P

ython開發者」,提升Python技能


喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 Python開發者 的精彩文章:

IEEE Spectrum 2017 編程語言排行:Python 奪冠
Python 求職 Top10 城市,來看看是否有你所在的城市
機器學習演算法實踐:樸素貝葉斯 (Naive Bayes)
這2個套路走完, 你就成了 Facebook 認證的數據分析師
動手實現推薦系統,挑戰高薪!

TAG:Python開發者 |

您可能感興趣

利用Chan-Vese模型和Sobel運算對重疊葉片進行圖像分割
使用 VS Code 進行 Python 編程
一文讀懂如何用LSA、PSLA、LDA和lda2vec進行主題建模
使用pdb進行Python調試(下篇)
用Python進行機器學習
S.Moro和T.Lunger進行Pik Pobeda峰冬季首攀
諾基亞9真旗艦 有資格與三星S10和iPhone X Plus進行PK嗎?
英格蘭教會或使用Apple Pay/Google Pay進行募捐
使用pdb進行Python調試
Python利用OpenCV來進行圖片的位移和縮放
Angewandte Chemie 突破!會動的納米馬達讓CRISPR-Cas-9鑽進癌細胞心窩進行基因編輯!
使用Pandas&NumPy進行數據清洗的6大常用方法
2018 UOD舉行Epic Games創始人Tim Sweeney進行主題演講
用Prophet快速進行時間序列預測(附Prophet和R代碼)
三星將停止對Galaxy Note 5和S6 Edge +進行系統更新
微軟Azure現在支持Nvidia的GPU Cloud進行深度學習模型的訓練和推理
德國 Fostla為Mercedes-AMG GT S 進行改裝強化 馬力達 613ps!
Google試圖僱用Vitalik Buterin進行秘密加密項目
使用 OpenCV 進行高動態範圍(HDR)成像
Daily日報│FB為其VR繪畫應用Quill增加動畫工具;Star breeze正與宏碁進行StarVR 的IPO談判