## Similarité Mutual Information
The distributional metric $P(i,j)$ of $i$ and $j$ terms is:
$$ P(i,j) = \frac {\sum_{k \neq i,j ; M_{ik} >0}^{} \min(M_{ik},
M_{jk})}{\sum_{k \neq i,j ; M_{ik}>0}^{}},
$$
where $M_{ij}$ is defined as
$$ M_{ij} = \log\left(\frac{C_{ij}}{E_{ij}}\right),
$$
with $C_{ij}$ the number of word bags containing cooccurrences of $i$ and $j$;
and $E_{ij}$ defined as (given a map list of size $m$):
$$ E_{ij} = \frac {S_{i} S_{j}} {N_{m}} $$
with $S_i$ the total number of cooccurrences of term $i$,
$$S_{i} = \sum_{j=1, j \neq i}^{m} C_{ij}$$
and $N_m$ the total number of cooccurrences of terms with a map list of size $m$:
$$ N_{m} = \sum_{i=1}^m\,S_i = \sum_{i=1}^{m} \sum_{j=1, j \neq i}^{m} C_{ij}. $$
:::info
GargStamp:
ca80367ce4c78565382ce840c378291a6ccb67b48776f4d082934fed5110bb8c
:::