I want to prove that a lower bound of the Divergence between two probability distributions $p$ and $q$ defined on the set $\mathcal{U}$ can be expressed by defining a subset $\mathcal{S}\subset\mathcal{U}$ with the following expression:
\begin{equation} D(p||q)\geq d_2(p(\mathcal{S})||q(\mathcal{S}))\text, \end{equation}
where $p(\mathcal{S})=\sum_{u\in \mathcal{S}}p(u)$, similarly for $q(\mathcal{S})$, and $d_2(\alpha||\beta)=\alpha\log \frac{\alpha}{\beta}+(1-\alpha)\log \frac{1-\alpha}{1-\beta}$
I suppose that this could be proven with the data processing theorem for divergence but I don't know how to arrange terms to find the previous expression.