Let us start with an example, called Langevin paramagnetism, where the magnetic moment is described classically, as a vector in three dimensions. Calling $\vec\mu$ this moment, $\vec B$ the magnetic induction and $\theta$ the angle between $\vec\mu$ and $\vec B$. The probability density of the angle $\theta$ is $\rho(\theta)=\frac{1}{\mathcal Z}\exp(\beta\mu B\cos\theta)$, where $\mathcal Z$ is the partition function and can be seen as a normalization constant.
Indeed we have (using $x=\cos\theta$)
$$\mathcal Z=\int_0^\pi\rho(\theta)\sin\theta\,\mathrm d\theta
=\int_{-1}^1\exp(\beta\mu B x)\mathrm dx=2\frac{\sinh(\beta\mu B)}{\beta\mu B}.$$
The average energy is
$$E=-\frac{\partial \ln\cal Z}{\partial \beta}=-\mu B\left(\coth(\beta\mu B)-\frac1{\beta\mu B}\right)$$
The entropy is
$$\begin{split}S&=-k_{\text B}\int_0^\pi \rho(\theta)\ln\rho(\theta)\,\sin\theta\,\mathrm d\theta=-k_{\text B}\frac1{\cal Z}
\int_{-1}^1\exp(\beta \mu B x)\left(\beta\mu Bx-\ln\mathcal Z\right)\mathrm dx\\
&=k_{\text B}\ln\mathcal Z-k_{\text B}\beta\mu B\left(\coth(\beta\mu B)-\frac{1}{\beta \mu B}\right)=k_{\text B}\ln {\cal Z}+\frac{E}{T}.
\end{split}$$
If one defines (as usual) the free energy $F$
as $F=-k_{\text B}T\ln\mathcal Z$, then the last expression is $F=E-TS$. This demonstrates that the entropy formula is valid using the expression $S=-k_{\text B}\int\rho\ln\rho$.
In this model, no underlying discretization is needed, but when one uses the definition of $S$ as an integral, the argument in the logarithm may have a physical unit, which gives a hint that the entropy is defined up to a constant. Moreover, discretizing the values of $\theta$ into segments of width $\pi/N$, we can define $p_n=\rho(n\frac\pi N)\frac\pi N$. The statistical entropy becomes
$$S=-k_{\text B}\sum_n p_n\ln p_n=-k_{\text B}\sum_n \rho\left(n\frac\pi N\right)\frac \pi N\left[\ln\rho\left(n\frac\pi N\right)+\ln\frac\pi N\right]$$
This is a Riemann sum plus a number that goes to infinity when $N\to\infty$. The preceding formula becomes
$$S=-k_{\text B}\int\rho\ln\rho\;\;-k_{\text B}\lim_{N\to\infty}\ln\frac\pi N.$$
As usually only the variations of entropy are used, the infinite constant
is irrelevant and one may forget about it and use the entropy defined by the integral.