I have run into this issue in the past, and needed a concrete answer as I was required to document the methodology behind the profile generation for a client. I had found that using an existing shapefile to generate the profile data resulted in highly variable distances between given elevation points, and that when using an interpolated shape (3D analyst - Interpolate Shape - Generate Profile), the distances were more uniform, though not necessarily at an expected interval. I contacted ESRI tech support to clarify my results and they sent me the following response:
To recap our phone conversation, the interval is determined depending on how the vertices were added to the line. The profile graph will read the vertices when creating the graphic. When the Interpolate Shape tool is used the vertices will be added for each cell within the raster, creating a smoother graphic. The interval will be reflected as this interpolation occurs. When the vertices of the original line are used, the graph will be much more coarse and the interval reflect the surface length between vertices.
In summary, when using an existing shapefile, horizontal distance along the X-axis will be in increments determined by the shapefile vertices (units should match the horizontal unit defined for the shapefile). When using the Interpolate Shape approach, horizontal distance should roughly follow a one vertex per DEM cell distribution (i.e. one elevation point will be generated for each DEM pixel intersected by the profile line). In my experience, this interval is additionally affected by the horizontal tolerance of the shapefile, resulting in distance intervals that are close to the DEM cell size, but not exact (e.g. for a 10m resolution DEM, intervals between elevation points will be ~9.996m).