Raster calculations are quite effective for such calculations, especially when you have travel time data.
The inputs are rasters of distance to the cities, one raster per city. For example, if you're using Euclidean distance, first compute the Euclidean distance grid for each city. Otherwise, compute the "distance" in any way you can, maybe using CostDistance or perhaps rasterizing a network travel time calculation.
The area associated to a city consists of all points where that city's population, divided by the square of the distance, is larger than the comparable population/distance^2 values for all other cities. To prepare for this, use your distance grids to compute these population/distance^2 values: that is a simple "map algebra" operation. Now you're practically done: the Highest Position command creates a coded version of the result you want. Its argument is a list of your population/distance^2 grids and for the coded values, a 1 is used for the first grid in the list, 2 for the second, and so on.
Example
Comments indicate this may be a big problem: 2300 or so centers, perhaps covering a huge (country-wide?) area, thereby requiring a large grid. Don't try this with Spatial Analyst: as a general-purpose raster GIS, it's too slow.
Here is an example in R involving one-tenth as many centers on a 1000 by 1667 grid. It took ten seconds to compute. (That's still a little slow, but it's at least an order of magnitude faster than it could be done in ArcGIS 10.x.)

The numbers on this map identify the centers, using whole values from 1 through 230. The black dots show the locations of those centers (with areas in proportion to their populations). Because this is based on Euclidean distance, the boundaries among "territories" are all portions of circular arcs.
The calculation proceeds by establishing two grids--one for the maximum "gravity" seen so far in each cell and another for the current "owner" of that cell--and computing a new grid for each center in turn. It identifies all cells where the new center's "gravity" exceeds the current maximum and updates the [gravity] grid with the new center's value and the [owner] grid with the identifier of the new owner.
The computation time scales directly with the number of centers, so 2300 centers would require two minutes of calculation. It also scales directly with the number of rows and the number of columns. A more detailed grid of, say, 10,000 rows by 16,667 columns would therefore require about 160 minutes, or three hours.
To achieve these moderately quick speeds, the algorithm needs RAM to store three grids at once: [gravity], [owner], and the values for each center in turn (which are re-used for each center). Assuming 16 bytes per value (double precision with some R overhead), the calculation on 10,000 by 16,667 grid would require 8 GB, which I confirmed with a test. (That would cover the entire continental US at a 300 meter resolution.) At this resolution the calculation takes 5 seconds per center. Extrapolating to 2300 centers still gives three hours.
Be prepared for counter-intuitive results. (I'm not saying they will be wrong, but they might be surprising.) For instance, when just two cities are involved, the area associated with the smaller city will be a perfect disk: but it won't be centered on the city itself. It will be surrounded an (infinite) area associated with the larger city. That means that a plurality of people living sufficiently far away on the far side of the small city are predicted to travel to the large city--even when that involves passing right through the small city.
Here is the R code used to produce the example. To use it in practice you would want to read the center coordinates (perhaps from a shapefile) and, afterwards, write the allocation grid to disk for further processing and better mapping. Those straightforward procedures are illustrated (with working code) on other R-based threads here.
#
# Create population centers and their populations.
#
set.seed(17)
x.min <- 0; x.max <- 5000000 # Extent of easting coordinates
y.min <- 0; y.max <- 3000000 # Extent of northing coordinates
n <- 230 # Number of centers
center <- matrix(runif(n*2, min=c(x.min,y.min), max=c(x.max, y.max)),
ncol=2, nrow=n, byrow=TRUE)
pop <- rgamma(n, 5, 10^-2 / (4*n)) # Total population around 110M people
#
# Create row and column coordinate arrays.
#
n.rows <- 10^3
cellsize <- (y.max - y.min) / n.rows
n.cols <- ceiling((x.max - x.min) / cellsize)
x.max <- x.min + n.cols * cellsize # Assures square cells are used
y.0 <- seq(y.max-cellsize/2, y.min+cellsize/2, length.out=n.rows) #
x.0 <- seq(x.min+cellsize/2, x.max-cellsize/2, length.out=n.cols)
#
# For each center, compute a "gravity" grid (based on population times inverse
# squared Euclidean distance) and accumulate the maxima.
#
system.time(
{
#
# It is slightly more efficient to process the larger
# centers first.
#
i <- order(pop, decreasing=TRUE)
pop <- pop[i]
center <- center[i, , drop=FALSE]
#
# Initialize the output.
#
owner <- matrix(0, n.rows, n.cols)
gravity.max <- matrix(0, n.rows, n.cols)
#
# Loop over the centers, updating the output.
# (This is a "highest position" calculation.)
#
for (i in 1:n) {
r <- pop[i] / outer((y.0 - center[i,2])^2, (x.0 - center[i,1])^2, "+")
update <- which(r >= gravity.max)
gravity.max[update] <- r[update]
owner[update] <- i
}
})
#
# Plot the results.
#
library(raster)
par(mfrow=c(1,1))
plot(raster(log(gravity.max), xmn=x.min, xmx=x.max, ymn=y.min, ymx=y.max),
main="Log Gravity")
plot(raster(owner, xmn=x.min, xmx=x.max, ymn=y.min, ymx=y.max),
main="Gravity-based allocation")
points(center, pch=19, cex = 20*sqrt(pop/n/max(pop)))