In the Categorical Clustering problem, we are given a set of vectors (matrix) A = {a(1), . . . , a(n)} over Sigma(m), where Sigma is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clustering in the Hamming norm is at most B. That is, we seek a partition {I-1, . . . , I-k} of {1, . . . , n} and vectors c(1), . . . , c(k) is an element of Sigma(m) such that Sigma(k)(i=1) Sigma(j is an element of Ii) d(H)(c(i), a(j)) <= B, where d(H)(a, b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case Sigma = {0, 1}) by giving an algorithm that solves the problem in time 2(O(B logB)) . (mn)(O(1)). We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely, in CAPACITATED CLUSTERING, in addition, we are given two non-negative integers p and q, and seek a clustering with p <= vertical bar I-i vertical bar <= q for all i is an element of {1, . . . , k}. Our main theorem is that CAPACITATED CLUSTERING is solvable in time 2(O(B logB))vertical bar Sigma vertical bar(B) . (mn)(O(1)). The theorem not only extends the previous algorithmic results to a significantly more general model, it also implies algorithms for several other variants of CATEGORICAL CLUSTERING with constraints on cluster sizes.