It is difficult to overstate the extent to which analysis of variance-based linear modeling based for different groups dominates the cognitive sciences. A recent methodological review of the psychology literature suggesting that these analyses are used to test hypotheses in as much as 95% of studies (citation pending me recovering it). There are alternatives in working memory, but they often have more limited empirical support due to the dominance of the aforementioned analyses.
Hierarchical models
One alternative that doesn't veer too far from established practices is to use a hierarchical modeling approach, as you suggest. For example, Lee and Webb (2005) proposed a Bayesian hierarchical modeling of working memory which accounted for both individual differences and aggregated group data. The main benefit here is to take advantage of both individual- and group-level variability, but we ultimately end up with a linear model.
Many evaluations of cognitive models rely on data that have been averaged or aggregated across all experimental subjects, and so fail to consider the possibility of important individual differences between subjects. Other evaluations are done at the single-subject level, and so fail to benefit from the reduction of noise that data averaging or aggregation potentially provides. To overcome these weaknesses, we have developed a general approach to modeling individual differences using families of cognitive models in which different groups of subjects are identified as having different psychological behavior. Separate models with separate parameterizations are applied to each group of subjects, and Bayesian model selection is used to determine the appropriate number of groups.
Topology
An emerging, but currently more niche approach is to use topology. Topology is making some exciting inroads in the psychological methodology literature (e.g., Butner et al., 2014), but so far has not managed to see substantial use in the empirical WM literature. However, there are a few early birds who've made some attempts to capture WM within a topological context, namely Glassman's work on a "theory of relativity" for working memory (Glassman, 1999; Glassman, 2003).
In the succeeding excerpt, he explains why human verbal memory capacity is 3-4 items from a topological perspective:
Thus,
a moment’s WM is hypothesized here to reside in myriad activated cortical planar “patches,” each subdivided into up to four amoeboid “subpatches.” Two different lines of topological reasoning suggest orderly associations of such representations. (1) The four-color principle of map topology, and the related K4 is planar theorem of graph theory, imply that if a small cortical area is dynamically subdivided into no more than four, discretely bounded planar subareas, then each such segment has ample free access to each of the others. (2) A hypothetical alternative to such associative adjacency of simultaneously active cortical representations of chunk-attributes is associative overlap, whereby, in dense cortical neuropil, activated subpatches behave like Venn diagrams of intersecting sets. As the number of Venn-like coactive subpatches
within a patch increases, maintaining ad hoc associativity among all combinations requires exponentially proliferating intersections. Beyond four, serpentine subpatch shapes are required, which could easily lead to pathologies of omission or commission.
Others
I am tangentially aware of related attempts to develop a theory of (working) memory as a form of foraging behavior, but not sufficiently to give a summary of it. Hopefully, the link is instructive.
References
- Butner, J. E., Gagnon, K. T., Geuss, M. N., Lessard, D. A., & Story, T. N. (2014). Utilizing Topology to Generate and Test Theories of Change. Psychological methods.
- Lee, M. D., & Webb, M. R. (2005). Modeling individual differences in cognition. Psychonomic Bulletin & Review, 12(4), 605–621. doi:10.3758/BF03196751
- Glassman, R. B. (1999). A working memory “theory of relativity”: elasticity in temporal, spatial, and modality dimensions conserves item capacity in radial maze, verbal tasks, and other cognition. Brain Research Bulletin, 48(5), 475–489. doi:10.1016/S0361-9230(99)00026-X
- Glassman, R. B. (2003). Topology and graph theory applied to cortical anatomy may help explain working memory capacity for three or four simultaneous items. Brain Research Bulletin, 60(1-2), 25–42. doi:10.1016/S0361-9230(03)00030-3