“The four-bar phrase has had a bad press in our time,” writes Charles Rosen.1 But for all the denigration, four-bar phrases are ubiquitous.
Why? In themselves, individual bars of music aren’t that revealing. Identifying larger groupings doesn’t necessarily help. So instead of bar counts, let’s first think of music in terms of its relationships — its hierarchical repetitions and shapes.
There are two separate tasks here:
- Identifying a sound or a gesture
- Multiplying that sound to shape a larger structure.
Together these elements explain why musical phrases are often multiples of 2 bars long and how that construction isn’t necessarily boring.
Identifying a Sound or Gesture
Per James Tenney, sounds have three key dimensions: structure, shape, state. Structure refers to a sound’s internal relationships; shape refers to the trajectory of these relationships; and state refers to the sound’s overall impression (its statistical properties).
Sounds are also hierarchical. Like a Russian nesting doll, they can be subdivided into smaller, constituent sounds or concatenated into larger agglomerations (until you hit perceptual limits in either direction).
Let’s make some sense of these propositions using Bach. Will will call this fugue subject our sound:
- Structurally, it is composed of smaller sounds called “motives” (along with their own structures, shapes, and states) and is thus hierarchical. Its motives are often defined by the rhythmic motive 16-16-8-8-8. Bach truncates third repetition of this structure, contrasting it with 16-16-4-16-16-16.
- Its main shape is the descending voice leading G4-F4-Eb4.
- Statistically, one of the strongest impressions the sound gives is C5, which accounts for 37.5% of the pitches. This emphasis on C5, along with the repetition of the initial rhythmic motive, strengthens the contrast of the syncopation and lower register in the last half bar.
Over the course of the fugue, Bach repeats this sound (the fugue subject) at a variety of pitch levels and in various other transformations. The resulting structures and shapes compose successively larger sounds that give different impressions (i.e., convey different states) depending on which level of the hierarchy the listener focuses.2
We could have started our analysis on any of these hierarchical levels. And just as we can analyze in these terms, we can also compose in them.
Multiplying that Sound
Whatever the sound or gesture I juxtapose with the one I initially created, some aspect of the first repeats in the second, some connection between them will be linear and some other connection will be non-linear. As these juxtapositions multiply, eventually the constant things move and the moving things will stay constant for a while.3
This process happens at various time scales. You can look at the connection between two chords, but you can also look at the connection between two phrases. The critical question is “What makes me equate these two sounds as part of the same timescale? And in this grouping which of these sounds is stressed and which is unstressed?”
Twos and Threes
In poetry, the grouping of stressed and unstressed sounds is called meter. These patterns of stressed and unstressed sounds fall into groupings of twos and threes at various levels.
Music also composes sounds into nested stress patterns. A whole litany of reasons may explain why a sound is stressed or unstressed, but it’s critical to note that, unlike in poetry, the traditional conception of meter in music (and its notational aid, barlines) doesn’t actually group sounds but pulses.
Barlines don’t tell you anything about the music itself but about the pulse groupings that underlie it. I would speculate that the meter of pulse is an emergent property, whereas the meter of repetition is primal:4 The grouping of the sounds may align with or obscure the pulse groupings, but the grouping of the sounds produces that sense of pulse groupings.5
But Need It Be Boring?
Thus, music often falls into four- and eight-bar phrases because (1) stress patterns consist of twos and threes, (2) bars delineate stress patterns of pulse, (3) sounds are traditionally organized along these meters, and thus (4) higher-level groupings of twos or threes (e.g., four-bar phrases) are the inevitable result.
That said, predictability is not inevitable, because the relationship between sounds6 can — should — be varied. For instance, in the Mozart example at the start of this post, Mozart creates an textural instability and cadential ambiguity that subverts the predictability of his four-bar phrases and leads the cadence in bar 12.
In other words, Mozart does not slavishly repeat the rhetorical function of each phrase. The three phrases exist within a larger hierarchical shape that subsumes their repeated duration to a larger direction.
Thus, while stress patterns will always come in groups of twos or threes, by considering their larger groupings, four-bar phrases (or whatever duration is being repeated) aren’t doomed to predictability or directionlessness.
- In The Romantic Generation 258. Rosen gives a fascinating exploration of them in his book, but I want to articulate a different approach. ↩
- Tenney calls these various hierarchical levels “temporal gestalts” (TGs), building on earlier gestalt theory. ↩
- This is counterpoint 101: When you connect two chords, you almost always have a common tone, as well as a couple voices moving linearly and one moving by leap. ↩
- The former doesn’t feel emergent in much Western music because it’s assumed: musicians begin with a background of metered pulses and arrange their sounds in relation to it. ↩
- There are a fair number of twentieth-century pieces in which meter is a notated performance aid rather than an audible phenomenon. ↩
- Which, as we’ve seen, includes four-bar phrases. ↩