Science of
Equal Temperament
Equal temperament is a musical temperament, or a system of tuning in which every pair of adjacent notes has an identical frequency ratio. In equal temperament tunings an interval — usually the octave — is divided into a series of equal steps (equal frequency ratios). For modern Western music, the most common tuning system is twelve-tone equal temperament.
Equal Temperament Scale
|
The equal tempered scale is the common musical scale used at present, used for the tuning of pianos and other instruments of relatively fixed scale. It divides the octave into 12 equal semitones. It is common practice to state musical intervals in cents, where 100¢ is defined as one equal tempered semitone. The cents notation provides a useful way to compare intervals in different temperaments and to decide whether those differences are musically significant. A useful parameter for comparison is the just noticeable difference in pitch which corresponds to about 5¢. ![]() One of the advantages of the equal tempered scale is that it is the same in any musical "key", so that compositions may be freely transposed up or down without changing the musical intervals. This is such a major advantage that it has made equal temperament the standard temperament in western music for the past 200 years. The equal tempered intervals may be compared with Just and Pythagorean temperaments which maintain the exact-integer-ratio rule for the main intervals, while equal temperament departs from that standard. The piano keyboard is the standard example of the equal tempered scale, and the frets on a modern guitar are also placed to fix the instrument into the equal tempered scale.
|
Cents
Musical intervals are often expressed in cents, a unit of pitch based upon the equal tempered octave such that one equal tempered semitone is equal to 100 cents. An octave is then 1200¢ and the other equal tempered intervals can be obtained by adding semitones.
Just Temperament
Just temperament refers to a musical scale or musical intervals which maintain exact integer ratios between pitches.
For example, the ration 3:2 is said to be a "just" musical fifth and is sometimes called a "perfect fifth".
Pythagorean temperament maintains just intervals for the fifth and fourth but departs for some other intervals. Equal temperament does not contain any just intervals except the octave itself. However, if five cents is taken as the just noticeable difference in pitch, the fifths and fourths of equal temperament are just within that 5¢ margin.
Unfortunately, Just Temperament produces 2 different sizes of whole steps, and consequently 2 sets of intervals for minor triads. Just Temperament offers perfect agreement with the Harmonic Series only in the tones of the I, IV and IV triads and only in the Major key of the I chord.
In fact, none of the above temperaments offers consistent intonation for all keys. The only way to make intonation consistent is to make all of the half step intervals consistent - all the same size. This leads us to equal temperament.
Just Noticeable Difference in Pitch
The just noticeable difference in pitch must be expressed as a ratio or musical interval since the human ear tends to respond equally to equal ratios of frequencies. It is convenient to express the just noticeable difference in cents since that notation was developed to express musical intervals. Although research reveals variations, a reasonable estimate of the JND is about five cents. One of the advantages of the cents notation is that it expresses the same musical interval, regardless of the frequency range.
![]() |
a nickel's worth of difference ![]() |
Advantages of Cents Notation
Examining the semitone A to B-flat at different points in the range of the piano will illustrate the fact that if expressed in cents, every equal tempered semitone is the same. Expressed in Hz difference, every semitone is different. The interval value in cents expresses the ratio of the frequencies, which is the same for every equal tempered semitone.

Included with the semitone intervals above is an evaluation of the deviation in Hz needed to equal 5¢, the nominal just-noticeable difference for these pitches. Note that the range represented by 5¢ increases from less than a tenth of a Hz at the low end of the piano to about 10 Hz at the top end of the piano.
Calculating Cents
The fact that one octave is equal to 1200 cents leads one to the power of 2 relationship:

This is convenient for calculating the frequency corresponding to a certain number of cents. To calculate the number of cents for any two frequencies, the above relationship must be reversed. Taking the log of both sides gives:

Pitch Details Related to Cents
Evaluating the just noticeable difference in pitch by the "nickel's worth" rule is convenient, but as you might expect it is an oversimplification. Rossing describes measurements of pitch discrimination with pure tones at about 80 dB for frequencies between 1 and 4 kHz. The jnd is found to be about 0.5% of the pure tone frequency, which corresponds to about 8¢ . He also states that the jnd has been found to depend upon the frequency, the sound level, the duration of the tone, the suddenness of the frequency change, the musical training of the listener, and the method of measurement. The real world can rarely be accurately characterized by simple rules.
Rossing also reports that the critical bandwidth (related to loudness perception) is about 30 times the jnd for pitch, suggesting that both are related to the regions of excitation along the basilar membrane of the inner ear.
![]() |
Harmonic Intervals in CentsExpressing the musical intervals between successive harmonics in cents notation helps to show the relationship between the harmonics and the equal tempered musical scale. The equal tempered intervals are all in multiples of 100 cents, so the departure from such a multiple is an indication of how much the just interval is out of tune with equal temperament. If any musical source produces a fundamental and a series of exact harmonices, then it is evident that the upper harmonics will be out of tune with the corresponding equal tempered notes. Examination of the illustration will make it evident that the 7th harmonic shows the most severe departure from any equal tempered interval. This is important in the design of brass instruments, since they use the upper resonances of the instrument as played notes in a harmonic sequence. Fortunately, the upper resonances of brass instruments can be tuned closer to equal temperament in the manufacturing process so that the problem is not so severe, but the seventh resonance is still troublesome. |
Major and Minor Triads
The major and minor triads are used widely in western music. In terms of just musical intervals, these triads can be expressed as whole number ratios of frequencies. The actual frequencies vary slightly in other temperaments, but the labeling of the intervals is the same.

Note that the 4th, 5th and 6th harmonics of a harmonic sequence form a major triad: The pitch ratio 4:5:6 is the triad. The minor triad does not fit so neatly in the harmonic sequence, the smallest whole number ratios that produce a minor triad being 10:12:15.
Major Triad Differences
| The differences between Just, Pythagorean , and Equal Tempered scales becomes evident in the tuning of a triad. The fifth is essentially the same for the three temperaments, but the major third differences are significant. | ![]() |
![]() | The frequencies for the major triad A-C#-E are shown at left. The three different major thirds are labeled in cents, and the difference can be seen to be larger than the just noticeable difference of 5¢. |
The difference between the Just and Phythagorean major thirds is 22¢, which is called the "syntonic comma". Nine commas make approximately a whole tone.
The Equal Tempered Octave


The middle octave on the piano is shown as a standard example of equal temperament. Since a musical interval is defined by a ratio, the division of an octave into 12 equal intervals (equal tempered semitones) involves finding the ratio by which you multiply the starting frequency f twelve times to get a frequency 2f. If this ratio is represented by a, then

Proceeding up the equal tempered scale, each note is about 6% higher than the previous note.
Ideas associated with the equal tempered octave
The equal tempered scale is the common musical scale used at present, used for the tuning of pianos and other instruments of relatively fixed scale. The middle octave of the piano is illustrated below and used to point to some of the features of equal tempered intervals compared to Just and Pythagorean intervals.

Equal temperament divides the octave into 12 equal semitones. It is common practice to state musical intervals in cents, where 100¢ is defined as one equal tempered semitone. The cents notation provides a useful way to compare intervals in different temperaments and to decide whether those differences are musically significant. A useful parameter for comparison is the just noticeable difference (JND) in pitch which corresponds to about 5¢.
If the 5¢ JND in pitch is used as the criterion for acceptable tuning accuracy, then it can be seen that Equal Temperament holds the fifth, the fourth and the whole tone to acceptable accuracy. The breaking of the whole tone into semitones differs significantly from Pythagorean temperament as shown above. The equal tempered major and minor thirds differ significantly from the small whole number ratios of just temperament, and significantly affect the tuning of the major triad.
Musical Clefs

Semitones Compared

The differences shown between semitones in Pythagorean and Equal Temperament are representative of the classic difficulties encountered in the building up of musical scales. Pythagorean temperament has two different sized semitones which lead to differences in chromatic notes like A-flat and G-sharp. This means that you cannot freely change keys for a piece of music written for Pythagorean temperament. In equal temperament, A-flat and G-sharp are the same black key on the piano, so that you can freely transpose between keys. On the other hand, that convenience comes at the price of deviations from the small-whole-number ratio rule for consonance.
Comparison of Musical Intervals

The use of cents notation facilitates the comparison of musical intervals in the different systems of temperament. The small-integer-ratio rule for musical consonance is embodied in the just intervals above. The use of 5 cents as the just noticeable difference in pitch helps in the assessment of the magnitudes of the deviations. Using just temperament as the reference, it will be noted that major and minor thirds in equal temperament are the most out of tune with the just intervals. Another useful type of comparison of temperaments is that encountered in the tuning of a major triad.
Pitch and Harmonics
Pitch
For example, middle C in equal temperament = 261.6 Hz
Sounds may be generally characterized by pitch, loudness, and quality. The perceived pitch of a sound is just the ear's response to frequency, i.e., for most practical purposes the pitch is just the frequency. The pitch perception of the human ear is understood to operate basically by the place theory, with some sharpening mechanism necessary to explain the remarkably high resolution of human pitch perception.
The place theory and its refinements provide plausible models for the perception of the relative pitch of two tones, but do not explain the phenomenon of perfect pitch.
The just noticeable difference in pitch is conveniently expressed in cents, and the standard figure for the human ear is 5 cents.
The A 440 Pitch Standard
In 1939, an International Conference met in London and unanimously adopted 440 Hz as the standard frequency for the pitch A4, and that is the almost universal standard at present. The National Institute of Standards and Technology (NIST) broadcasts a precise 440 Hz reference tone on its short wave radio station WWV.
Rossing reports Handel's tuning fork frequency to be 422.5 Hz for that A, and the eras of Hayden, Mozart, Bach and Beethoven had pitch standards around that frequency. This means that their compositions are now played about 70 cents sharper than the originals.
Note Frequency

This table is based upon the A-440 Hz frequency standard. The note letters have a number appended which is their octave number beginning at the bottom octave on the piano. The offset frequencies are the chromatic notes (sharps and flats)
Details About Pitch
Although for most practical purposes, the pitch of a sound can be said to be simply a measure of its frequency, there are circumstances in which a constant frequency sound can be perceived to be changing in pitch.
One of most consistently observed "psychoacoustic" effects is that a sustained high frequency sound (>2kHz) which is increased steadily in intensity will be perceived to be rising in pitch, whereas a low frequency sound (<2kHz) will be perceived to be dropping in pitch. (More detail)
The perception of the pitch of short pulses differs from that of sustained sounds of the same measured frequency. If a short pulse of a pure tone is decaying in amplitude, it will be perceived to be higher in pitch than an identical pulse which has steady amplitude. Interfering tones or noise can cause an apparent pitch shift.
Further discussion of these and other perceptual aspects of pitch may be found in Chapter 7 of Rossing, The Science of Sound, 2nd. Ed.
Effect of Loudness Changes on Perceived Pitch
A high pitch (>2kHz) will be perceived to be getting higher if its loudness is increased, whereas a low pitch (<2kHz) will be perceived to be going lower with increased loudness. Sometimes called "Stevens's rule" after an early investigator, this psychoacoustic effect has been extensively investigated.
With an increase of sound intensity from 60 to 90 decibels, Terhardt found that the pitch of a 6kHz pure tone was perceived to rise over 30 cents. A 200 Hz tone was found to drop about 20 cents in perceived pitch over the same intensity change.
Studies with the sounds of musical instruments show less perceived pitch change with increasing intensity. Rossing reports a perceived pitch change of around 17 cents for a change from 65 dB to 95 dB. This perceived change can be upward or downward, depending upon which harmonics are predominant. For example, if the majority of the intensity comes from harmonics which are above 2 kHz, the perceived pitch shift will be upward.
Perfect Pitch
"Perfect pitch" or "absolute pitch" refers to the ability of some persons to recognize the pitch of a musical note without any discernable pitch standard, as if the person can recognize a pitch like the eye discerns the color of an object. Most persons apparently have only a sense of relative pitch and can recognize a musical interval, but not an isolated pitch.
Rossing suggests that less than 0.01% of the population appear to be able to recognize absolute pitches, whereas over 98% of the population can do the corresponding visual task of recognizing colors with no color standard present.
Pitch Resolution
|
The extremely small size of the cochlea and the extremely high resolution of human pitch perception cast doubt on the sufficiency of the place theory to completely account for the human ear's pitch resolution. Some typical data:
| ![]() | ||||||||||
This would require a separate detectable pitch for every 0.002 cm, which is physically unreasonable for a simple peaking action on the membrane.
The normal human ear can detect the difference between 440 Hz and 441 Hz. It is hard to believe it could attain such resolution from selective peaking of the membrane vibrations. Some pitch sharpening mechanism must be operating.
Sharpening of Pitch Perception
The high pitch resolution of the ear suggests that only about a dozen hair cells, or about three tiers from the four banks of cells are associated with each distinguishable pitch. It is hard to conceive of a mechanical resonance of the basilar membrane that sharp. So we look for enhancements of the basic place theory of pitch perception.
| There must be some mechanism which sharpens the response curve of the organ of Corti,as suggested schematically in the diagram. Several such mechanisms have been suggested. |
|
Mechanisms for Sharpening
Since it seems unlikely that the basic place theory for pitch perception can explain the extraordinary pitch resolution of the human ear, some sharpening mechanism must be operating. Several of the proposed mechanism have the nature of lateral inhibition on the basilar membrane. One way to sharpen the pitch perception would be bring the peak of the excitation pattern on the basilar membrane into greater relief by inhibiting the firing of those hair cells which are adjacent to the peak. Since nerve cells obey an "all-or-none" law, discharging when receiving the appropriate stimulus and then drawing energy from the metabolism to recharge before firing again, one form the lateral inhibition could take is the inhibition of the recharging process since the cells at the peak of the response will be drawing energy from the surrounding fluid most rapidly. Inhibition of the lateral hair cells could also occur at the ganglia, with some kind of inhibitory gating which lets through only those pulses from the cells which are firing most rapidly. It is known that there are feedback signals from the brain to the hair cells, so the inhibition could occur by that means.
Harmonics
|
An ideal vibrating string will vibrate with its fundamental frequency and all harmonics of that frequency. The position of nodes and antinodes is just the opposite of those for an open air column. The fundamental frequencycan be calculated from
where
and the harmonics are integer multiples.
|
|
Overtones and Harmonics
The term harmonic has a precise meaning - that of an integer (whole number) multiple of the fundamental frequency of a vibrating object. The term overtone is used to refer to any resonant frequency above the fundamental frequency - an overtone may or may not be a harmonic. Many of the instruments of the orchestra, those utilizing strings or air columns, produce the fundamental frequency and harmonics. Their overtones can be said to be harmonic. Other sound sources such as the membranes or other percussive sources may have resonant frequencies which are not whole number multiples of their fundamental frequencies. They are said to have non-harmonic overtones.
| All harmonics are overtones for an open air column or a string. | Closed air columns produce only odd harmonics. | A rectangular membrane produces harmonics, but also some other overtones. |
![]() | ![]() | ![]() |
Fundamental and Harmonics
The lowest resonant frequency of a vibrating object is called its fundamental frequency. Most vibrating objects have more than one resonant frequency and those used in musical instruments typically vibrate at harmonics of the fundamental. A harmonic is defined as an integer (whole number) multiple of the fundamental frequency. Vibrating strings, open cylindrical air columns, and conical air columns will vibrate at all harmonics of the fundamental. Cylinders with one end closed will vibrate with only odd harmonics of the fundamental. Vibrating membranes typically produce vibrations at harmonics, but also have some resonant frequencies which are not harmonics. It is for this class of vibrators that the term overtone becomes useful - they are said to have some non-harmonic overtones.
Sinusoidal Waves
![]() Sine waves can be represented mathematically and it can be shown that any wave can be constructed from an appropriate combination of sine waves (Fourier synthesis) |
Any single- frequency traveling wave will take the form of a sine wave. This transverse wave is typical of that caused by a small pebble dropped into a still pool. The position of an object vibrating in simple harmonic motion will trace out a sine wave as a function of time. (Or if a mass on a spring is carried at constant speed across a room, it will trace out a sine wave.) |
Sinusoidal Node

Producing String Resonance

Phons
Saying that two sounds have equal intensity is not the same thing as saying that they have equal loudness. Since the human hearing sensitivity varies with frequency, it is useful to plot equal loudness curves which show that variation for the average human ear. If 1000 Hz is chosen as a standard frequency, then each equal loudness curve can be referenced to the decibel level at 1000 Hz. This is the basis for the measurement of loudness in phons. If a given sound is perceived to be as loud as a 60 dB sound at 1000 Hz, then it is said to have a loudness of 60 phons.
The loudness of complex sounds can be measured by comparison to 1000Hz test tones, and this type of measurement is useful for research, but for practical sound level measurement, the use of filter contours has been commonly adopted to approximate the variations of the human ear.
Sones
The use of the phon as a unit of loudness is an improvement over just quoting the level in decibels, but it is still not a measurement which is directly proportional to loudness. Using the rule of thumb for loudness, the sone scale was created to provide such a linear scale of loudness. It is usually presumed that the standard range for orchestral music is about 40 to 100 phons. If the lower end of that range is arbitrarily assigned a loudness of one sone, then 50 phons would have a loudness of 2 sones, 60 phons would be 4 sones, etc.
Place Theory
|
High frequency sounds selectively vibrate the basilar membrane of the inner ear near the entrance port (the oval window). Lower frequencies travel further along the membrane before causing appreciable excitation of the membrane. The basic pitch determining mechanism is based on the location along the membrane where the hair cells are stimulated. |
| A schematic view of the place theory unrolls the cochlea and represents the distribution of sensitive hair cells on the organ of Corti. Pressure waves are sent through the fluid of the inner ear by force from the stirrup . |
|
The place theory is the first step toward an understanding of pitch perception. But considering the extreme pitch sensitivity of the human ear, it is thought that there must be some additional "sharpening" mechanism to enhance the pitch resolution.
Timbre
Sounds may be generally characterized by pitch, loudness, and quality. Sound "quality" or "timbre" describes those characteristics of sound which allow the ear to distinguish sounds which have the same pitch and loudness. Timbre is then a general term for the distinguishable characteristics of a tone. Timbre is mainly determined by the harmonic content of a sound and the dynamic characteristics of the sound such as vibrato and the attack-decay envelope of the sound.
Some investigators report that it takes a duration of about 60 ms to recognize the timbre of a tone, and that any tone shorter than about 4 ms is perceived as an atonal click. It is suggested that it takes about a 4 dB change in mid or high harmonics to be perceived as a change in timbre, whereas about 10 dB of change in one of the lower harmonics is required.
Harmonic Content
The primary contributers to the quality or timbre of the sound of a musical instrument are harmonic content, attack and decay, and vibrato. For sustained tones, the most important of these is the harmonic content, the number and relative intensity of the upper harmonics present in the sound.
Some musical sound sources have overtones which are not harmonics of the fundamental. While there is some efficiency in characterizing such sources in terms of their overtones, it is always possible to characterize a periodic waveform in terms of harmonics - such an analysis is called Fourier analysis. It is common practice to characterize a sound waveform by the spectrum of harmonics necessary to reproduce the observed waveform.

The recognition of different vowel sounds of the human voice is largely accomplished by analysis of the harmonic content by the inner ear. Their distinctly different quality is attributed to vocal formants, frequency ranges where the harmonics are enhanced.
Attack and Decay
The primary contributers to the quality or timbre of the sound of a musical instrument are harmonic content, attack and decay, and vibrato.

The illustration above shows the attack and decay of a plucked guitar string. The plucking action gives it a sudden attack characterized by a rapid rise to its peak amplitude. The decay is long and gradual by comparison. The ear is sensitive to these attack and decay rates and may be able to use them to identify the instrument producing the sound.

This shows the sound envelope of striking a cymbal with a stick. The attack is almost instantaneous, but the decay envelope is very long. The time period shown is about half a second. The interval shown with the guitar string above is also about half a second, but since its frequency is much lower, you can resolve the individual periods in that sound envelope.
Vibrato/Tremolo
The primary contributers to the quality or timbre of the sound of a musical instrument are harmonic content, attack and decay, and vibrato. The ordinary definition of vibrato is "periodic changes in the pitch of the tone", and the term tremolo is used to indicate periodic changes in the amplitude or loudness of the tone. So vibrato could be called FM (frequency modulation) and tremolo could be called AM (amplitude modulation) of the tone. Actually, in the voice or the sound of a musical instrument both are usually present to some extent.
Vibrato is considered to be a desirable characteristic of the human voice if it is not excessive. It can be used for expression and adds a richness to the voice. If the harmonic content of a sustained sound from a voice or wind instrument is reproduced precisely, the ear can readily detect the difference in timbre because of the absence of vibrato. More realistic synthesized tones will add some type of vibrato and/or tremolo to produce a more realistic tone.

Above is an amplitude plot of a sustained "ee" vowel sound produced by a female voice. The periodic amplitude change would be described as tremolo by the ordinary definition of it. You could also hear pitch variation along with it, so vibrato was present as well. That is commonly the case. The period of the amplitude modulation is about 0.17 seconds, or a modulation frequency of about 5.8 Hz superimposed on a tone of frequency centered at about 395 Hz. Rough frequency measurements gave frequencies of 392 Hz when the amplitude was high and 399 Hz when the amplitude was low. It is not known whether or not this kind of variation is typical. Scaling the amplitude variation gives a range of about 7 dB in intensity associated with the amplitude modulation.
Resonance
In sound applications, a resonant frequency is a natural frequency of vibration determined by the physical parameters of the vibrating object. This same basic idea of physically determined natural frequencies applies throughout physics in mechanics, electricity and magnetism, and even throughout the realm of modern physics. Some of the implications of resonant frequencies are:
| 1. It is easy to get an object to vibrate at its resonant frequencies, hard to get it to vibrate at other frequencies. |
| 2. A vibrating object will pick out its resonant frequencies from a complex excitation and vibrate at those frequencies, essentially "filtering out" other frequencies present in the excitation. |
| 3. Most vibrating objects have multiple resonant frequencies. |
Ease of Excitation at Resonance
It is easy to get an object to vibrate at its resonant frequencies, hard at other frequencies. A child's playground swing is an example of a pendulum, a resonant system with only one resonant frequency. With a tiny push on the swing each time it comes back to you, you can continue to build up the amplitude of swing. If you try to force it to swing a twice that frequency, you will find it very difficult, and might even lose teeth in the process!
| Swinging a child in a playground swing is an easy job because you are helped by its natural frequency. | ![]() | But can you swing it at some other frequency? |
Picking out resonant frequencies
A vibrating object will pick out its resonant frequencies from a complex excitation and vibrate at those frequencies, essentially "filtering out" other frequencies present in the excitation.
![]() | If you just whack a mass on a spring with a stick, the initial motion may be complex, but the main response will be to bob up and down at its natural frequency. The blow with the stick is a complex excitation with many frequency components (as could be shown by Fourier analysis), but the spring picks out its natural frequency and responds to that. |
Pythagorean
Pythagorean Temperament
A pentatonic musical scale can be devised with the use of only the octave, fifth and fourth. It produces three intervals with ratio 9/8 and two larger intervals. If a 9/8 (whole tone) interval is carved out of the larger ones, a smaller (semitone) interval is left: B-C and E-F. This creates a Pythagorean diatonic scale. If the semitone thus created is taken from the whole tone, a chromatic semitone of different size is left over. This leads to some of the difficulties of Pythagorean temperament and other temperaments - such difficulties ultimately led to the development of equal temperament.

Pythagorean Intervals
The Pythagorean temperament produces two different semitones.When expressed in cents, it becomesevident that theirdifferences arewell above the just noticeable difference in pitch.

This leads to certain difficulties which ultimately led to the development of equal temperament.
Temperament Problems
For over two centuries the predominant musical scale used, at least for western music, has been the equal tempered scale. The ability to freely modulate between musical keys and the equivalence of all musical keys were strong enough features to overcome reservations about the compromises made with the small-integer-ratio rule. The kinds of problems which led to this compromise scale were:

Quarter-comma Meantone Tuning
In quarter-comma meantone temperament the just major third is divided into two equal whole tones. This forces some tampering with the fifths and fourths. The fifths are made smaller and the fourths larger by a quarter of a syntonic comma.
Expressed in terms of whole number ratios, the ordinary (syntonic or Ptolemaic) comma is the interval between a just major third (5:4) and a Phythagorean ditonic or major third (81:64). Its ratio is 81:80 which is 22 cents. The compromise of the fifths and fourths is then about 5¢, the commonly accepted value for the just noticeable difference in pitch.
Music
Musical Intervals
The term musical interval refers to a step up or down in pitch which is specified by the ratio of the frequencies involved. For example, an octave is a music interval defined by the ratio 2:1 regardless of the starting frequency. From 100 Hz to 200 Hz is an octave, as is the interval from 2000 Hz to 4000 Hz.
The intervals which are generally the most consonance-and-dissonancet to the human ear are intervals represented by small integer ratios. Intervals represented by exact integer ratios are said to be Just intervals, and the temperament which keeps all intervals at exact whole number ratios is Just temperament.
| Examples of just musical intervals: | 2:1 | octave |
| 3:2 | fifth | |
| 4:3 | fourth | |
| 5:4 | major third | |
| 6:5 | minor third |
Musical Scales
Background Material for Tuning and Temperament
assumptions about the human hearing process:
- The ear is sensitive to ratios of frequencies (pitches) rather than to differences in establishing musical intervals.
- The intervals which are perceived to be most consonant are composed of small integer ratios of frequency.
|
The octave, fifth, and fourth are the intervals which have been considered to be consonant throughout history by essentially all cultures, so they form a logical base for the building up of musical scales. A typical strategy for using these univerally consonant intervals is the circle of fifths. |
Consonance and Dissonance
Two tones are said to be consonant if their combination is pleasing to the ear, and dissonant if displeasing. The simplest approach to quantifying consonance is to say that two tones are consonant if their frequencies are related by a small integer ratio. The ratio determines the musical interval.
The octave 2:1, fifth 3:2, and fourth 4:3 are presumed to be universally consonant musical intervals because most persons in any culture or period of history have considered them to be pleasing tone combinations and have built musical compositions around them.
For example, in the buildup of a pentatonic scale by a circle of fifths, a natural whole tone of ratio 9/8 emerges, satisfying the condition for consonance. A semitone like E-F also emerges, and the ratio 256/243 suggests dissonance.
When you define "consonance" as "pleasing to the ear", then of course you have to ask "whose ear?". You can get into such intense debate about what is "pleasing" that some have come to define music as "sounds organized by human beings" to accede the endless variety. The use of consonance here is limited to giving a suggestion of a simple rule that yields musical intervals that are pleasing to most people, i.e., "consonant".
Circle of Fifths
A full chromatic scale can be created by using just the perfect fourth and fifth musical intervals. This is characteristic of the Pythagorean temperament.
This process can be pictured on the circle of fifths. The outer circle visits all twelve notes on the chromatic scale by going up by fifths (or down by fourths) .
The inner circle goes down by fifths (or up by fourths). To create all these notes in the same octave, you could drop down an octave when necessary to stay in the original octave.
Diatonic and Chromatic Scales
Building up a musical scale using a sequence or cycle of musical fifths and fourths leads first to the pentatonic scale, but this leaves two large intervals which in the illustration below would be labeled D-F and A-C. If you take a whole tone interval out of one of the large intervals, a smaller interval or semitone remains. In this way it could be said that the semitone inevitably arises from the pattern of buildup by fifths and fourths. The pattern which results in our example of the Pythagorean scale is the sequence WWHWWWH.

A scale with this sequence is called a diatonic scale. With our choice of C as the starting point, this example has no sharps or flats. When the whole tones of this diatonic scale are divided into semitones with additional notes, these are called chromatic notes and the scale where they are included is called a chromatic scale. In this particular example, all the chromatic notes added would be denoted by sharps or flats. An entire chromatic scale in current musical keys would consist of octaves with 12 semitones or half-steps.
Note that it is not the absence of sharps and flats which defines the diatonic scale - that would depend upon the starting point once you have associated letter names ABC... with the notes. The defining criterion for the diatonic scale is the sequence of whole and half-steps, WWHWWWH.
On the piano keyboard, an example of the equal tempered scale, the pattern WWHWWWH is demonstrated by the pattern of the white keys if you start with middle-C as indicated. The white keys that have a black key between them are a whole tone apart, but the E-F and B-C white keys do not have a black key between and are a semitone apart. |
![]() |
Pentatonic Scale
A five-note pentatonic scale can be built up with a circle of fifths, a strategy based on our understanding of the human hearing response to small integer ratios of frequencies.
The Pythagorean scale can be produced by carving a natural whole tone out of the larger interval which naturally appears in the pentatonic scale.
The Whole Tone
In the buildup of a pentatonic scale using the musical intervals which have been found to be universally consonant ( octave, fifth, and fourth), an interval of ratio 9/8 naturally emerges. This interval satisfies the basic condition for consonance and it occurs in the basic pentatonic scale. This suggests the carving of another such whole tone from the larger interval remaining, leaving the smaller 256/243 interval. This whole tone is used in the Just and Pythagorean temperaments. In devising the equal tempered scale it was important to maintain not only the octave, fifth, and fourth at close to their just values, but also to maintain the whole tone. Expressed in cents notation, the natural whole tone is 204¢, compared to 200¢ for the equal tempered whole tone, just within the accepted 5¢ just noticeable difference.

The natural wholle tone that arises in Pythagorean temperament leads to certain difficulties when compared to equal temperament as shown below.

Temperament and musical scales

Guitar
|
Some Guitar Details
The strings of a guitar allow control of the pitch and harmonic content of the sound produced. The pitch is determined by the length, mass and tension of the strings. The produced frequencies are resonant frequencies of the strings, which depend inversely on the length (eg., cutting the length in half by depressing the string to the 12th fret will double the frequency to the note one octave up from the fundamental frequency of the string.) The mass and tension together determine the speed of the wave in the string. Since it is desirable to have about the same tension in each of the strings to keep from putting any distorting torque on the instrument, a matched set of strings will have carefully adjusted masses so that the strings are tuned to the correct intervals when the tensions are the same.
The bridge transfers the vibrational energy of the strings to the top plate of the instrument - the strings alone can't effectively move air to produce sound, but the vibrating top plate can do that quite efficiently. The air cavity resonance produced by the round hole can be demonstrated by singing a note somewhere between F#2 and A2 (depending on the guitar) while holding your ear close to the sound hole. You will hear the air in the body resonating. Another way to hear the effect of this resonance is to play the open A string while slipping a piece of paper or cardboard back and forth across the soundhole. Because this stops the resonance or shifts it to a lower frequency, you will notice the loss of bass response when you close the hole. The coupled resonance of the front and back plates produces a resonance about an octave above the main air resonance.
The Spaniard Antonio de Torres Jurado is credited with considerable enhancements of the modern classical guitar in the mid nineteen hundreds. The body and the sound hole were enlarged and the fretboard widened. Perhaps his most important contribution was the development of "fan strutting", a series of struts which diverge from the sound hole on the top plate of the instrument. This design gave it a considerably stronger, more sustained tone.
Fret Rule for Guitars
To provide for definite pitch relations between notes, metal inserts called frets are inset in a fretboard on the neck of guitars. The raised edges of the frets provide fixed lengths of string when the string is held down against them with a finger. The interval between successive frets is normally one equal tempered semitone.
Frets on guitars are placed by the fret rule "one-eighteenth the remaining length of the string". This makes them approximately a semitone apart. If the musical interval produced by this rule is expressed in cents, then a string of length 17/18 of its original length will be higher by an interval of 98.9 cents compared to a precise semitone of 100 cents. Since the just noticeable difference in pitch is about 5 cents, then the fret rule could be applied for one change of fret with no problems in intonation. One would have to be careful about cumulative errors if the rule were applied repeatedly, so it should be checked at intervals such as the musical fourth which should be precisely 3/4 of the length of the open string.
Tuning the 6-String Guitar
A standard tuning for the strings of the guitar would be E2, A2, D3, G3, B3, E4, referring to the equal tempered set of notes based on A440Hz. They are usually numbered in the descending order (i.e., the E4 is called string 1). The strings are tuned a fourth apart except for the major third interval between strings 2 and 3. Here is a basic procedure for tuning the guitar:
- Tune string 5 (A2) to 110 Hz using an electronic metronome or tuning fork, etc.
- Tune string 6 (E2) by depressing it at fret 5 (A2) and matching this pitch to the pitch of open string 6 (A2).
- Tune string 4 (D3) by depressing string 5 at fret 5 (D3) and matching these pitches.
- Tune string 3 (G3) by depressing string 4 at fret 5 (G3) and matching these pitches.
- Tune string 2 (B3) by depressing string 3 at fret 4 (B3) and matching these pitches.
- Tune string 1 (E4) by depressing string 2 at fret 5 (E4) and matching these pitches.
- If you are a fortunate person, your guitar now might be in tune.
If the frets of the guitar are set for equal temperament ), then the 5th fret is 5 semitones or an equal tempered musical fourth. That frequency should equal the open string frequency of the next string up in pitch. They can be matched by "zero beating", adjusting the string so that no beat is heard. After tuning for the zero beat, you should be able to pluck the lower string (held at the fifth fret) and the higher string should start to vibrate, driven by the vibrations in the bridge.
The uncertain ending in step 7 above is caused by a number of "real world" complications. One of the complications is that the upper harmonics of the lower strings will not necessarily sound in tune with the upper strings, even when their fundamentals are tuned to the prescribed interval. One of the reasons is that perfect harmonics from the strings produce intonation problems with equal temperament. If you examine the intervals in cents for a harmonic sequence, you find that the upper harmonics are out of tune with equal temperamment, with the 7th harmonic being notably bad. So one kind of problem arises if the harmonics are perfect, but another arises if they depart too far from perfect harmonics. Because of the stiffness of the bass strings, their upper resonances tend to be sharp compared to exact whole number multiples (harmonics). The upper harmonics of the lower strings may beat with the upper strings in such a way to produce an unpleasant sound. This sometimes leads to a process which is called "octave stretching" in piano tuning. The upper notes of the instruments are tuned a little sharp to sound better when played with the lower strings. The measurements and calculations involved to optimize the tuning are unreasonable for a simple tuning process, so in practice you follow a basic procedure like that described above, and then adjust the tuning by ear to get the most pleasing blend.
After going through the procedure above for relative tuning of the strings, it should be mentioned that inexpensive digital tuning devices are readily available which allow you to tune each string individually. Some give an LED indicator when you are in tune and a readout of how many cents sharp or flat you are with respect to the standard pitch. Even after tuning to one of these devices, you might yet have to deal with the difficulties described above in getting the most pleasant sound out of the instrument. But once you found a mixture that you like, you could use the digital tuner to note how many cents sharp or flat you tuned each string to get that optimum sound.
Vibrating Strings
Vibrating String
The fundamental vibrational mode of a stretched string is such that the wavelength is twice the length of the string.

Applying the basic wave relationship gives an expression for the fundamental frequency:
![]() | Calculation |
| Since the wave velocity is given by | | , the frequency expression |
can be put in the form:

The string will also vibrate at all harmonics of the fundamental. Each of these harmonics will form a standing wave on the string.

This shows a resonant standing wave on a string. It is driven by a vibrator at 120 Hz.

For strings of finite stiffness, the harmonic frequencies will depart progressively from the mathematical harmonics. To get the necessary mass for the strings of an electric bass as shown above, wire is wound around a solid core wire. This allows the addition of mass without producing excessive stiffness.
Vibrating String Frequencies
If you pluck your guitar string, you don't have to tell it what pitch to produce - it knows! That is, its pitch is its resonant frequency, which is determined by the length, mass, and tension of the string. The pitch varies in different ways with these different parameters, as illustrated by the examples below:
original values. |
|
Wave Velocity in String
The velocity of a traveling wave in a stretched string is determined by the tension and the mass per unit length of the string.
| The wave velocity is given by | ![]() |
When the wave relationship is applied to a stretched string, it is seen that resonant standing wave modes are produced. The lowest frequency mode for a stretched string is called the fundamental, and its frequency is given by

The Physics of Sound
Loudness
Sound loudness is a subjective term describing the strength of the ear's perception of a sound. It is intimately related to sound intensity but can by no means be considered identical to intensity. The sound intensity must be factored by the ear's sensitivity to the particular frequencies contained in the sound. This is the kind of information contained in equal loudness curves for the human ear. It must also be considered that the ear's response to increasing sound intensity is a "power of ten" or logarithmic relationship. This is one of the motivations for using the decibel scale to measure sound intensity. A general "rule of thumb" for loudness is that the power must be increased by about a factor of ten to sound twice as loud. To more realistically assess sound loudness, the ear's sensitivity curves are factored in to produce a phon scale for loudness. The factor of ten rule of thumb can then be used to produce the sone scale of loudness. In practical sound level measurement, filter contours such as the A, B, and C contours are used to make the measuring instrument more nearly approximate the ear.
"Rule of Thumb" for Loudness
A widely used "rule of thumb" for the loudness of a particular sound is that the sound must be increased in intensity by a factor of ten for the sound to be perceived as twice as loud. A common way of stating it is that it takes 10 violins to sound twice as loud as one violin. Another way to state the rule is to say that the loudness doubles for every 10 phon increase in the sound loudness level. Although this rule is widely used, it must be emphasized that it is an approximate general statement based upon a great deal of investigation of average human hearing but it is not to be taken as a hard and fast rule.
Why is it that doubling the sound intensity to the ear does not produce a dramatic increase in loudness? We cannot give answers with complete confidence, but it appears that there are saturation effects. Nerve cells have maximum rates at which they can fire, and it appears that doubling the sound energy to the sensitive inner ear does not double the strength of the nerve signal to the brain. This is just a model, but it seems to correlate with the general observations which suggest that something like ten times the intensity is required to double the signal from the innner ear.
One difficulty with this "rule of thumb" for loudness is that it is applicable only to adding loudness for identical sounds. If a second sound is widely enough separated in frequency to be outside the critical band of the first, then this rule does not apply at all.
While not a precise rule even for the increase of the same sound, the rule has considerable utility along with the just noticeable difference in sound intensity when judging the significance of changes in sound level.
Adding Loudness
When one sound is produced and another sound is added, the increase in loudness perceived depends upon its frequency relation to the first sound. Insight into this process can be obtained from the place theory of pitch perception. If the second sound is widely separated in pitch from the first, then they do not compete for the same nerve endings on the basilar membrane of the inner ear. Adding a second sound of equal loudness yields a total sound about twice as loud. But if the two sounds are close together in frequency, within a critical band, then the saturation effects in the organ of Corti are such that the perceived combined loudness is only slightly greater than either sound alone. This is the condition which leads to the commonly used rule of thumb for loudness addition.

Sound Intensity
Sound intensity is defined as the sound power per unit area. The usual context is the measurement of sound intensity in the air at a listener's location. The basic units are watts/m2 or watts/cm2 . Many sound intensity measurements are made relative to a standard threshold of hearing intensity I0 :

The most common approach to sound intensity measurement is to use the decibel scale:

Decibels measure the ratio of a given intensity I to the threshold of hearing intensity , so that this threshold takes the value 0 decibels (0 dB). To assess sound loudness, as distinct from an objective intensity measurement, the sensitivity of the ear must be factored in.
Sound Pressure
Since audible sound consists of pressure waves, one of the ways to quantify the sound is to state the amount of pressure variation relative to atmospheric pressure caused by the sound. Because of the great sensitivity of human hearing, the threshold of hearing corresponds to a pressure variation less than a billionth of atmospheric pressure.
The standard threshold of hearing can be stated in terms of pressure and the sound intensity in decibels can be expressed in terms of the sound pressure:

The pressure P here is to be understood as the amplitude of the pressure wave. The power carried by a traveling wave is proportional to the square of the amplitude. The factor of 20 comes from the fact that the logarithm of the square of a quantity is equal to 2 x the logarithm of the quantity. Since common microphones such as dynamic microphones produce a voltage which is proportional to the sound pressure, then changes in sound intensity incident on the microphone can be calculated from

where V1 and V2 are the measured voltage amplitudes .
Threshold of Hearing
Sound level measurements in decibels are generally referenced to a standard threshold of hearing at 1000 Hz for the human ear which can be stated in terms of sound intensity:

or in terms of sound pressure:

This value has wide acceptance as a nominal standard threshold and corresponds to 0 decibels. It represents a pressure change of less than one billionth of standard atmospheric pressure. This is indicative of the incredible sensitivity of human hearing. The actual average threshold of hearing at 1000 Hz is more like 2.5 x 10-12 watts/m2 or about 4 decibels, but zero decibels is a convenient reference. The threshold of hearing varies with frequency, as illustrated by the measured hearing curves.
Threshold of Pain
The nominal dynamic range of human hearing is from the standard threshold of hearing to the threshold of pain. A nominal figure for the threshold of pain is 130 decibels, but that which may be considered painful for one may be welcomed as entertainment by others. Generally, younger persons are more tolerant of loud sounds than older persons because their protective mechanisms are more effective. This tolerance does not make them immune to the damage that loud sounds can produce.

Some sources quote 120 dB as the pain threshold and define the audible sound frequency range as ending at about 20,000 Hz where the threshold of hearing and the threshold of pain meet.
Decibels
The sound intensity I may be expressed in decibels above the standard threshold of hearing I0 . The expression is

The logarithm involved is just the power of ten of the sound intensity expressed as a multiple of the threshold of hearing intensity. Example: If I= 10,000 times the threshold, then the ratio of the intensity to the threshold intensity is 104, the power of ten is 4, and the intensity is 40 dB:


The factor of 10 multiplying the logarithm makes it decibels instead of Bels, and is included because about 1 decibel is the just noticeable difference (JND) in sound intensity for the normal human ear.
Decibels provide a relative measure of sound intensity. The unit is based on powers of 10 to give a manageable range of numbers to encompass the wide range of the human hearing response, from the standard threshold of hearing at 1000 Hz to the threshold of pain at some ten trillion times that intensity.
Another consideration which prompts the use of powers of 10 for sound measurement is the rule of thumb for loudness: it takes about 10 times the intensity to sound twice as loud.
Decibels and Logarithms
The decibel scale is a reflection of the logarithmic response of the human ear to changes in sound intensity:

The logarithm to the base 10 used in this expression is just the power of 10 of the quantity in brackets according to the basic definition of the logarithm:


JND in Sound Intensity
A useful general reference is that the just noticeable difference in sound intensity for the human ear is about 1 decibel.
In fact, the use of the factor of 10 in the definition of the decibel is to create a unit which is about the least detectable change in sound intensity.
That having been established, it can be noted that there are some variations. The jnd is about 1 dB for soft sounds around 30-40 dB at low and midrange freqencies. It may drop to 1/3 to 1/2 a decibel for loud sounds.
Caution must be used in applying the "one decibel" criterion. It presumes that you are increasing the same sound by one decibel. If you were adding a sound outside the critical band of frequency from this sound, you would be exciting fresh nerve endings, and the one decibel rule can't be presumed to apply. This causes some concern about the perceptual encoding schemes used with modern digital recording which might eliminate some significant audible content by the use of a "one decibel" criterion for dropping content.
Variations in Difference Threshold

Critical Band
When two sounds of equal loudness when sounded separately are close together in pitch, their combined loudness when sounded together will be only slightly louder than one of them alone. They may be said to be in the same critical band where they are competing for the same nerve endings on the basilar membrane of the inner ear. According the the place theory of pitch perception, sounds of a given frequency will excite the nerve cells of the organ of Corti only at a specific place. The available receptors show saturation effects which lead to the general rule of thumb for loudness by limiting the increase in neural response.
If the two sounds are widely separated in pitch, the perceived loudness of the combined tones will be considerably greater because they do not overlap on the basilar membrane and compete for the same hair cells. The phenomenon of the critical band has been widely investigated.
Backus reports that this critical band is about 90 Hz wide for sounds below 200 Hz and increases to about 900 Hz for frequencies around 5000 Hertz. It is suggested that this corresponds to a roughly constant length on the basilar membrane of length about 1.2 mm and involving some 1300 hair cells. If the tones are far apart in frequency (not within a critical band), the combined sound may be perceived as twice as loud as one alone.
Critical Band Measurement
For low frequencies the critical band is about 90 Hz wide. For higher frequencies, it is between a whole tone and 1/3 octave wide.
|
| ||||||||||||||||||
Sensitivity of Human Ear
The human ear can respond to minute pressure variations in the air if they are in the audible frequency range, roughly 20 Hz - 20 kHz.


It is capable of detecting pressure variations of less than one billionth of atmospheric pressure. The threshold of hearing corresponds to air vibrations on the order of a tenth of an atomic diameter. This incredible sensitivity is enhanced by an effective amplification of the sound signal by the outer and middle ear structures. Contributing to the wide dynamic range of human hearing are protective mechanisms that reduce the ear's response to very loud sounds. Sound intensities over this wide range are usually expressed in decibels.
Dynamic Range of Hearing
In addition to its remarkable sensitivity, the human ear is capable of responding to the widest range of stimuli of any of the senses. The practical dynamic range could be said to be from the threshold of hearing to the threshold of pain:
|
| |
| |
|
This remarkable dynamic range is enhanced by an effective amplification structure which extends its low end and by a protective mechanism which extends the high end.
Equal Loudness Curves

Annotated Equal Loudness Curves
Click on any of the highlighted text for further details about the equal loudness curves.

|
Progressive Discrimination Against Low FrequenciesFor very soft sounds, near the threshold of hearing, the ear strongly discriminates agains low frequencies. For mid-range sounds around 60 phons, the discrimination is not so pronounced and for very loud sounds in the neighborhood of 120 phons, the hearing response is more nearly flat. One of the implications of this aspect of human hearing is that you will perceive a progressive loss of bass frequencies as a given sound becomes softer and softer. For example if you are listening to a recording of an orchestra and you turn the volume down, you will find that the bass instruments are less and less prominent. This is the purpose of the so-called "loudness contours" on audio amplifiers; they allow you to boost the bass frequencies when you are listening at low sound levels to give you a more realistic balance of the high and low frequencies in the music. |
This aspect of human hearing has important implications for the design of auditoriums for music performance. As the sound gets softer toward the back of the auditorium, the listener will perceive a loss of bass frequencies, an undesirable condition. To overcome this "bass loss problem", auditorium designs must attempt to provide some reinforcement of the low frequencies.
Threshold of Hearing
The measured threshold of hearing curve shows that the sound intensity required to be heard is quite different for different frequencies. The standard threshold of hearing at 1000 Hz is nominally taken to be 0 dB, but the actual curves show the measured threshold at 1000 Hz to be about 4 dB. There is marked discrimination against low frequencies so that about 60 dB is required to be heard at 30 Hz. The maximum sensitivity at about 3500 to 4000 Hz is related to the resonance of the auditory canal.

Human Ear
Structures of the Ear
The structures of the outer and middle ear contribute to both the remarkable sensitivity and the wide dynamic range of human hearing. They can be considered to be both a pre-amplifier and a limiter for the human hearing process.
|
The outer ear (pinna) collects more sound energy than the ear canal would receive without it and thus contributes some area amplification. The numbers here are just representative ... not precise data. |
| |||||||||
The outer and middle ears contribute something like a factor of 100 or about 20 decibels of amplification under optimum conditions.
The Body's Microphone

Organ of Corti
The organ of Corti is the sensitive element in the inner ear and can be thought of as the body's microphone. It is situated on the basilar membrane in one of the three compartments of the Cochlea. It contains four rows of hair cells which protrude from its surface. Above them is the tectoral membrane which can move in response to pressure variations in the fluid-filled tympanic and vestibular canals. There are some 16,000 -20,000 of the hair cells distributed along the basilar membrane which follows the spiral of the cochlea.
The place along the basilar membrane where maximum excitation of the hair cells occurs determines the perception of pitch according to the place theory. The perception of loudness is also connected with this organ.
Tiny relative movements of the layers of the membrane are sufficient to trigger the hair cells. Like other nerve cells, their response to stimulus is to send a tiny voltage pulse called an "action potential" down the associated nerve fiber (axon). These impulses travel to the auditory areas of the brain for processing.
Arrangement of Hair Cells
|
The hair cells of the organ of Corti are arranged in four rows along the length of the basilar membrane. Individual hair cells have multiple strands called stereocilia. There may be 16,000 - 20,000 such cells. The place theory of pitch perception suggests that pitch is determined by the place along this collection at which excitation occurs. The pitch resolution of the ear suggests a collection of hair cells like this associated with each distinguishable pitch. |
![]() |
![]() |
This is another conception of the arrangement of the outer three rows of hair cells, consistent with the above picture, but showing that a cluster of the cilia is associated with a single hair cell. It is drawn roughly from the work of McGutin. I think that the best that we know about the cilia arrangements comes from electron micrographs like those of Hudspeth.
|
Single Hair Cell Structure
|
The sensitive hair cells of the organ of Corti may have about 100 tiny stereocilia which in the resting state are leaning on each other in a conical bundle. In response to the pressure variations in the Cochlea produced by sound, the stereocilia may dance about wildly and send electrical impulses to the brain. |
![]() |
The Cochlea
| The inner ear structure called the cochlea is a snail-shell like structure divided into three fluid-filled parts. Two are canals for the transmission of pressure and in the third is the sensitive organ of Corti, which detects pressure impulses and responds with electrical impulses which travel along the auditory nerve to the brain. |
|
Section of Cochlea
The cochlea has three fluid filled sections. The perilymph fluid in the canals differs from the endolymph fluid in the cochlear duct. The organ of Corti is the sensor of pressure variations.

The Fluid Filled Cochlea
The pressure changes in the cochlea caused by sound entering the ear travel down the fluid filled tympanic and vestibular canals which are filled with a fluid called perilymph. This perilymph is almost identical to spinal fluid and differs significantly from the endolymph which fills the cochlear duct and surrounds the sensitive organ of Corti. The fluids differ interms of their electrolytes and if the membranes are ruptured so that there is mixing of the fluids, the hearing is impaired.
The Auditory Nerve
Taking electrical impulses from the cochlea and the semicircular canals, the auditory nerve makes connections with both auditory areas of the brain.

Auditory Area of Brain
This schematic view of some of the auditory areas of the brain shows that information from both ears goes to both sides of the brain - in fact, binaural information is present in all of the major relay stations illustrated here. That is, when the auditory nerve from one ear takes information to the brain, that information is directly sent to both the processing areas on both sides of the brain.

The Inner Ear

The small bone called the stirrup, one of the ossicles, exerts force on the thin membrane called the oval window, transmitting sound pressure information into the inner ear.
The Inner Ear Canals
|
The inner ear can be thought of as two organs: the semicircular canals which serve as the body's balance organ and the cochlea which serves as the body's microphone, converting sound pressure impulses from theouter ear into electrical impulses which are passed on to the brain via the auditory nerve. The basilar membrane of the inner ear plays a critical role in the perception of pitch according to the place theory. |
|
The Semicircular Canals
| The semicircular canals are the body's balance organs, detecting acceleration in the three perpendicular planes. These accelerometers make use of hair cells similar to those on the organ of Corti, but these hair cells detect movements of the fluid in the canals caused by angular acceleration about an axis perpendicular to the plane of the canal. Tiny floating particles aid the process of stimulating the hair cells as they move with the fluid. The canals are connected to the auditory nerve. ![]() |
The Ossicles
The three tiniest bones in the body form the coupling between the vibration of the eardrum and the forces exerted on the oval window of the inner ear. Formally named the malleus, incus, and stapes, they are commonly referred to in English as the hammer, anvil, and stirrup.
With a long enough lever, you can lift a big rock with a small applied force on the other end of the lever. The amplification of force can be changed by shifting the pivot point. | ![]() |
|
The ossicles can be thought of as a compound lever which achieves a multiplication of force. This lever action is thought to achieve an amplification by a factor of about three under optimum conditions, but can be adjusted by muscle action to actually attenuate the sound signal for protection against loud sounds. |
A physiology book describes the ossicles as small enough to fit collectively on a U.S. dime. The image to the right actually makes the ossicles a bit too large - they may be half that large in some persons. ![]() |
![]() |
Ossicle Vibration
|
The vibration of the eardrum is transmitted to the oval window of the inner ear by means of the ossicles, which achieve an amplification by lever action. The lever is adjustable under muscle action and may actually attenuate loud sounds for protection of the ear. |
|
Ear and Hearing

The Outer Ear
Sound energy spreads out from its sources. For a point source of sound, it spreads out according to the inverse square law. For a given sound intensity, a larger ear captures more of the wave and hence more sound energy.
The outer ear structures act as part of the ear's preamplifier to enhance the sensitivity of hearing.

The auditory canal acts as a closed tube resonator, enhancing sounds in the range 2-5 kiloHertz.
The Tympanic Membrane
The tympanic membrane or "eardrum" receives vibrations traveling up the auditory canal and transfers them through the tiny ossicles to the oval window, the port into the inner ear.
The eardrum is some fifteen times larger than the oval window of the inner ear, giving an amplification of about fifteen compared to a case where the sound pressure interacted with the oval window alone. The tympanic membrane is very thin, about 0.1 mm, but it is resilient and strong.(Zemlin) |
You may reach information about the nearby structures of the ear by clicking on the item of interest on the illustration. |
The Eustachian Tube

The Eustachian tube is an open tube leading from the middle ear to the nasopharynx. It allows the middle ear pressure to equalize with the external air pressure. Its secondary function is to allow drainage of normal and diseased middle ear secretions from the middle ear cavity into the nasopharynx.
Human Voice
Phonation
The process of speech production by the human voice may be divided into phonation, resonation, and articulation. Phonation is the process by which energy from the lungs in the form of air pressure is converted into audible vibrations.
| One method of phonation involves using the air pressure to set the elastic vocal folds into vibration, a process called voicing. The other involves allowing air to pass through the larynx in to the vocal tract where modifications of the airstream produce transient or aperiodic sound waves. |
Aperiodic sounds can be combined with voiced sounds to create voiced consonants like /d/.
Aperiodic Phonation
Aperiodic phonation involves allowing air to pass through the larynx in to the vocal tract where modifications of the airstream produce transient or aperiodic sound waves.
/t/ | Sound produced by blocking the airstream and suddenly releasing the built up air pressure. Such a sound is called a stop or "plosive". |
/sh/ | A "continuous noise" type sound made by forcing air through a constricted space, like the sound /sh/. |
/ch/ | A combination of plosive and fricative, as in the sound /ch/ in "chair" . |
/d/ | A plosive followed by a voiced sound. |
Voice Vibrato
Included in the distinguishing characteristics of a musical sound which determine its timbre is vibrato/tremolo. The term vibrato for the singing voice is more commonly used to describe the variations in the voice during a sustained note. When analyzed, it is found that both the pitch and amplitude change periodically so that both vibrato and tremolo are present. The presence of vibrato, within limits, adds richness and expression to the voice. It is often the case that the amount of vibrato increases as a given note is sustained, and many singers use variations of vibrato for expression in singing. Excessive vibrato gives an impression of instability to the tone, so control of the amount of vibrato is a matter of practice and musical judgement.

This recording of a sustained vocal sound shows the periodic change in the amplitude of the sound. You could also clearly hear the accompanying periodic pitch change, so both amplitude modulation and frequency modulation were present.
If the precise harmonic content of a sustained voice sound is reproduced and sounded, the ear can easily distinguish the sound without the vibrato. Stanley suggests that a good vibrato rate for the singing voice is about 6 pulses per second, and that the average pitch variation is about a semitone accompanied by about 3 decibels of intensity variation. The amount of vibrato tends to increase with loudness, reaching about a full tone for very loud vocalization (fortissimo).
Vibrato with Musical Instruments
Included in the distinguishing characteristics of the sound of a musical instrument which determine its timbre is vibrato/tremolo. With a violin, almost pure vibrato (periodic pitch variations) can be produced with the finger on the string by a rocking motion which periodically changes the length and therefore the resonant frequency. Likewise a trombone player can produce almost pure vibrato by wiggling the slide in and out to change the pitch. On the other hand when a flute player uses the diaphragm to produce a tone variation, sometimes called "diaphragm vibrato", it is actually almost pure tremolo (periodic amplitude variation). For most tone variations which come under the general term vibrato, such as that produced with instruments employing a reed, the tone variation includes both true vibrato and tremolo since it is hard to produce a pitch variation without producing some amplitude variation as well. Voice vibrato also includes both pitch and amplitude variation.
Vocal Sound Production
![]() |
![]() Diaphragm action pushes air from the lungs through the vocal folds, producing a periodic train of air pulses. This pulse train is shaped by the resonances of the vocal tract. The basic resonances, called vocal formants, can be changed by the action of the articulators to produce distinguishable voice sounds, like the vowel sounds. |
The Vocal Folds
Positioned at the base of the larynx in the vocal tract, these twin infoldings of mucous membrane act as the vibrator or "reed" during phonation. Open during breathing, the folds are closed by the pivoting of the arytenoid cartilages for speech or singing.
|
Positive air pressure from the lungs forces them open momentarily, but the high velocity air produces a lowered pressure by the Bernoulli.\\effect which brings them back together. The folds themselves have a resonant frequency which determines voice pitch. |
|
In an adult male, the vocal folds are usually 17-23 mm long, and12.5 -17 mm in an adult female (Kaplan). They may be stretched 3 or 4 mm by action of the muscles in the larynx.
The male speaking voice averages about 125 Hz, while the female voice averages about 210 Hz. Children's voices average over 300 Hz. The illustration below shows approximate pitches for speaking voices related to an equal tempered piano keyboard based on A4 = 440 Hz.

The front end of the vocal folds is attached to the thyroid cartilage, the "Adam's apple". The back end is attached to the arytenoid cartilages, which move to separate the folds for breathing.
Vocal Folds in Phonation
The process of converting the air pressure from the lungs into audible vibrations is called phonation. When the air passes through the elastic vocal folds and causes them to vibrate, the type of phonation is called voicing. The vocal folds give the singer a wide range of control over the pitch of the sound produced. While "vocal folds" is more descriptive than "vocal cords", there is some similarity to a vibrating string in that the pitch produced depends upon the length, mass and tension of the vocal folds.
The excitation of the vocal folds is however very different from the excitation of a string in that is is caused by the passage of air through the opening between the folds. The muscles of the larynx change the elasticity and tension of the vocal folds to determine the pitch of the sound. |
|
Vocal Tract Resonance
|
Sundberg models the vocal tract as a closed tube resonator, suggesting that the three prominent formants seen in vowel sounds correspond to the harmonics 1,3,5. These frequencies are then modified by the cavity resonance of the vocal tract as influenced by the articulators. ![]() |
![]() |
A typical length for the vocal tract is about 17-18 cm. This would give a fundamental frequency of about 500 Hz if it is treated as a closed cylinder. This would predict formant frequencies of 500, 1500 and 2500, which is in the range of observed frequencies. However, the articulators which provide differences in vowel sounds produce significant changes in these formant frequencies.
Voice Articulators
![]() |
In order to produce distinguishable voice sounds, like vowel sounds, the vocal mechanism must control the resonances of the vocal tract which produce the characteristic vocal formants. If the vocal tract is considered to be a cavity resonator, then it can be seen that the position of the tongue, the area of opening of the mouth, and any changes which affect the volume of the cavity will retune the resonance. ![]() |
Voice Articulators Details
Voice articulation is seen as the changes in the resonances of the vocal tract, and the agents of such changes can be called articulators. Movement of the tongue, pharynx, palate, jaw, or lips can change the basic factors which determine the frequency of cavity resonance (volume of cavity, area of opening, and port length) .
Voice articulation produces sounds which are called vowels, dipthongs, semivowels, and nasals. Such sounds can be considered to be modifications of the basic vocal tract resonance, a kind of filtering of the acoustic spectrum of the voice mechanism.
While the resonances for most voiced sounds are in the pharyngeal and oral cavities, the nasal sounds /m/, /n/, and /ng/ require added resonance in the nasal cavity.
![]() |
The Voice MechanismThe voice mechanism involves the lungs and diaphragm as the power source, and the larynx, pharynx, mouth and nose. At the base of the short tubular larynx are the vocal folds, commonly called the vocal cords. The larynx opens into the pharynx during speech or singing, and is covered by the epiglottis during swallowing. The vocal tract acts as a resonator with frequencies which can be modulated by the articulators, forming the vocal formants which make vowel sounds recognizable. |
Forming the Vowel Sounds
The vocal resonances are altered by the articulators to form distinguishable vowel sounds. The peaks in the vowel spectra are called vocal formants. Note the prominent role of the tongue in this process. The jaw position and lips also play a major part.

The sketches at left above are adapted from Gunnar Fant's "Acoustic theory of speech production" and are reportedly sketches taken from x-rays of the head during the production of these sounds. These are the vowels classified as IPA [a], [i], and [u] and roughly correlate with the vowels represented in the spectra from Benade. The emphasis should be on "roughly" since I don't know how close the correlation is. The intent here is to illustrate the role of the articulators and to point to the fact that their action has a major influence on the harmonic content of the voiced sounds. The normal ear is able to clearly distinguish those differences.
The Speech Chain
![]() |
|
Examples of the changing shapes of the vocal mechanism in a process called articulation as it forms different vowel sounds. |
|
Vocal Formants
The term formant refers to peaks in the harmonic spectrum of a complex sound which arise from some sort of resonance of the source. Because of their resonant origin, they tend to stay essentially the same when the frequency of the fundamental is changed. Formants in the sound of the human voice are particularly important because they are essential components in the intelligibility of speech. For example, the distinguishability of the vowel sounds can be attributed to the differences in their first three formant frequencies. Producing different vowel sounds amounts to retuning these formants within a general range of frequencies. Benade suggests the following ranges of frequencies for the formants of a male voice:
|
1st formant 150-850 Hz 2nd formant 500-2500 Hz 3rd formant 1500-3500 Hz 4th formant 2500-4800 Hz |
|
The process of articulation determines the frequencies of the vocal formants. Sundberg has identified portions of the vocal anatomy which he associates with the formant frequencies. The jaw opening, which constricts the vocal tract toward the glottal end and expands it toward the lip end, is the deciding factor for the first formant. This formant frequency rises as the jaw is opened wider. The second formant is most sensitive to the shape of the body of the tongue, and the third formant is most sensitive to the tip of the tongue.
| Examples |
![]() This is another example of vowel sounds produced at a frequency of 325 Hz by a female voice. It shows the display vs time on the left to show the waveforms, and the Fast Fourier Transform on the right to show the distinctive harmonic content of the vowels. There is considerable variation, but the ear acts as a harmonic analyzer and can easily distinguish and recognize these sounds. Representative Vowel Formant FrequenciesLadefoged lists representative vowel formant frequencies, averages from several U. S. speakers. The symbols used represent standard English phonemes. | ||||||||||||||||||||||||
The formant frequencies are keys to the distinguishablility of the vowel sounds.
Same Vowel, Different Pitch
|
To explain how the ear can recognize a vowel sound as the same vowel, even though it is sounded at different pitches, the idea of vocal formants is invoked. This is data from Benade showing that an "Ah" vowel involves a similar envelope of harmonics when sounded at different frequencies. ![]() |
Stemple, et al., report a mean fundamental frequency for male voices of 106 Hz with a range from 77 Hz to 482 Hz. For female voices the mean was 193 Hz with a range from 137 Hz to 634 Hz. These averages were based on the production of a sustained vowel /a/ .
| Fundamental frequencies for speech |
Same Vowel, Different Pitch Details
|
To explain how the ear can recognize a vowel sound as the same vowel, even though it is sounded at different pitches, the idea of vocal formants is invoked. This is data from Backus showing that an "EE" vowel involves a similar envelope of harmonics when sounded at different frequencies. Formants occur at about 300 Hz and about 2300 Hz for each sound. ![]() |
Stemple, et al., report a mean fundamental frequency for male voices of 106 Hz with a range from 77 Hz to 482 Hz. For female voices the mean was 193 Hz with a range from 137 Hz to 634 Hz. These averages were based on the production of a sustained vowel /a/ .
Fundamental Frequencies for Speech
A number of studies of fundamental frequencies for speech have been conducted. Such frequencies are different for men and women and change with age. There are also differences between ethnic groups.
An interesting study of the speaking pitch of a group of women over a 48 year time span was made by Russell, Penny and Pemberton. They had high quality recordings from 28 young women between the ages of 18 and 25, made in 1945. They were able to find 15 of them in 1993 and recorded them reading the same passages. They found that the group mean speaking fundamental frequency in 1945 was 229.0 Hz and in 1993 was 181.2 Hz. From Russell, A., Penny, L. and Pemberton, C., Speaking fundamental frequency changes over time in women: A longitudinal study, Journal of Speech and Hearing Research 38, 101-109 (1995)
A study of the speaking frequency of different groups of men of different ages was conducted by Hollien & Shipp. They collected 25 men from each decade of age and asked all of them to read a specified paragraph from the same book. They concluded that there was a lowering of the speaking pitch through early and middle adulthood and then a rising frequency into later life. Hollien, H. and Shipp, T, Speaking fundamental frequency and chronological age in males, Journal of Speech and Hearing Research 15, 155-159 (1972). |
Distinguishing Vowel Sounds

![]() |
To explain how the ear can recognize different vowel sounds, the idea of vocal formants is invoked. This is a conceptualization only; no scaling to the inner ear was done. The place theory suggests that the ear distinguishes pitches based on the location of maximum excitation along the basilar membrane of the inner ear. So the ear acts as a sound analyzer which can detect differences in harmonic content by the different amounts of excitation at different places along the basilar membrane. Since sustained vowel sounds differ primarily in their harmonic content, this offers a mechanism by which the ear can distinguish them. |
| Displays of vowel sounds vs time and frequency |
| Another example with plotted harmonic content |
| Distinguishability based on first two formants |
Vocal Tract Resonance
|
Sundberg models the vocal tract as a closed tube resonator, suggesting that the three prominent formants seen in vowel sounds correspond to the harmonics 1,3,5. These frequencies are then modified by the cavity resonance of the vocal tract as influenced by the articulators. ![]() |
![]() |
A typical length for the vocal tract is about 17-18 cm. This would give a fundamental frequency of about 500 Hz if it is treated as a closed cylinder. This would predict formant frequencies of 500, 1500 and 2500, which is in the range of observed frequencies. However, the articulators which provide differences in vowel sounds produce significant changes in these formant frequencies.
Voice Articulators
![]() |
In order to produce distinguishable voice sounds, like vowel sounds, the vocal mechanism must control the resonances of the vocal tract which produce the characteristic vocal formants. If the vocal tract is considered to be a cavity resonator, then it canbe seen that the position of the tongue, the area of opening of the mouth, and any changes which affect the volume of the cavity will retune the resonance. ![]() |
Voice Articulators Details
Voice articulation is seen as the changes in the resonances of the vocal tract, and the agents of such changes can be called articulators. Movement of the tongue, pharynx, palate, jaw, or lips can change the basic factors which determine the frequency of cavity resonance (volume of cavity, area of opening, and port length) .
Voice articulation produces sounds which are called vowels, dipthongs, semivowels, and nasals. Such sounds can be considered to be modifications of the basic vocal tract resonance, a kind of filtering of the acoustic spectrum of the voice mechanism.
While the resonances for most voiced sounds are in the pharyngeal and oral cavities, the nasal sounds /m/, /n/, and /ng/ require added resonance in the nasal cavity.
![]() |
The Voice MechanismThe voice mechanism involves the lungs and diaphragm as the power source, and the larynx, pharynx, mouth and nose. At the base of the short tubular larynx are the vocal folds, commonly called the vocal cords. The larynx opens into the pharynx during speech or singing, and is covered by the epiglottis during swallowing. The vocal tract acts as a resonator with frequencies which can be modulated by the articulators, forming the vocal formants which make vowel sounds recognizable. |
Measuring Sound
Audible Sound
Usually "sound" is used to mean sound which can be perceived by the human ear, i.e., "sound" refers to audible sound unless otherwise classified. A reasonably standard definition of audible sound is that it is a pressure wave with frequency between 20 Hz and 20,000 Hz and with an intensity above the standard threshold of hearing. Since the ear is surrounded by air, or perhaps under water, the sound waves are constrained to be longitudinal waves. Normal ranges of sound pressure and sound intensity may also be specified.
Speed of Sound in Air
The speed of sound in dry air is given approximately by

where TC is the celsius temperature,
Selected Sound Speeds in Gases
|
| | |
| | | |
| | ||
| | ||
| | ||
| | | |
Sound Speed in Helium
The speed of sound in helium at 0°C is about 972 m/s, compared to 331 m/s in air. This is consistent with the general relationship for sound speed in gases since the density of helium is so much less than that of air.
The high speed of sound is responsible for the amusing "Donald Duck" voice which occurs when someone has breathed in helium from a balloon. Note that if the vibration frequency of the vocal folds does not change, the actual pitch of the voice is not higher. The cavity resonances which determine the vocal formants would be raised by the higher sound speed, so the timbre of the voice would be different. It is possible for the pitch of the voice to change since gas dynamics ( i.e., Bernoulli effect) is partially responsible for the closing frequency of the vocal folds, but I haven't been able to find any data which demonstrates such a change.
The Angle of Attack for an Airfoil

While an airplane wing is one of the most popular examples of the Bernoulli effect, many discussions allege that the Bernoulli lift is actually a small part of the lift force which allows the aircraft to fly. You can argue that the main lift comes from the fact that the wing is angled slightly upward so that air striking the underside of the wing is forced downward. The Newton's 3rd law reaction force upward on the wing provides the lift. Increasing the angle of attack can increase the lift, but it also increases drag so that you have to provide more thrust with the aircraft engines.
Some pilots have been known to get a bit testy about their lift being attributed to the Bernoulli effect, and reply "Then how do you suppose we can fly the plane upside down?". It looks a bit tricky, but you can adjust the attitude of the aircraft when upside down to give the proper angle of attack to get lift.
The discussions of "Bernoulli vs Newton" continue, but aerodynamicists such as Eastlake take the point of view that they are ultimately equivalent models and that neither is incorrect. In his wind tunnel testing at the Department of Aerospace Engineering, Embry-Riddle Aeronautical University, the Bernoulli approach is preferred because it can be tested more readily with the type of measurements which can be made in a wind tunnel. Making numerous point measurements around the airfoil and summing (integrating) them in the context of a Bernoulli model gives consistent modeling of observed lift forces.
Cavity Resonance
An air cavity will exhibit a single resonant frequency. If extra air is pushed into the volume and then released, the pressure will drive it out. But, acting somewhat like a mass on a spring which is pulled down and then released, it will overshoot and produce a slight vacuum in the cavity. The air will oscillate into and out of the container for a few cycles at a natural frequency. The qualitative nature of the frequency determining factors:

Actually the frequency depends upon the square root of these factors and also upon the speed of sound, as you can see in the actual calculation of the frequency. But the above illustration shows the physical factors that are involved in determining the resonant frequency.
Cavity Oscillations
The single frequency cavity resonance suggests a parallel with the single resonant frequency of a mass on a spring. In fact, the term "acoustic mass" is sometimes used in connection with such oscillations.
Thinking of the cavity resonance in terms of an oscillating mass of air can give some intuition about why the physical properties of the cavity affect the resonant frequency as they do. You can visualize the process of pushing extra air into the cavity to produce an overpressure. That overpressure represents stored energy in a way analogous to lifting a mass on a spring above its equilibrium position. If the opening to the cavity is larger, the excess air can escape more rapidly to bring the pressure down to atmospheric, and this leads to a higher cavity resonant frequency. If the neck of the cavity is longer, there is more resistance to the flow of the excess air and the resonant frequency is lowered. If the cavity volume is increased, then it takes a greater excess mass of air to produce a given overpressure, and it therefore takes longer for that excess pressure to be relieved. The larger cavity will have a lower resonant frequency.
Again visualizing the mass on a spring, we know that if we lift it from equilibrium and allow it to fall, it will not stop when it reaches that equilibrium point but will overshoot it and oscillate about equilibrium because the work we did to lift the mass put energy into the elastic system. Likewise, when we have done work to increase the pressure in a cavity, we have given it energy and as the air rushes out, it will overshoot the equilibrium (atmospheric pressure) and produce a slight vacuum in the cavity. This elastic system produces the cavity resonance, but it is highly damped and will not continue to oscillate like a mass on a spring.
Cavity Resonant Frequency
A quantitative analysis of the cavity resonance gives the frequency expression
Frequency, area, volume or length may be calculated by clicking on the desired quantity in the above highlighted formula. Data values not entered will default to the experimental values for a plastic coke bottle used in an example. All parameters may be changed.

Simple Harmonic Motion Frequency
The frequency of simple harmonic motion like a mass on a spring is determined by the mass m and the stiffness of the spring expressed in terms of a spring constant k ( see Hooke's Law):

Mass on Spring Resonance
![]() |
A mass on a spring has a single resonant frequency determined by its spring constant k and the mass m. Using Hooke's law and neglecting damping and the mass of the spring, Newton's second law gives the equation of motion: ![]() The solution to this differential equation is of the form: ![]() which when substituted into the motion equation gives: ![]() Collecting terms gives B=mg/k, which is just the stretch of the spring by the weight, and the expression for the resonant vibrational frequency: ![]() This kind of motion is called simple harmonic motion and the system a simple harmonic oscillator. |
Mass on Spring: Motion Sequence
A mass on a spring will trace out a sinusoidal pattern as a function of time, as will any object vibrating in simple harmonic motion. One way to visualize this pattern is to walk in a straight line at constant speed while carriying the vibrating mass. Then the mass will trace out a sinusoidal path in space as well as time.

Energy in Mass on Spring
The simple harmonic motion of a mass on a spring is an example of an energy transformation between potential energy and kinetic energy. In the example below, it is assumed that 2 joules of work has been done to set the mass in motion.

Newton
Newton's Laws
|
Newton's First Law
Newton's First Law states that an object will remain at rest or in uniform motion in a straight line unless acted upon by an external force. It may be seen as a statement about inertia, that objects will remain in their state of motion unless a force acts to change the motion. Any change in motion involves an acceleration, and then Newton's Second Law applies; in fact, the First Law is just a special case of the Second Law for which the net external force is zero.
Newton's First Law contains implications about the fundamental symmetry of the universe in that a state of motion in a straight line must be just as "natural" as being at rest. If an object is at rest in one frame of reference, it will appear to be moving in a straight line to an observer in a reference frame which is moving by the object. There is no way to say which reference frame is "special", so all constant velocity reference frames must be equivalent.
Centripetal Force Example
The string must provide the necessary centripetal force to move the ball in a circle. If the string breaks, the ball will move off in a straight line. The straight line motion in the absence of the constraining force is an example of Newton's first law. The example here presumes that no other net forces are acting, such as horizontal motion on a frictionless surface. The vertical circle is more involved.

|
Newton's Second Law
Newton's Second Law as stated below applies to a wide range of physical phenomena, but it is not a fundamental principle like the Conservation Laws. It is applicable only if the force is the net external force. It does not apply directly to situations where the mass is changing, either from loss or gain of material, or because the object is traveling close to the speed of light where relativistic effects must be included. It does not apply directly on the very small scale of the atom where quantum mechanics must be used.
Data can be entered into any of the boxes below. Specifying any two of the quantities determines the third. After you have entered values for two, click on the text representing to third to calculate its value.

Newton's Second Law Illustration
Newton's 2nd Law enables us to compare the results of the same force exerted on objects of different mass.
Newton's Third Law
Newton's third law: All forces in the universe occur in equal but oppositely directed pairs. There are no isolated forces; for every external force that acts on an object there is a force of equal magnitude but opposite direction which acts back on the object which exerted that external force. In the case of internal forces, a force on one part of a system will be countered by a reaction force on another part of the system so that an isolated system cannot by any means exert a net force on the system as a whole. A system cannot "bootstrap" itself into motion with purely internal forces - to achieve a net force and an acceleration, it must interact with an object external to itself.
![]() | Without specifying the nature or origin of the forces on the two masses, Newton's 3rd law states that if they arise from the two masses themselves, they must be equal in magnitude but opposite in direction so that no net force arises from purely internal forces. |
Newton's third law is one of the fundamental symmetry principles of the universe. Since we have no examples of it being violated in nature, it is a useful tool for analyzing situations with are somewhat counter-intuitive. For example, when a small truck collides head-on with a large truck, your intuition might tell you that the force on the small truck is larger. Not so!
Small truck, | ![]() |
Newton's Third Law Example
Newton's third law can be illustrated by identifying the pairs of forces which are involved in supporting the blocks on the spring scale.

Presuming that the blocks are supported and at equilibrium, then the net force on the system is zero. All the forces occur in Newton's third law pairs.
Conservation Laws
If a system does not interact with its environment in any way, then certain mechanical properties of the system cannot change. They are sometimes called "constants of the motion". These quantities are said to be "conserved" and the conservation laws which result can be considered to be the most fundamental principles of mechanics. In mechanics, examples of conserved quantities are energy, momentum, and angular momentum. The conservation laws are exact for an isolated system.
Stated here as principles of mechanics, these conservation laws have far-reaching implications as symmetries of nature which we do not see violated. They serve as a strong constraint on any theory in any branch of science.
Conservation of Momentum
The momentum of an isolated system is a constant. The vector sum of the momenta mv of all the objects of a system cannot be changed by interactions within the system. This puts a strong constraint on the types of motions which can occur in an isolated system. If one part of the system is given a momentum in a given direction, then some other part or parts of the system must simultaneously be given exactly the same momentum in the opposite direction. As far as we can tell, conservation of momentum is an absolute symmetry of nature. That is, we do not know of anything in nature that violates it.
Conservation of Energy
Energy can be defined as the capacity for doing work. It may exist in a variety of forms and may be transformed from one type of energy to another. However, these energy transformations are constrained by a fundamental principle, the Conservation of Energy principle. One way to state this principle is "Energy can neither be created nor destroyed". Another approach is to say that the total energy of an isolated system remains constant.
Conservation of Energy as a Fundamental Principle
The conservation of energy principle is one of the foundation principles of all science disciplines. In varied areas of science there will be primary equations which can be seen to be just an appropriate reformulation of the principle of conservation of energy.
Fluids | Bernoulli equation |
Electric circuits | Voltage law |
Heat and thermodynamics | First law of thermodyamics |
Conservation of Angular Momentum
The angular momentum of an isolated system remains constant in both magnitude and direction. The angular momentum is defined as the product of the moment of inertia I and the angular velocity. The angular momentum is a vector quantity and the vector sum of the angular momenta of the parts of an isolated system is constant. This puts a strong constraint on the types of rotational motions which can occur in an isolated system. If one part of the system is given an angular momentum in a given direction, then some other part or parts of the system must simultaneously be given exactly the same angular momentum in the opposite direction. As far as we can tell, conservation of angular momentum is an absolute symmetry of nature. That is, we do not know of anything in nature that violates it.
An Isolated System
An isolated system implies a collection of matter which does not interact with the rest of the universe at all - and as far as we know there are really no such systems. There is no shield against gravity, and the electromagnetic force is infinite in range. But in order to focus on basic principles, it is useful to postulate such a system to clarify the nature of physical laws. In particular, the conservation laws can be presumed to be exact when referring to an isolated system:
Conservation of Energy: the total energy of the system is constant.
Conservation of Momentum: the mass times the velocity of the center of mass is constant.
Conservation of Angular Momentum: The total angular momentum of the system is constant.
Newton's Third Law: No net force can be generated within the system since all internal forces occur in opposing pairs. The acceleration of the center of mass is zero.
Physics
Bernoulli Equation
The Bernoulli Equation can be considered to be a statement of the conservation of energy principle appropriate for flowing fluids. The qualitative behavior that is usually labeled with the term "Bernoulli effect" is the lowering of fluid pressure in regions where the flow velocity is increased. This lowering of pressure in a constriction of a flow path may seem counterintuitive, but seems less so when you consider pressure to be energy density. In the high velocity flow through the constriction, kinetic energy must increase at the expense of pressure energy.

Pressure
Pressure is defined as force per unit area. It is usually more convenient to use pressure rather than force to describe the influences upon fluid behavior. The standard unit for pressure is the Pascal, which is a Newton per square meter.
For an object sitting on a surface, the force pressing on the surface is the weight of the object, but in different orientations it might have a different area in contact with the surface and therefore exert a different pressure.

Mass and Weight
The mass of an object is a fundamental property of the object; a numerical measure of its inertia; a fundamental measure of the amount of matter in the object. Definitions of mass often seem circular because it is such a fundamental quantity that it is hard to define in terms of something else. All mechanical quantities can be defined in terms of mass, length, and time. The usual symbol for mass is m and its SI unit is the kilogram. While the mass is normally considered to be an unchanging property of an object, at speeds approaching the speed of light one must consider the increase in the relativistic mass.
The weight of an object is the force of gravity on the object and may be defined as the mass times the acceleration of gravity, w = mg. Since the weight is a force, its SI unit is the newton. Density is mass/volume.

Weight
The weight of an object is defined as the force of gravity on the object and may be calculated as the mass times the acceleration of gravity, w = mg. Since the weight is a force, its SI unit is the newton.
For an object in free fall, so that gravity is the only force acting on it, then the expression for weight follows from Newton's second law.

You might well ask, as many do, "Why do you multiply the mass times the freefall acceleration of gravity when the mass is sitting at rest on the table?". The value of g allows you to determine the net gravity force if it were in freefall, and that net gravity force is the weight. Another approach is to consider "g" to be the measure of the intensity of the gravity field in Newtons/kg at your location. You can view the weight as a measure of the mass in kg times the intensity of the gravity field, 9.8 Newtons/kg under standard conditions.
Elastic Potential Energy
Elastic potential energy is Potential energy stored as a result of deformation of an elastic object, such as the stretching of a spring. It is equal to the work done to stretch the spring, which depends upon the spring constant k as well as the distance stretched. According to Hooke's law, the force required to stretch the spring will be directly proportional to the amount of stretch.
Since the force has the form
F = -kx
then the work done to stretch the spring a distance x is


Spring Potential Energy
Since the change in Potential energy of an object between two positions is equal to the work that must be done to move the object from one point to the other, the calculation of potential energy is equivalent to calculating the work. Since the force required to stretch a spring changes with distance, the calculation of the work involves an integral.
The work can also be visualized as the area under the force curve: ![]() |
|
Kinetic Energy
Kinetic energy is energy of motion. The kinetic energy of an object is the energy it possesses because of its motion. The kinetic energy* of a point mass m is given by
Kinetic energy is an expression of the fact that a moving object can do work on anything it hits; it quantifies the amount of work the object could do as a result of its motion. The total mechanical energy of an object is the sum of its kinetic energy and potential energy.
For an object of finite size, this kinetic energy is called the translational kinetic energy of the mass to distinguish it from any rotational kinetic energy it might possess - the total kinetic energy of a mass can be expressed as the sum of the translational kinetic energy of its center of mass plus the kinetic energy of rotation about its center of mass.
*This assumes that the speed is much less than the speed of light. If the speed is comparable with c then the relativistic kinetic energy expression must be used
Kinetic Energy Concept
Kinetic energy is energy of motion. The kinetic energy of an object is the energy it possesses because of its motion. The kinetic energy of a point mass m is given by
| More detail on this development |
Energy as the capacity for doing work is a convertible currency. To give something kinetic energy you must do work on it. This development uses the concept of work as well as Newton's second law and the motion equations. It is a special case of the work-energy principle, a powerful general principle of nature.
More Detail on Kinetic Energy Concept
Kinetic energy is energy of motion. The kinetic energy of an object is the energy it possesses because of its motion. The kinetic energy of a point mass m is given by

Center of Mass
The terms "center of mass" and "center of gravity" are used synonymously in a uniform gravity field to represent the unique point in an object or system which can be used to describe the system's response to external forces and torques. The concept of the center of mass is that of an average of the masses factored by their distances from a reference point. In one plane, that is like the balancing of a seesaw about a pivot point with respect to the torques produced.
If you are making measurements from the center of mass point for a two-mass system then the center of mass condition can be expressed as

where r1 and r2 locate the masses. The center of mass lies on the line connecting the two masses.
Center of Mass for Particles
The center of mass is the point at which all the mass can be considered to be "concentrated" for the purpose for the purpose of calculating the "first moment", i.e., mass times distance. For two masses this distance is calculated from

For the more general collection of N particles this becomes

and when extended to three dimensions:
This approach applies to diccrete masses even if they are not point masses if the position xi is taken to be the position of the center of mass of the ith mass. It also points the way toward the calculation of the center of mass of an extended object.
Center of Mass: Continuous
For a continuous distribution of mass, the expression for the center of mass of a collection of particles :

becomes an infinite sum and is expressed in the form of an integral

For the case of a uniform rod this becomes

This example of a uniform rod previews some common features about the process of finding the center of mass of a continuous body. Continuous mass distributions require calculus methods involving an integral over the mass of the object. Such integrals are typically transformed into spatial integrals by relating the mass to a distance, as with the linear density M/L of the rod. Exploiting symmetry can give much information: e.g., the center of mass will be on any rotational symmetry axis. The use of symmetry would tell you that the center of mass is at the geometric center of the rod without calculation.














































