HISTORY
A good place to start examining the history of speech sound
analysis goes back a little more than one hundred years to Alexander Melville Bell who
developed a visual representation of the spoken word. This visual display of the spoken
word conveyed much more information about the pronunciation of that word than the
dictionary spelling could ever suggest. His depiction of speech sounds demonstrated the
subtle differences with which different people pronounced the same words. This system of
speech sound analysis developed by Bell is the phonetic alphabet which he called
"visible speech".5 His method of encoding the great variety of speech sounds was
by handwritten symbols and was language independent. This code produced a visual
representation of speech which could convey to the eye the subtle differences in which
words were spoken. This system was used by both Bell and his son, Alexander Graham Bell,
in helping deaf people learn to speak.6
It was in the early 1940's that a new method of speech
sound analysis was developed. Potter, Kopp & Green, working for Bell Laboratories in
Murray Hill, New Jersey, began work on a project to develop a visual representation of
speech using a sound spectrograph. This machine, an automatic sound wave analyzer,
produced a visual record of speech portraying three parameters; frequency, intensity and
time. This research was intensified during World War II when acoustic scientists suggested
that enemy radio voices could be identified by the spectrograms produced by the sound
spectrograph. The war ended before the technique could be perfected.
In 1947, Potter, Kopp and Green published their work in a
book, the title of which was borrowed from Alexander Melville Bell, Visible Speech. Their
work is a comprehensive study of speech spectrograms designed to linguistically interpret
visible speech sound patterns. This work was similar to that of Bell's in that speech
sounds were encoded into a visual form. The difference is, instead of a pen, Potter, Kopp
and Green used a sound spectrograph to produce the visual patterns.
Research in the area of speaker identification slowed
dramatically with the end of
World War II. It was not until the late 1950's and early
1960's that the research began again. It was at this time the New York City Police
Department was receiving a large number of telephone bomb threats to the airlines.7 At
that time Bell Laboratories was asked by law enforcement officers to provide assistance in
the apprehension of the individuals making the telephone calls. The task of developing a
reliable method of identification of a speaker's voice was given to Lawrence G. Kersta, a
physicist at Bell Laboratories who had worked on the early experiments using the sound
spectrograph. In two years Kersta had developed a method of identification in which he
reported results yielding a correct identification 99.65% of all attempts.8
It was in 1966 that the Michigan State Police began the
practical application of the voice identification method in attempting to solve criminal
cases. A Voice Identification unit was established and the unit personnel received
training from Kersta and other speech scientists. During the first few years the voice
identification method was used only as an investigative aid.
The first court of published opinion to rule on the
admissibility of voice identification analysis was in the case of United States v. Wright,
17 USCMA 183, 37 CMR 447 (1967). This was a court martial proceeding in which the
appellate court affirmed the admission of spectrographic voice identification evidence by
the board of review. The lengthy dissent by Judge Ferguson based on the requirements for
acceptance of scientific evidence spelled out in Frye v. United States, 293 Fed. 1013 (CA
DC Cir) (1923), was the beginning of a controversy which continues today.
The first non-military case to review the admissibility of
voice identification evidence was the New Jersey Supreme Court in State v. Cary.9 In this
case the court stated that "the physical properties of a person's voice are
identifying characteristics".10 The court also noted that trial courts in the states
of New York and California have admitted voice identification evidence but that these
admissions have not been subject of appellate review.11 The court declined to rule on the
admissibility issue and remanded the case to determine if the equipment and technique were
sufficiently accurate to provide results admissible as evidence. The Superior Court of New
Jersey, on appeal from a denial of admission after remand, held that the majority of
evidence "indicates, not that the technique is not accurate and reliable, but rather
that it is just too early to tell and at this time lacks the required scientific
acceptance".12 The New Jersey Supreme Court reviewed this decision and once again
remanded for additional fact finding "in light of the far-reaching implications of
admission of voiceprint evidence".13 The State of New Jersey was unable "to
furnish any new and significant evidence" by the third time the New Jersey Supreme
Court reviewed this issue and as such affirmed the trial court's opinion excluding voice
identification evidence.14
California came to a similar holding when the issue first
reached the appellate level in People v. King.15 The State brought in Lawrence Kersta as
the voice identification expert to testify as to the reliability of the technique. The
defense brought in seven speech scientists and engineers to rebut Kersta's claims. The
court held that "Kersta's claims for the accuracy of the `voiceprint' process are
founded on theories and conclusions which are not yet substantiated by accepted methods of
scientific verification".16 The court cited the Frye test as the proper standard for
admissibility.17 The court also left the door open for future admission by saying when
voice identification evidence has achieved the necessary degree of acceptance they will
welcome its use.18
In State ex rel. Trimble v. Heldman 19, the Supreme Court
of Minnesota held that "spectrograms ought to be admissible at least for the purpose
of corroborating opinions as to identification by means of ear alone".20 The court
was impressed by the testimony of Dr. Oscar Tosi who had previously testified against the
use of spectrographic voice identification evidence in courtrooms, but after extensive
research and experimentation now described the technique as "extremely
reliable".21 The court made reference to the Frye test and to the scientific
community's acceptance of Dr. Tosi's study, but did not specifically apply the Frye test
as the standard for the admissibility of the voice identification evidence.22 In
discussing the issue of admissibility the court held that it was the job of the factfinder
to weight the credibility of the evidence.
"The opinion of an expert is admissible, if at all,
for the purpose of aiding the jury or the factfinder in a field where he has no particular
knowledge or training. The weight and credibility to be given to the opinion of an expert
lies with the factfinder. It is no different in this field than in any other".23
In 1972 the third and fourth District Courts of Florida, in
separate opinions, held admissible the use of spectrographic voice identification
evidence.24 The court in Worley held that the voice identification evidence was admissible
to corroborate the defendant's identification by other means. The court stated that the
technique had attained the necessary level of scientific reliability required for
admission, but since it was only offered as corroborative evidence, the court refused to
comment as to whether such evidence alone would be sufficient to sustain the
identification and conviction.25
The third District Court of Appeals of Florida did not
limit the admission of spectrograph evidence to corroborative status. In the Alea opinion
the court does not mention the Frye test as the standard to be used for admission, but
rather states that "such testimony is admissible to establish the identity of a
suspect as direct and positive proof, although its probative value is a question for the
jury".26
In the case of State v. Andretta 27, the New Jersey Supreme
Court stated that there was much more support for the admission of spectrographic voice
identification evidence than at the time they decided Cary, but refused to address the
issue further since the only issue before them was whether the defendant should be
compelled to speak for a spectrographic voice analysis.28
In California the Court of Appeal affirmed the trial
court's admission of voice identification evidence in the case of Hodo v. Superior
Court.29 Here the court found the requirements of Frye had been met in that there was now
general acceptance of spectrographic voice identification by recognized experts in the
field. The court cited Dr. Tosi's testimony that "those who really are familiar with
spectrography, they are accepting the technique".30 Tosi also pointed out that the
general population of speech scientists are not familiar with this technique and thus can
not form an opinion on it.31
The court in United States v. Samples 32 held that the Frye
test of general acceptance precludes too much relevant evidence for purposes of the fact
determining process at a revocation of probation hearing and the court allowed the use of
spectrographic voice identification evidence to corroborate other identification
evidence.33
In 1974 the case of United States v. Addison 34 rejected
the admission of voice identification evidence saying that such evidence "is not now
sufficiently accepted" and as such the requirements of the Frye test were not met.35
At the trial the court heard from two experts endorsing the technique, Dr. Tosi and a
recent convert to the reliability of the technique, Dr. Ladefoged. Only one expert, Dr.
Stuart, testified that he was still skeptical of the technique and thought that most of
the scientific community was also.36 Although the admission of spectrographic voice
identification evidence was held to be error by the trial court, the appellate court
refused to overturn the conviction due to overwhelming amount of other evidence supporting
the conviction.37
Attempted disguise or mimic were the grounds the California
Court of Appeal used to reverse a conviction based in part on spectrographic voice
identification in the case of People v. Law.38 The court found that "with respect to
disguised and mimicked voices in particular, the prosecution did not carry out its burden
of proof to demonstrate that the scientific principles pertaining to spectrographic
identification were beyond the experimental and into the demonstrable stage or that the
procedure was sufficiently established to have gained general acceptance in the particular
field in which it belongs".39 The main concern of the court was that no
experimentation had been completed studying the effects of attempts to disguise or mimic
on the accuracy of the identification process. Without mentioning the Frye test this court
used the standards set in Frye as the test of admissibility although the court seemed to
be limiting the scope of the opinion to cases involving disguise or mimic.
In United States v. Franks 40, the Sixth Circuit Court of
Appeals held spectrographic voice identification evidence to be admissible. The court said
it was "mindful of a considerable area of discretion on the part of the trial judge
in admitting or refusing to admit evidence based on scientific processes".41 Quoting
from United States v. Stifel 42, the court pointed out that "neither newness nor lack
of absolute certainty in a test suffices to render it inadmissible in court. Every useful
new development must have its first day in court. And court records are full of the
conflicting opinions of doctors, engineers and accountants...".43 The court in Franks
found that extensive review was given to the qualifications of the experts and opportunity
to cross-examine the experts to determine the proper weight to be given such evidence.
The Massachusetts Supreme Court, in Commonwealth v. Lykus
44, allowed the admission of spectrographic voice identification evidence saying that the
opinions of a qualified expert should be received and the considerations similar to those
expressed in Frye should be for the fact finder as to the weight and value of the
opinions. The court gave greater weight to those experts who had had direct and empirical
experience in the field as opposed to those who had only performed a theoretical review of
that work.45 The court also stated that "neither infallibility nor unanimous
acceptance of the principle need be proved to justify its admission into evidence".46
The Massachusetts Supreme Court again, that same year, found no error in the use of
spectrographic voice identification evidence in the case of Commonwealth v. Vitello.47
The Fourth Circuit Court of Appeals, in the case of United
States v. Baller 48, allowed the admission of spectrographic voice identification evidence
saying unless it is prejudicial or misleading to the jury, it is better to admit relevant
scientific evidence in the same manner as other expert testimony and allow its weight to
be attacked by cross-examination and refutation.49 The court listed six reasons supporting
admission; the expert was a qualified practitioner, evidence in voir dire demonstrated
probative value, competent witnesses were available to expose limitations, the defense
demonstrated competent cross-examination, the tape recordings were played for the jury,
and the jury was told they could disregard the opinion of the voice identification
expert.50
Voice identification evidence was admitted by the Sixth
Circuit Court of Appeals in United States v. Jenkins 51 using the same logic as in Baller.
Here the court said that the issue of admissibility was within the discretion of the trial
judge and that once a proper foundation had been laid the trier of fact was able to assign
proper weight to the evidence.52
In 1976 the New York Supreme Court pointed out, in the case
of People v. Rogers 53, that fifty different trial courts had admitted spectrographic
voice identification evidence, as had fourteen out of fifteen U. S. District Court judges,
and only two out of thirty- seven states considering the issue had rejected admission.54
The Rogers court stated that this technique, when accompanied by aural examination and
conducted by a qualified examiner, had now reached the level of general scientific
acceptance by those who would be expected to be familiar with its use, and as such, has
reached the level of scientific acceptance and reliability necessary for admission.55 The
court also pointed out that other scientific evidence processes are regularly admitted
which as, or less, reliable than spectrographic voice identification; hair and fiber
analysis, ballistics, forensic chemistry and serology, and blood alcohol tests.56
The Supreme Court of California finally put an end to the
see-saw ride of admissibility in that state in People v. Kelly 57 by rejecting admission
because of insufficient showing of support. "Although voiceprint analysis may indeed
constitute a reliable and valuable tool in either identifying or eliminating suspects in
criminal cases, that fact was not satisfactorily demonstrated in this case".58 In
this case the court seemed to have the most trouble with the fact the only expert provided
to lay the foundation for admission was the technician who performed the analysis, saying
that a single witness can not attest to the views of the scientific community on this new
technique and that this witness, who may not be capable of a fair and impartial evaluation
of the technique since he has built a career on it, lacked the academic credentials to
express an opinion as to the acceptance of the technique by the scientific community.59
In United States v. McDaniel 60, it appears that District
of Columbia Circuit Court of Appeals would have liked to admit the spectrographic voice
identification evidence but had to reject it because the shadow of the Addison decision of
two years past "looms over our consideration of this issue".61 The court held
the admission of the voice identification evidence to be harmless error in that the rest
of the evidence was overwhelming. The court did recognize the trend toward admissibility
and contemplated that it may be time to reexamine the holding of Addison "in light of
the apparently increased reliability and general acceptance in the scientific
community".62
The Supreme Court of Pennsylvania rejected admission in
Commonwealth v. Topa 63 holding that the technician's opinion alone will not suffice to
permit the introduction of scientific evidence into a court of law.64 This was the same
situation, in fact the same single expert, which confronted the Kelly court.
In People v. Tobey 65 the Michigan Supreme Court found, by
applying the Frye test, that the trial court erred in admitting spectrographic voice
identification evidence. The court found that neither of the two experts testifying in
favor of the technique could be called disinterested and impartial experts in that both
had built their reputations and careers on this type of work.66 The court pointed out that
not all courts require independent and impartial proof of general scientific acceptability
and was quick to add that this decision was not intended in anyway to foreclose the
introduction of such evidence in future cases where there is demonstrated solid scientific
approval and support of this new method of identification.67
In admitting voice identification evidence, the United
States District Court for the Southern District of New York, in United States v. Willaims
68, found that the requirements of the Frye test were met when the technique was performed
"by aural comparison and spectrographic analysis".69 The court stated that the
concerns of the defendant that this technique had a mystique of scientific precision which
may mask the ultimate subjectivity of spectrographic analysis, although they were valid
concerns, could be alleviated by action other than suppression of the evidence, such as
opposing expert opinion and jury instructions allowing the jury to determine the weight,
if any, of the evidence.70
In People v. Collins 71, the Supreme Court of New York
rejected admission of spectrographic voice identification evidence saying that the Frye
test alone was insufficient to determine admissibility and must be used in conjunction
with a test of reliability.72 The court found that the proponents of the technique were in
the minority and that the remainder of the relevant scientific community either expressed
opposition or expressed no opinion.73
In Brown v. United States 74, the District of Columbia
Court of Appeals rejected the use of voice identification evidence, but held the error to
be harmless and affirmed the conviction in light of overwhelming non-spectrographic
identification of the defendant as perpetrator of the crime. One of the main problems in
this case was the fact that the exemplar of the defendant's voice was recorded in a
defective manner but used anyway after the tape speed malfunction had been corrected in a
laboratory. Dr. Tosi, testifying as a proponent of the technique, stated that the
technician should not have used the defective recording as a basis of comparison.75 The
court held the technique was not shown to be sufficiently reliable and accepted within the
scientific community to permit its use in this criminal case, but that this decision did
not foreclose a future decision as to admissibility of the technique.76
In the civil case of D'Arc v. D'Arc 77, the court found
that the requirements of the Frye test had not been met and thus the evidence could not be
admitted. The court believed that even with proper instructions to the contrary, this type
of evidence "has the potentiality to be assumed by many jurors as being conclusive
and dispositive" and thus should be subject to strict standards of admission.78
The court in State v. Williams 79 refused to apply the Frye
standard citing instead the Maine Rules of Evidence, Rule 401, which states "all
relevant evidence is admissible", with relevant being described as evidence having
any tendency to make the existence of any fact that is of consequence to the determination
of the action more probable or less probable than it would be without the evidence.80
In Reed v. State 81 the court applied the Frye standard to
determine admissibility with a rather wide definition of the scientific community which
included "those whose scientific background and training are sufficient to allow them
to comprehend and understand the process and form a judgment about it".82 The court
said the trial court erred in using the more restricted definition of scientific
community, "those who are knowledgeable, directly knowledgeable through work,
utilization of the techniques, experimentation and so forth" and did not mean the
broad general scientific community of speech and hearing science.83
In a fifty-one page dissent to the Reed decision 84, Judge
Smith points out that the Frye standard is much criticized and has never been adopted in
the state of Maryland, that this decision is out of step with other courts on related
issues of fingerprints, ballistics, x-rays and the like, that this decision is out of step
with prior Maryland holdings on expert testimony, that the majority of reported opinions
have accepted such evidence, and that even if Frye were applicable it is satisfied.
In United States v. Williams 85 the court did not apply the
Frye standard but did note that acceptance of the technique appeared strong among
scientists who had worked with spectrograms and weak among those who had not.86 The court
then focused on the reliability of the technique and the tendency to mislead. As to the
reliability of the technique, the court noted the small error rate, 2.4% false
identification, the existence and maintenance of standards of analysis, and the
conservative manner in which the technique was applied.87 As to the tendency to mislead,
the court felt that adequate precautions were taken in that the jury could view the
spectrograms and listen to the recording and the expert's qualifications, the reliability
of the equipment and the technique were subject to scrutiny by the defense, and the jury
was instructed that they were free to disregard the testimony of the experts.88
In the case of People v. Bein 89 the court based
admissibility on a two pronged test; general acceptance by the relevant scientific
community, and competent expert testimony establishing reliability of the process. The
court found that both tests had been met and allow the admission of the evidence.90 The
court described the relevant scientific community "to be that group of scientists who
are concerned with the problems of voice identification for forensic and other
purposes".91 The court also suggested that "it is no different in this field of
expertise than in other fields, that where experts disagree, it is for the finder of fact
to determine which testimony is the more credible and therefore more acceptable".92
The Ohio Supreme Court, in State v. Williams 93, relied on
their own state rules of evidence, as did the Maine court in Williams, and rejected the
use of the Frye standard. The court refused "to engage in scientific nose counting
for the purpose of whether evidence based on newly ascertained or applied scientific
principles is admissible".94 The court noted, with approval, the playing of the
recordings to the jury and, that the jury was free to reject the testimony of the
expert.95
In that same year, right across the border in Indiana, the
court in Cornett v. State96 rejected admission of voice identification evidence saying the
conditions set out in Frye had not been met. Here the court used a wide definition of the
scientific community which included linguists, psychologists and engineers who use voice
spectrography for identification purposes.97 Although the court held that the trial court
erred in admitting the evidence, the error was found to be harmless and the conviction
affirmed.98
Likewise the court in State v. Gortarez 99 rejected the
admission of voice identification evidence but affirmed the conviction holding such
admission to be harmless error. The court also used a wide definition of the scientific
community in applying the Frye standard including experts in the fields of acoustical
engineering, acoustics, communication electronics, linguists, phonetics, physics and
speech communications and found that there was not general acceptance among these
scientists.100
In the case of United States v. Love101, the admissibility
of spectrographic voice identification was not at issue. The fourth circuit Court of
Appeals was reviewing whether the trial judge's comments about a voice identification
expert were considered error. The trial judge told the jury that they, the jury, were to
assign whatever weight they wanted to the testimony of the expert and even disregard his
testimony if they "should conclude that his opinion was not based on adequate
education, training or experience, or that his professed science of voice print
identification was not sufficiently reliable, accurate, and dependable."102 The Court
of Appeals found no error in the judge's instruction to the jury.
In admitting spectrographic voice identification evidence,
the Supreme Court of Rhode Island, in State v. Wheeler 103, declined to apply the Frye
standard holding instead "the law and practice of this state on the use of expert
testimony has historically been based on the principle that helpfulness to the trier of
fact is the most critical consideration".104 The court reviewed the cases around the
country, both state and federal, and noted that the majority of circuit courts that have
considered admission of spectrographic evidence have decided in favor of its admission.105
The court pointed out that the defendant had all the proper safeguards such as
cross-examination, rebuttal experts, and the jury had the right to reject the evidence for
any one of a number of reasons.106
In State v. Free107 the Court of Appeals of the State of
Louisiana did not rely on the Frye test for guidance in determining the admissibility of
spectrographic voice identification evidence but instead applied a balancing test set
forth in State v. Catanese108). One individual, accepted as an expert in voice
identification, testified as to the theoretical and technical aspects of the
spectrographic voice analysis method. No other witnesses were called to either support of
show fault with the admission of the voice identification testimony. The Court of Appeals
found that voice identification evidence, when offered by a competent expert and obtained
through proper procedures, "is as reliable as other kinds of scientific evidence
accepted routinely by courts" and "can be highly probative"109. Using the
Catanese balancing test the Court of Appeals found that trier of fact was likely to give
almost conclusive weight to the voice identification expert's opinion, consequently,
misleading the jurors. The Court of Appeals was also concerned that there were not enough
experts available who could critically examine the validity of a voice identification
determination in a particular case. Nine rules were suggested as a basis for which voice
identification evidence could be accepted110). The Court of Appeals held that Catanese
prohibits admission of the voice identification evidence at this time111 and found the
admission of that evidence to be harmless error.
In 1987 the Supreme Court of New Jersey again addressed the
issue of admissibility of spectrographic evidence in the civil case of Windmere v.
International Insurance Company.112 In affirming the judgment of the Appellate Division,
the Supreme Court of New Jersey ruled that the Appellate court's affirmation of the
admission of the spectrographic evidence by the trial court was improper. The court stated
the admissibility of the spectrographic voice analysis is based on the scientific
technique having sufficient scientific basis to produce uniform and reasonably reliable
results and contribute materially to the ascertainment of the truth 113, a standard the
court admits bears "a close resemblance to the familiar Frye test".114 The court
relies upon the "general acceptance within the professional community" to
establish the scientific reliability of the voice identification process. In reaching a
determination of general acceptance, the court on a three prong test which includes; (1)
the testimony of knowledgeable experts, (2) authoritative scientific literature, and (3)
persuasive judicial decisions which acknowledge such general acceptance of expert
testimony.115 The court found that none of the three prongs indicated that there was a
general acceptance of spectrographic voice identification in the professional community.
The court criticized the proponent experts as being too closely tied to the development of
this identification analysis to represent the opinions of the community.116 The court
found that the trial court did not undertake to resolve the issue of conflicting
scientific literature and they would make no effort to resolve the conflict.117 The court
also reviewed the judicial decisions regarding admissibility and found a split among the
jurisdictions as to the reliability of the identification process.118
The New Jersey Supreme Court specifically limited its
decision in Windmere excluding spectrographic voice identification evidence to the present
case. The court stated that the future use of voice identification evidence "as a
reasonably reliable scientific method may not be precluded forever if more thorough proofs
as to reliability are introduced" 119 and they will "continue to await the more
conclusive evidence of scientific reliability".120
The Court of Appeals of Texas in the case of Pope v.
Texas121 refused to address the issue of admissibility of voice identification evidence
stating that "the overwhelming evidence against appellant renders this error, if any,
harmless"122). Justice McClung in his dissenting opinion states that the trial court
did err in admitting the voice identification evidence and that the error was not
harmless123. He suggests that the Frye test is the proper standard for assessing the
admissibility issue and that the "relevant scientific community" should be
defined broadly124. When this aspect of the test is so defined the "general
acceptability" criterion is not met.
In February of 1989, the United States Court of Appeals for
the Seventh Circuit affirmed the decision of the United States District Court for the
Northern District of Illinois admitting spectrographic voice identification evidence in
the criminal case of United States of America v. Tamara Jo Smith.125 The Seventh circuit
now joins the Second, Fourth and Sixth Circuits in affirming the use of spectrographic
voice identification evidence.126 The Appellate court used the Frye standard to hold
expert testimony concerning spectrographic voice analysis admissible in cases where the
proponent of the testimony has established a proper foundation.127 The court noted that
this technique was not one-hundred percent infallible and that the entire scientific
community does not support it, however, neither infallibility nor unanimity is a
precondition for general acceptance of scientific evidence.128 The Seventh circuit found
that a proper foundation had been established in that the expert testified to the theory
and the technique, the accuracy of the analysis and the limitations of the process.129 The
court noted that variations from the norm result in an increase of false eliminations.130
The jury was not likely to be misled in that they had the opportunity to hear the
recordings, see the spectrograms, hear the limitations of the process, witnessed a
rigorous cross-examination of the expert and could reject the testimony of the expert.131
In United States v. Maivia,132 the United States District
Court admitted spectrographic evidence after a four day hearing on the issue. The court
examined the various sub- tests of the Frye test and found that spectrographic voice
identification evidence met these tests. The court also noted that "inasmuch as the
admissibility of spectrographic evidence to identify voices has received judicial
recognition, it is no longer considered novel within the Frye test and consequently the
test is inapplicable" 133. The court also looked to the Federal Rules of Evidence,
specifically rule 403, in deciding the admissibility of spectrographic voice
identification evidence.
In affirming the order of the Appellate Division, the New
York Supreme Court, in the case of People v. Jeter134, concluded that the trial court was
not able to properly determine that voice identification evidence is generally accepted as
reliable based on case law and existing literature. The Court stated that the trial court
should have held a preliminary inquiry into the reliability of voice spectrographic
evidence. In the light of the other evidence, the admission of the voice identification
evidence was held to be harmless error in this case.