Stephen Biss
- Aug 2, 2022
- 9 min read

Reliability of Approved Instruments in Field Over Time is a Hypothesis: Where is Empirical Study?

Updated: Aug 9, 2022

Front page to Intoxilyzer 8000C evaluation

Linearity of two new Intoxilyzer 8000 instruments was evaluated in 2005, but has no one evaluated the linearity of the aging 8000C instruments out in the field? If so, single point cal. checks do not establish reliability across the measuring interval.

Purposes:

To obtain the following admissions from the Crown's expert:

Calibration of an approved instrument out in the field does not come from anything that the qualified technician does on the day of the subject test
Calibration of an approved instrument out in the field comes from what the manufacturer did years before
Calibration of an approved instrument out in the field comes from the calibration curve created years before
Out in the field we are counting on the calibration curve, specifically the relationship between the response at the end ofthe detector and the indication; we are counting on that calibration curve not having changed
We [in Ontario] verify that calibration at only one data point on the day of the subject test
If we don’t have disclosure of the Certificate of Calibration from the manufacturer from many years ago, when it came from the manufacturer, we don’t have any information, that permits us to conclude that there has been a reliable measurement result at, for example, 150 or 160 mg/100mls, if the cal. check is only at 100 mg/100 mls at time of use
The CFS scientist will then rely upon the fact that, in general, the breath testing instruments used in Canada, demonstrate that they have a linear calibration curve, so any change in calibration should be uniform across that curve, permitting a cal. check at a single point, 100 mg/100mls
But: The meat of the real issue is that the calibration curve becomes linear as a result of the calibration empirically as the particular instument learns at the factory [the software and procedure are sometimes called "the linearizer"]
It is a hypothesis relied upon by the CFS scientist that ,generally speaking, the relationship between response at theend of the detector and what comes off of the indication is a linear response
That hypothesis needs to be tested empirically
Where is the empirical evidence that tests that hypothesis?
The CFS scientist will then rely on the fact that during evaluation, the original two Intoxilyzer 8000s (sic) evaluated by the Alcohol Test Committee demonstrated a linear response
This particular CFS scientist will then rely upon the fact that during his own evaluation of the Intoxilyzer 5000EN for the Alcohol test Committee, that original instrument demonstrated a linear response
But: Those were all new instruments
No one at the Centre of Forensic Sciences, no one in the Alcohol Test Committee, no member of the Canadian Society of Forensic Sciences has every empirically tested the hypothesis in an aging instrument, that’s seven years old.
Acknowledgement of Hodgson's definition of reliability "over time"
Reliability relates to significant drift in accuracy and precision over time
If reliability relates to drift in accuracy and precision over time, one cannot reach any conclusion whatsoever with respect to one data point calibration check, without any knowledge of the original date and certification of calibration, the date of any re-calibration of the instrument, and without any information as to verification of calibration at any other data points other than 100
One cannot assume that the response is always linear over time
Reliability increases with frequency of calibration, short calibration interval
No published studies that has tested length of time that an instrument keeps it calibration? No empirical studies in Canada, no empirical studies in the United States on that subject.
CFS hypothesis: any change in calibration should be uniform right across the measuring interval scale, challenge the Crown/witness to produce studies that empirically support that hypothesis, there are none the witness is aware of.

Cross-examination of a CFS scientist:

Q. And so that’s why you have specific recommended

standards of the Alcohol Test Committee. Let me put it this way,

if an Intoxilyzer 8000C is being used out in the field by a

particular police officer, it’s reliability, it’s calibration

doesn’t come from something that the qualified technician does...

A. That’s...

Q. ...On the day of the subject test. Its

calibration comes from what the manufacturer did at the date of

the calibration back a number of years before.

A. That’s correct.

Q. We are counting on the calibration curve, in

other words the relationship between the response at the end of

the detector and the indication; we are counting on that

calibration curve not having changed, right?

A. But we are also verifying the calibration,

albeit only at that...

Q. Only at one data point, namely 100 milligrams

per 100 mills.

A. That’s correct.

Q. So my question to you is, if an instrument has

not been calibrated properly in the first place, if we don’t ever

see the Certificate of Calibration from the manufacturer from

five years ago or six years ago or seven years ago when it came

from the manufacturer, we don’t have any of that information, how

can we ever make a determination that there has been a reliable

- that the indication on the instrument is reliable if the test

result is 150 or 160 if the only checking that we’ve done is at

100 milligrams per 100 mills?

A. Well in general breath-testing instruments that

have been used in Canada demonstrate that they have a linear

calibration curve and so any change in calibration of the

instrument should be uniform across that calibration curve. That

gives us the ability to put together a procedure that only uses

that single calibration checkpoint.

Q. All right, so now we get to the meat of the

real issue of all the questions that I’m asking you. You’re

saying that any change should be uniform because the calibration

curve essentially becomes linear...

A. Yes.

Q. ...As a result of the calibration empirically.

I’m asking you as a scientist. You know what I mean by

empirically?

A. Yes.

Q. You’ve put forward a hypothesis to suggest that

by in large, evidentiary breath test equipment that’s used in

Canada – the response, the relationship between response at the

end of the detector and what comes off of the indication is a

linear response. That’s the hypothesis that you’ve just put

forward?

A. Yes.

Q. Right? All right, let’s test that empirically.

I know, and I want to suggest to you the same Terry Martin that

we just talked about, when she did her evaluation for the

Intoxilyzer 8000C wrote a paper right after that evaluation where

she tested the instrument and made – reached the conclusion that

the response was linear, right?

A. Yes.

Q. That was a new instrument with a new

calibration certificate just like the one – like the 5000EN that

you received for evaluation, right?

A. Yes.

Q. No one at the Centre of Forensic Sciences, no

one in the Alcohol Test Committee, no member of the Canadian

Society of Forensic Sciences has every empirically tested the

hypothesis that you’ve just put forward in an aging instrument,

in an instrument that’s five years, six years, seven years old,

right?

A. Not that I’m aware of or no one’s – I’m not

aware of a study that’s been published showing that but it

certainly has been tested and is part of our training and it’s

also been tested by manufacturers, which is why their internal

test only measures the calibration at one point, again at 100

milligrams of alcohol in 100 millilitres of blood unless by

statute a jurisdiction is going to use another point.

Q. Okay, now before we talk about the internal

test procedure, you said that you have this hypothesis that any

change over time - and let’s just go back to a paper by Brian

Hodgson, you know who he is?

A. Yes.

Q. You know that he wrote a paper that the Supreme

Court of Canada relied upon in a case called St. Onge Lamoureux?

A. Yes.

Q. And you’re very familiar with the paper in

which he defined what accuracy is, what precision is and he

defined what reliability is. He also defined specificity, right?

A. Yes.

Q. And he referred to reliability as referring to

significant drift – or I’m sorry, significant drift in accuracy

and precision over time, right? Have I got that roughly right?

A. Sounds like a good...

Q. Sounds like a good...

A. ...Paraphrase.

Q. ...Definition. You’d agree with that

definition with what reliability is?

A. Yep.

Q. So here’s the question, if reliability relates

to drift in accuracy and precision over time, how can a court

reach any conclusion whatsoever with respect to one data point

calibration check without any knowledge of the original date and

certification of calibration, the date of any re-calibration of

the instrument and without any information as to if anybody has

verified that calibration at any other data points other than

100? Why on earth would a court assume that the response is

linear?

A. Well, without having any evidence I suppose

that the court couldn’t. They would need the evidence of an

expert.

Q. The expert who...

THE COURT: Maybe we can frame this – I don’t think

we should be framing this in terms of what legal

conclusions the court might reach.

MR. BISS: All right.

THE COURT: I mean, if you want to ask him how, you

know, how he would understand it to be, that’s one

thing but I think we have to be careful about how

this is framed.

MR. BISS: All right. I’ll do better then, Your

Honour.

Q. You’d agree with me that reliability increases

with any kind of measuring instrument with frequency of

calibration or short calibration interval? That’s a general

concept across science.

A. Well, obviously yes but then relative to how

long that actual estimate keeps its calibration...

Q. And again, you’d agree with me no published

studies, certainly none that you’re aware of, that has tested

length of time that an instrument keeps it calibration? No

empirical studies in Canada, no empirical studies in the United

States on that subject. You’re not aware of any that are

published?

A. You know, none comes to mind but I haven’t

turned to my mind to that for 20 years.

Q. Right. So here’s the problem, you say – you

draw the inference from the hypothesis that you’ve proposed, that

any change should uniform right across the measuring interval

scale so therefore you put together a technical program, a

technical set of recommendations that says let’s have – and this

is what the Alcohol Test Committee has done, let’s have all of

the police services across Canada run at least one or more

control tests at one data point when they’re running an

evidentiary breath test. That’s the reason for that procedure?

A. See, I don’t know that it was – that the basis

was just the hypothesis. Obviously there must have been testing

of instruments reliability over time and the change in

calibration.

Q. I wanna suggest to you that there are no such

empirical studies and I mean I’m challenging you, I’m challenging

the Crown to produce them but I wanna suggest you to there are no

such empirical studies. It’s an assumption that’s been made and

as a result of that, I think you said earlier, as a result of

that that they put together a program, a package, a set of norms,

a set of technical norms for qualified technicians to follow,

right?

A. Yes, I’d agree with that.

Q. It’s because of that assumption.

A. But I also can’t – I mean I can’t say that it’s

just an assumption. I can’t think of any published studies right

now, but even if there were no published studies I can’t discount

that – our understanding of the linearity of the instruments

wasn’t – hasn’t been, in fact, studied in any number of forensic

laboratories.

...

[after a break]

...

Q. And a control test at one data point is not a

check of the calibration of the instrument; it’s only a technical

procedure to try and help police officers to screen out

instruments that should be taken out of service?

A. I would disagree. Just as stated here, when

the Intoxilyzer 8000C is calibrated, a number of calibrators are

used...

Q. Yes.

A. ...So that fulfills that section. In my

opinion, the use of just one calibration check at a concentration

of 100 milligrams of alcohol in 100 millilitres of blood is

sufficient to determine if the instrument remains in calibration.

Q. I want to suggest to you that’s a technical

opinion based on a norm of your employer. It’s not a scientific

opinion and you don’t have an empirical...

A. No.

Q. ...Research to substantiate that.

A. Actually do, which I’ve completely forgotten

about because it actually relates to comparison analysis and I

know of several papers that – comparing blood tests to breath

testing and in those, while there is always going to be

difference between the breath test and the blood test...

Q. Yeah.

A. ...Based on the time that the tests occurred,

as well as the fact that breath testing in North America produces

results that are systemically low between 10 to 12 percent...

Q. Yes.

A. ...So in correcting those two factors, these

comparison studies have shown that there is good correlation

between blood and breath results from the same individual from

the same incident and moreover – more importantly is that there

was the difference between the two did not vary by concentration.

In other words, there was no evidence that breath-testing

instruments got worse as the blood concentration changed over the

occurrence and most if not all of these tests were done with

instruments that had been in the field for some time. Some could

have been just re-calibrated; others certainly would have been in

use for a variety of times...

Q. Do you have a copy of that study?

A. Not with me, no. It’s one by Jim Wigmore and I

forget who the other author was. Then there’s a paper by Hodgson

who also looked at the relationship and there is one by Cowan and

I don’t know the age of the instrument he used but he performed

simultaneous analyses, blood and breath, using the Intoxilyzer

8000C and showed that in 100 percent of cases the breath result

was lower than the blood result and it – then he wasn’t doing a

linearity analysis in that particular case.

Q. All right, let’s talk about Wigmore and

Hodgson. Do you have knowledge of whether the instruments in

that particular case, it may have been older instruments, but

when had they last been recalibrated?

A. That’s information – in the Centre for Forensic

Science study wasn’t available. I don’t know Mr. Hodgson’s study

makes any statement about when the instrument he was using was

last calibrated.

Q. So those studies don’t support the hypothesis

that it doesn’t matter the length of time between calibration and

the breath testing result doesn’t matter in terms of reliability

of the instrument. It does not support that hypothesis.

A. I would disagree.

[Next time I would have these studies at my fingertips to use them in cross-examination. My client in this case did not want an adjournment for that purpose.]

Q. Well, except that we have no information.

You’re saying that these studies support that hypothesis but we

have no information about how recently before the testing had

been done, the simultaneous testing had been done, of the length

of time before that instrument had been recalibrated. I mean,

instruments that are out in the field – I’m sure that your

instruments at the Centre of Forensic Sciences from time to time

go back for recalibration?

A. Very rarely, yes.

Q. But on occasion they do?

A. Yes.

Q. But the point is that nobody’s done an

empirical study to determine whether that linear relationship

lasts over a long period of time?

A. I can’t say with certainty whether there has

been a specific study of that or not.

Q. All right.

A. And I don’t have access here to my Alcohol Test

Committee files to see if in fact early on at some point if the

Committee didn’t actually perform that or other labs from which

those individuals came had in fact done such studies.

duimetrology.com

905-273-3322 or 1-877-273-3322

Reliability of Approved Instruments in Field Over Time is a Hypothesis: Where is Empirical Study?

Recent Posts