होम Theory & Psychology Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder

Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder

यह पुस्तक आपको कितनी अच्छी लगी?
फ़ाइल की गुणवत्ता क्या है?
पुस्तक की गुणवत्ता का मूल्यांकन करने के लिए यह पुस्तक डाउनलोड करें
डाउनलोड की गई फ़ाइलों की गुणवत्ता क्या है?
खंड:
23
भाषा:
english
पत्रिका:
Theory & Psychology
DOI:
10.1177/0959354313490451
Date:
October, 2013
फ़ाइल:
PDF, 805 KB
Conversion to is in progress
Conversion to is failed
0 comments
 

अपनी समीक्षा पोस्ट करने के लिए साइन इन करें या साइन अप करें
आप पुस्तक समीक्षा लिख सकते हैं और अपना अनुभव साझा कर सकते हैं. पढ़ूी हुई पुस्तकों के बारे में आपकी राय जानने में अन्य पाठकों को दिलचस्पी होगी. भले ही आपको किताब पसंद हो या न हो, अगर आप इसके बारे में ईमानदारी से और विस्तार से बताएँगे, तो लोग अपने लिए नई रुचिकर पुस्तकें खोज पाएँगे.
1

Encyclopedia of Reagents for Organic Synthesis || Lithium ( Z )-Bis(2-ethoxyvinyl)cuprate

साल:
2001
भाषा:
english
फ़ाइल:
PDF, 46 KB
2

Neuroeconomics, identity theory, and the issue of correlation

साल:
2013
भाषा:
english
फ़ाइल:
PDF, 740 KB
Theory
& Psychology
http://tap.sagepub.com/

Measurement in psychology: A case of ignoramus et ignorabimus? A
rejoinder
Günter Trendler
Theory Psychology 2013 23: 591 originally published online 26 July 2013
DOI: 10.1177/0959354313490451
The online version of this article can be found at:
http://tap.sagepub.com/content/23/5/591

Published by:
http://www.sagepublications.com

Additional services and information for Theory & Psychology can be found at:
Email Alerts: http://tap.sagepub.com/cgi/alerts
Subscriptions: http://tap.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://tap.sagepub.com/content/23/5/591.refs.html

>> Version of Record - Oct 16, 2013
OnlineFirst Version of Record - Aug 28, 2013
OnlineFirst Version of Record - Jul 26, 2013
What is This?

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

490451
2013

TAP23510.1177/0959354313490451Theory & PsychologyTrendler

Article

Measurement in psychology:
A case of ignoramus et
ignorabimus? A rejoinder

Theory & Psychology
23(5) 591­–615
© The Author(s) 2013
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0959354313490451
tap.sagepub.com

Günter Trendler
Worms, Germany

Abstract
The Millean Quantity Objection—that is, the claim that the measurement of psychological attributes
is impossible (Trendler, 2009)—has been countered with partly vigorous opposition (Kyngdon,
2013; Markus & Borsboom, 2012; Saint-Mont, 2012). Kyngdon’s response is of particular interest,
since he asserts that measurement may already have been established. If correct, it would definitely
invalidate any quantity objection and end the century-long discussion about the measurability of
psychological attributes. Therefore the focus of the rejoinder will be on the question of when
measurement is reached. First the meaning of measurement is elaborated. On this basis, criteria
for the successful establishm; ent of measurement are formulated and it is outlined how these
are satisfied in the case of intensive quantities. It is concluded that the evidence presented by
Kyngdon is insufficient and inadequate. The approach will also serve as background to discuss
objections raised against the Millean Quantity Objection.

Keywords
measurement theory, psychometrics, quantity objection, separability

The Millean Quantity Objection (Trendler, 2009)—that is, the statement I put forward
in this journal that psychological attributes are non-measurable—has been countered
with partly strong opposition. Markus and Borsboom (2012) find the argumentation to
be “flawed in several respects” (p. 452) and Saint-Mont (2012) believes that the “question whether an attribute (really) is ‘quantitative’ or not can hardly be answered in a
reasonable way” (p. 475). Kyngdon (2013), by contrast, concedes that the quantity
objection applies under certain circumstances, but he still believes that where descriptive theories of underlying cognitive processes exist and where quantitative features
can be empirically manipulated, the scientific measurement of the relevant attributes is
credible.
Corresponding author:
Günter Trendler, Robert-Schuman-Str. 15, 67549 Worms, Germany.
Email: guenter@trendler.org

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

592

Theory & Psychology 23(5)

However, Markus and Borsboom (2012) recognize that if the Millean Quantity
Objection is correct, then important consequences would ensue. They write: “[W]e need
not argue that these are many and varied (e.g., large parts of psychometrics, intelligence
research, and personality testing, but also of psychonomic research on memory, learning,
cognition, etc., would be discredited)” (p. 453). At the same time they concede that they
“do not aim to defend the thesis that currently existing tests and other observational practices can be defensibly interpreted as measurement instruments in the strong sense of
measurement intended by Trendler (2009) and Michell (1997, 1999)” (p. 464).
Furthermore, they see no urgency in resolving the problem, but rather prefer to stay
“agnostic about this issue” (p. 464) and withdraw to the position that they “are currently
in a state of ignorance with respect to this issue insofar as it concerns many cases of putative measurement in psychology” (p. 455).
This agnostic attitude is all the more astonishing since psychometrics is, as Joel
Michell (2000, 2008a) explains, a pathological science: that is, although psychometrics
is premised on the hypothesis that psychological attributes are quantitative, psychologists never seriously attempt to find out if they are really measurable (see also Barrett,
2003, 2008). Worse, there are reasons to believe that if psychological attributes are not
measurable, then the application of methods relying on the quantitative assumption are
systematically misleading. Under circumstances which are typical of psychology, even
such relatively simple mathematical operations as averaging are problematic. Friedrich
Sixtl (1982) observes: If psychological attributes were measurable, then, according to the
Gaussian theory of errors, averaging would be a useful tool to account for random error.
However, if this is not the case, the application of the mathematical operation is misleading, for, as Sixtl (1985) notes:
The arithmetic mean can indicate the real central value of a parameter only, if the basic data are
independent and identically distributed. The basic data obtained in behavioural sciences is,
however, seldom independent and identically distributed because every person represents a
unique individual. Consequently, in such cases, the obtained arithmetic mean depends upon the
distribution of the parameter among the subjects. As this distribution is bound to be different in
different experimental groups, the arithmetic means obtained in the same experiment conducted
among different experimental groups are different. (p. 338)

Hence, Tukey (1986), as quoted by Saint-Mont (2012, p. 473), may be right in pointing
out that the calculation of the arithmetic mean can be useful even if the requirements of
permissible statistics are not met. But this does not apply to situations which are typical
of psychology: that is, where the data are not identically distributed and where, consequently, “[t]he average response is not the response of any real person, but rather artificial” (Sixtl, 1998, p. 519).
It is therefore likely that the systematic connections Markus and Borsboom (2012)
believe to perceive are not an indication “that something is going on in the data” or that
“psychological scales are not random collections of items” (p. 464), but illusions created
by “[t]he tendency to see patterns in randomness” (Kahneman, 2011, p. 117). Critically
analyzed, even such popular constructs as the “Big Five personality factors,” mentioned
by Markus and Borsboom (2012, p. 464) as a supportive example for real knowledge, are
far from stable or coherent (Barrett & Rolland, 2009). It is also often misleadingly

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

593

Trendler

suggested that statistical analysis is in general “robust” against violations of the conditions of its applicability (e.g., Saint-Mont, 2011, p. 75).
Hence, if psychometrics is pathological, then almost all of psychology is pathological,
namely all psychological research predicated on the assumption that “the measurability
thesis” (Michell, 1999) is true. In consequence, if psychologists prefer to preserve the
status quo, they willingly choose to stay in a state of ignorance and delusion; which
hardly is a scientific attitude. This is the reason why—contrary to Saint-Mont’s (2012)
position that a “rigid or even ‘fundamental’ point of view is to no avail” (p. 473)—the
question of measurability is of crucial importance and why its resolution should not
lightly be deferred to the future (see also Barrett, 2008).
Furthermore, Markus and Borsboom (2012) are of the opinion that against the background of the long history of contesting the existence and possibility of measurement in
psychology, the debate will continue unabated in future. Hence they propose a “general
framework for evaluating arguments against measurement” (pp. 461–463) in order to
accommodate future discussions. One gets the impression that by the attempt to anticipate even future criticism they want to deflate the wind of the critics’ sail for good. What
they overlook is the obvious, namely the possibility to end the debate by demonstrating
that at least one psychological attribute is measurable. In this regard I applaud Kyngdon’s
(2013) attempt to diffuse the objections by presenting evidence that psychological attributes may be measurable after all. Certainly, the best way to confront and finally silence
the sceptic is to demonstrate that measurement has been established.
To make a proper assessment of the evidence, the criteria for the attainment of measurement must be specified. First I will elaborate in some detail on the meaning of measurement. Based on this I will formulate criteria for measurement and exemplify how they
can be satisfied in the case of intensive quantities. The criteria will serve, so to speak, as
a yardstick against which we can judge if (or to what extent) the measurement task has
been solved. This way of proceeding will also function as the background for replying to
different objections raised by the critics.

The paradigmatic meaning of measurement
Contemporary measurement theory is dominated by two concepts of measurement.
Firstly, there is the classical concept as was originally developed by Euclid (for details
see Michell, 1999, pp. 25–33), according to which measurement is defined as the determination of the ratio of a magnitude of a quantity to another magnitude of the same
quantity called unit. More precisely, for any magnitudes a and b of a quantity A
the magnitude of a relative to b may always be expressed by a positive, real number, r, where
a = r.b. That is, the ratio of a to b (a positive, real number) is the measure of a in units of b.
(Michell, 1997, p. 358)

Another common way to express the same idea is by the formula “physical quantity =
numerical value × unit” (de Boer, 1994/1995, p. 405). The numerical value r is usually
called “measurement value.” As such the definition is still widely used in physics (see
Michell, 1999, p. 33).

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

594

Theory & Psychology 23(5)

Secondly, we have to consider the “homomorphism definition of measurement” (Luce
& Narens, 1994, p. 224) as is part of the representational theory of measurement (RTM;
Luce & Suppes, 2002); the presently dominant theory at least in the domain of philosophy of science. In its definite form the theory is outlined in the now classic Volume 1 of
Foundations of Measurement (Krantz, Luce, Suppes, & Tversky, 1971). For properly
understanding this definition, Tarski’s (1954) notions of a relational system or structure
and of an isomorphism between structures are essential. In basic terms, a relational structure is a set of entities together with one or more relations between the entities. An isomorphism is a mathematical one-to-one function in which one relational structure is
mapped onto another such that the relations among entities of the first structure are preserved by corresponding relations among entities in the second structure. In order to
employ these concepts within measurement theory, the distinction is made between
numerical and empirical relational systems or structures. An empirical relational structure contains “identifiable entities, such as weights, persons, attitude statements, or
sounds” (Suppes & Zinnes, 1963, p. 7), or more generally “concrete objects or events”
(Luce, 1979, p. 94). A numerical relational structure is a system consisting of abstract
objects (usually a set of real numbers) and relations between them (e.g., relations of
equivalence or order). In measurement, where the same number may be assigned to
more than one object (e.g., different objects may have an equal length), a many-to-one
function or a homomorphism is preferable. Accordingly, measurement is “defined as a
homomorphic mapping of an empirical system into a numerical system” (Zinnes,
1969, p. 454).
What is the relation between the two theories? According to Hand (2004), “the classical theory is predicated on the condition that measurement is necessarily concerned with
‘quantitative’ attributes. Modern representational theory includes this as a special case”
(p. 51). Luce and Narens (1994) elaborate:
[E]ither directly or indirectly standard sequences are used to establish scales in almost all of the
major results of FM [Foundations of Measurement]. Thus, since the process of measurement
through standard sequences is usually taken as paradigmatic of ‘measurement processes’ almost
all of the representational results of FM are valid not only from the RTM viewpoint but from a
number of different perspectives about what measurement is. (p. 225)

The primary exceptions are, as Luce and Narens note, “purely ordinal cases including
variants such as interval orders and semiorders” (p. 242).
Hence, if the set of homomorphisms is restricted to those established by means of a
standard sequence, then the homomorphism definition is equivalent in meaning to the
classical definition of measurement (see below for an example of the construction of a
standard sequence). In other words, proponents of RTM do not doubt the actual measurement practice, as was developed in the natural sciences, but only choose to describe it
differently. Or, as William James may have put it, the two definitions practically mean
the same thing (for details see James, 1907/1955, pp. 41ff).
Following Luce and Narens (1994), I will call the common idea underlying the two
definitions the paradigmatic meaning of measurement. For simplicity I will always use
the word “measurement” subsequently in the meaning of “paradigmatic measurement.”

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

595

Trendler

Also, to avoid misunderstandings, the term “quantification” will be used only to refer
to quantification by means of paradigmatic measurement. Hence, the answer to SaintMont’s (2012) question “what measurement is all about” is: measurement is about the
determination of ratios of magnitudes of a quantity, not more and not less. Cases of
nominal or ordinal measurement involving the concept of quantity (e.g., “associative
measurement” as described by Ellis, 1966, chap. 6) are considered as preliminary
stages to paradigmatic measurement. That they are useful in their own right is not
questioned. However, what matters in the context of the discussion is that most psychometric methods are based on the presumption that the conditions of paradigmatic
measurement can be met.
Non-paradigmatic cases of measurement will not concern us here (e.g., Beaufort’s
wind scale, Mohs’ hardness scale, etc.), though their usefulness is not doubted. It is also
not implied that quantitative methods should in general be replaced with qualitative
methods, as falsely suggested by Saint-Mont (2012, p. 467). In this connection Markus
and Borsboom (2012) allege that my argumentation “discounts the possibility that the
scientific community will change its conceptual scheme so as to alter the meaning of the
term measurement itself” (pp. 463–464). In my view this imputation is unjustified. If I
believe that measurement is impossible in psychology, then, in the sense of paradigmatic
measurement; the right to define measurement according to other criteria is certainly not
contested. However, on a critical note, one must understand that by changing the definition one does not change the facts. This may sound trivial, but, after the transition from
the classical definition of measurement to the representational conception, exactly this
fallacy prevailed in psychology. Since then measurement is usually explained as: “When
measuring some attribute of a class of objects or events, we associate numbers … with
the objects in such a way that the properties of the attribute are faithfully represented as
numerical properties” (Krantz et al., 1971, p. 1). Suddenly, when considered superficially, classifying people according to a presumably quantitative attribute (e.g., intelligence, ability) seemed just as easy as classifying them according to non-quantitative
attributes (e.g., nationality, gender, or profession). Thus through obfuscation the illusion
was created that measurement in psychology may be at least in principle an easy thing;
that, since psychology “possesses ordinal relations aplenty” (Michell, 1990, p. 68), conjoint measurement (Luce & Tukey, 1964) is the royal road to measurement. Actually, the
delusion has taken such deep roots that even if led close to it, some psychologists have
difficulties seeing the difference. Otherwise I cannot explain why Markus and Borsboom
(2012)—although they realize that the problem I pointed out (Trendler, 2009, pp. 583–
584) is not with classification per se, but with classification of people according to quantitative attributes—still claim that “if one can assess equality, one can construct nominal
representations, and if one can construct nominal representations, … then the problem
with psychological measurement must lie elsewhere in [the] axiomatization” (p. 460).

Measurement: The criterion
The criterion for the successful application of measurement theory is derivable from the
definition of measurement. That is, the measurability of a quantity is established if it can
be demonstrated empirically that there exists a ratio between some magnitudes of the

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

596

Theory & Psychology 23(5)

quantity or, in other words, that the ratio between two magnitudes can be expressed by a
real number.
Though necessary, this criterion is not sufficient. Because of experimental and human
fallibility, as is the case for empirical sciences, it must be supplemented with corollary
requirements. Firstly: A reported discovery must be replicated independently of the original discoverer before it is accepted as valid; ideally with improved experimental apparatus and under varying experimental circumstances. As a rule the finding must in principle
be replicable any number of times. For example, Ohm’s (1826) discovery of the law of
electrical current was extensively verified by Fechner (1831) in numerous experiments.
Not only did he use other measurement procedures, but he also substantially extended the
validity of the law, for example, by investigating fluid conductors. Among others, decisive
experimental contributions to the corroboration of Ohm’s law were made by Jacobi (1839,
1842), Lenz (1837, 1839), and Wheatstone (1843). Interestingly, apparently unaware of
Ohm’s empirical work, the French physicist Pouillet (1837) rediscovered the law of electrical current. The last great effort to verify Ohm’s law was undertaken in 1874 when the
British Association for the Advancement of Science asked J. C. Maxwell to direct a committee that would perform the testing. The cause was a set of experiments by Arthur
Schuster which suggested “that Ohm’s law may only be a first order approximation, and
that the resistance and/or the self-inductance of a wire could be current dependent”
(Davies & Marsh, 1985, p. 525). The result of the investigation was that the law was validated in its original form; respectively it was demonstrated that Schuster’s original observations were dependent on the use of tangent galvanometers in conjunction with alternating
currents. Since then the law has not been seriously contested.
Importantly, in physics, laws of nature are accepted as true despite their limitations
(cf. Kipnis, 2009). That is, laws of nature must be replicable under some but not necessarily under all conceivable circumstances. This does not imply that their validity should
fail to be extended through corroboration as much as possible, but circumstances in
nature are not always favourable. For example, it has been found that in metals at high
current densities Ohm’s law reaches limits of validity (e.g., Bridgman, 1921). However,
such limitations are not necessarily a disadvantage, but may open new fields of investigation. Thus, for example, Ferdinand Braun’s (1877) investigations on the limits of
Ohm’s law led to the discovery of semi-conductors. Actually much of the task of replication comprises the systematic investigation of the limits of a law. In conclusion:
Quantification can be regarded as successful even if it is experimentally verified over
only a limited range of magnitudes of the quantity.
Although not a necessary criterion, it is worth mentioning in this context that an
important incentive for their acceptance as true laws of nature is these laws’ practical
applicability. There certainly is legitimacy in saying that the breakthrough for Ohm’s law
being generally accepted was a function of its practical usefulness, for example, in telegraphy (Wheatstone, 1843) or in the development of the electro-motor (Jacobi, 1837).
Finally, the reader may have noticed that I did not take into consideration what is usually considered the criterion for successful measurement, namely the statistical fit
between model and data (Sijtsma, 2011, 2012). The reason for this omission is that statistical significance is no guarantee for the replicability of measurement results; in other
words, it may very well be the case that a model fits the data, but neither are the observed

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

597

Trendler

ratios constant nor are the measurement values replicable (e.g., Andrich, 1988, pp. 32–34).
For example, as Barrett (2005) points out, it is possible that the outcomes of an experiment conform to the Rasch model, but that there is no relation to any substantive theory.
This was first noted by Wood (1978), who, by fitting the Rasch model to simulated cointossing data, demonstrated that the model shows fit even in the case of random (i.e.,
meaningless) data. It is therefore in the best interest of the psychologist investigating the
topic of measurability to always be guided by the following statement: significance is
insignificant, what matters is replicability, as the antidote counteracting the “obsession
with statistical tests of significance” (Lambdin, 2012, p. 84). As will be explained below,
what is needed most is to demonstrate the replicability of measurement values.

Indirect measurement
In An Essay on Quantity, Thomas Reid (1748) already made the distinction between
proper and improper quantities (or extensive and intensive quantities as they are usually
called today), which, according to him, goes back to Aristotle. The main difference is
that extensive quantities (e.g., length) allow the application of physical operations which
are structurally analogous to common arithmetic operations (e.g., addition, subtraction,
multiplication, division) while intensive quantities (e.g., temperature) do not. The apparent difficulty in the latter case is how to determine ratios if one cannot add and subtract
instances of magnitudes of the quantity. It represents one of the great scientific breakthroughs to have demonstrated that the measurement of intensive quantities is nevertheless possible, namely by means of associated factors in the manner explained below.
Accordingly the distinction is usually made between fundamental (direct) and derived
(indirect) measurement (Campbell, 1920; Fornasini, 2008). In short, the measurement of
intensive quantities requires methods of indirect measurement. Because psychological
attributes (e.g., abilities, traits, attitudes, etc.) can be conceived only as intensive quantities, indirect measurement is of particular interest in psychology.
However, first we must consider how RTM approaches the problem. To properly
understand the proposed two-step strategy, it must be taken into account that RTM is part
of the program of the axiomatization of the natural sciences as was started at the end of
the 19th century (Corry, 2004). Michell (1999) notes: “A theory is ‘axiomatized’ when it
is expressed as a set of (ideally, logically independent) propositions (called ‘axioms’)
from which the remainder of the theory deductively follows” (p. 194). In measurement
theory the fundamental problem of axiomatization is to deduce from an appropriate set
of axioms “that each magnitude, a, measured by any other magnitude, b, gives a definite
numerical magnitude or number” (Michell & Ernst, 1996, p. 241). Otto Hölder’s (1901)
merit is to have presented the first such complete axiomatization. Obviously if an attribute satisfies the axioms of measurement it is, in principle, measurable.
The first step towards quantification, or “the scientific task,” as Michell (1999) calls
it, comprises the task of empirically testing the axioms of measurement. Michell notes:
“The scientific task having been successfully completed, it is known that the relevant
attribute is quantitative and, so, it follows that it is measurable. That is, magnitudes of the
quantity sustain ratios” (p. 75). However, it must be emphasized that even if this task is
successfully completed, the attributes involved are not really measured. Or, in other

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

598

Theory & Psychology 23(5)

words, we know at this stage that the attribute sustains ratios of magnitudes, but we cannot determine them numerically. To obtain measurement values it is necessary to solve
in the second step: the “scaling problem” (Orth, 1974, chap. 9) or “the instrumental
task.” Michell (1999) explains: “The business of the instrumental task is to contrive procedures whereby [the] ratios can be discovered or reliably estimated” (p. 75).
With critical regard to the two-step strategy it must be noted that no clear criteria are
specified for when to consider the scientific task accomplished. Actually to my knowledge no criteria at all are proposed. This represents in my view a serious flaw. Given that
the empirical structure potentially contains infinitely many objects, the instrumental task
can be deferred indefinitely. More importantly, since procedures for the determination of
ratios must be implemented at some point in the process of quantification, otherwise
measurement is never reached, it is rather reasonable to dispense with the two-step strategy altogether and start with the instrumental task directly, as is the rule in physics. As
Michell (1999) notes, the determination of ratios of magnitudes
is generally done by exploiting relationships between the attribute being quantified and another
already quantified. Consider for example the measurement of temperature using an ordinary
thermometer. Within a specific range of temperatures, it has been found that the temperature of
a liquid (say, the metal, mercury) is linearly related to its volume, if pressure is held constant.
Thus, in a sealed glass tube of uniform width, for a limited range, temperature varies linearly
with the height of the column of liquid. By this means, it is possible to measure temperature via
measurements of length. (pp. 75–76)

If such construction is indeed possible, then procedures of indirect measurement become
applicable, where the scientific and the instrumental task are solved simultaneously.
Before proceeding I would like to mention Markus and Borsboom’s (2012) critique of
the two-step approach. They point out that even though the ancient Egyptians “did not
have access to set theory, homomorphic mappings, the axioms of quantity, or any of the
other mathematical structures” (p. 455), they succeeded in establishing successful measurement practices. The point of this example is that the construction and appropriate use
of “measurement instruments was apparently possible in the absence of any success in the
‘scientific task’ (or even the awareness that there was one)” (p. 455). In a similar vein,
Saint-Mont (2012) notes, echoing Schönemann (1994), that a “formalization of the measurement process followed by invariance considerations does not come first. Rather, these
developments prove to be useful in structuring existing substantial results” (p. 473).
In my view it is not really reasonable to ignore modern measurement theory just
because our ancestors did manage quite well without it. On the contrary, the fact that we
dispose of an axiomatic theory is an invaluable advantage in helping us understand what
the conditions for measurement are, thus facilitating the gathering and interpretation of
substantial results. After all, axiomatic measurement theory can be viewed as making
explicit what is implicitly contained in measurement practice. As Krantz et al. (1971)
note, taking geometry as an example:
Geometry is a beautiful and far-reaching example of a foundational treatment of measurement.
The science of geometry (i.e., earth measurement) was probably first developed as a set of
practical procedures, either for the direct measurement of lengths and areas on the earth’s

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

599

Trendler

surface or in connection with the astronomy devised to serve astrology. Eventually, the tacit
assumptions of practice were formulated explicitly as theorems of geometry, and these were
systematically organized and deduced from a few axioms and postulates by Euclid. (p. 7)

In conclusion, yes, axioms of measurement must be tested, but not necessarily in the
manner prescribed by RTM. As will become evident below, the experimental testing for
measurability does indeed not involve the explicit knowledge of such mathematical concepts as, for example, a homomorphism, the Archimedean condition, or the condition of
solvability. Hence, concerning this matter, I’m partly in agreement with Markus and
Borsboom (2012).
To return to the main topic of this section, as the example above (as quoted from
Michell, 1999) illustrates, the origin of indirect measurement can be ascribed to the
empirical observation that a presumed quantity (or independent variable) A covariates
with a measurable quantity (or dependent variable) P. The idea derived from such observations is that we may indirectly measure A by means of the relation P = f (A). The first
problem encountered is how to determine values of A without knowing f. Secondly,
because P = f (A) represents an empirical relation, it must first be validated in experiment. This seems to lead into a circulus vitiosus; since, on the one hand, in order to
measure A we must know f and, on the other hand, in order to identify the form of the
function f we need to measure A. The obstacle can be overcome by incorporating a third
quantity X which is empirically associated with A and P so that P = f (A,X). That is, values of P are a function of the joint variation of A and X. What is decisive is that now the
quantitative hypothesis P = f (A,X ) can be tested by assuming ex hypothesi that certain
empirical laws connect the quantities. The simplest function that comes into consideration is the relation of (direct or inverse) proportionality. In this case the quantity P varies
(directly or inversely) proportional to the product of A and X. To test the quantitative
hypothesis it is necessary to combine magnitudes a, b, c… etc. of A with magnitudes x,
y, z … etc. of X and record the joint effect ax, ay, az, …, bx, by, bz, … cx, cy, cz … have
on P. The crucial point is that if the effect of A on P is empirically separable from the
effect of X on P, then the numerical value r of the corresponding ratios
ax ay az
= =
= ...= r
bx by bz
is constant. We can conclude that a/b = r, respectively a = r.b. The same logic applies in
symmetry to X. Hence, although the operation of physical addition is not applicable to
intensive quantities, they can nevertheless be measured according to the definition of
measurement. Also note that the independent variables must not be measureable for the
procedure to be applicable. As a minimum requirement we must be able to identify equal
magnitudes of A and X.
In conclusion, indirect measurement consists in the discovery of numerical constants
in an empirical law (Helmholtz, 1887/1998). The empirical discovery of constants in a
law is the most solid empirical evidence not only for the measurability of the independent variables, but simultaneously for the validity of the empirical law. It goes without
saying that such invariance was found by Ohm (1826) in his experiments on the laws of

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

600

Theory & Psychology 23(5)

the electrical current. That the measurement of intensive quantities is about the determination of constants in an empirical law is self-evident to physicists (e.g., Jacobi, 1842;
Wheatstone, 1843).
At this point it is important to note that the dependent variable might not be a continuous quantity but can also be discrete. For example, following the discovery of the electron by J. J. Thomson in 1897, electric currents came to be conceived as a flow of clouds
of electrons driven through the conductor by an electromotive force (Home, 2005). As
Home notes, the electric charge “became a primitive term, a quality attributed to the
fundamental constituents of matter that was itself left unexplained” (p. 96). From that
date on it was at least theoretically possible to measure the intensity of the electrical current by counting electrons. Only recently it became really possible to construct measurement devices which measure the current in a microelectronic circuit by counting the
number of electrons per second (i.e., a frequency) as they pass through a detection circuit. The electric current I then equals this frequency f multiplied by the electron charge
e, a universal physical constant (Bylander, Duty, & Delsing, 2005). Similar attempts are
made in temperature measurement (e.g., Sanner et al., 2010).
Interestingly, most of the measurement laws in physics turned out to be of the simple
proportional form (e.g., Newton’s second law of motion, Boyle’s law, Hooke’s law,
Ohm’s law, etc.). However, this must not necessarily be the case, since, as Georg Rasch
(1960/1980) recognized, what is generally required is a special class of functions or, to
use Rasch’s term, “models for measurement,” namely those that allow an algebraic
separation of the effects of the independent variables on the dependent variable as illustrated above. Accordingly he called this property separability (Rasch, 1977). Which
class of functions exactly satisfies this condition is still a matter of debate. Sixtl (1982)
assumes that it is functions with an algebraic addition theorem (see also Fischer, 1989).
In principle, for his models of measurement, Rasch made use of Gauß’s (1821/1995)
discovery about the probability distribution of measurement values as observed in
repeated measurements. What constitutes Gauß’s revolutionary insight is the representation, as a probability function, of the distribution of measurement values resulting from
repeated observations of the same magnitude of a quantity. The distribution is now
known as the Gaussian or Normal distribution. Basically the Gaussian function makes a
connection between this distribution and the “true value” of the magnitude of quantity
underlying it. Hence it is a small step to go the other way round and estimate from a probability distribution the parameter of the distribution (i.e., the “true” value of the magnitude of the quantity). Thereby one assumes, metaphorically speaking, that measurement
has already taken place and that as a researcher one receives the measurement protocol
for data analysis (cf. Michell, 2008a). For example, the simple logistic model, as the
simplest model of the Rasch family, can be expressed as:
A
π
=
1− π D
where π is the probability that a person solves an item correctly, A represents person ability and D item difficulty. θ is the ratio of the probability that a person will solve an item
correctly to the probability that she will not. The function f selected by Rasch to attain
separability is a special case of the logistic function, namely

θ=

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

601

Trendler
f ( x) =

x
1+ x

where f is an increasing monotonic function mapping positive real numbers into the real
number interval [0, 1] (for models relying on the Gaussian function see Gutjahr, 1977,
chap. 3). Hence the simple logistic model can be rewritten as:
A
π= D
A
1+
D
Finally, we must take into consideration how RTM deals with the measurement of
intensive quantities. First, it should be pointed out that to some extent the theory of conjoint measurement was specially developed “to show that a standard sequence can be
developed … without any prior extensive measurement” (Krantz et al., 1971, p. 138);
contrary to Norman Campbell’s claim that derived measurement necessarily depends on
the prior measurement of other attributes. The basic idea for such an alternative quantification is that if an increase from a to b within quantity A in conjunction with an increase
from x to y within quantity X has an equal effect on P, then we can conclude that (b – a)
= (y – x). If it is possible to adjoin such differences, then the interval (b – a) on A can be
used to trade off equal intervals (y – x) = (z – y) = … etc., on X. This procedure assumes
additivity in the effects of the two factors: that is, if the sum of a and y effects equals the
sum of b and x effects, then the difference between b and a effects must equal the difference between y and x effects. Put simply, “the entities that can be concatenated are intervals within one factor” (Krantz et al., 1971, p. 18). Krantz et al. (1971) summarize:
[T]he presence of a second factor together with the assumption of additivity of the effects of the
two factors allows us both to calibrate equal units on the first factor and to combine adjacent
equal units to form a standard sequence. (p. 18)

The method is illustrated as follows:
Consider the problem of measuring the electrical resistance of intervals along a piece of wire
(say, for purposes of calibrating a variable resistor). Suppose that the lengths of the intervals are
either unknowable (the wire is coiled or inaccessible) or irrelevant (the resistivity is nonconstant
if the thickness or composition of the wire varies from point to point). Moreover, suppose that
the only instruments available are incandescent light bulbs, a power source of suitable output,
other wires, and soldering tools. One solution is to connect one light bulb in series with an
interval along the to-be-measured wire (using a pair of variable contacts to attach two leads to
the endpoints of the interval) and a second light bulb to a fixed length of another (auxiliary)
wire. Plug both bulbs into the power source. If the auxiliary wire has sufficiently small
resistance, then the second bulb will have more current through it and be brighter than the first.
Move one of the variable contacts along the to-be-measured wire toward the other until the
resistance added by the interval is small enough so that the lights match in brightness. These
two endpoints can then be taken as points a1 and a2 in a standard sequence. One next searches
for a point a3 such that the resistance of interval a2a3 is just sufficient for the two bulbs to

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

602

Theory & Psychology 23(5)

match, keeping the same auxiliary wire in series with the second bulb. In this way, a sequence
of intervals of equal resistance can be found. … To obtain a finer or coarser sequence, shorter
or longer pieces of auxiliary wire are used with the second bulb. (p. 138)

Hence, the minimum requirement for constructing a standard sequence by means of this
method is that it is possible to identify and compare magnitudes of the quantities for equivalence and order. Of course, what is still necessary is “to check the appropriate axioms or,
what amounts to the same thing, to show that the ratio of the resistances of any two intervals is invariant within error limits with the choice of standard sequence” (p. 138).
However, seen in critical light, dispensing with extensive measurement is not necessarily an advantage, since what in theory seems rather simple proves quite laborious in
practice. Krantz et al. (1971) note the example seems contrived because, “ordinarily, one
would obtain a calibrated electrical meter and measure resistance in terms of previous
extensive measurement” (p. 138). In my view this is not the only reason.
Firstly, it is proposed that equal intensity of electrical current can be determined by
visually comparing two incandescent light bulbs for equality in brightness. That is, if the
lights match in brightness, it is concluded that the electrical currents are equal in intensity. Obviously it is implicitly assumed that there is a direct proportional relation between
the physical intensity of light and the associated sensation of brightness. The prominent
role ascribed to humans as measurement instruments is no coincidence. As a matter of
fact the RTM approach in general deduces much of its plausibility, with regard to a possible application in psychology, from the hypothesis that humans function as measuring
instruments (i.e., that humans are able to identify unequivocally magnitudes of physical
or psychological attributes and compare them for equivalence and order; for an example
see the applications of conjoint measurement as presented in Michell, 1990). However,
it must be noted that humans have proven in general highly unreliable instruments for an
objective evaluation of reality. In particular, as we now know from psychophysics,
human “judgements of quantity are astonishingly poor” (Laming, 1987, p. 657; for
details see Laming, 1997). Actually in physics it has been known for a long time that
human senses can be quite deceiving even if confronted with such relatively simple tasks
as comparing the length of two straight rods (see Goodeve & Shelley, 1877, chap. 2).
Maxwell (1871/2001) summarizes the situation:
In fact, our sensations of every kind depend upon so many variable conditions, that for all
scientific purposes we prefer to form our estimate of the state of bodies from their observed
action on some apparatus whose conditions are more simple and less variable than those of our
own sense. (p. 3)

In conclusion, relying on humans as measurement instruments, certainly no viable scientific measurement practice can be established; neither in physics nor in psychology.
Secondly, the application of the RTM procedure is extremely difficult even in physics
where we have maximum technological control through apparatus construction.
Therefore, the question arises: Is it at all possible to identify in any precise manner magnitudes of intensive quantities without the help of an associated extensive quantity and
without making unrealistic assumptions such as that humans can be used as a substitute for measurement instruments? However, as explained in detail elsewhere

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

603

Trendler

(Trendler, 2009), the problem is not with the non-availability of extensive measurement,
but with the impossibility of the Galilean revolution. Actually the RTM example gives
credence to the idea that under circumstances where systematic disturbances are active,
sophisticated manipulation and control through apparatus construction is necessary for
attaining measurement.
Before closing, I want to mention that Markus and Borsboom (2012) find fault with
what I described as the method of identifying equal magnitudes of an intensive quantity
(Trendler, 2009, pp. 583–584). They linger extensively over different interpretations of
the first axiom, although it is clear that the only relevant interpretation for the topic discussed is in terms of magnitudes of a quantity. As such, the axiom specifies the criteria
for the successful identification of equal magnitudes of a quantity. The method I describe
is not my method, as Markus and Borsboom allege, but is derivative of what J. S. Mill
(1843/1974) called the “Baconian rule of varying circumstances” (p. 381). In particular
I singled out the “method of concomitant variations” (pp. 398–403), which essentially
consists in varying causes and observing the effect of this manipulation on the dependent
variable. Furthermore, Markus and Borsboom (2012) comment: “[S]howing that one
cannot establish the scale of measurement using the test … described does not rule out
the possibility of establishing it by other means” (p. 461). They note:
In cases where experimental manipulation is impossible, measurement procedures typically
rest on triangulation of multiple measures that are aimed at assessing the same property. If one
has multivariate information, then one can check that they share the same scale without
resorting to manipulation. … More broadly, we think that this is precisely the method on which
general psychometric modeling procedures rest. These procedures test whether the association
structure in a multivariate space that arises from obtaining multiple measures of the same
property is in accordance with the hypothesis that the observed variables in fact measure the
same thing. (p. 461)

In reply I must first point out that, if there are laws of measurement in psychology, then
they are all of the general form R = f (S,O), where the dependent variable R stands for
reaction variables (e.g., test result) and the independent variables S and O for stimulus
variables (e.g., test item) respectively for organism variables (e.g., ability; for further
details see Sixtl, 1982, pp. 24–30; cf. also Rasch, 1977). Secondly, given the mathematics
of measurement, in particular the concept of separability, experimental investigation
always involves the combination of different magnitudes of a quantity S with different
magnitudes of a quantity O and the observation the combination has on the dependent
variable R while controlling systematic disturbances. The application of this method presupposes that we can actually identify equal magnitudes of the presumed quantities S and
O. We can of course also assume, as is usually done in psychology, that no systematic
disturbances distort the relation and test the quantitative hypothesis directly. However, if
the test fails (i.e., if the criterion of measurement is not satisfied), we must verify whether
indeed equal magnitudes of S and O have been experimentally combined.
Actually this course of action is in principle confirmed by the authors quoted by
Markus and Borsboom (2012 i.e., Hambleton & Swaminathan, 1985; Lord & Novick,
1968; Rasch, 1960/1980; van der Linden & Hambleton, 1997) as instances for an alternative methodological approach. As Hambleton, Swaminathan, and Rogers (1991) note

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

604

Theory & Psychology 23(5)

with regard to item response theory (IRT): “An assumption common to the IRT models
most widely used is that only one ability is measured by the items that make up the test.
This is called the assumption of unidimensionality” (p. 9) They elaborate:
This assumption cannot be strictly met because several cognitive, personality, and test-taking
factors always affect test performance, at least to some extent. These factors might include level
of motivation, test anxiety, ability to work quickly, tendency to guess when in doubt about
answers, and cognitive skills in addition to the dominant one measured by the set of test items.
What is required for the unidimensionality assumption to be met adequately by a set of test data
is the presence of a “dominant” [emphasis added] component or factor that influences test
performance. (p. 9)

A second, related assumption is the requirement of local independence:
Local independence means that when the abilities influencing test performance are held
constant, examinees’ responses to any pair of items are statistically independent. In other
words, after taking examinees’ abilities into account, no relationship exists between examinees’
responses to different items. Simply put, this means that the abilities specified in the model are
the only factors [emphasis added] influencing examinees’ responses to test items. (p. 10)

Obviously if these requirements are not satisfied, then the separation of the variables
of interest will fail. Conversely, if other dominant factors (i.e., systematic disturbances)
influence test performance, the model will be unsuited for the purpose of measurement
(cf. Sijtsma, 2011, 2012). That other dominant factors are always active in psychological
experiments, and that their control through apparatus construction, at least to the extent
necessary for a successful application of measurement theory, is impossible, is precisely
my argument. Hence, the literature quoted by Markus and Borsboom (2012) does not
support their case.
To conclude, the range of experimental methods applicable to empirical structures
(n.b., interpreted as relations between magnitudes of a quantity) is constrained by the
mathematics of the homomorphic numerical structures. This is the reason why the methods employed for achieving measurement in psychology (including the method for testing the first condition of quantity) are structurally analogous to the methods used in the
natural sciences. Or, in short, empirical separability structurally correlates with algebraic separability. That psychologists will, eventually, concoct new ways of achieving
their goals, which “may not be structurally analogous” (Markus & Borsboom, 2012, p.
463), is, I believe, for the reasons stated here and previously (Trendler, 2009), merely a
logical possibility.

Theory and measurement
According to Kyngdon (2013), “scientific measurement of psychological attributes
depends not only upon adequate stimulus control, but also upon descriptive theories of
psychological systems and the demonstration of pure differences in degree (magnitude)
within attributes hypothesized to be quantitative” (p. 227). Later he amplifies this point by
claiming that “the fundamental problem facing psychological measurement is not lack of

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

605

Trendler

experimental control, manipulation, and apparatus. It is the lack of extant quantities and
the lack of descriptive theories of psychological systems” (p. 230). Kyngdon uses two
theories to substantiate his claims, namely the cumulative prospect theory (Kahneman &
Tversky, 1979; Tversky & Kahneman, 1992) and the Lexile Framework for Reading
(Stenner, Burdick, Sanford, & Burdick, 2006). As I believe that the objections I will raise
apply equally to both, I will restrict my comments only to the Lexile theory.
The Lexile Framework for Reading is a modified Rasch model (for details see
Appendix B of Kyngdon, 2013). The basic idea is that reading-item response probabilities are a function of the reading ability of persons and the difficulty of the text read.
According to the criterion for measurement, we would expect that evidence is brought
forward which shows that ratios between the reading ability of two persons is constant
over item difficulty and vice versa (cf. Andrich, 1988, pp. 20–23). Kyngdon (2013),
however, presents measurement as a fait accompli (e.g., he reports on p. 233 that the
measurement value for a certain item is 1,210 L; the letter L standing for the unit of
measurement “Lexile”). As has been explained above, measurement values can be
accepted as valid only if they satisfy certain conditions. That is, in view of the fact that
the objective sought is measurement on an interval scale, what must be demonstrated is,
as pointed out by Krantz et al. (1971), that the ratio of reading ability (respectively item
difficulty) of any two intervals is invariant within error limits with the choice of the
standard unit.
In particular, what must be empirically proven is the constancy (within experiments)
and replicability (over experiments) of measurement values, ideally by other independent researchers. As Regnault (1847) notes with regard to temperature measurement:
[T]here is a prerequisite that all measuring equipment must meet; it is not only necessary that it
is always strictly comparable to itself, that is to say, it always marks the same level of degree
under identical circumstances, but it is necessary that we can reproduce it at will and still get
strictly comparable instruments. (pp. 164–165)

And Wittgenstein (1953) outlined the importance of this aspect for the practice of measurement as follows:
The procedure of putting a lump of cheese on a balance and fixing the price by the turn of the
scale would lose its point if it frequently happened for such lumps to suddenly grow or shrink
for no obvious reason. (p. 56e)

That is, if it is found, for example, that the ratio between the ability of two persons is r,
then the ratio must prove invariant over replications. Conversely, if the measurement
value of reading ability for a person varies (beyond the limits of random error) over
repeated measurements, it would be useless as an indicator for the level of ability.
Actually, Regnault’s condition of comparability corresponds to what I have called “the
first condition of quantity” (Trendler, 2009, p. 583). In conclusion, unless constancy and
reproducibility of measurement values is demonstrated, it is not even certain if the first
condition of quantity is satisfied.
However, Kyngdon (2013) is of the opinion that decisive evidence for measurability
consists rather in the demonstration that differences between degrees and pairs of degrees

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

606

Theory & Psychology 23(5)

of the relevant, psychological quantity are homogeneous. He explains: “By homogeneous, it is meant that the degrees (magnitudes) of a continuous quantity must vary only
by amount, not kind” (p. 233) and indicates that the concept of homogeneity was recognized by Hölder (1901) and has been recently emphasized by Michell (2009).
Accordingly, his intention is to demonstrate that reading items “can be engineered
whilst holding the content and number of words the same, such that differences between
the Lexile difficulties of the texts are approximately equal” (Kyngdon, 2013, p. 236).
But again, it is not evident to what extent the reported measurement values have the
properties required. Admittedly, if such construction is possible it would represent supplementary indication of the presents of quantitative structure. However, the demonstration of homogeneity is not a necessary criterion for the establishment of an attribute
as measurable. For example, in the case of temperature measurement by means of real
gases, matters are not that simple; in view of the fact that real gases behave only to a
very limited extent like ideal gases, a demonstration of homogeneity, in line with the
method described above, is not possible. Nevertheless temperature is considered to be
measurable (Schooley, 1986, chap. 2).
It should not go unmentioned that Kyngdon (2013) concedes that the Millean Quantity
Objection “appears to have force for psychological attributes where heterogeneous, qualitative differences between degrees can be deduced from the relevant stimuli” (p. 244).
But note that this conclusion does not rely on the failure to satisfy the criterion for measurement but is deduced from “the content and structure” (p. 238) of test items. For example, Kyngdon remarks with regard to an item utilized in the Trends in International
Mathematics and Science Study (TIMSS) that
[w]hat makes the algebra item harder than the decimals item (knowledge of algebra and “test
wiseness” in dealing with the red herring) is qualitatively different to what makes the
measurement item harder than the whole numbers item (knowledge of units of measurement of
time and correct unit conversions). (p. 240)

Unfortunately, in this regard I cannot recognize any substantial difference to the Lexile
Framework. That is, from the content and structure of the Lexile items we can just as
easily deduce that it is not only verbal working memory that influences outcome, but a
mixture of cognitive processes.
Furthermore, Kyngdon (2013) shares the view that, in particular, cognitive theories are
relevant for measurement, since if “sensations, utility, and cognitive abilities are indeed
measurable, descriptive theories of the response process which connect the hypothesized
psychological quantity to identifiable features of the stimuli must be developed” (p. 230;
for details see also Sijtsma, 2012). Accordingly, he criticizes the fact that I ignore the role
of theory. Indeed, I believe that what thwarts measurement attempts in psychology is not
so much the impossibility of meaningful theory, but the impossibility of the Galilean revolution: that is, the impossibility to make scientific progress through the construction of
experimental apparatuses. In essence the Millean Quantity Objection says nothing but that
the Galilean revolution is impossible in psychology. This does not imply that I deny the
role of theory in the process of quantification. However, by mentioning as an example
“descriptive theories of atomic physics” (Kyngdon, 2013, p. 229), or by pointing out that

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

607

Trendler

the definition of the second is in atomic terms, Kygndon gives the impression that sophisticated theory may be necessary from the start. This is the more surprising since he is very
reserved with regard to the psychological theory underlying the Lexile Framework.
Theory describing the cognitive processes is almost completely missing from his account,
and where one would expect a firework of empirical studies about the cognitive structure
of “verbal working memory” (p. 232), not one single study is mentioned.
However, in the beginning there always are relatively simple observations and simple
theories. For example, Boyle “explained the ‘spring’ of the air by drawing an analogy
with a mass of fleece or wool which may be compressed or expanded according to the
pressure applied to it” (Neville, 1962, p. 357; cf. also Rasch, 1977, pp. 70–73). What is
characteristic for a “progressive research program” (Lakatos, 1978) is a cumulative progress consisting to a large extent in the gradual refinement of theory and experimental
apparatus in interplay, so that refinement in experimental technique leads to more theory
and vice versa. That this process (i.e., the Galilean revolution) cannot be started in psychology is what makes it a degenerative research program. Hence, I certainly do not
question that qualitative laws are meaningful in psychology (e.g., that the higher the
motivation of a person, the faster she accomplishes some task, or that the more intelligent
a person is, the more difficult items she can solve, etc.; for a systematic overview of such
qualitative laws see Smedslund, 1997) and that relatively simple theories are heuristically useful, but I doubt that these laws can be expressed in quantitative form (not even
on the level of a nominal scale).
Finally, a critical note on Kygndon’s (2013) proposition that empirical studies in support of theories must be
based upon the presentation of stimuli to humans (such as test items) and inferring [emphasis
added] something of the relevant, unobservable attribute (such as a cognitive ability of some kind)
from the observable response made to the stimulus (such as a correct answer). (pp. 229–230)

Carefully examined, however, this method of inferring from the known (observable) to
the unknown (hypothetical construct) as is commonly used in cognitive psychology is an
instance of what H. M. Johnson (1945) identifies as a “spurious problem,” that is, a problem that is based on “a requirement that cannot be satisfied” (p. 189). Actually what we
are confronted with in cognitive psychology is a modern version of what Johnson calls
“the psychophysical requirement.” By that term he means “a requirement of ascertaining
a correspondence between classes of entities that we may call bodily, and other entities
that we may call mental” (p. 192). As he points out, in order to make such an inference,
we must first provide “the law of functional interdependence” (p. 208) between the two.
If we conceive the relation between observable stimuli (and responses) and unobservable
attributes as y = f(x), then “[t]o get the law by induction we must have the simultaneous
values of both terms in the relationship; whereas we can observe but one of the terms”
(p. 208). This, Johnson notes, “would virtually demand that we solve one equation for
two unknowns” (p. 201). Hence, he concludes, the problems resulting from deducing or
inferring from the known to the unknown “are not genuine, but spurious. In other words,
they are forever insoluble. The attempts to solve them have yielded only facts of assumption, which can never be tested by experience” (p. 211).

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

608

Theory & Psychology 23(5)

The problem has not gone unnoticed in cognitive psychology. Of particular interest is
the argument about “representational indeterminacy” presented by Anderson (1978) in
the context of “the imagery debate” (Pylyshyn, 1981). The topic of this debate is mainly
about the question in what format information is mentally represented. On one side there
are those who argue that visual information is encoded “in terms of properties that are
quite spatial and modality specific” (Anderson, 1978, p. 249); on the other side of the
issue are those “who argue that imagery is encoded in an abstract propositional format”
(p. 250). After analysing the pro- and contra-imagery arguments, Anderson demonstrates
that “it is not possible for behavioral data to uniquely decide issues of internal representation” (pp. 262–263). Even more, he argues that models with different theories of representation are not discriminable on the basis of purely behavioral data. It is not even
possible to unequivocally decide whether the information is represented in an analogous
or in a digital format. Hence, if no behavior-independent access is found to “the mental,”
cognitive psychologists will forever only be scratching the surface of the cognitive apparatus. Or, in other words, the content of “the mental” as a black box is constrained only
by human fantasy. In conclusion: From cognitive psychology we certainly cannot expect
any support in solving the measurement problem. It is struggling with a fundamental,
seemingly unsolvable issue all of its own.

The Newtonian curse: An epilogue
Previously I explained why in my view measurement in psychology is impossible
(Trendler, 2009); presently, in reply to the critique, I focus on the criterion for measurement, that is, on the question how we can decide if measurement has been attained.
Against the criterion we are forced to conclude that the idea that “descriptive theories of
behaviour may allow for the scientific measurement of psychological attributes”
(Kyngdon, 2013, p. 227) is far from evident. Unfortunately I must concur with Michell
(2008b), who remarked with regard to Kyngdon’s work that it is premature to “conclude from his results that reading comprehension ability is a quantitative attribute, for
much more research would need to be done before that hypothesis is made even plausible” (p. 132).
Actually the Rasch hypothesis in its most general form claims that humans are
“Rasch-Machines” designed by nature to function according to the law: θ = A ⋅ D −1 .
That this hypothesis can be established as true is in my view just as likely as that Ohm’s
law can be established only with the help of electric phenomena as found in nature (e.g.,
by investigating electric fish) and not, as actually happened, with the help of an experimental apparatus assembled out of processed objects. What must be taken into consideration is Hugo Dingler’s (1932) remark: “Apparatus do not arise in nature, neither do
they grow on the trees, nor are they dug out of the ground somewhere” (p. 21).
In general, faced with the measurement problem, psychology is left with the following alternatives: discuss the topic from here to eternity, as Markus and Borsboom (2012)
suggest, seriously attempt to find out if psychological attributes are measurable (e.g.,
Kyngdon, 2013), or look for other solutions (e.g., Grice, Barrett, Schlimgen, & Abramson,
2012; Michell, 2011; Smedslund, 1997, 2012). Obviously I would encourage the last two
alternatives. In view of the fact that the measurability debate has now already gone on for

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

609

Trendler

centuries without any progress, the resilience of the discussion is astonishing. Not even
the grandiose failure of two large-scale research programs—i.e., psychophysics (Laming,
1997) and behaviorism (Koch, 1954)—has deterred psychologists from giving up the
quantitative imperative. In my view “the Rasch wars” (McNamara & Koch, 2012) and
similar contemporary attempts (e.g., Kyngdon, 2011) will end the same way: namely, in
a total defeat.
Realistically viewed, however, I fear that, for reasons given below, the measurability
debate indeed has the potential to continue unabated for an indefinite time. As such it
may suffer the fate of other similar debates. For example, in his famous lecture Über die
Grenzen des Naturerkennens held before the German Association for the Advancement
of Science and Medicine in 1872, Emil du Bois-Reymond vigorously opposed the boundless optimism in the limitless possibilities of science as was widespread at the end of the
19th century (du Bois-Reymond, 1872/1967). In his lecture, he identified an instance for
the limits of scientific knowledge in the mind–body problem (i.e., in the problem of how
to explain consciousness from its material conditions). Not only did he argue that it is
non-solvable based on the present state of knowledge, but that in principle it will never
be solved. To bring this thesis to the point he coined the Latin slogan ignoramus et
ignorabimus (i.e., we do not know and will not know). So far, history has proven him to
be right: The mind–body debate goes on unabated without any indication towards a resolution. In my view the Millean Quantity Objection is to be classified in the same category
of insoluble problems. As Markus and Borsboom (2012) correctly point out, what is
claimed is “the impossibility of epistemic access to the truth of the measurement claim;
not the impossibility of that claim holding true” (p. 455). Indeed, the issue with measurement in psychology is not ontological but epistemological in nature. That is, in reality
psychological attributes might be quantitative, but if they really are we do not know, and
we never will know.
Likewise, I think that mainstream psychology will ignore fundamental objections like
the Millean Quantity Objection. The ignorance will persist as long as psychologists cannot rid themselves of the obsession of attempting to successfully emulate physics with
the intention to attain a similar success in prediction and control. The idea of a Newton
of psychology has haunted psychology ever since David Hume modeled his A Treatise of
Human Nature (1739/2008) after Newton’s Principia Mathematica (1687/1999), hoping
that “his psychology would add as much to our knowledge of human nature as Newton”
(Dicker, 1998, p. 75). More recently and most forcefully, Clark L. Hull “sought to model
a scientific psychology after Newton’s physics” (Barone, Maddux, & Snyder, 1997,
p. 42), and many seriously thought that he would finally be the Newton of psychology. It should also be noted that Maxwell’s analysis of Newton’s second law of motion
(in Maxwell, 1876/1991) served Georg Rasch as an important source of inspiration for
his models of measurement (Rasch, 1960/1980, chap. 7).
Ironically the result of the effort of making psychology more scientific by modeling it
after physics is that it has become a perfect pathological science in the sense that it now
perfectly mimics natural sciences. As known from biology, a form of mimicry consists in
the similarity in appearance between the mimicking and the mimicked species (e.g., a
harmless snake mimics the external appearance of a dangerous species). Similarly, mainstream psychology mimics successful sciences like physics. Psychologists are seen to

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

610

Theory & Psychology 23(5)

perform experiments, analyze data, test hypotheses, publish results in journals, organize
conferences, establish psychological institutes, award prizes, and so on. That is, in
appearance they perform the same actions as physicists; but the essential difference is
that while physicists accumulate real, replicable knowledge, no such consequence is to
be found in “Newtonian” psychology. This phenomenon has also been described as
“cargo cult science” (Feynman, 1985).
Thus the Newtonian model turned into a curse and the Newtons of psychology
“turned out rather to be the Newton[s] of Alchemy” (Eysenck, 1995, p. 45). However,
breaking the Newtonian curse may prove extremely difficult, since psychologists
would have to accept definite limitations to what they can do. In this connection I
would like to mention Smedslund (2009), who describes four obstacles facing the
search for general empirical laws along the lines specified by the natural sciences
(and which, by the way, are also excellent arguments against the measurability of
psychological attributes). Firstly, we must consider the openness of the person to an
indefinite number of possible influences which determine behavior. Secondly, there is
the phenomenon of irreversibility: that is, the fact that persons remember and learn
from their experiences, which means that psychological processes are irreversible.
Thirdly, persons are not isolated entities but part of a shared meaning system (i.e.,
system of families, cultural practices, etc.). And fourthly, we must always take into
account personal uniqueness: that is, fortuitous events and unique life situations make
the history of persons unique.
Hence, giving up the model of natural sciences would imply bidding farewell to
the long-cherished view that knowledge in psychology will ultimately be universal,
cumulative, and atemporal, and to accept that it is and always will be to a large extent
historically and culturally bound, depending on circumstances, transitory and
fragmentary.
Acknowledgements
I would like to express my special thanks to Paul Barrett for accompanying this work with constructive comments which, as always, are invaluable.

Funding
This research received no specific grant from any funding agency in the public, commercial, or
not-for-profit sectors.

References
Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological
Review, 85, 249–277.
Andrich, D. (1988). Rasch models for measurement. Newbury Park, CA: Sage.
Barone, D. F, Maddux, J. E., & Snyder, C. R. (1997). Social cognitive psychology: History and
current domains. New York, NY: Plenum.
Barrett, P. (2003). Beyond psychometrics: Measurement, non-quantitative structure, and applied
numerics. Journal of Managerial Psychology, 18, 421–439.
Barrett, P. (2005). What if there were no psychometrics? Constructs, complexity, and measurement. Journal of Personality Assessment, 85, 134–140.

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

611

Trendler

Barrett, P. (2008). The consequence of sustaining a pathology: Scientific stagnation. A commentary on the target article ‘Is psychometrics a pathological science?’ by Joel Michell.
Measurement, 6, 78–123.
Barrett, P., & Rolland, J.-P. (2009). The meta-analytic correlation between the Big Five personality constructs of emotional stability and conscientiousness: Something is not quite right in the
woodshed. Retrieved from http://www.pbarrett.net/stratpapers/metacorr.pdf
Braun, F. (1877). Ueber Abweichungen vom Ohm’schen Gesetz in metallisch leitenden Körpern
[About the deviations from Ohm’s law in solids with metallic conductivity]. Annalen der
Physik, 237, 95–110.
Bridgman, P. W. (1921). Measurements of the deviation from Ohm’s law in metals at high current
densities. Proceedings of the National Academy of Sciences of the United States of America,
7, 299–303.
Bylander, J., Duty, T., & Delsing, P. (2005). Current measurement by real-time counting of single
electrons. Nature, 434, 361–364.
Campbell, N. (1920). Physics: The elements. Cambridge, UK: Cambridge University Press.
Corry, L. (2004). David Hilbert and the axiomatization of physics (1898–1918): From “Grundlagen
der Geometrie” to “Grundlagen der Physik”. Dordrecht, The Netherlands: Kluwer Academic.
Davies, P. J., & Marsh, J. O. (1985). Ohm’s law and the Schuster effect. Physical Science,
Measurement and Instrumentation, Management and Education – Reviews, IEE Proceedings
A, 132, 525–532.
de Boer, J. (1994/1995). On the history of quantity calculus and the international system.
Metrologia, 31, 405–429.
Dicker, G. (1998). Hume’s epistemology and metaphysics: An introduction. London, UK:
Routledge.
Dingler, H. (1932). Der Glaube an die Weltmaschine und seine Überwindung [Overcoming the
belief in the world-machine]. Stuttgart, Germany: Enke.
du Bois-Reymond, E. (1967). Über die Grenzen des Naturerkennens [On the limits of our understanding of nature]. Darmstadt, Germany: Wissenschaftliche Buchgesellschaft. (Original work
published 1872)
Ellis, B. (1966). Basic concepts of measurement. Cambridge, UK: Cambridge University Press.
Eysenck, H. (1995). Genius: The natural history of creativity. Cambridge, UK: Cambridge
University Press.
Fechner, G. T. (1831). Massbestimmungen über die galvanische Kette [Measurement determinations by use of the galvanic chain]. Leipzig, Germany: F. A. Brockhaus.
Feynman, R. (1985). “Surely you’re joking, Mr. Feynman!” Adventures of a curious character.
New York, NY: Norton.
Fischer, G. H. (1989). Spezifische Objektivität: Eine wissenschaftstheoretische Grundlage des
Rasch-Modells [Specific objectivity: A scientific foundation of the Rasch model]. In K. D.
Kubinger (Ed.), Moderne Testtheorie: Ein Abriß samt neuesten Beiträgen [Modern test theory:
An outline with most recent contributions] (2nd ed., pp. 87–111). Weinheim, Germany: Beltz.
Fornasini, P. (2008). The uncertainty in physical measurements: An introduction to data analysis
in the physics laboratory. New York, NY: Springer.
Gauß, C. F. (1995). Theoria combinationis observationum erroribus minimus obnoxiae [Theory of
the combination of observations least subject to error]. Philadelphia, PA: Society for Industrial
and Applied Mathematics. (Original work published 1821)
Goodeve, T. M., & Shelley, C. P. B. (1877). The Whitworth measuring machine. London, UK:
Longmans, Green, and Co.
Grice, J. W., Barrett, P. T., Schlimgen, L. A., & Abramson, C. I. (2012). Toward a brighter future
for psychology as an observation oriented science. Behavioral Sciences, 2, 1–22.

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

612

Theory & Psychology 23(5)

Gutjahr, W. (1977). Die Messung psychischer Eigenschaften [The measurement of psychological
attributes]. Cologne, Germany: Kiepenheuer & Witsch.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications.
Boston, MA: Kluwer Nijhoff.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response
theory. Newbury Park, CA: Sage.
Hand, D. J. (2004). Measurement theory and practice: The world through quantification. London,
UK: Edward Arnold.
Helmholtz, H. von. (1998). Zählen und Messen, erkenntnistheoretisch betrachtet [Numbering and
measuring from an epistemological viewpoint]. In E. Bonk (Ed.), Schriften zur Erkenntnistheorie
[Epistemological writings] (pp. 99–129). Vienna, Austria: Springer. (Original work published
1887)
Hölder, O. (1901). Die Axiome der Quantität und die Lehre vom Mass [The axioms of quantity
and the theory of measurement]. Berichte über die Verhandlungen der Königlich Sächsischen
Gesellschaft der Wissenschaften zu Leipzig, Mathematisch-Physische Classe, 53, 1–64.
Home, R. W. (2005). Electricity. In J. L. Heilbron (Ed.), The Oxford guide to the history of physics
and astronomy (pp. 94–96). Oxford, UK: Oxford University Press.
Hume, D. (2008). A treatise of human nature. Oxford, UK: Oxford University Press. (Original
work published 1739)
Jacobi, M. H. (1837). On the application of electro-magnetism to the movement of machines. In
R. Taylor (Ed.), Scientific memoirs, selected from the transactions of foreign academies of
science and learned societies, and from foreign journals (Vol. 1, pp. 503–531). London, UK:
Richard and John E. Taylor.
Jacobi, M. H. (1839). Ueber das chemische und das magnetische Galvanometer [On the chemical
and magnetical galvanometer]. Annalen der Physik, 124, 26–57.
Jacobi, M. H. (1842). Eine Methode, die Constanten der Volta’schen Ketten zu bestimmen
[A method for determining the constants in a voltaic chain]. Annalen der Physik, 133,
85–100.
James, W. (1955). Pragmatism and four essays from The meaning of truth. New York, NY:
Meridian Books. (Original work published 1907)
Johnson, H. M. (1945). Are psychophysical problems genuine or spurious? The American Journal
of Psychology, 58, 189–211.
Kahneman, D. (2011). Thinking, fast and slow. London, UK: Penguin.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.
Econometrica, 47, 263–291.
Kipnis, N. (2009). A law of physics in the classroom: The case of Ohm’s law. Science & Education,
18, 349–382.
Koch, S. (1954). Clark L. Hull. In W. K. Estes (Ed.), Modern learning theory: A critical analysis
of five examples (pp. 1–176). New York, NY: Appleton.
Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement: Vol.
1. Additive and polynomial representations. New York, NY: Academic Press.
Kyngdon, A. (2011). Plausible measurement analogies to some psychometric models of test performance. British Journal of Mathematical and Statistical Psychology, 64, 478–497.
Kyngdon, A. (2013). Descriptive theories of behaviour may allow for the scientific measurement
of psychological attributes. Theory & Psychology, 23, 227–250.
Lambdin, C. (2012). Significance tests as sorcery: Science is empirical—significance tests are not.
Theory & Psychology, 22, 67–90.
Lakatos, I. (1978). The methodology of scientific research programmes: Philosophical papers
(Vol. 1). Cambridge, UK: Cambridge University Press.

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

613

Trendler

Laming, D. (1987). Psychophysics. In R. L. Gregory (Ed.), The Oxford companion to the mind (pp.
655–657). Oxford, UK: Oxford University Press.
Laming, D. (1997). The measurement of sensation. Oxford, UK: Oxford University Press.
Lenz, E. (1837). On the laws of the conducting powers of wires of different lengths and diameters
for electricity. In R. Taylor (Ed.), Scientific memoirs, selected from the transactions of foreign
academies of science and learned societies, and from foreign journals (Vol. 1, pp. 311–324).
London, UK: Richard and John E. Taylor.
Lenz, E. (1839). Ueber die Eigenschaften der magneto-elektrischen Ströme: Eine Berichtigung
des Aufsatzes von Hrn. De la Rive über denselben Gegenstand [On the properties of magnetoelectric currents: A rectification of De la Rive’s article about the same topic]. Annalen der
Physik und Chemie, 48, 385–423.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA:
Addison Wesley.
Luce, R. D. (1979). Suppes’ contributions to the theory of measurement. In R. J. Bogdan (Ed.),
Patrick Suppes (pp. 93–110). Dordrecht, The Netherlands: D. Reidel.
Luce, R. D., & Narens, L. (1994). Fifteen problems concerning the representational theory of
measurement. In P. Humphreys (Ed.), Patrick Suppes: Scientific philosopher (Vol. 2, pp. 219–
249). Dordrecht, The Netherlands: Kluwer Academic.
Luce, R. D., & Suppes, P. (2002). Representational measurement theory. In H. Pashler & J. Wixted
(Eds.), Stevens’ handbook of experimental psychology (3rd ed., Vol. 4, pp. 1–41). New York,
NY: Wiley.
Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.
Markus, K. A., & Borsboom, D. (2012).The cat came back: Evaluating arguments against psychological measurement. Theory & Psychology, 22, 452–466.
Maxwell, J. C. (1991). Matter and motion. New York, NY: Dover. (Original work published 1876)
Maxwell, J. C. (2001). Theory of heat. New York, NY: Dover. (Original work published 1871)
McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in
language testing. Language & Testing, 29, 555–576.
Michell, J. (1990). An introduction to the logic of psychological measurement. Hillsdale, NJ:
Erlbaum.
Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British
Journal of Psychology, 88, 355–383.
Michell, J. (1999). Measurement in psychology. Cambridge, UK: Cambridge University Press.
Michell, J. (2000). Normal science, pathological science, and psychometrics. Theory & Psychology,
10, 639–667.
Michell, J. (2008a). Is psychometrics pathological science? Measurement, 6, 7–24.
Michell, J. (2008b). Rejoinder. Measurement, 6, 125–133.
Michell, J. (2009). Invalidity in validity. In R. W. Lissitz (Ed.), The concept of validity: Revisions,
new directions and applications (pp. 111–133). Charlotte, NC: Information Age Publishing, Inc.
Michell, J. (2011). Qualitative research meets the ghost of Pythagoras. Theory & Psychology, 21,
241–259.
Michell, J., & Ernst, C. (1996). The axioms of quantity and the theory of measurement: Translated
from Part I of Otto Hölder’s German text “Die Axiome der Quantität und die Lehre vom Mass”
Journal of Mathematical Psychology, 40, 235–252.
Mill, J. S. (1974). A system of logic: Ratiocinative and inductive. Toronto, ON: University of
Toronto Press. (Original work published 1843)
Neville, R. G. (1962). The discovery of Boyle’s Law, 1661–62. Journal of Chemical Education,
39, 356–359.

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

614

Theory & Psychology 23(5)

Newton, I. (1999). Mathematical principles of natural philosophy. Berkeley: University of
California Press. (Original work published 1687)
Ohm, G.S. (1826). Bestimmung des Gesetzes, nach welchem Metalle die Contaktelektricität leiten,
nebst einem Entwurfe zu einer Theorie des voltaischen Apparats und des Schweiggerschen
Multiplicators [Determination of the law in accordance with which metals conduct contact
electricity, together with an outline of a theory of the voltaic apparatus and of Schweigger’s
multiplier]. Journal für Chemie und Physik, 46, 137–166.
Orth, B. (1974). Einführung in die Theorie des Messens [Introduction to measurement theory].
Stuttgart, Germany: Kohlhammer.
Pouillet, M. (1837). Mémoire sur la pile de Volta et sur la loi générale de l’intensité que prennent
les courants, soit qu’ils proviennent d’un seul élément, soit qu’ils proviennent d’une pile à
grande ou à petite tension [Memoir on the voltaic pile and the general law of intensity of currents from a simple chain or a pile of large and small tension]. Comptes Rendus Hebdomadaires
des Séances de l’Académie des Sciences, 4, 267–279.
Pylyshyn, Z. W. (1981). The imagery debate: Analog media versus tacit knowledge. Psychological
Review, 88, 16–45.
Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and
validity of scientific statements. The Danish Yearbook of Philosophy, 14, 58–93.
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Copenhagen,
Denmark: Nielsen & Lydiche. (Original work published 1960)
Regnault, V. (1847). Relation des expériences entreprises par ordre de M. le Ministre des Travaux
Publics, et sur la proposition de la Commission Centrale des Machines à Vapeur, pour déterminer les principales lois physiques et les données numériques qui entrent dans le calcul des
machines à vapeur. [Report on experiments undertaken by order of the minister of public
works, and the proposal of the central committee of steam engines, to determine the main
physical laws and numerical data used in the calculation of steam engines]. Mémoires de
l’Académie Royale des Sciences de l’Institut de France, 21, 1-748.
Reid, T. (1748). An essay on quantity. Transactions of the Royal Society of London, 14,
505–520.
Saint-Mont, U. (2011). Statistik im Forschungsprozess [Statistics in the scientific process].
Heidelberg, Germany: Physika Verlag.
Saint-Mont, U. (2012). What measurement is all about. Theory & Psychology, 22, 467–485.
Sanner, C., Su, E. J., Keshet, A., Gommers, R., Shin, Y., Huang, W., & Ketterle, W. (2010).
Suppression of density fluctuations in a quantum degenerate Fermi gas. Physical Review
Letter, 105, 1–4.
Schönemann, P.H. (1994). Measurement: The reasonable ineffectiveness of mathematics in the
social sciences. In I. Borg & P. Mohler (Eds.), Trends and perspectives in empirical social
research (pp. 149–160). Berlin, Germany: Walter de Gruyter.
Schooley, J. F. (1986). Thermometry. Boca Raton, FL: CRC.
Sijtsma, K. (2011). Introduction to the measurement of psychological attributes. Measurement,
44, 1209–1219.
Sijtsma, K. (2012). Psychological measurement between physics and statistics. Theory &
Psychology, 22, 786–809.
Sixtl, F. (1982). Messmethoden der Psychologie: Theoretische Grundlagen und Probleme
[Measurement methods in psychology: Theoretical foundations and problems] (2nd ed.).
Weinheim, Germany: Beltz.
Sixtl, F. (1985). Notwendigkeit und Möglichkeit einer neuen Methodenlehre der Psychologie [The
necessity and possibility of a new methodology in psychology]. Zeitschrift für experimentelle
und angewandte Psychologie, 32, 320–339.

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014

615

Trendler

Sixtl, F. (1998). Der Abschied von Homme Moyen alias Average Person [Taking leave of the average person]. In W. Hacker & M. Rinck (Eds.), Bericht über den 41. Kongreß der Deutschen
Gesellschaft für Psychologie in Dresden 1998 [Report on the 41st Congress of the German
Psychological Association in Dresden, 1998] (pp. 519–526). Lengerich, Germany: Pabst
Science.
Smedslund, J. (1997). The structure of psychological common sense. Mahwah, NJ: Erlbaum.
Smedslund, J. (2009). The mismatch between current research methods and the nature of psychological phenomena: What researchers must learn from practitioners. Theory & Psychology,
19, 778–794.
Smedslund, J. (2012). The bricoleur model of psychological practice. Theory & Psychology, 22,
643–657.
Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text
measures? Journal of Applied Measurement, 7, 307–322.
Suppes, P., & Zinnes, J. L. (1963). Basic measurement theory. In R. D. Luce, R. R. Bush &
E. H. Galanter (Eds.), Handbook of mathematical psychology (Vol. 1, pp. 3–76). New York,
NY: Wiley.
Tarski, A. (1954). Contributions to the theory of models I. Indagationes Mathematicae, 16, 572–588.
Trendler, G. (2009). Measurement theory, psychology and the revolution that cannot happen.
Theory & Psychology, 19, 579–599.
Tukey, J. W. (1986). Data analysis and behavioral science or learning to bear the quantitative
man’s burden by shunning badmandments. In L.V. Jones (Ed.), The collected works of J. W.
Tukey: Philosophy and principles of data analysis: 1949–1964 (Vol. 3, pp. 187–390). London,
UK: Chapman & Hall.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of
uncertainty. Journal of Risk and Uncertainty, 5, 297–323.
van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory.
New York, NY: Springer.
Wheatstone, C. (1843). The Bakerian lecture: An account of several new instruments and processes for determining the constants of a voltaic circuit. Philosophical Transactions of the
Royal Society of London, 133, 303–327.
Wittgenstein, L. (1953). Philosophical investigations. Oxford, UK: Basil Blackwell.
Wood, R. (1978). Fitting the Rasch model: A heady tale. British Journal of Mathematical and
Statistical Psychology, 31, 27–32.
Zinnes, J. L. (1969). Scaling. Annual Review of Psychology, 20, 447–478.

Author biography
Günter Trendler is currently working as technician at a leading provider of technical solutions in
mechanized tunneling. Previously he was involved as a graduate research assistant in the
Schizophrenia Research Unit of the Central Institute of Mental Health Mannheim (CIMH) in
research to improve the early detection of the onset of psychosis. Address: Robert-Schuman-Str.
15, 67549 Worms, Germany. Email: guenter@trendler.org

Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014