होम
Theory & Psychology Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder
Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder
Trendler, G.यह पुस्तक आपको कितनी अच्छी लगी?
फ़ाइल की गुणवत्ता क्या है?
पुस्तक की गुणवत्ता का मूल्यांकन करने के लिए यह पुस्तक डाउनलोड करें
डाउनलोड की गई फ़ाइलों की गुणवत्ता क्या है?
खंड:
23
भाषा:
english
पत्रिका:
Theory & Psychology
DOI:
10.1177/0959354313490451
Date:
October, 2013
फ़ाइल:
PDF, 805 KB
आपके टैग:
फ़ाइल 15 मिनट के भीतर आपके ईमेल पते पर भेजी जाएगी.
फ़ाइल 15 मिनट के भीतर आपकी Kindle पर डिलीवर हो जाएगी.
टिप्पणी: आप जो भी पुस्तक अपने Kindle पर भेजना चाहें इसे सत्यापित करना होगा. Amazon Kindle Support से सत्यापन ईमेल के लिए अपना मेलबॉक्स देखें.
टिप्पणी: आप जो भी पुस्तक अपने Kindle पर भेजना चाहें इसे सत्यापित करना होगा. Amazon Kindle Support से सत्यापन ईमेल के लिए अपना मेलबॉक्स देखें.
Conversion to is in progress
Conversion to is failed
0 comments
आप पुस्तक समीक्षा लिख सकते हैं और अपना अनुभव साझा कर सकते हैं. पढ़ूी हुई पुस्तकों के बारे में आपकी राय जानने में अन्य पाठकों को दिलचस्पी होगी. भले ही आपको किताब पसंद हो या न हो, अगर आप इसके बारे में ईमानदारी से और विस्तार से बताएँगे, तो लोग अपने लिए नई रुचिकर पुस्तकें खोज पाएँगे.
1


2


Theory & Psychology http://tap.sagepub.com/ Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder Günter Trendler Theory Psychology 2013 23: 591 originally published online 26 July 2013 DOI: 10.1177/0959354313490451 The online version of this article can be found at: http://tap.sagepub.com/content/23/5/591 Published by: http://www.sagepublications.com Additional services and information for Theory & Psychology can be found at: Email Alerts: http://tap.sagepub.com/cgi/alerts Subscriptions: http://tap.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations: http://tap.sagepub.com/content/23/5/591.refs.html >> Version of Record  Oct 16, 2013 OnlineFirst Version of Record  Aug 28, 2013 OnlineFirst Version of Record  Jul 26, 2013 What is This? Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 490451 2013 TAP23510.1177/0959354313490451Theory & PsychologyTrendler Article Measurement in psychology: A case of ignoramus et ignorabimus? A rejoinder Theory & Psychology 23(5) 591–615 © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0959354313490451 tap.sagepub.com Günter Trendler Worms, Germany Abstract The Millean Quantity Objection—that is, the claim that the measurement of psychological attributes is impossible (Trendler, 2009)—has been countered with partly vigorous opposition (Kyngdon, 2013; Markus & Borsboom, 2012; SaintMont, 2012). Kyngdon’s response is of particular interest, since he asserts that measurement may already have been established. If correct, it would definitely invalidate any quantity objection and end the centurylong discussion about the measurability of psychological attributes. Therefore the focus of the rejoinder will be on the question of when measurement is reached. First the meaning of measurement is elaborated. On this basis, criteria for the successful establishm; ent of measurement are formulated and it is outlined how these are satisfied in the case of intensive quantities. It is concluded that the evidence presented by Kyngdon is insufficient and inadequate. The approach will also serve as background to discuss objections raised against the Millean Quantity Objection. Keywords measurement theory, psychometrics, quantity objection, separability The Millean Quantity Objection (Trendler, 2009)—that is, the statement I put forward in this journal that psychological attributes are nonmeasurable—has been countered with partly strong opposition. Markus and Borsboom (2012) find the argumentation to be “flawed in several respects” (p. 452) and SaintMont (2012) believes that the “question whether an attribute (really) is ‘quantitative’ or not can hardly be answered in a reasonable way” (p. 475). Kyngdon (2013), by contrast, concedes that the quantity objection applies under certain circumstances, but he still believes that where descriptive theories of underlying cognitive processes exist and where quantitative features can be empirically manipulated, the scientific measurement of the relevant attributes is credible. Corresponding author: Günter Trendler, RobertSchumanStr. 15, 67549 Worms, Germany. Email: guenter@trendler.org Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 592 Theory & Psychology 23(5) However, Markus and Borsboom (2012) recognize that if the Millean Quantity Objection is correct, then important consequences would ensue. They write: “[W]e need not argue that these are many and varied (e.g., large parts of psychometrics, intelligence research, and personality testing, but also of psychonomic research on memory, learning, cognition, etc., would be discredited)” (p. 453). At the same time they concede that they “do not aim to defend the thesis that currently existing tests and other observational practices can be defensibly interpreted as measurement instruments in the strong sense of measurement intended by Trendler (2009) and Michell (1997, 1999)” (p. 464). Furthermore, they see no urgency in resolving the problem, but rather prefer to stay “agnostic about this issue” (p. 464) and withdraw to the position that they “are currently in a state of ignorance with respect to this issue insofar as it concerns many cases of putative measurement in psychology” (p. 455). This agnostic attitude is all the more astonishing since psychometrics is, as Joel Michell (2000, 2008a) explains, a pathological science: that is, although psychometrics is premised on the hypothesis that psychological attributes are quantitative, psychologists never seriously attempt to find out if they are really measurable (see also Barrett, 2003, 2008). Worse, there are reasons to believe that if psychological attributes are not measurable, then the application of methods relying on the quantitative assumption are systematically misleading. Under circumstances which are typical of psychology, even such relatively simple mathematical operations as averaging are problematic. Friedrich Sixtl (1982) observes: If psychological attributes were measurable, then, according to the Gaussian theory of errors, averaging would be a useful tool to account for random error. However, if this is not the case, the application of the mathematical operation is misleading, for, as Sixtl (1985) notes: The arithmetic mean can indicate the real central value of a parameter only, if the basic data are independent and identically distributed. The basic data obtained in behavioural sciences is, however, seldom independent and identically distributed because every person represents a unique individual. Consequently, in such cases, the obtained arithmetic mean depends upon the distribution of the parameter among the subjects. As this distribution is bound to be different in different experimental groups, the arithmetic means obtained in the same experiment conducted among different experimental groups are different. (p. 338) Hence, Tukey (1986), as quoted by SaintMont (2012, p. 473), may be right in pointing out that the calculation of the arithmetic mean can be useful even if the requirements of permissible statistics are not met. But this does not apply to situations which are typical of psychology: that is, where the data are not identically distributed and where, consequently, “[t]he average response is not the response of any real person, but rather artificial” (Sixtl, 1998, p. 519). It is therefore likely that the systematic connections Markus and Borsboom (2012) believe to perceive are not an indication “that something is going on in the data” or that “psychological scales are not random collections of items” (p. 464), but illusions created by “[t]he tendency to see patterns in randomness” (Kahneman, 2011, p. 117). Critically analyzed, even such popular constructs as the “Big Five personality factors,” mentioned by Markus and Borsboom (2012, p. 464) as a supportive example for real knowledge, are far from stable or coherent (Barrett & Rolland, 2009). It is also often misleadingly Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 593 Trendler suggested that statistical analysis is in general “robust” against violations of the conditions of its applicability (e.g., SaintMont, 2011, p. 75). Hence, if psychometrics is pathological, then almost all of psychology is pathological, namely all psychological research predicated on the assumption that “the measurability thesis” (Michell, 1999) is true. In consequence, if psychologists prefer to preserve the status quo, they willingly choose to stay in a state of ignorance and delusion; which hardly is a scientific attitude. This is the reason why—contrary to SaintMont’s (2012) position that a “rigid or even ‘fundamental’ point of view is to no avail” (p. 473)—the question of measurability is of crucial importance and why its resolution should not lightly be deferred to the future (see also Barrett, 2008). Furthermore, Markus and Borsboom (2012) are of the opinion that against the background of the long history of contesting the existence and possibility of measurement in psychology, the debate will continue unabated in future. Hence they propose a “general framework for evaluating arguments against measurement” (pp. 461–463) in order to accommodate future discussions. One gets the impression that by the attempt to anticipate even future criticism they want to deflate the wind of the critics’ sail for good. What they overlook is the obvious, namely the possibility to end the debate by demonstrating that at least one psychological attribute is measurable. In this regard I applaud Kyngdon’s (2013) attempt to diffuse the objections by presenting evidence that psychological attributes may be measurable after all. Certainly, the best way to confront and finally silence the sceptic is to demonstrate that measurement has been established. To make a proper assessment of the evidence, the criteria for the attainment of measurement must be specified. First I will elaborate in some detail on the meaning of measurement. Based on this I will formulate criteria for measurement and exemplify how they can be satisfied in the case of intensive quantities. The criteria will serve, so to speak, as a yardstick against which we can judge if (or to what extent) the measurement task has been solved. This way of proceeding will also function as the background for replying to different objections raised by the critics. The paradigmatic meaning of measurement Contemporary measurement theory is dominated by two concepts of measurement. Firstly, there is the classical concept as was originally developed by Euclid (for details see Michell, 1999, pp. 25–33), according to which measurement is defined as the determination of the ratio of a magnitude of a quantity to another magnitude of the same quantity called unit. More precisely, for any magnitudes a and b of a quantity A the magnitude of a relative to b may always be expressed by a positive, real number, r, where a = r.b. That is, the ratio of a to b (a positive, real number) is the measure of a in units of b. (Michell, 1997, p. 358) Another common way to express the same idea is by the formula “physical quantity = numerical value × unit” (de Boer, 1994/1995, p. 405). The numerical value r is usually called “measurement value.” As such the definition is still widely used in physics (see Michell, 1999, p. 33). Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 594 Theory & Psychology 23(5) Secondly, we have to consider the “homomorphism definition of measurement” (Luce & Narens, 1994, p. 224) as is part of the representational theory of measurement (RTM; Luce & Suppes, 2002); the presently dominant theory at least in the domain of philosophy of science. In its definite form the theory is outlined in the now classic Volume 1 of Foundations of Measurement (Krantz, Luce, Suppes, & Tversky, 1971). For properly understanding this definition, Tarski’s (1954) notions of a relational system or structure and of an isomorphism between structures are essential. In basic terms, a relational structure is a set of entities together with one or more relations between the entities. An isomorphism is a mathematical onetoone function in which one relational structure is mapped onto another such that the relations among entities of the first structure are preserved by corresponding relations among entities in the second structure. In order to employ these concepts within measurement theory, the distinction is made between numerical and empirical relational systems or structures. An empirical relational structure contains “identifiable entities, such as weights, persons, attitude statements, or sounds” (Suppes & Zinnes, 1963, p. 7), or more generally “concrete objects or events” (Luce, 1979, p. 94). A numerical relational structure is a system consisting of abstract objects (usually a set of real numbers) and relations between them (e.g., relations of equivalence or order). In measurement, where the same number may be assigned to more than one object (e.g., different objects may have an equal length), a manytoone function or a homomorphism is preferable. Accordingly, measurement is “defined as a homomorphic mapping of an empirical system into a numerical system” (Zinnes, 1969, p. 454). What is the relation between the two theories? According to Hand (2004), “the classical theory is predicated on the condition that measurement is necessarily concerned with ‘quantitative’ attributes. Modern representational theory includes this as a special case” (p. 51). Luce and Narens (1994) elaborate: [E]ither directly or indirectly standard sequences are used to establish scales in almost all of the major results of FM [Foundations of Measurement]. Thus, since the process of measurement through standard sequences is usually taken as paradigmatic of ‘measurement processes’ almost all of the representational results of FM are valid not only from the RTM viewpoint but from a number of different perspectives about what measurement is. (p. 225) The primary exceptions are, as Luce and Narens note, “purely ordinal cases including variants such as interval orders and semiorders” (p. 242). Hence, if the set of homomorphisms is restricted to those established by means of a standard sequence, then the homomorphism definition is equivalent in meaning to the classical definition of measurement (see below for an example of the construction of a standard sequence). In other words, proponents of RTM do not doubt the actual measurement practice, as was developed in the natural sciences, but only choose to describe it differently. Or, as William James may have put it, the two definitions practically mean the same thing (for details see James, 1907/1955, pp. 41ff). Following Luce and Narens (1994), I will call the common idea underlying the two definitions the paradigmatic meaning of measurement. For simplicity I will always use the word “measurement” subsequently in the meaning of “paradigmatic measurement.” Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 595 Trendler Also, to avoid misunderstandings, the term “quantification” will be used only to refer to quantification by means of paradigmatic measurement. Hence, the answer to SaintMont’s (2012) question “what measurement is all about” is: measurement is about the determination of ratios of magnitudes of a quantity, not more and not less. Cases of nominal or ordinal measurement involving the concept of quantity (e.g., “associative measurement” as described by Ellis, 1966, chap. 6) are considered as preliminary stages to paradigmatic measurement. That they are useful in their own right is not questioned. However, what matters in the context of the discussion is that most psychometric methods are based on the presumption that the conditions of paradigmatic measurement can be met. Nonparadigmatic cases of measurement will not concern us here (e.g., Beaufort’s wind scale, Mohs’ hardness scale, etc.), though their usefulness is not doubted. It is also not implied that quantitative methods should in general be replaced with qualitative methods, as falsely suggested by SaintMont (2012, p. 467). In this connection Markus and Borsboom (2012) allege that my argumentation “discounts the possibility that the scientific community will change its conceptual scheme so as to alter the meaning of the term measurement itself” (pp. 463–464). In my view this imputation is unjustified. If I believe that measurement is impossible in psychology, then, in the sense of paradigmatic measurement; the right to define measurement according to other criteria is certainly not contested. However, on a critical note, one must understand that by changing the definition one does not change the facts. This may sound trivial, but, after the transition from the classical definition of measurement to the representational conception, exactly this fallacy prevailed in psychology. Since then measurement is usually explained as: “When measuring some attribute of a class of objects or events, we associate numbers … with the objects in such a way that the properties of the attribute are faithfully represented as numerical properties” (Krantz et al., 1971, p. 1). Suddenly, when considered superficially, classifying people according to a presumably quantitative attribute (e.g., intelligence, ability) seemed just as easy as classifying them according to nonquantitative attributes (e.g., nationality, gender, or profession). Thus through obfuscation the illusion was created that measurement in psychology may be at least in principle an easy thing; that, since psychology “possesses ordinal relations aplenty” (Michell, 1990, p. 68), conjoint measurement (Luce & Tukey, 1964) is the royal road to measurement. Actually, the delusion has taken such deep roots that even if led close to it, some psychologists have difficulties seeing the difference. Otherwise I cannot explain why Markus and Borsboom (2012)—although they realize that the problem I pointed out (Trendler, 2009, pp. 583– 584) is not with classification per se, but with classification of people according to quantitative attributes—still claim that “if one can assess equality, one can construct nominal representations, and if one can construct nominal representations, … then the problem with psychological measurement must lie elsewhere in [the] axiomatization” (p. 460). Measurement: The criterion The criterion for the successful application of measurement theory is derivable from the definition of measurement. That is, the measurability of a quantity is established if it can be demonstrated empirically that there exists a ratio between some magnitudes of the Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 596 Theory & Psychology 23(5) quantity or, in other words, that the ratio between two magnitudes can be expressed by a real number. Though necessary, this criterion is not sufficient. Because of experimental and human fallibility, as is the case for empirical sciences, it must be supplemented with corollary requirements. Firstly: A reported discovery must be replicated independently of the original discoverer before it is accepted as valid; ideally with improved experimental apparatus and under varying experimental circumstances. As a rule the finding must in principle be replicable any number of times. For example, Ohm’s (1826) discovery of the law of electrical current was extensively verified by Fechner (1831) in numerous experiments. Not only did he use other measurement procedures, but he also substantially extended the validity of the law, for example, by investigating fluid conductors. Among others, decisive experimental contributions to the corroboration of Ohm’s law were made by Jacobi (1839, 1842), Lenz (1837, 1839), and Wheatstone (1843). Interestingly, apparently unaware of Ohm’s empirical work, the French physicist Pouillet (1837) rediscovered the law of electrical current. The last great effort to verify Ohm’s law was undertaken in 1874 when the British Association for the Advancement of Science asked J. C. Maxwell to direct a committee that would perform the testing. The cause was a set of experiments by Arthur Schuster which suggested “that Ohm’s law may only be a first order approximation, and that the resistance and/or the selfinductance of a wire could be current dependent” (Davies & Marsh, 1985, p. 525). The result of the investigation was that the law was validated in its original form; respectively it was demonstrated that Schuster’s original observations were dependent on the use of tangent galvanometers in conjunction with alternating currents. Since then the law has not been seriously contested. Importantly, in physics, laws of nature are accepted as true despite their limitations (cf. Kipnis, 2009). That is, laws of nature must be replicable under some but not necessarily under all conceivable circumstances. This does not imply that their validity should fail to be extended through corroboration as much as possible, but circumstances in nature are not always favourable. For example, it has been found that in metals at high current densities Ohm’s law reaches limits of validity (e.g., Bridgman, 1921). However, such limitations are not necessarily a disadvantage, but may open new fields of investigation. Thus, for example, Ferdinand Braun’s (1877) investigations on the limits of Ohm’s law led to the discovery of semiconductors. Actually much of the task of replication comprises the systematic investigation of the limits of a law. In conclusion: Quantification can be regarded as successful even if it is experimentally verified over only a limited range of magnitudes of the quantity. Although not a necessary criterion, it is worth mentioning in this context that an important incentive for their acceptance as true laws of nature is these laws’ practical applicability. There certainly is legitimacy in saying that the breakthrough for Ohm’s law being generally accepted was a function of its practical usefulness, for example, in telegraphy (Wheatstone, 1843) or in the development of the electromotor (Jacobi, 1837). Finally, the reader may have noticed that I did not take into consideration what is usually considered the criterion for successful measurement, namely the statistical fit between model and data (Sijtsma, 2011, 2012). The reason for this omission is that statistical significance is no guarantee for the replicability of measurement results; in other words, it may very well be the case that a model fits the data, but neither are the observed Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 597 Trendler ratios constant nor are the measurement values replicable (e.g., Andrich, 1988, pp. 32–34). For example, as Barrett (2005) points out, it is possible that the outcomes of an experiment conform to the Rasch model, but that there is no relation to any substantive theory. This was first noted by Wood (1978), who, by fitting the Rasch model to simulated cointossing data, demonstrated that the model shows fit even in the case of random (i.e., meaningless) data. It is therefore in the best interest of the psychologist investigating the topic of measurability to always be guided by the following statement: significance is insignificant, what matters is replicability, as the antidote counteracting the “obsession with statistical tests of significance” (Lambdin, 2012, p. 84). As will be explained below, what is needed most is to demonstrate the replicability of measurement values. Indirect measurement In An Essay on Quantity, Thomas Reid (1748) already made the distinction between proper and improper quantities (or extensive and intensive quantities as they are usually called today), which, according to him, goes back to Aristotle. The main difference is that extensive quantities (e.g., length) allow the application of physical operations which are structurally analogous to common arithmetic operations (e.g., addition, subtraction, multiplication, division) while intensive quantities (e.g., temperature) do not. The apparent difficulty in the latter case is how to determine ratios if one cannot add and subtract instances of magnitudes of the quantity. It represents one of the great scientific breakthroughs to have demonstrated that the measurement of intensive quantities is nevertheless possible, namely by means of associated factors in the manner explained below. Accordingly the distinction is usually made between fundamental (direct) and derived (indirect) measurement (Campbell, 1920; Fornasini, 2008). In short, the measurement of intensive quantities requires methods of indirect measurement. Because psychological attributes (e.g., abilities, traits, attitudes, etc.) can be conceived only as intensive quantities, indirect measurement is of particular interest in psychology. However, first we must consider how RTM approaches the problem. To properly understand the proposed twostep strategy, it must be taken into account that RTM is part of the program of the axiomatization of the natural sciences as was started at the end of the 19th century (Corry, 2004). Michell (1999) notes: “A theory is ‘axiomatized’ when it is expressed as a set of (ideally, logically independent) propositions (called ‘axioms’) from which the remainder of the theory deductively follows” (p. 194). In measurement theory the fundamental problem of axiomatization is to deduce from an appropriate set of axioms “that each magnitude, a, measured by any other magnitude, b, gives a definite numerical magnitude or number” (Michell & Ernst, 1996, p. 241). Otto Hölder’s (1901) merit is to have presented the first such complete axiomatization. Obviously if an attribute satisfies the axioms of measurement it is, in principle, measurable. The first step towards quantification, or “the scientific task,” as Michell (1999) calls it, comprises the task of empirically testing the axioms of measurement. Michell notes: “The scientific task having been successfully completed, it is known that the relevant attribute is quantitative and, so, it follows that it is measurable. That is, magnitudes of the quantity sustain ratios” (p. 75). However, it must be emphasized that even if this task is successfully completed, the attributes involved are not really measured. Or, in other Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 598 Theory & Psychology 23(5) words, we know at this stage that the attribute sustains ratios of magnitudes, but we cannot determine them numerically. To obtain measurement values it is necessary to solve in the second step: the “scaling problem” (Orth, 1974, chap. 9) or “the instrumental task.” Michell (1999) explains: “The business of the instrumental task is to contrive procedures whereby [the] ratios can be discovered or reliably estimated” (p. 75). With critical regard to the twostep strategy it must be noted that no clear criteria are specified for when to consider the scientific task accomplished. Actually to my knowledge no criteria at all are proposed. This represents in my view a serious flaw. Given that the empirical structure potentially contains infinitely many objects, the instrumental task can be deferred indefinitely. More importantly, since procedures for the determination of ratios must be implemented at some point in the process of quantification, otherwise measurement is never reached, it is rather reasonable to dispense with the twostep strategy altogether and start with the instrumental task directly, as is the rule in physics. As Michell (1999) notes, the determination of ratios of magnitudes is generally done by exploiting relationships between the attribute being quantified and another already quantified. Consider for example the measurement of temperature using an ordinary thermometer. Within a specific range of temperatures, it has been found that the temperature of a liquid (say, the metal, mercury) is linearly related to its volume, if pressure is held constant. Thus, in a sealed glass tube of uniform width, for a limited range, temperature varies linearly with the height of the column of liquid. By this means, it is possible to measure temperature via measurements of length. (pp. 75–76) If such construction is indeed possible, then procedures of indirect measurement become applicable, where the scientific and the instrumental task are solved simultaneously. Before proceeding I would like to mention Markus and Borsboom’s (2012) critique of the twostep approach. They point out that even though the ancient Egyptians “did not have access to set theory, homomorphic mappings, the axioms of quantity, or any of the other mathematical structures” (p. 455), they succeeded in establishing successful measurement practices. The point of this example is that the construction and appropriate use of “measurement instruments was apparently possible in the absence of any success in the ‘scientific task’ (or even the awareness that there was one)” (p. 455). In a similar vein, SaintMont (2012) notes, echoing Schönemann (1994), that a “formalization of the measurement process followed by invariance considerations does not come first. Rather, these developments prove to be useful in structuring existing substantial results” (p. 473). In my view it is not really reasonable to ignore modern measurement theory just because our ancestors did manage quite well without it. On the contrary, the fact that we dispose of an axiomatic theory is an invaluable advantage in helping us understand what the conditions for measurement are, thus facilitating the gathering and interpretation of substantial results. After all, axiomatic measurement theory can be viewed as making explicit what is implicitly contained in measurement practice. As Krantz et al. (1971) note, taking geometry as an example: Geometry is a beautiful and farreaching example of a foundational treatment of measurement. The science of geometry (i.e., earth measurement) was probably first developed as a set of practical procedures, either for the direct measurement of lengths and areas on the earth’s Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 599 Trendler surface or in connection with the astronomy devised to serve astrology. Eventually, the tacit assumptions of practice were formulated explicitly as theorems of geometry, and these were systematically organized and deduced from a few axioms and postulates by Euclid. (p. 7) In conclusion, yes, axioms of measurement must be tested, but not necessarily in the manner prescribed by RTM. As will become evident below, the experimental testing for measurability does indeed not involve the explicit knowledge of such mathematical concepts as, for example, a homomorphism, the Archimedean condition, or the condition of solvability. Hence, concerning this matter, I’m partly in agreement with Markus and Borsboom (2012). To return to the main topic of this section, as the example above (as quoted from Michell, 1999) illustrates, the origin of indirect measurement can be ascribed to the empirical observation that a presumed quantity (or independent variable) A covariates with a measurable quantity (or dependent variable) P. The idea derived from such observations is that we may indirectly measure A by means of the relation P = f (A). The first problem encountered is how to determine values of A without knowing f. Secondly, because P = f (A) represents an empirical relation, it must first be validated in experiment. This seems to lead into a circulus vitiosus; since, on the one hand, in order to measure A we must know f and, on the other hand, in order to identify the form of the function f we need to measure A. The obstacle can be overcome by incorporating a third quantity X which is empirically associated with A and P so that P = f (A,X). That is, values of P are a function of the joint variation of A and X. What is decisive is that now the quantitative hypothesis P = f (A,X ) can be tested by assuming ex hypothesi that certain empirical laws connect the quantities. The simplest function that comes into consideration is the relation of (direct or inverse) proportionality. In this case the quantity P varies (directly or inversely) proportional to the product of A and X. To test the quantitative hypothesis it is necessary to combine magnitudes a, b, c… etc. of A with magnitudes x, y, z … etc. of X and record the joint effect ax, ay, az, …, bx, by, bz, … cx, cy, cz … have on P. The crucial point is that if the effect of A on P is empirically separable from the effect of X on P, then the numerical value r of the corresponding ratios ax ay az = = = ...= r bx by bz is constant. We can conclude that a/b = r, respectively a = r.b. The same logic applies in symmetry to X. Hence, although the operation of physical addition is not applicable to intensive quantities, they can nevertheless be measured according to the definition of measurement. Also note that the independent variables must not be measureable for the procedure to be applicable. As a minimum requirement we must be able to identify equal magnitudes of A and X. In conclusion, indirect measurement consists in the discovery of numerical constants in an empirical law (Helmholtz, 1887/1998). The empirical discovery of constants in a law is the most solid empirical evidence not only for the measurability of the independent variables, but simultaneously for the validity of the empirical law. It goes without saying that such invariance was found by Ohm (1826) in his experiments on the laws of Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 600 Theory & Psychology 23(5) the electrical current. That the measurement of intensive quantities is about the determination of constants in an empirical law is selfevident to physicists (e.g., Jacobi, 1842; Wheatstone, 1843). At this point it is important to note that the dependent variable might not be a continuous quantity but can also be discrete. For example, following the discovery of the electron by J. J. Thomson in 1897, electric currents came to be conceived as a flow of clouds of electrons driven through the conductor by an electromotive force (Home, 2005). As Home notes, the electric charge “became a primitive term, a quality attributed to the fundamental constituents of matter that was itself left unexplained” (p. 96). From that date on it was at least theoretically possible to measure the intensity of the electrical current by counting electrons. Only recently it became really possible to construct measurement devices which measure the current in a microelectronic circuit by counting the number of electrons per second (i.e., a frequency) as they pass through a detection circuit. The electric current I then equals this frequency f multiplied by the electron charge e, a universal physical constant (Bylander, Duty, & Delsing, 2005). Similar attempts are made in temperature measurement (e.g., Sanner et al., 2010). Interestingly, most of the measurement laws in physics turned out to be of the simple proportional form (e.g., Newton’s second law of motion, Boyle’s law, Hooke’s law, Ohm’s law, etc.). However, this must not necessarily be the case, since, as Georg Rasch (1960/1980) recognized, what is generally required is a special class of functions or, to use Rasch’s term, “models for measurement,” namely those that allow an algebraic separation of the effects of the independent variables on the dependent variable as illustrated above. Accordingly he called this property separability (Rasch, 1977). Which class of functions exactly satisfies this condition is still a matter of debate. Sixtl (1982) assumes that it is functions with an algebraic addition theorem (see also Fischer, 1989). In principle, for his models of measurement, Rasch made use of Gauß’s (1821/1995) discovery about the probability distribution of measurement values as observed in repeated measurements. What constitutes Gauß’s revolutionary insight is the representation, as a probability function, of the distribution of measurement values resulting from repeated observations of the same magnitude of a quantity. The distribution is now known as the Gaussian or Normal distribution. Basically the Gaussian function makes a connection between this distribution and the “true value” of the magnitude of quantity underlying it. Hence it is a small step to go the other way round and estimate from a probability distribution the parameter of the distribution (i.e., the “true” value of the magnitude of the quantity). Thereby one assumes, metaphorically speaking, that measurement has already taken place and that as a researcher one receives the measurement protocol for data analysis (cf. Michell, 2008a). For example, the simple logistic model, as the simplest model of the Rasch family, can be expressed as: A π = 1− π D where π is the probability that a person solves an item correctly, A represents person ability and D item difficulty. θ is the ratio of the probability that a person will solve an item correctly to the probability that she will not. The function f selected by Rasch to attain separability is a special case of the logistic function, namely θ= Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 601 Trendler f ( x) = x 1+ x where f is an increasing monotonic function mapping positive real numbers into the real number interval [0, 1] (for models relying on the Gaussian function see Gutjahr, 1977, chap. 3). Hence the simple logistic model can be rewritten as: A π= D A 1+ D Finally, we must take into consideration how RTM deals with the measurement of intensive quantities. First, it should be pointed out that to some extent the theory of conjoint measurement was specially developed “to show that a standard sequence can be developed … without any prior extensive measurement” (Krantz et al., 1971, p. 138); contrary to Norman Campbell’s claim that derived measurement necessarily depends on the prior measurement of other attributes. The basic idea for such an alternative quantification is that if an increase from a to b within quantity A in conjunction with an increase from x to y within quantity X has an equal effect on P, then we can conclude that (b – a) = (y – x). If it is possible to adjoin such differences, then the interval (b – a) on A can be used to trade off equal intervals (y – x) = (z – y) = … etc., on X. This procedure assumes additivity in the effects of the two factors: that is, if the sum of a and y effects equals the sum of b and x effects, then the difference between b and a effects must equal the difference between y and x effects. Put simply, “the entities that can be concatenated are intervals within one factor” (Krantz et al., 1971, p. 18). Krantz et al. (1971) summarize: [T]he presence of a second factor together with the assumption of additivity of the effects of the two factors allows us both to calibrate equal units on the first factor and to combine adjacent equal units to form a standard sequence. (p. 18) The method is illustrated as follows: Consider the problem of measuring the electrical resistance of intervals along a piece of wire (say, for purposes of calibrating a variable resistor). Suppose that the lengths of the intervals are either unknowable (the wire is coiled or inaccessible) or irrelevant (the resistivity is nonconstant if the thickness or composition of the wire varies from point to point). Moreover, suppose that the only instruments available are incandescent light bulbs, a power source of suitable output, other wires, and soldering tools. One solution is to connect one light bulb in series with an interval along the tobemeasured wire (using a pair of variable contacts to attach two leads to the endpoints of the interval) and a second light bulb to a fixed length of another (auxiliary) wire. Plug both bulbs into the power source. If the auxiliary wire has sufficiently small resistance, then the second bulb will have more current through it and be brighter than the first. Move one of the variable contacts along the tobemeasured wire toward the other until the resistance added by the interval is small enough so that the lights match in brightness. These two endpoints can then be taken as points a1 and a2 in a standard sequence. One next searches for a point a3 such that the resistance of interval a2a3 is just sufficient for the two bulbs to Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 602 Theory & Psychology 23(5) match, keeping the same auxiliary wire in series with the second bulb. In this way, a sequence of intervals of equal resistance can be found. … To obtain a finer or coarser sequence, shorter or longer pieces of auxiliary wire are used with the second bulb. (p. 138) Hence, the minimum requirement for constructing a standard sequence by means of this method is that it is possible to identify and compare magnitudes of the quantities for equivalence and order. Of course, what is still necessary is “to check the appropriate axioms or, what amounts to the same thing, to show that the ratio of the resistances of any two intervals is invariant within error limits with the choice of standard sequence” (p. 138). However, seen in critical light, dispensing with extensive measurement is not necessarily an advantage, since what in theory seems rather simple proves quite laborious in practice. Krantz et al. (1971) note the example seems contrived because, “ordinarily, one would obtain a calibrated electrical meter and measure resistance in terms of previous extensive measurement” (p. 138). In my view this is not the only reason. Firstly, it is proposed that equal intensity of electrical current can be determined by visually comparing two incandescent light bulbs for equality in brightness. That is, if the lights match in brightness, it is concluded that the electrical currents are equal in intensity. Obviously it is implicitly assumed that there is a direct proportional relation between the physical intensity of light and the associated sensation of brightness. The prominent role ascribed to humans as measurement instruments is no coincidence. As a matter of fact the RTM approach in general deduces much of its plausibility, with regard to a possible application in psychology, from the hypothesis that humans function as measuring instruments (i.e., that humans are able to identify unequivocally magnitudes of physical or psychological attributes and compare them for equivalence and order; for an example see the applications of conjoint measurement as presented in Michell, 1990). However, it must be noted that humans have proven in general highly unreliable instruments for an objective evaluation of reality. In particular, as we now know from psychophysics, human “judgements of quantity are astonishingly poor” (Laming, 1987, p. 657; for details see Laming, 1997). Actually in physics it has been known for a long time that human senses can be quite deceiving even if confronted with such relatively simple tasks as comparing the length of two straight rods (see Goodeve & Shelley, 1877, chap. 2). Maxwell (1871/2001) summarizes the situation: In fact, our sensations of every kind depend upon so many variable conditions, that for all scientific purposes we prefer to form our estimate of the state of bodies from their observed action on some apparatus whose conditions are more simple and less variable than those of our own sense. (p. 3) In conclusion, relying on humans as measurement instruments, certainly no viable scientific measurement practice can be established; neither in physics nor in psychology. Secondly, the application of the RTM procedure is extremely difficult even in physics where we have maximum technological control through apparatus construction. Therefore, the question arises: Is it at all possible to identify in any precise manner magnitudes of intensive quantities without the help of an associated extensive quantity and without making unrealistic assumptions such as that humans can be used as a substitute for measurement instruments? However, as explained in detail elsewhere Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 603 Trendler (Trendler, 2009), the problem is not with the nonavailability of extensive measurement, but with the impossibility of the Galilean revolution. Actually the RTM example gives credence to the idea that under circumstances where systematic disturbances are active, sophisticated manipulation and control through apparatus construction is necessary for attaining measurement. Before closing, I want to mention that Markus and Borsboom (2012) find fault with what I described as the method of identifying equal magnitudes of an intensive quantity (Trendler, 2009, pp. 583–584). They linger extensively over different interpretations of the first axiom, although it is clear that the only relevant interpretation for the topic discussed is in terms of magnitudes of a quantity. As such, the axiom specifies the criteria for the successful identification of equal magnitudes of a quantity. The method I describe is not my method, as Markus and Borsboom allege, but is derivative of what J. S. Mill (1843/1974) called the “Baconian rule of varying circumstances” (p. 381). In particular I singled out the “method of concomitant variations” (pp. 398–403), which essentially consists in varying causes and observing the effect of this manipulation on the dependent variable. Furthermore, Markus and Borsboom (2012) comment: “[S]howing that one cannot establish the scale of measurement using the test … described does not rule out the possibility of establishing it by other means” (p. 461). They note: In cases where experimental manipulation is impossible, measurement procedures typically rest on triangulation of multiple measures that are aimed at assessing the same property. If one has multivariate information, then one can check that they share the same scale without resorting to manipulation. … More broadly, we think that this is precisely the method on which general psychometric modeling procedures rest. These procedures test whether the association structure in a multivariate space that arises from obtaining multiple measures of the same property is in accordance with the hypothesis that the observed variables in fact measure the same thing. (p. 461) In reply I must first point out that, if there are laws of measurement in psychology, then they are all of the general form R = f (S,O), where the dependent variable R stands for reaction variables (e.g., test result) and the independent variables S and O for stimulus variables (e.g., test item) respectively for organism variables (e.g., ability; for further details see Sixtl, 1982, pp. 24–30; cf. also Rasch, 1977). Secondly, given the mathematics of measurement, in particular the concept of separability, experimental investigation always involves the combination of different magnitudes of a quantity S with different magnitudes of a quantity O and the observation the combination has on the dependent variable R while controlling systematic disturbances. The application of this method presupposes that we can actually identify equal magnitudes of the presumed quantities S and O. We can of course also assume, as is usually done in psychology, that no systematic disturbances distort the relation and test the quantitative hypothesis directly. However, if the test fails (i.e., if the criterion of measurement is not satisfied), we must verify whether indeed equal magnitudes of S and O have been experimentally combined. Actually this course of action is in principle confirmed by the authors quoted by Markus and Borsboom (2012 i.e., Hambleton & Swaminathan, 1985; Lord & Novick, 1968; Rasch, 1960/1980; van der Linden & Hambleton, 1997) as instances for an alternative methodological approach. As Hambleton, Swaminathan, and Rogers (1991) note Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 604 Theory & Psychology 23(5) with regard to item response theory (IRT): “An assumption common to the IRT models most widely used is that only one ability is measured by the items that make up the test. This is called the assumption of unidimensionality” (p. 9) They elaborate: This assumption cannot be strictly met because several cognitive, personality, and testtaking factors always affect test performance, at least to some extent. These factors might include level of motivation, test anxiety, ability to work quickly, tendency to guess when in doubt about answers, and cognitive skills in addition to the dominant one measured by the set of test items. What is required for the unidimensionality assumption to be met adequately by a set of test data is the presence of a “dominant” [emphasis added] component or factor that influences test performance. (p. 9) A second, related assumption is the requirement of local independence: Local independence means that when the abilities influencing test performance are held constant, examinees’ responses to any pair of items are statistically independent. In other words, after taking examinees’ abilities into account, no relationship exists between examinees’ responses to different items. Simply put, this means that the abilities specified in the model are the only factors [emphasis added] influencing examinees’ responses to test items. (p. 10) Obviously if these requirements are not satisfied, then the separation of the variables of interest will fail. Conversely, if other dominant factors (i.e., systematic disturbances) influence test performance, the model will be unsuited for the purpose of measurement (cf. Sijtsma, 2011, 2012). That other dominant factors are always active in psychological experiments, and that their control through apparatus construction, at least to the extent necessary for a successful application of measurement theory, is impossible, is precisely my argument. Hence, the literature quoted by Markus and Borsboom (2012) does not support their case. To conclude, the range of experimental methods applicable to empirical structures (n.b., interpreted as relations between magnitudes of a quantity) is constrained by the mathematics of the homomorphic numerical structures. This is the reason why the methods employed for achieving measurement in psychology (including the method for testing the first condition of quantity) are structurally analogous to the methods used in the natural sciences. Or, in short, empirical separability structurally correlates with algebraic separability. That psychologists will, eventually, concoct new ways of achieving their goals, which “may not be structurally analogous” (Markus & Borsboom, 2012, p. 463), is, I believe, for the reasons stated here and previously (Trendler, 2009), merely a logical possibility. Theory and measurement According to Kyngdon (2013), “scientific measurement of psychological attributes depends not only upon adequate stimulus control, but also upon descriptive theories of psychological systems and the demonstration of pure differences in degree (magnitude) within attributes hypothesized to be quantitative” (p. 227). Later he amplifies this point by claiming that “the fundamental problem facing psychological measurement is not lack of Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 605 Trendler experimental control, manipulation, and apparatus. It is the lack of extant quantities and the lack of descriptive theories of psychological systems” (p. 230). Kyngdon uses two theories to substantiate his claims, namely the cumulative prospect theory (Kahneman & Tversky, 1979; Tversky & Kahneman, 1992) and the Lexile Framework for Reading (Stenner, Burdick, Sanford, & Burdick, 2006). As I believe that the objections I will raise apply equally to both, I will restrict my comments only to the Lexile theory. The Lexile Framework for Reading is a modified Rasch model (for details see Appendix B of Kyngdon, 2013). The basic idea is that readingitem response probabilities are a function of the reading ability of persons and the difficulty of the text read. According to the criterion for measurement, we would expect that evidence is brought forward which shows that ratios between the reading ability of two persons is constant over item difficulty and vice versa (cf. Andrich, 1988, pp. 20–23). Kyngdon (2013), however, presents measurement as a fait accompli (e.g., he reports on p. 233 that the measurement value for a certain item is 1,210 L; the letter L standing for the unit of measurement “Lexile”). As has been explained above, measurement values can be accepted as valid only if they satisfy certain conditions. That is, in view of the fact that the objective sought is measurement on an interval scale, what must be demonstrated is, as pointed out by Krantz et al. (1971), that the ratio of reading ability (respectively item difficulty) of any two intervals is invariant within error limits with the choice of the standard unit. In particular, what must be empirically proven is the constancy (within experiments) and replicability (over experiments) of measurement values, ideally by other independent researchers. As Regnault (1847) notes with regard to temperature measurement: [T]here is a prerequisite that all measuring equipment must meet; it is not only necessary that it is always strictly comparable to itself, that is to say, it always marks the same level of degree under identical circumstances, but it is necessary that we can reproduce it at will and still get strictly comparable instruments. (pp. 164–165) And Wittgenstein (1953) outlined the importance of this aspect for the practice of measurement as follows: The procedure of putting a lump of cheese on a balance and fixing the price by the turn of the scale would lose its point if it frequently happened for such lumps to suddenly grow or shrink for no obvious reason. (p. 56e) That is, if it is found, for example, that the ratio between the ability of two persons is r, then the ratio must prove invariant over replications. Conversely, if the measurement value of reading ability for a person varies (beyond the limits of random error) over repeated measurements, it would be useless as an indicator for the level of ability. Actually, Regnault’s condition of comparability corresponds to what I have called “the first condition of quantity” (Trendler, 2009, p. 583). In conclusion, unless constancy and reproducibility of measurement values is demonstrated, it is not even certain if the first condition of quantity is satisfied. However, Kyngdon (2013) is of the opinion that decisive evidence for measurability consists rather in the demonstration that differences between degrees and pairs of degrees Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 606 Theory & Psychology 23(5) of the relevant, psychological quantity are homogeneous. He explains: “By homogeneous, it is meant that the degrees (magnitudes) of a continuous quantity must vary only by amount, not kind” (p. 233) and indicates that the concept of homogeneity was recognized by Hölder (1901) and has been recently emphasized by Michell (2009). Accordingly, his intention is to demonstrate that reading items “can be engineered whilst holding the content and number of words the same, such that differences between the Lexile difficulties of the texts are approximately equal” (Kyngdon, 2013, p. 236). But again, it is not evident to what extent the reported measurement values have the properties required. Admittedly, if such construction is possible it would represent supplementary indication of the presents of quantitative structure. However, the demonstration of homogeneity is not a necessary criterion for the establishment of an attribute as measurable. For example, in the case of temperature measurement by means of real gases, matters are not that simple; in view of the fact that real gases behave only to a very limited extent like ideal gases, a demonstration of homogeneity, in line with the method described above, is not possible. Nevertheless temperature is considered to be measurable (Schooley, 1986, chap. 2). It should not go unmentioned that Kyngdon (2013) concedes that the Millean Quantity Objection “appears to have force for psychological attributes where heterogeneous, qualitative differences between degrees can be deduced from the relevant stimuli” (p. 244). But note that this conclusion does not rely on the failure to satisfy the criterion for measurement but is deduced from “the content and structure” (p. 238) of test items. For example, Kyngdon remarks with regard to an item utilized in the Trends in International Mathematics and Science Study (TIMSS) that [w]hat makes the algebra item harder than the decimals item (knowledge of algebra and “test wiseness” in dealing with the red herring) is qualitatively different to what makes the measurement item harder than the whole numbers item (knowledge of units of measurement of time and correct unit conversions). (p. 240) Unfortunately, in this regard I cannot recognize any substantial difference to the Lexile Framework. That is, from the content and structure of the Lexile items we can just as easily deduce that it is not only verbal working memory that influences outcome, but a mixture of cognitive processes. Furthermore, Kyngdon (2013) shares the view that, in particular, cognitive theories are relevant for measurement, since if “sensations, utility, and cognitive abilities are indeed measurable, descriptive theories of the response process which connect the hypothesized psychological quantity to identifiable features of the stimuli must be developed” (p. 230; for details see also Sijtsma, 2012). Accordingly, he criticizes the fact that I ignore the role of theory. Indeed, I believe that what thwarts measurement attempts in psychology is not so much the impossibility of meaningful theory, but the impossibility of the Galilean revolution: that is, the impossibility to make scientific progress through the construction of experimental apparatuses. In essence the Millean Quantity Objection says nothing but that the Galilean revolution is impossible in psychology. This does not imply that I deny the role of theory in the process of quantification. However, by mentioning as an example “descriptive theories of atomic physics” (Kyngdon, 2013, p. 229), or by pointing out that Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 607 Trendler the definition of the second is in atomic terms, Kygndon gives the impression that sophisticated theory may be necessary from the start. This is the more surprising since he is very reserved with regard to the psychological theory underlying the Lexile Framework. Theory describing the cognitive processes is almost completely missing from his account, and where one would expect a firework of empirical studies about the cognitive structure of “verbal working memory” (p. 232), not one single study is mentioned. However, in the beginning there always are relatively simple observations and simple theories. For example, Boyle “explained the ‘spring’ of the air by drawing an analogy with a mass of fleece or wool which may be compressed or expanded according to the pressure applied to it” (Neville, 1962, p. 357; cf. also Rasch, 1977, pp. 70–73). What is characteristic for a “progressive research program” (Lakatos, 1978) is a cumulative progress consisting to a large extent in the gradual refinement of theory and experimental apparatus in interplay, so that refinement in experimental technique leads to more theory and vice versa. That this process (i.e., the Galilean revolution) cannot be started in psychology is what makes it a degenerative research program. Hence, I certainly do not question that qualitative laws are meaningful in psychology (e.g., that the higher the motivation of a person, the faster she accomplishes some task, or that the more intelligent a person is, the more difficult items she can solve, etc.; for a systematic overview of such qualitative laws see Smedslund, 1997) and that relatively simple theories are heuristically useful, but I doubt that these laws can be expressed in quantitative form (not even on the level of a nominal scale). Finally, a critical note on Kygndon’s (2013) proposition that empirical studies in support of theories must be based upon the presentation of stimuli to humans (such as test items) and inferring [emphasis added] something of the relevant, unobservable attribute (such as a cognitive ability of some kind) from the observable response made to the stimulus (such as a correct answer). (pp. 229–230) Carefully examined, however, this method of inferring from the known (observable) to the unknown (hypothetical construct) as is commonly used in cognitive psychology is an instance of what H. M. Johnson (1945) identifies as a “spurious problem,” that is, a problem that is based on “a requirement that cannot be satisfied” (p. 189). Actually what we are confronted with in cognitive psychology is a modern version of what Johnson calls “the psychophysical requirement.” By that term he means “a requirement of ascertaining a correspondence between classes of entities that we may call bodily, and other entities that we may call mental” (p. 192). As he points out, in order to make such an inference, we must first provide “the law of functional interdependence” (p. 208) between the two. If we conceive the relation between observable stimuli (and responses) and unobservable attributes as y = f(x), then “[t]o get the law by induction we must have the simultaneous values of both terms in the relationship; whereas we can observe but one of the terms” (p. 208). This, Johnson notes, “would virtually demand that we solve one equation for two unknowns” (p. 201). Hence, he concludes, the problems resulting from deducing or inferring from the known to the unknown “are not genuine, but spurious. In other words, they are forever insoluble. The attempts to solve them have yielded only facts of assumption, which can never be tested by experience” (p. 211). Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 608 Theory & Psychology 23(5) The problem has not gone unnoticed in cognitive psychology. Of particular interest is the argument about “representational indeterminacy” presented by Anderson (1978) in the context of “the imagery debate” (Pylyshyn, 1981). The topic of this debate is mainly about the question in what format information is mentally represented. On one side there are those who argue that visual information is encoded “in terms of properties that are quite spatial and modality specific” (Anderson, 1978, p. 249); on the other side of the issue are those “who argue that imagery is encoded in an abstract propositional format” (p. 250). After analysing the pro and contraimagery arguments, Anderson demonstrates that “it is not possible for behavioral data to uniquely decide issues of internal representation” (pp. 262–263). Even more, he argues that models with different theories of representation are not discriminable on the basis of purely behavioral data. It is not even possible to unequivocally decide whether the information is represented in an analogous or in a digital format. Hence, if no behaviorindependent access is found to “the mental,” cognitive psychologists will forever only be scratching the surface of the cognitive apparatus. Or, in other words, the content of “the mental” as a black box is constrained only by human fantasy. In conclusion: From cognitive psychology we certainly cannot expect any support in solving the measurement problem. It is struggling with a fundamental, seemingly unsolvable issue all of its own. The Newtonian curse: An epilogue Previously I explained why in my view measurement in psychology is impossible (Trendler, 2009); presently, in reply to the critique, I focus on the criterion for measurement, that is, on the question how we can decide if measurement has been attained. Against the criterion we are forced to conclude that the idea that “descriptive theories of behaviour may allow for the scientific measurement of psychological attributes” (Kyngdon, 2013, p. 227) is far from evident. Unfortunately I must concur with Michell (2008b), who remarked with regard to Kyngdon’s work that it is premature to “conclude from his results that reading comprehension ability is a quantitative attribute, for much more research would need to be done before that hypothesis is made even plausible” (p. 132). Actually the Rasch hypothesis in its most general form claims that humans are “RaschMachines” designed by nature to function according to the law: θ = A ⋅ D −1 . That this hypothesis can be established as true is in my view just as likely as that Ohm’s law can be established only with the help of electric phenomena as found in nature (e.g., by investigating electric fish) and not, as actually happened, with the help of an experimental apparatus assembled out of processed objects. What must be taken into consideration is Hugo Dingler’s (1932) remark: “Apparatus do not arise in nature, neither do they grow on the trees, nor are they dug out of the ground somewhere” (p. 21). In general, faced with the measurement problem, psychology is left with the following alternatives: discuss the topic from here to eternity, as Markus and Borsboom (2012) suggest, seriously attempt to find out if psychological attributes are measurable (e.g., Kyngdon, 2013), or look for other solutions (e.g., Grice, Barrett, Schlimgen, & Abramson, 2012; Michell, 2011; Smedslund, 1997, 2012). Obviously I would encourage the last two alternatives. In view of the fact that the measurability debate has now already gone on for Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 609 Trendler centuries without any progress, the resilience of the discussion is astonishing. Not even the grandiose failure of two largescale research programs—i.e., psychophysics (Laming, 1997) and behaviorism (Koch, 1954)—has deterred psychologists from giving up the quantitative imperative. In my view “the Rasch wars” (McNamara & Koch, 2012) and similar contemporary attempts (e.g., Kyngdon, 2011) will end the same way: namely, in a total defeat. Realistically viewed, however, I fear that, for reasons given below, the measurability debate indeed has the potential to continue unabated for an indefinite time. As such it may suffer the fate of other similar debates. For example, in his famous lecture Über die Grenzen des Naturerkennens held before the German Association for the Advancement of Science and Medicine in 1872, Emil du BoisReymond vigorously opposed the boundless optimism in the limitless possibilities of science as was widespread at the end of the 19th century (du BoisReymond, 1872/1967). In his lecture, he identified an instance for the limits of scientific knowledge in the mind–body problem (i.e., in the problem of how to explain consciousness from its material conditions). Not only did he argue that it is nonsolvable based on the present state of knowledge, but that in principle it will never be solved. To bring this thesis to the point he coined the Latin slogan ignoramus et ignorabimus (i.e., we do not know and will not know). So far, history has proven him to be right: The mind–body debate goes on unabated without any indication towards a resolution. In my view the Millean Quantity Objection is to be classified in the same category of insoluble problems. As Markus and Borsboom (2012) correctly point out, what is claimed is “the impossibility of epistemic access to the truth of the measurement claim; not the impossibility of that claim holding true” (p. 455). Indeed, the issue with measurement in psychology is not ontological but epistemological in nature. That is, in reality psychological attributes might be quantitative, but if they really are we do not know, and we never will know. Likewise, I think that mainstream psychology will ignore fundamental objections like the Millean Quantity Objection. The ignorance will persist as long as psychologists cannot rid themselves of the obsession of attempting to successfully emulate physics with the intention to attain a similar success in prediction and control. The idea of a Newton of psychology has haunted psychology ever since David Hume modeled his A Treatise of Human Nature (1739/2008) after Newton’s Principia Mathematica (1687/1999), hoping that “his psychology would add as much to our knowledge of human nature as Newton” (Dicker, 1998, p. 75). More recently and most forcefully, Clark L. Hull “sought to model a scientific psychology after Newton’s physics” (Barone, Maddux, & Snyder, 1997, p. 42), and many seriously thought that he would finally be the Newton of psychology. It should also be noted that Maxwell’s analysis of Newton’s second law of motion (in Maxwell, 1876/1991) served Georg Rasch as an important source of inspiration for his models of measurement (Rasch, 1960/1980, chap. 7). Ironically the result of the effort of making psychology more scientific by modeling it after physics is that it has become a perfect pathological science in the sense that it now perfectly mimics natural sciences. As known from biology, a form of mimicry consists in the similarity in appearance between the mimicking and the mimicked species (e.g., a harmless snake mimics the external appearance of a dangerous species). Similarly, mainstream psychology mimics successful sciences like physics. Psychologists are seen to Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 610 Theory & Psychology 23(5) perform experiments, analyze data, test hypotheses, publish results in journals, organize conferences, establish psychological institutes, award prizes, and so on. That is, in appearance they perform the same actions as physicists; but the essential difference is that while physicists accumulate real, replicable knowledge, no such consequence is to be found in “Newtonian” psychology. This phenomenon has also been described as “cargo cult science” (Feynman, 1985). Thus the Newtonian model turned into a curse and the Newtons of psychology “turned out rather to be the Newton[s] of Alchemy” (Eysenck, 1995, p. 45). However, breaking the Newtonian curse may prove extremely difficult, since psychologists would have to accept definite limitations to what they can do. In this connection I would like to mention Smedslund (2009), who describes four obstacles facing the search for general empirical laws along the lines specified by the natural sciences (and which, by the way, are also excellent arguments against the measurability of psychological attributes). Firstly, we must consider the openness of the person to an indefinite number of possible influences which determine behavior. Secondly, there is the phenomenon of irreversibility: that is, the fact that persons remember and learn from their experiences, which means that psychological processes are irreversible. Thirdly, persons are not isolated entities but part of a shared meaning system (i.e., system of families, cultural practices, etc.). And fourthly, we must always take into account personal uniqueness: that is, fortuitous events and unique life situations make the history of persons unique. Hence, giving up the model of natural sciences would imply bidding farewell to the longcherished view that knowledge in psychology will ultimately be universal, cumulative, and atemporal, and to accept that it is and always will be to a large extent historically and culturally bound, depending on circumstances, transitory and fragmentary. Acknowledgements I would like to express my special thanks to Paul Barrett for accompanying this work with constructive comments which, as always, are invaluable. Funding This research received no specific grant from any funding agency in the public, commercial, or notforprofit sectors. References Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological Review, 85, 249–277. Andrich, D. (1988). Rasch models for measurement. Newbury Park, CA: Sage. Barone, D. F, Maddux, J. E., & Snyder, C. R. (1997). Social cognitive psychology: History and current domains. New York, NY: Plenum. Barrett, P. (2003). Beyond psychometrics: Measurement, nonquantitative structure, and applied numerics. Journal of Managerial Psychology, 18, 421–439. Barrett, P. (2005). What if there were no psychometrics? Constructs, complexity, and measurement. Journal of Personality Assessment, 85, 134–140. Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 611 Trendler Barrett, P. (2008). The consequence of sustaining a pathology: Scientific stagnation. A commentary on the target article ‘Is psychometrics a pathological science?’ by Joel Michell. Measurement, 6, 78–123. Barrett, P., & Rolland, J.P. (2009). The metaanalytic correlation between the Big Five personality constructs of emotional stability and conscientiousness: Something is not quite right in the woodshed. Retrieved from http://www.pbarrett.net/stratpapers/metacorr.pdf Braun, F. (1877). Ueber Abweichungen vom Ohm’schen Gesetz in metallisch leitenden Körpern [About the deviations from Ohm’s law in solids with metallic conductivity]. Annalen der Physik, 237, 95–110. Bridgman, P. W. (1921). Measurements of the deviation from Ohm’s law in metals at high current densities. Proceedings of the National Academy of Sciences of the United States of America, 7, 299–303. Bylander, J., Duty, T., & Delsing, P. (2005). Current measurement by realtime counting of single electrons. Nature, 434, 361–364. Campbell, N. (1920). Physics: The elements. Cambridge, UK: Cambridge University Press. Corry, L. (2004). David Hilbert and the axiomatization of physics (1898–1918): From “Grundlagen der Geometrie” to “Grundlagen der Physik”. Dordrecht, The Netherlands: Kluwer Academic. Davies, P. J., & Marsh, J. O. (1985). Ohm’s law and the Schuster effect. Physical Science, Measurement and Instrumentation, Management and Education – Reviews, IEE Proceedings A, 132, 525–532. de Boer, J. (1994/1995). On the history of quantity calculus and the international system. Metrologia, 31, 405–429. Dicker, G. (1998). Hume’s epistemology and metaphysics: An introduction. London, UK: Routledge. Dingler, H. (1932). Der Glaube an die Weltmaschine und seine Überwindung [Overcoming the belief in the worldmachine]. Stuttgart, Germany: Enke. du BoisReymond, E. (1967). Über die Grenzen des Naturerkennens [On the limits of our understanding of nature]. Darmstadt, Germany: Wissenschaftliche Buchgesellschaft. (Original work published 1872) Ellis, B. (1966). Basic concepts of measurement. Cambridge, UK: Cambridge University Press. Eysenck, H. (1995). Genius: The natural history of creativity. Cambridge, UK: Cambridge University Press. Fechner, G. T. (1831). Massbestimmungen über die galvanische Kette [Measurement determinations by use of the galvanic chain]. Leipzig, Germany: F. A. Brockhaus. Feynman, R. (1985). “Surely you’re joking, Mr. Feynman!” Adventures of a curious character. New York, NY: Norton. Fischer, G. H. (1989). Spezifische Objektivität: Eine wissenschaftstheoretische Grundlage des RaschModells [Specific objectivity: A scientific foundation of the Rasch model]. In K. D. Kubinger (Ed.), Moderne Testtheorie: Ein Abriß samt neuesten Beiträgen [Modern test theory: An outline with most recent contributions] (2nd ed., pp. 87–111). Weinheim, Germany: Beltz. Fornasini, P. (2008). The uncertainty in physical measurements: An introduction to data analysis in the physics laboratory. New York, NY: Springer. Gauß, C. F. (1995). Theoria combinationis observationum erroribus minimus obnoxiae [Theory of the combination of observations least subject to error]. Philadelphia, PA: Society for Industrial and Applied Mathematics. (Original work published 1821) Goodeve, T. M., & Shelley, C. P. B. (1877). The Whitworth measuring machine. London, UK: Longmans, Green, and Co. Grice, J. W., Barrett, P. T., Schlimgen, L. A., & Abramson, C. I. (2012). Toward a brighter future for psychology as an observation oriented science. Behavioral Sciences, 2, 1–22. Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 612 Theory & Psychology 23(5) Gutjahr, W. (1977). Die Messung psychischer Eigenschaften [The measurement of psychological attributes]. Cologne, Germany: Kiepenheuer & Witsch. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Nijhoff. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage. Hand, D. J. (2004). Measurement theory and practice: The world through quantification. London, UK: Edward Arnold. Helmholtz, H. von. (1998). Zählen und Messen, erkenntnistheoretisch betrachtet [Numbering and measuring from an epistemological viewpoint]. In E. Bonk (Ed.), Schriften zur Erkenntnistheorie [Epistemological writings] (pp. 99–129). Vienna, Austria: Springer. (Original work published 1887) Hölder, O. (1901). Die Axiome der Quantität und die Lehre vom Mass [The axioms of quantity and the theory of measurement]. Berichte über die Verhandlungen der Königlich Sächsischen Gesellschaft der Wissenschaften zu Leipzig, MathematischPhysische Classe, 53, 1–64. Home, R. W. (2005). Electricity. In J. L. Heilbron (Ed.), The Oxford guide to the history of physics and astronomy (pp. 94–96). Oxford, UK: Oxford University Press. Hume, D. (2008). A treatise of human nature. Oxford, UK: Oxford University Press. (Original work published 1739) Jacobi, M. H. (1837). On the application of electromagnetism to the movement of machines. In R. Taylor (Ed.), Scientific memoirs, selected from the transactions of foreign academies of science and learned societies, and from foreign journals (Vol. 1, pp. 503–531). London, UK: Richard and John E. Taylor. Jacobi, M. H. (1839). Ueber das chemische und das magnetische Galvanometer [On the chemical and magnetical galvanometer]. Annalen der Physik, 124, 26–57. Jacobi, M. H. (1842). Eine Methode, die Constanten der Volta’schen Ketten zu bestimmen [A method for determining the constants in a voltaic chain]. Annalen der Physik, 133, 85–100. James, W. (1955). Pragmatism and four essays from The meaning of truth. New York, NY: Meridian Books. (Original work published 1907) Johnson, H. M. (1945). Are psychophysical problems genuine or spurious? The American Journal of Psychology, 58, 189–211. Kahneman, D. (2011). Thinking, fast and slow. London, UK: Penguin. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291. Kipnis, N. (2009). A law of physics in the classroom: The case of Ohm’s law. Science & Education, 18, 349–382. Koch, S. (1954). Clark L. Hull. In W. K. Estes (Ed.), Modern learning theory: A critical analysis of five examples (pp. 1–176). New York, NY: Appleton. Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement: Vol. 1. Additive and polynomial representations. New York, NY: Academic Press. Kyngdon, A. (2011). Plausible measurement analogies to some psychometric models of test performance. British Journal of Mathematical and Statistical Psychology, 64, 478–497. Kyngdon, A. (2013). Descriptive theories of behaviour may allow for the scientific measurement of psychological attributes. Theory & Psychology, 23, 227–250. Lambdin, C. (2012). Significance tests as sorcery: Science is empirical—significance tests are not. Theory & Psychology, 22, 67–90. Lakatos, I. (1978). The methodology of scientific research programmes: Philosophical papers (Vol. 1). Cambridge, UK: Cambridge University Press. Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 613 Trendler Laming, D. (1987). Psychophysics. In R. L. Gregory (Ed.), The Oxford companion to the mind (pp. 655–657). Oxford, UK: Oxford University Press. Laming, D. (1997). The measurement of sensation. Oxford, UK: Oxford University Press. Lenz, E. (1837). On the laws of the conducting powers of wires of different lengths and diameters for electricity. In R. Taylor (Ed.), Scientific memoirs, selected from the transactions of foreign academies of science and learned societies, and from foreign journals (Vol. 1, pp. 311–324). London, UK: Richard and John E. Taylor. Lenz, E. (1839). Ueber die Eigenschaften der magnetoelektrischen Ströme: Eine Berichtigung des Aufsatzes von Hrn. De la Rive über denselben Gegenstand [On the properties of magnetoelectric currents: A rectification of De la Rive’s article about the same topic]. Annalen der Physik und Chemie, 48, 385–423. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison Wesley. Luce, R. D. (1979). Suppes’ contributions to the theory of measurement. In R. J. Bogdan (Ed.), Patrick Suppes (pp. 93–110). Dordrecht, The Netherlands: D. Reidel. Luce, R. D., & Narens, L. (1994). Fifteen problems concerning the representational theory of measurement. In P. Humphreys (Ed.), Patrick Suppes: Scientific philosopher (Vol. 2, pp. 219– 249). Dordrecht, The Netherlands: Kluwer Academic. Luce, R. D., & Suppes, P. (2002). Representational measurement theory. In H. Pashler & J. Wixted (Eds.), Stevens’ handbook of experimental psychology (3rd ed., Vol. 4, pp. 1–41). New York, NY: Wiley. Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27. Markus, K. A., & Borsboom, D. (2012).The cat came back: Evaluating arguments against psychological measurement. Theory & Psychology, 22, 452–466. Maxwell, J. C. (1991). Matter and motion. New York, NY: Dover. (Original work published 1876) Maxwell, J. C. (2001). Theory of heat. New York, NY: Dover. (Original work published 1871) McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language & Testing, 29, 555–576. Michell, J. (1990). An introduction to the logic of psychological measurement. Hillsdale, NJ: Erlbaum. Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383. Michell, J. (1999). Measurement in psychology. Cambridge, UK: Cambridge University Press. Michell, J. (2000). Normal science, pathological science, and psychometrics. Theory & Psychology, 10, 639–667. Michell, J. (2008a). Is psychometrics pathological science? Measurement, 6, 7–24. Michell, J. (2008b). Rejoinder. Measurement, 6, 125–133. Michell, J. (2009). Invalidity in validity. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 111–133). Charlotte, NC: Information Age Publishing, Inc. Michell, J. (2011). Qualitative research meets the ghost of Pythagoras. Theory & Psychology, 21, 241–259. Michell, J., & Ernst, C. (1996). The axioms of quantity and the theory of measurement: Translated from Part I of Otto Hölder’s German text “Die Axiome der Quantität und die Lehre vom Mass” Journal of Mathematical Psychology, 40, 235–252. Mill, J. S. (1974). A system of logic: Ratiocinative and inductive. Toronto, ON: University of Toronto Press. (Original work published 1843) Neville, R. G. (1962). The discovery of Boyle’s Law, 1661–62. Journal of Chemical Education, 39, 356–359. Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 614 Theory & Psychology 23(5) Newton, I. (1999). Mathematical principles of natural philosophy. Berkeley: University of California Press. (Original work published 1687) Ohm, G.S. (1826). Bestimmung des Gesetzes, nach welchem Metalle die Contaktelektricität leiten, nebst einem Entwurfe zu einer Theorie des voltaischen Apparats und des Schweiggerschen Multiplicators [Determination of the law in accordance with which metals conduct contact electricity, together with an outline of a theory of the voltaic apparatus and of Schweigger’s multiplier]. Journal für Chemie und Physik, 46, 137–166. Orth, B. (1974). Einführung in die Theorie des Messens [Introduction to measurement theory]. Stuttgart, Germany: Kohlhammer. Pouillet, M. (1837). Mémoire sur la pile de Volta et sur la loi générale de l’intensité que prennent les courants, soit qu’ils proviennent d’un seul élément, soit qu’ils proviennent d’une pile à grande ou à petite tension [Memoir on the voltaic pile and the general law of intensity of currents from a simple chain or a pile of large and small tension]. Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences, 4, 267–279. Pylyshyn, Z. W. (1981). The imagery debate: Analog media versus tacit knowledge. Psychological Review, 88, 16–45. Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. The Danish Yearbook of Philosophy, 14, 58–93. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Nielsen & Lydiche. (Original work published 1960) Regnault, V. (1847). Relation des expériences entreprises par ordre de M. le Ministre des Travaux Publics, et sur la proposition de la Commission Centrale des Machines à Vapeur, pour déterminer les principales lois physiques et les données numériques qui entrent dans le calcul des machines à vapeur. [Report on experiments undertaken by order of the minister of public works, and the proposal of the central committee of steam engines, to determine the main physical laws and numerical data used in the calculation of steam engines]. Mémoires de l’Académie Royale des Sciences de l’Institut de France, 21, 1748. Reid, T. (1748). An essay on quantity. Transactions of the Royal Society of London, 14, 505–520. SaintMont, U. (2011). Statistik im Forschungsprozess [Statistics in the scientific process]. Heidelberg, Germany: Physika Verlag. SaintMont, U. (2012). What measurement is all about. Theory & Psychology, 22, 467–485. Sanner, C., Su, E. J., Keshet, A., Gommers, R., Shin, Y., Huang, W., & Ketterle, W. (2010). Suppression of density fluctuations in a quantum degenerate Fermi gas. Physical Review Letter, 105, 1–4. Schönemann, P.H. (1994). Measurement: The reasonable ineffectiveness of mathematics in the social sciences. In I. Borg & P. Mohler (Eds.), Trends and perspectives in empirical social research (pp. 149–160). Berlin, Germany: Walter de Gruyter. Schooley, J. F. (1986). Thermometry. Boca Raton, FL: CRC. Sijtsma, K. (2011). Introduction to the measurement of psychological attributes. Measurement, 44, 1209–1219. Sijtsma, K. (2012). Psychological measurement between physics and statistics. Theory & Psychology, 22, 786–809. Sixtl, F. (1982). Messmethoden der Psychologie: Theoretische Grundlagen und Probleme [Measurement methods in psychology: Theoretical foundations and problems] (2nd ed.). Weinheim, Germany: Beltz. Sixtl, F. (1985). Notwendigkeit und Möglichkeit einer neuen Methodenlehre der Psychologie [The necessity and possibility of a new methodology in psychology]. Zeitschrift für experimentelle und angewandte Psychologie, 32, 320–339. Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014 615 Trendler Sixtl, F. (1998). Der Abschied von Homme Moyen alias Average Person [Taking leave of the average person]. In W. Hacker & M. Rinck (Eds.), Bericht über den 41. Kongreß der Deutschen Gesellschaft für Psychologie in Dresden 1998 [Report on the 41st Congress of the German Psychological Association in Dresden, 1998] (pp. 519–526). Lengerich, Germany: Pabst Science. Smedslund, J. (1997). The structure of psychological common sense. Mahwah, NJ: Erlbaum. Smedslund, J. (2009). The mismatch between current research methods and the nature of psychological phenomena: What researchers must learn from practitioners. Theory & Psychology, 19, 778–794. Smedslund, J. (2012). The bricoleur model of psychological practice. Theory & Psychology, 22, 643–657. Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7, 307–322. Suppes, P., & Zinnes, J. L. (1963). Basic measurement theory. In R. D. Luce, R. R. Bush & E. H. Galanter (Eds.), Handbook of mathematical psychology (Vol. 1, pp. 3–76). New York, NY: Wiley. Tarski, A. (1954). Contributions to the theory of models I. Indagationes Mathematicae, 16, 572–588. Trendler, G. (2009). Measurement theory, psychology and the revolution that cannot happen. Theory & Psychology, 19, 579–599. Tukey, J. W. (1986). Data analysis and behavioral science or learning to bear the quantitative man’s burden by shunning badmandments. In L.V. Jones (Ed.), The collected works of J. W. Tukey: Philosophy and principles of data analysis: 1949–1964 (Vol. 3, pp. 187–390). London, UK: Chapman & Hall. Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323. van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York, NY: Springer. Wheatstone, C. (1843). The Bakerian lecture: An account of several new instruments and processes for determining the constants of a voltaic circuit. Philosophical Transactions of the Royal Society of London, 133, 303–327. Wittgenstein, L. (1953). Philosophical investigations. Oxford, UK: Basil Blackwell. Wood, R. (1978). Fitting the Rasch model: A heady tale. British Journal of Mathematical and Statistical Psychology, 31, 27–32. Zinnes, J. L. (1969). Scaling. Annual Review of Psychology, 20, 447–478. Author biography Günter Trendler is currently working as technician at a leading provider of technical solutions in mechanized tunneling. Previously he was involved as a graduate research assistant in the Schizophrenia Research Unit of the Central Institute of Mental Health Mannheim (CIMH) in research to improve the early detection of the onset of psychosis. Address: RobertSchumanStr. 15, 67549 Worms, Germany. Email: guenter@trendler.org Downloaded from tap.sagepub.com at Erciyes Universitesi on May 16, 2014