RETURN to Part I.

re-edited 11/5/19

Part II: The Capacity to Benefit from Formal Academic Schooling:
two ideologies of distribution
by Edward G. Rozycki, Ed.D.

The "gold standard" or "priors" one chooses as a basic reference points for test accuracy comparisons can have a substantial effect on estimated test performance characteristics.[15] An educational philosophy or organizational ideology -- not infrequently viewed by some practitioners as dispensible verbiage -- can  importantly determines what the priors will be. [16] For example, for public education in the U.S., let us consider two ideologies: a) DuBois' Talented Tenth , or TT; also, what I will call public school ideology, PS. [16b]

Let us suppose that W.E.B. DuBois had it right: every human group has at best a "talented tenth" who both need and can use a formal, theoretical education in order to maximize their social contribution.[17] Assuming a normal distribution of talent (ND) in a group, this would restrict academic schooling to students with an IQ of about 119 and above.[18] But public school ideology -- with recent amendments from Special Education Law -- has it that students with IQ's above 85 (-1 s.d.) should be in the regular classroom with no special support.

Let us use DuBois' theory of prevalence to generate the "true" and "false" numbers, the gold standard for identifying those "in need" of academic schooling. Public school placement (PS) procedures, which assume the validity of ND, we will treat as a sorting test. That test must have an sensitivity that identifies students from -1s.d. on up as talented. So the sum of true positives + false positives must equal about 84% of the total population. (This is a shift of 74% of the population to the "in need" group.)

Chart 3

If we look at Chart 3 (which reintroduces chart 2 from the previous essay, Classification Error in Evaluation Practice) above, we can see that with a TT prevalence of 10% we would need a test sensitivity above 95% to reach a PPV above 68%. We can get more accuracy with Chart 4 which treats sensitivity (or prevalence) as a function of PPV and prevalence (or sensitivity, respectively). On the x-axis we find a PPV of .90 and, treating the y-axis as incidence, find TTT incidence as .10, we find we must use a test with sensitivity = .99.

Chart 4

The colored areas of charts 2 and 4 map into each other. Note that for a given value of PPV, sensitivity and prevalence relate inversely.[19] For low prevalence populations, rather than use sensitivity, specifity is used. This allows us to reach higher levels of NPV (negative predictive value), the probability of accurately rejecting those who lack a specific characterisic. Thus, by focussing on specificity, we have a higher probability with TT of correctly rejecting students who don't belong than in using a highly sensitive test to identify those who do.

However, in the real world of educational politics, it is difficult to persuade parents and their advocates that a test that rejects students from a desired category is to be preferred to a test that admits them. This consideration may help explain, despite parental objection, the inclination of school boards to let Gifted Education, admittance to which is often premised on an IQ of 135, languish unfunded.[20]



[15] See Lucas M. Bachman et al "Consequences of different diagnostic 'gold standards' in test accuracy research: Carpal Tunnel Syndrome as an example" in International Journal of Epidemiology 2005; 34:953-955 available online as pdf at yudkodw

[16] See Edward G. Rozycki, "Philosophy and Education: What's The Connection?" at

Despite the fact that PPV is a function of three variables, i.e. PPV = F(s, sp, p) where s = sensitivity, sp = specificity, and p = prevalence, s and sp may -- by empirical determination -- be independent variables. However, the working assumptions in public education are biased toward treating s and sp as dependent. Also, the error of disregarding the effects of p is far from uncommon. I will speculate why this situation exists:

a. There are few uncontroversial "gold standards" in, for example, public education, for identifying either educational goals or deficits -- estimates tend to be based on tradition and anecdote; such dissensus is  generally found in many publicly influenced institutions, national defense policy or national health policy.

b. Separate testing for s and sp may be seen as unnecessarily increasing costs, especially since the classifications involved are binary -- also, letting s stand in for sp may be thought to be inconsequential;

c. There is a bias toward maximizing inclusion, especially if the treatment outcome is seen as desirable, or, mutatis mutandis, the need for treatment dire.

[16b] See Lynn Fendler and Irfan Muzaffar, "The History of the Bell Curve: sorting and the idea of the normal" in Educational Theory Vol 58, No 1 (2008)pp. 63 - 82.

[17] W. E. B. DuBois "The Talented Tenth" Chapter Two in The Negro Problem (New York: James Pott and Company, 1903)

[18] The normal IQ distribution covers 90% of the population at approximately +1.3 standard deviations from the mean of 100. Using 15 points = 1 s.d., we get an IQ of approximately 119 as the lower limit of the "talented tenth."

[19] The formula which generates Chart 2, assumes sensitivity = specificity, as explained in footnote 6 of part I. This assumption reflects public school ideology biased toward inclusivity. For this formula PPV = (S*P)/((S*P)+((1-S)*(1-P))), can be manipulated algebraically to yield either
s = (p-1)/((2p-1)-(p/PPV)), or p = (s-1)/((2s-1)-(s/PPV)). Either formula generates Chart 4.

[20] See Edward G. Rozycki, "Justice Through Testing" available at