Published: April 6, 2011

So I'm back from spring break and I notice that the Thomas B. Fordham Institute had an op-ed in the Akron Beacon Journal on Sunday that was critical of my story on rising math scores on the National Assessment of Educational Progress that ran on Monday, March 28 .

Here is the paragraph in the Fordham op-ed that got my attention:

It gets worse as you dig deeper. The black-white achievement gap has actually grown since the early 1990s. In 1992, the achievement gap in fourth-grade math was 15 percentage points; in 2009 it had yawned to 40 percentage points. In every grade and subject measured the gap has grown during the past two decades, leading one to wonder how one expert in the news story could declare that gains have been ''extraordinary'' and that we should be happy about the achievement among Ohio's African-American students.

A 40-point yawning achievement gap? Say what? I crunched those NAEP numbers pretty hard and I didn't recall seeing that kind of gap in fourth grade math. So I went back and looked and sure enough, the gap between white students and black students was 28 points in 1992 and 27 in 2009. Basically flat.

(If you want to crunch them yourselves,  you can use the NAEP Data Explorer. Click on  the green box [Main NDE] and then follow the steps. You'll have to noodle around a little to get the hang of selecting criteria, variables, and statistical options).

Yes, the gap didn't close, but that was because both groups did significantly better. For the gap to close, black students would have to dramatically outperform white students.

However, Fordham said the gap actually increased by 40 percentage points. Where did they get that figure?

If you read the paragraph above the one I quoted in the opinion piece, the Fordham authors are talking about the percentages of students scoring at the "Proficient"  level. This is where I think their yawning 40-point gap comes from.

In 1992, 18 percent of white students scored Proficient or better on the NAEP and 3 percent of black students did. There's the 15 percentage points. In 2009, 14 percent of black students scored Proficient or better, but 54 percent of white students did that well, too. So there's the yawning 40 percentage points. (note that both groups improved their scores, but whites improved theirs at this level much more).

But at the Basic level (one level below Proficient) the gap is narrowing. In 1992, the gap between white and black students scoring at the Basic level or higher was 41 percentage points. In 2009, it was 27 points.

I'm not sure what a gap analysis at either level really tells us, though.

As my story pointed out, the insertion of these cut scores establishing proficiency levels has been controversial from the beginning. Here's some cutting-room floor material that I didn't include in the story because of space considerations:

The ''Nation's Report Card'' began in 1969 (state level reports began in 1990) as a ''solely descriptive'' sampling of the nation's students, according to Gerald Bracey, an independent researcher and national expert on testing who died in October of 2009.

''Its purpose was to provide an indicator of the nation's general education health by determining what students knew and didn't know in the same way that a health survey determines what proportion of the people have tuberculosis or low body fat,'' Bracey wrote in an article published Educational Leadership the month after he died.

That changed in 1988 when Congress amended the law to permit state-by-state comparisons and decide what students at certain ages should know, which added a prescriptive element to tests by categorizing scores as Below Basic, Basic, Proficient or Advanced. Bracey notes that U.S. Government Accounting Office, The National Academy of Sciences and The National Academy of Education all have criticized the achievement levels.

''These critiques point out that the methods for constructing the levels are flawed, that the levels demand unreasonably high performance, and that they yield results that are not corroborated by other measures,'' Bracey wrote.

Proficiency levels aside, however, it's clear that Ohio's white students and black students did not start out on an even field on this test. In 1992, the average score for white students was 222. The average score for black students was 194. In 2009, black fourth graders were getting the average score that white students got in 1992. White students scored 249 on the test in 2009.

Are those gains a big deal? Social scientists measure these things using a statistical concept known as standard deviation. This gets a little complicated, but imagine your typical bell curve graph with the average score at the top of the bell, which then slopes out in either direction approaching zero at the extremes. In a normal distribution, a third of all scores will be higher than the average and a third will be lower. That third either way is what they call a standard deviation. The rare scores on either end comprise the final third that fall outside the standard deviation.

I interviewed Richard Rothstein at the Economic Policy Institute, who told me that in social science research, a gain of more than .3 of a standard deviation would be considered a success. A gain of a whole standard deviation would be "very, very rare."

The standard deviation on the 1992 math test for Ohio black fourth graders  was 29 points. The average score for black fourth graders in 2009 was 28 points higher than it was in 1992.

That's almost an entire standard deviation difference, which is what prompted Rothstein's quote that the gain was "truly extraordinary."

What I needed to do was put that gain into simple English. Mr. Rothstein kindly shared a conversion chart that matches standard deviation units to percentile ranks.

That's what allowed me to say in English that "the average black fourth-grader in 2009 scored better than 83 percent of the black students who took the test in 1992."

Finally, I doubt anyone in education is satisfied with achievement for students of any race. But it's simply not true that educators have been spinning their wheels for two decades when it comes to math, at least according to the "gold standard" of testing.