OpenComps: Validity and Causal Inference

With the start of my comprehensive exams beginning in 12 days, my studying has hit the homestretch. Thankfully, my advisor has inspired some confidence by telling me that my understanding of the math education literature is solid and I won't need any more studying in that area. That's good for my studying, and something I take as a huge compliment. So now I can focus for a while on preparing myself for the exam question Derek Briggs is likely to throw my way. Typically, one of the three people on a comps committee is tasked with asking a question related to either the quantitative or qualitative research methodology we learn in our first year of our doctoral program. Derek is a top-notch quantitative researcher, and I enjoyed taking two classes from him last year: Measurement in Survey Research and Advanced Topics in Measurement. Where this gets slightly tricky is that Derek didn't actually teach either of my first-year quantitative methods courses, so there's a potential I could get surprised by something he normally teaches in those classes that I didn't see. It's a risk I was willing to take after working with Derek more recently and more closely in the two measurement courses last year.

It certainly won't be a surprise if Derek asks a question that focuses on issues of validity and causal inference. He mentioned it to me personally and put it in a study guide, so studying it now will be time well spent. I feel like I've had a tendency to read the validity literature a bit too quickly or superficially, so this is a good opportunity for me to revisit some of the papers I've looked at over the past couple of years. Here's the list I've put together for myself:

AERA/APA/NCME. (1999). Standards for educational and psychological testing. Washington, D.C.: American Educational Research Association. [Just the first chapter, "Validity."]

Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. Braun (Eds.), Test validity (pp. 19–32). Mahwah, NJ: Lawrence Erlbaum Associates.

Borsboom, D., Cramer, A. O. J., Kievit, R. A., Scholten, A. Z., & Franic, S. (2009). The end of construct validity. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 135–170). Information Age Publishing.

Brookhart, S. M. (2003). Developing measurement theory for classroom assessment purposes and uses. Educational Measurement: Issues and Practice, 22(4), 5–12. doi:10.1111/j.1745-3992.2003.tb00139.x

Chatterji, M. (2003). Designing and using tools for educational assessment (p. 512). Boston, MA: Allyn & Bacon. [Chapter 3, "Quality of Assessment Results: Validity, Reliability, and Utility"]

Cronbach, L. J. (1988). Five perspectives on validity argument. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 3–17). Hillsdale, NJ: Lawrence Erlbaum.

Eisenhart, M. A., & Howe, K. R. (1992). Validity in educational research. In M. LeCompte, W. Milroy, & J. Priessle (Eds.), The handbook of qualitative research in education (pp. 642–680). San Diego, CA: Academic Press.

Gorin, J. S. (2007). Test design with cognition in mind. Educational Measurement: Issues and Practice, 25(4), 21–35. doi:10.1111/j.1745-3992.2006.00076.x

Haertel, E. H., & Herman, J. L. (2005). A historical perspective on validity arguments for accountability testing. In J. L. Herman & E. H. Haertel (Eds.), Uses and misuses of data for educational accountability and improvement (NSSE 104th., pp. 1–34). Malden, MA: Wiley-Blackwell.

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960. doi:10.2307/2289069

Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527–535. doi:10.1037/0033-2909.112.3.527

Leighton, J. P., & Gierl, M. J. (2004). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice, 26(2), 3–16. doi:10.1111/j.1745-3992.2007.00090.x

Linn, R. L., & Baker, E. L. (1996). Can performance-based student assessments be psychometrically sound? Performance-based student assessment: Challenges and possibilities (pp. 84–103). Chicago, IL: The University of Chicago Press.

Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 33–45). Hillsdale, NJ: Lawrence Erlbaum.

Michell, J. (2009). Invalidity in validity. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 111–133). Information Age Publishing.

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference (p. 623). Boston, MA: Houghton Mifflin. [Probably Chapters 1-3 and 11, if not more.]

Shepard, L. A. (1993). Evaluating test validity. Review of Research in Education, 19(1), 405–450.

Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5–24. doi:10.1111/j.1745-3992.1997.tb00585.x

Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 65–82). Information Age Publishing.

Thankfully, some of these papers I've read recently for my Advances in Assessment course so the amount of reading I have to do is appreciably less than it might look. In my typical fashion, I'll study these in chronological order with the hopes that I get a sense for how the field has evolved its thinking and practice regarding these ideas over the past several decades.

Although I have little other graduate school experience to compare it to, I feel like this reading list is representative of what sets a PhD apart, particularly one earned at an R1 university. It's not necessarily glamorous, and its relevance to the day-to-day teaching and learning in classrooms might not be immediately obvious. But without attending to issues like validity and causal inference, we have a much more difficult time being sure about what we know and how we're using that knowledge. Issues of validity should be at the heart of any assessment or measurement, and when they're attended to properly we greatly improve our ability to advance educational theories and practice.

Jo Boaler, Standing Tall

Last night, Jo Boaler (whose work I've written about before) took to Twitter (welcome, Jo!) to share details of "harassment and persecution" regarding her research, which she has written about on Stanford's website (PDF). Those in the math community had some understanding that this had been going on, and I applaud Boaler's decision to bring it out in the open.

I'm sure much will be said about this in the coming days, but I hope at least some small part of the conversation addresses the discoverability and sharability of academic work. When I search for "boaler railside" on Google, this is what I see:


Instead of the first result pointing me to Boaler's 2008 article in Teachers College Record, I'm instead pointed directly to the Bishop, Clopton, and Milgram paper at the heart of this controversy. As Boaler has pointed out, it has never been published in a peer-reviewed journal. But it is published, in the modern sense, with perhaps something more important than peer review: a top ranking on Google. The second link points to Boaler's faculty profile, through which a couple of clicks will take you Boaler's self-hosted copy of the Railside article. I'm linking directly to it here not only because it's an article you should keep and read, but because it obviously needs all the Google PageRank help it can get. The third link in my search also refers to the "refutation" of Boaler's work, although the site no longer appears to exist.

Why is Boaler's original work not easier to find? Let's look at the Copyright Agreement of the Teacher's College Record. According to TCR, it is their policy to "acquire copyright for all of the material published on TCRecord.org" and that such a policy "is designed to promote the widest distribution of the material appearing on TCRecord.org while simultaneously protecting the rights of authors and of TCRecord.org as the publisher." For TCR, this "widest distribution" means putting the article behind a $7 paywall -- not an extravagant amount, but enough to keep most people from reading the work, which means not linking to it and not elevating its search rankings. (A search in Google Scholar, however, returns it as the top result.) Given the attacks on Boaler and her scholarship, has this copyright policy been "protecting the rights of authors?" In Boaler's case, it's obvious it hasn't. But then again, by signing over copyright I'm not sure exactly what rights TCR says she has left to protect.

I'm glad Boaler is sharing the article on her website. If she wasn't, I'd attempt to gain the rights to share it here, and that's not cheap:


Yes, republishing the article costs $500. Is it worth it for me to pay out of my own pocket? Probably not. But is it worth $500 to the greater mathematics education community to have it more discoverable, searchable, and sharable? Given what she's went through, is it worth it to Jo Boaler? Yes, it is, and that's why encourage all authors to publish in open access journals or otherwise negotiate their copyright agreement to ensure greater rights over their own work, including the ability to post and share in ways that improve search rankings.

OpenComps CGI

No, I don't mean "computer-generated imagery." Or the "Clinton Global Initiative." Or "Common Gateway Interface." In the world of mathematics education, CGI stands for "Cognitively Guided Instruction," one of the most robust lines of research produced in the past several decades. If you study math education, you're probably going to study CGI. If you study math education and your advisor is from the University of Wisconsin, then you're definitely going to study CGI. Here's my reading list:

Carpenter, T. P., Fennema, E., & Franke, M. L. (1996). Cognitively guided instruction: A knowledge base for reform in primary mathematics instruction. The Elementary School Journal, 97(1), 3–20. doi:10.1086/461846

Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C.-P., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26(4), 499–531. doi:10.3102/00028312026004499

Carpenter, T. P., & Moser, J. M. (1984). The acquisition of addition and subtraction concepts in grades one through three. Journal for Research in Mathematics Education, 15(3), 179–202. doi:10.2307/748348

Fennema, E., Carpenter, T. P., Franke, M. L., Levi, L., Jacobs, V. R., & Empson, S. B. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27(4), 403–434. doi:10.2307/749875

Franke, M. L., Carpenter, T. P., Levi, L., & Fennema, E. (2001). Capturing teachers’ generative change: A follow-up study of professional development in mathematics. American Educational Research Journal, 38(3), 653–689. doi:10.3102/00028312038003653

Knapp, N. F., & Peterson, P. L. (1995). Teachers’ interpretations of “CGI” after four years: Meanings and practices. Journal for Research in Mathematics Education, 26(1), 40–65. doi:10.2307/749227

This works out nicely because CGI also happens to be a topic of discussion this week in my "Advances in Assessment" class. (Related note: Due to Erin Furtak being out of town, Lorrie Shepard will be our "substitute teacher." That leads to the natural question: Great sub, or greatest sub?) CGI was also featured prominently in Randy Philipp's NCTM Research Handbook chapter on teacher beliefs and affect. Even though my knowledge of CGI is limited, I sense that lines of research like CGI are the stuff math education researchers dream about: long-lasting, productive, well-funded areas of study that help both students and teachers in measurable and meaningful ways.

OpenComps Study of Teacher Beliefs; MathEd.net Turns Three

A month from now I'll be in the midst of the written portion of my comprehensive exam. My last #OpenComps update (and several posts since then) listed several readings about teacher learning. With those complete, now I'm moving my attention towards teacher beliefs with the following articles and chapters:

Fennema, E., & Franke, M. L. (1992). Teachers knowledge and its impact. In D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning (pp. 147–164). Reston, VA: National Council of Teachers of Mathematics.

Pajares, M. F. (1992). Teachers’ beliefs and educational research: Cleaning up a messy construct. Review of Educational Research, 62(3), 307–332. doi:10.3102/00346543062003307

Philipp, R. A. (2007). Mathematics teachers’ beliefs and affect. In F. K. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 257–315). Charlotte, NC: Information Age.

Thompson, A. G. (1992). Teachers’ beliefs and conceptions: A synthesis of the research. In D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning (pp. 127–146). Reston, VA: National Council of Teachers of Mathematics.

Villegas, A. M. (2007). Dispositions in Teacher Education: A Look At Social Justice. Journal of Teacher Education, 58(5), 370–380. doi:10.1177/0022487107308419

Wilkins, J. L. M., & Brand, B. R. (2004). Change in preservice teachers’ beliefs: An evaluation of a mathematics methods course. School Science and Mathematics, 104(5), 226–232. doi:10.1111/j.1949-8594.2004.tb18245.x

As I usually do, I'm reading these in chronological order. I just finished the Pajares article and will next move on to Alba Thompson's well-regarded chapter from the 1992 NCTM research handbook. My advisor said I probably don't need to read the entire Fennema & Franke chapter, but there is a diagram near the end that I should be aware of and the context surrounding it.

MathEd.net Turns Three

Although I've been blogging my random thoughts and personal commentary since 2001, after starting graduate school I knew I'd be blogging more about education. Three years ago today, I decided it was time to split my identity: one blog and Twitter account for professional/educational content, and a separate blog and Twitter account for personal/miscellaneous content. It's been a good decision, one that has spared many of you from numerous updates about the Cubs, college wrestling, or my infrequent travels.

I'm creeping up on 40,000 page views, which I think is pretty good given how infrequently I sometimes post and how technical some of what I'm writing about has become. It reminds me largely of why I started this blog: as a teacher, I was willing to have my practice improved by knowledge from research, if only I could find it. The research literature was locked behind paywalls I couldn't afford, and as a lone math teacher in a rural district, I didn't have instructional coaches or curriculum staff to help me. But I knew smart people and resources existed online, and that social tools were allowing us to come together in new ways. The best ticket for admission in that social world is one's own contributions, and I'm trying to contribute something not easily found elsewhere.

I thank you all for reading, and I look forward to what the future brings -- not only for this blog and for myself, but also where this disintermediated online sharing of educational knowledge might take us.