2009年4月6日 星期一

Overview: Reliability and Validity

These related research issues ask us to consider whether we are studying what we think we are studying and whether the measures we use are consistent. To read more about these issues, click on the list below:

Reliability
Validity
Commentary
Key Terms
Annotated Bibliography
Related Links
Contributors to this Guide

Reliability
Reliability is the extent to which an experiment, test, or any measuring procedure yields the same result on repeated trials. Without the agreement of independent observers able to replicate research procedures, or the ability to use research tools and procedures that yield consistent measurements, researchers would be unable to satisfactorily draw conclusions, formulate theories, or make claims about the generalizability of their research. In addition to its important role in research, reliability is critical for many parts of our lives, including manufacturing, medicine, and sports.

Reliability is such an important concept that it has been defined in terms of its application to a wide range of activities. For researchers, four key types of reliability are:

Equivalency Reliability
Stability Reliability
Internal Consistency
Interrater Reliability

Reliability: Example


An example of the importance of reliability is the use of measuring devices in Olympic track and field events. For the vast majority of people, ordinary measuring rulers and their degree of accuracy are reliable enough. However, for an Olympic event, such as the discus throw, the slightest variation in a measuring device -- whether it is a tape, clock, or other device -- could mean the difference between the gold and silver medals. Additionally, it could mean the difference between a new world record and outright failure to qualify for an event. Olympic measuring devices, then, must be reliable from one throw or race to another and from one competition to another. They must also be reliable when used in different parts of the world, as temperature, air pressure, humidity, interpretation, or other variables might affect their readings.


Equivalency Reliability

Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty. Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association. In quantitative studies and particularly in experimental studies, a correlation coefficient, statistically referred to as r, is used to show the strength of the correlation between a dependent variable (the subject under study), and one or more independent variables, which are manipulated to determine effects on the dependent variable. An important consideration is that equivalency reliability is concerned with correlational, not causal, relationships.

For example, a researcher studying university English students happened to notice that when some students were studying for finals, their holiday shopping began. Intrigued by this, the researcher attempted to observe how often, or to what degree, this these two behaviors co-occurred throughout the academic year. The researcher used the results of the observations to assess the correlation between studying throughout the academic year and shopping for gifts. The researcher concluded there was poor equivalency reliability between the two actions. In other words, studying was not a reliable predictor of shopping for gifts.


Stability Reliability


Stability reliability (sometimes called test, re-test reliability) is the agreement of measuring instruments over time. To determine stability, a measure or test is repeated on the same subjects at a future date. Results are compared and correlated with the initial test to give a measure of stability.

An example of stability reliability would be the method of maintaining weights used by the U.S. Bureau of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be reset so they are "weighing" accurately. Keeping track of how much the scales are off from year to year establishes a stability reliability for these instruments. In this instance, the platinum weights themselves are assumed to have a perfectly fixed stability reliability.


Internal Consistency

Internal consistency is the extent to which tests or procedures assess the same characteristic, skill or quality. It is a measure of the precision between the observers or of the measuring instruments used in a study. This type of reliability often helps researchers interpret data and predict the value of scores and the limits of the relationship among variables.

For example, a researcher designs a questionnaire to find out about college students' dissatisfaction with a particular textbook. Analyzing the internal consistency of the survey items dealing with dissatisfaction will reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction.


Interrater Reliability


Interrater reliability is the extent to which two or more individuals (coders or raters) agree. Interrater reliability addresses the consistency of the implementation of a rating system.

A test of interrater reliability would be the following scenario: Two or more researchers are observing a high school classroom. The class is discussing a movie that they have just viewed as a group. The researchers have a sliding rating scale (1 being most positive, 5 being most negative) with which they are rating the student's oral responses. Interrater reliability assesses the consistency of how the rating system is implemented. For example, if one researcher gives a "1" to a student response, while another researcher gives a "5," obviously the interrater reliability would be inconsistent. Interrater reliability is dependent upon the ability of two or more individuals to be consistent. Training, education and monitoring skills can enhance interrater reliability.


Validity
Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure.

Researchers should be concerned with both external and internal validity. External validity refers to the extent to which the results of a study are generalizable or transferable. (Most discussions of external validity focus solely on generalizability; see Campbell and Stanley, 1966. We include a reference here to transferability because many qualitative research studies are not designed to be generalized.)

Internal validity refers to (1) the rigor with which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured) and (2) the extent to which the designers of a study have taken into account alternative explanations for any causal relationships they explore (Huitt, 1998). In studies that do not explore causal relationships, only the first of these definitions should be considered when assessing internal validity.

Scholars discuss several types of internal validity. For brief discussions of several types of internal validity, click on the items below:

Face Validity
Criterion Related Validity
Construct Validity
Content Validity

Validity: Example


Many recreational activities of high school students involve driving cars. A researcher, wanting to measure whether recreational activities have a negative effect on grade point average in high school students, might conduct a survey asking how many students drive to school and then attempt to find a correlation between these two factors. Because many students might use their cars for purposes other than or in addition to recreation (e.g., driving to work after school, driving to school rather than walking or taking a bus), this research study might prove invalid. Even if a strong correlation was found between driving and grade point average, driving to school in and of itself would seem to be an invalid measure of recreational activity.


Face Validity


Face validity is concerned with how a measure or procedure appears. Does it seem like a reasonable way to gain the information the researchers are attempting to obtain? Does it seem well designed? Does it seem as though it will work reliably? Unlike content validity, face validity does not depend on established theories for support (Fink, 1995).


Criterion Related Validity


Criterion related validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

For example, imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test with the scores from the hands-on driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.


Construct Validity


Construct validity seeks agreement between a theoretical concept and a specific measuring device or procedure. For example, a researcher inventing a new IQ test might spend a great deal of time attempting to "define" intelligence in order to reach an acceptable level of construct validity.

Construct validity can be broken down into two sub-categories: Convergent validity and discriminate validity. Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related. Discriminate validity is the lack of a relationship among measures which theoretically should not be related.

To understand whether a piece of research has construct validity, three steps should be followed. First, the theoretical relationships must be specified. Second, the empirical relationships between the measures of the concepts must be examined. Third, the empirical evidence must be interpreted in terms of how it clarifies the construct validity of the particular measure being tested (Carmines & Zeller, p. 23).


Content Validity


Content Validity is based on the extent to which a measurement reflects the specific intended domain of content (Carmines & Zeller, 1991, p.20).

Content validity is illustrated using the following examples: Researchers aim to study mathematical learning and create a survey to test for mathematical skill. If these researchers only tested for multiplication and then drew conclusions from that survey, their study would not show content validity because it excludes other mathematical functions. Although the establishment of content validity for placement-type exams seems relatively straight-forward, the process becomes more complex as it moves into the more abstract domain of socio-cultural studies. For example, a researcher needing to measure an attitude like self-esteem must decide what constitutes a relevant domain of content for that attitude. For socio-cultural studies, content validity forces the researchers to define the very domains they are attempting to study.


Commentary
The challenges of achieving reliability and validity are among the most difficult faced by researchers. In this section, we offer commentaries on these challenges.

Difficulties of Achieving Reliability
Comments on a Flawed, Yet Influential Study

Difficulties of Achieving Reliability


It is important to understand some of the problems concerning reliability which might arise. It would be ideal to reliably measure, every time, exactly those things which we intend to measure. However, researchers can go to great lengths and make every attempt to ensure accuracy in their studies, and still deal with the inherent difficulties of measuring particular events or behaviors. Sometimes, and particularly in studies of natural settings, the only measuring device available is the researcher's own observations of human interaction or human reaction to varying stimuli. As these methods are ultimately subjective in nature, results may be unreliable and multiple interpretations are possible. Three of these inherent difficulties are quixotic reliability, diachronic reliability and synchronic reliability.

Quixotic reliability refers to the situation where a single manner of observation consistently, yet erroneously, yields the same result. It is often a problem when research appears to be going well. This consistency might seem to suggest that the experiment was demonstrating perfect stability reliability. This, however, would not be the case.

For example, if a measuring device used in an Olympic competition always read 100 meters for every discus throw, this would be an example of an instrument consistently, yet erroneously, yielding the same result. However, quixotic reliability is often more subtle in its occurrences than this. For example, suppose a group of German researchers doing an ethnographic study of American attitudes ask questions and record responses. Parts of their study might produce responses which seem reliable, yet turn out to measure felicitous verbal embellishments required for "correct" social behavior. Asking Americans, "How are you?" for example, would in most cases, elicit the token, "Fine, thanks." However, this response would not accurately represent the mental or physical state of the respondents.

Diachronic reliability refers to the stability of observations over time. It is similar to stability reliability in that it deals with time. While this type of reliability is appropriate to assess features that remain relatively unchanged over time, such as landscape benchmarks or buildings, the same level of reliability is more difficult to achieve with socio-cultural phenomena.

For example, in a follow-up study one year later of reading comprehension in a specific group of school children, diachronic reliability would be hard to achieve. If the test were given to the same subjects a year later, many confounding variables would have impacted the researchers' ability to reproduce the same circumstances present at the first test. The final results would almost assuredly not reflect the degree of stability sought by the researchers.

Synchronic reliability refers to the similarity of observations within the same time frame; it is not about the similarity of things observed. Synchronic reliability, unlike diachronic reliability, rarely involves observations of identical things. Rather, it concerns itself with particularities of interest to the research.

For example, a researcher studies the actions of a duck's wing in flight and the actions of a hummingbird's wing in flight. Despite the fact that the researcher is studying two distinctly different kinds of wings, the action of the wings and the phenomenon produced is the same.


Comments on a Flawed, Yet Influential Study


An example of the dangers of generalizing from research that is inconsistent, invalid, unreliable, and incomplete is found in the Time magazine article, "On A Screen Near You: Cyberporn" (De Witt, 1995). This article relies on a study done at Carnegie Mellon University to determine the extent and implications of online pornography. Inherent to the study are methodological problems of unqualified hypotheses and conclusions, unsupported generalizations and a lack of peer review.

Ignoring the functional problems that manifest themselves later in the study, it seems that there are a number of ethical problems within the article. The article claims to be an exhaustive study of pornography on the Internet, (it was anything but exhaustive), it resembles a case study more than anything else. Marty Rimm, author of the undergraduate paper that Time used as a basis for the article, claims the paper was an "exhaustive study" of online pornography when, in fact, the study based most of its conclusions about pornography on the Internet on the "descriptions of slightly more than 4,000 images" (Meeks, 1995, p. 1). Some USENET groups see hundreds of postings in a day.

Considering the thousands of USENET groups, 4,000 images no longer carries the authoritative weight that its author intended. The real problem is that the study (an undergraduate paper similar to a second-semester composition assignment) was based not on pornographic images themselves, but on the descriptions of those images. This kind of reduction detracts significantly from the integrity of the final claims made by the author. In fact, this kind of research is commensurate with doing a study of the content of pornographic movies based on the titles of the movies, then making sociological generalizations based on what those titles indicate. (This is obviously a problem with a number of types of validity, because Rimm is not studying what he thinks he is studying, but instead something quite different. )

The author of the Time article, Philip Elmer De Witt writes, "The research team at CMU has undertaken the first systematic study of pornography on the Information Superhighway" (Godwin, 1995, p. 1). His statement is problematic in at least three ways. First, the research team actually consisted of a few of Rimm's undergraduate friends with no methodological training whatsoever. Additionally, no mention of the degree of interrater reliability is made. Second, this systematic study is actually merely a "non-randomly selected subset of commercial bulletin-board systems that focus on selling porn" (Godwin, p. 6). As pornography vending is actually just a small part of the whole concerning the use of pornography on the Internet, the entire premise of this study's content validity is firmly called into question. Finally, the use of the term "Information Superhighway" is a false assessment of what in actuality is only a few USENET groups and BBSs (Bulletin Board System), which make up only a small fraction of the entire "Information Superhighway" traffic. Essentially, what is here is yet another violation of content validity.

De Witt is quoted as saying: "In an 18-month study, the team surveyed 917,410 sexually-explicit pictures, descriptions, short-stories and film clips. On those USENET newsgroups where digitized images are stored, 83.5 percent of the pictures were pornographic" (De Witt 40).

Statistically, some interesting contradictions arise. The figure 917,410 was taken from adult-oriented BBSs--none came from actual USENET groups or the Internet itself. This is a glaring discrepancy. Out of the 917,410 files, 212,114 are only descriptions (Hoffman & Novak, 1995, p.2). The question is, how many actual images did the "researchers" see?

"Between April and July 1994, the research team downloaded all available images (3,254)...the team encountered technical difficulties with 13 percent of these images...This left a total of 2,830 images for analysis" (p. 2). This means that out of 917,410 files discussed in this study, 914,580 of them were not even pictures! As for the 83.5 percent figure, this is actually based on "17 alt.binaries groups that Rimm considered pornographic" (p. 2).

In real terms, 17 USENET groups is a fraction of a percent of all USENET groups available. Worse yet, Time claimed that "...only about 3 percent of all messages on the USENET [represent pornographic material], while the USENET itself represents 11.5 percent of the traffic on the Internet" (De Witt, p. 40).

Time neglected to carry the interpretation of this data out to its logical conclusion, which is that less than half of 1 percent (3 percent of 11 percent) of the images on the Internet are associated with newsgroups that contain pornographic imagery. Furthermore, of this half percent, an unknown but even smaller percentage of the messages in newsgroups that are 'associated with pornographic imagery', actually contained pornographic material (Hoffman & Novak, p. 3).

Another blunder can be seen in the avoidance of peer-review, which suggests that there was some political interests being served in having the study become a Timecover story. Marty Rimm contracted the Georgetown Law Review and Time in an agreement to publish his study as long as they kept it under lock and key. During the months before publication, many interested scholars and professionals tried in vain to obtain a copy of the study in order to check it for flaws. De Witt justified not letting such peer-review take place, and also justified the reliability and validity of the study, on the grounds that because the Georgetown Law Review had accepted it, it was therefore reliable and valid, and needed no peer-review. What he didn't know, was that law reviews are not edited by professionals, but by "third year law students" (Godwin, p. 4).

There are many consequences of the failure to subject such a study to the scrutiny of peer review. If it was Rimm's desire to publish an article about on-line pornography in a manner that legitimized his article, yet escaped the kind of critical review the piece would have to undergo if published in a scholarly journal of computer-science, engineering, marketing, psychology, or communications. What better venue than a law journal? A law journal article would have the added advantage of being taken seriously by law professors, lawyers, and legally-trained policymakers. By virtue of where it appeared, it would automatically be catapulted into the center of the policy debate surrounding online censorship and freedom of speech (Godwin).

Herein lies the dangerous implication of such a study: Because the questions surrounding pornography are of such immediate political concern, the study was placed in the forefront of the U.S. domestic policy debate over censorship on the Internet, (an integral aspect of current anti-First Amendment legislation) with little regard for its validity or reliability.

On June 26, the day the article came out, Senator Grassley, (co-sponsor of the anti-porn bill, along with Senator Dole) began drafting a speech that was to be delivered that very day in the Senate, using the study as evidence. The same day, at the same time, Mike Godwin posted on WELL (Whole Earth 'Lectronic Link, a forum for professionals on the Internet) what turned out to be the overstatement of the year: "Philip's story is an utter disaster, and it will damage the debate about this issue because we will have to spend lots of time correcting misunderstandings that are directly attributable to the story" (Meeks, p. 7).

As Godwin was writing this, Senator Grassley was speaking to the Senate: "Mr. President, I want to repeat that: 83.5 percent of the 900,000 images reviewed--these are all on the Internet--are pornographic, according to the Carnegie-Mellon study" ( p. 7). Several days later, Senator Dole was waving the magazine in front of the Senate like a battle flag.

Donna Hoffman, professor at Vanderbilt University, summed up the dangerous political implications by saying, "The critically important national debate over First Amendment rights and restrictions of information on the Internet and other emerging media requires facts and informed opinion, not hysteria" (p.1).

In addition to the hysteria, Hoffman sees a plethora of other problems with the study. "Because the content analysis and classification scheme are 'black boxes,'" Hoffman said, "because no reliability and validity results are presented, because no statistical testing of the differences both within and among categories for different types of listings has been performed, and because not a single hypothesis has been tested, formally or otherwise, no conclusions should be drawn until the issues raised in this critique are resolved" (p. 4).

However, the damage has already been done. This questionable research by an undergraduate engineering major has been generalized to such an extent that even the U.S. Senate, and in particular Senators Grassley and Dole, have been duped, albeit through the strength of their own desires to see only what they wanted to see.


Annotated Bibliography
American Psychological Association. (1985). Standards for educational and psychological testing. Washington, DC: Author.

This work on focuses on reliability, validity and the standards that testers need to achieve in order to ensure accuracy.
Babbie, E.R. & Huitt, R.E. (1979). The practice of social research 2nd ed. Belmont, CA: Wadsworth Publishing.

An overview of social research and its applications.
Beauchamp, T. L., Faden, R.R., Wallace, Jr., R.J. & Walters, L. (1982). Ethical issues in social science research. Baltimore and London: The Johns Hopkins University Press.

A systematic overview of ethical issues in Social Science Research written by researchers with firsthand familiarity with the situations and problems researchers face in their work. This book raises several questions of how reliability and validity can be affected by ethics.
Borman, K.M. et al. (1986). Ethnographic and qualitative research design and why it doesn't work. American behavioral scientist 30, 42-57.

The authors pose questions concerning threats to qualitative research and suggest solutions.
Bowen, K. A. (1996, Oct. 12). The sin of omission -punishable by death to internal validity: An argument for integration of quantitative research methods to strengthen internal validity. Available: http://trochim.human.cornell.edu/gallery/bowen/hss691.htm

An entire Web site that examines the merits of integrating qualitative and quantitative research methodologies through triangulation. The author argues that improving the internal validity of social science will be the result of such a union.
Brinberg, D. & McGrath, J.E. (1985). Validity and the research process. Beverly Hills: Sage Publications.

The authors investigate validity as value and propose the Validity Network Schema, a process by which researchers can infuse validity into their research.
Bussières, J-F. (1996, Oct.12). Reliability and validity of information provided by museum Web sites. Available: http://www.oise.on.ca/~jfbussieres/issue.html

This Web page examines the validity of museum Web sites which calls into question the validity of Web-based resources in general. Addresses the issue that all Websites should be examined with skepticism about the validity of the information contained within them.
Campbell, D. T. & Stanley, J.C. (1963). Experimental and quasi-experimental designs for research. Boston: Houghton Mifflin.

An overview of experimental research that includes pre-experimental designs, controls for internal validity, and tables listing sources of invalidity in quasi-experimental designs. Reference list and examples.
Carmines, E. G. & Zeller, R.A. (1991). Reliability and validity assessment. Newbury Park: Sage Publications.

An introduction to research methodology that includes classical test theory, validity, and methods of assessing reliability.
Carroll, K. M. (1995). Methodological issues and problems in the assessment of substance use. Psychological Assessment, Sep. 7 n3, 349-58.

Discusses methodological issues in research involving the assessment of substance abuse. Introduces strategies for avoiding problems with the reliability and validity of methods.
Connelly, F. M. & Clandinin, D.J. (1990). Stories of experience and narrative inquiry. Educational Researcher 19:5, 2-12.

A survey of narrative inquiry that outlines criteria, methods, and writing forms. It includes a discussion of risks and dangers in narrative studies, as well as a research agenda for curricula and classroom studies.
De Witt, P.E.. (1995, July 3). On a screen near you: Cyberporn. Time, 38-45.

This is an exhaustive Carnegie Mellon study of online pornography by Marty Rimm, electrical engineering student.
Fink, A., ed. (1995). The survey Handbook, v.1.Thousand Oaks, CA: Sage.

A guide to survey; this is the first in a series referred to as the "survey kit". It includes bibliograpgical references. Addresses survey design, analysis, reporting surveys and how to measure the validity and reliability of surveys.
Fink, A., ed. (1995). How to measure survey reliability and validity v. 7. Thousand Oaks, CA: Sage.

This volume seeks to select and apply reliability criteria and select and apply validity criteria. The fundamental principles of scaling and scoring are considered.
Godwin, M. (1995, July). JournoPorn, dissection of the Time article. Available: http://www.hotwired.com

A detailed critique of Time magazine's Cyberporn, outlining flaws of methodology as well as exploring the underlying assumptions of the article.
Hambleton, R.K. & Zaal, J.N., eds. (1991). Advances in educational and psychological testing. Boston: Kluwer Academic.

Information on the concepts of reliability and validity in psychology and education.
Harnish, D.L. (1992). Human judgment and the logic of evidence: A critical examination of research methods in special education transition literature. In D.L. Harnish et al. eds., Selected readings in transition.

This article investigates threats to validity in special education research.
Haynes, N. M. (1995). How skewed is 'the bell curve'? Book Product Reviews. 1-24.

This paper claims that R.J. Herrnstein and C. Murray's The Bell Curve: Intelligence and Class Structure in American Life does not have scientific merit and claims that the bell curve is an unreliable measure of intelligence.
Healey, J. F. (1993). Statistics: A tool for social research, 3rd ed. Belmont: Wadsworth Publishing.

Inferential statistics, measures of association, and multivariate techniques in statistical analysis for social scientists are addressed.
Helberg, C. (1996, Oct.12). Pitfalls of data analysis (or how to avoid lies and damned lies). Available: http//maddog/fammed.wisc.edu/pitfalls/

A discussion of things researchers often overlook in their data analysis and how statistics are often used to skew reliability and validity for the researchers purposes.
Hoffman, D. L. and Novak, T.P. (1995, July). A detailed critique of the Time article: Cyberporn. Available: http://www.hotwired.com

A methodological critique of the Time article that uncovers some of the fundamental flaws in the statistics and the conclusions made by De Witt.
Huitt, William G. (1998). Internal and External Validity. http://www.valdosta.peachnet.edu/~whuitt/psy702/intro/valdgn.html

A Web document addressing key issues of external and internal validity.
Jones, J. E. & Bearley, W.L. (1996, Oct 12). Reliability and validity of training instruments. Organizational Universe Systems. Available: http://ous.usa.net/relval.htm

The authors discuss the reliability and validity of training design in a business setting. Basic terms are defined and examples provided.
Cultural Anthropology Methods Journal. (1996, Oct. 12). Available: http://www.lawrence.edu/~bradleyc/cam.html

An online journal containing articles on the practical application of research methods when conducting qualitative and quantitative research. Reliability and validity are addressed throughout.
Kirk, J. & Miller, M. M. (1986). Reliability and validity in qualitative research. Beverly Hills: Sage Publications.

This text describes objectivity in qualitative research by focusing on the issues of validity and reliability in terms of their limitations and applicability in the social and natural sciences.
Krakower, J. & Niwa, S. (1985). An assessment of validity and reliability of the institutinal perfarmance survey. Boulder, CO: National center for higher education management systems.

Educational surveys and higher education research and the efeectiveness of organization.
Lauer, J. M. & Asher, J.W. (1988). Composition Research. New York: Oxford University Press.

A discussion of empirical designs in the context of composition research as a whole.
Laurent, J. et al. (1992, Mar.) Review of validity research on the stanford-binet intelligence scale: 4th Ed. Psychological Assessment. 102-112.

This paper looks at the results of construct and criterion- related validity studies to determine if the SB:FE is a valid measure of intelligence.
LeCompte, M. D., Millroy, W.L., & Preissle, J. eds. (1992). The handbook of qualitative research in education. San Diego: Academic Press.

A compilation of the range of methodological and theoretical qualitative inquiry in the human sciences and education research. Numerous contributing authors apply their expertise to discussing a wide variety of issues pertaining to educational and humanities research as well as suggestions about how to deal with problems when conducting research.
McDowell, I. & Newell, C. (1987). Measuring health: A guide to rating scales and questionnaires. New York: Oxford University Press.

This gives a variety of examples of health measurement techniques and scales and discusses the validity and reliability of important health measures.
Meeks, B. (1995, July). Muckraker: How Time failed. Available: http://www.hotwired.com

A step-by-step outline of the events which took place during the researching, writing, and negotiating of the Time article of 3 July, 1995 titled: On A Screen Near You: Cyberporn.
Merriam, S. B. (1995). What can you tell from an N of 1?: Issues of validity and reliability in qualitative research. Journal of Lifelong Learning v4, 51-60.

Addresses issues of validity and reliability in qualitative research for education. Discusses philosophical assumptions underlying the concepts of internal validity, reliability, and external validity or generalizability. Presents strategies for ensuring rigor and trustworthiness when conducting qualitative research.
Morris, L.L, Fitzgibbon, C.T., & Lindheim, E. (1987). How to measure performance and use tests. In J.L. Herman (Ed.), Program evaluation kit (2nd ed.). Newbury Park, CA: Sage.

Discussion of reliability and validity as it pertyains to measuring students' performance.
Murray, S., et al. (1979, April). Technical issues as threats to internal validity of experimental and quasi-experimental designs. San Francisco: University of California. 8-12.

(From Yang et al. bibliography--unavailable as of this writing.)
Russ-Eft, D. F. (1980). Validity and reliability in survey research. American Institutes for Research in the Behavioral Sciences August, 227 151.

An investigation of validity and reliability in survey research with and overview of the concepts of reliability and validity. Specific procedures for measuring sources of error are suggested as well as general suggestions for improving the reliability and validity of survey data. A extensive annotated bibliography is provided.
Ryser, G. R. (1994). Developing reliable and valid authentic assessments for the classroom: Is it possible? Journal of Secondary Gifted Education Fall, v6 n1, 62-66.

Defines the meanings of reliability and validity as they apply to standardized measures of classroom assessment. This article defines reliability as scorability and stability and validity is seen as students' ability to use knowledge authentically in the field.
Schmidt, W., et al. (1982). Validity as a variable: Can the same certification test be valid for all students? Institute for Research on Teaching July, ED 227 151.

A technical report that presents specific criteria for judging content, instructional and curricular validity as related to certification tests in education.
Scholfield, P. (1995). Quantifying language. A researcher's and teacher's guide to gathering language data and reducing it to figures. Bristol: Multilingual Matters.

A guide to categorizing, measuring, testing, and assessing aspects of language. A source for language-related practitioners and researchers in conjunction with other resources on research methods and statistics. Questions of reliability, and validity are also explored.
Scriven, M. (1993). Hard-Won Lessons in Program Evaluation. San Francisco: Jossey-Bass Publishers.

A common sense approach for evaluating the validity of various educational programs and how to address specific issues facing evaluators.
Shou, P. (1993, Jan.). The singer loomis inventory of personality: A review and critique. [Paper presented at the Annual Meeting of the Southwest Educational Research Association.]

Evidence for reliability and validity are reviewed. A summary evaluation suggests that SLIP (developed by two Jungian analysts to allow examination of personality from the perspective of Jung's typology) appears to be a useful tool for educators and counselors.
Sutton, L.R. (1992). Community college teacher evaluation instrument: A reliability and validity study. Diss. Colorado State University.

Studies of reliability and validity in occupational and educational research.
Thompson, B. & Daniel, L.G. (1996, Oct.). Seminal readings on reliability and validity: A "hit parade" bibliography. Educational and psychological measurement v. 56, 741-745.

Editorial board members of Educational and Psychological Measurement generated bibliography of definitive publications of measurement research. Many articles are directly related to reliability and validity.
Thompson, E. Y., et al. (1995). Overview of qualitative research. Diss. Colorado State University.

A discussion of strengths and weaknesses of qualitative research and its evolution and adaptation. Appendices and annotated bibliography.
Traver, C. et al. (1995). Case Study. Diss. Colorado State University.

This presentation gives an overview of case study research, providing definitions and a brief history and explanation of how to design research.
Trochim, William M. K. (1996) External validity. (. Available: http://trochim.human.cornell.edu/kb/EXTERVAL.htm

A comprehensive treatment of external validity found in William Trochim's online text about research methods and issues.
Trochim, William M. K. (1996) Introduction to validity. (. Available: hhttp://trochim.human.cornell.edu/kb/INTROVAL.htm

An introduction to validity found in William Trochim's online text about research methods and issues.
Trochim, William M. K. (1996) Reliability. (. Available: http://trochim.human.cornell.edu/kb/reltypes.htm

A comprehensive treatment of reliability found in William Trochim's online text about research methods and issues.
Validity. (1996, Oct. 12). Available: http://vislab-www.nps.navy.mil/~haga/validity.html

A source for definitions of various forms and types of reliability and validity.
Vinsonhaler, J. F., et al. (1983, July). Improving diagnostic reliability in reading through training. Institute for Research on Teaching ED 237 934.

This technical report investigates the practical application of a program intended to improve the diagnoses of reading deficient students. Here, reliability is assumed and a pragmatic answer to a specific educational problem is suggested as a result.
Wentland, E. J. & Smith, K.W. (1993). Survey responses: An evaluation of their validity. San Diego: Academic Press.

This book looks at the factors affecting response validity (or the accuracy of self-reports in surveys) and provides several examples with varying accuracy levels.
Wiget, A. (1996). Father juan greyrobe: Reconstructing tradition histories, and the reliability and validity of uncorroborated oral tradition. Ethnohistory 43:3, 459-482.

This paper presents a convincing argument for the validity of oral histories in ethnographic research where at least some of the evidence can be corroborated through written records.
Yang, G. H., et al. (1995). Experimental and quasi-experimental educational research. Diss. Colorado State University.

This discussion defines experimentation and considers the rhetorical issues and advantages and disadvantages of experimental research. Annotated bibliography.
Yarroch, W. L. (1991, Sept.). The Implications of content versus validity on science tests. Journal of Research in Science Teaching, 619-629.

The use of content validity as the primary assurance of the measurement accuracy for science assessment examinations is questioned. An alternative accuracy measure, item validity, is proposed to look at qualitative comparisons between different factors.
Yin, R. K. (1989). Case study research: Design and methods. London: Sage Publications.

This book discusses the design process of case study research, including collection of evidence, composing the case study report, and designing single and multiple case studies.



Reliability and Validity: Links
Introduction to Validity.
William Trochim's introduction to validity in his comprehensive online textbook about research methods and issues.
http://trochim.human.cornell.edu/kb/introval.htm

Reliability.
Trochim's overview of reliability.
http://trochim.human.cornell.edu/kb/reliable.htm

External Validity.
William Trochim's discussion of external reliability.
http://trochim.human.cornell.edu/kb/EXTERVAL.htm

Educational Psychology Interactive: Internal and External Validity.
A Web document addressing key issues of external and internal validity.
http://www.valdosta.peachnet.edu/~whuitt/psy702/intro/valdgn.html

Cultural Anthropology Methods Journal.
An online journal containing articles on the practical application of research methods when conducting qualitative and quantitative research. Reliability and validity are addressed throughout.
http://www.lawrence.edu/~bradleyc/cam.html

The sin of omission -punishable by death to internal validity: An argument for integration of quantitative research methods to strengthen internal validity.
This site examines the merits of integrating qualitative and quantitative research methodologies through triangulation. The author argues that improving the internal validity of social science will be the result of such a union.
http://trochim.human.cornell.edu/gallery/bowen/hss691.htm

Reliability and validity of information provided by museum Web sites.
This Web page examines the validity of museum Web sites which calls into question the validity of Web-based resources in general. Addresses the issue that all Websites should be examined with skepticism about the validity of the information contained within them.
http://www.oise.on.ca/~jfbussieres/issue.html

Internal Validity Tutorial.
An interactive tutorial on internal validity. http://server.bmod.athabascau.ca/html/Validity/index.shtml

2009年4月5日 星期日

Annotated Bibliography

Alvermann, D., O'Brien, D., & Dillon, D. (1996). On writing qualitative research. Reading research quarterly, 31(1), 114-120.

This article presents a "conversation" among the authors about issues in writing qualitative research reports. They address potential problems researchers may face when reporting their findings and discuss how theory and methodology shape qualitative research write-ups.

Anderson, G. L. (1994). The cultural politics of qualitative research in education: Confirming and contesting the canon. Educational Theory, 44, 225-237.

This article looks at different approaches to qualitative field research. It is also a critical review of the Handbook of qualitative research in education.

Andreas, D. (1992). Ethnography of Biography: Student Teachers Reflecting on 'Life-Stories' of Experienced Teachers. Paper presented at the Annual Meeting of the American Educational Research Association (San Francisco, CA, April 20-24).

Explores the use of ethnographic biography as a source of information and reflection for student teachers.

Balester, V. M. (1993). Cultural divide: A study of African-American college-level writers. Portsmouth, NH: Boynton/Cook.

This book is based on research Balester conducted on the spoken and written texts of African-American students. For her study, Balester did case studies of eight African-American students, looking specifically at the students' attitudes toward their own language and the language of academia.

Banning, J. (1995, Sept. 19). Qualitative research. Personal interview with professor at Colorado State University, Fort Collins.

Dr. Banning, a professor in the School of Education at Colorado State University, discusses in detail the workshop he and colleague Jeff Gliner conducted on qualitative research.

Bishop, W. (1992). I-Witnessing in Composition: Turning Ethnographic Data into Narratives. Rhetoric Review; v11 n1 p147-58 Fall.

Discusses problems with reconciling ethnographic research with positivistic methods.

Blair, K. (1995). Ethnography and the Internet: Research into Electronic Discourse Communities. Paper presented at the Annual Meeting of the Conference on College Composition and Communication (46th, Washington, DC, March 23-25, 1995).

Pros of electronic ethnography.

Borman, K. M. (1986). Ethnographic and qualitative research design and why it doesn't work. American Behavioral Scientist, 30, 43-57.

Borman identifies the characteristics of qualitative research and its weaknesses, then offers solutions.

Brophy, J. (Nov. 1995). Thoughts on the qualitative quantitative debate. Chicago, IL: National Council for the Social Studies. (ERIC Document Reproduction Service No. 392 734)

The focus of this paper is on the goals of both qualitative and quantative research and developing effective studies for the classroom. Brophy asserts that qualitative and quantitative methods are simply "tools" and should be evaluated from the standpoint of what questions they can answer best.

Bruyn, S. T. (1970). The new empiricists: The participant observer and phenomenologist. In W. J. Filstead (Ed.), Qualitative methodology: Firsthand involvement with the social world. Chicago: Markham, 283-287.

This article discusses the importance of phenomenology to qualitative research.

Bullock, R. (1995). Classroom Research in Graduate Methods Courses. Paper presented at the Annual Meeting of the Conference on College Composition and Communication (46th, Washington, DC, March 23-25, 1995).

Examines first year graduate student-teachers and why they are distrustful of narrative or ethnographic research as opposed to empirical research.

Burroughs-Lange, S. G., & Lange, J. (1993). Denuded data! Grounded theory using the NUDIST computer analysis program: In researching the challenge to teacher self-efficacy posed by students with learning disabilities in Australian education. Paper presented at the annual meeting of the American Educational Research Association, Atlanta, GA. (ERIC Document Reproduction Service No. ED 364 193)

The authors evaluate the use of the NUDIST (Non-numerical, Unstructured Data Indexing, Searching and Theorising) computer program to organize coded, qualitative data. NUDIST was used in the authors' study to develop a theoretical understanding of the challenge that students with learning disablities pose to neophyte teachers' newly-formed images of effectiveness.

Collier, J., & Collier, M. (1986). Visual anthropology: Photography as a research method. Albuquerque: University of New Mexico Press.

This work discusses the benefits and possibilities of including photography in anthropological and ethnographic research. The book includes sections on the role of the photographer in documenting a culture or group, how photographs function in the interviewing process, analyzing images, and the psychological significance of photography and visual images in conveying meaning.

Connelly, F. M., & Clandinin, D. J. (1990). Stories of experience and narrative inquiry.Educational Researcher, 19 (5), 2-14.

This article is a theoretical work on conducting narrative inquiry that focuses on the issues of transferability and generalizability in this field of research.

Coulon, A. (1995). Ethnomethodology (J. Coulon & J. Katz, Trans.). London: Sage.

This text covers the history and issues related to ethnomethodology.

Cross, G. (1994). Ethnographic Research in Business and Technical Writing: Between Extremes and Margins. Journal of Business and Technical Communication; v8 n1 p118-34 Jan.

Explores the phenomenal context, the site's cultural context, the research community context, and the researcher's interior context in business and technical writing.

Doheny-Farina, S. (1986). Writing in an emerging organization: An ethnographic study. Written Communication, 3, 158-85.

This article, gleaned from the author's doctoral dissertation, discusses his study of collaborative writing among executives at a new software firm. His methods included participant-observations, open-ended interviews, and Discourse-Based interviews.

Doheny-Farina, S. & Odell, L. (1985). Ethnographic research on writing: assumptions and methodology. In L. Odell &D. Goswami (Ed.), Writing in nonacademic settings. New York: Guilford, 503-535.

With a caution that researchers in English need to understand ethnography's basis in anthropology, this article outlines theoretical assumptions, methodologies, and the uses and limitations of ethnographic research.

Dyson, A. Haas. (1984). Learning to write/learning to do school: Emergent writers' interpretations of school literacy tasks. Research in the Teaching of English, 18, . 233-264.

This article is the report of an ethnographic study of kindergarten children which examined the relationship between their learning to write and their adapting to the culture of school. Data was collected several times per week over a fourteen week period. The researcher was a participant-observer who selected three case study children during the first phase of observation and studied them in context.

Ember,C. R., & Ember, M. (1973). Anthropology. New York: Appleton, Century, Crofts.



Fetterman, D. M. (1989). Ethnography: Step by step. Newbury Park, CA: Sage.

As the title suggests, this is a how-to book on ethnographies and ethnographic research. The book answers the question: what is ethnographic research and outlines a step by step approach to conducting this type of research. Chapter subjects include methods and techniques of ethnographic fieldwork, equipment needed for ethnographic research, how to analyze your findings, the writing process, and ethics in ethnographic research.

Fielding, N. G., & Lee, R. M. (Ed.). (1991). Using computers in qualitative research. London: Sage.

This anthology contains 11 essays on computers and qualitative research. The topics include general information about types of qualitative research and software, implications for research, and qualitative knowledge and computing. This text provides valuable information on both the positive and negative aspects of using computers for qualitative research.

Filstead, W. J. (Ed.). (1970). Qualitative methodology: Firsthand involvement with the social world. Chicago: Markham.

This text is a collection of essays on qualitative methodologies.

Firestone, W. A. & Dawson, J. A. (June 1981). To ethnograph or not to ethnograph? Varieties of qualitative research in education. Philadelphia, PA: Research for Better Schools, Inc. (ERIC Document Reproduction Service No. ED 222 985)

This paper addresses the advantages and disadvantages of using ethnographic studies and outlines six criteria for successfully using ethnographies in education studies. The authors also discuss five ways in which qualitative approaches can vary in terms of data collection.

Fitch, K. (1994). Criteria for Evidence in Qualitative Research. Western Journal of Communication; v58 n1 p32-38 Win.

Contributions and limitations of conversation analysis and postmodernism toward the enterprise of ethnographic research. Criteria for qualitative data as evidence for claims about social life and for a qualitative study to count as evidence.

Flake, C. (1992). Ethnography for Teacher Education: An Innovative Elementary School Social Studies Program in South Carolina. Social Studies; v83 n6 p253-57 Nov-Dec 1992.

Describes a teacher education program that utilizes an internship that includes an ethnographic research project. Explains that the teacher intern is required to conduct an in-depth analysis of the social studies being taught in their school as contrasted to that described in their textbooks. Includes resulting suggestions for improvement in the curriculum.

Gilbert, R. (1992). Text and context in qualitative educational research: Discourse analysis and the problem of contextual explanation. Linguistics and Education, 4, 37-57.

This article discusses methods of improving qualitative research in education.

Gilmore, D.D. (1991, Fall). Subjectivity and subjugation: Fieldwork in the stratified community. Human Organization, 215.

This article outlines an anthropologist's efforts to maintain scholarly neutrality in an agricultural town in Franco Spain where class conflict was severe.

Greenberg, J. H. (1954). A quantitative approach to the morphological typology of language. In R. F. Spencer (Ed.). Method and perspective in anthropology. Minneapolis, MN: University of Minnesota Press.

The author compares and contrasts typological methods of languages against the genetic-historical method.

Hammersley,M., & Atkinson, P. (1983). Ethnography: Principles in practice. London: Taveston.

This work deals with what ethnographic research is, what its strengths and weaknesses are, and how to go about conducting the research for your own project.

Hammersley, M. (1990). Reading ethnographic research: A critical guide. New York: Longman.

This book is a how-to manual on ethnographic research emphasizing understanding within unspoken contexts.

Hasselkus, B. R. (1995). Beyond ethnography: Expanding our understanding and criteria for qualitative research. Occupational Therapy Journal of Research, 15, 75-84.

Hasselkus discusses the different methods of qualitative research.

Hathaway, R. (1995). Assumptions underlying quantitative and qualitative research: Implications for institutional research. Research in higher education, 36 (5), 535-562.

Hathaway says that the choice between using qualitative or quantitative approaches is less about methodology and more about aligning oneself with particular theoretical and academic traditions. He concluded that the two approaches address questions in very different ways, each one having its own advantages and drawbacks.

Heath, S. B. (1983). Ways with words: Language, life, and work in communities and classrooms. New York: Cambridge University Press.

Heath studies two communities; one Black and one White, to analyze the citizens' language development.

Heath, S. B. (1993). The Madness(es) of Reading and Writing Ethnography. Anthropology and Education Quarterly; v24 n3 p256-68 Sep.

Describes how these reactions have led the author to see things in the work that she had not seen before. Strengths and weaknesses of the book she identifies have implications for the conduct of future ethnographic research.

Hinsley, C. M. (1981). Savages and scientists: The Smithsonian Institution and the development of American anthropology. Washington, D.C.: Smithsonian Institution Press.

Hornberger, N. (1995). Ethnography in Linguistic Perspective: Understanding School Processes. Language and Education; v9 n4 p233-48.

Perspectives and methodologies that sociolinguistics brings to ethnographic research in schools. Methodological contributions arising from linguistics that interactional sociolinguistics and microethnograpy share, such as the use of naturally occurring language data, the consultation of native intuition, and discourse analysis.

This short Web site briefly describes qualitative research and gives an example of how it can be used to supplement quantitative studies in health care.

Journal of Contemporary Ethnography (formerly Urban Life). Newbury Park, CA: Sage.

This is a quarterly publication containing recent ethnographic studies and what's new in ethnography. This publication is a good source of information on and examples of how other researchers are conducting their own ethnographic studies

Kamil, M. L., Langer, J. A., & Shanahan, T. (1985). Ethnographic methodologies. Understanding research in reading and writing. Boston: Allyn and Bacon, 71-91.

The chapter defines ethnographic research, examines its theoretical underpinnings, and contrasts it with experimental research. It includes an extended example from Heath's "Questioning at Home and at School: A Comparative Study."

Kirk, J. & Miller, M. (1986). Reliability and validity in qualitative research. Beverly Hills, CA: Sage.

This book investigates how realiability and validity in qualitative research help to evaluate the objectivity of particular studies. The authors assert that given the true meaning of validity, many studies, including "scientific" ones, are not really valid. Also included are guidelines for maintaining reliability in qualitative studies.

Lancy, D. E. (1993). Qualitative research in education. White Plains, NY: Longman.

This text explores the many issues of qualitative research.

Lauer, J. M., & Asher, J. W. (1988). Ethnographies. Composition research: Empirical designs. New York: Oxford University Press, 39-53.

This chapter provides an overview of ethnographic research applied to English. It includes examples from two studies, Florio and Clark's "The function of writing in an elementary classroom" and Lemke and Bridwell's "Assessing writing ability--an ethnographic study of consultant-teacher relationships."

Lawless, E.J. (1992, Summer). I was afraid someone like you...an outsider...would misunderstand: Negotiating interpretive differences between ethnographers and subjects. Journal of American Folklore, 302.

This article looks at the role of the ethnographer in the collection of field research and writing. A new approach called "reciprocal ethnography" allows for interaction with the ethnographer.

Lazerfeld, P. F. (1972). Qualitative analysis: Historical and critical essays. Boston: Allyn and Bacon.

This text deals with the issues of qualitative research.

LeCompte, M. D., Millroy, W. L., & Preissle, J. (Ed.). (1992). The handbook of qualitative research in education. San Diego: Academic Press.

This anthology contains 18 essays on qualitative research in education. The topics range from the future of qualitative research to issues of validity and subjectivity in qualitative research. This text is a good source for those interested in current theories about and research on qualitative research itself.

Lier, L. (1988). The classroom and the language learner. New York: Longman.

The author argues for collecting and interpreting of classroom data (L-2 learning) in the presence of only limited knowledge of the process of teaching and learning in second language classrooms. This book sets out to define problems of classroom research within second language acquisition study and within social science. And, it offers a well documented guide for conducting research in the context of the classroom.

Lincoln, Y.S., & Guba, E.G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage.

This text outlines the positivist and naturalist research paradigms.

Linstead, S. (1993, Jan.). From postmodern anthropology to deconstructive ethnography. Human Relations, 97.

This article studies the effects of ethnography and postmodern influences on organizations. Derridian deconstruction theory is applied in order to get a new angle on social interactions within organizations.

Manwar, A., Johnson, B. D., & Dunlap, E. (1994). Qualitative data analysis with hypertext: A case of New York City crack dealers. Qualitative Sociology, 17, 283-292.

The authors describe some of the problems of data management and analysis faced by a team of ethnographers researching cocaine and crack distributuion in New York City. The researchers used FolioVIEWS, a hypertext software program, which proved to be more effective than other available programs in solving managment and analytical problems.

Marshall, C. & Rossman, G. (1995). Designing qualitative research. (2nd ed.). Thousand Oaks, CA: Sage.

This book explains different types of qualitative studies and provides thorough instruction on how to design, conduct and evaluate a qualitative study. It also includes helpful information on managing time, personnel and financial resources for qualitative research.

Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis. Thousand Oaks, CA: Sage.

This text covers data analysis issues related to qualitative research.

Minnich, R. G. (Ed.). (1987). Aspects of Polish folk culture. Bergen, Norway: Department of Social Anthropology, University of Bergen.

This text is a good source of examples for work done in the field of ethnography dealing with culture and literature. The work is a compilation of ethnographic studies by different authors done on topics ranging from the role played by gifts in Polish weddings to the role of art in Polish society. Through the reports included in this work, Minnich draws a clearer picture of Polish folk culture.

Minnis, J. R. (1985). Ethnography, case study, grounded theory, and distance education research. Distance Education, 6, 189-198.

Minnis explores the possibility of expanding the research base through the use of accepted qualitative methodologies.

Moores, S. (1993). Interpreting audiences. Thousand Oaks, CA: Sage.

This text characterizes features of ethnography as a method of cultural investigation. It provides a discussion of the opposing, alternative perspectives on various forms of media reception and how ethnographic practice best equips researchers to map the media's varied uses and meanings for particular social subjects in particular cultural contexts.

Mortensen, P. & Kirsch, G., Eds. (1996). Ethics and Representation in Qualitative Studies of Literacy.Urbana, IL: ERIC.

Fourteen essays address questions faced by qualitative researchers today: how to represent others and themselves in research narratives; how to address ethical dilemmas in research-participant relations; and how to deal with various rhetorical, institutional, and historical constraints on research.

Narayan, K. (1989). Storytellers, saints, scoundrels: Folk narrative in Hindu religious teaching. Philadelphia: University of Pennsylvania Press.

The author relates hindu stories and their significance to education, both moral and religious.

Newkirk, T. (1991). The politics of composition research: The conspiracy against experience. In R. Bullock & J. Trimbur (Eds.), The politics of writing instruction: Postsecondary. Portsmouth, NH: Heinemann, 119-135.

The author argues for the importance of ethnographic research in English education from a political perspective. He cites its key strengths over experimental research--particularity, involvement of the researcher, underlying ideology--the very characteristics which experimentalists criticize. Newkirk asserts that ethnographic research empowers practitioners.

Patton, M. Q. (1992). Ethnography and research: A qualitative view. Topics in Language Disorders, 12, 1-14.

This article describes the functions of ethnography in the fields of education and communication disorders.

Patton, M. Q. (1980). Qualitative evaluation methods. Beverly Hills: Sage.

This book is an in depth study of qualitative research from conceptual issues to data analysis.

Rice-Lively. (1994). Wired Warp and Woof: An Ethnographic Study of a Networking Class. Internet Research; v4 n4 p20-35 Win.

Describes an ethnographic study of the electronic community comprised of masters and doctoral students involved in a seminar on networking. Ethnographic research facilitated observation and description of the networked learning community. The exploration of the cultural meaning of class events led to enhanced understanding of online education and the applicability of ethnographic research.



Richards, L., & Richards, T. (1993). Qualitative computing: promises, problems, and implications for research process. Qualitative data analysis resources Home Page. [On-line]. Available WWW: address http://www.qsr.com.au/ftp/papers/qualprobs.txt.

Based on their experience with qualitative research software, the authors examine both the positive and negative aspects of this technology.

Rosen, M. (1991, Jan.). Coming to terms with the field: Understanding and doing organizational ethnography. Journal of Management Studies, 1.

Ethnography is not well understood or applied as a methodology for studying organization culture. This article highlights problems and offers tools for effective research in this arena.

Sanday, P. R. (1979). The ethnographic paradigms(s). Administrative Science Quarterly, 24, 527-538.

Three styles of ethnography are examined: holistic, semiotic, and behavioristic.

Saville-Troike, M. (1989). The ethnography of communication (2nd ed.). Oxford: Basil Blackwell.

This text is a synthesis of the field of ethnography of communication, which studies the norms of communicative conduct in different communities and deals with methods for studying these norms.

Schmid, T. (1992). Classroom-Based Ethnography: A Research Pedagogy. Teaching Sociology; v20 n1 p28-35 Jan.

Discusses difficulties of classroom-based research and obstacles to conducting classroom-based ethnographic research. Identifies temporal obstacles, personnel, safety, and traditional classroom orientation. Suggests experiential approaches for fieldwork instructors such as individual projects, a choice of group projects, or a single designated class project. Describes a cooperative project on homelessness.

Shanahan, T., Ed. Teachers Thinking, Teachers Knowing: Reflections on Literacy and Language Education. Urbana, IL: ERIC.

Thirteen essays share the insights of leading scholars and teacher-researchers regarding the re-emergence of teacher education as a central focus in the field of English education. Discusses methods of supporting teacher development such as the study of cases, teacher groups, ethnographic research in the classroom and community, and teacher lore.

Smith, G.W. (1990, Nov.) Political activist as ethnographer. Social Problems, 629.

Two studies that use Dorothy E. Smith's reflexive materialist method of sociology are presented; the studies examine the social organization of ruling regimes with an aim toward changing them.

Snyder, I. (1995). Multiple perspectives in literacy research: Integrating the quantitative and qualitative. Language and Education, 9 (1), 45-59.

This article explains a study in which the author employed quantitative and qualitative methods simultaneously to compare computer composition classrooms and traditional classrooms. Although there were some problems with integrating both approaches, Snyder says they can be used together if researchers plan carefully and use their methods thoughtfully.

Tallerico, M. (1992). Computer technology for qualitative research: Hope and humbug. Journal of Educational Administration, 30 (2), 32-40.

The author describes how computer technology offers new options for the qulitative researcher in education. Tallerico also identifies both the potential benefits and limitations of research software, drawing on a study of local educational governance. She also decribes the ETHNOGRAPH, a data analysis program.

Tesch, R. (1991). Software for qualitative researchers: Analysis needs and program capabilities. In N. G. Fielding & R. M. Lee (Ed.), Using computers in qualitative research. London: Sage, 16-37.

Tesch begins by explaining the different types of qualitative research. She goes on to define the general categories of computer software available to qualitative researchers and gives advice on what functions and features to look for when choosing software.

Thornton, S. & Garrett, K. (1995). Ethnography as a Bridge to Multicultural Practice. Journal of Social Work Education; v31 n1 p67-74 Win.

Ethnographic research method taught as a way of studying different cultural groups in a social work curriculum.

Turner, E. (1992). Experiencing ritual: A new interpretation of African healing. Philadelphia: University of Pennsylvania Press.

This text reports an anthropology, the story of a "visible spirit" from among the Ndembu of Zambia. This work gives an account of the ethnographer's experience living with the Ndembu and attempting to parallel Ndembu life.

Van Maanen, J. (1979). The fact of fiction in organizational ethnography. Administrative Science Quarterly, 24, 539-550.

Van Maanen discusses the need to distinguish whether the point of view reported is that of informant or of researcher.

Van Maanen, J. (1988). Tales of the field. Chicago: The University of Chicago Press.

In this book, the author provides an informal introduction to ethnography addressed to fieldworkers of sociology or anthropology.

Weitzman, E. A., & Miles, M. B. (1995). Computer programs for qualitative data analysis. Thousand Oaks, CA: Sage.

Weitzman and Miles discuss the different functions of qualitative research software. They also categorize the software currently available and explain and review each program. This text provides valuable information for any researcher who is choosing software for qualitative research.

Wu, R. (1994). Writing In and Writing Out: Some Reflections on the Researcher's Dual Role in Ethnographic Research. Paper presented at the Annual Penn State Conference on Rhetoric and Composition (University Park, PA, July 13-16).

Proposes "a more fluid, process-oriented definition of the ethnographer's role based on feminist standpoint theories to acknowledge the complexity of multicultural observers and observed."

Zaharlick, A. (1992). Ethnography in anthropology and its value for education. Theory into Practice, 31, 116-125.

This article examines the role of ethnography in anthropology.

Related Links

The following is a list of Internet links that are related to the field of qualitative observational research methods.

Books from SAGE publications:

http://www.sagepub.co.uk/books/details/b003409.html

The Association of Qualitative Research Practitioners

http://www.aqrp.co.uk/

Nova Southeastern University’s School of Social and Systematic Studies (go to their Homepage and do a search on Qualitative Research)

http://www.nova.edu/

General references on qualitative research

http://www.misq.org/misqd961/isworld/general.htm

Advanced graduate study and research at Lesley College

http://www.lesley.edu/courses/eagsr.html

ISWorld Net page for research and scholarship

http://www.umich.edu/~isworld/reshome.html

Glossary of Key Terms

Accuracy
A term used in survey research to refer to the match between the target population and the sample.

ANCOVA (Analysis of Co-Variance)
Same method as ANOVA, but analyzes differences between dependent variables.

ANOVA (Analysis of Variance)
A method of statistical analysis broadly applicable to a number of research designs, used to determine differences among the means of two or more groups on a variable. The independent variables are usually nominal, and the dependent variable is usual an interval.

Apparency
Clear, understandable representation of the data

Bell curve
A frequency distribution statistics. Normal distribution is shaped like a bell.

Case Study
The collection and presentation of detailed information about a particular participant or small group, frequently including the accounts of subjects themselves.

Causal Model
A model which represents a causal relationship between two variables.

Causal Relationship
The relationship established that shows that an independent variable, and nothing else, causes a change in a dependent variable. Establishes, also, how much of a change is shown in the dependent variable.

Causality
The relation between cause and effect.

Central Tendency
These measures indicate the middle or center of a distribution.

Confirmability
Objectivity; the findings of the study could be confirmed by another person conducting the same study

Confidence Interval
The range around a numeric statistical value obtained from a sample, within which the actual, corresponding value for the population is likely to fall, at a given level of probability (Alreck, 444).

Confidence Level
The specific probability of obtaining some result from a sample if it did not exist in the population as a whole, at or below which the relationship will be regarded as statistically significant (Alreck, 444).

Confidence Limits (Same as confidence interval, but is terminology used by Lauer and Asher.)
"The range of scores or percentages within which a population percentage is likely to be found on variables that describe that population" (Lauer and Asher, 58). Confidence limits are expressed in a "plus or minus" fashion according to sample size, then corrected according to formulas based on variables connected to population size in relation to sample size and the relationship of the variable to the population size--the larger the sample, the smaller the variability or confidence limits.

Confounding Variable
An unforeseen, and unaccounted-for variable that jeopardizes reliability and validity of an experiment's outcome.

Construct Validity Seeks an agreement between a theoretical concept and a specific measuring device, such as observation.

Content Validity The extent to which a measurement reflects the specific intended domain of content (Carmines & Zeller, 1991, p.20).

Context sensitivity Awareness by a qualitative researcher of factors such as values and beliefs that influence cultural behaviors

Continuous Variable A variable that may have fractional values, e.g., height, weight and time.
Control Group A group in an experiment that receives not treatment in order to compare the treated group against a norm.

Convergent Validity The general agreement among ratings, gathered independently of one another, where measures should be theoretically related.

Correlation 1) A common statistical analysis, usually abbreviated as r, that measures the degree of relationship between pairs of interval variables in a sample. The range of correlation is from -1.00 to zero to +1.00. 2) A non-cause and effect relationship between two variables.

Covariate A product of the correlation of two related variables times their standard deviations. Used in true experiments to measure the difference of treatment between them.

Credibility A researcher's ability to demonstrate that the object of a study is accurately identified and described, based on the way in which the study was conducted

Criterion Related Validity Used to demonstrate the accuracy of a measuring procedure by comparing it with another procedure which has been demonstrated to be valid; also referred to as instrumental validity.

Data Recorded observations, usually in numeric or textual form
Deductive A form of reasoning in which conclusions are formulated about particulars from general or universal premises

Dependability Being able to account for changes in the design of the study and the changing conditions surrounding what was studied.

Dependent Variable A variable that receives stimulus and measured for the effect the treatment has had upon it.

Design flexibility A quality of an observational study that allows researchers to pursue inquiries on new topics or questions that emerge from initial research

Deviation The distance between the mean and a particular data point in a given distribution.
Discourse Community A community of scholars and researchers in a given field who respond to and communicate to each other through published articles in the community's journals and presentations at conventions. All members of the discourse community adhere to certain conventions for the presentation of their theories and research.
Discrete Variable A variable that is measured solely in whole units, e.g., gender and siblings
Discriminate Validity The lack of a relationship among measures which theoretically should not be related.

Distribution The range of values of a particular variable.
Dynamic systems Qualitative observational research is not concerned with having straight-forward, right or wrong answers. Change in a study is common because the researcher is not concerned with finding only one answer.

Electronic Text A "paper" or linear text that has been essentially "copied" into an electronic medium.

Empathic neutrality A quality of qualitative researchers who strive to be non-judgmental when compiling findings

Empirical Research "…the process of developing systematized knowledge gained from observations that are formulated to support insights and generalizations about the phenomena under study" (Lauer and Asher, 1988, p. 7)

Equivalency Reliability The extent to which two items measure identical concepts at an identical level of difficulty.

Ethnography Ethnographies study groups and/or cultures over a period of time. The goal of this type of research is to comprehend the particular group/culture through observer immersion into the culture or group. Research is completed through various methods, which are similar to those of case studies, but since the researcher is immersed within the group for an extended period of time more detailed information is usually collected during the research.

Ethnomethodology A form of ethnography that studies activities of group members to see how they make sense of their surroundings

Existence or Frequency This is a key question in the coding process. The researcher must decide if he/she is going to count a concept only once, for existence, no matter how many times it appears, or if he/she will count it each time it occurs. For example, "damn" could be counted once, even though it appears 50 times, or it could be counted all 50 times. The latter measurement may be interested in how many times it occurs and what that indicates, whereas the former may simply looking for existence, period.

Experiment Experimental Research A researcher working within this methodology creates an environment in which to observe and interpret the results of a research question. A key element in experimental research is that participants in a study are randomly assigned to groups. In an attempt to create a causal model (i.e., to discover the causal origin of a particular phenomenon), groups are treated differently and measurements are conducted to determine if different treatments appear to lead to different effects.

External Validity The extent to which the results of a study are generalizable or transferable. See also validity

Face Validity How a measure or procedure appears.

Factor Analysis A statistical test that explores relationships among data. The test explores which variables in a data set are most related to each other. In a carefully constructed survey, for example, factor analysis can yield information on patterns of responses, not simply data on a single response. Larger tendencies may then be interpreted, indicating behavior trends rather than simply responses to specific questions.

Generalizability The extent to which research findings and conclusions from a study conducted on a sample population can be applied to the population at large.

Grounded theory Practice of developing other theories that emerge from observing a group. Theories are grounded in the group's observable experiences, but researchers add their own insight into why those experiences exist.

Holistic perspective Taking almost every action or communication of the whole phenomenon of a certain community or culture into account in research

Hypertext A nonsequential text composed of links and nodes

Hypothesis A tentative explanation based on theory to predict a causal relationship between variables.

Independent Variable A variable that is part of the situation that exist from which originates the stimulus given to a dependent variable. Includes treatment, state of variable, such as age, size, weight, etc.

Inductive A form of reasoning in which a generalized conclusion is formulated from particular instances

Inductive analysis A form of analysis based on inductive reasoning; a researcher using inductive analysis starts with answers, but forms questions throughout the research process.

Internal Consistency The extent to which all questions or items assess the same characteristic, skill, or quality.

Internal Validity (1) The rigor with which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured) and (2) the extent to which the designers of a study have taken into account alternative explanations for any causal relationships they explore (Huitt, 1998). In studies that do not explore causal relationships, only the first of these definitions should be considered when assessing internal validity. See also validity.

Interrater Reliability The extent to which two or more individuals agree. It addresses the consistency of the implementation of a rating system.

Interval Variable A variable in which both order of data points and distance between data points can be determined, e.g., percentage scores and distances
Interviews A research tool in which a researcher asks questions of participants; interviews are often audio- or video-taped for later transcription and analysis.

Irrelevant Information One must decide what to do with the information in the text that is not coded. One's options include either deleting or skipping over unwanted material, or viewing all information as relevant and important and using it to reexamine, reassess and perhaps even alter the one's coding scheme.

Kinesics Kinesic analysis examines what is communicated through body movement

Level of Analysis Chosen by determining which word, set of words, or phrases will constitute a concept. According to Carley, 100-500 concepts is generally sufficient when coding for a specific topic, but this number of course varies on a case by case basis.

Level of Generalization A researcher must decide whether concepts are to be coded exactly as they appear, or if they can be recorded in some altered or collapsed form. Using Horton as an example again, she could code profanity individually and code "damn" and "dammit" as two separate concepts. Or, by generalizing their meaning, i.e. they both express the same idea, she could group them together as one item, i.e. "damn words."

Level of Implication One must determine whether to code simply for explicit appearances of concepts, or for implied concepts, as well. For example, consider a hypothetical piece of text about skiing, written by an expert. The expert might refer several times to "???," as well as various other kinds of turns. One must decide whether to code "???" as an entity in and of itself, or, if coding for "turn" references in general, to code "???" as implicitly meaning "turn." Thus, by determining that the meaning "turn" is implicit in the words "???," anytime the words "???" or "turn" appear in the text, they will be coded under the same category of "turn."

Link In hypertext, a pointer from one node to another

Matched T-Test A statistical test used to compare two sets of scores for the same subject. A matched pairs T-test can be used to determine if the scores of the same participants in a study differ under different conditions. For instance, this sort of t-test could be used to determine if people write better essays after taking a writing class than they did before taking the writing class.

Matching Process of corresponding variables in experimental groups equally feature for feature.

Mean The average score within a distribution.
Mean Deviation A measure of variation that indicates the average deviation of scores in a distribution from themean: It is determined by averaging the absolute values of the deviations.
Median The center score in a distribution.
Mental Models A group or network of interrelated concepts that reflect conscious or subconscious perceptions of reality. These internal mental networks of meaning are constructed as people draw inferences and gather information about the world.

Mode The most frequent score in a distribution.
Multi-Modal Methods A research approach that employs a variety of methods; see also triangulation

Narrative Inquiry A qualitative research approach based on a researcher's narrative account of the investigation, not to be confused with a narrative examined by the researcher as data

Naturalistic Inquiry Observational research of a group in its natural setting

Node In hypertext, each unit of information, connected by links

Nominal Variable A variable determined by categories which cannot be ordered, e.g., gender and color
Normal distribution A normal frequency distribution representing the probability that a majority of randomly selected members of a population will fall within the middle of the distribution. Represented by the bell curve.

Ordinal Variable A variable in which the order of data points can be determined but not the distance between data points, e.g., letter grades
Parameter A coefficient or value for the population that corresponds to a particular statistic from a sample and is often inferred from the sample.
Phenomenology A qualitative research approach concerned with understanding certain group behaviors from that group's point of view

Population The target group under investigation, as in all students enrolled in first-year composition courses taught in traditional classrooms. The population is the entire set under consideration. Samples are drawn from populations.

Precision In survey research, the tightness of the confidence limits.

Pre-defined or Interactive Concept Choice One must determine whether to code only from a pre-defined set of concepts and categories, or if one will develop some or all of these during the coding process. For example, using a predefined set, Horton would code only for profane language. But, if Horton coded interactively, she may have decided to half-way through the process that the text warranted coding for profane gestures, as well.

Probability The chance that a phenomenon has a of occurring randomly. As a statistical measure, it shown as p (the "p" factor).

Qualitative Research Empirical research in which the researcher explores relationships using textual, rather than quantitative data. Case study, observation, and ethnography are considered forms of qualitative research. Results are not usually considered generalizable, but are often transferable.

Quantitative Research Empirical research in which the researcher explores relationships using numeric data. Survey is generally considered a form of quantitative research. Results can often be generalized, though this is not always the case.

Quasi-experiment Similar to true experiments. Have subjects, treatment, etc., but uses nonrandomized groups. Incorporates interpretation and transferability in order to compensate for lack of control of variables.

Quixotic Reliability Refers to the situation where a single manner of observation consistently, yet erroneously, yields the same result.

Random sampling Process used in research to draw a sample of a population strictly by chance, yielding no discernible pattern beyond chance. Random sampling can be accomplished by first numbering the population, then selecting the sample according to a table of random numbers or using a random-number computer generator. The sample is said to be random because there is no regular or discernible pattern or order. Random sample selection is used under the assumption that sufficiently large samples assigned randomly will exhibit a distribution comparable to that of the population from which the sample is drawn.

Randomization Used to allocate subjects to experimental and control groups. The subjects are initially considered not unequal because they were randomly selected.

Range The difference between the highest and lowest scores in a distribution.
Reliability The extent to which a measure, procedure or instrument yields the same result on repeated trials.

Response Rate In survey research, the actual percentage of questionnaires completed and returned.

Rhetorical Inquiry "entails…1) identifying a motivational concern, 2) posing questions, 3) engaging in a heuristic search (which in composition studies has often occurred by probing other fields), 4) creating a new theory or hypotheses, and 5) justifying the theory" (Lauer and Asher, 1988, p. 5)

Rigor Degree to which research methods are scrupulously and meticulously carried out in order to recognize important influences occurring in a experiment.

Sampling Error The degree to which the results from the sample deviate from those that would be obtained from the entire population, because of random error in the selection of respondent and the corresponding reduction in reliability (Alreck, 454).

Sampling Frame A listing that should include all those in the population to be sampled and exclude all those who are not in the population (Alreck, 454).

Sample The population researched in a particular study. Usually, attempts are made to select a "sample population" that is considered representative of groups of people to whom results will be generalized or transferred. In studies that use inferential statistics to analyze results or which are designed to be generalizable, sample size is critical--generally the larger the number in the sample, the higher the likelihood of a representative distribution of the population.

Selective Reduction The central idea of content analysis. Text is reduced to categories consisting of a word, set of words or phrases, on which the researcher can focus. Specific words or patterns are indicative of the research question and determine levels of analysis and generalization.

Serial Effect In survey research, a situation where questions may "lead" participant responses through establishing a certain tone early in the questionnaire. The serial effect may accrue as several questions establish a pattern of response in the participant, biasing results.

Short-term observation Studies that list or present findings of short-term qualitative study based on recorded observation

Skewed Distribution Any distribution which is not normal, that is not symmetrical along the x-axis
Stability Reliability The agreement of measuring instruments over time.

Standard Deviation A term used in statistical analysis. A measure of variation that indicates the typical distance between the scores of a distribution and the mean; it is determined by taking the square root of the average of the squared deviations in a given distribution.It can be used to indicate the proportion of data within certain ranges of scale values when the distribution conforms closely to the normal curve.

Standard Error (S.E.) of the Mean A term used in statistical analysis. A computed value based on the size of the sample and the standard deviation of the distribution, indicating the range within which the mean of the population is likely to be from the mean of the sample at a given level of probability (Alreck, 456).

Survey A research tool that includes at least one question which is either open-ended or close-ended and employs an oral or written method for asking these questions. The goal of a survey is to gain specific information about either a specific group or a representative sample of a particular group. Results are typically used to understand the attitudes, beliefs, or knowledge of a particular group.

Synchronic Reliability The similarity of observations within the same time frame; it is not about the similarity of things observed.

T-Test A statistical test. A t-test is used to determine if the scores of two groups differ on a single variable. For instance, to determine whether writing ability differs among students in two classrooms, a t-test could be used.

Thick Description A rich and extensive set of details concerning methodology and context provided in a research report.

Transferability The ability to apply the results of research in one context to another similar context. Also, the extent to which a study invites readers to make connections between elements of the study and their own experiences.

Translation Rules If one decides to generalize concepts during coding, then one must develop a set of rules by which less general concepts will be translated into more general ones. This doesn't involve simple generalization, for example, as with "damn" and "dammit," but requires one to determine, from a given set of concepts, what concepts are missing. When dealing with the idea of profanity, one must decide what to do with the concept "dang it," which is generally thought to imply "damn it." The researcher must make this distinction, i.e. make this implicit concept explicit, and then code for the frequency of its occurrence. This decision results in the construction of a translation rule, which instructs the researcher to code for the concept "dang it" in a certain way.

Treatment The stimulus given to a dependent variable.

Triangulation The use of a combination of research methods in a study. An example of triangulation would be a study that incorporated surveys, interviews, and observations. See also multi-modal methods

Unique case orientation A perspective adopted by many researchers conducting qualitative observational studies; researchers adopting this orientation remember every study is special and deserves in-depth attention. This is especially necessary for doing cultural comparisons.

Validity The degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. A method can be reliable, consistently measuring the same thing, but not valid. See also internal validity and external validity

Variable Observable characteristics that vary among individuals. See also ordinal variable, nominal variable, interval variable, continuous variable, discrete variable, dependent variable, independent variable.
Variance A measure of variation within a distribution, determined by averaging the squared deviations from the mean of a distribution.
Variation The dispersion of data points around the mean of a distribution.
Verisimilitude Having the semblance of truth; in research, it refers to the probability that the research findings are consistent with occurrences in the "real world."

Ethnography, Observational Research, and Narrative Inquiry: Commentary

Commentary on issues related to qualitative observational research is provided in this section. Check the items below to read the commentaries:

Advantages of Qualitative Observational Research

Qualitative observational research, especially ethnographies, can:
Account for the complexity of group behaviors
Reveal interrelationships among multifaceted dimensions of group interactions
Provide context for behaviors

Narrative inquiry,especially ethnographic, can:
Reveal qualities of group experience in a way that other forms of research cannot
Help determine questions and types of follow-up research

Observational study can:
Reveal descriptions of behaviors in context by stepping outside the group
Allow qualitative researchers to identify recurring patterns of behavior that participants may be unable to recognize

Qualitative research expands the range of knowledge and understanding of the world beyond the researchers themselves. It often helps us see why something is the way it is, rather than just presenting a phenomenon. For instance, a quantitative study may find that students who are taught composition using a process method receive higher grades on papers than students taught using a product method. However, a qualitative study of composition instructors could reveal why many of them still use the product method even though they are aware of the benefits of the process method.
http://writing.colostate.edu/guides/research/observe/com2d1.cfm

Disadvantages of Qualitative Observational Research

Qualitative Observational Research
Researcher bias can bias the design of a study.
Researcher bias can enter into data collection.
Sources or subjects may not all be equally credible.
Some subjects may be previously influenced and affect the outcome of the study.
Background information may be missing.
Study group may not be representative of the larger population.
Analysis of observations can be biased.
Any group that is studied is altered to some degree by the very presence of the researcher. Therefore, any data collected is somewhat skewed. (Heisenburg Uncertainty Principle)
It takes time to build trust with participants that facilitates full and honest self-representation. Short term observational studies are at a particular disadvantage where trust building is concerned.

Ethnographic studies
The quality of the data alone is problematic. (Lauer and Asher) (1988): Ethnographic research is time consuming, potentially expensive, and requires a well trained researcher
Too little data can lead to false assumptions about behavior patterns. Conversely, a large quantity of data may not be effectively be processed
Data Collector's first impressions can bias collection

Narrative Inquiries
Narrative inquiries do not lend themselves well to replicability and are not generalizable.
Narrative Inquiries are considered unreliable by experimentalists. However, ethnographies can be assessed and compared for certain variables to yield testable explanations; this is as close as ethnographic research gets to being empirical in nature.
Qualitative research is neither prescriptive nor definite. While it provides significant data about groups or cultures and prompts new research questions, narrative studies do not attempt to answer questions, nor are they predictive of future behaviors.
http://writing.colostate.edu/guides/research/observe/com2d2.cfm

The Qualitative/Quantitative Debate

In Miles and Huberman's 1994 book Qualitative Data Analysis, quantitative researcher Fred Kerlinger is quoted as saying, "There's no such thing as qualitative data. Everything is either 1 or 0" (p. 40). To this another researcher, D. T. Campbell, asserts, "All research ultimately has a qualitative grounding" (p. 40). This back and forth banter among qualitative and quantitative researchers is "essentially unproductive," according to Miles and Huberman. They and many other researchers agree that these two research methods need each other more often than not. But, because qualitative data typically involves words and quantitative data involves numbers, there are some researchers who feel that one is better (or more scientific) than the other. Another major difference between the two is that qualitative research is inductive and quantitative research is deductive. In qualitative research, a hypothesis is not needed to begin research. However, all quantitative research requires a hypothesis before research can begin.

Another major difference between qualitative and quantitative research deals with the underlying assumptions about the role of the researcher. In quantitative research, the researcher is ideally an objective observer who neither participates in nor influences what is being studied. In qualitative research, however, it is thought that the researcher can learn the most by participating and/or being immersed in a research situation. These basic underlying assumptions of both methodologies guide and sequence the types of data collection methods employed.

Although there are clear differences between qualitative and quantitative approaches, some researchers maintain that the choice between using qualitative or quantitative approaches actually has less to do with methodologies than it does with positioning oneself within a particular discipline or research tradition. The difficulty in choosing a method is compounded by the fact that research is often affiliated with universities and other institutions. The findings of research projects often guide important decisions about specific practices and policies. Choices about which approach to use may reflect the interests of those conducting or benefiting from the research and the purposes for which the findings will be applied. Decisions about which kind of research method to use may also be based on the researcher's own experience and preference, the population being researched, the proposed audience for findings, time, money and other resources available (Hathaway, 1995).

Some researchers believe that qualitative and quantitative methodologies cannot be combined because the assumptions underlying each tradition are so vastly different. Other researchers think they can be used in combination only by alternating between methods; qualitative research is appropriate to answer certain kinds of questions in certain conditions and quantitative is right for others. And some researchers think that both qualitative and quantitative methods can be used simultaneously to answer a research question.

To a certain extent, researchers on all sides of the debate are correct; each approach has its drawbacks. Quantitative research often "forces" responses or people into categories that might not "fit" in order to make meaning. Qualitative research, on the other hand, sometimes focuses too closely on individual results and fails to make connections to larger situations or possible causes of the results. Rather than discounting either approach for its drawbacks, researchers should find the most effective ways to incorporate elements of both to ensure that their studies are as accurate and thorough as possible.

It is important for researchers to realize that qualitative and quantitative methods can be used in conjunction with each other. In a study of computer-assisted writing classrooms, Snyder (1995) employed both qualitative and quantitative approaches. The study was constructed according to guidelines for quantitative studies; the computer classroom was the "treatment" group and the traditional pen and paper classroom was the "control" group. Both classes contained subjects with the same characteristics from the population sampled. Both classes followed the same lesson plan and were taught by the same teacher in the same semester. The only variable used was the absence or presence of the computers. Although Snyder set this study up as an "experiment," she used many qualitative approaches to supplement her findings. She observed both classrooms on a regular basis as a participant-observer and conducted several interviews with the teacher both during and after the semester. However, there were problems in using this approach. The strict adherence to the same syllabus and lesson plans for both classes and the restricted access of the control group to the computers may have put some students at a disadvantage. Snyder also notes that in retrospect she should have used case studies of the students to further develop her findings. Although her study had certain flaws, Snyder insists that researchers can simultaneously employ qualitative and quantitative methods if studies are planned carefully and carried out conscientiously.

Newkirk (1991) argues for qualitative research in English education from a political point of view. He says that not only can teachers more readily identify with and accept such particularized studies, but also the work of observing-participants, who report classroom "lore," gives practitioners a voice in the conversations informing their discipline. In addition, he asserts that experimental research tends to support the hierarchical structure of education policy, which discounts the experience of practitioners by privileging the alleged objectivity and generalizability of experimental designs and removing research from context. Additionally, Newkirk points out that "ethnographic...research works from fundamentally different assumptions about knowledge." Essentially, ethnography's epistemological orientation is phenomenological (observation based) while experimental research's is ontological (investigates the metaphysical or essential nature of something).
http://writing.colostate.edu/guides/research/observe/com2d3.cfm

Ethical Considerations in Ethnography, Observational Research, and Narrative Inquiry
Ethical issues should always be considered when undertaking data analysis. Because the nature of qualitative observational research requires observation and interaction with groups, it is understandable why certain ethical issues may arise. Miles and Huberman (1994) list several issues that researchers should consider when analyzing data. They caution researchers to be aware of these and other issues before, during, and after the research had been conducted. Some of the issues involve the following:

Informed consent (Do participants have full knowledge of what is involved?)
Harm and risk (Can the study hurt participants?)
Honesty and trust (Is the researcher being truthful in presenting data?)
Privacy, confidentiality, and anonymity (Will the study intrude too much into group behaviors?)
Intervention and advocacy (What should researchers do if participants display harmful or illegal behavior?)
http://writing.colostate.edu/guides/research/observe/com2d4.cfm