Module 2: Research Methods in Social Psychology
Module Overview
In Module 2 we will address the fact that psychology is the scientific study of behavior and mental processes. We will do this by examining the steps of the scientific method and describing the five major designs used in psychological research. We will also differentiate between reliability and validity and their importance for measurement. Psychology has very clear ethical standards and procedures for scientific research. We will discuss these but also why they are needed. Finally, psychology as a field, but especially social psychology as a subfield, is faced with a replication crisis and issues with the generalizability of its findings. These will be explained to close out the module.
Module Outline
- 2.1. The Scientific Method
- 2.2. Research Designs Used by Social Psychologists
- 2.3. Reliability and Validity
- 2.4. Research Ethics
- 2.5. Issues in Social Psychology
Module Learning Outcomes
- Clarify what it means for psychology to be scientific by examining the steps of the scientific method and the three cardinal features of science.
- Outline the five main research methods used in psychology and clarify how they are utilized in social psychology.
- Differentiate and explain the concepts of reliability and validity.
- Describe key features of research ethics.
- Clarify the nature of the replication crisis in psychology and the importance of generalizability.
2.1. The Scientific Method
Section Learning Objectives
- Define scientific method.
- Outline and describe the steps of the scientific method, defining all key terms.
- Identify and clarify the importance of the three cardinal features of science.
In Module 1, we learned that psychology was the scientific study of behavior and mental processes. We will spend quite a lot of time on the behavior and mental processes part, but before we proceed, it is prudent to elaborate more on what makes psychology scientific. In fact, it is safe to say that most people not within our discipline or a sister science, would be surprised to learn that psychology utilizes the scientific method at all.
So what is the scientific method? Simply, the scientific method is a systematic method for gathering knowledge about the world around us. The key word here is that it is systematic meaning there is a set way to use it. What is that way? Well, depending on what source you look at it can include a varying number of steps. For our purposes, the following will be used:
Table 2.1: The Steps of the Scientific Method
Step | Name | Description |
0 | Ask questions and be willing to wonder. | To study the world around us you have to wonder about it. This inquisitive nature is the hallmark of critical thinking, or our ability to assess claims made by others and make objective judgments that are independent of emotion and anecdote and based on hard evidence, and required to be a scientist. We might wonder why our friend chose to go to a technical school or the military over the four year university we went to, which falls under attribution theory in social psychology. |
1 | Generate a research question or identify a problem to investigate. | Through our wonderment about the world around us and why events occur as they do, we begin to ask questions that require further investigation to arrive at an answer. This investigation usually starts with a literature review, or when we conduct a literature search through our university library or a search engine such as Google Scholar to see what questions have been investigated already and what answers have been found, so that we can identify gaps or holes in this body of work. For instance, in relation to attribution theory, we would execute a search using those words as our parameters. Google Scholar and similar search engines, would look for attribution-theory in the key words authors identify when writing their abstract. The search would likely return quite a few articles at which time you would pick and choose which ones to read from the abstracts (the short summary of what the article is about; it is sort of like the description of a book found on the back cover or sometimes the inside cover of a book jacket). As you read articles you would try and figure out what has and has not been done to give your future research project direction. |
2 | Attempt to explain the phenomena we wish to study. | We now attempt to formulate an explanation of why the event occurs as it does. This systematic explanation of a phenomenon is a theory and our specific, testable prediction is the hypothesis. We will know if our theory is correct because we have formulated a hypothesis which we can now test. In the case of our example, we are not really creating a theory as one exists to explain why people do what they did (attribution theory) but we can formulate a specific, testable prediction in relation to it. You might examine whether or not your friend made his choice because he is genuinely interested in learning a trade or serving his country, or if he was pushed to do this by his parents. The former would be a dispositional or personal reason while the latter would be situational. You might focus your investigation on the effect parents can have on the career choices children make. Maybe you suppose if a child is securely attached to his parents he will follow their wishes as compared to a child who is insecurely attached. This question would actually blend social and developmental psychology. |
3 | Test the hypothesis. | It goes without saying that if we cannot test our hypothesis, then we cannot show whether our prediction is correct or not. Our plan of action of how we will go about testing the hypothesis is called our research design. In the planning stage, we will select the appropriate research method to answer our question/test our hypothesis. In this case that is to what extent parenting and attachment serve as situational factors affecting career choice decisions. We will discuss specific designs in the next section but for now, we could use a survey and observation. |
4 | Interpret the results. | With our research study done, we now examine the data to see if the pattern we predicted exists. We need to see if a cause and effect statement can be made, assuming our method allows for this inference. The statistics we use take on two forms. First, there are descriptive statistics which provide a means of summarizing or describing data, and presenting the data in a usable form. You likely have heard of the mean or average, median, and mode. Along with standard deviation and variance, these are ways to describe our data. Second, there are inferential statistics which allow for the analysis of two or more sets of numerical data to determine the statistical significance of the results. Significance is an indication of how confident we are that our results are due to our manipulation or design and not chance. Typically we set this significance at no higher than 5% due to chance. |
5 | Draw conclusions carefully. | We need to accurately interpret our results and not overstate our findings. To do this, we need to be aware of our biases and avoid emotional reasoning so that they do not cloud our judgment. How so? In our effort to stop a child from engaging in self-injurious behavior that could cause substantial harm or even death, we might overstate the success of our treatment method. In the case of our attribution study, we might not fudge our results like this but still need to make sure we interpret our statistical findings correctly. |
6 | Communicate our findings to the larger scientific community. | Once we have decided on whether our hypothesis is correct or not, we need to share this information with others so that they might comment critically on our methodology, statistical analyses, and conclusions. Sharing also allows for replication or repeating the study to confirm its results. Communication is accomplished via scientific journals, conferences, or newsletters released by many of the organizations mentioned in Section 1.4. As a note, there is actually a major issue in the field of psychology related to replication right now. We will discuss this in Section 2.5.
|
Science has at its root three cardinal features that we will see play out time and time again throughout this book, and as mentioned in Module 1. They are:
- Observation – In order to know about the world around us we must be able to see it firsthand. In relation to social psychology, we know our friend and his parents pretty well, and so in our time with them have observed the influence they exert on his life.
- Experimentation – To be able to make causal or cause and effect statements, we must be able to isolate variables. We have to manipulate one variable and see the effect of doing so on another variable. Experimentation is the primary method social psychology uses to test its hypotheses.
- Measurement – How do we know whether or not our friend is truly securely attached to his parents? Well, simply we measure attachment. In order to do that, we could give our friend a short questionnaire asking about his attachment pattern to his parents. For this questionnaire, let’s say we use a 5-point scale for all questions (with 1 meaning the question does not apply to 5 meaning it definitely is true or matters). If there were 10 questions, then our friend would have a score between 10 and 50. The 10 would come from him answering every question with a 1 and the 50 from answering every question with a 5. If you are not aware, there are four main styles of attachment (secure, anxious-ambivalent, avoidant, and disorganized-disoriented). We would have 2-3 questions assessing each of the 4 styles meaning that if we had 2 questions for that style, the score would range from 2 to 10. If 3 questions, the range would be 3 to 15. The higher the score, the more likely the person exhibits that style to the parent and our friend should only have a high score on one of the four styles if our scale correctly assesses attachment. We will discuss reliability and validity in Section 2.3.
2.2. Research Designs Used by Social Psychologists
Section Learning Objectives
- List the five main research methods used in psychology.
- Describe observational research, listing its advantages and disadvantages.
- Describe case study research, listing its advantages and disadvantages.
- Describe survey research, listing its advantages and disadvantages.
- Describe correlational research, listing its advantages and disadvantages.
- Describe experimental research, listing its advantages and disadvantages.
- State the utility and need for multimethod research.
Step 3 called on the scientist to test their hypothesis. Psychology as a discipline uses five main research designs. These include observational research, case studies, surveys, correlational designs, and experiments.
2.2.1. Observational Research
In terms of naturalistic observation, the scientist studies human or animal behavior in its natural environment which could include the home, school, or a forest. The researcher counts, measures, and rates behavior in a systematic way and at times uses multiple judges to ensure accuracy in how the behavior is being measured. This is called inter-rater reliability as you will see in Section 2.3. The advantage of this method is that you witness behavior as it occurs and it is not tainted by the experimenter. The disadvantage is that it could take a long time for the behavior to occur and if the researcher is detected then this may influence the behavior of those being observed. In the case of the latter, the behavior of the observed becomes artificial.
Laboratory observation involves observing people or animals in a laboratory setting. The researcher might want to know more about parent-child interactions and so brings a mother and her child into the lab to engage in preplanned tasks such as playing with toys, eating a meal, or the mother leaving the room for a short period of time. The advantage of this method over the naturalistic method is that the experimenter can use sophisticated equipment and videotape the session to examine it at a later time. The problem is that since the subjects know the experimenter is watching them, their behavior could become artificial from the start.
2.2.1.1. Example of an observational social psychology study. Griffiths (1991) studied the gambling behavior of adolescents by observing the clientele of 33 arcades in the UK. He used participant (when the researcher becomes an active participant in the group they are studying) and non-participant observation methodologies and found that adolescent gambling depended on the time of day and the time of year, and regular players had stereotypical behaviors and conformed to specific rules of etiquette. They played for fun, to win, to socialize, for excitement, and/or to escape.
2.2.2. Case Studies
Psychology can also utilize a detailed description of one person or a small group based on careful observation. This was the approach the founder of psychoanalysis, Sigmund Freud, took to develop his theories. The advantage of this method is that you arrive at a rich description of the behavior being investigated but the disadvantage is that what you are learning may be unrepresentative of the larger population and so lacks generalizability. Again, bear in mind that you are studying one person or a very small group. Can you possibly make conclusions about all people from just one or even five or ten? The other issue is that the case study is subject to the bias of the researcher in terms of what is included in the final write up and what is left out. Despite these limitations, case studies can lead us to novel ideas about the cause of behavior and help us to study unusual conditions that occur too infrequently to study with large sample sizes and in a systematic way. Though our field does make use of the case study methodology, social psychology does not frequently use the design.
2.2.2.1. Example of a case study from clinical psychology. In 1895, the book, Studies on Hysteria, was published by Josef Breuer (1842-1925) and Sigmund Freud (1856-1939), and marked the birth of psychoanalysis, though Freud did not use this actual term until a year later. The book published several case studies, including that of Anna O., born February 27, 1859 in Vienna to Jewish parents Siegmund and Recha Pappenheim, strict Orthodox adherents and considered millionaires at the time. Bertha, known in published case studies as Anna O., was expected to complete the formal education of a girl in the upper middle class which included foreign language, religion, horseback riding, needlepoint, and piano. She felt confined and suffocated in this life and took to a fantasy world she called her “private theater.” Anna also developed hysteria to include symptoms such as memory loss, paralysis, disturbed eye movements, reduced speech, nausea, and mental deterioration. Her symptoms appeared as she cared for her dying father and her mother called on Breuer to diagnose her condition (note that Freud never actually treated her). Hypnosis was used at first and relieved her symptoms. Breuer made daily visits and allowed her to share stories from her private theater which he came to call “talking cure” or “chimney sweeping.” Many of the stories she shared were actually thoughts or events she found troubling and reliving them helped to relieve or eliminate the symptoms. Breuer’s wife, Mathilde, became jealous of her husband’s relationship with the young girl, leading Breuer to terminate treatment in the June of 1882 before Anna had fully recovered. She relapsed and was admitted to Bellevue Sanatorium on July 1, eventually being released in October of the same year. With time, Anna O. did recover from her hysteria and went on to become a prominent member of the Jewish Community, involving herself in social work, volunteering at soup kitchens, and becoming ‘House Mother’ at an orphanage for Jewish girls in 1895. Bertha (Anna O.) became involved in the German Feminist movement, and in 1904 founded the League of Jewish Women. She published many short stories; a play called Women’s Rights, in which she criticized the economic and sexual exploitation of women, and wrote a book in 1900 called The Jewish Problem in Galicia, in which she blamed the poverty of the Jews of Eastern Europe on their lack of education. In 1935 she was diagnosed with a tumor and was summoned by the Gestapo in 1936 to explain anti-Hitler statements she had allegedly made. She died shortly after this interrogation on May 28, 1936. Freud considered the talking cure of Anna O. to be the origin of psychoanalytic therapy and what would come to be called the cathartic method.
To learn more about observational and case study designs, please take a look at our Research Methods in Psychology textbook by visiting:
https://opentext.wsu.edu/carriecuttler/chapter/observational-research/
For more on Anna O., please see:
https://www.psychologytoday.com/blog/freuds-patients-serial/201201/bertha-pappenheim-1859-1936
2.2.3. Surveys/Self-Report Data
A survey is a questionnaire consisting of at least one scale with some number of questions which assess a psychological construct of interest such as parenting style, depression, locus of control, attitudes, or sensation seeking behavior. It may be administered by paper and pencil or computer. Surveys allow for the collection of large amounts of data quickly but the actual survey could be tedious for the participant and social desirability, when a participant answers questions dishonestly so that he/she is seen in a more favorable light, could be an issue. For instance, if you are asking high school students about their sexual activity they may not give genuine answers for fear that their parents will find out. Or if you wanted to know about prejudicial attitudes of a group of people, you could use the survey method. You could alternatively gather this information via an interview in a structured or unstructured fashion. Important to survey research is that you have random sampling or when everyone in the population has an equal chance of being included in the sample. This helps the survey to be representative of the population and in terms of key demographic variables such as gender, age, ethnicity, race, education level, and religious orientation.
To learn more about the survey research design, please take a look at our Research Methods in Psychology textbook by visiting:
https://opentext.wsu.edu/carriecuttler/chapter/7-1-overview-of-survey-research/
2.2.4. Correlational Research
This research method examines the relationship between two variables or two groups of variables. A numerical measure of the strength of this relationship is derived, called the correlation coefficient, and can range from -1.00, a perfect inverse relationship meaning that as one variable goes up the other goes down, to 0 or no relationship at all, to +1.00 or a perfect relationship in which as one variable goes up or down so does the other. In terms of a negative correlation we might say that as a parent becomes more rigid, controlling, and cold, the attachment of the child to the parent goes down. In contrast, as a parent becomes warmer, more loving, and provides structure, the child becomes more attached. The advantage of correlational research is that you can correlate anything. The disadvantage is that you can correlate anything. Variables that really do not have any relationship to one another could be viewed as related. Yes. This is both an advantage and a disadvantage. For instance, we might correlate instances of making peanut butter and jelly sandwiches with someone we are attracted to sitting near us at lunch. Are the two related? Not likely, unless you make a really good PB&J but then the person is probably only interested in you for food and not companionship. The main issue here is that correlation does not allow you to make a causal statement.
To learn more about the correlational research design, please take a look at our Research Methods in Psychology textbook by visiting:
https://opentext.wsu.edu/carriecuttler/chapter/correlational-research/
2.2.5. Example of a Study Using Survey and Correlational Designs
Roccas, Sagiv, Schwartz, and Knafo (2002) examined the relationship of the big five personality traits and values by administering the Schwartz (1992) Values survey, NEO-PI, a positive affect scale, and a single item assessing religiosity to introductory to psychology students at an Israeli university. For Extraversion, it was found that values that define activity, challenge, excitement, and pleasure as desirable goals in life (i.e. stimulation, hedonism, and achievement) were important while valuing self-denial or self-abnegation, expressed in traditional values, was antithetical.
For Openness, values that emphasize intellectual and emotional autonomy, acceptance and cultivation of diversity, and pursuit of novelty and change (i.e. universalism, self-direction, and stimulation) were important while conformity, security, and tradition values were incompatible. Benevolence, tradition, and to a lesser degree conformity, were important for Agreeableness while power and achievement correlated negatively. In terms of Conscientiousness (C), there was a positive correlation with security values as both share the goal of maintaining smooth interpersonal relations and avoiding disruption of social order and there was a negative correlation with stimulation, indicating an avoidance of risk as a motivator of C.
Finally, there was little association of values with the domain of Neuroticism but a closer inspection of the pattern of correlations with the facets of N suggests two components. First, the angry hostility and impulsiveness facets could be called extrapunitive since the negative emotion is directed outward and tends to correlate positively with hedonism and stimulation values and negatively with benevolence, tradition, conformity, and C values. Second, the anxiety, depression, self-consciousness, and vulnerability facets could be called intrapunitive since the negative emotion is directed inward. This component tends to correlate positively with tradition values and negatively with achievement and stimulation values.
2.2.6. Experiments
An experiment is a controlled test of a hypothesis in which a researcher manipulates one variable and measures its effect on another variable. The variable that is manipulated is called the independent variable (IV) and the one that is measured is called the dependent variable (DV). A common feature of experiments is to have a control group that does not receive the treatment or is not manipulated and an experimental group that does receive the treatment or manipulation. If the experiment includes random assignment participants have an equal chance of being placed in the control or experimental group. The control group allows the researcher to make a comparison to the experimental group, making a causal statement possible, and stronger.
2.2.6.1. Example of an experiment. Allison and Messick (1990) led subjects to believe they were the first of six group members to take points from a common resource pool and that they could take as many points as desired which could later be exchanged for cash. Three variables were experimentally manipulated. First, subjects in the low payoff condition were led to believe the pool was only 18 or 21 points in size whereas those in the high payoff condition were told the pool consisted of either 24 or 27 points. Second, the pools were divisible (18 and 24) or nondivisible (21 or 27). Third, half of the subjects were placed in the fate control condition and told that if the requests from the six group members exceeded the pool size, then no one could keep any points, while the other half were in the no fate control condition and told there would be no penalties for overconsumption of the pool. Finally, data for a fourth variable, social values, was collected via questionnaire four weeks prior to participation. In all, the study employed a 2 (fate control) x 2 (payoff size) x 2 (divisibility) x 2 (social values) between-subjects factorial design.
Results showed that subjects took the least number of points from the resource pool when the resource was divisible, the payoffs were low, and there was no fate control. On the other hand, subjects took the most points when the resource was nondivisible, the payoffs were high, and subjects were noncooperative. To further demonstrate this point, Allison and Messick (1990) counted the number of inducements to which participants were exposed. This number ranged from 0 to 4 inducements. Subjects took between one-fifth and one-fourth when there were one or two inducements, took about one-third when there were three inducements, and about half of the pool when all four were present. They state that an equal division rule was used when there were no temptations to violate equality but as the number of temptations increased, subjects became progressively more likely to overconsume the pool. The authors conclude that the presence of competing cues/factors tends to invite the use of self-serving rules to include “First-come, first-served” and “People who get to go first take more.”
To learn more about the experimental research design, please take a look at our Research Methods in Psychology textbook by visiting:
https://opentext.wsu.edu/carriecuttler/chapter/experiment-basics/
2.2.7. Multi-Method Research
As you have seen above, no single method alone is perfect. All have their strengths and limitations. As such, for the psychologist to provide the clearest picture of what is affecting behavior or mental processes, several of these approaches are typically employed at different stages of the research process. This is called multi-method research.
2.2.8. Archival Research
Another technique used by psychologists is called archival research or when the researcher analyzes data that has already been collected and for another purpose. For instance, a researcher may request data from high schools about a student’s GPA and their SAT and/or ACT score(s) and then obtain their four-year GPA from the university they attended. This can be used to make a prediction about success in college and which measure – GPA or standardized test score – is the better predictor.
2.2.9. Meta-Analysis
Meta-analysis is a statistical procedure that allows a researcher to combine data from more than one study. For example, Shariff et al. (2015) published an article on religious priming and prosociality in Personality and Social Psychology Review. The authors used effect-size analyses, p-curve analyses, and adjustments for publication bias (no worries, you don’t have to understand any of that), to evaluate the robustness of four types of religious priming, how religion affects prosocial behavior, and whether religious-priming effects generalize to those who are loosely or not religious at all. Results were presented across 93 studies and 11,653 participants and showed that religious priming has robust effects in relation to a variety of outcome measures, prosocial behavior included. It did not affect non-religious people though.
2.2.10. Communicating Results
In scientific research, it is common practice to communicate the findings of our investigation. By reporting what we found in our study other researchers can critique our methodology and address our limitations. Publishing allows psychology to grow its knowledge base about human behavior. We can also see where gaps still exist. We move it into the public domain so others can read and comment on it. Scientists can also replicate what we did and possibly extend our work if it is published.
There are several ways to communicate our findings. We can do so at conferences in the form of posters or oral presentations, through newsletters from APA itself or one of its many divisions or other organizations, or through research journals and specifically scientific research articles. Published journal articles represent a form of communication between scientists and in them, the researchers describe how their work relates to previous research, how it replicates and/or extends this work, and what their work might mean theoretically.
Research articles begin with an abstract or a 150-250 word summary of the entire article. The purpose is to describe the experiment and allows the reader to make a decision about whether he or she wants to read it further. The abstract provides a statement of purpose, overview of the methods, main results, and a brief statement of the conclusion. Keywords are also given that allow for students and other researchers alike to find the article when doing a search.
The abstract is followed by four major sections as described:
- Introduction – The first section is designed to provide a summary of the current literature as it relates to your topic. It helps the reader to see how you arrived at your hypothesis and the design of your study. Essentially, it gives the logic behind the decisions you made. You also state the purpose and share your predictions or hypothesis.
- Method – Since replication is a required element of science, we must have a way to share information on our design and sample with readers. This is the essence of the method section and covers three major aspects of your study – your participants, materials or apparatus, and procedure. The reader needs to know who was in your study so that limitations related to generalizability of your findings can be identified and investigated in the future. You will also state your operational definition, describe any groups you used, random sampling or assignment procedures, information about how a scale was scored, etc. Think of the Method section as a cookbook. The participants are your ingredients, the materials or apparatus are whatever tools you will need, and the procedure is the instructions for how to bake the cake.
- Results – In this section you state the outcome of your experiment and whether they were statistically significant or not. You can also present tables and figures.
- Discussion – In this section you start by restating the main findings and hypothesis of the study. Next, you offer an interpretation of the findings and what their significance might be. Finally, you state strengths and limitations of the study which will allow you to propose future directions.
Whether you are writing a research paper for a class or preparing an article for publication, or reading a research article, the structure and function of a research article is the same. Understanding this will help you when reading social psychological articles.
2.3. Reliability and Validity
Section Learning Objectives
- Clarify why reliability and validity are important.
- Define reliability and list and describe forms it takes.
- Define validity and list and describe forms it takes.
Recall that measurement involves the assignment of scores to an individual which are used to represent aspects of the individual such as how conscientious they are or their level of depression. Whether or not the scores actually represent the individual is what is in question. Cuttler (2017) says in her book Research Methods in Psychology, “Psychologists do not simply assume that their measures work. Instead, they collect data to demonstrate that they work. If their research does not demonstrate that a measure works, they stop using it.” So how do they demonstrate that a measure works? This is where reliability and validity come in.
2.3.1. Reliability
First, reliability describes how consistent a measure is. It can be measured in terms of test-retest reliability, or how reliable the measure is across time, internal consistency, or the “consistency of people’s responses across the items on multiple-item measures,” (Cuttler, 2017), and finally inter-rater reliability, or how consistent different observers are when making judgments. In terms of inter-rater reliability, Cuttler (2017) writes, “Inter-rater reliability would also have been measured in Bandura’s Bobo doll study. In this case, the observers’ ratings of how many acts of aggression a particular child committed while playing with the Bobo doll should have been highly positively correlated.”
2.3.2. Validity
A measure is considered to be valid if its scores represent the variable it is said to measure. For instance, if a scale says it measures depression, and it does, then we can say it is valid. Validity can take many forms. First, face validity is “the extent to which a measurement method appears “on its face” to measure the construct of interest” (Cuttler, 2017). A scale purported to measure values should have questions about values such as benevolence, conformity, and self-direction, and not questions about depression or attitudes toward toilet paper.
Content validity is to what degree a measure covers the construct of interest. Cuttler (2017) says, “… consider that attitudes are usually defined as involving thoughts, feelings, and actions toward something. By this conceptual definition, a person has a positive attitude toward exercise to the extent that he or she thinks positive thoughts about exercising, feels good about exercising, and actually exercises.”
Oftentimes, we expect a person’s scores on one measure to be correlated with scores on another measure that we expect it to be related to, called criterion validity. For instance, consider parenting style and attachment. We would expect that if a person indicates on one scale that their father was authoritarian (or dictatorial) then attachment would be low or insecure. In contrast, if the mother was authoritative (or democratic) we would expect the child to show a secure attachment style.
As researchers we expect that our results will generalize from our sample to the larger population. This was the issue with case studies as the sample is too small to make conclusions about everyone. If our results do generalize from the circumstances under which our study was conducted to similar situations, then we can say our study has external validity. External validity is also affected by how real the research is. Two types of realism are possible. First, mundane realism occurs when the research setting closely resembles the real world setting. Experimental realism is the degree to which the experimental procedures that are used feel real to the participant. It does not matter if they really mirror real life but that they only appear real to the participant. If so, his or her behavior will be more natural and less artificial.
In contrast, a study is said to have good internal validity when we can confidently say that the effect on the dependent variable (the one that is measured) was due solely to our manipulation or the independent variable. A confound occurs when a factor other than the independent variable leads to changes in the dependent variable.
To learn more about reliability and validity, please visit: https://opentext.wsu.edu/carriecuttler/chapter/reliability-and-validity-of-measurement/
2.4. Research Ethics
Section Learning Objectives
- Exemplify instances of ethical misconduct in research.
- List and describe principles of research ethics.
Throughout this module so far, we have seen that it is important for researchers to understand the methods they are using. Equally important, they must understand and appreciate ethical standards in research. The American Psychological Association identifies high standards of ethics and conduct as one of its four main guiding principles or missions. To read about the other three, please visit https://www.apa.org/about/index.aspx. So why are ethical standards needed and what do they look like?
2.4.1. Milgram’s Study on Learning…or Not
Possibly, the one social psychologist students know about the most is Stanley Milgram, if not by name, then by his study on obedience using shock (Milgram, 1974). Essentially, two individuals came to each experimental session but only one of these two individuals was a participant. The other was what is called a confederate and is part of the study without the participant knowing. The confederate was asked to pick heads or tails and then a coin was flipped. As you might expect, the confederate always won and chose to be the learner. The “experimenter,” who was also a confederate, took him into one room where he was hooked up to wires and electrodes. This was done while the “teacher,” the actual participant, watched and added to the realism of what was being done. The teacher was then taken into an adjacent room where he was seated in front of a shock generator. The teacher was told it was his task to read a series of word pairs to the learner. Upon completion of reading the list, he would ask the learner one of the two words and it was the learner’s task to state what the other word in the pair was. If the learner incorrectly paired any of the words, he would be shocked. The shock generator started at 30 volts and increased in 15-volt increments up to 450 volts. The switches were labeled with terms such as “Slight shock,” “Moderate shock,” “Danger: Severe Shock,” and the final two switches were ominously labeled “XXX.”
As the experiment progressed, the teacher would hear the learner scream, holler, plead to be released, complain about a heart condition, or say nothing at all. When the learner stopped replying, the teacher would turn to the experimenter and ask what to do, to which the experimenter indicated for him to treat nonresponses as incorrect and shock the learner. Most participants asked the experimenter whether they should continue at various points in the experiment. The experimenter issued a series of commands to include, “Please continue,” “It is absolutely essential that you continue,” and “You have no other choice, you must go on.”
Any guesses as to what happened? What percent of the participants would you hypothesize actually shocked the learner to death? Milgram found that 65 percent of participants/teachers shocked the learner to the XXX switches which would have killed him. Why? They were told to do so. How do you think the participant felt when they realized that they could kill someone simply because they were told to do so?
Source: Milgram, S. (1974). Obedience to authority. New York, NY: Harper Perennial.
2.4.2. GO TO JAIL: Go Directly to Jail. Do Not Pass Go. Do Not Collect $200
Early in the morning on Sunday, August 14, 1971, a Palo Alto, CA police car began arresting college students for committing armed robbery and burglary. Each suspect was arrested at his home, charged, read his Miranda rights, searched, handcuffed, and placed in the back of the police car as neighbors watched. At the station, the suspect was booked, read his rights again, and identified. He was then placed in a cell. How were these individuals chosen? Of course, they did not really commit the crimes they were charged with. The suspects had answered a newspaper ad requesting volunteers for a study of the psychological effects of prison life.
After screening individuals who applied to partake in the study, a final group of 24 were selected. These individuals did not have any psychological problems, criminal record, history of drug use, or mental disorder. They were paid $15 for their participation. The participants were divided into two groups through a flip of a coin. One half became the prison guards and the other half the prisoners. The prison was constructed by boarding up each end of a corridor in the basement of Stanford University’s Psychology building. This space was called “The Yard” and was the only place where the prisoners were permitted to walk, exercise, and eat. Prison cells were created by removing doors from some of the labs and replacing them with specially made doors with steel bars and cell numbers. A small closet was used for solitary confinement and was called “The Hole.” There were no clocks or windows in the prison and an intercom was used to make announcements to all prisoners. The suspects who were arrested were transported to “Stanford County Jail” to be processed. It was there they were greeted by the warden and told what the seriousness of their crime was. They were stripped searched and deloused, and the process was made to be intentionally degrading and humiliating. They were given uniforms with a prison ID number on it. This number became the only way they were referred to during their time. A heavy chain was placed on each prisoner’s right ankle which served the purpose of reminding them of how oppressive their environment was.
The guards were given no training and could do what they felt was necessary to maintain order and command the respect of the prisoners. They made their own set of rules and were supervised by the warden, who was played by another student at Stanford. Guards were dressed in identical uniforms, carried a whistle, held a billy club, and wore special mirror sun-glasses so no one could see their eyes or read their emotions. Three guards were assigned to each of the three hour shifts and supervised the nine prisoners. At 2:30 am they would wake the prisoners to take counts. This provided an opportunity to exert control and to get a feel for their role. Similarly, prisoners had to figure out how they were to act and at first, tried to maintain their independence. As you might expect, this led to confrontations between the prisoners and the guards resulting in the guards physically punishing the prisoners with push-ups.
The first day was relatively quiet, but on the second day, a rebellion broke out in which prisoners removed their caps, ripped off their numbers, and put their beds against their cell doors creating a barricade. The guards responded by obtaining a fire extinguisher and shooting a stream of the cold carbon dioxide solution at the prisoners. The cells were then broken into, the prisoners stripped, beds removed, ringleaders put into solitary confinement, and a program of harassment and intimidation of the remaining inmates began. Since 9 guards could not be on duty at all times to maintain order, a special “privilege cell” was established and the three prisoners least involved in the rebellion were allowed to stay in it. They were given their beds and uniforms back, could brush their teeth and take a bath, and were allowed to eat special food in the presence of the other six prisoners. This broke the solidarity among the prisoners.
Less than 36 hours after the study began a prisoner began showing signs of uncontrollable crying, acute emotional disturbance, rage, and disorganized thinking. Though his emotional problems were initially seen as an attempt to gain release which resulted in his being returned to the prison and used as an informant, the symptoms worsened and he had to be released from the study. Then there was the rumor of a mass escape by the prisoners which the guards worked to foil. When it was revealed that the prisoners were never actually going to attempt the prison break, the guards became very frustrated and made the prisoners engage in menial work, pushups, jumping jacks, and anything else humiliating that they could think of.
A Catholic priest was invited to evaluate how realistic the prison was. Each prisoner was interviewed individually and most introduced himself to the priest by his prison number and not his name. He offered to help them obtain a lawyer and some accepted. One prisoner was feeling ill (#819) and did not meet with the priest right away. When he did, he broke down and began to cry. He was quickly taken to another room and all prison garments taken off. While this occurred, the guards lined up the other prisoners and broke them out into a chant of “Prisoner #819 is a bad prisoner. Because of what Prisoner #819 did, my cell is a mess. Mr. Correctional Officer.” This further upset the prisoner and he was encouraged to leave, though he refused each time. He finally did agree to leave after the researcher (i.e. Zimbardo) told him what he was undergoing was just a research study and not really prison. The next day parole hearings were held and prisoners who felt they deserved to be paroled were interviewed one at a time. Most, when asked if they would give up the money they were making for their participation so they could leave, said yes.
In all, the study lasted just six days. Zimbardo noted that three types of guards emerged—tough but fair who followed the prison rules; “good guys” who never punished the prisoners and did them little favors; and finally those who were hostile, inventive in their employment of punishment, and who truly enjoyed the power they had. As for the prisoners, they coped with the events in the prison in different ways. Some fought back, others broke down emotionally, one developed a rash over his entire body, and some tried to be good prisoners and do all that the guards asked of them. No matter what strategy they used early on, by the end of the study they all were disintegrated as a group, and as individuals. The guards commanded blind obedience from all of the prisoners.
When asked later why he ended the study, Zimbardo cited two reasons. First, it became apparent that the guards were escalating their abuse of the prisoners in the middle of the night when they thought no one was watching. Second, Christina Maslach, a recent Stanford Ph.D. was asked to conduct interviews with the guards and prisoners and saw the prisoners being marched to the toilet with bags on their heads and legs chained together. She was outraged and questioned the study’s morality.
Source: http://www.prisonexp.org/
If you would like to learn more about the moral foundations of ethical research, please visit: https://opentext.wsu.edu/carriecuttler/chapter/moral-foundations-of-ethical-research/
2.4.3. Ethical Guidelines
Due to these studies, and others, the American Psychological Association (APA) established guiding principles for conducting psychological research. The principles can be broken down in terms of when they should occur during the process of a person participating in the study.
2.4.3.1. Before participating. First, researchers must obtain informed consent or when the person agrees to participate because they are told what will happen to them. They are given information about any risks they face, or potential harm that could come to them, whether physical or psychological. They are also told about confidentiality or the person’s right not to be identified. Since most research is conducted with students taking introductory psychology courses, they have to be given the right to do something other than a research study to likely earn required credits for the class. This is called an alternative activity and could take the form of reading and summarizing a research article. The amount of time taken to do this should not exceed the amount of time the student would be expected to participate in a study.
2.4.3.2. While participating. Participants are afforded the ability to withdraw or the person’s right to exit the study if any discomfort is experienced.
2.4.3.3. After participating. Once their participation is over, participants should be debriefed or when the true purpose of the study is revealed and they are told where to go if they need assistance and how to reach the researcher if they have questions. So can researchers deceive participants, or intentionally withhold the true purpose of the study from them? According to the APA, a minimal amount of deception is allowed.
Human research must be approved by an Institutional Review Board or IRB. It is the IRB that will determine whether the researcher is providing enough information for the participant to give consent that is truly informed, if debriefing is adequate, and if any deception is allowed or not.
If you would like to learn more about how to use ethics in your research, please read: https://opentext.wsu.edu/carriecuttler/chapter/putting-ethics-into-practice/
2.5. Issues in Social Psychology
Section Learning Objectives
- Describe the replication crisis in psychology.
- Describe the issue with generalizability faced by social psychologists.
2.5.1. The Replication Crisis in Social Psychology
Today, the field of psychology faces what is called a replication crisis. Simply, published findings in psychology are not replicable, one of the hallmarks of science. Swiatkowski and Dompnier (2017) addressed this issue but with a focus on social psychology. They note that the field faces a confidence crisis due to events such as Diederick Staple intentionally fabricating data over a dozen years which lead to the retraction of over 50 published papers. They cite a study by John et al. (2012) in which 56% of 2,155 respondents admitted to collecting more data after discovering that the initial statistical test was not significant and 46% selectively reported studies that “worked” in a paper to be published. They also note that Nuijten et al. (2015) collected a sample of over 30,000 articles from the top 8 psychology journals and found that 1 in 8 possibly had an inconsistent p value that could have affected the conclusion the researchers drew.
So, how extensive is the issue? The Psychology Reproducibility Project was started to determine to what degree psychological effects from the literature could be replicated. One hundred published studies were attempted to be replicated by independent research teams and from different subfields in psychology. Only 39% of the findings were considered to be successfully replicated. For social psychology the results were worse. Only 25% were replicated.
Why might a study not replicate? Swiatkowski and Dompnier (2017) cite a few reasons. First, they believe that statistical power, or making the decision to not reject the null hypothesis (H0 – hypothesis stating that there is no effect or your hypothesis was not correct) when it is actually false, is an issue in social psychology. Many studies are underpowered as shown by small effect sizes observed in the field, which inflates the rate of false-positive findings and leads to unreplicable findings.
Second, they say that some researchers use “unjustifiable flexibility in data analysis, such as working with several undisclosed dependent variables, collecting more observations after initial hypothesis testing, stopping data collection earlier than planned because of a statistically significant predicted finding, controlling for gender effects a posterior, dropping experimental conditions, and so on” (pg. 114). Some also do undisclosed multiple testing without making adjustments, called p-hacking, or dropping observations to achieve a significance level, called cherry picking. Such practices could explain the high prevalence of false positives in social psychological research.
Third, some current publication standards may promote bad research practices in a few ways. Statistical significance has been set at p = 0.05 as the sine qua non condition for publication. According to Swiattkowski and Dompnier (2017) this leads to dichotomous thinking in terms of the “strict existence and non-existence of an effect” (pg. 115). Also, positive, statistically significant results are more likely to be published than negative, statistically, non-significant results which can be hard to interpret. This bias leads to a structural incentive to seek out positive results. Finally, the authors point out that current editorial standards show a preference for novelty or accepting studies which report new and original psychological effects. This reduces the importance of replications which lack prestige and inspire little interest among researchers. It should also be pointed out that there is a mentality of ‘Publish or perish’ at universities for full time faculty. Those who are prolific and publish often are rewarded with promotions, pay raises, tenure, or prestigious professorships. Also, studies that present highly novel and cool findings are showcased by the media.
The authors state, “In the long run, the lack of a viable falsification procedure seriously undermines the quality of scientific knowledge psychology produces. Without a way to build a cumulative net of well-tested theories and to abandon those that are false, social psychology risks ending up with a confused mixture of both instead”(pg. 117).
For more on this issue, check out the following articles
- 2016 Article in the Atlantic – https://www.theatlantic.com/science/archive/2016/03/psychologys-replication-crisis-cant-be-wished-away/472272/
- 2018 Article in The Atlantic – https://www.theatlantic.com/science/archive/2018/11/psychologys-replication-crisis-real/576223/
- 2018 Article in the Washington Post – https://www.washingtonpost.com/news/speaking-of-science/wp/2018/08/27/researchers-replicate-just-13-of-21-social-science-experiments-published-in-top-journals/?noredirect=on&utm_term=.2a05aff2d7de
- 2018 Article from Science News – https://www.sciencenews.org/blog/science-public/replication-crisis-psychology-science-studies-statistics
2.5.2. Generalizability
Earlier we discussed how researchers want to generalize their findings from the sample to the population, or from a small, representative group to everyone. The problem that plagues social psychology is who makes up our samples. Many social psychological studies are conducted with college students working for course credit (Sears, 1986). They represent what is called a convenience sample. Can we generalize from college students to the larger group?
Module Recap
In Module 1 we stated that psychology studied behavior and mental processes using the strict standards of science. In Module 2 we showed you how that is done via adoption of the scientific method and use of the research designs of observation, case study, surveys, correlation, and experiments. To make sure our measurement of a variable is sound, we need to have measures that are reliable and valid. And to give our research legitimacy we have to use clear ethical standards for research to include gaining informed consent from participants, telling them of the risks, giving them the right to withdraw, debriefing them, and using nothing more than minimal deception. Despite all this, psychology faces a crisis in which many studies are not replicating and findings from some social psychological research are not generalizable to the population.
This concludes Part I of the book. In Part II we will discuss how we think about ourselves and others. First, we will tackle the self and then move to the perception of others. Part II will conclude with a discussion of attitudes.
2nd edition