PsychDisclosure.org provides a platform for authors of recently published articles in Psychology to publicly disclose four categories of methodological details that are not required to be disclosed under current reporting standards, but which are essential for interpreting research findings. Click here for our PPS article reporting on our initiative. Learn More

We're glad to report that the success of PsychDisclosure.org has influenced the journal Psychological Science to raise their reporting standards! Since January 1, 2014, all authors who submit a manuscript to Psychological Science must excplicitly confirm that they have disclosed all information covered by the 4 categories (see here for more details on Psychological Science's new submission guidelines). Also, we're excited to announce that disclosure information available at PsychDisclosure.org will soon be integrated at CurateScience.org!

Disclosure categories:
  1. Exclusions: Disclosed total number of observations excluded and criterion for doing so.
  2. Conditions: Disclosed all tested experimental conditions, including failed manipulations.
  3. Measures: Disclosed all administered measures and items.
  4. Sample size: Disclosed (a) basis for chosen sample sizes and (b) basis for stopping data collection.
Submit your disclosure information for any article in any journal! Please see Contact Us page for more details.
Current response rate overall = 49% (308/630); PS = 44%, JPSP = 54%, JEPLMC = 50%, JEPG = 55%. [Website up-to-date as of May 18, 2014]
IssueAuthorsArticle Title
Dec 2013de Nooijer, van Gog, Paas et al.When Left Is Not Right: Handedness Effects on Learning Object-Manipulation Words Using Pictures With Left- or Right-Handed First-Person Perspectives
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Nov 2013Chen, Minson, Schöne et al.In the Eye of the Beholder: Eye Contact Increases Resistance to Persuasion
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Goetz, Shattuck, Miller et al.Social Status Moderates the Relationship Between Facial Structure and Aggression
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study #1 was part of a larger protocol investigating associations between hormones (testosterone/cortisol), personality, and aggressive behaviour. The additional measures were not related to the research questions being addressed in this manuscript.
  4. Sample Size: We decided ahead of time to collect data from 125 men and 125 women, or stop data collection at the end of the school year or once our targeted sample size was obtained. Although we did not perform formal power analyses, this was a relatively large sample size for human social neuroendocrinology and would enable us to test hormone/behaviour associations (some of these findings are reported in Carré et al., 2013 Psychoneuroendocrinology). For Study #2, we obtained all data available for National Hockey League players from the 2010-2011 season. 
Little, Feinberg, DeBruine et al.Adaptation to Faces and Voices: Unimodal, Cross-Modal, and Sex-Specific Effects
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We did collect ethnicity and nationality information from participants but this was not reported or analysed. These variables were not related to the research question and the undergraduate Scottish university sample meant the sample was homogeneous (majority reported White, UK).
  4. Sample Size: There was no formal stopping rule although previous and unpublished studies suggested that 20 participants per condition in each experiment is sufficient to see effects in this type of experiment and this was used as a guide for minimum sample size for each experiment. Data from experiments 1-3 were collected across 2 semesters because minimum sample size was not achieved in 1 semester. Participants were randomly allocated to experiment 1-3. Experiment 4 was collected after data collection was complete for Experiments 1-3 and was collected over 1 semester.
Rice, Phillips, Natu et al.Unaware Person Recognition From the Body When Face Identification Fails
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We have done many similar experiments and tested comparable (and relatively small numbers of subjects). This avoids over powering the stats with a high n. All of our reported effects were large (and would have been significant with fewer subjects)
Rudman, McLean, BunzlWhen Truth Is Personally Inconvenient, Attitudes Change: The Impact of Extreme Weather on Implicit Support for Green Politicians and Explicit Climate-Change Beliefs
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: At Time 1, we used a behavioral measure in the lab that was not possible to administer to the online sample at Time 2. For Time 2, we reported all measures used.
  4. Sample Size: At Time 1, the sample size was determined by the number of participants who completed the study in the lab during the allotted time period (before classes ended). At Time 2, sample size was determined by matching as closely as possible the number of participants who completed the study online at Time 1 (to be able to compare the two groups).
Oct 2013Bates, Lewis, WeissChildhood Socioeconomic Status Amplifies Genetic Effects on Adult Intelligence
  1. Exclusions: There were no excluded observations.
  2. Conditions: Full Disclosure
  3. Measures: This is a large study with several  thousand measures. Reference provided to the study. No unreported measures were  tested for this paper.
  4. Sample Size: All available subjects were used (archival data).
Farrelly, Slater, Elliott et al.Competitors who choose to be red have higher testosterone levels
  1. Exclusions: Data from 4 participants were removed for having abnormally high/low Testosterone levels (more than 2 SDs). This was not reported due to the brevity needed on this short report.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Original sample size was approximately 90, based on available salivary Testosterone sampling kits. Data collection was stopped early as further recruitment of male participants at the university was proving extremely difficult.
LeBel & CampbellHeightened sensitivity to temperature cues in highly anxiously attached individuals: Real or elusive phenomenon? 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure (details provided in the preregistered study protocols, accessible via links in the article)
Sep 2013 Vishwanath & HibbardSeeing in 3-D With Just One Eye: Stereopsis Without Binocular Vision
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We chose a sample size based on comparable studies published in the field.
Beall & TracyWomen Are More Likely to Wear Red or Pink at Peak Fertility
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In addition to several measures not related to the research question, we also asked participants to respond to the question, “What percentage of clothing you are currently wearing is red?” using an 11-point Likert scale ranging from “0%- I am not wearing any red clothing” to “100%- Everything I am wearing is entirely red.” After beginning the research, however, we realized that this item was problematic for several reasons, most notably: (a) Variance on this item was severely restricted; the mean response was 1.69 (on an 11-point scale), suggesting a major floor effect, apparently due to the fact that only 12% of participants reported wearing more than 10% red; (b) It was worded to include unobservable clothing such as socks and underwear, which, we realized, are irrelevant to our specific hypothesis that women wear red or pink during peak fertility as a way of increasing their apparent sexual attractiveness, and (c) it only covered only red, not pink, clothing, and our hypothesis applies equally to both colors.
  4. Sample Size: We collected data from two samples of women, somewhat simultaneously. One sample consisted of women who were undergraduate students at the University of British Columbia participating in the Psychology Department Subject Pool (in the paper, this sample was labeled Sample B). We began collecting data from this sample in March, 2012, and aimed to recruit women to participate through the end of the school term--April, 2012. We successfully recruited 62 women during this time (we concluded data collection when the subject pool closed at the end of the term), but could include only 24 of these individuals, due to the restrictions commonly imposed to maximize the validity of self-reported ovulation measures (all restrictions were reported in our paper). We also simultaneously (beginning in February, 2012) recruited a larger sample of women on M-Turk (labeled Sample A). Our goal for the M-Turk study was to recruit participants until we obtained a useable, regularly ovulating sample (i.e., women who met all of our inclusion criteria) of N = 100; this target number was chosen to allow us to detect a medium to large sized effect. Of note, in addition to details provided in the Online Supplement that appeared with our paper, more methodological information can be found in this blog post, which was linked to a post that appeared on Slate.com: http://ubc-emotionlab.ca/2013/07/too-good-does-not-always-mean-not-true.
Le & ImpettWhen Holding Back Helps: Suppressing Negative Emotions During Sacrifice Feels Authentic and Is Beneficial for Highly Interdependent People
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Analyses were conducted on a subset of a larger dataset on sacrifice in daily life. We reported the measures we analyzed for the specific research questions of the current paper, however, the larger dataset also included measures that were not central to our research question, including other personality measures (i.e., Big Five, communal orientation, appreciation) and daily measures (i.e., motivations for sacrifice).
  4. Sample Size: Although we did not conduct a formal power analysis, we based our desired sample size on previous diary studies we've conducted and aimed for approximately 100 participants. At the daily level, 14 observations were decided on because it was deemed to be a good balance between maximizing the number of observations collected in daily life while minimizing participant attrition due to length of the study. With the goal of 100 participants, we began collecting data from a college student sample during the academic year and terminated data collection when the academic year ended (i.e., participant pool closed) leaving a final N = 73.
ScottCorollary Discharge Provides the Sensory Content of Inner Speech
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I conducted a pilot test to get a sense of the power of the effect and so the sample size necessary to show a significant result.  My rule was therefore to stop when the predetermined sample size was reached.
Silver, Holman, Andersen et al.Mental- and Physical-Health Effects of Acute Exposure to Media Images of the September 11, 2001, Attacks and the Iraq War
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Some measures were not related to the research question.
  4. Sample Size: Full Disclosure
SuriPatient Inertia and the Status Quo Bias: When an Inferior Option Is Preferred
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We identified a desired sample size prior to launching the studies and measured effects. 
Vohs, Redden & RahinelPhysical Order Produces Healthy Choices, Generosity, and Conventionality, Whereas Disorder Produces Creativity
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure, except we had another study that showed the predicted effect (and hence a replication) but it was not in the manuscript due to word count constraints. 
  3. Measures: Experiment 1 included a self-created measure of preferences of how to use one’s time and a hypothetical lottery task, the latter being exploratory. The former was intended to be a scale measuring predicted factors. Scale items did not hang together. Experiments 2 and 3, yes.
  4. Sample Size: Experiments 2-3: We used a rule-of-thumb stopping rule that 24+ participants per cell was sufficient.  
Warburton, Wilson, Lynch et al.The Cognitive Benefits of Movement Reduction: Evidence From Dance Marking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Aug 2013 Milne, Chapman, Gallivan et al.Connecting the dots: Object connectedness deceives perception but not movement planning.
  1. Exclusions: In the Methods section we mention the number of participants excluded and refer to the online supplemental material for a description of the removal criteria.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample size was determined based on our previous research, which has shown that the sample size we used provides sufficient power to establish effects in our reaching task. In fact, we collected more data than we did in our previous work because participants in the current study could be excluded from the entire analysis due to below threshold performance on just one of the tasks. The sample size that we chose was also sufficient for our perceptual task, having been based on earlier work by another lab whose work we were replicating.  The number of participants included in their experiments ranged from 15-26.
Attwood, Penton-Voak, Burton et al.Acute anxiety impairs accuracy in identifying photographed faces.
  1. Exclusions: Two participants who were recruited did not complete the face recognition task (although they completed the other task conducted during the inhalation period - see below), in both cases because the pre-specified duration of inhalation ended before the task was completed. Complete data were therefore not available on these two participants for this task. We reported results for the 28 participants on whom complete data were available.
  2. Conditions: Full Disclosure
  3. Measures: Given the expense of the CO2 inhalation we typically include two unrelated tasks in the inhalation period, to answer unrelated questions. Here the additional task was a measure of speech perception. The order of tasks was counter-balanced between participants. These data have been reported elsewhere: Mattys, S.L., Seymour, F., Attwood, A. & Munafo, M.R. (2013). Effects of acute anxiety on speech perception: Are anxious listeners distracted listeners. Psychological Science, 24 (8), 1606-1608.
  4. Sample Size: The sample size (n = 30) was pre-specified and we continued data collection until we achieved this (although, as described above, only 28 completed the face recognition task). Our sample size was adequate to provide > 99% power to detect an effect of CO2 inhalation on subjective anxiety equivalent to ~20 points on the Spielberger State-Trait Anxiety Inventory state sub-scale. We had no strong prediction regarding the magnitude of the effect on face recognition, and this aspect was therefore exploratory.
Conner, Morrison, Fishman et al.A longitudinal cluster-randomized controlled study on the accumulating effects of individualized literacy instruction on students’ reading from first through third grade.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In addition to the reading measures, we also administered language and executive functioning measures. These measures were collected to address other research questions and not the ones reported in our manuscript. An excel sheet with the entire battery and number of students receiving each assessment is available upon request. 
  4. Sample Size: Full Disclosure
Finkel, Slotter, Luchies et alA brief intervention to promote conflict reappraisal preserves marital quality over time.
  1. Exclusions: Full Disclosure - The only exception was a 121st couple whom we eliminated from the dataset before conducting any data analysis. Our reason for doing so is that we learned after-the-fact that the wife had died at some point during the study and the husband had been completing her questionnaires in addition to his own. We weren’t able to discern which data from that couple were valid, and so we vacated all data from that one couple, treating them as if they were never involved in the study.
  2. Conditions: Full Disclosure - There were only two conditions, and we reported them both.
  3. Measures: The study is one of those major longitudinal studies with many measures. But we were systematic in our reports of the relevant dependent measures. Specifically, we assessed and reported Fletcher et al.’s (2000) six components of perceived relationship quality. That is, we fully reported the effects of our manipulation on relationship quality; the other measures were included in this study for reasons unrelated to the published report.
  4. Sample Size: I’d reported a power analysis in my initial grant proposal, but our procedures changed markedly from that point until the time we conducted the study. I’d proposed to run 100 couples, but then Erica Slotter and I developed an efficient data collection procedure that allowed us to collect ~25 couples in a given Saturday. I had enough funding to run another Saturday worth of sessions, so we decided to run five Saturdays worth of sessions instead of four—shooting for 125 instead of 100—although the no-shows on that one flawed couple I mentioned previous led to a final sample of 120.
Tannenbaum, Valasek, Knowles et al.Incentivizing wellness in the workplace: Sticks (not carrots) send stigmatizing signals.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In study 1a, we included a 14-item Fat Phobia Scale (Bacon, Scheltema, & Robinson, 2001) on the back-end of the study, after subjects responded to all dependent measures. The FBS, which measured explicit attitudes toward the overweight, was included for exploratory purposes. The scale (as well as the results from the scale) were not reported due to space limitations. At the end of the study we asked some basic demographic information (age, gender, and political orientation), as well as participant height and weight. These were not discussed in detail due to space limitations. On the back-end of Study 1b we included two self-report items asking subjects how satisfied they with their present body shape and how satisfied they were with their present body weight (1 = very dissatisfied, 7 = very satisfied). We did not report these items because of space limitations, and because they were highly redundant with our primary moderator of interest, which was subject's Body Mass Index (BMI was highly correlated with these items, and using the subjective satisfaction items instead of BMI yields qualitatively similar results). In Study 2 we mention that participants answered several questionnaires but did not specify these items, again due to space constraints. Subjects completed a 10-item Social Desirability scale (M-C Form 1; Strahan & Gerbasi, 1972), as well as a 20-item Positive and Negative Affect Scale (Watson, Clark, & Tellegen, 1988). The Social Desirability scale is discussed briefly in the Supplementary Materials. All data, including the measures not reported in the final paper, are publicly available at the following website: http://thedata.harvard.edu/dvn/dv/davetannenbaum/faces/study/StudyPage.xhtml?globalId=hdl:1902.1/19735&studyListingIndex=1_e2432d7668975e43ac87bc2f77fd
  4. Sample Size: For Studies 1a and 1b, sample size was based on rough intuition of adequate statistical power combined with practical considerations (e.g., "about 150 subjects or until data collection begins to slows down"). For Study 2 we collected as many subjects as we could until the end of the academic quarter. For all studies we terminated data collection before analyzing the results.
Jul 2013 Pascucci, Turatto et al.Immediate effect of internal reward on visual adaptation.
  1. Exclusions: Data from 1 participant in Experiment 1 and from 2 participants in Experiment 2 were excluded due to performances at chance level on the orientation discrimination task.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample size was determined by the statistical analyses, and on the ground of the average sample size in other psychophysics experiment in the literature
Rabagliati, Snedeker et al.The truth about chickens and bats: Ambiguity avoidance distinguishes types of polysemy.
  1. Exclusions: Full Disclosure
  2. Conditions: We ran two pilot studies; both showed the same result but the design was less elegant.
  3. Measures: Full Disclosure
  4. Sample Size: a) A guesstimate based on our pilot study b) When we reached our guesstimate.
Gibson, Piantadosi et al.A noisy-channel account of crosslinguistic word-order variation.
  1. Exclusions: 2 participants in experiment 1 (English) was excluded for not following the instructions (only gestured the verbs); 53 trials (out of 592 trials) in experiment 1 (English) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient; 12 trials (out of 263 trials) in experiment 1 (Japanese) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient; 33 trials (out of 311 trials) in experiment 1 (Korean) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient; 43 trials (out of 310 trials) in experiment 2 (Japanese) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient; 42 trials (out of 311 trials) in experiment 2 (Korean) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient; 1 participant in experiment 3 (English) was excluded for knowing ASL; 1 participant in experiment 3 (English) was excluded for not following the instructions; 11 trials (out of 341 trials) in experiment 3 (English) were excluded because either there was no patient gestured or the verb (action) was gestured on both sides of the patient.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We determined our sample size by estimating based on previous work using the same method: Goldin-Meadow et al. (2009) and Langus & Nespor (2010).
Lee, Chabris et al.General cognitive ability and the psychological refractory period: Individual differences in the mind's bottleneck.
  1. Exclusions: One additional participant was excluded from the main study. This subject was initially turned away for failing to provide documentation of SAT scores. The subject begged to be allowed to participate because he needed the credits for satisfactory completion of his psychology course; he promised to provide documentation in the coming days. When the subject failed to follow through on this promise, the first author discarded his data without inspecting them.
  2. Conditions: Full Disclosure
  3. Measures: We administered a brief questionnaire at the end asking whether the subject had participated in any of our previous studies, whether the subject had taken the IQ tests previously, how much sleep the subject had gotten the previous night. We might have asked some other quick questions, but I cannot recall. This data has been lost. We did not include this information because (1) it contained nothing of interest (e.g., the two g groups did not differ significantly in hours slept) and (2) because of the strict space limitations.
  4. Sample Size: We initially decided to test 300 subjects for the main study. There was no compelling reason to settle on this sample size; a previous study that was somewhat similar (Schmiedek et al., 2007) had used a sample size of roughly 150, and we thought it would be good to improve upon that study by using a sample size twice as large. We learned that this goal would not be practically achievable, given the pace of subject recruitment. At this point we adjusted our goal and aspired to test at least 100 subjects. Again, no explicit rationale justified this choice; it just struck us a "nice" number. However, because the first author tested nearly every subject himself but lived over four hours away from the lab located at Harvard University, he was finding the attainment of this goal extremely difficult. At a certain point he moved to a new residence that was over seven hours away from the lab; 70 subjects had been tested by then. The two authors had a discussion regarding whether to continue the study; they decided to write up the paper and submit it. After writing the first draft of the paper, the authors decided that the secondary study was necessary. This was carried out at Union College -- a four-hour drive from the residence of the first author, who again tested all subjects. However, given the small sample of the secondary study, this long round trip was bearable. We decided to test eight subjects because in our experience this is enough to obtain significant within-subject effects equaling ~30 ms (in fact even one or two subjects may suffice). Because unpredictable numbers of subjects signed up for each session, we ended up with nine subjects.
Jun 2013 Wilson-Mendenhall, Barrett et al.Neural evidence that human emotions share core affective properties.
  1. Exclusions: We excluded 2 additional participants due to excessive head motion in the scanner (movement greater than 3mm during a functional scan), 1 additional participant who did not follow task instructions, and 3 additional participants who completed the first (behavioral) session, but who did not complete the MRI scan session.
  2. Conditions: Full Disclosure
  3. Measures: We included several short self-report measures for exploratory purposes that were unrelated to testing our main hypotheses (e.g., to collect pilot data for grant applications proposing studies designed to assess individual differences using larger samples) because acquiring neuroimaging data is expensive. The self-report questionnaires were administered at the end of the first training session, and included the Emotional Intensity Scale (Bachorowski & Braaten, 1994), the NEO-FFI (Costa & McCrae, 1992), the Trait Anxiety Inventory (Spielberger, Gorsuch, Lushene, Vagg, & Jacobs, 1983), and the Tellegen Absorption scale (Tellegen & Atkinson, 1974).
  4. Sample Size: Because this fMRI study tested a novel manipulation, we could not calculate statistical power using effect sizes from prior studies. We set a minimum sample size for the study (N =16) based on normative sample sizes used in the literature (typically 10-20 participants; Murphy & Garavan, 2004), which tend to be smaller due to data acquisition costs. We decided prior to running the study to collect data until this sample size was achieved, which we followed. We preprocessed the imaging data as we collected it so we could replace participants with artifacts in their data (e.g., excessive head movement) to achieve the predetermined sample size.
Anderson, Vogel et al.A common discrete resource for visual working memory and visual search.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Our data collection stopping rule was established as follows: after checking the data for the presence of the predicted pattern of results, we doubled the number of subjects to ensure the observed effect was robust.
Baker, Shelton et al.Low skin conductance activity in infancy predicts aggression in toddlers 2 years later.
  1. Exclusions: In year 1 we tested 100 infants and had followup data for 70; drop out was caused by participants refusing to take part again 2 years later, mothers being to busy, or families having moved outside the area. 
  2. Conditions: Full Disclosure (N/A: No experimental conditions)
  3. Measures: The is a prospective longitudinal study involving measurement of different emotions, temperaments, psychophys. parameters and the scope of this paper, being a short report, makes it impossible to include more data/results.
  4. Sample Size: Sample size determined by power analysis but didn’t achieve it by the end of term):the sample size of 100 was determined by grant funding.
Hoffman, von Helversen et al.Deliberation's blindsight: How cognitive load can improve judgments.
  1. Exclusions: Full Disclosure
  2. Conditions: We first used a design with 5 cues for Study 2. Unfortunately, our participants failed to learn the judgment task in all conditions (even without load).
  3. Measures: Full Disclosure
  4. Sample Size: We decided in advance to collect 30 data points per condition.
Norman, Heywood et al.Object-based attention without awareness.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We set our sample size based on the results from preliminary experiments in the lab, and this was followed.
Piazza, Pica et al.Education enhances the acuity of the nonverbal approximate number system.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Since we were testing a remote population in the Amazon our final sample size consisted in all subjects we managed to test during our permanence in the Indian territory. We initially hoped to be able to have at least 30 participants. That was not decided on the basis of power analysis, but as a "rule of thumb".
Yee, Chrysikou et al.Manual experience shapes object representations.
  1. Exclusions: One participant from Experiment 2 was excluded for not following instructions, and one participant from Experiment 2 was excluded because of computer malfunction.
  2. Conditions: Full Disclosure
  3. Measures: We did not report one measure because of space limitations and because it was not central to our research question.
  4. Sample Size: We decided ahead of time to collect a minimum of 30 participants per condition (based on prior literature). After reaching this number, we continued collecting data until all subjects who had already been scheduled were tested.
May 2013 Graham, Fisher et al.What sleeping babies hear: A functional MRI study of interparental conflict and infants emotion processing.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We did not include a description of measures that were not related to our research question. 
  4. Sample Size: Full Disclosure
Vachon, Lynam et al.Basic traits predict the prevalence of personality disorder across the life span: The example of psychopathy.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Not all measures from the larger study were relevant to this particular question.
  4. Sample Size: We decided how many to collect in advance. Study was very well-powered.
Cook, Brewer et al.Alexithymia, not autism, predicts poor recognition of emotional facial expressions.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was determined a priori based on anticipated access to autsitic participants and our desire to counterbalance condition order.
Cowie, Makin et al.Childrens responses to the rubber-hand illusion reveal dissociable pathways in body representation.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect 15 participants per group, which we did. We didn't conduct a power analysis.
Otto, Gershman et al.The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Working memory capacity and trait impulsivity were measured but not used in analyses.
  4. Sample Size: We knew about how many participants this type of study would have and we aimed for as many as we could get within a practical period of time, until the needed number was surpassed.
Zhao, Al-Aidroos et al.Attention is spontaneously biased toward regularities.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Apr 2013 Wacker, Mueller et al.Dopamine-D2-receptor blockade reverses the association between trait approach motivation and frontal asymmetry in an approach-motivation context.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We administered various additional measures not related to the research question, data from which was (or hopefully will be) reported elsewhere.
  4. Sample Size: Sample size was determined based on a power analysis included in the grant proposal (Deutsche Forschungsgemeinschaft grant #  WA2593/2-1) and this was followed.
Hamlin, Mahajan et al.Not like me = Bad: Infants prefer those who harm dissimilar others.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We always plan to collect 16 subjects per condition (as is typical of previous research in our and other laboratories). In one instance in this particular study, as 100% of infants performed exactly the same, we only included 8 participants as the effect size was so large. Follow-up studies using 16 subjects each confirmed the reliability of that effect. In all other conditions we tested the planned 16 subjects per condition. 
Joshi, Fast et al.Power and reduced temporal discounting.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: I had included a loss discounting task in study 3. However, due to space limitations and the general focus of the manuscript on gain discounting, this measure was not included in the manuscript.
  4. Sample Size: We requested data from at least 30 participants per condition. Participants who did not complete the manipulation or left measures incomplete were not included in our sample size.
Kelley, Hortensius et al.When anger leads to rumination: Induction of relative right frontal cortical activity with transcranial direct current stimulation increases anger-related rumination.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: No, this information was not reported in the manuscript. A priori power analysis was conducted with G*Power, power analysis software (Faul, Erdfelder, Lang, & Buchner, 2007). Using an expected effect size from previous research (Hortensius, Schutter, & Harmon-Jones, 2010, ƒ = 0.35), an acceptable level of both power (1 – β = .80) and type I error (α = .05), a sample of at least 84 subjects was required. As indicated in the manuscript we ran more than that (n=115) to account for potential loss of participants due to technical failure or suspicion about the deception procedures. This left us with an adequately powered sample size (n=90). 
Kim, Yi et al.Out of mind, out of sight: Perceptual consequences of memory suppression.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We estimated the target sample size based on other independent works from our lab using the same manipulation (i.e., think/no-think training). The data collection was stopped once we ran roughly equal number of participants as our target sample size. It was not easy to run the exact number of participants who volunteered for course credits in the middle of semester.
MantylaGender differences in multitasking reflect spatial ability.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample sizes were rather arbitrary and  were based on practical considerations; recruiting females with specific inclusion criteria was challenging and n = 20 was considered as a reasonable minimum for stopping the data collection. 
Urgolites, Wood et al.Visual long-term memory stores high-fidelity representations of observed actions.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until minimum sample size achieved and this was followed. 
Mar 2013Ackerman, Kashy, Donnellan et al.The Interpersonal Legacy of a Positive Family Climate in Adolescence
  1. Exclusions: We excluded some people from the sample because they did not have the observational data in adolescence or data about marital romantic partnerships in adulthood.  We did not exclude anyone who otherwise met the selection criteria for our study.
  2. Conditions: Full Disclosure (We had no manipulations. In our analyses we did examine a couple of parenting variables but then we learned that another team working with the same data set was using those outcomes so we dropped them.)
  3. Measures: No.  There are many more measures of different constructs in the dataset. We selected those most relevant to our investigation.
  4. Sample Size: The data were already collected so we simply analyzed what was available to us from this existing project.
Goldfarb & TreismanCounting Multidimensional Objects: Implications for the Neural-Synchrony Theory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until minimum sample size was achieved and this was followed.
Hehman, Leitner, Deegan et al.Facial Structure Is Indicative of Explicit Support for Prejudicial Beliefs
  1. Exclusions: Full Disclosure
  2. Conditions: In Study 3, there was no difference between conditions on any of our dependent measures, nor interactions. Therefore, there was no evidence that this manipulation was effective or that it even functionally "existed."
  3. Measures: Full Disclosure
  4. Sample Size: In Study 1, we collected as many as possible during a semester. In Study 2, we aimed for 100 participants as we were unsure what effect size to expect, and stopped when we reached that goal. In Study 3, we based our sample goal on the size of the effect demonstrated in Study 2.
Schneider, Eerland, van Harreveld et al.One Way and the Other: The Bidirectional Relationship Between Ambivalence and Body Movement
  1. Exclusions: Full Disclosure
  2. Conditions: Ambivalence is hard to manipulate experimentally among Dutch students, we pretested more self-written articles but used only the strongest manipulation.  Because there were no differences in ambivalence, we did not analyze these data further, and as such, this information was not interesting.
  3. Measures: Some were significant, some were not, but the most important reason was doubts regarding validity. However, we mention the additional measures in the paper and interested researchers may contact us about these measures. 
  4. Sample Size: We decided ahead of time to collect data until minimum sample size achieved, or data collection period ended, and this was followed.
Sutin, Terracciano, Milaneschi et al.The Effect of Birth Cohort on Well-Being: The Legacy of Economic Hard Times
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: The Baltimore Longitudinal Study of Aging (BLSA) is an on-going epidemiological study of normal aging. BLSA participants undergo extensive testing during each visit that lasts for 2-3 days. This testing includes numerous measures of physical, cognitive, and emotional health. The National Health and Nutrition Examination Survey (NHANES I) was likewise a large study that included numerous measures of health and nutrition. From both studies, we selected the measure that was relevant to our research question.
  4. Sample Size: We selected every participant who had completed the CES-D from the time it was introduced into the BLSA (1979) to the time of the initial data analysis (2010). BLSA participants continue to fill out the CES-D at every visit. From NHANES I, we selected adult participants who completed the CES-D.
Van der Burg, Awh, & OliversThe Capacity of Audiovisual Integration Is Limited to One Item
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We included a predetermined number of subjects in each experiment. This was based on experience.
Feb 2013Caparos, Linnell, Bremner et al.Do Local and Global Perceptual Biases Tell Us Anything About Local and Global Selective Attention?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Participants performed two blocks of trials. Only the data obtained in the first block are reported in the paper. The effects (reported in the paper) were also present in the second block, however, there were also carry-over effects that were not sufficiently reliable to be reported. The paper format (brief report) was not adequate for us to discuss these effects.
  4. Sample Size: We aimed to test at least 50 participants in each group (a group of British participants and a group of traditional African participants). In Africa, two weeks of testing were dedicated to data collection for this experiment. We tested as many participants as we could during these two weeks (reaching a sample size of 58 in the African group). We then tested an equivalent number of British participants.
Jamieson, Koslov, Nock et al.Experiencing Discrimination Increases Risk Taking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We collected pre-experiment questionnaires on-line that all participants completed and were beyond the scope of the current article. Measures included items such as intergroup contact, personality measures, and other individual differences
  4. Sample Size: As in all the studies in my lab we decide on the targeted N based on previous studies and power analysis. We then run 10% over the targeted amount due to typical loss in physiolgocial measures and biological samples (due to electrical interference, loss of signal, contaminated saliva samples, etc). We *never* analyze our data until the study is complete primarily because we send out biological samples in batch so that they are assayed at one time. I didn't respond (yes or no) above because this stopping rule is not stated explictly, but we do cite standard articles and chapters that outline this protocol and space restraints prevent this type of extra information.
Laran & SalernoLife-History Strategy, Food Choice, and Caloric Consumption
  1. Exclusions: Full Disclosure
  2. Conditions: An entire study, from the first submission, did not make the final version of the paper as per editorial request.
  3. Measures:  In study 2, we included a few other filler questions unrelated to our research questions that were included to support our cover story. These measures did not vary as a function of our experimental conditions.
  4. Sample Size: Study 1: We aimed to collect at least 25 participants per cell. We obtained our final sample by asking our undergraduate research assistants to recruit as many participants as they could over a two day period of a few hours each day and ended up with more participants than the 25 per cell initially expected (n = 121). Studies 2 and 3: We aimed to collect 40 subjects per cell given that the dependent variable was binary. Our total n was lower than expected in Study 2 (n = 238) and Study 3 (n = 144) based on fluctuation in attendance rates for the sessions held at our lab.
Marinovic, Pearce, & ArnoldAttentional-Tracking Acuity Is Modulated by Illusory Changes in Perceived Speed
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Based on past experience of motion adaptation phenomena, we determined on a sample size of 8, as this should be more than ample to detect a low-level visual aftereffect. We stopped testing once we had tested all the participants.
Simonsohn & GinoDaily Horizons: Evidence of Narrow Bracketing in Judgment From 10 Years of M.B.A. Admissions Interviews
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Jan 2013Briñol, Gascó, Petty et al. Treating Thoughts as Material Objects Can Increase or Decrease Their Impact on Evaluation
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We excluded any items that were unrelated to the research questions or that were included for exploratory purposes. Furthermore, we focused on the items that were included in all the studies within the paper in order to maintain convergence across experiments.
  4. Sample Size: The number selected was based on our prior experience with this research topic and the number of participants that could be successfully recruited within an academic term (without crossing terms). Also, given that not all participants who signed up for the experiments in advance showed up to participate, the total number of subjects per cell was not identical in all cases. Also, we did not conduct any statistical tests until we were done collecting data.
Cook, Johnston, & HeyesFacial Self-Imitation: Objective Measurement Reveals No Improvement Without Visual Feedback
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Decided ahead of time to collect data until minimum sample size achieved and this was followed
Kille, Forest & WoodTall, Dark, and Stable: Embodiment Motivates Mate Selection Preferences
  1. Exclusions: Data from 2 participants were excluded: 1 was unable to sit in either of our chairs—which constituted our manipulation of physical stability—due to due his/her weight, and 1 did not comply with the researcher's assignment to condition. When these participants are included in our analyses (the participant who was unable to use our chair was assigned to a separate "stable" chair), results remained significant.
  2. Conditions: Full Disclosure
  3. Measures: We also gathered measures to address a separate research question regarding participants’ perceptions of the stability of their own singlehood status that we did not report. As we predicted, we found that participants in the physically unstable (vs. stable) condition felt that their singlehood was less likely to last. After participants completed all of the measures reported in the paper, they went on to complete measures assessing their preferences for products (e.g., Aerobics step bench) unrelated to relationships.   
  4. Sample Size: We recruited participants in a campus student center, which requires reserving time slots in advance. We reserved a number of timeslots that we felt would give us adequate access to participants to obtain at least 20 participants per cell in our design and collected the data until our slots were completed.
Lerner, Yi, & WeberThe Financial Costs of Sadness
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We aimed for 30 subjects per cell, based on past experience with these kinds of studies. We did not conduct a power analysis.  Once we reached at least 30 per cell, we continued running until all previously scheduled subjects had been run
Spunt & LiebermanThe Busy Social Brain: Evidence for Automaticity and Control in the Neural Systems Supporting Social Cognition and Action Understanding
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Participants in the study completed several validated personality questionnaires following their MRI session. To be perfectly honest, their inclusion was primarily motivated by convenience: given that MRI data is expensive to collect, we often include additional measurements that are secondary to the main purpose of the study but which will permit theoretically-related follow-up analyses (for instance, examining the moderating influence of a personality variable on the strength of an observed group effect). For the published study in question, I have not had the time to even begin to look at this individual difference data.
  4. Sample Size: We determined sample size in heuristic-fashion based on our previously published studies using this paradigm (Spunt, Satpute, & Lieberman, 2012, Journal of Cognitive Neuroscience; Spunt, Falk, & Lieberman, 2010, Psychological Science). We collected a few more subjects than in those previous studies given that this study was examining the moderating effect of an additional manipulation (i.e., memory load). I completely acknowledge that this is a highly informal procedure; at the time it was unclear how best to formally determine sample size. While there are still many ambiguities in how to best determine sample size for fMRI studies, recent publications and software-releases (e.g., http://fmripower.org/) are beginning to clarify things.
Dec 2012Berntsen, Johannessen, Thomsen et al.Peace and War: Trajectories of Posttraumatic Stress Disorder Symptoms Before, During, and After Military Deployment in Afghanistan
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (As stated in the supplementary material as well as in the paper: The reported study was part of a large survey conducted through the military. It included many questionnaires and we had to focus on the ones of key relevance. This is often the case with this size of data bases (unlike the typical lab experiment).)
  4. Sample Size: Full Disclosure
Gaissmaier & Gigerenzer9/11, Act II A Fine-Grained Analysis of Regional Variations in Traffic Fatalities in the Aftermath of the Terrorist Attacks
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (Note: We analyzed publicly available observational data from the 50 US states (+DC) only and thus did not have any experimental conditions)
  3. Measures: In response to the editor’s and reviewers’ comments, we conducted and/or discussed some additional analyses, which were only presented to the editors and reviewers, but not included in the paper – either because they yielded redundant results (e.g., statistics per inhabitant with driver’s licence rather than per each inhabitant) or because the number of observations was too small to yield reliable results (e.g., number of drunk driving citations on a state-by-state level)
  4. Sample Size: Full Disclosure (Note: The sample size was simply determined by the number of states)
Korjoukov, Jeurissen, Kloosterman et al.The Time Course of Perceptual Grouping in Natural Scenes
  1. Exclusions: In the “Size” experiment, described in the Appendix, we excluded data from 15 participants due to a technical error. Another data set, collected over 12 participants, was excluded due to a difference in procedure (difference in the overall number of trials and session duration).
  2. Conditions: We did not report the differences between three d-conditions in the experiments because they are irrelevant for main focus of the study.
  3. Measures: Full Disclosure
  4. Sample Size: We found that the results were highly significant after we tested a predetermined number of participants.
Rodeheffer, Hill & LordDoes This Recession Make Me Look Black? The Effect of Resource Scarcity on the Categorization of Biracial Faces
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For both studies, we decided ahead of time to aim for 30 participants per cell and stopped data collection once we reached that target.  In Study 1 we went slightly over (N = 35) and in Study 2 we were slightly under (N = 27). Discrepancies were due to fluctuations in participant attendance rates. We did not look at my data before data collection ended, nor did we run more participants once we had decided when the last experiment session would be.
Nov 2012MatthewsHow Much Do Incidental Values Affect the Judgment of Time
  1. Exclusions: When participants were excluded on the basis of their responses to questions asked during the task (e.g., extreme values), I explained how many participants were excluded and the basis for exclusion in the Supplementary Materials for the paper (which describe the methods in detail). In addition, several of my studies were run on-line. For these studies, I applied eligibility criteria to determine whether the participant was eligible to be included in the sample. These included age (participants had to be at least 16), answering all questions (i.e., not choosing to withdraw from the task), and not having an ip address that appeared earlier in the study or in one of the earlier studies in the series. These eligibility criteria are fully documented in the Supplementary Materials. Ineligible responses were never analysed and there is no way of knowing, for example, how many actual participants they represent (e.g., duplicate ip addresses may be one person or several), so I did not report the precise numbers of responses that were screened on these grounds (although I am happy to provide that information).
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes were based on power analysis and sampling continued until a minimum sample size was achieved. It was not possible to specify in advance precisely what sample size would be tested because I could not control exactly how many eligible people would sign up. The policy was to recruit more participants than were needed to give high power to detect the effect of interest (see Table 1 of the paper). That is, I aimed to “over-shoot” slightly so as to have high power after removing ineligible respondents. Samples were intentionally larger for on-line studies because (a) participants were easier to recruit, and (b) the more heterogeneous sample and testing environment might reduce effect size. There was no optional stopping.
Stallen, De Dreu, Shalvi et al.The Herding Hormone: Oxytocin Stimulates In-Group Conformity
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The number of participants was determined before starting the study, and in line with typical sample sizes used in studies in this field. Data collection was terminated upon reaching the predefined sample size
Weems, Scott, Banks & GrahamIs TV Traumatic for All Youths? The Role of Preexisting Posttraumatic-Stress Symptoms in the Link Between Disaster Coverage and Stress
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (non-experimental research)
  3. Measures: Data was from a larger longitudinal study and we do reference this fact and the other work (previously published studies) in the paper. We also tested a number of alternative explanations with additional measures and this we report in our supplemental data available online.
  4. Sample Size: Full Disclosure
Oct 2012Berman & SmallSelf-interest without selfishness: The hedonic benefit of imposed self-interest
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We took additional measures that were not reported in the final manuscript. These measures were not included because: a) they were irrelevant to the main hypothesis and were not analyzed; b) they were removed during the review process; or c) reporting the results did not fit within the manuscript word count. 
  4. Sample Size: All of our sample sizes were determined in advance of collecting data and data collection stopped when the target sample sizes were reached. For study 2, we purchased a set of gift cards ahead of time in bulk, and stopped when we ran out of gift cards.
Brascamp & BlakeInattention abolishes binocular rivalry: Perceptual evidence
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We empirically determined the amount of data needed to get a clear and interpretable data pattern in our two reference conditions (called 'Attended' and 'Absent'). For our condition of interest ('Unattended') we then collected the same amount of data.
Fairbanks, Way, Breidenthal et al.Maternal and offspring dopamine D4 receptor genotypes interact to influence juvenile impulsivity in vervet monkeys
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample size was not determined a priori. We tested all of the juvenile monkeys available in our colony during the 7 year time period of the research.
Grossman, Karasawa, Izumi et al.Aging and wisdom: Culture matters
  1. Exclusions: Data from 22 American participants excluded for being outside the matching age range of the corresponding Japanese sample (25-75). The survey company in charge of subject recruitment in Japan did not recruit Japanese over 76 yrs. To match the age range of its Japanese equivalent, we reduced the American sample. Results remain virtually identical when examining all American adults. The full American sample was reported in the initial paper from this project (collected before the Japanese counterpart; Grossmann et al., 2010). Results are very similar across both types of samples.
  2. Conditions: Full Disclosure
  3. Measures: The study was part of a large-scale project examining cultural differences between Americans and Japanese in cognition and emotion. Thus, in other sessions participants were tested for a variety of instruments dealing with cultural constructs of independence vs. interdependence; holistic attention; positivity bias in memory, etc. Reporting these measures was outside the scope of the paper both thematically and in terms of page length. Further, many of these tasks were not yet entirely coded and analyzed at the time this paper was in press.
  4. Sample Size: We used an age-stratified random sample with oversampling. The latter was done to ensure that we have a comparable number of individuals of both genders, different levels of education (junior High vs. college), and in each of the three age groups (25-40; 41-55; 60-75). Our goal was to have at least 25 people in each cell. In the U.S., we stopped collecting data when we achieved this quota for the cells we have to oversample; in Japan a survey company made a corresponding decision.
Hu, Rosenfeld, & BodenhausenCombating automatic autobiographical associations: The effect of instruction and training in strategically concealing information in the autobiographical Implicit Association Test
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes are determined based on previous similar experiments conducted by the first author. We decided to stop data collection after we reached the predetermined number, which is N=16 in each condition.
Shalvi, Eldar, & Bereby-MeyerHonesty requires time (and lack of justifications)
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We instructed the RA to target at 30-35 participants per cell, based on our experience with the strength of studied effects and a general convention of good practice in the field. Data collection was stopped once this target was met. We ended up with a slightly higher n-per-cell due to good show up rates in some experimental sessions.
Ybarra, Lee, & GonzalezSupportive social relationships attenuate the appeal of choice
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 1 contained measures not related to research question.
  4. Sample Size: No (Disclosure statement coming soon.)
Sep 2012Cain, Vul, Clark et al.A Bayesian optimal foraging model of human visual search.
  1. Exclusions: Full Disclosure
  2. Conditions: All between-subjects conditions and manipulations were reported. We attempted a within-subjects version but it was too difficult for participants and was canceled.
  3. Measures: We also collected additional demographic information (e.g. video game playing behavior) for future recruitment purposes. This were not analyzed in relation to the dependent measures of this study.
  4. Sample Size: We collected 10 participants per group (30 total) and examined the results. The results were unclear so we decided to collect an additional 5 participants per group. At that point a clearer picture had emerged and we stopped data collection. These values were informed by previous studies from our lab using related paradigms that tested 12 participants per group.
Emery, Finkel, & PedersenPulmonary function as a cause of cognitive aging
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (Not applicable.)
  3. Measures: Other measures assessed but not reported because data came from longitudinal, population based study of multiple outcomes
  4. Sample Size: Coming soon.
Grant & DuttonBeneficiary or benefactor: Are people more prosocial when they reflect on receiving or giving?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We collected additional questionnaire measures unrelated to the research question.
  4. Sample Size: For Study 1 the sample was the population of employees at the call center. For Study 2 we set our data collection termination rule in advance based on power calculations from Cohen (1992 PB) and sample size availability in the behavioral lab. We did not modify the rule in the course of the research.
O’Hara, Gibbons, Gerrard et al.Greater exposure to sexual content in popular movies predicts earlier sexual debut and increased sexual risk taking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (non-experimental)
  3. Measures: These data came from an extensive longitudinal study of media and health and many measures were not related to this research question.
  4. Sample Size: A power analysis was used to determine that the analyses from the original grant proposal required successful follow-up with 2200 never-smokers at baseline resulting in an original sample of 6522 participants at Time 1.
Raby, Cicchetti, Carlson et al.Genetic and caregiving-based contributions to infant attachment: Unique associations with distress reactivity and attachment security
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Our study was part of' a longitudinal project that collected several other measures not relevant to our hypotheses. However exploratory analyses (involving measures of infant temperament) were completed at a reviewer's request but the results were not reported because the measures were not sufficiently reliable.
  4. Sample Size: Full Disclosure
Aug 2012Monti, Parsons, & OshersonThought beyond language: Neural dissociation of algebra and natural language
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size (20 subs + 1 pilot) was decided at the time we put in our request MRI scanning slots for the study and set in line with typical sample sizes in the field. Collection was terminated upon reaching the target N.
PleskacComparability effects in probability judgments.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I sought to obtain 30 participants. The design was completely within subjects and so this sample size provides sufficient power (greater than 80%) at the aggregate level. Note also I collected enough observations per subject 450 so I can actually treat each participants as his or her own experiment.
Jul 2012Bélanger, Slattery, Mayberry et al.Skilled deaf readers have an enhanced perceptual span in reading.
  1. Exclusions: One participant was excluded from the experiment based on he/she not meeting our inclusion criterion on non-verbal IQ. This was not reported because of the limited space available to report more relevant results.
  2. Conditions: Full Disclosure
  3. Measures: We had a background test to assess ASL skills and the test was not well suited to our adult population. Participants's scores reached ceiling or near ceiling score so it could not be used as a covariate as originally planned.
  4. Sample Size: We ran the maximum number of people that we could find in our special population that also met our inclusion criteria (those were included in the paper).
Fuller-Rowell, Evans, & OngPoverty and health: The mediating role of perceived discrimination
  1. Exclusions: Full Disclosure (We used FIML estimation in order to be able to include all individuals who participated in W3 of the study in the models.)
  2. Conditions: Full Disclosure (Not applicable. Our study was not experimental.)
  3. Measures: Study included a large number of measures. We only discussed the measures relevant to the analyses presented in our paper.
  4. Sample Size: Coming soon.
Leander, Chartrand, & BarghYou give me the chills: Embodied reactions to inappropriate amounts of behavioral mimicry
  1. Exclusions: Full Disclosure
  2. Conditions: In the third study we ran (now Study 1), I attempted to add a second, male experimenter, but his data were uninterpretable and only seemed to add error variance. That is why all studies specifically report using only a female experimenter.
  3. Measures: Scales/questionnaires unrelated to the research question were not reported. The DVs were the first things we assessed after the manipulations and we included additional questionnaires afterwards so as to make full use of the participants' time while we had them in the lab. It seems superfluous and distracting to report such information if it is independent of the study procedure and would not be meaningful for the purpose of someone trying to replicate the findings (which, in my mind, is the essence of how to write a research report).
  4. Sample Size: An a priori decision was made to stop data collection at the end of the given block/semester.
Longo, Long, & HaggardMapping the invisible hand: A body model of a phantom limb
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: This was a single case-study of an individual with congenital limb absence so the issue of determining sample size is not applicable.
Mazerolle, Régner, Morisset et al.Stereotype threat strengthens automatic recall and undermines controlled processes in older adults
  1. Exclusions: We removed 4 participants (two young and two old participants) for being outliers (on Cook's D and SSD, following Judd & McClelland, 1989; McClelland, 2000). Because of PS word count for short reports, we didn'tmentioned these informations.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: A french version of the process dissociation procedure (PDP) was constructed in French, including four sets of words (similar in letters and syllables numbers, and frequency, based on Jacoby's recommendations, 1998) corresponding to four instructions conditions (inclusion, inclusion filler, exclusion, exclusion filler). To account for possible differences in words sets, we counterbalanced each set, creating 4 PDP versions. Then, we counterbalanced each PDP version with threat conditions and age groups. We decided that 56 participants for each PDP version was sufficient to account for potential differences, resulting in 56 participants X 4 PDP versions = 224 participants. Analysis didn't shown any difference between the 4 PDP versions. Sample size was decided ahead and was followed. Again, because of PS word count for short reports, we didn't mentioned these informations.
Parise & CsibraElectrophysiological evidence for the understanding of maternal speech by 9-month-old infants
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: 14 infants / group were targeted but the sample overshoot because less infants were excluded due to bad data than expected.
WolfeSaved by a log: How do humans perform hybrid visual and memory search?
  1. Exclusions: We excluded outlier trials with RTs > 7000 msec. That seems to have eliminated 8 of 12000+ trials. Usually we report that exclusion. I think that must have fallen victim to the word count restriction in Psych Sci.
  2. Conditions: Frankly this is a bit of a silly question. We would typically run various pilot versions of experiments to make sure that the code works that we know how long the task takes etc. Once we are sure we are not wasting our time collecting garbage we would run a decently powered experiment. What you (I assume) really want to know is whether we ran more or less the same experiment 20 times and are only reporting the one time that p scraped over 0.05. That we did not do.
  3. Measures: Oh come now we record all sorts of things out of a sense of completeness. For example in this experiment I know the location on the screen of every target item. There are undoubtedly effects of this variable on reaction time. Those effects might be interesting. We do not happen to have analyzed that variable. Is it unrelated to the research question? I dont know. Might be a good exploration for a rainy day or for someone who asks for our data.
  4. Sample Size: Many years of experience with experiments of this sort suggest that if we collect on the order of 50 data points in each cell of the experiment (in this case 50 trials target present and absent for each combination of visual and memory set size) and if we run 10-12 observers that our results will have sufficient power to see differences between conditions when such differences exist.
Jun 2012Donkin & NosofskyA power-law model of psychological memory strength in short- and long-term recognition
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided to collect four participants (each of whom completed 10 sessions) because our goal was to fit a quantitative model to individual subject response time distributions. This number is standard for this type of analysis. The experiment was a replication of a previous study and our own pilot studies revealed that the effect was remarkably robust in individuals (even when they completed the task for just one hour) which told us that we did not need to collect more participants than is standard.
Hirsh, Kang, & BodenhausenPersonalized persuasion: Tailoring persuasive appeals to recipients' personality traits
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was determined a priori using power analysis software based on effect size estimates from previous research. Data collection was stopped once we achieved the target sample size.
Hofmann, Vohs, & BaumeisterWhat people desire, feel conflicted about, and try to resist in everyday life
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Other measures were assessed but not reported because they were unrelated to the research question (Large experience sampling project addressing multiple research questions not all of which were addressed in the above publication as explicitly stated.)
  4. Sample Size: The goal was to collect as many participants as possible with the available project funds.
Hulme, Bowyer-Crane, Carroll et al.The causal role of phoneme awareness and letter-sound knowledge in learning to read: Combining
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Article reports analyses of a subset of measures from an earlier study. Article only reports data relevant to assessing a specific hypothesis that the effects of the reading and phonology training in the previous study were mediated by changes in phoneme awareness and letter-sound knowledge.
  4. Sample Size: The data come from a previously conducted randomized controlled trial wherein sample size was based on a power calculation and we recruited samples that were as large as possible within the time and resources available.
Keysar, Hayakawa, & AnThe foreign-language effect: Thinking in a foreign tongue reduces decision biases
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For experiments conducted on campus: We targeted around 30 subjects per condition in advance. For experiments abroad and out of state, we instructed RAs to recruit as many subjects as they could within their limited time-frame.
May 2012Bernard, Gervais, Allen et al.Integrating sexual objectification with object versus person recognition: The sexualized-body-inversion hypothesis.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We also measured potential moderators: ambivalent sexism (ASI) internalization of beauty standards (SATAQ) and self-objectification. We did not find any significant correlations and we decided to report all tested experimental conditions (i.e. recognition of inverted males upright males inverted females and upright females) without mentioning these additional measures (i.e. moderators). Participants were also asked to complete two other tasks unrelated to the body inversion paradigm we used.
  4. Sample Size: Before data collection we decided to test approximately 80 participants, based on past studies we have done using this task. We tested during several testing sessions and we stopped data collection after the last testing session (when we had > 80 participants)
John, Loewenstein, & PrelecMeasuring the prevalence of questionable research practices with incentives for truth telling
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure (As stated in the supplement, “We stopped collecting data approximately ten days after the final follow-up email was sent. By this point, the rate of incoming responses had dropped off substantially (from over 100 per day in the days immediately following the first solicitation email, to on average fewer than one respondent per day).” That is, the decision to stop was independent from results of data analysis.
Sweeny & VohsOn near misses and completed tasks: The nature of relief.
  1. Exclusions: We excluded one participant in Study 2 because the RA failed to record the experimental condition for that session.
  2. Conditions: Full Disclosure
  3. Measures: Other measures were assessed but were not reported because they were not related to the research question.
  4. Sample Size: We aimed for approximately 100 participants for each study and this was followed. In Study 1 we went slightly over (n = 114) before noting the sample size and cutting off recruitment and in Study 2 we didn't quite reach 100 (n = 79) by the end of the data collection period at the end of term.
Terburg, Aarts, & van HonkTestosterone affects gaze aversion from angry faces outside of conscious awareness.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Measure of digit ratio was included in the research design as a possible mediator in the effects of testosterone. This was however not the case which was not reported for reasons of space/word-limits. These additional data are however reported and discussed in: Terburg D. & van Honk (in press). Approach-avoidance versus dominance-submissiveness: A multilevel framework on how testosterone promotes social status Emotion Review
  4. Sample Size: Sample size was predetermined based on earlier testosterone administration studies; We collected data until the sample-size as written in the protocol (N=20) was reached.
VessWarm thoughts: Attachment anxiety and sensitivity to temperature cues
  1. Exclusions: Full Disclosure (Criterion used for 2 participants excluded in Study 2 was studentized-deleted residual values greater than |3.0|. This criterion was provided in the original submission, but was excluded in the final version due to strict word limits)
  2. Conditions: Full Disclosure
  3. Measures: I included 2 items regarding participants experience with the sentence unscrambling task. These items assessed task difficulty and task enjoyment. Description of these items were included in a supplement to the original submission but were excluded in the final version due to space restrictions and their null impact on the primary results.
  4. Sample Size: A minimum sample size was targeted and each study was opened on-line for a set amount of time. Because the minimum sample size was met in both studies after this set time, each study was stopped at that point.
Apr 2012Chandler & ProninFast thought speed induces risk taking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In Study 2 we asked participants items from the CARE (a measure of risk taking) that pertained to sex substance abuse and disorderly conduct. However due to researcher (my) error participants also completed a single item from the sports subscale of the CARE (the extent to which participants enjoy skiing) at the end of the questionnaire. We did not report this item because (aside from the fact that we did not intend to include it) it seemed inappropriate to make claims about behaviors measured by this subscale when it was represented by only a single item. All descriptive results reported in the paper are virtually identical and all statistical tests of significance are unchanged if this item is included.
  4. Sample Size: We had a sense that the effect would likely be large based on earlier research using similar manipulations so we were not too concerned about obtaining a large sample. This was really one of those situations where sample size was determined by resource limitations rather than a solid methodological rationale - I had to be present in the eating hall with the RA while data were collected and so we could only collect at times both of us were free. The rule was collect until the end of the semester and see how things looked then. In both cases we collected a single semester's worth of data checked and terminated.
O'Brien & Ellsworth More than skin deep: Visceral states are not projected onto dissimilar others
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We included a question about perspective taking (To what extent did you step inside Jim's shoes while reading the story? from 0-9); there were no differences on this measure & it didn't affect the results, and a number of people left it blank or put question marks next to it. We also included 5 yes/no questions taken from Van Boven & Loewenstein 2003: Have you ever been lost in the woods? Engaged in backpacking? Engaged in mountaineering? Engaged in hiking? Engaged in wilderness activities? Almost everyone circled No, and many people left them blank. They were printed on the back page of the last sheet of the study packet, so I think some people forgot to flip it over. Hence, due to methodological difficulties and to fit word limits, we dropped these measures.
  4. Sample Size: The general rule of thumb is that we always try to get at least 20-30 people per cell before looking at any of the data, and if more data are needed to run additional blocks of 20-30 before looking. There's also some practical constraints. For example, I think we stopped data collection in Study 2 because we ran out of subject pool hours (the 20-30 rule had also been met).
Mar 2012Forest & WoodWhen social networking is not working: Individuals with low self-esteem recognize but do not reap the benefits of self-disclosure on Facebook
  1. Exclusions: Full Disclosure (Criteria for exclusion were reported in the paper. In general, data from participants who complete a survey multiple times or double-submit pages of the survey are also discarded, or the first set of responses is retained but subsequent responses are discarded). Without going back to the raw user-input data, I cannot be certain which of these strategies was employed, or whether there were any such participants who completed the survey multiple times. However, discarding data from participants who submit multiple times is always done before any analyses are conducted.)
  2. Conditions: Full Disclosure
  3. Measures: Other measures included for example, measures of the Big 5 Personality traits and narcissism, questions about participants' Facebook settings--that were collected for purposes unrelated to the main research questions addressed in the paper and were therefore not reported in the paper.
  4. Sample Size: These data were collected several years ago and I do not recall the specific reasons for sample size decisions in these particular studies. In general, we terminated data collection at the end of an academic term or when a given study had reached its maximum credit allocation from the research participant pool, unless a sample size we deemed sufficient was reached prior to these cutoffs.
Gupta, Jang, Mednick et al.The road not taken: Creative solutions require avoidance of high-frequency responses
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We collected the data during one term and then spent the next year applying mathematical models to these data. There was no modification of sample size.
Jackson, Thoemmes, Jonkmann et al.Military training and personality trait development: Does the military make the man, or does the man make the military
  1. Exclusions: Full Disclosure
  2. Conditions: N/A (non-experimental study)
  3. Measures: There were scores of unreported measures and items given the study was part of a large, multi-wave longitudinal study.
  4. Sample Size: Sample size determined by power analysis for initial grant that took into account the number of schools that we would need to sample (assuming a particular response rate).
McCaffreyInnovation relies on the obscure: A key to overcoming the classic problem of functional fixedness
  1. Exclusions: Full Disclosure
  2. Conditions: Another condition was testing a secondary hypothesis, which did not reach significance. We reported on the primary hypothesis but not the secondary hypothesis.
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Feb 2012Frankenstein, Mohler, Bülthoff et al.Is the map in our head oriented north
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect a sample size of 30 participants (15 male 15 female). Due to time constrains of the project and the lab space used by a lot of research groups (i.e. we had only a limited amount of time to collect data in that lab facility) we were not able to run 30 participants and had a few less. All data collection was finished before starting any analyses no additional participants were run after analyses started.
Hodson & Busseri Bright minds and dark attitudes: Lower cognitive ability predicts greater prejudice through right-wing ideology and low intergroup contact.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (We conducted secondary analyses of large-scale datasets with our analyses focusing on the key variables reported by the original authors.)
  4. Sample Size: Full Disclosure (We used the participant samples as used by the original authors in our secondary analyses.)
Howell & Shepperd Reducing information avoidance through affirmation
  1. Exclusions: Full Disclosure
  2. Conditions: When we conducted Study 1 we included a manipulation intended to have the opposite effect of affirmation. However manipulation check measures suggested that our manipulation failed to produce the intended psychological effect. Thus we dropped it from the remaining studies and do not report it in the paper.
  3. Measures: We included some measures that were not related to the research question. We also included a variety of process variables which either were not reliable or did not predict any variance in our outcomes. We chose to stick to our primary effects for publication because of space constraints and to streamline our story.
  4. Sample Size: We determined that we were going to collect 20-25 participants per condition in advance of the study based on standard power recommendations for the analyses we intended. We ran our research in our university's human-subjects participant pool and uploaded blocks of participation slots each week. For the first study we stopped when the semester ended. Our second two studies included more than 25 participants per cell because of an unanticipated influx of signups at the end of the studies.
O'Brien & Ellsworth Saving the last for best: A positivity bias for end experiences
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We also measured current mood, current hunger level, and general enjoyment of chocolate (on 0-10 scales); there were no differences between groups, and none of these variables influenced the results; I think I dropped them to fit word limits.
  4. Sample Size: We stopped data collection simply because I had to travel to Poland for a summer research exchange program, and I wanted to finish data collection before I left. I collected data up until the last possible day before the trip.
Wang, Li, Fang et al.Individual differences in holistic processing predict face recognition ability
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: At the first step we planned to collect 500 subjects. (For genetic studies usually you need three independent samples and the first sample is for exploratory investigation and the second and the third samples for validation. That is the sample size is determined by the genetic study). However we were only able to collect data from 337 subjects in this step. We happened to learn that our data were also capable of addressing the relation between face recognition and holistic face processing so we used this set of data. We used the data designated for another study (i.e. the genetic basis for face recognition). We used the full data set collected.
Jan 2012Duguid & GoncaloLiving large: The powerful overestimate their own height
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Hehman, Gaertner, Dovidio et al.Group status drives majority and minority integration preferences.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Article was based on data from a larger multi-year project hence multiple measures were collected many unrelated to the research question addressed in the manuscript and so were not included. Additionally since we created the scale used for this research several items originally intended to measure our construct of interest were cut from analysis based on a confirmatory factor analysis (though the use of a CFA is reported in the manuscript).
  4. Sample Size: Our original goal was for 150 participants of each type of student (Black or White) at two universities or 600 participants. We quickly realized collecting data from 150 White participants at one university (a historically Black college) was unrealistic. For these groups we collected as many as possible and stopped collecting at the end of the semester. For the groups for which we were able to meet our target goal we stopped when hitting that goal (~150 participants).
Sandman, Davis, & GlynnPrescient human fetuses thrive
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (We didn't have experimental conditions but naturalistic observations)
  3. Measures: We also assessed hundreds of other measures in this longitudinal project--everything from fetal growth (using ultrasound) to childhood MRI; We did not report these other measures because it was not be feasible.
  4. Sample Size: Longitudinal Study; Sample included all subjects for whom complete data were available for the variables of interest. For the NIH grants that supported the studies detailed descriptions of power were included.
Sternberg & McClellandTwo mechanisms of human contingency learning
  1. Exclusions: Full Disclosure
  2. Conditions: For comparison with the causal framing instructions, we required a comparison framing condition that led to comparable learning of the direct contingencies in the training phase of the experiment across the causal framing and comparison framing condition. Two conditions that did not meet this requirement were tried before the object framing condition. Details are reported in the first author's dissertation (Sternberg, 2012).
  3. Measures: Participants were also asked to give subjective ratings about the probabilities of the outcome for each item at the very end of the experiment, immediately before debriefing. These are not reported in the paper, as we found early on that while they demonstrated participants had declarative knowledge of the direct contingencies, they were in general not reliably sensitive to the observed indirect effects in either task.
  4. Sample Size: As cue competition/indirect effects in fast-paced response time tasks have not to our knowledge been previously observed in the contingency learning literature, we could not perform a direct power analysis based on previous findings. However, we decided on specific target sample sizes (48 per condition in the RT task and 24 per condition, in the prediction task, respectively) from the outset, and stuck with them.
Szpunar, Addis, & SchacterMemory for emotional simulations: Remembering a rosy future.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was determined on basis of results of previous studies run in the lab. We aspired to reach 24 participants in each delay condition half using the memory sampling technique and half using the list sampling technique. This goal was not modified in the course of the experiment.
Waytz & YoungThe Group-Member Mind Trade-Off: Attributing Mind to Groups Versus Group Members
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Targeted sample sizes were based on previous similar studies, taking into account the particular design of the study at hand, and the total number of cells. We stopped data collection when we reached our pre-determined targets. The first two studies relied on item-wise analyses, so we aimed for approximately 20 subjects accounting for data loss/gain that results from typical discrepancies between MTurk's reported number of hits accepted and the actual numbers of participants that we identified as completing the study upon data inspection. Studies 3 and 4 relied on subject-wise analyses and either a mixed design (Study 3) or a within-subjects design (Study 4). The main analysis of Study 3 involved a 3x2 ANOVA (6 cells) as well as a between-subjects factor, included in an initial analysis, reported in the paper. The primary analysis of Study 4 was a 2x2 ANOVA (4 cells). We aimed for approximately 60 subjects and 30 subjects for Studies 3 and 4 respectively, again accounting for data loss/gain.
IssueAuthorsArticle Title
Dec 2013Gere, MacDonald, Joel et al.The independent contributions of social reward and threat perceptions to romantic commitment
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In all studies, all of the variables that were relevant to the constructs considered on our article were reported. Other variables were also measured in the studies, but those were not relevant to our research question and the constructs that we examined in the paper.
  4. Sample Size: Study 1: The data were taken from a larger, longitudinal study on relationship satisfaction. A sample of over 500 people was collected to account for anticipated dropouts from the study and failures of quality check. Such a large sample afford sufficient power to detect even small effects. Study 2: The data were taken from a larger, longitudinal study on relationship dissolution. A sample of over 1100 was collected to account for anticipated dropouts from the study, failures of quality check, and a relatively low proportion of people who were expected to break up over the course of the study in order to allow the prediction of relationship dissolution over time with sufficient power. Study 3: We wanted to have at least 100 participants per condition for sufficient power, thus we set MTurk to end the recruitment once 400 participants have reported on MTurk that they completed the study.
Nov 2013Loersch & ArbuckleUnraveling the mystery of music: Music as an evolved group process
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Additional measures were included to examine non-focal hypotheses that were related to the overarching ideas tested in the research.
  4. Sample Size: Rough sample sizes were chosen based on convention, and, once a number of studies had been conducted, on previously obtained effect sizes.
Sherman, Figueredo, FunderThe behavioral correlates of overall and distinctive life history strategy
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: The data analyzed were part of several much larger projects. We note in the method section that this is the case and reference other papers based on these large projects, but listing all of the measures gathered (including those not used) in each of those data sets would be unwieldy. The full set of measures used in those data sets are available online.
  4. Sample Size: We did not report this in this paper because these data were archival (i.e., the data had already been collected years ago). However, the sample sizes were determined aprior for all of the studies to be somewhere between 180-220 participants because this provides ample power to detect effect size rs of .15 at the p < .05 level (two-tailed).
Oct 2013 Zaiser & Giner-SorollaSaying sorry: Shifting obligation after conciliatory acts satisfies perpetrator group members
  1. Exclusions: Full disclosure was given for Experiments 1 & 2, and the US-to-Bunce Island Apology Manipulation of Experiment 3. For the US-to-India apology study, Experiment 4, we were collecting a concurrent sample of Indian respondents using Mturk, which we excluded from the write-up because they were part of the group receiving the apology, not giving it, which was the focus of the paper.
  2. Conditions: Full disclosure was given for Experiments 1 & 2, and the US-to-Bunce Island Apology Manipulation of Experiment 3. For the US-to-India apology study, Experiment 4, there was a third condition, apology only, that was dropped. We did this because results looked stronger when only the two conditions – control and the maximally effective conciliatory act including both apology and reparations – were analyzed.
  3. Measures: For Experiment 1, the study originally included, as possible moderators, measures of national attachment and glorification, egalitarianism, assessments of the morality of the event and of the group; and as possible outcomes, measures of group-based emotions. These measures were not reported because they were not relevant to the eventual theoretical point of the paper, and because some of them gave nonsignificant or inconsistent results across studies. We decided instead to focus on the most straightforward outcomes (e.g. satisfaction) and the theoretically relevant mediators of power, image and obligation. At some point in the research program we kept including these items for consistency but gave up even analyzing them (e.g., in the Study 4). For Experiment 2, the study also originally included, as possible moderators, measures of national attachment and glorification, assessments of the morality of the event and of the group; and as possible outcomes, measures of group-based emotions. Using the same rationale as in Experiment 1, these were not reported in the final manuscript. For the US-to-Bunce Island apology study, Experiment 3, the study originally included, as possible outcomes, measures of specific group-based emotions, and as a possible moderator, perceptions of the sincerity of the apology and importance of future improved relations. These were not reported in the final manuscript, again for the same rationale of exclusion used in Study 1. It also included measures of personal willingness to take part in outreach (e.g. visit Bunce Island) but these were not reported because they did not correlate well with collective action measures, had a floor effect, and did not yield significant effects of manipulations. For the US-to-India apology study, Experiment 4, the study originally included, as possible moderators, national identification, and perceptions of who is responsible, as well as specific intergroup emotions. These were not reported in the final manuscript again for the same rationale of exclusion used in Study 1. Additionally, there were measures of perceived power similar to Study 1 and 2, which were not reported because they were nonsignificant and by then not relevant to the theoretical picture, Study 4 being reported to address a specific editor’s/reviewer’s point about the interpretation of obligation shifting.
  4. Sample Size: For Experiment 1, sample size was determined ad hoc based on availability of participants through the research participation pool (as many people as signed up were allowed to take the study). Sample size was not altered midway through based on looking at the results. For experiment 2, sample size was determined ad hoc based on availability of participants through the research participation pool (as many people as signed up were allowed to take the study). Sample size was not altered midway through based on looking at the results. For the US-to-Bunce Island apology study, Experiment 3, sample size was determined ad hoc based on availability of participants through the research participation pool (as many people as signed up were allowed to take the study). After an initial data collection yielded marginally significant results, we undertook an additional data collection, which improved the significance to conventional levels. For the US-to-India apology study of Experiment 4, sample size was determined based on availability of MTurk funds, and initially was 40 per condition minus attrition from noncompleted studies. It was not altered during the analysis process.
Geers, Rose, Fowler et al.Why does choice enhance treatment effectiveness? Using placebo treatments to demonstrate the role of personal control
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: At the end of Studies 1 to 3 participants completed a series of exploratory items. These included items such as ones asking summary evaluations of the pain task/aversive noises, assessing anxiety, attention to task, feelings of control, and experience with similar tasks—such as working around loud noises. Most items were non-significant and the few significant items were unreliable across. Thus, we did not focus on these items. Interested researchers may contact us about these measures. 
  4. Sample Size: We determined that we were going to collect 20-25 participants per condition in advance of data collect based on studies we have conducted using similar procedures. For Study 3 we stopped early due to the end of the academic semester, and Study 4 we collected a few more participants due to an unanticipated increase in participant signups at the end of the academic term.
Sep 2013 Gerbasi & PrenticeThe Self- and Other-Interest Inventory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 1a Participants also completed the AIQ, 5 items from the Visions of Morality scale (Shelton & McAdams, 1990), and 10 items from Bobo and Hutching's (1996) zero-sum belief measure.  The AIQ was not included because it was felt to be redundant with other measures, however the pattern of results was similar for students as reported for adults in Study 2 (regression and correlation tables included below).  Visions of Morality was not included in the manuscript because the reliability was low (alpha =.48, however correlation relationships did confirm hypotheses: r= .32, p<.01 with other-interest).  The zero-sum beliefs measure (r=.22, p=.04 with self-interest) targeted a related research question outside of the scope of the paper. Study 1b Participants also completed the SOII thinking of how they believe the average Princeton Undergraduate would respond.  In order to study the differences between how perceptions of SI and OI in the self versus "the average" target, additional measures (both from the perspective of the self and the average other) were included:  the theories of self-relative-to-other behavior, zero-sum beliefs (Bobo & Hutchings),  and  responses to defectors in prisoner dilemma type vignettes. This set of measures targeted a different research question and therefore was not included in the paper. Study 1c Participants completed measures about their perception of the policies supported by Barack Obama and John McCain, in terms of how beneficial to the self and to the other each set of policies were.  They were also asked their voting behavior and candidate preferences overall.  Participants also completed measures about purported changes to Princeton policy, and reported how much they thought each change would benefit the self and others, and how much time and effort they would be willing to donate to determine the course of policy outcomes.  These items were not included because they were pilot measures for projects relating SOII to policy driven behavior, and were deemed not to be robust measures.
  4. Sample Size: Studies 1a-c: We collected as much data as possible in a set period of time. Study 2: We based sample size calculation on a pilot study (as reported in the paper). Study 3: Sample sizes were based on the availability of participants in the research pool. Study 4: Sample sizes were based on the availability of participants in the research pool. Study 5: We based sample size calculation on our own previous research using the same paradigm.
Beck, Pietromonaco, DeBuse et al.Spouses' attachment pairings predict neuroendocrine, behavioral, and psychological responses to marital conflict
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Excluded measures were not related to research questions.
  4. Sample Size: Decided ahead of time to collect data until minimum sample size of 225 recently married opposite-sex couples was achieved and this was followed. The sample size was estimated in advance based on power analyses.
Mata, Ferreira, ShermanThe metacognitive advantage of deliberative thinkers: A dual-process perspective on overconfidence
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: In all studies, we tried to get around 20 participants per cell of the design. For the lab studies the N was ultimately dependent on the availability of student participants. For the Mturk studies, we recruited an exact number so as to try to have 20 participants per cell. In Study 5, we recruited 60 participants but 65 ended up providing data. This most likely means that 5 participants did not sign up for compensation at the end and were therefore not registered on Mechanical Turk as having taken part in the study.
Aug 2013 Sheikh & Janoff-BulmanParadoxical consequences of prohibitions.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We did not include some items on state emotions  and some demographic items such as political orientation because neither were relevant to the paper's research question
  4. Sample Size: We determined the sample size before data collection using similar studies run by our lab and reported by other researchers, but we did not report this information in the paper
Campbell, Overall et al.Inferring a partner's ideal discrepancies: Accuracy, projection, and the communicative role of interpersonal behavior.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A: both studies used correlational designs)
  3. Measures: Both studies were large scale dyadic studies that included measures related, and not related, to the hypotheses tested/discussed in this publication.
  4. Sample Size: In study 1, Nickola Overall (PI) targeted 200 dyads at the beginning of data collection. In study 2, Lorne Campbell (PI) targeted 100 dyads at the beginning of data collection.
Kraus, Keltner et al.Social class rank, essentialism, and punitive judgment.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Health measures in study 1 not related to the present research question. Study 3 included measures of emotion that were irrelevant to the study question and more measures of punitive judgment that were unreliable, and so we decided not to include these in our report of the research.
  4. Sample Size: Study 1, 2, and 4 sample sizes for online data collection were based on available funds. Study 2 sample size was determined by stopping collection of the study after one full academic semester of data collection.
Lee, Gregg et al.The person in the purchase: Narcissistic consumers prefer products that positively distinguish them.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: One exception: we also assessed self-esteem in Study 4, but not report results, for reasons of narrative coherence (we didn't measure it in Studies 2 and 3), because we dealt with the narcissism/self-esteem relationship in Study 1 (so the matter was dealt with), and because we conducts a large number of analyses analyzing the components of narcissism (for reasons of space).
  4. Sample Size: We simply ran as many participants as we feasibly could in a semester, or stopped at a number we thought might reasonably be liable to detect an effect (aiming for about 100 participants, in each case). It proved to be enough.
Jul 2013Horberg, Kraus et al.Pride displays communicate self-interest and support for meritocracy.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Participants completed some measures unrelated to the research question (i.e., filler items or items part of an unrelated or exploratory investigation). Further information and examples are reported on p. 28 and 31 of the article.
  4. Sample Size: Studies 1 and 3: We aimed to enroll at least 25 participants per between-subjects condition based on sample sizes used in similar research. We removed the online posting for the study when this was achieved, but permitted people who had already enrolled by the time it was removed to participate. Study 2: Since this was a within-subjects design, we aimed to collect data from at least 25 participants per cell. We stopped enrolling participants when this was achieved. Study 4: Participants came from a previously-conducted study that was run for one academic semester. We had no eligibility requirements for this study, so we decided to analyze the entire data set.
Hui, Molden et al.Loving freedom: Concerns with promotion or prevention and the role of autonomy in relationship well-being.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We did include some other dependent measures for exploratory purposes (e.g., responses to partner's capitalization attempts in three studies; Studies 4a, 4b, and 5b). We did not include them in the final paper as we could not consistently get the autonomy x promotion interaction.
  4. Sample Size: We did not conduct any power analysis beforehand. In general, we ran a minimum of 40 participants for a correlational study and 20 participants per cell for an experiment. In the non-online studies (i.e., Study 1a, 3a, 5a, and 5b), the sample size was additionally determined by the number of subject hours we had for the study. For the online studies, we usually continued to run the study until we observed the effects. We typically re-analyze the data after the data collection was completed each day.
Lahti, Raikkonen et al.Trajectories of physical growth and personality dimensions of the Five-Factor Model.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Van Doesum, Van Lange et al.Social mindfulness: Skill and will to navigate the social world.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: All laboratory studies (Study 1a, 2a, 3, and 4) contained additional measures/items that were not related to the research question and/or were exploratory in nature. However, as replications, all measures were reported for the online studies (Study 1b, 1c, and 2b), except for two further exploratory items in Study 1c that did not influence the conclusions. As mentioned in the paper, Study 3 was embedded in a larger project, and we only reported on the relevant measures.
  4. Sample Size: For all studies that were run in the laboratory (Study 1a, 2a, 3, and 4), a minimum sample size for reasonable power was determined in advance. However, as a default strategy, data collection was never stopped before the allotted time in the laboratory ended, which in all cases provided us with the required minimum. For the online studies (Study 1b, 1c, and 2b), sample size (or the amount of judgments ordered) was determined in advance.
Wagner, Gerstorf et al.The nature and correlates of self-esteem trajectories in late life.
  1. Exclusions: As reported in the paper, our data analyses included all participants who (a) were deceased by July 2011 and (b) had contributed one or more waves of self-esteem data in the last 10 years of life resulting in a sample of 1,215 individuals from the original sample of 2,087 participants.
  2. Conditions: Full Disclosure
  3. Measures: There were many more measures included in the dataset (the study was part of a large, multi-wave longitudinal study). However, they were not related to our research question.
  4. Sample Size: Full Disclosure
Jun 2013Baker, McNulty et al.When low self-esteem encourages behaviors that risk rejection to increase interdependence: The role of relational self-construal.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Also assessed other measures not related to the research question.
  4. Sample Size: Sample size was determined by power analysis; however, we did not describe this in the manuscript
Frimer, Biesanz et al.Liberals and conservatives rely on common moral foundations when making moral judgments about influential people.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A; correlational design)
  3. Measures: Full Disclosure
  4. Sample Size: Our primary interest was in comparing effect sizes, so we did not emphasize this issue.  We aimed for samples of 100. This did happen in Study 1.  In Study 2, we recruited professors from liberal and conservative colleges, and found that the conservative colleges had somewhat liberal professors.  To achieve a reasonable sample, we over-recruited from the conservative colleges (which had some conservative professors).  By the time we had a reasonable balance and our RAs were thoroughly exhausted from gathering contact information of thousands of professors, the sample was closer to 200.  Study 3 was a replication of Study 2, so we matched the sample size of Study 2.
Guimond, Crisp et al.Diversity policy, social dominance, and intergroup relations: Predicting prejudice in changing social and political contexts.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: The published paper is part of a larger project and measures not related to the particular hypotheses tested were not reported.
  4. Sample Size: It was decided in advance that a sample of approximately 200 participants would be tested and when this number was reached or nearly reached, we stopped data collection.
Hepler, Albarracin et al.Attitudes without objects: Evidence for a dispositional attitude, its measurement, and its consequences.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For studies conducted using an undergraduate subject pool, we collected as many subjects as possible during a single semester. For studies conducted on Mechanical Turk, we collected as many subject as possible given funding limitations. We did not conduct any statistical tests for a study until we were completely done collecting data for that study.
Zelenski, Whelan et al.Personality and affective forecasting: Trait introverts underpredict the hedonic benefits of acting extraverted.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 1: As part of an exploratory student project, participants imagined two additional scenarios at the end of the study where they rated how introverted and extraverted they would behave. Results were not reported because they address different hypotheses. Studies 2 - 5: Generally, these were studies that pursued multiple objectives. We focused reporting on relevant variables while trying to give a methodologically accurate account of procedures. In the published JPSP article we briefly allude to the fact that participants' rated their own and others' (unless a confederate) behavior and a questionnaire that assessed general reactions to the discussion (provided by William Fleeson from his original acting extraverted study). More details are available for Study 2 & Study 3 where behavior results are reported in Zelenski, Santoro, & Whelan, D. C. (2012) (as noted in our article). Very similar behavior questionnaires were included in Studies 4 & 5. Also, we note that Studies 2-5 were conducted in the order reported. Study 2: Participants also completed a Stroop test following the group discussion (Reported in Zelenski et al., 2012). Study 3: Participants also completed a Stroop test following the interview. They also provided a drop of blood analyzed for glucose near the beginning of the study and just before the Stroop test (Reported in Zelenski et al., 2012). Study 4: Participants completed a (unreported) mood questionnaire at the beginning of the session. (This omission from the procedure section was an oversight.) After the interaction (and data reported in our JPSP paper), confederates rated participants' affect and behavior, participants completed an exploratory implicit affect measure (Quirin, Kazen and Kuhl, 2009), and participants played the game 'Operation' where we recorded time to completion and errors. However, the task was implemented with significant variation, there was missing data due to malfunctions, and errors were difficult to record accurately. Study 5: Separate from and before the experimental session, but explicitly connected with it, participants completed additional personality measures of BIS/BAS, authenticity, and ideal self. Following the group interaction (and thus data presented in the JPSP paper), participants rated their state authenticity and effort, and completed a Stroop task. As the most recent study in the package, we anticipate reporting on some or all of these additional measures in future publications.
  4. Sample Size: Across studies 1-4, we determined rough goals for sample sizes, but they were ultimately largely determined by how much data we could collect in a give term. If sample size was far from those goals, data collection continued to another term. Loose consideration of power and a local rule of thumb suggested that sample sizes of about 110 gave a reasonable chance of detecting moderately sized personality x situation interactions. In no case did we decide to stop or continue collecting data based on the results of analyses. As part of D. Whelan's dissertation, Study 5 included some more formal power analyses prior to data collection (with regard to hypotheses beyond those reported in JPSP, however), but pragmatic issues of recruitment within a term still played a role. 
May 2013Lucas, Lawless et al.Does life seem better on a sunny day? Examining the association between daily weather conditions and life satisfaction judgments.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A; no experimental conditions)
  3. Measures: Dataset used had many different questions, but only one was of interest to us, the single-item life satisfaction question we analyzed. We did provide a reference to the broader dataset that lists all questions, and we did not try running our analyses using any other measures before writing up the study. There were other weather variables that could have been analyzed; we used those that were most relevant for judging whether the day was a pleasant weather day or not. We did not download any additional weather variables from our sources that were analyzed but not reported.
  4. Sample Size: We used existing data and used all participants who we could match with weather conditions on the day of the survey.
Silvestrini, Gendolla et al.Automatic effort mobilization and the principle of resource conservation: One can only prime the possible and justified.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Actually, the answer is “yes”. However, our impedance cardiograph system (Cardioscreen 1000, Medis, Ilmenau, Gemany) automatically assess various measures of cardiac functioning that we don't analyze, because they are not related to our predictions (e.g., stroke volume or left ventricular ejection time). For the pre-ejection period, our main effort-related cardiac measure, and heart rate we don't use the automatic assessment of the Cardioscreen system but process the raw data with a lab-made software specifically designed to allow visual inspection of the individual cardiac cycles and of the B-point location, as reported in the methods section. That is, the measurement device automatically assess several indices of cardiac functioning, but we did neither measure them actively, nor did we analyze them.
  4. Sample Size: For the first study, we determined the sample size according to the sample sizes of our previous studies (at least 10 valid participants per cell). For the second study, we determined our sample size according to the recently published article of Simmons, Nelson, and Simonsohn (2011), which recommends at least 20 participants per cell. Most relevant, we didn't analyze the data before the data collection was completed and we didn't add participants after the original data collection.
Todd, Burgmer et al.Perspective taking and automatic intergroup evaluation change: Testing an associative self-anchoring account.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Experiment 5 (two-session study) contained some exploratory measures as part of pilot study for a different project to fill the remainder of the time allotted for the second session. Also, our initial submission contained an additional study; however, in the course of the review process, that study was replaced with what is now Experiment 4 (details about the dropped study are available from the first author).
  4. Sample Size: Sample sizes were based on prior work in this area, with the additional stipulation of running at least 20-25 participants per cell (as per Simmons et al., 2011). Experiment 5 had a target of 30-35 per cell (as is now typical in the first author’s lab). Data collection stopped once the target sample size had been reached or the remaining weekly lab sessions dedicated to these studies had been run.
Apr 2013Aknin, Barrington-Leigh et al.Prosocial spending and well-being: Cross-cultural evidence for a psychological universal.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full disclosure for all studies expect for Study 1. Study 1 used data collected in by Gallup World Poll, which contains hundreds of questions. We reported all questions of interest and direct interested readers to the long list of items used. 
  4. Sample Size: We recruited as many participants as possible in a given term or until study funds ran out.  
Shallcross, Ford et al.Getting better with age: The relationship between age, acceptance, and negative affect.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A: Our study did not involve any experiments/manipulations.)
  3. Measures: We only reported measures that pertained to our research question. The manuscript states that data reported were part of a larger research project. 
  4. Sample Size: (a) Our sample size was determined based on power analyses that were conducted to address a different research question than was the focus of this manuscript. (b) We discontinued data collection when this sample size was achieved.
Van den Akker, Dekovic et al.Personality types in childhood: Relations to latent trajectory classes of problem behavior and overreactive parenting across the transition into adolescence.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A; this is a longitudinal non-experimental study)
  3. Measures: Unreported measures were not related to research question
  4. Sample Size: The sample size was decided at the time that the study started (1999), based on power analysis, expected attrition  and practical concerns (available funding).
Mar 2013Van Dillen, Papies, & HofmannTurning a blind eye to temptation: How cognitive load can facilitate self-regulation.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: For participant selection purposes we implemented two additional measures (Body Mass Index and Dietary Restraint Questionnaire), but given that we observed no extreme scores on these measures which could have affected our results, we have not used them as a basis for excluding participants.
  4. Sample Size: We decided ahead of time on the minimum sample size and stopped recruiting participants when this number was reached.
Yang, Wu, Zhou et al.Diverging effects of clean versus dirty money on attitudes, values, and interpersonal behavior.
  1. Exclusions: Data from one participant in Experiment 4 excluded because she left before finishing the tasks, 3 participants in Experiment 5 excluded for answering their cell phones during the experiment, 4 participants in Experiment 6 excluded for skipping some trials during the main dependent measure.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes are determined based on pilot studies or previous studies and power analysis; Data collection stopping rule: decided ahead of time and stopped when the number was reached.
Feb 2013Conway & GawronskiDeontological and utilitarian inclinations in moral decision making: A process dissociation approach.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 1 included exploratory measures of five additional constructs for which we did not have a priori hypotheses (administered at the end of the study). Because these measures failed to show any significant relations to the constructs of interest (i.e., deontology/utilitarianism), they were not reported in the final paper. Full disclosure for Studies 2 and 3.
  4. Sample Size: The sample size was determined a priori on the basis of previous work in this area. The data were collected in one shot until we attained the desired sample for all studies. 
Gurven, von Rueden, Massenkoff et al.How universal is the Big Five? Testing the five-factor model of personality variation among forager-farmers in the Bolivian Amazon.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The personality interviews were administered as part of a larger series of interviews (demographic, economic, etc.) and medical checkup we conduct among the Tsimane. Subjects are seen once a year by anthropologists and physicians affiliated with the Tsimane Health and Life History Project (http://www.unm.edu/~tsimane/). We average an 85% participation rate among Tsimane adults in the villages that are part of the THLHP. The sample size of our personality interviews was not pre-determined. Our self-report sample (n=632) resulted from all adults who participated in our project during 2009 and 2010 and who agreed to take the personality questionnaire.  For the first half of 2009, our clinic was mobile and traveled to each of the 28 villages then part of the THLHP. Subsequently, our clinic was located in a fixed position and we transported Tsimane adults from those 28 villages (over 40 years of age) to our clinic on a village by village basis (in addition to their immediate family members and any other village residents reporting injury or illness).  We targeted older individuals for our fixed clinic given the larger goals of the THLHP to investigate senescence in a traditional population. Thus, most of the individuals in our self-report sample under the age of 40 were interviewed at their village in the first half of 2009. During the year 2010, we administered the personality questionnaire to any adults visiting our clinic who did not complete the questionnaire the prior year.  At that point we deemed the sample size (n=632) and representativeness of the sample in terms of age, sex, village affiliation, education, etc. to be sufficiently robust. The spouse-report sample (n=430) was obtained the same way at our fixed clinic during 2011 and 2012.  No data analysis (with respect to the principal results of the paper) was conducted prior to collection of the full datasets.
Peetz & WilsonThe post-birthday world: Consequences of temporal landmarks for temporal self-appraisal and motivation.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Studies 4, 5, and 6 were the earlier studies conducted. To explore potential DVs,  we included additional measures and exploratory items in these studies. Most DVs showed the same pattern but in reporting the studies, we focused on reporting those items that were assessed across all studies for simplicity.
  4. Sample Size:  In Study 1, sample size was determined by inviting every student in an introductory psychology class present at a given day to participate (paper and pencil format). In Study 2a-c, and 3b, we posted a estimated number of participant slots on MTurk, aiming for around 30 participants per condition, accounting for data loss/gain that results from typical discrepancies between MTurk's reported number of hits accepted and the actual numbers of participants that completed the study.  In Study 3a, sample size was determined by collecting 30 participants per condition (paper and pencil format). In Study 4 and 6, we emailed a subset of potential respondents from an email list, aiming for around 25 participants per condition. Once the responses stopped coming in (less than two responses a day), we closed the study. In Study 5, sample size was determined by having a day of data collection assigned to this study and one other study. Three research assistants collected as many people as possible on that day for the two studies.
Sah, Loewenstein, & CainThe burden of disclosure: Increased compliance with distrusted advice.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Our minimum sample size was approximately 30 per cell and the research assistants collected as much data as they could on data collection days.
Syed & Seiffge-KrenkePersonality development from adolescence to emerging adulthood: Linking trajectories of ego development to the family context and identity formation.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: This analysis made use of a very large 10-year longitudinal study that contained many measures. We only included measures that were relevant to our research questions. 
  4. Sample Size: There was no stopping rule. We recruited as many participants as we could given the resources available. 
Jan 2013Bélanger, Lafrenière, Vallerand et al.Driven by fear: The effect of success and failure information on passionate individuals' performance.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We measured an additional potential mediator (self-clarity). We did not find any significant correlation with this measure and its reliability wasn’t great either (alpha < .70), therefore we decided not to discuss this measure.
  4. Sample Size: Sample size was determined based on our experience with this research topic. We stopped data collection at the end of the semester; this was decided a priori. Lastly, we did not conduct statistical tests until the end of data collection.
Cheng, Tracy, Foulsham et al.Two ways to the top: Evidence that dominance and prestige are distinct yet viable avenues to social rank and influence.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We excluded measures that are unrelated to the present research question, but that were collected as part of a larger study.
  4. Sample Size: Data collection was terminated at the end of term when our team of research assistants could no longer conduct study sessions.
Meissner & RothermundEstimating the contributions of associations and recoding in the Implicit Association Test: The ReAL model for the IAT.
  1. Exclusions: Full Disclosure
  2. Conditions: In Experiment 7, we manipulated cognitive capacity during the behavioral measure. While half of the participants had to memorize a 2-digit number, the other half had to keep an 8-digit number in mind. We hoped that the latter would show stronger correlations between the association parameters and the behavioral measures. However, this was not the case (nonsignificant moderation). Furthermore, a manipulation check revealed no significant group differences. Actually, the pattern suggested that both experimental groups had been under (equally) high cognitive load. Full Disclosure for Experiment 1 to 6.
  3. Measures: Subsequent to the measures reported in the article, we collected additional questionnaire measures in Experiment 5, 6, and 7. As these items were unrelated to the research question they were not included in the analysis.
  4. Sample Size: The sample size was determined a priori, based on prior studies in the research area as well as considerations of the maximum sample size that could be achieved within the period of data collection. Data analysis started after data collection had been completed.
Vohs, Park, & SchmeichelSelf-affirmation can enable goal disengagement.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We included a thought perception measure in experiment 2 that was not germane to the present investigation.
  4. Sample Size: The stopping rules for Exps 1 and 4 were simple: the data came only from participants in certain courses and there were only a set number of students in those courses at the time. Exps 2-3 used a stopping rule of obtaining as many participants we could during the course of the semester.
Vorauer & SucharynaPotential negative effects of perspective-taking efforts in the context of close relationships: Increased bias and reduced satisfaction.
  1. Exclusions: Full Disclosure (In footnote 2 we reported the criteria for exclusion, but we did not report the numbers excluded. Footnote 2 states: "Throughout the three studies reported in this paper, participant numbers do not include participants who failed to follow instructions (e.g., did not complete the manipulation or completed the measures incorrectly) or for whom data on entire measures were missing." The actual numbers of participants excluded were 16 in Study 1, 15 (pairs) in Study 2, and 1 (pair) in Study 3. We had much better control over data collection in Study 3 because only one pair participated (in person) at a time. In the online and group session studies (preferred by participants with less strong English language skills) participants were more apt to find the transparency measures confusing and were able to skip the manipulation if they so chose)
  2. Conditions: Full Disclosure
  3. Measures: We included some additional items on an exploratory basis that were not of central importance to the focal research question in the paper. We also asked additional background questions about participants' relationship with their acquaintance/friend/partner for the purpose of understanding our samples.
  4. Sample Size: In each study we determined a priori how many participants would be sufficient and feasible to run. We did not analyze or even enter the data into SPSS until we had come as close to our target N as we could and stopped data collection.
Dec 2012Chan, Allik, Nakazato et al.Stereotypes of age differences in personality traits: Universal and accurate
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Data were collected as part of a larger battery in 26 countries, other administered measures were previously published (e.g. De Fruyt et al., 2009, Assessment). Three additional NCS items were examined and discarded before analyses of age stereotypes began.
  4. Sample Size: Sample size determination was reported in the first publication of the larger study (De Fruyt et al., 2009, Assessment). Collaborators from each site were requested to collect a minimum sample size of 50 M and 50 F participants.
Galak, LeBoeuf, Nelson et al.Correcting the past: Failures to replicate psi
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Kochanska, Kim, & Koenig NordlingChallenging circumstances moderate the links between mothers' personality traits and their parenting in low-income families with young children
  1. Exclusions: Full Disclosure (One mother (out of 186) did not provide any questionnaire data, and one more did not provide self-reported power assertion data. Behavioral data are complete)
  2. Conditions: There were no experimental conditions that were relevant at the time (all those measures were collected during pretest in a large intervention study, prior to any randomization to conditions). At that point, it was simply one group of mothers.
  3. Measures: Multiple measures are collected in this very large research program (a large longitudinal study). We reported the measures pertinent to the questions asked in the paper.
  4. Sample Size: The sample size was proposed in the original NIH grant application, subject to further discussions with Program Officers (e.g., budget constraints).  We collected all the data we had promised. We never "stopped" collecting data for any reasons. 
Lemay, Overall, & ClarkExperiences and interpersonal consequences of hurt feelings and anger
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Measures not related to research question, and too many items to report all of them.
  4. Sample Size: Decided ahead of time to collect data until minimum sample size achieved and this was followed.
Paluck & ShepherdThe salience of social referents: A field experiment on collective norms and harassment behavior in a school social network
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We measured a few more aspects of the social network, and these seemed to add too much to the length of the survey; we received few responses on these questions in the first survey wave (they were some of the final survey questions; also one of them was a "negative" network question about conflict that people didn't seem to want to answer) so we subsequently dropped these questions in our later survey waves. We also tried out some questions about "narrative" that did not seem to be responsive to our intervention--two questions that we added to waves 2 and 3 of our survey that reflected some oft-repeated phrases that students used during their intervention assembly. Finally, we measured a few other items that we are reserving for future papers, one on girl vs. boy norms (we are writing a gender paper on that now), and some aspects of students' cultural tastes (e.g. music, internet use), for a potential future paper on the diffusion of taste in the network.
  4. Sample Size: Full Disclosure (we simply measured the whole network)
Sullivan, Landau, Kay et al.Collectivism and the meaning of suffering
  1. Exclusions: Full Disclosure
  2. Conditions: Study 3 originally included a control condition, in which we asked participants to focus on neutral aspects of a story (rather than on singular pronouns, to prime individualism, or plural pronouns, to prime collectivism). Whereas our individualism/collectivism primes were based on prior research and have been repeatedly validated, the additional control condition was our own creation and previously untested. Our control condition did not significantly differ from either individualism or collectivism conditions on our dependent measure. We removed the participants who were assigned to this condition, because their responses did not seem to illuminate the question at hand (the differences between individualism and collectivism), and because it is not conventional in research priming individualism and collectivism to include a control condition.
  3. Measures: In Study 2 we included a belief in a just world scale (based on Lipkus, 1991). Our prime of individualism-collectivism did not have an effect on this scale. We originally reported this finding, but the editor and a reviewer asked that we remove it to streamline our presentation (or to research this particular issue more thoroughly; since it was not our priority in this project, we chose the former option).
  4. Sample Size: Based on a power analysis, assuming a medium effect, we sought to obtain at least 20-25 participants per condition across the studies. In some instances we collected more participants to better insure a robust effect. The editor asked us to explain our procedures in correspondence but did not require that we report them in the text.
Nov 2012Locke, Craig, Baik et al. Binds and bounds of communion: Effects of interpersonal values on assumed similarity of self and others
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In the cross-cultural studies (studies 3-4) we included a few exploratory questions (concerning knowledge of the other country and imagined experiences as an exchange student in the other country) but did not mention them because they were unrelated to our hypotheses.
  4. Sample Size: Collected data until we obtained a predetermined sample size. (In Study 5 we kept collecting data after exceeding that sample size for reasons completely unrelated to the study.)
Oct 2012Becker The system-stabilizing role of identity management strategies: Social creativity can undermine collective action for social change.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: I also measured identification in all studies but did not report the full results due to a reviewer's request. However I mentioned some brief results in footnotes.
  4. Sample Size: Student assistants were instructed that they need at least N=20 participants per experimental condition. I used the full data set that student assistants collected within a given time.
Sep 2012Federico, Deason, & FisherIdeological asymmetry in the relationship between epistemic motivation and political attitudes
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Not all assessed measures were relevant to this study because data were from a large national survey addressing several research questions.
  4. Sample Size: We simply tried to go for the largest sample size possible within our budget; we contracted for a sample of N = 1500 and received a sample of that size from Knowledge Networks. No decisions were made along the way by us about when to terminate and no analysis was done until we received the full dataset. The sample size was well within the range needed in order to sufficiently power the analyses we did.
Lilienfield, Waldman, Landfield et al. Fearless dominance and the U.S. presidency: Implications of psychopathic personality traits for successful and unsuccessful political leadership
  1. Exclusions: Full Disclosure (We reported all observations that were central tests of our original hypotheses.)
  2. Conditions: Full Disclosure
  3. Measures: We examined other personality measures that we didn't report but these weren't relevant to our hypotheses (and we didn't analyze these).
  4. Sample Size: Full Disclosure
Shiota & LevensonTurn down the volume or change the channel? Emotional effects of detached versus positive reappraisal.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: A battery of trait questionnaire measures completed prior to the study and a few additional thought content and facial expression measures assessed were not reported as these were not relevant to the research question addressed in the paper.
  4. Sample Size: Original target sample size for the entire study (including a 3rd experimental reported in the paper but not used in analyses) was 252 based on power analyses for the grant supporting the research. Final sample size for the entire study was 223 with data collection ended due to departure of the project director (Shiota) for a new job.
Aug 2012Cohrs, Kämpfe-Hargrave & RiemannIndividual differences in ideological attitudes and prejudice: Evidence from peer-report data
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (non-experimental research)
  3. Measures: Analyses were of secondary nature; data stemmed from large-scale surveys involving numerous measures unrelated to the research questions
  4. Sample Size: Sample sizes were determined by practical issues of data collection; for Study 2, we refer to a publication describing the complete data collection procedure in detail (Stößel et al., 2006)
Hill, Rodeheffer, Griskevicius et al.Boosting beauty in an economic decline: Mating, spending, and the lipstick effect
  1. Exclusions: When we do studies on mate attraction (the current research), we routinely remove Ps who are not heterosexual and who are married. Although we screen Ps for this ahead of time, we inevitably get a few who register for the experiments anyway. This information was included in our sample descriptions for most of the experiments, but it wasn’t mentioned in all of them. This was simply an oversight on our part.
  2. Conditions: Full Disclosure
  3. Measures: Each experiment included some variables measuring potential mediators (i.e., process variables) and moderators (e.g., SOI) which we did not report. The last two of the experiments also included measures that were used to pilot a new study that we were developing.
  4. Sample Size: We typically go with the conventional n = 30 within in each experimental cell. Because we are at the mercy of our research Ps to show up to experiments for which they are registered & not show up at experiments for which they are not, sometimes we get somewhat more than this & sometimes we get less.
Smillie, Cooper, Wilt et al.Do extraverts get more bang for the buck? Refining the affective-reactivity hypothesis of extraversion.
  1. Exclusions: Two participants were omitted from study 4 as they were disproportionately older than the rest of the sample and one reviewer was not happy about this (although this had no effect on the results).
  2. Conditions: Full Disclosure
  3. Measures: Additional measures were assessed but not reported because they were not related to the research question.
  4. Sample Size: We aimed for approximately 30-40 participants in each condition as a rough rule of thumb and the exact point of termination was decided according to convenience (e.g. once the student research participation period had ended for the year).
Jul 2012Aviezer, Trope, & TodorovHolistic person processing: Faces with bodies tell the whole story.
  1. Exclusions: Full Disclosure
  2. Conditions: One experiment was removed from the analysis due to editorial suggestion because the data showed a more complex picture. it was however by no means a failed manipulation. It is being prepared for publication in a different outlet.
  3. Measures: Full Disclosure
  4. Sample Size: In this paradigm (face in body context) I typically run up to 20 subjects per condition. This is based on several previous studies showing that an n of 15-20 is sufficient for reliably detecting the effect (it is a robust effect). We used the same rule of thumb here. I instructed the RA that we should run between 20-30 subjects and data collection stopped when the RA told me that we are in the range that we set for data collection. Data analysis only took place after all the data was collected, so we never stopped running as a function of reaching significance or particular data patterns.
Ermer, Kahn, Salovey et al.Emotional intelligence in incarcerated men with psychopathic traits.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: A variety of neuroimaging personality genetic and neuropsychological measures were also assessed from participants. We only discussed measures related to the research question addressed in this paper.
  4. Sample Size: We used all available MSCEIT (emotional intelligence) data collected over a three year period (except for reported exclusions). Original sample sizes for the NIH grants were determined based on power analyses to detect the effects of interest for the grant's specific aims (largely focused on brain imaging research questions). However, we ended up collecting as many subjects as possible given the funds awarded.
Jun 2012Bergsieker, Leslie, Constantine et al.Stereotyping by omission: Eliminate the negative accentuate the positive
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (though there was a partially successful very similar parallel study with manipulated self-presentation concerns that didn't make it into the final paper, but which is reported elsewhere: Fiske, Bergsieker, Russell, & Williams, 2009)
  3. Measures: Other measures assessed but were not related to the research question
  4. Sample Size: Ran the max # of Ps available to us in the rather limited Princeton participant pool each term
Carter & GilovichI am what I do not what I have: The differential centrality of experiential and material purchases to the self.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Standard demographic variables (e.g. major or GPA) might have been assessed but weren't reported because they were not relevant to the research question.
  4. Sample Size: We would never take a first look at the data before hitting a rather high threshold for sample size (at *least* 25 per cell) and if we decided to run additional participants would run until we hit a new target N which was typically some multiple of the initial sample. In the event that I looked at the data before we hit that target (to make sure there were no problems with the survey) I did not stop data collection until we hit the target N even if the data were already significant.
Danziger, Montal, & BarkanIdealistic advice and pragmatic choice: A psychological distance account.
  1. Exclusions: Full Disclosure
  2. Conditions: Throughout the course of the project we first tested several scenarios comparing choice and advice. Many of them yielded significant differences, some did not. We then focused on some of the scenarios that yielded significant differences, as reported in the accepted version of the manuscript.
  3. Measures: Full Disclosure
  4. Sample Size: Our decision was based on general conventions (central limit sample size vs. effect size) as well as on ease of recruiting subjects for the study and the number of conditions in the study. Studies ranged from ~40 to 60~ subjects per condition. In studies with 4 conditions we had slightly less subjects per condition than in studies with 2 conditions. In study 5 in which there was a confederate we had slightly less subjects per condition (~20). Pretests had fewer subjects.
Mumenthaler & SanderSocial appraisal influences recognition of emotions
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: we defined the sample size in both experiments before collecting data based on similar experiments and the availability of participants in our subject pool. Data analyses were performed once the minimum sample size was achieved.
Rothschild, Landau, Sullivan et al.A dual-motive model of scapegoating: Displacing blame to reduce guilt or increase control.
  1. Exclusions: 26 participants were excluded from Study 2. Of these excluded cases, 11 were excluded for indicating that they were over 25 years old. Although we tried to specifically recruit participants between the ages of 18 to 25 years, 11 participants outside that age range ended up participating anyway. Given that our manipulation specifically targeted young people ages 18-25, we only included participants that fell within that age range. The remaining 15 cases were excluded for failing a attention check question that followed the article that comprised our threat manipulation which read "According to the article you just read, who is the primary cause of climate change? a) young people (ages 18-25), b) middle-aged people (ages 35-55), c) older people (ages 60-80), d) the primary cause of climate change is unknown."    16 participants were excluded from Study 3. Of these excluded cases, 5 were excluded for indicating that they were over 25 years old. The remaining 11 participants were excluded for failing the same attention check question used in Study 2.
  2. Conditions: Full Disclosure
  3. Measures: We did not report filler questions unrelated to our research questions that were included to support our cover story.
  4. Sample Size: For Study 1 we ran a power analysis assuming a medium to large effect size that indicated a optimal sample size of 111 participants. We gave 6 RAs 20 questionnaires each to hand out around campus. Our RAs returned with 114 questionnaires. For Study 2 we performed a power analysis assuming a large effect size that indicated a optimal sample size of 87 participants. Although we reached this target by the end of the semester, because we had to exclude 26 cases we ended up with only 61 participants in our final analysis. For Study 3 we were shooting for at least 20 participants per cell. We stopped collecting data once we had reached 80 participants. However, because 16 cases ended up being excluded from data analysis we ended up with 64 participants.
Shu & GinoSweeping dishonesty under the rug: How unethical actions lead to forgetting of moral rules
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
May 2012Briñol, McCaslin, & PettySelf-generated persuasion: Effects of the target and direction of arguments
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We excluded any items that were unrelated to the research questions or that were included for exploratory purposes. Furthermore we focused on the items that were included in all the studies within the paper in order to maintain convergence across experiments.
  4. Sample Size: The number selected was based on our prior experience with this research topic and the number of participants that could be successfully recruited within an academic term (without crossing terms). Also given that not all participants who signed up for the experiments in advance showed up to participate the total number of subjects per cell was not identical in all cases. Also we did not conduct any statistical tests until we were done collecting data.
Apr 2012Kim, Schimmack, & OishiCultural differences in self- and other-evaluations and well-being: A study of European and Asian Canadians. 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (Not Applicable. Our research did not involve an experimental design. )
  3. Measures: The data sets included other measures (e.g., importance ratings of different domains in life, gratitude questions and etc). Including the measure(s) would have made the model too complex. 
  4. Sample Size: We did not determine sample size ahead of time but our sample size was very large relative to other studies. We had sufficient power to demonstrate small effects (r = .1, N = 780, power = 80%)  The sample was based on five data sets of which one was collected in 2010 until the end of the term.
Malle & HolbrookIs there a hierarchy of social inferences? The likelihood and speed of inferring intentionality, mind, and personality
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: First study's sample size was determined by rough power analysis on the basis of preliminary dissertation studies (which were underpowered but showed general pattern), but we fell short of target of 50, at end of semester. Second, third study, and fourth study sampling ended when just above target of 50. For fifth study we targeted more, around 70, but I can't recall the exact number. We expected more missing values and greater error variance, because of the changed procedure.
Mar 2012Sullivan, Landau, Branscombe et al.Competitive victimhood as a response to accusations of ingroup harm doing
  1. Exclusions: Full Disclosure
  2. Conditions: Study 4 originally included an additional control condition, which claimed that neither Black Americans nor White Americans experience any discrimination in university admissions. This condition did not significantly differ from the other two conditions. Upon reflection we realized that this condition was ineffective because our outcome measure asked participants to compare the amount of discrimination their group experiences to that of the outgroup. Whereas in the threat condition participants increased their rating as a function of defensive processes (as evidenced by the condition’s differentiation from the other comparison condition, and by our mediational finding), in the control condition participants may have rated the groups as equal due to non-defensive processes. We dropped the condition due to its irrelevance to the overall project and its lack of significant differentiation.
  3. Measures: In Studies 2 and 4 we included items measuring the extent to which participants engaged in competitive victimhood with outgroups other than the group they were accused of harming. In one case there was an effect of condition on these items, in the other there was not. Based on these results and the results of other studies not included in this paper, it seems to be the case that the order of measurement makes the difference. When participants engage in competitive victimhood with the relevant outgroup first, condition has no effect on ratings of other outgroups. However, since we never tested the order effect in a full factorial design, we decided not to report these inconclusive findings.
  4. Sample Size: Based on past convention, we sought to obtain at least 15 participants in each condition. Once this was obtained, we would continue to collect data until the end of the semester. In Study 3 (a one-shot online sample), we simply collected a number of participants that seemed more than sufficient (approx. 150)
Zhou, He, Yang et al.Control deprivation and styles of thinking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We also assessed other measures at the end of the experiment but did not report these as they were meant for pretesting for another experiment.
  4. Sample Size: Sample sizes were predetermined by pilot studies.
Feb 2012Wohl, Hornsey, & BennettWhy group apologies succeed and fail: Intergroup forgiveness and the role of primary and secondary emotions
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In the first two studies we included other measures (e.g. collective angst and SDO) for exploratory purposes. These were unrelated to the main hypothesis and were included to see if the dvs would respond to the variations in primary and secondary emotions for future studies. In subsequent studies in the paper we included a measure of perceived ulterior motives response satisfaction and perceived likelihood that the transgressor would re-offend. In the last study we included a measure assessing perceptions of the Canadian Military as a pilot for work that a person in Matthew Hornsey's lab was doing on dehumanization of the military.
  4. Sample Size: Number of participants was based on providing sufficient power (?=.80) to detect a moderate effect size d = .5 as significant (p < .05 two-tailed; Cohen 1988). My rule of thumb to get this is to have 30 participants per cell in an experiment. Data collection continued until we could reach at least that number or until participation dropped off. For the studies in which we were assessing more complex models (moderated mediation) I like to up that number to between 40-50. As can be seen in the manuscript we had over 150 participants in studies where more complex models were tested. Analysis did not occur until we stopped collecting data.
Jan 2012Feinberg, Willer, & Keltner Flustered and faithful: Embarrassment as a signal of prosociality
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We also conducted a study displaying a picture of President Obama expressing an ambiguous emotion with a caption indicating he was expressing embarrassment or amusement. When labeled as embarrassed, the president was rated as more prosocial. This study was included in the initial submission, but reviewers suggested results had confusing implications for our theory, so it was removed.
  4. Sample Size: Study 1a: Without prior precedent, we intended to run about 50. 57 participants had taken part when the data-collection week ended. Study 1b: Research assistants were directed to continually advertise the survey online for a week with the hope of collecting about 40 participants, a target based on Study 1a results.Study 2: Participants were recruited as part of a larger recruitment block. The sample size was a product of the number of participants who signed up during the block. We estimated 80-100 would sign up during this time, a total we thought would offer enough statistical power to find a significant result if one existed. Studies 3 & 4: Participants were recruited as part of two online surveys that involved the reported study questions and batteries that served as “prescreening” for unrelated lab studies (our items appeared first). Sample sizes were a product of the number of participants who took the surveys. Study 5: We used a rule of thumb of 15-20 participants per condition. Because the study was very labor-intensive – employing an experimenter and trained confederates – we looked at the data after approximately 15 participants to check the progress of the study. Results were promising so we continued collecting data until we reached the target sample range.
Johnson, Freeman, & PaukerRace is gendered: How covarying phenotypes and stereotypes bias sex categorization.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Data collection occurred for a pre-specified period of time and data from all participants who completed the study during that period were included in analyses.
Laurin, Kay, & FitzsimonsDivergent effects of activating thoughts of God on self-regulation
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (with the exception of some demographic information such as country of birth)
  4. Sample Size: During the time at which these studies were conducted, our typical approach was to simply aim for 15-20 observations per cell, and to analyze the data at the end of the semester if the sample size had or had nearly reached that threshold.
Oishi, Miao, Koo et al.Residential mobility breeds familiarity-seeking
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We assessed other measures (e.g. 5-item satisfaction with life scale: SWLS) but did not report them as they weren't relevant to our hypotheses.
  4. Sample Size: Basis of data collection termination was that we set out to collect x participants and terminated when that goal was reached. We never peaked at the data before terminating the data collection.
de Haan, Deković, & PrinzieLongitudinal impact of parental and adolescent personality on parenting
  1. Exclusions: Full Disclosure
  2. Conditions: N/A (non-experimental study)
  3. Measures: Measures not related to research question were not reported.
  4. Sample Size: We aimed for the largest sample size possible within our budget. The desired sample size was decided ahead of time, and the data collection termination rule was not modified during the course of data collection.
IssueAuthorsArticle Title
Nov 2013Allen & GabbertExogenous social identity cues differentially affect the dynamic tracking of individual target faces
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We tried to present an IAT but the programme malfunctioned.
  4. Sample Size: We reported the sample size – 48 participants, but not the “stopping” criterion. This was a between subjects experiment with, therefore 24 participants in each group. From past experience of our own and other work, this was determined as a satisfactory number.
Finn & RoedigerInterfering effects of retrieval in learning new information
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until a minimum sample size was achieved, and this was followed.
KleinmanResolving semantic interference during word production requires central attention
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: In my lab, it is standard practice to run 48 subjects (or as close to it as possible given counterbalancing constraints) in experiments that are expected to have reasonable statistical power. As a result, I ran 48 subjects in Experiment 1 (which needed a number of subjects that was a multiple of 6) and 40 subjects in Experiment 2 (which needed a number of subjects that was a multiple of 20). These Ns were determined ahead of time.
Richard & WallerToward a definition of intrinsic axes: The effect of orthogonality and symmetry on the preferred direction of spatial memory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We determined the sample size based on previous experiments from our research group. Once this goal was reached, we stopped data collection. 
Smith, Roediger, KarpickeCovert retrieval practice benefits retention as much as overt retrieval practice
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We included a cued recall test at the end of Experiments 1, 2, and 3. The measure showed the same pattern of results as the free recall test and was contaminated because of the previous free recall test so we eliminated it.
  4. Sample Size: We determined how many subjects we would run before the experiment was run based on the expected effect size and the number of counterbalancing versions we needed to run. We stopped when we reached this sample size.
Sep 2013MatthewsRelatively random: Context effects on perceived randomness and predicted outcomes.
  1. Exclusions: Full Disclosure: See footnote 1 of the paper. The first step was to screen for duplicate ip addresses. In Experiment 1, there were no duplicates. However (as reported in footnote 1), 19 respondents in the raw data file downloaded from the on-line survey software did not have a full set of responses. In Experiment 2, there were 447 entries in the raw data file. A total of 11 were excluded because the ip address had appeared in Experiment 1 or earlier in Experiment 2 (if the timestamp for two occurrences of an ip within Experiment 2 overlapped, both instances were excluded). Of the remaining respondents, three were excluded because they did not provide a full set of responses.  All of these exclusions were prior to data analysis.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes were based on power calculations, partly using estimated effect sizes from similar previous studies/pilot work. Both experiments were very highly powered given the anticipated (and observed) effect sizes. There was no optional stopping; samples of a given size were requested from the crowdsourcing platform used to administer/distribute the tasks, and data collection proceeded until that sample was reached. (In fact, final samples are usually slightly larger than requested because the platform “overshoots”.
McDaniel, Fadler, PashlerEffects of spaced versus massed training in function learning.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes were determined from much previous work we had conducted with the basic task (function learning) as well as initial pilot work on the specific issue (massed versus spaced training).    Our initial sample sizes (in the first preliminary experiment) were sensitive to the training effects, so we used those as our target sample sizes for the subsequent experiments and stopped collecting data when we reached those levels.
Jul 2013 Fitzsimmons, Drieghe et al.How fast can predictability influence word skipping during reading?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Data collection stopped when we knew from previous experience that we had sufficient statistical power.
Markovits, Brunet et al.Direct evidence for a dual process model of deductive inference.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Decided the appx. N beforehand.
Rerko, Oberauer et al.Focused, unfocused, and defocused information in working memory.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For all experiments, we determined the sample size needed by estimating the size of the to-be-expected effect and the sample size needed by experience from previous experiments. The sample size was determined before starting data collection. For one of our experiments, we obtained ambiguous data after reaching the planned sample size (n = 24). We therefore decided to collect additional data (+n = 12). Which aspects of the results with the smaller sample were ambiguous can no longer be reconstructed at this point.
Yerramsetti, Marchette et al.Accessibility versus accuracy in retrieving spatial memory: Evidence for suboptimal assumed headings.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We piloted 4 participants to examine how variable their responses were.  Based on these figures, we estimated that 30 participants would likely allow us to examine whether there was stability in the responses across the group.  These were done with some simple estimates and simulations.  Based on pre-determined exclusions criteria (as reported) we stopped running participants when we had  30 usable participants.
Klauer, Singmann et al.Does logic feel good? Testing for intuitive detection of logicality in syllogistic reasoning.
  1. Exclusions: Full Disclosure
  2. Conditions: There was a separate, smaller control group run partially in parallel to data collection for Exp. 4 with similar procedures, but replacing syllogisms by what we called pseudo-syllogisms. These data might have helped to interpret a significant effect in Exp. 4, if any had emerged. Because none emerged, the control study was pointless, and we did not report it as one of many measures taken to stay within the strict word limit.
  3. Measures: Full Disclosure
  4. Sample Size: For Exp. 1A, we decided to sample n=20 per group and stuck to that n for Exps. 1B and 1C. For Exp. 3, we expected a null effect in one condition and stepped n up to 30 per group to make a possible null effect somewhat more convincing. We stuck to that n as target for Exp. 2 conducted after Exp. 3. For Exp. 4, we needed a large n to really pin down a null effect observed in Exp. 3, and we felt that n=200 would be sufficiently convincing.
Lowder, Gordon et al.It's hard to offend the college: Effects of sentence structure on figurative-language processing.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We administered the Author Recognition Test (ART) to all subjects, but did not report this information in the paper.  Administration of the ART was part of a larger-scale study where we are relating patterns of reading time to participants' scores on individual-difference measures.  We plan to report these findings at a later date.
  4. Sample Size: Sample sizes were determined based on previous experiments that have been run in our lab.
Roer, Bell et al.Is the survival-processing memory advantage due to richness of encoding?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure (A priori power analysis was conducted with G*Power analysis software. Given our sample sizes (218; 102; 100) and α = .05, effects of size f = 0.27 (Experiment 1 and 2) and f = 0.24 (Experiment 3), respectively, could be detected with a probability of 1 - β = .95. Statistical analysis for each experiment was only performed after data collection was complete.)
Stewert, Haigh et al.Sensitivity to speaker control in the online comprehension of conditional tips and promises: An eye-tracking study.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: On the basis of similar experiments reported in the literature, we decided ahead of time to collect data from 36 participants.
Tempel, Frings et al.Resolving interference between body movements: Retrieval-induced forgetting of motor sequences.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Counterbalancing of four control factors (2x2x2x2) resulted in 16 combinations to which two participants each were randomly assigned in Experiment 1. A sample size of 32 should suffice to detect a medium size population effect, which we considered to be of interest. In Experiment 2 and Experiment 3, the design comprised one additional between-subjects factor that operationalized a manipulation of interest. Therefore, the samples were double the size as in Experiment 1.
Weaver, Arrington et al.The effect of hierarchical task representations on task selection in voluntary task switching.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was decided prior to data collection and was chosen so that the number of participants in each condition matched that used in previous similar studies.
Jun 2013 O'Malley, Besner et al.Reading aloud: Does previous trial history modulate the joint effects of stimulus quality and word frequency?
  1. Exclusions: The number of observations excluded and the criteria for exclusion was originally reported by O'Malley and Besner (2008: JEPLMC).
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was determined by previous experiments using similar methods, and was designed to ensure we had an equal number of observations in each counterbalance.
May 2013CalvilloRapid recollection of foresight judgments increases hindsight bias in a memory design.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I conducted a prior power analyses based on predicted relationships and effects but I did not report this in the manuscript.
Gomes, Brainerd et al.Effects of emotional valence and arousal on recollective and nonrecollective recall.
  1. Exclusions:  In Experiment 2, we collected data from 72 younger adults but reported data from 63 of them, as 9 subjects did not complete the task or did not understand it. 
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample sizes were estimated assuming high power (90-95%) to detect medium to large effect sizes with the conventional .05 significance level (it has been a while since the experiments were ran, so I do not recall the exact values).  We decided to stop collecting data as soon as the total sample size passed the estimated one (data were collected in small groups).
Little, Nosofsky et al.Logical rules and the classification of integral-dimension stimuli.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Although not explicitly stated, it is implicit in the article that we intended to collect 1 observer per boundary rotation condition in Experiment 1 and 3 x observers in the boundary rotation condition in Experiment 2.  The number of data sessions per subject (in Experiment 1) were based on the number of trials necessary to adequately represent RT distributions with a high degree of fidelity (based on relevant precedents; e.g., Fific, Little & Nosofsky, 2010). In Experiment 2, the number of sessions was determined by whether or not a participant met the stated accuracy criterion; this was explicitly stated in the manuscript.
Masson, Kliegl et al.Modulation of additive and interactive effects in lexical decision by trial history.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We first established an approximate sample size based on prior experience using our planned analytic techniques.  The specific sample size was then determined by the number of counterbalancing conditions required by the design of the experiment.  We tested subjects until this sample size was reached.  This method was used for both experiments, and the sample size was the same in both experiments.
Miller, Lazarus et al.Spatial clustering during memory search.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We did not explicitly describe power analyses in the paper.  We ran the experiment over the course of one term and analyzed our data at the end of the term.  Each term we collect approximately the same amount of data.  Statistical tests were run on the full sample at the end of the semester.
Mitterer, Russell et al.How phonological reductions sometimes help the listener.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: As a rule of thumb, we want at least 400 observations per condition. As we had 40 observations per condition, 16 participants were more than sufficient. So we started with 16 participants, as we hoped to collect data from 4 repeated blocks. Looking at the data after 16 participants, it became apparent that the repetitions did influence how participants behaved, and we had to focus on the first block (as indicated in the paper). As one block contained 10 items per condition per subject, our 400 trials rules of thumb suggested 40 participants. So we ran 24 additional participants, and after that, we stopped and looked at the data.
Otgaar, Scoboria et al.Experimentally evoking nonbelieved memories for childhood events.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: A) In our study, we were interested in a new memory phenomenon called nonbelieved memories. Since evidence around this phenomenon has only been anecdotal, our study was the first attempt to experimentally elicit these memories in the lab. Luckily, we succeeded. Hence, since no experimental work has been done on these memories, we could not determine which sample size should be sufficient. However, and this is related to b), we do know from previous research (Wade et al., 2002), that approximately 30% of participants develop false memories and based on Mazzoni et al. (2010), our expectation was that from this 30%, 20% would develop a nonbelieved memories. This thus means that we needed to collect many participants. Our intention was to test participants until we had about 20 or more nonbelieved memories. This number should be sufficient to execute analyses as ANOVA etc.
Slevc, Ferreira et al.To err is human; To structurally prime from errors is also human.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We chose a sample size of 48 (and thus our stopping rule was to halt after collecting data from 48 participants) following other work from our labs.  In particular, this seems to yield appropriate power and is conveniently divisible by four, allowing for straightforward counterbalancing in a 2x2 design. 
Stalinski, Schellenberg et al.Listeners remember music they like.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We had a target N in mind for the first experiment. For the other experiments, the sample sizes were similar but we sometimes stopped collecting data earlier when the response patterns were crystal clear.
Mar 2013Gillespie, Pearlmutter et al.Against structural constraints in subject-verb agreement production.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: There were subsets of items that were considered "fillers" in these experiment that were not related to the research questions at hand. None of these have been reported elsewhere.
  4. Sample Size: In these experiments we ran as many participants as possible (far more than previous research has) in order to show that our null effects were not due to a lack of power. There was not set stopping rule, but we decided to collect over 100 participants' data in each experiment.  In Experiment 2, we collected over 150 participants' data, while in similar studies it is common to collect data from 40-50 participants.
Halali, Bereby-Meyer et al.Pitfall or scaffolding? Starting-point pull in configuration tasks.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: At the end of experiments we also measured individual thinking style with the REI (24 items) thinking style questionnaire (Pacini and Epstein, 1999). Since we didn't find any systematic effects and since it was not part of our hypotheses we did not report the results of this questionnaire.
  4. Sample Size: We did not determine the sample size by power analysis, but given our previous experience with similar experiments we aim to collect between 25-30 participants in each between participants experimental condition. In Experiment 3 we ran 40 participants in each condition because we had an hypothesis about a null effect.
Lindsay, Gaskell et al.Lexical integration of novel words without sleep.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Experiment 1 aimed for 35-40 participants, which was based on similar previous studies and our pilot work. Experiments 2 and 3 aimed for a similar number of participants to Experiment 1. The exact number within that range was determined by availability of participants within the time window of testing.
Ortells, Mari-Beffa et al.Unconscious congruency priming from unpracticed words is modulated by prime-target semantic relatedness.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Schotter, Ferreira, & ReynerParallel object activation and attentional gating of information: Evidence from eye movements in the multiple object naming paradigm.
  1. Exclusions: Data from 7 subjects were excluded from the analyses due to data loss (blinking during the trial, which cannot be classified according to eyetracking data analysis methods). This was not reported in the manuscript to save space and to focus on the important results and because it is peripheral to the theoretical focus of the study (i.e., the fact that subjects blinked excessively is independent from the purpose of the experiment). For all other data (i.e., that included in the analyses), data exclusion procedures and number of included observations were reported.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (As noted in Rayner, Staub, Li, et al., this question not entirely relevant in the context of eye movement research. There are tons of measures that could be examined - eye movements give you data extended over a large temporal window. We reported the most theoretically relevant eye movement measure.)
  4. Sample Size: We estimated the number of subjects needed based on the number of subjects and items used in previous, related research.
Jan 2013D'Angelo, Jimanez, Milliken et al.On the specificity of sequential congruency effects in implicit learning of motor and perceptual sequences.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until a sample size of of 40 was achieved, based on previous work examining similar effects.
Grzyb & HabnerExcessive response-repetition costs under task switching: How response inhibition amplifies response conflict.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Target sample sizes of experiments were based on pilot work. Target sample size of Experiment 3 was larger because of an additional manipulation. We stopped recruiting participants when target sample size was reached and, if already more participants enrolled, also collected data from those participants.
Liesefeld & ZimmerThink spatial: The representation in mental rotation is nonvisual.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Participants of Experiment 2 also filled out a paper-and-pencil mental-rotation test (Peters et al., 2005). This test was administered to explore a research question unrelated to the topic of the article. It cannot have influenced the data presented in this article, because it was administered after participation in the experiment.
  4. Sample Size: We have decided on (a generous) sample size based on our experience with previous experiments of this type. Because mental rotation is probably influenced by gender, we made sure to include the same number of men and women. In Experiment 2, a considerable number of participants was excluded based on our predetermined exclusion criteria and due to excessive EEG artifacts (as reported). In order to obtain our desired sample size and to balance the cells of the design, we, therefore, replaced excluded participants.
PecherNo role for motor affordances in visual working memory.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Based on sample sizes of comparable studies in the literature I aimed to run 24 subjects in Experiment 1. I had a slightly higher number of subjects sign up to take no-shows into account, therefore the number of actual participants was 26. The results of Experiment 1 indicated that this number provided enough power. Therefore, I used the same procedure for the other four experiments, that is, aim for at least 24 subjects. The actual numbers were between 26 and 28.
Sulpizio, Arduino, Paizi et al.Stress assignment in reading Italian polysyllabic pseudowords.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The number of participants was based on the psycholinguistic literature and on our previous experience with other studies addressing the same topic.
Nov 2012Atalay & MisirlisoyCan contingency learning alone account for item-specific control? Evidence from within- and between-language ISPC effects
  1. Exclusions: In Experiments 1 we excluded one participant, because the participant's native language wasn't Turkish. In Experiment 2, we excluded 2 participants, one for being color-blind and the other for not reporting better proficiency in Turkish than in English. These participants were replaced.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We estimate sample size from similar studies. For Experiments 1A and 1B, sample size in original submission was 34 and 20 respectively. To increase statistical power as recommended by the reviewers sample size was increased to 52 in both experiments.
ChuderskiThe contribution of working memory to fluid reasoning: Capacity, control, or both
  1. Exclusions: Full Disclosure (Two data were excluded from exp. no. 5, and we reported that in the paper. Actually, we did not report on participants who resigned during test taking - assuming that they did not agree to participate in experiment at all, but delayed their decision. As I remember, it could be as few as 3-4 such persons in total, in all our six experiments reported in JEPLMC (i.e., out of around 670 participants in total), but unfortunately we did not register that.)
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Our criteria for sample sizes in each experiment were exactly like this: "decided ahead of time to collect data until minimum sample size achieved and this was followed" - in correlational studies on WM and Gf like the one reported in JEPLMC it is more or less known how many observations one needs to get significant correlations (if they exist) - from 100 to 150 participants (only in the last experiment no. 6 we examined 75 persons, because it was a simple follow-up study - a continuation of exp. no. 5). Moreover, as we paid our participants, we had to estimate the necessary funds for each experiment in advance.
ClapperThe effects of prior knowledge on incidental category learning
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I had run previous experiments of this type, and determined my sample size in advance based on that (just using the same number as in previous studies of this type). Data collection stopped when the target sample size was reached.
Pansky Inoculation against forgetting: Advantages of immediate versus delayed initial testing due to superior verbatim accessibility
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample size was predetermined at 32 participants per experimental group (X 3 groups = 96) because it was the minimal sample size that satisfied the following two requirements: 1) n>30 to ensure normality of the sampling distribution and 2) n divisible by 8 so that all the counterbalancing combinations were equally repeated. I stopped data collection when n=32 for each experimental group.
Perfect & WeberHow should witnesses regulate the accuracy of their identification decisions: One step forward, two steps back
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (We reported the existence of a manipulation of disguise in footnote 2, but we did not report the analyses because it did not add interact with our key manipulation, and we were mindful of the word-limit for a research report.)
  3. Measures: The computer program used to collect responses automatically recorded response latencies, but we did not look at these data because we had no theoretical interest in this measure.
  4. Sample Size: We based our sample size on effect sizes found in previous work in the lab. We estimated that we would need a sample of 50 per cell (400 in total) and the research assistant was instructed to stop once that target was reached. In fact they exceeded this total because they over-recruited on the final day, and these data were retained.
Pratte & Rouder Assessing the dissociability of recollection and familiarity in recognition memory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Simulations were conducted before data collection to determine how many participants and trials were needed to obtain reasonable parameter estimates. We then collected data from about this many participants perhaps with fewer in followup experiments in which highly accurate estimates were not needed. No statistical analyses were conducted before all data were collected.
Rayner, Staub, Li et al.Plausibility effects when reading one- and two-character words in Chinese: Evidence from eye movements
  1. Exclusions: Data from 2 participants were excluded due to poor calibration of the eye tracker.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (This question not entirely relevant in the context of eye movement research.  There are tons of measures that could be examined - eye movements give you data extended over a large temporal window.  We did report all of the relevant/standard measures associated with the designated target words in the study.)
  4. Sample Size: The experiment had 4 conditions and we wanted 10 data points from subjects in each condition. We did not modify this goal during the course of the research.
Sep 2012Foster & SahakyanMetacognition influences item-method directed forgetting
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided to collect 32 participants in each group. If, by the end of data collection, we obtained an effect in any direction that was close to significance (e.g., p=.08), we continued to run an additional 16 participants.
Gaschler, Frensch, Cohen et al.Implicit sequence learning based on instructed task set
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size for Exp. 1a was based on a prior study (Wenke & Frensch, 2005). Based on the effects in Exp. 1a, a larger sample size was chosen for Exp. 2a and following. In addition, an increase in sample size for Exp. 2b was suggested (and achieved) during the review process.
Hino, Lupker, & TaylorThe role of orthography in the semantic activation of neighbors
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: (a) We usually collect 30-40 participants' data in a single experiment. Thus, we collected 44 participants in each of the two experiments in this article. (b) We kept running the experiments to see whether the effect becomes significant not only in the subject analysis but also in the item analysis. After getting 44 subjects in these experiments, we stopped collecting data because the effect became significant in both the subject and item analyses.
KleinA role for self-referential processing in tasks requiring participants to imagine survival on the savannah
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I determined sample size based on two factors (a) subject availability (doled out each quarter to each researcher) and (b) previous experience with memory research of this nature suggesting that a sample size of between 20-40 provides a reasonably stable estimate of population parameters.
Liefooghe, Wenke, & De HouwerInstruction-based task-rule congruency effects
  1. Exclusions: Full Disclosure
  2. Conditions: A randomization error occurred in one condition of Experiment 2, this condition was replaced by a properly randomized condition.
  3. Measures: Full Disclosure
  4. Sample Size: Approximate sample size was decided ahead on the basis of pilot work.
Monaghan, Mattock, & Walker The role of sound symbolism in language learning
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was determined on the basis of previous similar studies (in particular Monaghan Christiansen & Fitneva 2011 JEPG); we stopped collecting data once we had 24 participants in each experiment.
Quinlan & CohenGrouping and binding in visual short-term memory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size in original submission in both studies was set at 22 (estimated from previous similar studies). To address concerns about statistical power in revising the manuscript the sample size was increased to 30 in both studies.
Uzer, Lee, & BrownOn the prevalence of directly retrieved autobiographical memories
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided sample size ahead of time based on previous studies
Jul 2012Bissett & LoganPost-stop-signal adjustments: Inhibition improves subsequent inhibition
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided to run 24 subjects in each experiment before we began collecting data.  This decision was based on the number of subjects we needed to complete counterbalancing (e.g., groups of 4) and the number of subjects we have run in similar experiments in the past that detected differences of the same magnitude we expected here.
Boywitt & MeiserThe role of attention for contextcontext binding of intrinsic and extrinsic features.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We usually aim for around 100 participants in two group designs with the type of model used here because it gives adequate power. For Experiment 2 we didn't reach that goal however because we couldn't find any more volunteering participants. Therefore sample size in Experiment 2 (N=92) was slightly smaller than in Experiment 1 (N=100).
Davis, Love, & PrestonStriatal and hippocampal entropy and recognition signals in category learning: Simultaneous processes revealed by model-based fMRI
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size was chosen to match sample size from previous high resolution fMRI studies after the expected number of excluded subjects were taken into account.
Pachur & ScheibehenneConstructing preference from experience: The endowment effect reflected in external information search
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We had originally planned for a sample size of 75 per condition (150 in total). Given that this was the first study on external search processes preceding WTA/WTP judgments, we aimed for a rather high number of participants. Eventually, 2 additional participants registered for the experiment on the institute’s website and we decided not to exclude them.
Rabovsky, Sommer, & Abdel RahmanImplicit word learning benefits from semantic richness: Electrophysiological and behavioral evidence.
  1. Exclusions: Full Disclosure
  2. Conditions: We also manipulated the number of associates (orthogonal with the number of features) but there were no effects and we did not report this manipulation in the paper due to the conciseness and brevity required by the short report format.
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data from 24 participants and this was followed.
May 2012ArndtThe influence of forward and backward associative strength on false recognition
  1. Exclusions: We completed a stimulus counterbalance at 104 subjects, and collected data from a total of 106 participants. One of the participants’ data was thrown out due to being part of an incomplete counterbalance (participant # 105), while the other was a participant number (#50 in our order of collection) that was mistakenly run by two research assistants. Thus, there were two peoples’ data in the same counterbalance condition at that point in our data collection. Our standard practice is to only use the first person’s data that were collected and to toss the other (as opposed to breaking the randomization of counterbalance orders we set up prior to running).
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: I generally seek to collect a large number of observations per condition (~1000) for a study. No a-priori power analysis was done. If the sample size is 50 or 60, an effect looks very large (informally gauged by comparing the mean difference to the standard error of the mean) and when we reach the end of data collection in a study for the semester (people are no longer signing up for that study very often or the semester is over with), I’ll often terminate the study and consider it finished. Otherwise, we typically try to run until we get to around 1000 observations per condition, look at the data then, and decide if it makes sense to collect more data (e.g., when effects are small, but potentially meaningful and we’re in something of an inferential “no man’s land” with a small mean difference that isn’t outside of the margin of error).
Fedor, Varga, & SzathmárySemantics boosts syntax in artificial grammar learning tasks with recursion
  1. Exclusions: Data form the last participant was not analyzed and not included. He participated in the study a week later than the rest of the participants and his data got lost.
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We aimed to have 15-20 participants (in accordance with previous similar studies in the literature) in each group and tested as many as volunteered within the time frame we had.
Griffiths, Hayes, & NewellFeature-based versus category-based induction with uncertain categories
  1. Exclusions: Full Disclosure
  2. Conditions: In Experiment 1 we replicated the primary result of prior studies (i.e. that feature-based reasoning is frequently used to make predictions about instances whose category membership is uncertain). For pragmatic reasons, when we ran this study we also constructed a novel procedure (resulting in a between-subjects manipulation) to act as a pilot for a second (unrelated) research stream. This new procedure was a failure, but the existing procedure replicated prior data. We reported on the existing procedure, and omitted the data from the new, failed procedure. We did not collapse data across this manipulation.
  3. Measures: We conducted two “manipulation check” measures (after all the critical measures were taken). These were not reported because: (i) in hindsight these measures were redundant (performance on the critical measures provided sufficient validation of the manipulation), and (ii) any explanation would have been lengthy. Indeed, data from these measures have never been analyzed.
  4. Sample Size: We based our sample size on other samples used in closely related studies (several are cited in the paper), and our experience with the pilot studies for this task. We chose n=20 in advance, and complied with that as close as possible (some conditions were more difficult, and thus fewer participants reached criterion performance in these groups. This was unsurprising and is fully described in the article – including the performance of the excluded participants). Data collection stopped once we reached the target sample size.
McVay & KaneDrifting from slow to doh: Working memory capacity and mind wandering predict extreme reaction times and executive control errors.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: One measure was not closely enough related to our central research questions to report. As in a previous article (McVay & Kane 2009 JEPLMC) subjects here also completed a general retrospective self-report questionnaire about cognitive failures the Cognitive Failures Questionnaire - Memory and Attention Lapses (CFQ-MAL). We did not report the results from this measure here because the current article was already trying to accomplish too much both analytically and theoretically to include them; however the new findings closely replicated those of McVay & Kane (2009) with the CFQ-MAL score not correlating significantly with working memory capacity but correlating significantly positively with mind-wandering rates during the SART task and significantly negatively with performance in the SART task.
  4. Sample Size: We had aimed to test at least 100 subjects in each of our two between-subjects conditions. After running the study for one semester (Spring 2008) we were well below that target (with Ns in the 50-60 range in each group) and so we collected data throughout the entire subsequent semester (Fall 2008) which brought our samples well above the target. To be explicit we did not stop testing subjects until we had tested as many people as possible over two complete semesters.
Olds & WestermanCan fluency be interpreted as novelty? Retraining the interpretation of fluency in recognition memory
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Samples sizes were determined based on past experiments on this topic in our lab and whether the key variables were within-subjects or between-subjects, with one exception: Experiment 5 yielded ambiguous data when it was initially submitted. The reviewers asked us to collect more data, which we did (the results did not change--the results of that experiment are still somewhat unclear).
Pyc & RawsonWhy is test-restudy practice beneficial for memory? An evaluation of the mediator shift hypothesis.
  1. Exclusions: Data were excluded from analyses if participants failed to return for the second session (2 in Experiments 1a/1b and 5 in Experiment 2) or if they failed to follow task instructions (5 from Experiment 2).
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until we had 25 subjects per group.
Sprenger & DoughertyGenerating and evaluating options for decision making: The impact of sequentially presented evidence.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Mar 2012Hampton, Aina, Andersson et al.The Rumsfeld effect: The unknown unknown
  1. Exclusions: Full Disclosure (only participants who did not return for the second session were excluded, necessarily as their data were missing)
  2. Conditions: One study in the series was not reported. It was underpowered, and the results did not achieve significance. Other studies related to those published have been conducted and await publication.
  3. Measures: Full Disclosure
  4. Sample Size: Sample size for experiments after the first was based on a power analysis from the first experiment in the series (there was no prior way to estimate an effect size for this, so a large effect size was assumed). In no case were the data analysed before data collection had been completed, so no stopping rule problems arise. Sample size varied as a function of availability of suitable participants, and variable return rate for the second session.
Ingram, Mickes, & WixtedRecollection can be weak and familiarity can be strong
  1. Exclusions: Full Disclosure
  2. Conditions: We did not report pilot data that were collected to test different confidence scales (but these were not failed manipulations).
  3. Measures: Full Disclosure
  4. Sample Size: We chose a minimum sample size (e.g., n = 20) and then our lab assistants brought us data to analyze when they reached or exceeded that minimum (e.g., if they reached 23 by the end of the day, our actual n was 23).
Schuck, Gaschler, Keisler et al.Position-item associations play a role in the acquisition of order knowledge in an implicit serial reaction time task
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: After a first round of reviews, we increased the sample size upon the reviewers suggestion.
Jan 2012Andrews & LoNot all skilled readers have cracked the code: Individual differences in masked form priming
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (A second set of less well-matched items were included in the experiment but not reported in the analysis, as noted in the paper.)
  4. Sample Size: We decided ahead of time to collect data for approximately 100 participants and stopped recruitment when this total was reached.
Bowles, Harlow, Meeking et al.Discriminating famous from fictional names based on lifetime experience: Evidence in support of a signal detection model based on finite mixture distributions.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Data collection was terminated when it was clear that the effect we were interested in could be demonstrated consistently in most individual participants.
Herrera & MacizoSemantic processing in the production of numerals across notations
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Klein, Robertson, Delton et al.Familiarity and personal experience as mediators of recall when planning for future contingencies
  1. Exclusions: Full Disclosure
  2. Conditions: Coming soon.
  3. Measures: Full Disclosure
  4. Sample Size: Decided based on subject pool to allot equal numbers across cells. Had sufficient power -- as indicated in paper.
Mutter, Atchley, & PlumleeAging and retrospective revaluation of causal learning
  1. Exclusions: Experiment 1:  One young participant was replaced - failed to follow instructions. Experiment 2a:  One older adult was replaced - missing data. Experiment 2b:  Two older adults were replaced - 1 did not finish the session, 1 was colorblind and thus could not perceive the colors of the stimuli. Experiment 3:  Three older adults were replaced - 1 due to experimenter error (run in wrong experiment), 1 due to failure to follow instructions, 1 due to medical condition that could affect results (stroke)
  2. Conditions: Full Disclosure
  3. Measures: We collect data on several demographic and individual difference measures (e.g., SES, MMSI) as part of the screening protocol for our young and older participants.  We report only those measures that are relevant to the research question and to characterizing the typical differences that are observed between young and older adults.
  4. Sample Size: The number of participants in our young and older groups for all three experiments was determined in advance of data collection based on (1) the number of participants used in the original experiment upon which our procedure was based (i.e., Larkin et al., 1998) and (2) the number of participants needed for counterbalancing over 9 list orders (3 per list).  We did not modify these requirements over the course of data collection for any of the three experiments.
IssueAuthorsArticle Title
Nov 2013Bainbridge, Isola, OlivaThe intrinsic memorability of face photographs
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Our study is working to create an exhaustive dataset for the use of other researchers, so the sample sizes work towards the goal of BigData (e.g., with memorability scores collected for 2,222 faces). The number of ratings per attribute were uniformly chosen as 15 raters per attribute and 15 ratings per antonym to get a stable value for each attribute for each face, and to accommodate for noise in the data due to error that may arise from crowdsourcing data collection. All data we collected are going to be made available online for other researchers' use.
Chapman, Johannes, Poppenk et al.Evidence for the differential salience of disgust and fear in episodic memory. 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For each of our studies we aimed for a target sample size of 20-30 for each between-subject condition. This number was based on our previous experience of the number of subjects needed to detect emotional enhancement of memory. Data were collected by student research assistants and the final sample size was effectively determined by the sample size achieved by the end of term. In no cases did we inspect the data and then add extra subjects in an effort to achieve significance. 
DuBrow & DavachiThe influence of context boundaries on memory for the sequential order of events
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: No. We did not report the results of a final recognition memory test in Experiment 3 because the results were not central to our research question. Instead, it addressed whether sequence reactivation has consequences for long-term item memory, which was not observed.
  4. Sample Size: No. We based our sample size on prior studies that have had sufficient power to observe effects of boundaries or context shifts in episodic memory tasks. Thus, we targeted 20+ usable subjects (i.e., those that followed instructions and had above-chance memory) for each experiment. 
Horner & BurgessThe associative structure of memory for multi-element events
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Novel analyses prevented apriori power analyses for determining appropriate sample sizes. Experiment 1 was exploratory, testing a relatively low number of participants (N=14) to identify if any evidence for dependency could be observed. Given the success of Experiment 1, Experiment 2 had a similar N. Experiment 3 tested a higher number of participants (N=23), more comparable with previous studies (e.g., Trinkler et al., 2006), to increase power in our split-half analyses (split by subjective confidence at retrieval; not conducted for Experiments 1-2).
McLachlan, Marco, Light et al.Consonance and pitch
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Some demographic data was not found to be relevant to our research questions.
  4. Sample Size: We decided ahead what was our minimum sample size should be and continued collecting until we achieved this and our recruitment options ran out.
Rawson & DunloskyRelearning attenuates the benefits and costs of spacing
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: At the end of the last session of Experiment 2 (after all primary measures of interest had been collected), we had participants complete individual difference measures of reading comprehension, motivation orientation, vocabulary, self-regulated learning habits, and conscientiousness.  We did not report these outcomes in the published paper because they were not related to the primary research question (concerning normative effects of relearning and spacing).  These measures were collected to potentially serve as pilot data for subsequent individual differences studies but have not yet been analyzed (too busy).   
  4. Sample Size: In each experiment, we initially planned to collect around 30 participants per group.  This plan was based on sample sizes we have used in similar studies in the past.  In Experiment 1, we stopped data collection at the end of the spring semester (we didn’t quite reach 30 per group but came close enough to decide to forge ahead with the data we had in hand rather than for the fall semester to run a few more).  In Experiments 2-3, we ended up with slightly larger samples than originally planned because we had more participants sign up and complete the study than anticipated.  In all three experiments, data collection was terminated before we completed data analyses (i.e., no peeking). 
Reas & BrewerImbalance of incidental encoding across tasks: An explanation for non-memory-related hippocampal activations?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We predetermined our sample size of ~20 subjects per study. This target was based on prior similar studies that suggested this sample size would provide sufficient power.
Aug 2013 Beckmann, Gropel et al.Preventing motor skill failure through hemisphere-specific priming: Cases from choking under pressure.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We used one more short measure (questionnaire) with other scales on the start of the experiments that was not related to the research question. The score on this scale did neither affect manipulations nor interact with the effects reported.
  4. Sample Size: We only reported sample sizes. The sample sizes were determined before starting the experiments by the combination of power analysis and sampling possibilities (please note that we did not sample students but well experienced athletes. In the case of taekwondo and badminton athletes, however, we were limited by the number of available internationally experienced taekwondo athletes and badminton league player, respectively, and hence there were somewhat less athletes in the samples than optimal).
Chrobak, Zaragoza et al.When forced fabrications become truth: Causal explanations and false memory development.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until a minimum sample size was achieved and, if available, we ran additional participants because we anticipated having to exclude some participants (for failure to follow directions, interviewer mistakes, etc.).  Adding more participants also served to increase power
Crossley, Ashby et al.Erasing the engram: The unlearning of procedural skills.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We used a sample size that we knew from prior research would have sufficient power.
Grossman, Na et al.A route to well-being: Intelligence versus wise reasoning.
  1. Exclusions: Full Disclosure (Footnote 1 in the paper indicates the origin of the dataset - Grossmann et al., 2010. Proceedings of the National Academy of Sciences. As discussed in that paper, a few participants were excluded due to their performance on the minimental state examination. For further information, consult the publication above. Also see answers to Q4 below.)
  2. Conditions: Full Disclosure
  3. Measures: This paper is part of the large-scale project, with over 6 hours of tasks in different countries. We focused on all measures relevant to the questions of wise reasoning and well-being. As of now, there is no journals that would allow for publication of the full wealth of the data in the project. Other parts of the data are discussed in Grossmann et al., 2012. Psychological Science. Further, reports on cross-cultural variations in emotionality, self, and cognitive style, which were also collected in the current project are currently under review.
  4. Sample Size: We used a random sampling procedure, aiming for 20 participants in each strata of the 3 (age: 25-40, 41-59, 60+) X 2(education: completed college vs. did not complete college) X 2 gender (male vs. female) design. To achieve this goal, we had to oversample some of the strata. The final sample included 247 participants, of which 6 had to be excluded due to technical errors or under the cut scores on the minimental state exam.
Maglio, Trope et al. Distance from a distance: Psychological distance reduces sensitivity to any further psychological distance.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For each study, we decided ahead of time on a minimum sample size per cell and collected data until that number was achieved. We used a variety of methods to collect data across our studies, and some (e.g., Amazon's Mechanical Turk) proved easier to hit a pre-determined number than others (e.g., field studies, in which we collected data until the end of the day and accordingly stopped at or above the target).
Mendes, Koslov et al.Brittle smiles: Positive biases toward stigmatized and outgroup targets.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We typically collect a number of individual difference measures as part of a standard pre-screening that participants complete prior to coming to the lab. In the large majority of the studies in my lab this means we collect big-5 personality measures, optimism, self-esteem, and (in some cases) anxiety (BAI) and depression (BDI -- without the suicide question). for studies that include intergroup/inter-racial questions, we also obtain measures of intergroup contact, modern racism or social dominance orientation and race rejection sensitivity. for the paper in question, we did not examine any individual differences as moderators (other than the IAT measure in study 3). Once we have more intergroup studies completed we plan to examine how individual difference moderators influence the effects across multiple studies, but we are waiting until we have over 1000 people before we launch this type of individual difference investigation.
  4. Sample Size: for psychophysiology studies we tend to have an expectation based on our past studies and others studies that if the manipulation is an interracial setting, we tend to obtain effect sizes in the medium to large range (between .5 and .8, see Blascovich et al., 2001 for this range). So we decide ahead of time what our target number of participants are and then step once our random assignment sheet says we are done. I want to point out that for those of us who do neuroendocrine studies there is not an option of opportunistic stopping because we assay our data in batch (meaning all data are assayed at one time). So if there was ever a case in which something was just short of significant, for example, we don't have the option of starting back up the study. Once a study is done, it is done.
May 2013Berntsen, Staugaard et al.Why am I remembering this now? Predicting the occurrence of involuntary (spontaneous) episodic memories.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Based on previous studies, we had planned ahead how many participants we would need in each condition. We simply stopped when we had run the number we had planned to run.
Gu, Zhong et al.Listen to your heart: When false somatic feedback shapes moral behavior.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to collect data until minimum sample size achieved and this was followed.
Lawrence, Klein et al.Isolating exogenous and endogenous modes of temporal attention.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: At the time of this research, we were unaware of the importance of clear stopping rules. Exploration of my (ML) email archive shows that I collected data across two consecutive weekends, but it appears that I did look at the data between these sessions and decided to decline individuals volunteering for a third weekend on the basis of the data obtained from the first weekend. I don't have record of what criteria I used in making this decision, but at the time I was indeed using null-hypothesis significance testing methods and would have likely been looking for p<.05 for at least the interaction of two of the three variables manipulated in the study. The fact that I did not resume recruiting after the second weekend suggests that whatever criterion I was using held after the data from the second weekend was added. Despite our failure to define a clear stopping rule, it should be noted that while NHST methods may have been used to peek at the data during collection, the final reported analysis employs a likelihood-based evidence quantification approach for which stopping rules should be less important with regards to inference. The transition from NHST to likelihood methods was driven by my (ML) exposure to literature on the philosophical/inferential superiority of the likelihood approach over NHST (though more recently I've made yet another transition from likelihood to Bayesian methods), as well as by the fact that there is no consensus method to extract p-values from the advanced computational methods by which we accounted for the possibly-non-linear effects associated with the continuous variable manipulated in the study. We hope that if the latter barrier hadn't existed, we would have maintained our reluctance to employ NHST methods on the more important philosophical grounds.
Nordfang, Dyrholm et al.Identifying bottom-up and top-down components of attentional weight by experimental analysis and computational modeling.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Sample size were determined prior to data collection based on our experience with comparable experimental setups
Redick, Shipstead et al.No evidence of intelligence improvement after working memory training: A randomized, placebo-controlled study.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Also collected but not analyzed because we had not hypotheses involving the information included: parents’ native countries, whether or not subjects were native English subjects (if not, age they learned English), subjects’ race, subjects’ handedness, a questionnaire about subjects’ videogame experience
  4. Sample Size: Full Disclosure
Shteingart, Neiman et al. The role of first impression in operant learning.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We re-analyzed previously-published data collected by another lab, which is now available on-line and did not do the experiments ourselves. From this dataset no data was excluded. 
Szpunar, Schacter et al.Get real: Effects of repeated simulation and emotion on the perceived plausibility of future experiences.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We were particularly interested in conducting the reported study in the context of an fMRI experiment. Hence, we set out to restrict our sample to 30 participants in order to ensure that the study would be feasible to conduct in the scanning environment. The effects were sufficiently large for us to follow through with an fMRI version of the study, which was not too surprising based on the fact that similar effects of repetition on event plausibility had been previously reported. The results of the study reported in Journal of Experimental Psychology: General are completely separate from the subsequent fMRI study that followed.
Feb 2013Bayliss, Murphy, Naughtin et al.Gaze leading: Initiating simulated joint attention influences eye movements and choice behavior.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Participants in Experiment 1 only completed a second task concerning a research question that was, though related in topic, beyond the scope of the paper. Participants in Experiment 1 also completed the Autism Spectrum Quotient (Baron-Cohen et al. 2001) questionnaire.
  4. Sample Size: For Experiment 1, we set an approximate target of between 24 and 30 participants. The sample sizes for Experiments 2 & 3 were based on power analyses and negotiations during the review process. A larger sample size of ‘about 40’ was targeted for Experiment 4. Final sample sizes were determined by stopping, for example, at the end of a week or when heading into a period when the laboratory was unavailable.
Bonnefon, Hopfensitz, & De NeysThe modular nature of trustworthiness detection.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure (We reported all the measures and items that we administered -- but note that other researchers administered other measures to the same subject pool. We had access to some of these measures, but did not use them in our analyses.)
  4. Sample Size: We used the whole subject pool that was made available to us by the university.
Callan, Ferguson, & BindemannEye movements to audiovisual scenes reveal expectations of a just world.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: There was a paper-based questionnaire after the eye tracking part of the experiment (participants made judgments of deservingness and immanent justice across various scenarios). This was to pilot test materials for use in subsequent, unrelated experiments; these data were never linked to, or analysed with, the eye tracking data.
  4. Sample Size: We aimed to test 10 participants in each experimental list, as is standard in this sort of research (we had 4 lists to rotate the items around the four within-subjects conditions), but the participant sign-ups were such that we over-recruited by two. We had to run two more participants to balance the lists out again (resulting in 44 participants in total). Data analysis didn't commence until the lists were balanced at 44 participants.
Cowan & SaultsWhen does a good working memory counteract proactive interference? Surprising evidence from a probe recognition task.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We knew about how many participants this type of study would have and we aimed for as many as we could get within a practical period of time, until the needed number was surpassed.
Dshemuchadse, Scherbaum, & GoschkeHow decisions emerge: Action dynamics in intertemporal decision making.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We increased the sample size due to a reviewer's request. 
Ferrante, Girotto, Straga et al.Improving the past and the future: A temporal asymmetry in hypothetical thinking.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In the first experiment, we included some additional measures but due to limits on the length of the paper we reported only the measures relevant to our research question. In the supplementary materials, we fully described the materials and procedure of both experiments so that interested researchers may contact us about measures which were not reported.
  4. Sample Size: The size of our samples was determined in advance and based on our previous studies (Girotto et al., 2007 in Psychological Science, Pighin et al. 2011 in Thinking and Reasoning).
Redden & GalakThe subjective sense of feeling satiated.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For all studies, we  periodically checked on sample size as Ss completed each study. We stopped collecting data when that number was approximately 300 Ss for each study. The selection of 300 as the necessary sample size was approximated based on the sample sizes required in past studies of satiation. 
Tentori, Crupi, & RussoOn the determinants of the conjunction fallacy: Probability versus inductive confirmation.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time a proper sample size and collected data until it was achieved.
Nov 2012Bijleveld, Custers & AartsAdaptive reward pursuit: How effort requirements affect unconscious reward responses and conscious reward decisions
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For all experiments, we booked the lab for a fixed amount of time and we tried to run as many participants as possible during this period.
Cook, Dickinson, & HeyesContextual modulation of mirror and countermirror sensorimotor associations
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The sample size was determined a priori based on i) the requirement to fully counter-balance a number of factors, and ii) previous effect sizes seen in similar experiments conducted within our lab over a number of years. 
Inzlicht & Al-KhindiERN and the placebo: A misattribution approach to studying the arousal properties of the error-related negativity
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We administered a number  of other measures unrelated to the research question, but related to other questions (e.g., scales related to mindfulness, scales related to self-determined motivation, scales related to construal level). We also administered standard demographic measures (age, race, handedness, etc.) that we did not report fully.
  4. Sample Size: We decided ahead of time to stop after we had collected around 20 subjects per cell, and this was what we did.
Lyons, Ansari, & BeilockSymbolic estrangement: Evidence against a strong association between numerical symbols and the quantities they represent
  1. Exclusions: Data from 1 participant in Experiment 1 was removed because that participant performed at chance. Removing this participant did not affect results but was deemed advisable due to the participant's uninterpretable accuracy performance (primary analyses were done on response-times). Note that this information was originally included but removed to meet journal space requirements (This information can now be found in the updated online supplemental document.).
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The number of 'roughly 20 subjects' was initially chosen as it is in-keeping with many articles in cognitive science that rely on simple within-subjects contrasts on RTs (where rho is expected to be very high across conditions). Indeed, in Experiment 1, 21 subjects yielded adequate power (>.95), so sample sizes for subsequent Experiments (2-3) were yoked to this number (21). Significant effects were found in Experiment 2 but not Experiment 3. Note that a null result was predicted for Experiment 3, and we directly tested the relative size of effects in Experiments 2 and 3 via ANOVA interaction terms (which obtained significance - see page 639 in the article for details).
Mirman & GrazianoIndividual differences in the strength of taxonomic versus thematic relations
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Decided ahead of time to collect data from 15 participants in each of the 2 task versions based on previous experience with the paradigm.
Simmons & Massey Is optimism real
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
Zhang & EpleyExaggerated, mispredicted, and misplaced: When “it's the thought that counts” in gift exchanges
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We included several additional items to gain more information about the gifts, such as questions about the relationship between the giver and receiver (e.g., Response options for Experiment 1): Friends, romantic partner or spouse, family members, colleague, other), or whether the participant enjoyed the conversation at the beginning of study 3. We did not report these items because they do not bear directly on our hypotheses.
  4. Sample Size: All of our experiments were conducted at the Museum of Science and Industry in Chicago.  We targeted a specified number of participants in each experiment (20 per condition), but would run our experiment for the length of time we were scheduled for the space at the MSI.  Because the number of visitors at the Museum varies from one day to another, our sample sizes varied around that targeted number.  We did not describe this in the text.
Sep 2012Sewell & Lewandowsky Attention and working memory capacity: Insights from blocking, highlighting, and knowledge restructuring
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Because we wanted to analyze our data with structural equation modeling, we aimed to have at least 100 usable data sets (i.e., data from people who performed well above chance and were clearly following instructions). Testing was carried out until we exceeded this aspirational sample size.
Aug 2012Bub & Masson On the dynamics of action representations evoked by names of manipulable objects
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We used a sample size that we have found provides adequate power for effect sizes we expected to obtain in our experiments.  We did not make this point in the article.
Fenn & HambrickIndividual differences in working memory capacity predict sleep-dependent memory consolidation
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: At the start of the study, we set out to obtain a sample size of approximately 100 individuals per condition, based on previous research on individual differences.  We always over-recruit for these types of studies because each participant must complete two separate sessions and one group must remain awake between the sessions.  We therefore lose a large number of participants due to attrition and napping. Our final sample sizes were 111 and 114 per condition.
Lakens, Semin & ForoniBut for the bad, there would not be good: Grounding valence in brightness through shared relational structures
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Goal was 20 participants per within-subject condition. All studies have slightly higher sample sizes because participants were recruited through flyers, and lab regulations required the experiment to continue to accept participants untill 16:00 hours.
Slezak & SigmanDo not fear your opponent: Suboptimal changes of a prevention strategy when facing stronger opponents
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
May 2012Burns, Caruso, & BartelsPredicting premeditation: Future behavior is seen as more intentional than past behavior
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: In Studies 1 and 2 we administered additional items that were not reported because they did not relate to the main hypothesis. At the end of Study 3 we asked participants how believable they found the scenario to be. As we expected, there was no significant difference in believability between the past and future conditions (p=.83). We did not report this measure in the paper because of word limit constraints.
  4. Sample Size: Study 1: We decided ahead of time to administer the study to all students in an MBA course in decision making to which we had access. The results confirmed our hypotheses with respect to temporal perspective and outcome; however, there was a nonsignificant but suggestive temporal perspective X outcome interaction. To determine whether this interaction was reliable, we then conducted a second wave of data collection. We administered this second wave to all students in another MBA course in decision making (taught by a different instructor) the following quarter. The suggestive temporal perspective X outcome interaction did not emerge in this sample. Because the effect of the temporal perspective manipulation on intentionality ratings did not differ significantly across the two samples (p=.92), we combined results from the two waves in the paper. In both waves, all students agreed to complete the study. Study 2: We instructed research assistants to collect the survey from as many people as they could during specified hours (e.g., 11am and 5pm) on each of three consecutive days prior to and following tax day (April 15). Study 3: We instructed research assistants to collect 60 participants per condition and to stop after reaching that number.
Mason & BarThe effect of mental progression on mood
  1. Exclusions: We excluded one participant from Study 1. The reason was because the experimenter gave the wrong instructions.  More specifically, he or she gave instructions for one condition but ran the script for the second condition. I'm reasonably confident that we neither cleaned nor analyzed this participant's responses.  In other words, this person was excluded in advance of our analyses.  As far as I can tell, no participants were excluded from Study 2.  An earlier version of the manuscript included a third study that went unreported in the published paper. 
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Based on past experience with repeated measure studies, we intuitively judged this. I cant say for sure if we checked and realized we needed more participants or if the results were significant the first time we analyzed the data (It's very possible that the former happened with one or both studies.)  I do know that we collected more data to deal with a reviewer's request to rule out order effects. I believe this request came from someone who reviewed the piece for psych science and not JEPG. 
McVay & KaneWhy does working memory capacity predict variation in reading comprehension? On the influence of mind wandering and executive attention
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: The original proposal was to collect data from 200-300 participants.  We were able to meet that goal (actual sample size: 258) over the course of one semester and so we did not resume data collection the next semester.
Shaw & OlsonChildren discard a resource to avoid inequity
  1. Exclusions: Full Disclosure
  2. Conditions: We excluded an additional condition (ran with different participants) that was in a footnote that was a replication of our Equality Condition. This condition was not a failure but was instead ruling out an explanation of the Equality Condition--that children did not like throwing out 2 erasers. In this condition there were four erasers that were given out, 2 to Mark and 1 to Dan. Children were then asked what to do with the final eraser, give it to Dan or Throw it away. In this condition 100% of participants said to give the eraser to Dan. Based on other experiments in the paper this experiment was thought to be redundant since later studies demonstrated that the explanation based on children not wanting to throw two erasers away was not plausible. If anything, the inclusion of this condition would have help, not hindered, our case since it provided a replication of a control condition.
  3. Measures: Full Disclosure
  4. Sample Size: We decided ahead of time to use 20 participants and used 20 participants in all of our studies,with the exception of the 3- to 5-year-olds from experiment 1 which were run at a particularly busy day at the museum and so we ran an additional 4 participants. The pattern of results was the same (actually slightly more significant) if we stopped at 20 participants for the 3- to 5-year-olds.
TopolinskiThe sensorimotor contributions to implicit memory, familiarity, and recollection
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Because the present experiments are conceptual replications and generalizations of earlier papers, for most of the experiments I decided sample size ahead of time using my experience with the manipulation from those earlier published studies (i.e., minimum 40 per between condition had always been necessary to get the interaction significant). Also, because some of the experiments ran in longer experimental sessions involving other unrelated tasks (as is reported), practical considerations and required sample sizes for those other tasks also sometimes influenced sample size decisions. For Experiments 3 and 5, I collected additional data during the revision process due to concerns of statistical validity.
Feb 2012Jensen, Vangkilde, Frokjaer et al.Mindfulness training affects attention- Or is it attentional effort
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Full Disclosure
IssueAuthorsArticle Title
Jun 2013Eerland, Engelen, & Zwaan The Influence of Direct and Indirect Speech on Mental Representations
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We collected some demographic information as part of a standard questionnaire that is not relevant to the hypotheses tested and is used to get a better sense of the subject population. 
  4. Sample Size: Full Disclosure
Jan 2013Cox & HasselmanThe case of Watson vs. James: effect-priming studies do not support ideomotor theory.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (In fact, we predicted null-results and because of this a previous version of the manuscript was rejected (at another journal). This prompted us to come up with the decisive last experiment. In a way we should be grateful for the rejection, our argument is much stronger now.)
  3. Measures: Full Disclosure
  4. Sample Size: We used the original sample size for the replication and G*power 3 to get an estimate for the sample size of further experiments.
Dec 2012Zwaan & PecherRevisiting Mental Simulation in Language Comprehension: Six Replication Attempts
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We collected some demographic information as part of a standard questionnaire that is not relevant to the hypotheses tested and is used to get a better sense of the subject population.
  4. Sample Size: Full Disclosure
IssueAuthorsArticle Title
Mar 2013LeBel & CampbellThe interactive role of implicit and explicit partner evaluations on ongoing affective and behavioral romantic realities.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure (N/A; no experimental conditions)
  3. Measures: Several other measures were also assessed of relationship constructs unrelated to our research question. For the DV in our second model, several other (daily) positive relationship behaviors were assessed, but these went unreported because we didn't analyze them or because more complex patterns emerged with those behaviors.
  4. Sample Size: Recruited as many heterosexual couples as possible via advertisements placed in campus newspapers.
IssueAuthorsArticle Title
Jan 2013Wijnants, Cox, Hasselman et al.Does sample rate introduce an artifact in spectral analysis of continuous processes?
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: We used unpublished data whose measurement was disclosed in a previous paper. We refer to this previous paper, but did not spell out the context exactly.
  4. Sample Size: We used unpublished data whose sample size was disclosed in a previous paper. We refer to this previous paper, but did not spell out the context exactly.
IssueAuthorsArticle Title
Apr 2010LeBelAttitude accessibility as a moderator of implicit and explicit self-esteem correspondence
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Other measures assessed (e.g., causal attribution measures), but went unreported given they were unrelated to the research question at hand.
  4. Sample Size: Data collection stopped after reaching a predetermined sample size in both samples (which were eventually combined due to editorial request). No analyses were performed (in either samples) before data collection was completed.
IssueAuthorsArticle Title
Jul 2012Wijnants, Hasselman, Cox et al.An interaction-dominant perspective on reading fluency and dyslexia
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: We based our sample size on literature of 1/f noise in ADHD. For 30 participants 550 naming latencies were measured.
IssueAuthorsArticle Title
Sep 2012Giner-Sorolla, Caswell, Bosson et al. Emotions in sexual morality: Testing the separate elicitors of anger and disgust. 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Disclosed all measures for Study 1. For Study 2, ten emotion measures not relevant to anger and disgust (measuring positive affect, sympathy, fear and contempt) were included as filler among the anger and disgust items but not analyzed. Among the antecedents, four measures were included but not analyzed because of interpretive problems. There were two items intended to measure character attributions (“To what extent do you think James behaved the way he did because of something about his personality?”; “To what extent do you think James behaved the way he did because of something about who he is as a person?”) These were ambiguous as to whether they measured the more theoretically relevant construct of bad character (not all acts were judged as bad), so we excluded them from the measure, relying on the two variables that explicitly referred to “bad person” and “weak/flawed character.” There was an item assessing how much James had harmed himself, which was not relevant to the moral emotions under scrutiny (cf. Gutierrez & Giner-Sorolla, 2007). We also had a single exploratory item, “To what extent does James’ behavior violate people’s values?” This proved to show general high correlations with all the other measures and was not further explored.
  4. Sample Size: Study 1: Determined arbitrarily as a function of recruitment into the study; stopping was determined prior to looking at the results. Study 2: determined arbitrarily as a function of completion in a single class session; stopping was determined prior to looking at the results.
IssueAuthorsArticle Title
Aug 2012Ratliff, Swinkels, Klerx et al. Does one bad apple(juice) spoil the bunch? Implicit attitudes toward one product transfer to other products by the same brand.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: In both studies we aimed to get 300 participants (an arbitrary number decided a priori). Because data collection at Project Implicit is unpredictable, and because it takes 24 hours after stopping a study for data collection to actually end, we ended up a bit below (Study 1) and above (Study 2) this target N.
IssueAuthorsArticle Title
Dec 2011Ratliff & NosekNegativity and outgroup biases in attitude formation and transfer. 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 2 was part of the first author’s dissertation. There were several other measures collected as part of the dissertation project (prejudice, internal and external motivation to avoid prejudice) that were not reported in the final version of the published manuscript.
  4. Sample Size: In both studies we aimed to get 450 participants (an arbitrary number decided a priori). Because data collection at Project Implicit is unpredictable, and because it takes 24 hours after stopping a study for data collection to actually end, we ended up a bit below (Study 1) and above (Study 2) this target N.
Apr 2011Peters & GawronskiAre we puppets on a string? Comparing the impact of contingency and validity on implicit and explicit evaluations.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: For Experiments 1 and 2, data collection stopped after reaching a predetermined sample size. For Experiment 3, N =120 subjects were run in the first round (sample size pre-determined by power analysis). 14 subjects had to be dropped due to a programming error and 15 were dropped due to chance responding (as noted in MS). Following initial analysis and marginal results, the programming bug was corrected and an additional 98 subjects were run (limited by subject availability), with combined data analysis (N = 218) occurring only after all 98 had been run.
IssueAuthorsArticle Title
Mar 2011Peters & GawronskiMutual influences between the implicit and explicit self-concepts: The role of memory activation and motivated reasoning. 
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Full Disclosure
  4. Sample Size: Experiment 1: First sample stopped at pre-determined size (N = 118), limited by subject availability. This was the sample submitted in the MS; at editor's request, an additional 20 data points were collected for a revision. Experiment 2: First sample stopped at pre-determined size (N = 81), limited by subject availability; following analysis, another sample of N = 47 was collected before analyses were repeated. This was the sample submitted in the MS; at editor's request, an additional 20 data points were collected for a revision.
Sep 2010Ratliff & NosekCreating distinct implicit and explicit attitudes with an illusory correlation paradigm.
  1. Exclusions: Full Disclosure
  2. Conditions: Full Disclosure
  3. Measures: Study 1 included an additional measure of explicit illusory correlation (frequency estimates) that was not reported in the final version of the published manuscript. We also included an additional measure of implicit attitudes (the Sorting Paired Features Task; SPF) that was being pilot tested at the time.
  4. Sample Size: For both studies we collected as much data as we could during a semester either in the lab (Study 1) or on Project Implicit (Study 2).

Note: To protect the anonymity of non-respondents, only 50% of the authors in each issue have been randomly chosen to participate.