Results are presented from a study examining the batch-to-batch and panel-to-panel variability in delamination toughness data and its implications in research settings as well as in practical structural applications. Toughness data from various sets of double cantilevered beam, single leg bending, and end-notched flexure tests are statistically compared. These data sets contain results from different batches of material, different panels within a batch, different test geometries and different operators used for manufacture and testing. For research, the focus is on the use of delamination toughness data to develop and validate delamination growth prediction methodologies. The issue examined here is the balance of creating a large data set versus maintaining accuracy due to the variability in results. To this end, the t-test, analysis of variance for means test, and the k-sample Anderson Darling test are used to examine under what conditions the various data sets could be considered as belonging to the same population. It is found that the delamination toughness of new batches of material will not necessarily pool with existing test data. Thus, for research applications, it is concluded that new toughness values must be determined for each batch of material considered. For practical applications, it is assumed that B-basis values will be used for design, and the test data is examined for two different situations: when delamination toughness is used as an acceptance criterion, and when it is not. In the former case, three different criteria for determining whether or not new batches can be accepted are evaluated, and in the latter case, physical and statistical tests are suggested to ensure that B-basis design allowables reflect the variability of those batches used in production.