# A Simulation Study of Extrapolation Uncertainty in Exposure ... A SIMULATION STUDY OF EXTRAPOLATION UNCERTAINTY IN EXPOSURE ASSESSMENT USE OF PILOT STUDY RESULTS FOR SITE INVESTIGATION 1 Characterizing sites with multiple decision units (DUs) Exposure or remediation units (can be hundreds!) Practical constraints of environmental sampling Use of pilot studies to determine site compliance against threshold action level 2 Study Objective Understand uncertainty in extrapolating from pilot study to site-wide compliance/non-compliance Simulation allows us to compare pilot findings to true site-wide conditions Quantify probability of making decision error (false compliance, false non-compliance) Characterize sample statistics from pilot study that can serve as reliable indicators of decision error rates Determine drivers in pilot study design and site conditions that impact decision error rates 3

Potential Pilot Study Outcomes Costly but protective error (unnecessary action) Site-Wide Condition Pilot Study NC C NC Correct False compliance C False noncompliance Correct Public health concern

C = compl i a nce; NC = Not i n Compl i a nce 4 Defining Compliance Pilot Study Comparison of 95% Upper Confidence Level on the arithmetic mean (95UCL) for each DU to Action Level (AL) Any exceedance triggers need for further sampling True Conditions Comparison of arithmetic mean for each DU to AL Any exceedance is actionable 5 Pilot Study Sampling Approaches Define DU as having high or low Lognormal distributions Incremental Sampling Methods (ISM) Composite of increments in each DU (r = 1 or r=3) N=30 per replicate 95UCL calculated with Students t UCL and

Chebyshev UCL Discrete Sampling N=15 samples per DU 95UCL calculated following ProUCL decision tree (USEPA, 2013) Key assumption: samples are independent and identically distributed 6 Scenarios Evaluated Table 1. Scenarios evaluated in this simulation study. Sampling Method Lower Areas Discretee Log(100, 50) (CV=0.5) Discretee Log(100, 150) (CV=1.5) Discretee Log(100, 300) (CV=3.0) ISMf Log(100, 50)

(CV=0.5) ISMf Log(100, 150) (CV=1.5) ISMf Log(100, 300) Higher Areas Log(600, 300) (CV=0.5) Log(600, 900) (CV=1.5) Log(600, 1800) (CV=3.0) Log(600, 300) (CV=0.5) Log(600, 900) (CV=1.5) Log(600, 1800) Pilot

ISM P(High)a U(5%, 25%) ALb U(10, 6000) % of Sitec U(10%, 90%) DU % r=3d NA U(5%, 25%) U(10, 6000) U(10%, 90%) NA U(5%, 25%)

U(10, 6000) U(10%, 90%) NA U(5%, 25%) U(10, 6000) U(10%, 90%) U(10%, 66%) U(5%, 25%) U(10, 6000) U(10%, 90%) U(10%, 66%)

U(5%, 25%) U(10, 6000) U(10%, 90%) U(10%, 66%) 7 Simulation Approach: Overview Key concepts: 1. Low, medium, and high variability scenarios 2. For ISM, some DUs use r=1 instead of r=3. 95UCLs are calculated based on the arithmetic mean of the coefficient of variation (CV = SD/mean) 3. Each Pilot Study samples a subset of the Site area 4. Action Levels are variable this allows us to examine a range of conditions of near exceedances

8 Simulation Approach: Define Scenario Scenarios: Define two parent distributions High and low concentrations differ by 10x (represents heterogeneity) Variability described by lognormal distributions Divide site into DUs Site Area = 100 acres Average DU Size = 1 acre 9 Simulation Approach: Simulation Parameters 10,000 iterations*: Assign each DU true parameters (mean, SD) based on random sample of parent distribution (n=15). Probability of selecting a high concentration is uniform (5%, 25%). Randomly select Action Level from uniform distribution (range is ~10x greater/less than high mean) Randomly select DUs for pilot study Apply sampling design (ISM or discrete)

Random observations drawn from DUspecific true distribution 95UCL calculated for each DU *Analysis performed in Crystal Ball (ISM) and R (discrete) 10 Simulation Approach: Iteration Decisions 10,000 iterations: Pilot study decision If there is any exceedance of AL, site is non-compliant If there is no exceedance of AL, site is compliant Compare pilot study decision to true site-wide compliance to determine whether correct decision was made Site-Wide Condition Pilot Study NC C NC

Correct False compliance C False noncompliance Correct C = compl i a nce; NC = Not in Compl i a nce 11 Simulation Approach: Iteration Statistics Compile match rates (correct or incorrect decision) Tracking predictor variables from each iteration: Ratio of mean 95UCL from pilot study to AL Proportion of site in pilot study Probability DU has high concentration Proportion of pilot study DUs characterized by r=3 (ISM only) 12

Pilot Study Outcomes Discrete Sampling C NC 12.2% 0.3% C 3.9% 83.6% C = compl i a nce; NC = Not i n Compl i a nce CV = 0.5 Site-Wide Condition Site-Wide Condition NC NC C NC

17.4% 0.7% C 25.2% 56.8% C = compl i a nce; NC = Not i n Compl i a nce CV = 1.5 Pilot Study Site-Wide Condition Pilot Study Pilot Study NC C NC 24.9% 1.2%

C 37.6% 36.2% C = compl i a nce; NC = Not i n Compl i a nce CV = 3.0 13 Pilot Study Outcomes ISM Sampling (Students t) Pilot Study C NC 11.3% 0.6% C 1.3% 86.7%

C = compl i ance; NC = Not i n Compl i a nce CV = 0.5 NC C NC 16.9% 1.6% C 4.6% 77.0% C = compl i a nce; NC = Not i n Compl i a nce CV = 1.5 Pilot Study Site-Wide Condition NC Site-Wide Condition

Site-Wide Condition Pilot Study NC C NC 21.1% 4.1% C 8.0% 66.8% C = compl i a nce; NC = Not i n Compl ia nce CV = 3.0 14 Outcomes by Ratio (Avg. 95UCL/AL) Discrete Ratio

True non-compliance False non-compliance False compliance True compliance 15 Outcomes by Ratio ISM Ratio True non-compliance False non-compliance False compliance True compliance 16 Coverage Outcomes by % Pilot Coverage Discrete True non-compliance False non-compliance False compliance True compliance False Non-compliance is

high (>10%) and increases by 3x False Compliance is low (<5%) and decreases 17 Outcomes by % Pilot Coverage - ISM True non-compliance False non-compliance False compliance True compliance All Match Outcomes Distributed by Pilot Area Student's t UCL for CV=1.5 Match 1 Match 2 Match 3 Match 4 80% <= Pilot < 90% 70% <= Pilot < 80% Coverage

60% <= Pilot < 70% 50% <= Pilot < 60% 40% <= Pilot < 50% 30% <= Pilot < 40% 20% <= Pilot < 30% 10% <= Pilot < 20% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 10% <= Pilot < 20% <= Pilot < 30% <= Pilot < 40% <= Pilot < 50% <= Pilot < 60% <= Pilot < 70% <= Pilot < 80% <= Pilot < 20% 30%

40% 50% 60% 70% 80% 90% Match 1 13.2% 15.9% 17.3% 15.9% 17.4% 18.5% 18.1% 18.4% Match 2 1.5% 3.1% 3.1% 4.3% 5.2% 5.6% 6.2%

7.3% Match 3 4.1% 3.5% 2.4% 0.9% 1.3% 0.4% 0.4% 0.0% Match 4 81.1% 77.6% 77.2% 78.9% 76.2%

75.4% 75.3% 74.4% 18 False Compliance Error by Percent Coverage: Pilot Study size independence 95% confidence interval shows < 5% of Site exceeds AL (on average), regardless of Pilot Study size Discrete (CV=1.5) ISM (CV=1.5) 19 False Compliance Error by Percent Coverage: Pilot Study Size dependence When Pilot Study = 30% of Site, 95% prediction interval shows < 5% probability that area of exceedance is >5%

Discrete (CV=1.5) When Pilot Study = 30% of Site, 95% prediction interval shows < 5% probability that area of exceedance is >10% ISM (CV=1.5) 20 False Compliance Error by Probability of Higher Concentration Area False Compliance error is not sensitive to the portion of the site that has higher concentrations. This means the findings from this simulation apply to Sites with heterogeneous concentrations. 21 False Compliance Error by percent of Pilot Study DUs characterized by r=3 replicates Site Area of Exceedance is not sensitive to the percent of DUs with r=3 22

Guidelines on Expected False Compliance Error Based on Calculated Ratio (Average 95UCL / AL) 1. If the average 95UCL is low (< 0.1 x AL) or moderate (> 0.4 x AL), there is a negligible probability of making a False Compliance error with either discrete sampling or ISM sampling (with both Students t and Chebyshev UCL methods). 2. If (0.1 x AL) > average 95UCL < (0.4 x AL), False Compliance errors vary with different CVs: If CV=0.5, then < 5% error for both discrete and ISM sampling If CV=3.0, then < 5% error for discrete sampling, ~ 8% error for ISM and Chebyshev UCL, and ~ 10% error for ISM and Students t UCL 23 Guidelines on Expected Magnitude of Site Error when False Compliance Occurs 1. By definition, some portion of the Site exceeds the AL when there is a false compliance error. Increasing the Pilot Study area decreases the probability and magnitude of the area of exceedance, while simultaneously increasing the chance of false non-compliance errors. 2. Occurring on average (based on a 95% confidence interval), < 5% of Site area exceeds the AL for both Discrete and ISM sampling for all Pilot Study sizes evaluated (e.g., as low as 10% of Site) 3. When the Pilot Study area equals at least 30% of the Site: Discrete sampling: 95% probability that < 5% of area exceeds AL ISM sampling: 95% probability that < 10% of area exceeds AL 24 Guidelines on Sampling Design Options 1. The likelihood of a False Non-compliance error is always greater than a False

Compliance error. This is because we are relying on 95UCLs and random sampling. 2. ISM sampling reduces the False Non-compliance error rate by a factor of two compared with discrete sampling applied to the same Site conditions. 3. The False Compliance error rates using ISM sampling are insensitive to the fraction of the Pilot Study DUs that are characterized with r=3 replicates, so long as 95UCLs are calculated for the DUs with r=1 replicate using the average CV from the sample statistics for the r=3 replicate DUs. Use a Minimum of 3 DUs with r=3. 4. A Pilot Study Percent of Site of 10% to 20% achieves a False Compliance error rate of no greater than 5% to 10%, while also minimizing the False Noncompliance error rate. 25