Wide-area data and algorithms in large power systems are creating new opportunities for implementation of measurement-based dynamic load modeling techniques. These techniques improve the accuracy of dynamic load models, which are an integral part of transient stability analysis. Measurement-based load modeling techniques commonly assume response error is correlated to system or model accuracy. Response error is the difference between simulation output and phasor measurement units (PMUs) samples. This paper investigates similarity measures, output types, simulation time spans, and disturbance types used to generate response error and the correlation of the response error to system accuracy. This paper aims to address two hypotheses: 1) Can response error indicate if a dynamic load model being used at a bus is sufficiently accurate, and 2) Can response error determine the total system accuracy. The results of the study show only specific combinations of metrics yield statistically significant correlations, and there is a lack of pattern of combinations of metrics that deliver significant correlations. These outcomes highlight concerns with common measurement-based load modeling techniques, raising awareness to the importance of careful selection and validation of similarity measures and response output metrics. Naive or untested selection of metrics can deliver inaccurate and misleading results.