Grading ACEP's Report Card



The newly released ACEP report card model in not adequate to provide meaningful comparisons in the capabilities of the states or DC to provide access to emergency medicine. It certainly could not be used to justify the need for a large new hospital in DC. It does not provide much useful fodder for the opponents of the new hospital either, but it does confirm that DC is competitive in view of its much higher poverty levels, abnormal per capita use of emergency rooms, and below- average provision of preventative medicine. Future report cards need to be substantially improved over this first fledgling effort.


In January, 2006, the American College of Emergency Physicians (ACEP) released their first "National Report Card on the State of Emergency Medicine", stating that "this effort would be the first of a series of report cards" providing "an assessment of the support each state provides for its emergency medicine system". The Washington Post noted sagely that both proponents and foes of the SE Hospital Plan (NCMC) were citing the report's findings. Foes claim it shows DC has enough hospitals while advocates read into the scores a message to "stay ahead of the power curve". Virginia health officials, reacting to their very low grades, said they "would take a hard look at the perspective being shown to us".

In view of the continuing clamor over whether or not to build an expensive new National Capital Medical Center, NARPAC decided to take a hard look at the scoring system itself, and see whether in fact the report card accomplished its purpose. Our conclusion is that the grades themselves are probably meaningless, but that the underlying data can provide some useful insights. The user does have to accept the validity of comparing one relatively small city with all fifty US states, and should try to understand the peculiarities of the methodology itself. NARPAC, with considerable familiarity in, and skepticism of, complex weighting systems and performance measuring systems, reluctantly concludes that this methodology, like the camel, was designed by a committee. We seriously hope that the summary "grades" will not be flaunted by either NCMC's advocates or foes as a basis for deciding on a $400M new hospital.

First we will describe the parameters of the scoring system;
then offer NARPAC's rather extensive negative comments; and
finally, offer a few positive comments and suggestions.

return to the top of the pagedefining the components of the scoring system

The complicated chart below attempts to summarize all the various contributing aspects to this first-round ACEP scoring system. And the sections which follow try to explain the various problems inherent in this approach:

Basically, this methodology gathers data on 50 (count 'em, fifty) separate parameters considered germane to the provision of emergency services. These include some 24 quantitative performance measures (e.g., fully-staffed hospital beds per capita) and 26 yes/no questions (e.g., does the state have a motorcycle helmet law?), plus one bonus question (!). These contributing factors are clumped under four different, and quite disparate, headings, each of which has a certain weighting factor:

o 40% for "#1: Access to Emergency Care"; (11 quant. items)
o 25% for "#2: Quality (of medical personnel) and Patient Safety" (5 quant., 5 Y/N items)
o 10% for "#3: Public Health Care and Injury Prevention" (6 quant., 13 Y/N items)
o 25% for "#4$ Medical Liability Environment" (2 quant., 8+1 "bonus" Y/N items).

Within each heading, each item is also assigned a weighting to assess its relative important to that category. Hence under Access (#1), "ER visits per ER doctor" is weighted at 5 points (out of 40), while "registered nurses per capita" gets 3 points. At the other end of the spectrum, under Medical Liability (#4), a "yes" answer to "does the state have a $250K 'hard' cap on non- economic damages?" wins the lucky state 14 points (out of 25). But if a state has a very high percentage of its kids receiving pre-natal care (under Public Health (#3), it is rewarded with 0.5 points(out of 10). Note that all the weighting factors within all four categories add to 100.

converting data values into scores

In a rather interesting, but questionable, next step, the quantitative measures provided by all 51 "states" are ranked from "best" to "worst" for each of the 24 relevant items. Note that in some cases "best" is the highest value (e.g., trauma centers per capita); while in a few cases, it is the lowest value (e.g., annual rise in doctors' medical liability insurance). But the next step is the more dubious one. Once the data are ranked by value, a score of 51 is given to the "tallest" and a 1 to the "shortest". The rationale given is to make each measure relative to the best, rather than absolute. Well and good, but this approach also eliminates any proportionality between the contestants. If the "best" has 18.3 ER doctors per 100K population, it is scored as 51 times as good as the "worst" state, even if it has 15.7 ER doctors per 100K population.

The yes/no questions are scored differently, with each "yes" getting 51 points, and each "no", zero. Each score is then multiplied by its weighting factor (out of 100). The distribution of scores is unavoidably quite different between the "relative" scores (between 51 and 1) and the "absolute" scores (51 or 0). It is not insignificant to note that the median composite score in the latter two categories with the predominance of yes/no answers (#3, #4) is well below the average score, while the median and average for the entirely or predominantly quantitative categories (#1, #2) are very close together. This is shown below where the states' rankings are shown for Categories #2 and #4 (clumsily reproduced by NARPAC from the ACEP report):

converting the scores into grades

The next complicating step is that the individual parameter scores are then aggregated into each of the four categories (headings) and those results are then aligned again from "best" to worst", with the "best" being ranked at 100%, and all the others lined up in decreasing relative order, as shown immediately above. In a rather uneven conversion, these percentages are converted to letter grades as follows: States that reached at least 80% of the best score get and 'A'; 70%, a 'B'; 50%, a 'C'; 30%, a 'D'; and below 30%, an 'F". (You'd better hope your kids aren't graded this way!).

converting four category grades into a single state grade

The four grades (one for each category) are then converted to a single letter grade by weighted averaging. The same result would be achieved by aggregating the four (weighted) categories and using the same equivalency rules. Given the composite ranking methods used, it is not surprising that the overall national average is a 'C' or 'C-'. Mathematically, it would have to be a 'C' if there were not the large number of yes/no questions (most of which turned out to be negative).

return to the top of the pageNARPAC Commentary (Negative):

This first-pass ACEP methodology is seriously flawed for a variety of reasons, and there is little point in flogging each one to death. Nevertheless, those tempted to use these scores to prove their points will do well to understand that:

o There is no attempt whatsoever to compare the resources provided to the resources needed by that state. For instance, there is little doubt but that states with higher poverty rates (minority or not) use more emergency medical services than those with good insurance and family doctors. This is not considered here, but it is the essence of DC's NCMC issue. Note on the chart below how much higher DC's poverty burden is than any of the other nine "states" finishing in the "top ten" (just about double the average, shown in blue). This pattern will be evident in several of the subsequent charts:

o The weighting given to preventive medicine seems farcical (at only 2% total weighting). The parameters considered involve only pre-natal care and various vaccinations, and no consideration is given to primary care facilities and local clinics, another fundamental NCMC issue;

o The notion of purely additive component scores implies the existence of trade-offs between various parameters (more nurses, fewer doctors??), and can generate very erroneous grading. For instance, California wins the jackpot (as the best overall state) solely because it has a "hard cap" on medical liability payments, even though it devotes only 58% as many resources per capita to emergency care access as DC does. This is indicated on the chart below which "stacks" the four weighted categories to show their contributions to the total. Leave the yellow (liability) tops off, DC becomes #1 and California drops to #9; leave the blue (prevention) bands off, and DC becomes even more outstanding:

o The lack of multiplicative factors, which would indicate that the absence of a certain factor would be a "deal-breaker"(i.e., no doctors, zero score) is a serious shortcoming. In this methodology, DC could improve its relative standing (already probably too high) by eliminating all ER's, ER doctors, and nurses (-13 points) but adopting a $250K liability cap (+14 points);

o The unspoken fact is that if the capabilities in one certain area (say access to ambulances) varies by only 20% from top to bottom among the 51 "states", its impact on scores is artificially exaggerated. The top state gets an 'A+', the middle state a 'C', and the bottom state, an 'F' (in that specific category);

o In this model, there is no limit to having more of a good thing. If DC had four times as many fully-staffed hospital beds as it has now (already twice as many as the nearest contender!), its overall rating would stay the same, but it would drive down the grades of all 50 other states, because their standing relative to the top dog would decline. Shouldn't DC be penalized for having too many beds instead?

o There is another hidden assumption in this analysis to the effect that 40% of all ER visits are for trauma cases. Hence, lowering traffic accidents, occupational accidents, and substance abuse is scored higher than treating asthma, diabetes, or high blood pressure in normal clinics/hospitals. NARPAC has not yet uncovered comparative data on this issue, but we are certain that, say, Michigan and DC have very different reasons for visiting an ER!

o Though there are indicators for access to ambulances and minor mention of ambulance diversions, there is no geographic factor whatsoever to indicate the average distance/time from any point in a state to the nearest ER/trauma center (whether in that state or a neighboring state). NARPAC cannot believe that (total) ambulance time en route is not important, or that the availability of helicopters vs ground vehicles should be ignored. Try to explain DC's need for an NCMC to a Texan! Surely there is a geography factor;

o NARPAC commends the ACEP on the one hand for including considerations of disaster- relief but doubts the wisdom of subsuming its value within other more routine ER tasks. For the "Beat the Game Gang" (including NARPAC), it seems unwise to suggest a realistic trade-off between these disparate emergency needs. Tongue-in-cheek, NARPAC notes that DC would be better off with NO trauma centers and NO fully-staffed hospital beds, as long as it can say "yes", or even just "uh-huh", to a) offering citywide training in disaster responses, and b) having a so-so reporting system to itself on ambulance diversions.

o In addition to disaster-relief as an "outlier" for gauging access to emergency care, so certainly is the whole matter of the medical liability environment. Addressing these issues is of substantial importance (NARPAC might well have ignored them!), but stirring, if not baking, them into the mix of day-to-day emergency medicine is probably inappropriate. ACEP would do well in the development of future score/grading systems to keep separate those elements that deserve independent treatment.

o Finally, it is clearly necessary to understand the inherent characteristics of the methodology that has been adopted. It is embarrassingly inappropriate to cluck about the US getting a 'C-' on its national report card when the grading methodology almost assures the average grade will be near the dividing line between a 'C' and a 'D'. The implicit assumption that "there's lots of room for improvement" doesn't apply if the grading system is relative rather than absolute, and requires that the average result be 50%. In this case, the overall national grade would be even worse if the grading methodology made the passing grade 60% while the scoring methodology dictates that the average score be 50%!

The above criticisms would be 100% true in this case if all the 50 chosen parameters had quantitative values ranked linearly from best to worst. It would also be true if the expected chances of answering "yes or "no" to (sometimes loaded) questions were even-Steven. In this particular methodology, the maximum "yes" score amounts to 36% of the total, and is an "absolute" score. The remaining 64% is made up of "relative" scores. Thus the "best possible" overall national score is 36 + 32 = 68, which (apparently) equates to a 'C+' (or possibly a 'B-?'). States seeking to "beat the game" should focus more on trying to earn "yes" points than pushing their way up the ranking line!

return to the top of the pageNARPAC Commentary (Positive):

o Despite these seemingly "sour grapes" comments, NARPAC strongly favors the development of quantitative criteria and measurement of government performance at all levels. The danger lies in getting "too cute by half" in trying to merge disparate factors into some sort of single grade. There is no harm in handing out consistent, properly weighted grades in several different "subjects". And there would certainly be no discredit in trying to develop proper independent measures for both the relative needs and the relative satisfaction of those needs.

o But there is also a genuine benefit in providing access to the input materials themselves, and letting independent analysts apply their own weighting factors to their own specific applications. The chart below shows how DC (green bar) stacks up quantitatively relative to the other nine "states" that made it into the select 'B'/'B-' Club (there were no 'A's). These eleven parameters are all the variables that comprise the Access to Emergency Medicine in the ACEP methodology. DC leads in six of them, comes in second in two; third or fourth in two; and is below the median only in per capita funding for "SCHIP", which stands for State Children's Health Insurance Program.

In comparison to the other "states" (and their light blue average), however, DC has twice as many staffed beds per capita, pays twice as much for hospital care per capita, and spends three times as much on Medicare for adults under 65. It also has more than its share of emergency doctors per capita, but they do not treat an unusual number of ER visitors per day. Whether DC deserves 'B' for this performance or needs a new hospital is certainly not clear from these comparisons.

The final chart shows four more comparative variables from other categories. Clearly, DC has an abnormal number of ER visits per capita, has way more than its share of substance abusers to cope with, supports a very high number of emergency medicine residents per capita, but lags somewhat in the provision of preventative shots and pre-natal care. The problem with these charts, of course, as with the ones above, is that there is no baseline provided that reflects the specific characteristics of DC, or any state for that matter, to rationalize their ranking or more basically, their needs.


The newly released ACEP report card model in not adequate to provide meaningful comparisons in the capabilities of the states or DC to provide access to emergency medicine. It certainly could not be used to justify the need for a large new hospital in DC. It does not provide much useful fodder for the opponents of the new hospital either, but it does confirm that DC is competitive in view of its much higher poverty levels, abnormal per capita use of emergency rooms, and below- average provision of preventative medicine. Future report cards need to be substantially improved over this first fledgling effort.

return to the top of the

Did you find this of interest?
Please give us your FEEDBACK

This page was updated on Mar 5, 2006



© copyright 2007 NARPAC, Inc. All rights reserved