There is considerable evidence that our handling of the onset of COVID 19 was less than adequate. If a broader portion of the population had been tested early on, a more accurate assessment of the infection could have been made leading to more useful recommendations for action. Several scientists fear that the current pandemic is the leading edge of what may be a wave of more frequent infectious diseases. They suggest that this increase is in part due to changes in climate brought about by human activity. We need to pay attention to what works and remember those lessons. Insufficient testing is just one of the teachable moments; there are many other missteps that should be noted and corrected.
Even if testing had reached more people in the beginning, there were other practices that delayed our understanding of the true scope of the pandemic. Built in to any data set is the inherent tendency for errors to creep in. The path of data from collecting it to loading it into a database is comprised of a number of steps, each one of which can be a possible source of data ambiguity, alteration, and misrepresentation. The cumulative effect of errors on data and, consequently, on conclusions drawn from data, can be minimal and of little concern, or it can be substantial and lead to significant harm when actions based on data analysis are invoked (testing bias).
For examples of two types of errors, we can look at Texas and its experience with testing and the reporting of results. Other states are having similar problems, so Texas is not unique. The map below indicates the probability of an outbreak, calculated for each county.
The map was included in a report from April 5, 2020. The authors referenced a paper by researchers who had developed a tool to estimate the risk associated with the spread of infections. The results depended on the extent of infection reporting, level of information about the transmission rate and the possibility of "super-spreading events". As in most states early on, testing in Texas was minimal. The number of tests that returned positive probably did not represent the true count of cases at the time. The group estimated that if only one case was reported in a county, there was a 51% chance that an outbreak was already taking place. An outbreak was defined as a "sustained local transmission that will continue to spread".
The second example deals with the ambiguity of the reporting process itself. The graph below shows the daily number of cases reported as of May 20, 2020.
The values for daily cases vary widely, so I added a rolling average, calculating the average number of cases for each seven days (blue line). The line smooths out the graph so that the trend becomes visible. The values rose quickly from mid-March to the early part of April, then plateaued till the first week of May. The average cases per day once more climbed quickly until by May 20 they had more than doubled from the first week of April. The governor of Texas began reopening retail establishments at the beginning of May. Many states that have started reopening businesses are now experienced higher numbers of cases per day.
This graph came from an article that reported testing in Texas was also increasing to more that 20,000 tests per day. Most states are now using the percent new cases relative to total tests per day as an indication of the improving or worsening status of the infection. This value was chosen as an acknowledgement that as testing increases, more cases will be found, but the percentage of new cases to tests would better represent the trend of the infection in the population as a whole.
Unfortunately, reporting guidelines set forward by the CDC at the beginning of the pandemic were not specific enough to distinguish between viral diagnostic testing and antibody testing which gauges the level of immunity in the population. It was not until after April 5th that the CDC revised their reporting form and clarified the definition of a confirmed case versus a probable case. As a result of the change, a case was considered "probable" if an antibody test was positive, but was not counted as a "confirmed" case. Confirmed cases are most often used in graphs and maps to indicate the spread of the disease.
Initially, some states, including Texas, had been reporting the results of viral and antibody tests without distinguishing between the two. According to the referenced article, the state continued to lump the two test results together even after the reporting form was revised to separate the two. There can be accuracy problems with both tests (true and false results), but a positive viral test is usually considered to be reliable. Antibody tests, on the other hand, have been reported to have less reliability in determining the occurrence of infections and result in misleading test statistics when included with viral test results.
The two types of errors described in this post, under-testing and ambiguous reporting, can lead to confusion when presented to the public and to an inadequate basis for policy decisions regarding social restrictions. This puts people's lives at risk as we learn the hard way that data is critical for fighting this pandemic. The hope is that we will in time, more clearly and carefully bring robust data to bear on the effort.