Thursday, June 25, 2020

Who's Getting Sick - Race Matters

On April 10, 2020, the CDC posted a report that discussed the geographic variations in the spread and mortality rate of COVID 19. These included the following differences in location that might be influencing the pattern of disease incidence occurring across the United States:
  • the timing of COVID 19 introduction into an area
  • the relative population density of cities compared to rural areas
  • demographic values such as prevalence of different age groups and those with existing conditions
  • the timing and extent of government recommendations to diminish public interaction
  • diagnostic testing capacity in different jurisdictions
  • the level of public health reporting consistency and prioritization.
I have written about several of these in my posts, but there is one that now intersects with current events in a significant way  beyond the issues of health and economic upheaval: the demographics of race. In the midst of this global pandemic, an event occurred that burst into the fore front of the daily news cycles, moving COVID 19 updates into the background. George Floyd was living in Minneapolis, Minnesota, when on May 25, 2020, he was arrested by police after having been identified by a store clerk as having paid for his purchase with counterfeit money. Seventeen minutes later, Mr. Floyd was handcuffed and on the ground, held down by three police officers, one with a knee on Mr. Floyd's neck. At that point, after 8 minutes and 46 seconds of being held in that position, George Floyd had become unresponsive. An ambulance arrived a few minutes later and took Mr. Floyd to the hospital where he was declared dead.

It was not just because George Floyd was black that this story resonated so strongly around the world. As Sherrilyn Ifill, president of the NAACP Legal Defense fund, said in an interview with CBS's Bill Whitaker, "one of the reasons why the George Floyd video set us off so much was the realization that it's not different. We've-- we've seen the videos. And the videos seem not to make a difference. And that's why that officer could look like that. He wasn't afraid of being videotaped. He wasn't trying to hide what he was doing."

As Ms. Ifill said, we have seen this all before, many times. If we look at the numbers of men at risk of being killed by police, the imbalance between ethnic groups is overwhelming.

Adding insult to injury, George Floyd was tested for COVID 19 after his death and was found to be positive, though asymptomatic. That African Americans are victims of police brutality is bad enough, they are also almost five times more likely than white people to be hospitalized for COVID 19.

As can be seen in the two graphs above, it is not just blacks who are being treated more harshly by the  police and the pandemic - all minorities are suffering at a greater rate than whites. While density of the population in urban places plays a role in increased rates of infection, it really comes down to whether you are rich enought to "shelter in place", or, if you are not, being forced to go out to work at frontline service jobs in close proximity to others. The maps below show that, even before the pandemic in 2015, minorities were less likely to find work than whites.

The values that are represented in these maps are based on the ratio between the rate of unemployment for the minority and the rate of unemployment for whites. The rate of unemployment is calculated by dividing the number of a group who are unemployed by the total labor force of that group. The "labor force" is defined as those currently employed or who are not working, but who are actively looking for work. The areas for which the values are aggregated are congressional districts.

When the rates of COVID-19 deaths for different races are compared to each group's proportion in a state and then combined, the result can be used to show how far variances in racial deaths diverge from the entire state population death rate. A map of these divergences was developed by the University of California, Berkeley.

A cursory examination of the map above and the ones of unemployment show a probable cause-effect relationship between deaths and unemployment rates for certain states: Arizona, Georgia, Nevada, Michagan, Florida, and Missouri. In others, however, there seems to be no relationship: California, Texas, Oregon, and Wyoming. The unemployment rate itself is partly a function of a racial bias, which also reinforces several other circumstances that increase susceptibility to infection.

The CDC lists a number of race-related influences that affect health:
  • residential segregation that creates denser populations and greater distances to groceries and health care;
  • higher employment in essential industries requiring working outside the home and less paid sick leave;
  • poorer underlying health conditions like lack of health insurance and serious pre-existing illness
The inequality that exists in our society has made a difficult situation even worse for those who, for no reason other than the color of their skin, face so many injustices already.

Monday, June 1, 2020

COVID 19 Testing - Learning From Our Mistakes

There is considerable evidence that our handling of the onset of COVID 19 was less than adequate. If a broader portion of the population had been tested early on, a more accurate assessment of the infection could have been made leading to more useful recommendations for action. Several scientists fear that the current pandemic is the leading edge of what may be a wave of more frequent infectious diseases. They suggest that this increase is in part due to changes in climate brought about by human activity. We need to pay attention to what works and remember those lessons. Insufficient testing is just one of the teachable moments; there are many other missteps that should be noted and corrected.

Even if testing had reached more people in the beginning, there were other practices that delayed our understanding of the true scope of the pandemic. Built in to any data set is the inherent tendency for errors to creep in.  The path of data from collecting it to loading it into a database is comprised of a number of steps, each one of which can be a possible source of data ambiguity, alteration, and misrepresentation. The cumulative effect of errors on data and, consequently, on conclusions drawn from data, can be minimal and of little concern, or it can be substantial and lead to significant harm when actions based on data analysis are invoked (testing bias).

For examples of two types of errors, we can look at Texas and its experience with testing and the reporting of results. Other states are having similar problems, so Texas is not unique. The map below indicates the probability of an outbreak, calculated for each county.


The map was included in a report from April 5, 2020. The authors referenced a paper by researchers who had developed a tool to estimate the risk associated with the spread of infections. The results depended on the extent of infection reporting, level of information about the transmission rate and the possibility of "super-spreading events". As in most states early on, testing in Texas was minimal. The number of tests that returned positive probably did not represent the true count of cases at the time. The group estimated that if only one case was reported in a county, there was a 51% chance that an outbreak was already taking place. An outbreak was defined as a "sustained local transmission that will continue to spread".

The second example deals with the ambiguity of the reporting process itself. The graph below shows the daily number of cases reported as of May 20, 2020.

The values for daily cases vary widely, so I added a rolling average, calculating the average number of cases for each seven days (blue line). The line smooths out the graph so that the trend becomes visible. The values rose quickly from mid-March to the early part of April, then plateaued till the first week of May. The average cases per day once more climbed quickly until by May 20 they had more than doubled from the first week of April. The governor of Texas began reopening retail establishments at the beginning of May. Many states that have started reopening businesses are now experienced higher numbers of cases per day.

This graph came from an article that reported testing in Texas was also increasing to more that 20,000 tests per day. Most states are now using the percent new cases relative to total tests per day as an indication of the improving or worsening status of the infection. This value was chosen as an acknowledgement that as testing increases, more cases will be found, but the percentage of new cases to tests would better represent the trend of the infection in the population as a whole.

Unfortunately, reporting guidelines set forward by the CDC at the beginning of the pandemic were not specific enough to distinguish between viral diagnostic testing and antibody testing which gauges the level of immunity in the population. It was not until after April 5th that the CDC revised their reporting form and clarified the definition of a confirmed case versus a probable case. As a result of the change, a case was considered "probable" if an antibody test was positive, but was not counted as a "confirmed" case. Confirmed cases are most often used in graphs and maps to indicate the spread of the disease.

Initially, some states, including Texas, had been reporting the results of viral and antibody tests without distinguishing between the two. According to the referenced article, the state continued to lump the two test results together even after the reporting form was revised to separate the two. There can be accuracy problems with both tests (true and false results), but a positive viral test is usually considered to be reliable. Antibody tests, on the other hand, have been reported to have less reliability in determining the occurrence of infections and result in misleading test statistics when included with viral test results.

The two types of errors described in this post, under-testing and ambiguous reporting, can lead to confusion when presented to the public and to an inadequate basis for policy decisions regarding social restrictions. This puts people's lives at risk as we learn the hard way that data is critical for fighting this pandemic. The hope is that we will in time, more clearly and carefully bring robust data to bear on the effort.