COVID-19: Doing It Right

We can defeat the COVID-19 epidemic. We’ve already been shown how.

New York state was hard hit by COVID-19, and hit early by U.S. standards. Because of the incompetent federal response, nobody in the U.S. was properly prepared. Responding quickly, New York State took the tough steps that brought the disease under control. They even forced the outbreak down — they didn’t just “flatten” the curve, they crushed it. And they haven’t wavered, but have stayed the course, holding on to those measures that have enabled them to lower the daily case load from over 400 per day per million population, to under 30.


Here’s the same data plotted on a logarithmic scale; it changes the y-axis so that as you go up, the numbers grow faster and faster (like “exponential growth”). It emphasizes the most important aspect of the rise or fall of this disease: the rate of exponential growth. In addition to plotting the data on a logarithmic scale, I’ll estimate its exponential growth rate during various time periods, in the crudest way possible (shown by the red line):

As crude as it is, it gives us realistic estimates. The early invasion grew with a doubling time (the time required for the caseload to double) of less than 2 days. Then the rate slowed, so the doubling time increased to over 9 days. Since early April, the case load has been in decline so the rate is negative, with a half-life (the time required to cut the caseload in half) of 19 days.

That means that, on their best behavior with full measures in place, it took New York state roughly 19 days to cut the case load in half. But unchecked, it took less than two days for it to double.

New York today: the average number of daily new cases per million population is under 30. That’s low enough that outbreaks can be detected early, and when they arise can be contained with contact tracting and quarantine. AND — perhaps even more important — the caseload is still going down.

That’s why New York state is actually ready to re-open IF they do it right, which means taking small steps and watching the numbers like a hawk. You have to be ready to re-close if some re-opening step goes haywire. A responsible state government must make it clear to the people that re-opening steps are conditional, and can be reversed at any moment. That’s what New York is doing.

Because with an epidemic as hot as COVID-19, if we get too close we’ll get scalded. Smart people start slowly. Dip a toe in the water — and be ready to yank it back if it burns.


California was also hit early, but the first wave didn’t explode as fast is in New York, taking almost 4 days to double rather than under 2. By the time measures were in place, they hadn’t gotten into nearly as much trouble as New York; in early April California was barely over 30 cases per day per million population.

They certainly succeeded in flattening the curve. But they never made it go down. During April and May it was rising slowly, taking almost two months to double (doubling time 57 days), but it was still rising.

Then the re-opening steps kicked in, the rate got even faster, so lately the doubling time is a mere 17 days. The caseload got higher, even faster.

California lost ground by not forcing the curve down; during April and May they doubled the daily case load which put them in a terribly vulnerable position when re-opening. Re-opening has been too much, too soon. When once they had only a tenth as many cases per capita as New York, now roles are reversed. And California’s problem is still on the rise, while New York’s still falls.


Texas was not hit as hard, or as early, as New York. When April arrived, they were barely over 20 cases per day per million population. But like California they failed to drive the curve down, flattening it but not reducing it. And like California, their re-opening has been too much too soon. In fact the Texas re-opening has been more of a disaster than California’s; it takes less than 10 days to double.


Florida also was not hit early, and by early April was still around 30 cases per day per million population. Unlike California and Texas, they actually managed to drive the curve down — albeit slowly, with a half-life of 91 days. That’s still a decline. But their re-opening has been even more of a disaster than Texas’, with cases now rising with a doubling time of just 9 days.


For comparison purposes, to smooth out some of the random fluctuations here are 7-day running averages of the caseload for all four states, plotted in the “ordinary” way (a linear scale rather than a logarithmic scale):

All four states have recently taken steps to re-open the economy. New York was ready, and still they’re proceeding cautiously, dipping a toe in the water to test things. California opened too early, they weren’t ready. Texas and Florida weren’t ready, but rather than go slowly they decided to jump in. Over their heads.

I Love New York.

This blog is made possible by readers like you; join others by donating at My Wee Dragon.


35 responses to “COVID-19: Doing It Right

  1. That’s low enough that outbreaks can be detected early, and when they arise can be contained with contact [tracing] and quarantine. AND — perhaps even more important — the caseload is still going down.

    How effective is contact tracing in America? You do not read much about it, while it is the main way for fight the virus without locking down. See how slowly cases drop after lock down (if at all) makes me wonder how effective the contact tracing is.

    Are there any number on this? The percentage of infected people who know the contacts of, the percentage of contacts actually contacted (if I understand it right in the UK this is only 25%), the percentage of people who use a contact tracing app.

    • Contact tracing in America varies by jurisdiction. Here in South Carolina, and I think many places, it’s primarily a state responsibility.

    • I wouldn’t take the UK as a good example of what Test&Trace can achieve. With Lockdown in the process of being reversed, there are many worried epidemiologists that it is too quick given the UK Test&Trace system is a joke. It has been set up in a hurry at the command of a government more concerned with forging good media stories than the actual job in hand. And they employ people and organisations who have zero experience of what such a system entails.

      Indeed, until yesterday almost all Local Authorities (who would normally have a lead role in Test&Trace of infectious desease) had no access to any local test results for Covid-19 which were conducted outside hospitals. That amounts to a third of all tests and the majority of tests over recent weeks.

      To see what is possible with Test&Trace, consider South Korea who squashed a serious peak (which was possibly two-thirds down to a crazy sect) and have kept the lid on an infection rate of a few dozen daily cases, all without lockdown although there were some restrictions in force.

      • I was not referring to the UK as a good example. Those times are long gone. More as an example where a quite strict lockdown only reduced infections slowly because the tracing was so poor.

        It would valuable to have datasets on local policies and metrics on how well they work. That helps study how well measures work.

    • Greg Wellman

      To answer your question about the effectiveness of contact tracing in America – there’s too many infected with too many contacts, and they’re not even cooperative.
      https://www.lawyersgunsmoneyblog.com/2020/07/meaningful-contract-tracing-in-the-united-states-is-hopeless

  2. The log-scale graph looks much like the one I did for Pennsylvania (using R package “Segmented”), having a break in slope March 25 and then beginning a downward trend April 4. Unfortunately for Pennsylvania, after about 70 days the trend turned upward again in mid-June.

  3. Happy Independence Day, everyone–whether it is a holiday for you, or not!

    Small joys, even in the midst of big troubles…

  4. Billy Pilgrim

    Tracking the number of new cases can be misleading because the number of tests performed has not been constant. Consider a population where 5% are infected with the virus. 1000 tests would result in about 50 new cases being reported, whereas 100 tests would result in only about 5 new cases being reported.

    Better to track the percentage of tests coming back positive:

    https://coronavirus.jhu.edu/testing/individual-states

    • No, that is not how it works. They are not testing random people.

      Plus there has not been a huge upsurge in testing the last weeks, but there has been in cases and infections.

      • Yep. And positivity rate has been on the upswing, from what I’ve seen–certainly that’s true here in SC.

        The linear trend over the last 30 days is up from ~12% to ~20%, as you can see in more detail here:

        https://www.scdhec.gov/infectious-diseases/viruses/coronavirus-disease-2019-covid-19/sc-testing-data-projections-covid-19

      • Billy Pilgrim

        Don’t get me wrong – the more tests the better, because among other things, those who are asymptomatic but test positive will know they need to self isolate.

        But if the goal is to learn how fast the virus is spreading, just conducting more tests is not particularly useful. As mentioned, more tests will lead to more cases being reported even when the number of infected is unchanged.

        “No, that is not how it works. They are not testing random people.”

        Ideally testing would be random, but that is a separate problem and unrelated to the one I’ve brought up.

        “Plus there has not been a huge upsurge in testing the last weeks….. “

        Please refer to the John Hopkins chart. Not a surge, but a steady increase.

        [Response: Testing in Florida has increased steadily since April. But cases have not. In fact they dropped in April and held steady in May (despite the rise in testing). Then they took off like a bat out of hell in June (despite testing rising no more rapidly than before).

        You really haven’t thought this through.]

  5. :”Ideally testing would be random”

    Ah, no. When testing resources are limited, the best approach is to have a sampling design that maximizes the information you gather from each sample. Stratified samples, etc. allow you to use your existing knowledge to an advantage, instead of just taking a “shot in the dark/shotgun blast” approach to the problem.

    Assuming that the best sampling design is a purely random one is wrong scientifically, and wrong financially.

  6. Billy Pilgrim

    The John Hopkin’s link gives a 7-day moving average for percentage of tests coming back positive. In early June, in Florida, this number was less than 5%. Today it’s around 18%, which explains why the increase in reported new cases there has greatly outpaced the increase in daily tests performed.

    [Response: You REALLY haven’t thought this through.]

    • Billy Pilgrim

      Daily new cases = number of new tests performed * the percentage of tests that come back positive.

      So, given that Florida is currently testing around 50,000 people per day (see chart), and around 18% of those tests are coming back positive, then we would expect about 9,000 new cases per day to be reported. This matches recent observations.

      Had the percentage of positives in Florida remained at less than 5% – early June levels – then 50,000 tests would yield less than 2,500 new cases. This does not match recent observations.

      Please let me know, specifically, what it is you disagree with?

      • Billy Pilgrim

        Bob,
        Earlier I wrote, “But if the goal is to learn how fast the virus is spreading…..”

        That was the intended context for the quote you were critical of, “Ideally testing would be random”.

        What sort of sampling design would do a better job than a random sample for determining the percentage of a population infected with the virus?

      • OK. Billy, you had said:

        But if the goal is to learn how fast the virus is spreading, just conducting more tests is not particularly useful. As mentioned, more tests will lead to more cases being reported even when the number of infected is unchanged.”

        Although it’s certainly true “more tests will lead to more cases being reported even when the number of infected is unchanged”, it doesn’t follow that your picture of the viral spread is completely independent of the testing you do. Obviously, the more you test, the more potentially accurate and detailed your picture of spread can be, because a) sample size matters, and b) capturing sub-populations can be incredibly important.

        But then you went on to say:

        The John Hopkin’s link gives a 7-day moving average for percentage of tests coming back positive. In early June, in Florida, this number was less than 5%. Today it’s around 18%, which explains why the increase in reported new cases there has greatly outpaced the increase in daily tests performed.

        Now, clearly this has nothing to do with your scenario in comment 1, because in that scenario, the positivity rate is assumed to be unchanged. That is, as you put it, “the number of infected is unchanged.” So, you’re changing the subject a bit, even if you aren’t aware of it.

        (I’d also note in passing that an increased positivity rate is a pretty trivial ‘explanation’ for the fact that “the increase in reported new cases there has greatly outpaced the increase in daily tests.” Hence my comment about going in circles.)

        Finally, you say:

        Daily new cases = number of new tests performed * the percentage of tests that come back positive.

        Hallelujah! Something we can all unequivocally agree on! But what I would vehemently *disagree* with is that this supports the contention that tracking case numbers is “misleading.” More tests are being given in large part because more patients are presenting with Covid-like symptoms. The causality here works both ways: more testing uncovers more cases, to be sure, but more cases drive more testing as well.

        No doubt, the number of asymptomatic patients remains challenging. But confirmed cases is a valuable metric, to be considered in conjunction with other valuable information (such as the positivity rate.)

    • And you’re going in circles.

    • Billy Pilgrim

      Bob,
      you may have taken what I wrote out of context.

      What sort of sampling design would do a better job than a random sample for determining the percentage of the population infected with the virus?

      • Billy Pilgrim

        Sorry for the redundancy. I did not realize that my previous reply to Bob had posted.

      • Bob Loblaw

        Billy: you ask about random vs. other sampling. Let’s start with a clear definition of random sampling: it’s when a subset of the population is sampled, and each individual has an equal probability of being selected for the sample – and each selection is independent.

        Let’s also stipulate that we can’t perform our total number of samples instantaneously – we can only test at a certain rate, and it will take time to get our sample size up to our target. In a random sample, that doesn’t matter, because as we take more tests, we ignore any previous results and keep to our presribed random selection process. We can, however, use the results of previous tests to refine our sampling procedure – at which point it is no longer random.

        Now, let’s define the “population”. In a global pandemic, it seems reasonable to make this the global population. But we already know that different countries (or any geographical distribution you want to use) are affected at different rates. Would we really want to sample the researchers in Antarctica with the same probability as the people of New York City? Likely not, given that few people travel to Antarctica and there are currently no people there that are suffering from Covid-19. We’ll test there if someone becomes symptomatic, and we have reason to believe that it really has the potential to be Covid-19, but otherwise we’ll leave it out. So, we’ve stratified our sample – it’s no longer random.

        So, perhaps we are interested in only one country – our own. Although that is now a stratified sample, let’s redefine our “population” to be just our country. Mine happens to be Canada. Some parts of Canada are quite remote, so do we test those at the same probability as Toronto or Montreal? No – additional samples in remote areas with low risk and low incidence will not tell us much we don’t already know. Instead, we do more of the earlier tests in areas where things are dynamic and rates are kown to be higher. We prioritize, because there are areas where the extra information is more critical. We are no longer sampling randomly.

        We also know that the people “infected with the virus” is not a static value. Do we re-sample people that have been tested before? In a random sample, being tested yesterday does not affect the chance of being selected for a test today. Yes, a random sample will have some probability of re-testing the same individual, but we are better off if we reduce testing on people that test negative (unless they begin to exhibit symptoms), and we likely should quickly re-test people that tested positive, to see if the virus has cleared their system. That’s not random sampling. Again, we use the knowledge offered by the test results we have to focus our efforts.

        To stick blindly to a random sampling procedure is to ignore any information we have already collected. Learn from our results; apply that knowledge.

        Back to you. In particular, if you have a different definition of “random sampling” please elucidate.

  7. Yeah, holding NY up as an example doesn’t make sense. Stockholm has better numbers than NYC, and Stockholm did (almost) nothing. Cuomo’s response seems to have been worse than neutral, that is, it seems to have made things worse. Does anyone seriously think that any state in the US is going to catch up to NYC’s death total? Even on a per capita basis?

    [Response: You get a direct response.]

  8. Ignorant Guy

    Tamino, Doc Snow, Victor Venema and others.
    Now I am confused. You repeatedly tell Billy Pilgrim that he has got it all wrong. But as far as I can see he agrees with you on all points. The graphs he links to shows exactly what you all say.

    [Response: He has repeatedly promoted the idea that increases in testing are responsible, in significant part, for increases in cases. He really hasn’t thought through the process of how many people get tested and why. That’s what we keep telling him he’s got wrong. I suppose somebody should post about this.]

    • Billy Pilgrim

      Doc
      ““I’d also note in passing that an increased positivity rate is a pretty trivial ‘explanation’ for the fact that “the increase in reported new cases there has greatly outpaced the increase in daily tests.” Hence my comment about going in circles.””

      ???
      It’s not a trivial explanation, it’s the only explanation. Had the positivity rate remained constant, the increase in new cases would have been directly proportional to the increase in testing.

      [Response: Your explanation is only true in the case of random testing. In the real world, it’s a lot more complicated.]

      • It’s not a trivial explanation, it’s the only explanation. Had the positivity rate remained constant, the increase in new cases would have been directly proportional to the increase in testing.

        A good explanation usually provides some account of causality. In the present instance, does the positivity rate trend ’cause’ (in part or in whole) the trend in new cases, or vice-versa? I would say ‘no’, it’s rather that they co-vary as linked parts of a larger whole. The real “explanation” appears to me to be the clear fact that the populace was getting sicker on average. That fact drives observed increases in positivity rate AND observed increases in diagnosed cases.

        (Viewed conversely, the consilience between those two parameters gives us increased confidence that the epidemic really is growing. Which is why I take issue with your statement that observed increases in case numbers are “misleading,” or just due to testing.)

  9. Billy Pilgrim,
    Part of your problem is that simply saying “random testing” doesn’t really go very far in specifying the testing protocol you are advocating. There are many ways to test randomly. Testing every third person who walks into the clinic is random in a sense, but the population you are testing may not be the same as the population at large.
    Even if you went out and offered tests at random, your sample could be biases by a broad range of factors–your locality, whichf people decide to be tested, even time of day. And if you institute a protocol to try to make it more random, that protocol can, in itself, create biases.
    The very fact that you are testing self-aware, “intelligent” beings can institute biases, and it can be difficult to anticipate exactly how those biases will affect results. For instance, are people who decide to be tested more likely to be infected because they know they are engaging in risky behavior, or are they less likely because the decision to test in itself indicates concern, which might also be reflected in mitigations they take to lower risk of exposure.

    So: it ain’t simple.

  10. Billy Pilgrim

    Bob Loblaw,
    Thanks for the interesting comment!

    I see two competing ideas at play here.
    Suppose there’s a country made up of two islands, each with 10 million people, and we want to know what percent of the total population are infected with COVID.

    Idea 1) Even if we knew that island A had a worse problem that island B, we would still need to do random testing across both islands. Testing just one would lead to a biased total.

    Idea 2) Island B is known to have crushed the virus, and thought to have virtually no positives at all. Now we have a solid value to work with! 0%

    So, for example, if 6% of residents on island A are found through testing to be infected, and we already know that nobody on island B has it –
    then a simple average of the two rates, 0% and 6%, gives a value of 3% for the whole population.
    No bias even though just one of the islands was tested.

    ———
    The latter is the sort of situation I hadn’t considered prior to reading your post. I’m sure there are many more.

  11. Billy Pilgrim

    Doc,
    “Which is why I take issue with your statement that observed increases in case numbers are “misleading,” or just due to testing.”

    Oh no, I’m worried you’ve mistaken me for a Trump supporter. The horror!!!
    ——-

    Me: “Tracking the number of new cases can be misleading because the number of tests performed has not been constant.”

    This statement was intended to be a general criticism – one that goes both ways. Fewer tests, but the same number of new cases could lead people to underestimate the spread of a virus. Or yeah, an increase in new cases could just be an artifact of more testing. To tell the difference you need to look at BOTH metrics, tests and new cases. That’s what a positivity rate does.

  12. Billy:

    I still don’t think you’re understanding the aspect of non-random sampling design. In your two-island scenario, there is no reason to think that you have to do equal numbers of samples on both islands. In a purely-random sample design, you would expect that, because each island has same number of people so an equal probability of selecting people for testing.

    But when you repeatedly find no positives on island B, you can reduce the amount of testing (without eliminating it) while focusing on increased sampling on island A. Same number of total tests, but more on island A where you can get more useful information.

    As soon as you shift from equal sampling to A>B, you have shifted away from a completely random sample. You have stratified your sampling design. You can still have a random component to your design within each island, but your overall sample design is no longer purely random. It emphasizes certain aspects of the situation based on knowledge you have from initial testing.

    I think it would help you to sit down and write out exactly what you think “random sample” really means – regardless of whether you decide to share it or not.

  13. Billy Pilgrim

    Bob
    “In your two-island scenario, there is no reason to think that you have to do equal numbers of samples on both islands.”

    No, I wasn’t trying to suggest that. My thought is that the population of each island could be sampled separately, and the two results averaged, or you could take a sample of the total population (20 million). Which option is better, or does it matter?

    “But when you repeatedly find no positives on island B, you can reduce the amount of testing (without eliminating it) while focusing on increased sampling on island A. Same number of total tests, but more on island A where you can get more useful information.”

    You’ve sort of lost me here. You still need to determine the percentage of positives on each island, right? And, again, use those values to find an average?

    Maybe the island with fewer positive cases requires fewer samples to achieve the same level of accuracy as the island where the virus is more widespread? Is this an idea I’m missing?

    “I think it would help you to sit down and write out exactly what you think “random sample” really means – regardless of whether you decide to share it or not.”

    A random sample is where each member of the population has an equal chance of being selected to the sample group (a subset of the larger population). There should be no bias in the selection process.

    An example off the top of my head – you could select the individuals for the sample by using a program that generates random social security numbers.

    • Bob Loblaw

      Billy: we agree on what is random.

      Maybe the island with fewer positive cases requires fewer samples to achieve the same level of accuracy as the island where the virus is more widespread? Is this an idea I’m missing?

      Yes. The number of samples required to achieve a statistically-significant result depends on two things: the variability in the population, and the size of the effect that you are testing for. This can approached by doing a power test:
      https://en.wikipedia.org/wiki/Power_of_a_test

      The catch is, to do a good power test, you need to know a bit about the variability and the size of the effect. You can just guess, or you can use early data to estimate them and then revised your sampling to account for it. The key thing in good sample design is to use available information and tailor the sampling – which means moving away from a pure random sample.

      If the two islands show different variability (e.g. over time), and different amounts of infection, then sampling frequency does not need to be the same. It gets a bit tricky, because it also depends on exactly what information you want from the sample.

      The biggest mistake is to think you have a random sample and analyze it like a random sample, without realizing that the selection method is not actually random. Some examples of “not random” are:

      – “I randomly picked phone numbers from the phone listings.” (Not everyone has a phone, and the listing may be land lines only – not cell phones).

      – “I posted on a blog and invited comments”. (Not everyone reads your blog, and not everyone is motivated to comment.)

      – “I randomly selected from people that said they weren’t feeling well”. (Biased towards people that volunteered that they weren’t feeling well, and “not feeling well” is a subjective judgement.)

      In short – sample design is generally not simple, and there are a lot more ways of doing it wrong than there are of doing it right. And doing it right depends on the goal – there is not a universal solution (e.g. “random”).

  14. Billy Pilgrim

    Snarkrates

    I think you’ve brought up some of the issues that epidemiologists have been struggling with. Hard to know for sure how many people actually have, or have had, the virus.

  15. Billy Pilgrim

    Bob
    Thanks for your patience and the challenging link. Quoting Ignorant Guy,
    “As always it’s not so simple but irritatingly complicated.”