An interesting question that can be explored with the thousands of complete genomes sampled in the USA, Europe, Asia etc. is “How many introductions were there into the USA, or into Italy, etc.?” I have seen some blog posts, etc. suggesting that there were just 3 to 10 viruses that “seeded the USA epidemic” and that it was evident that the virus went to Washington State and New York City, and from there to New Orleans, etc. However, my estimation from studying subsets of the data presented at Nexstrain.org or at the Broad Institute, is that there must have been many dozens of introductions of virus into the USA and at least a dozen from the USA back to Europe and Asia before travel was severely restricted. Likewise there was more than one introduction into New Orleans, and we cannot be certain that any one of them came directly from New York City, although it is highly likely that at least one of them did.
So the point should be to say that we have enough early isolate sequences to say we can reconstruct the exact history, but only that we can be certain that the number of infected people traveling was between N1 and N2 at time x. Some of that may get pretty fuzzy because one infected person on an airplane may have infected 2 to 20 others during the flight, for example.
My opinion is that many of the papers doing the mathematical modeling or reconstructions of this tend to overstate the confidence intervals, because the models assume things like random sampling and a well mixed population. I am sure much of our earliest sampling was very nonrandom, because of a shortage of testing so that we needed to focus close to symptomatic cases.