Great comments. I think it’s worth noting that people underestimate the ability to run multiple tests simultaneously even when they interact. On this point:
Secondly, running multiple experiments across a user’s journey will lead to uncertainty about test results. Suppose you have an experiment on a product page, and one on search. If the experiment on search has a big impact on the type of traffic you send to the product page, the results from the product experiment will be skewed.
It can be perfectly fine if the experiment on search has an impact on the type of traffic you send to the product page. You still have a 2 x 2 matrix with the ability to compare A vs. B on one axis and C vs. D on the other — as well as the other interactions.
Running the tests sequentially, for instance, would actually decrease the reliability of the results. The interactions still exist in the accumulation of changes over time, in that case, and we’ve just failed to gather data on the interaction effects at all.