I put this earler in the phrase "reflection completeness": https://sdrinf.com/reflection-completeness ie there are things which stops working when people know about it.
In particular with A/B testing, this means that the initial A/B test is intermingled from at least 3 effects: specifically it measures how the naive population's behavior changes as a function of new functionality being made available. This is heavily, heavily time-dependent; specifically there's a "novelty effect" (early data collection will not be representative to long-term usage patterns); and there's "reflection effect" (once the outcome of the test is widely known, people can change their behavior based on that). Controlling for the first is difficult, but possible; controlling for the second, beyond just "keeping everything secret", is significantly more so, as the timelines for that might be years in length.
I strongly suspect GP was pointing at this timeline factor, and specifically that market engineering, as currently, generally, widely practiced, is grounded on the immediately available signal of "does it increases sales in 2 weeks of A/B test running". Which, given novelty effects, is heavily biased towards "yes"; and these people aren't incentivized (nor have the time/energy) to measure _very_ long-term effects beyond novelty, and reflection period.
An A/B test just refers to observing how a dependent variable changes when an independent variable is in two different states, State A and State B.
Drug vs placebo - is an A/B test.
To borrow the Lindy effect; whether someone likes the jacket in color A or B is of such short lived value it’s a huge waste of the resources that went into the pipeline needed to come to the conclusion.
Here’s an A/B test; rethink logistics to increase customization of outputs or continue to create design jobs who define what’s trendy and acceptable?
In the context of what we're talking about, you can A/B test more than marketing, you're can test variables like UI/UX.
Yes clothes fall in and out of fashion, but changing the placement, color, size of the "add to cart" button isn't something that's going to be changing frequently.
Another example might be adding a "trending" tab the top navigation of a page or whether the "what's trending" vs "what you like" provide more engagement as the default page.
Youtube recently tested randomly lowering people's video resolution to see who changed it back to gauge the importance of the resolution to their customers.
I wish they start gauging how frustrating such tests are, particularly for the test group. I've been cursing at YouTube many times over the past weeks because of this very issue - and now I learn it's not even a bug, but an A/B test.
I disagree that I am “caught up” on anything.
I have a preference that’s been refined over time. Not a psychological error in perception.