There's a lot of issues and variables at play; this isn't a case of "it's always DNS". What tools do ISPs even have at their disposal and how accurate are they and does it uncover the actual problem users are experiencing? This is the real issue that ISPs of all size have to deal with.
The worst that I have ever seen was about 30 seconds when visiting a foreign country where bufferbloat was occurring at peering links. The bufferbloat in peering links is likely visible from western countries if you ping residential IPs in developing countries and monitor the ping times over days. Some parts of the day will have very high ping times while others will not. The high ping times will be the buffer bloat.
In most western countries, the bufferbloat typically occurs at people’s home internet connections. As is the case in all cases of buffer bloat, the solution is to be willing to drop packets when the connection is saturated. If you limit the bandwidth just below what the connection can handle, you can do active queue management to solve the problem.
That said, I suggest you stop posting replies. Your crusade against the idea of buffer bloat makes you look bad to anyone with enough networking knowledge to understand what bufferbloat is. I also strongly suspect I wrote an explanation that you will take zero time to understand and rather than take my advice, you will post another reply to continue your crusade. :/
It is not yet a "solved" problem, but 10-15 years have started to make a dent and get better tools to both observe and act on the problem.
This is seen everywhere from the inclusion of CAKE ( https://man7.org/linux/man-pages/man8/tc-cake.8.html ) in some CPE / home router, but the use of fq_codel ( https://man7.org/linux/man-pages/man8/tc-fq_codel.8.html ) in routers along the way.
Other ISPs have to go even farther, because "content" might be 80-120ms away, and the ability to be more aggressive or less aggressive in tuning certain parameters can have a large impact on overall customer Quality of Experience. If there are any LEO hops along the way, problems with TCP and delayed signaling as a byproduct can also make throughout tank while latency spikes.
DPDK and VPP have contributed to a lot of new networking devices to help observe and act on traffic.
Everytime you go from a big pipe to a small pipe (higher data rate to lower data rate) connection you will see this issue at varying levels.
The worst that I have ever seen was about 30 seconds when visiting a foreign country where bufferbloat was occurring at peering links. The bufferbloat in peering links is likely visible from western countries if you ping residential IPs in developing countries and monitor the ping times over days. Some parts of the day will have very high ping times while others will not. The high ping times will be the buffer bloat.
Out of curiosity, did you have full observability of these peering links, or is this a hypothesis? I could think of a few scenarios where alternative explanations could explain what you're seeing.
In most western countries, the bufferbloat typically occurs at people’s home internet connections.
Says who? How is this measured? Do we have actual numbers on people experiencing real bufferbloat issues that are affecting their service?
That said, I suggest you stop posting replies. Your crusade against the idea of buffer bloat makes you look bad to anyone with enough networking knowledge to understand what bufferbloat is. I also strongly suspect I wrote an explanation that you will take zero time to understand and rather than take my advice, you will post another reply out of ignorance. :/
Look, I will cordially suggest a more tenable approach: consider disengaging from this thread, your vacuous and vapid post hasn't really brought anything to the table.
Edit: Seems I can't reply to the child comment, so I'll just say, you should've used your own advice and not reply. There's nothing of substance and you're still continuing with your daft misinterpretation of my take. I'll leave it at that.
The actual problem is I'm on a voip call and someone starts a big download (steam) and latency and jitter go to hell and the call is unusable. Bufferbloat test confirms that latency dramatically increases under load. Or same call but someone starts uploading something big.
If troublesome buffers are at the last mile connection and the ISP provides a modem/router, adding QoS limiting downloads and uploads to about 90% of the acheived physical connection will avoid the issue. The buffers are still too big, but they won't fill under normal conditions, so it's not a problem. You could still fill the buffers if there's a big flow that doesn't use effective congestion control, or a large enough number of flows so that the minimum send rate is still too much; or when the physical connection rate changes, but good enough. Many ISPs do this, and so you hear a lot less complaining about bufferbloat on say Comcast these days; also, this is an effective best practice, so less need for papers, reports and case studies... it's a matter of getting the practices in the wild and maybe figuring out how to do it better for wireless systems with rapidly changing rates.
Otherwise, ISP visibility can be limited. Not all equipment will report on buffer use, and even if it does, it may not report on a per port basis, and even then, the timing of measurement might miss things. What you're looking for is a 'standing buffer' where a port always has at least N packets waiting and the buffer does not drain for a meaningful amount of time. Ideally, you'd actually measure the buffer length in milliseconds, rather than packets, but that's asking a lot of the equipment.
There's a balance to be met as well. Smaller buffers mean packet drops, which is appropriate when dealing with standing buffers; but too small of buffers leads to problems if your flows are prone to 'micro bursts', lots of packets at once potentially on many flows, and then calm for a while. It's better to have room to buffer those.
Something I have always done I actually provision to account for packet overhead, so you might speed 2-3% higher speeds than your plan limit in a speed test, but psychologically the customer is getting more than they paid for, and most seem to be very happy about that.
But, rate limits were already in place long before anything about queue depth was even discussed, so that was nothing new. CAKE OTOH has had a very noticable impact on the customer experience, when their kids XBox can download that 250G update without impacting the voip call or wifi offloading another member of the household is on. Alternatively, that same gamer can play while Mom is downloading something near max throughput without having latency spikes and packet loss.
Yes, you're on to something about the customer experience in general that I'm tracking down myself. Orb is also trying to get a look, but I'm not a fan so far of that tool/platform https://orb.net/