The way clients usually do this is by fetching all mentions to the user after a given timestamp (of the original tweet), and collecting only those that are replies to the tweet. Since there's a limit on how many mentions you can get, you might not get all the replies.
It's been like that at least since 2013.