There is no easy way to open a Window and render a string. Period! You either need to write it yourself completely (as OP stated) or you can use Gtk/Qt or other heavy weight "client libraries" (cairo and freetype do not create Wayland windows and therefore are not applicable here).
Look at the code again! If you really think that the code at [1] is in any way a great solution as compared to [2] we are going to disagree.
> No, in both modern X11 or Wayland, you should use the same API for screenshots: the XDG screenshot portal.
ZERO screenshot apps on X11 use XDG screenshot portal, they all use XGetImage(). Mainly because the assumption that a dbus-daemon is always running everywhere is mostly false. Also XDG screenshot portal is simply not a good solution. It is cumbersome to use, contains tons of edge-cases and pulls a dbus dependency for something that could be solved much simpler with onboard OS-functionality without the need for extra daemons and weird binary protocols
> Client-side rendering is also the norm in X11, since decades ago when Xft was released
Besides the point, but you are still wrong. Xft does server-side rendering via XRender. The cache is rendered only once on the client but that's a technicality, spline tessellation was supposed to go into the server but Keith Packard had more important things to do at the time.