undefined | Better HN

0 pointsastronautas2y ago0 comments

Wow, make it open source quickly!!! :hype:. It's a classic Python REST API for model serving. But we have very low latency constraints. As such, rewriting in more high performant backend languages e.g. Go or Rust would substantially reduce resource usage (by reducing horizontal scaling need). Pre-baked model serving frameworks e.g. Nvidia's Triton aren't an option, since we have to query a feature store, and do some input feature tracking in between. Go seemed like an efficient, developer friendly choice, but there aren't any well maintained model inference libraries in Go up to this day...

0 comments

huac2y ago

We used Triton Inference Server (with a Golang sidecar to translate requests) for model serving and a separate Go app that handled receiving the request, fetching features, sending to Triton, doing other stuff with the response, serving. This scaled to 100k QPS with pretty good performance but does require some hops.

In general writing pure Go inference libraries sucks. Not easy to do array/vector manipulation, not easy to do SIMD/CUDA acceleration, cgo is not go, etc. I wrote a fast XGBoost library at least (https://github.com/stillmatic/arboreal) - it's on par with C implementations, but doing anything more complex is going to be tricky.

astronautasOP2y ago

Cool, thanks for sharing!

ramoz2y ago

I’ve also ran models in Go, transformers even T5. There wasn’t that much overhead maybe some annoying compilation stuff but nothing crazy.

This was tensorflow btw which has Go bindings support.

It is a smart & worthwhile move, we also needed to drop python for performance/cost gains.

astronautasOP2y ago

eh, awesome! Seems this one, right? https://github.com/galeone/tfgo. Quite many stars.

ramoz2y ago

I think just native https://pkg.go.dev/github.com/tensorflow/tensorflow/tensorfl... but tfgo looks interesting.

Actually the docs around this weren’t great. Took the train-in-python & inference-in-go approach. And only for versions greater than tf2

1 more reply

j / k navigate · click thread line to collapse

0 pointsastronautas2y ago0 comments

0 comments

huac2y ago

astronautasOP2y ago

Cool, thanks for sharing!

ramoz2y ago

I’ve also ran models in Go, transformers even T5. There wasn’t that much overhead maybe some annoying compilation stuff but nothing crazy.

This was tensorflow btw which has Go bindings support.

It is a smart & worthwhile move, we also needed to drop python for performance/cost gains.

astronautasOP2y ago

eh, awesome! Seems this one, right? https://github.com/galeone/tfgo. Quite many stars.

ramoz2y ago

I think just native https://pkg.go.dev/github.com/tensorflow/tensorflow/tensorfl... but tfgo looks interesting.

Actually the docs around this weren’t great. Took the train-in-python & inference-in-go approach. And only for versions greater than tf2

1 more reply

j / k navigate · click thread line to collapse