undefined | Better HN

0 pointsdandiep2y ago0 comments

It’s important to remember that GPT4 is only deterministic at the batch level because it is a mixture of experts model. Basically every time you invoke it, your query could get routed to a different expert because of what else is in the batch. At least this is my understanding based on others analysis.

0 comments

tarruda2y ago

> because it is a mixture of experts model

Do you have a source for this? I also considered but never saw any evidence that this is how GPT 4 is implemented.

I've always wondered how a system of multiple specialized small LLMs (with a "router LLM" in front of all) would fare against GPT4. Do you know if anyone is working on such a project?

j / k navigate · click thread line to collapse

0 pointsdandiep2y ago0 comments

0 comments

tarruda2y ago

> because it is a mixture of experts model

Do you have a source for this? I also considered but never saw any evidence that this is how GPT 4 is implemented.

I've always wondered how a system of multiple specialized small LLMs (with a "router LLM" in front of all) would fare against GPT4. Do you know if anyone is working on such a project?

j / k navigate · click thread line to collapse