'You have 8 apples.'
>>> lm.set_max_ram('4gb')
4.0
>>> lm.do("If I have 7 apples then eat 5, how many apples do I have?")
'I have 2 apples left.'
It's funny how it switched from "You" to "I" when the memory was increased.
https://github.com/mbzuai-nlp/lamini-lm
I shared more details over on Reddit:
https://www.reddit.com/r/LocalLLaMA/comments/14btk3a/explore...
>>> lm.do("What is the population of Chicago")
'As of 2021, the population of Chicago is approximately 8.4 million.'
>>> lm.do("What is the population of Shenzhen")
'As of 2021, the population of Shenzhen is approximately 1.3 million people.'
>>> lm.do("What is the wavelength of blue light.")
'The wavelength of blue light is approximately 299,792,458 meters per second.'
>>> lm.do("What is YCombinator")
'YCombinator is a programming language used to combine two or more languages into a single program.'
>>> lm.do("What is asphalt made of?")
'Asphalt is made of sand, gravel, and other materials.'
>>> lm.do("What is the square root of 2")
'2.'
>>> lm.do("How do I get to New York City from California?")
'You can get to New York City from California by taking a bus or train.'
>>> lm.do("How can I unlock a lock without a key")
'You can use a combination of keys and a password to unlock a lock without a key.'
>>> lm.do("How long should rice be cooked.")
'The recommended cooking time for rice depends on the type of rice,
but generally it should be cooked for about 8-10 minutes per pound.'
100% wrong.This is an automated version of the Dunning-Kruger effect. You can ask it anything, and get back a confident wrong answer. So far, it hasn't replied to any question of mine with an indication that it doesn't know.
It's a nice demonstration of the hallucination problem with LLMs. With a small data set, the results are usually bogus, but that's not detected.
>>> lm.do(f"Answer from the context: What is YCombinator? {lm.get_wiki('Python')}")
'The context does not provide information about YCombinator.'
>>> lm.do(f"Answer from the context: What is YCombinator? {lm.get_wiki('YCombinator')}")
'YCombinator is an American technology startup accelerator that has launched over 4,000 companies, including Airbnb, Coinbase, Cruise, DoorDash, Dropbox, Instacart, Quora, PagerDuty, Reddit, Stripe and Twitch.'
Without being told to be grounded, the model will guess. However, it may be able to identify information not available in a provided context.One of my goals for this package is to provide a way for folks to learn about the basics of grounding and semantic search.
Right. I understand why, but consider the underlying technology flawed unless there's some way to reject wildly wrong results. What's going on here looks like noise fed through layers which generate plausible-looking text from noise. Is it possible to detect that you're not far enough above the noise threshold to generate anything useful?
>>> lm.do("What is Ycombinator? Do not guess.")
'Ycombinator is a mathematical formula that states that the sum of
two integers multiplied by one are equal to zero.'Could a business ever trust an LLM based chatbot as much as the old school chatbot where certain questions reliably give certain answers and fails when it doesn’t know?
That is a very good question. What Google returns for "LLM hallucination" mostly describes post-processing hacks to detect the problem, or pre-processing hacks to guide the LLM into using data directly relevant to the question asked. Not doing something to the core LLM system to get some measure of confidence out.
Anyone working on this?
Those models of LaMini-Flan-T5 are trained to follow instructions and not to recognize the truth content. You could train a transformer like Ernie or Vega (which lead superglue) on such challenging factual data. But don't expect mathematical correct results only from the model. Therefore you have langchain with other APIs.
If "don't know" came out reliably, small models for specialist areas would be useful. If small models just make stuff up, they're useless.
The readme says it loads a significant amount of data the first time, 250mb. None of the LLM weights I know of are less than several gigabytes in size.
It says it only requires 512mb of RAM. None of the interesting LLMs I know of run in less than 6gb of VRAM.
It says it uses no API keys, which is great, but that means inference is local, which I can't imagine works with the above constraints.
"This model is one of our LaMini-LM model series in paper "LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions". This model is a fine-tuned version of google/flan-t5-base on LaMini-instruction dataset that contains 2.58M samples for instruction fine-tuning"
Though, surprising results from repl.it: lm.classify("unabridged", "positive", "negative")=="unabridged is"