100M Token Context Windows (opens in new tab)

(magic.dev)

94 pointsgklitt1y ago22 comments

22 comments

FYI wouldn't interview here. Got rejected after a 30 minute behavioral screen after spending 8 hours on an unpaid take-home.

cedws1y ago

That sucks, sorry to hear that. You should reject take homes like this in future. At least, don’t invest 8 hours before they’ve even interviewed you. The time investment should be symmetrical for the candidate and the employer.

thedevilslawyer1y ago

Are you saying you cannot be rejected in any following interview if you do a 8 hour unpaid take home?

shazami1y ago

Is it reasonable for non-technical screens to be sequenced before long, unpaid take homes?

dinobones1y ago

Long context windows are IMO, “AGI enough.”

100M context window means it can probably store everything you’ve ever told it for years.

Couple this with multimodal capabilities, like a robot encoding vision and audio into tokens, you can get autonomous assistants than learn your house/habits/chores really quickly.

segmondy1y ago

infinite context window is not AGI enough, memory is not substitute for planning and reasoning. imagine you have infinity memory, but can't plan or reason. you can memorize all chess games you have ever played. You will be crushed every time a new move/variation is introduced since you won't know what to do next. So it's not enough for us to have very long context windows, we need stronger planning and reasoning and ability for AI to have a world model of whatever universe it exists and operates in.

dogma11381y ago

Has anyone measured the performance of very large context windows like this vs a good RAG that you also constantly update and curate?

At least with other very large context windows like for example Claude offers a RAG is still very much preferable as it avoids confusion and collisions with information in the context that isn’t correct or relevant.

Sure you can also prune the context window and for many existing models you also need to do that (I often use an LLM to summarize a context to keep it going) but doing it with a RAG seems to still be much easier. This especially holds true of you use good knowledge management techniques to structure your RAG so your retrievals are optimized.

P.S. on a side note how confident are we that these very large context window models are not just a RAG in disguise? As the models which boast very large windows are at least for now all locked behind API access only.

jokethrowaway1y ago

Context window size is not the limiting factor. How well will it be able to use that information is the problem.

Even GPT and Claude make glaring mistakes with short prompts.

smusamashah1y ago

It should be benchmarked against something like RULER[1]

1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)

ipsum21y ago

> To incorporate this, we ask the model to complete a chain of hashes instead (as recently proposed by RULER):

They did mention it but didn't provide concrete benchmarks

fsndz1y ago

Context windows are becoming larger and larger, and I anticipate more research focusing on this trend. Could this signal the eventual demise of RAG? Only time will tell. I recently experimented with RAG and the limitations are often surprising (https://www.lycee.ai/blog/rag-fastapi-postgresql-pgvector). I wonder if we will see some of the same limitations for long context LLM. In context learning is probably a form of semantic / lexical cues based arithmetic.

Sakos1y ago

I was wondering how they could afford 8000 H100’s, but I guess I accidentally skipped over this part:

> We’ve raised a total of $465M, including a recent investment of $320 million from new investors Eric Schmidt, Jane Street, Sequoia, Atlassian, among others, and existing investors Nat Friedman & Daniel Gross, Elad Gil, and CapitalG.

Yeah, I guess that'd do it. Who are these people and how'd they convince them to invest that much?

IHLayman1y ago

Assume around $3/hr per H100 (pretty generous pricing for GCP), that is $2250/month-gpu, or for their fleet of 8000 comes to $18MM/month or around $216MM/year in just compute costs alone, not looking at SSD, bucket storage, or egress. At their initial investment of 465-320=$145MM that means they can’t have operated that cluster for longer than 6ish months without their funds running dry or the got massive discounts somewhere.

Something doesn’t add up here.

Sakos1y ago

Honestly, the more I look at this the more I'm mystified. Magic was founded in 2022 and they received their first funding round of $20 million a year later. Before that, the co-founder/CEO had founded some climate science education program called ClimateScience which he ran for 2 years:

> Then, it hit them. They thought, “What if we create bite-sized information, following the same scientific standards of peer-reviewed journals, to empower people to solve climate change?”

> Together, they started combing through climate science articles and turning them into social-media friendly content under the name ClimateScience. After two short months, ClimateScience went viral and grew to 40,000 followers on Instagram. People started sending in private messages, asking how they could help. A team of curious, kind and passionate people quickly grew, all dedicated to making climate education more understandable for everyone.

> Just a few years later, ClimateScience has grown into the world’s biggest climate education platform! We create educational courses, videos, resources and tools to improve climate understanding and education. It’s all completely free and just a few clicks away on any device.

According to LinkedIn, they have 50-200 employees. Is that plausible? How many of those are actually FTEs? Where is their revenue coming from if it's all completely free? Looking at the team page, this feels off, like it's a bunch of university students padding their CV.

That was after dropping out of university, 1 year into a bachelor's in computer science. During which he apparently had a 5 month contract at Facebook AI where he "lead the development of 'DREAM', an algorithm that's 100x more data-efficient and trains faster than the previous state-of-the-art in model-free multi-agent Deep RL. Paper: https://arxiv.org/abs/2006.10410".

How does this lead to Magic.dev and third-parties investing $500 million? Either this guy is a prodigy or this is the next Theranos.

edit: I looked into the other co-founder just now and I feel like I'm in the twilight zone.

thelastparadise1y ago

> edit: I looked into the other co-founder just now and I feel like I'm in the twilight zone

likewise, especially as a profitable bootstrapped founder

it's like these people don't play by the same rules as the rest of us

koolala1y ago

Interesting, I hope they merge the two passions and create a Earth AI from weather / climate / environmental data.

0cf8612b2e1e1y ago

For those names (access to $billions), curious how much due diligence they do any more. Just make a “chump change” investment in every hot trend? One phony AI startup pitch deck will look identical (if not better) to one with a real edge.

anonzzzies1y ago

What is the state of art on context on open models? Magic won't be open I guess after getting 500m in VC money.

samber1y ago

Based on Mamba ?

htrp1y ago

does anyone have a detailed tech breakdown of these guys? not quite sure how their LTM architecture works.

why_only_151y ago

They're not saying

1024core1y ago

These days the competition is so fierce that everyone's clammed up.

j / k navigate · click thread line to collapse

22 comments

shazami1y ago

FYI wouldn't interview here. Got rejected after a 30 minute behavioral screen after spending 8 hours on an unpaid take-home.

cedws1y ago

thedevilslawyer1y ago

Are you saying you cannot be rejected in any following interview if you do a 8 hour unpaid take home?

shazami1y ago

Is it reasonable for non-technical screens to be sequenced before long, unpaid take homes?

dinobones1y ago

Long context windows are IMO, “AGI enough.”

100M context window means it can probably store everything you’ve ever told it for years.

Couple this with multimodal capabilities, like a robot encoding vision and audio into tokens, you can get autonomous assistants than learn your house/habits/chores really quickly.

segmondy1y ago

dogma11381y ago

Has anyone measured the performance of very large context windows like this vs a good RAG that you also constantly update and curate?

jokethrowaway1y ago

Context window size is not the limiting factor. How well will it be able to use that information is the problem.

Even GPT and Claude make glaring mistakes with short prompts.

smusamashah1y ago

It should be benchmarked against something like RULER[1]

1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)

ipsum21y ago

> To incorporate this, we ask the model to complete a chain of hashes instead (as recently proposed by RULER):

They did mention it but didn't provide concrete benchmarks

fsndz1y ago

Sakos1y ago

I was wondering how they could afford 8000 H100’s, but I guess I accidentally skipped over this part:

Yeah, I guess that'd do it. Who are these people and how'd they convince them to invest that much?

IHLayman1y ago

Something doesn’t add up here.

Sakos1y ago

> Then, it hit them. They thought, “What if we create bite-sized information, following the same scientific standards of peer-reviewed journals, to empower people to solve climate change?”

How does this lead to Magic.dev and third-parties investing $500 million? Either this guy is a prodigy or this is the next Theranos.

edit: I looked into the other co-founder just now and I feel like I'm in the twilight zone.

thelastparadise1y ago

> edit: I looked into the other co-founder just now and I feel like I'm in the twilight zone

likewise, especially as a profitable bootstrapped founder

it's like these people don't play by the same rules as the rest of us

koolala1y ago

Interesting, I hope they merge the two passions and create a Earth AI from weather / climate / environmental data.

0cf8612b2e1e1y ago

anonzzzies1y ago

What is the state of art on context on open models? Magic won't be open I guess after getting 500m in VC money.

samber1y ago

Based on Mamba ?

htrp1y ago

does anyone have a detailed tech breakdown of these guys? not quite sure how their LTM architecture works.

why_only_151y ago

They're not saying

1024core1y ago

These days the competition is so fierce that everyone's clammed up.

j / k navigate · click thread line to collapse