Meta Unveils New AI Supercomputer (opens in new tab)

Jasper_4y ago

That's what gets me about self-driving cars. The road is a very social space, and follows social rules. Pretty much all of the communication and norms happening on the road are social ones.

The thing that would convince me AGI is ready would be to play a convincing game of poker. Or join in on a conversation mid-way through, listen to it, and engage with it actively. Show that machines are able to pick up on social cues, understand them, and learn new ones. It's a high bar, yes, but it's in my opinion a prerequisite for a self-driving car that's able to share roadways with other cars, cyclists, and kids playing in the street.

https://www.creativemachineslab.com/uploads/6/9/3/4/69340277...

rileyphone4y ago

"A robot modeled itself without prior knowledge of physics or its shape and used the self-model to perform tasks and detect self-damage."

arduinomancer4y ago

Well the theory for the end-to-end image based self-driving models is that they are supposed to cover that.

The reasoning is that given enough training data the system would know the pedestrian is going to jump out or the cyclist is going to fall just based on sheer volume of training examples. It would have seen that scenario tons of times in the image data.

Whether that will actually work is the question though

Nasrudith4y ago

Personally I think that biology may be a flawed approach for most applications. Although the others arr worthy ends in themselves just for its role in understanding ourselves in a forensic archaeologist try to replicate sort of way, let alone any potential insights to biological brains.

Biology is glacially slow in comparison and one of the advantages from computing is being fast.

I believe that not modeling it is partially by design as a result of responsibility and blame frameworks. If you depend upon possible actions taken by others to be safe you are reckless. Extrapolating from current motions is more reliable than trying to profile everything. "They are moving towards the street at 3mph and 20 ft away, their vector will intersect with car, brake to avoid collision or accelerate enough to leave intersection zone before they can even reach us" seems a more reliable approach. It isn't like a kid will suddenly teleport into the road.

mrkramer4y ago

I doubt there will be AGI in our lifetime. Maybe some breakthrough happens but it won't be even close to human intelligence.

mindcrime4y ago

I've always imagined AGI (perhaps naively) as being achieved by clever usage of ML, plus some utilization of classical/symbolic AI from pre-AI winter days, plus probably some unknown elements.

For what it's worth, this is my view as well. And I don't think it's particularly naive. Plenty of people have researched and/or are researching aspects of how to do this. But how to combine something like a neural network, with it's distributed (and very opaque) representations, with an inference engine that "wants" to work with discrete symbols is non-obvious. Or at least it appears to be, since nobody apparently has figured out how to do it yet - at least not to the level of yielding AGI.

but I've never heard a compelling argument for why pure ML would get us there.

The simplistic argument would be that ML models are, in some sense, trying to replicate "what the brain does" and it stands to reason that if your current toy ANN's (and let's be honest - the largest ANN's built to date are toys compared to the brain) are something like the brain, then in principle if you scale them up to "brain level" (in terms of numbers of neurons and synapses), you should get more intelligence. Now on the other hand, anybody working with ANN's today will tell you that they are at best "biologically inspired" and aren't even close to actually replicating what biological neural networks do. Soo... while people like Geoffrey Hinton have gone on record as saying that "ANN's are all you need" (I'm paraphrasing, and I don't have a citation handy, sorry) I tend to think that in the short term a valid approach is exactly what you suggested. Combine ML and use it for what it's good at (pattern recognition, largely) and use "old fashioned" symbolic AI for the things that it is good at (reasoning / inference / etc.)

Now, to figure out how to actually do that. :-)

space_fountain4y ago

It seems quite clear to me that human brains are not actually doing much symbolic logic. What symbolic logic we do do has been bolted on using other faculties. I think the problem is that reasoning about our own minds is incredible tough. We want there to be some sort of magic sauce to what makes us, us and so we reject things like ANN's that seem somehow too simple. I think it probably is right that we won't just be able to scale up the number of parameters and get human like performance. There are hints that returns start to level off, but I'm also unsure why people are so sure we can't.

edgyquant4y ago

Even if they did replicate how the brain works our brains aren’t one of these networks trained for specific things it is millions, maybe billions, of them combined.

mFixman4y ago

AGI seems hard because each year more and more problems that were previously considered close to AGI are solved.

Playing Chess at a grandmaster level was considered something only a human could do until the 1990s, and now no human has beat the best computer in 17 years while AGI seems further away than ever.

Mark my words: we'll create an AI that can pass the Turing test this decade, but we'll still be as far away from the badly defined general problem as we ever were.

lostmsu4y ago

The chess example is not that strong: "the best computer" or more precisely the software that beats humans since 1990s was actually specifically designed to beat chess. That was the case until AlphaZero did the same in 2017 for the whole class of turn based games.

MR4D4y ago

I agree. I read Jeff Hawkins book On Intelligence [0] back when it came out, and it had a profound effect on my thinking. Chasing more data, aka "parameters" doesn't seem to be the right answer. I think more of a Bayes model like spam filtering, but cobbled together with other Bayes models looking at other things until something emerges that we call "intelligent". Heck, I'd consider Google's spam filtering pretty intelligent today.

[0] - https://en.wikipedia.org/wiki/On_Intelligence

jcims4y ago

Hawkins way of thinking really maps well for me also. It seems like that more parameters helps until it doesn't, then you need to encapsulate those networks and pin them to some reference frame, they create hierarchies of these networks and a system to generalize and compress those hierarchies (aka patterns), rinse and repeat.

My brother just became a grandpa and I was watching his grandson navigate the world this past weekend. It's unbelievable how quickly the brain can extrapolate a new relationship between objects/actions/etc and then apply it elsewhere. Minimally you see it in the drinking action applied to all sorts of things, this sort of repetitive clenching/releasing of the fingers to find things to grip without looking, etc etc. Watching mom use a fork and very quickly understand how to grasp and manipulate it. The model of just training everything from exogenous data into a flat network seems like it will hit some asymptotic limit.

rafaelero4y ago

Scaling hypothesis says that we just need more processing power to achieve things we regarded as "impossible for non-intelligent agents". So far, scaling hypothesis is proving itself correct despite still prevailing skepticism.

Guest424y ago

In my experience, the data becomes quite limiting.

dekhn4y ago

it would be pretty embarrassing (or relieving?) if it eventually turned out there was nothing special about human intelligence, just that we crossed some threshold of neurons and other brain bits to ("a few quadrillion parameters") to convincingly fool ourselves that we are self aware, have agency, and do anything "intelligent" (other than some fancy stuff that looks like the physics/biology equivalent of state of the art ML).

I am a proponent of using a working theory that intelligence is an emergent property and we can in principle create new intelligences in a lab (or ML warehouse) if we provide the proper conditions, but that finding and maintaining those conditions is extremely hard. Some state of the art research today aims to integrate recognition capbilities (image recognititon and object detection/tracking on video, voice extraction from audio, text) with advanced generative models for language and behavior, as well as realtime rendering systems that can create realistic humans.

if we combine those we can make a bot that appears fully interactive, passes all turing tests, convinces typical person it's another person... and still has nothing inside researchers would call "artificial intelligence". It might even solve science problems that we can't without having any spark of creativity or agency. Or maybe when we make a bot with all those properties, some uncanny valley is crossed and out pops something that has objective AGI?

As the wise robot once said, "if you can't tell the difference, does it really matter?". We should forge ahead with building datacenter-scale brains and feed them with data and algorithms, while also maintaining a cadre of research scientists who are attuned to the ethical challenges of doing so, an ops team trained to recognize the early signs of sentience, and an exec team with humanity.

stereolambda4y ago

I'd say that the view against ANN gives humans (especially researchers) more "dignity", in the sense that we still need to figure out some deep stuff and not just add hardware. I wouldn't treat this as an argument either way, just an observation.

Heuristically, we came to be by a very dumb process of piling up newer generations. If my pet would communicate with me on the level of GPTx, I would be very impressed. That's why nowadays I have some scepticism for the ANN critics' arguments, though think it would be neat if they were right.

The thing that I dislike the most in these discussions is the pervasiveness of the AGI concept and the assumption of a linear scale of intelligence. Again, I can intuitively say that I'm more intelligent than my pet: but to quantify this, we'd need to use something silly like brain size, or qualitative/arbitrary things like "this being can talk". I think that human intelligence is a somewhat random point in a very multi-dimensional space, one that technology may never even have a reason to visit. But people tend to subscribe to the notion that this is the very important "point where AGI happens".

tsimionescu4y ago

> If my pet would communicate with me on the level of GPTx, I would be very impressed.

GPTx is not communicating with anyone. It is generating text that resembles text it had in its training set. The fact that human text is normally a form of communication doesn't make generating quasi-random text communication in itself. GPTx is no more communicating than a printer is when printing out text.

A cat or dog leading you to their empty food bowl is actual communication, and they are capable of much more advanced communication as well (especially dogs). The fact that it doesn't look like written text is not that relevant. They are of course worse than GPTx at producing text, just like they are worse than a printer at writing it on a blank page.

1. https://youtu.be/SGzMElJ11Cc?t=6597

ravi-delia4y ago

I'm most of the way towards agreeing with you, but I think you underestimate how far you could get without any major changes. Most of the brain consists of feed-forward processing, and what closed loops exist are probably replacements for backprop rather than essential to cognition. That's all the low level processing, from visual to motor. Now obviously we have higher level processing too, and it might be super weird! But no model we've made comes close to the size of even specialized brain regions, and study after study has demonstrated the power of the subconscious mind. Once we have big enough models, we might find out that all we need to take it to that final step is a while loop.

rawtxapp4y ago

Does it really matter it? If this new supercomputer means that ML engineers can iterate x% faster which in turn increases FB's profits by even a small y%, I would think this would have already paid for itself.

spoonjim4y ago

Not all applications require a general intelligence.

defaultprimate4y ago

We will never achieve AGI

mbrodersen4y ago

We are so far from having “real” AI that it is amusing to me every time I read yet another article gushing over ML. ML is fundamentally pattern matching. It is impressive tech for what it does. But humans doesn’t need 1 million carefully tagged images of chairs or cars to work out what a chair or car is. Our understanding of what general intelligent is hasn’t progressed much since the last AI winter. The only real difference is that computers are much faster today, enabling old technology ideas to be fast enough today for practical use.

mbrodersen4y ago

Wow so much downvoting. But no serious counter arguments.

reggieband4y ago

I hate that I always end up referencing the Lex Friedman podcast but it is often relevant to discussion on HN. Recently Lex spoke with Yann LeCun and they had a brief chat about AI at Meta/Facebook [1] (where I believe Yann is currently Chief AI Scientist). He claims that AI is the core of Meta and that if you were to take ML out of Meta systems the company would literally crumble because it is completely built around AI.

My feeling is this is a PR push by Facebook. All tech companies keep touting AI, especially Google but also Microsoft, Apple and Amazon. In some sense I believe these business want to control how their own success is defined. That is, they are right now convincing everyone that tech dominance is equivalent to AI dominance which is equivalent to ML dominance. In some sense this is turning into a purity test, like "which tech company is the most AI focused". I expect this kind of PR to accelerate as each company tries to prove its AI bona-fides to the market.

acchow4y ago

These aren't startups trying to prove "AI Purity" for more funding. These are money printers that are optimizing how to print more money. And Facebook and Google are competing against each other for online advertising dollars (yes, the pie is also growing). Their revenues are up 60% and 40% over the past 2 years, so I doubt their AI plays are just about proving some purity game.

6gvONxR4sf7o4y ago

I haven't listened to the podcast, but the way I read a statement like that is, "Our business needs to do things that aren't as simple as defining manual rules, but the economics of our business prevents us from just paying people to do those things at scale."

mrkramer4y ago

There is no single AI company. These are all machine learning techniques. AI is spreadsheets on steroids not real intelligence.

throwaway4233424y ago

As a side note:

I listened to the episode with Yann. Compared to other talks (e.g. the previous one with Brian Keating) it was a bit dull and uninteresting. The answers were not that insightful.

jcims4y ago

>I hate that I always end up referencing the Lex Friedman podcast but it is often relevant to discussion on HN.

I do the same thing and feel the same way, like I'm astroturfing or something. If its any consolation, I don't remember ever seeing your references and I hope you don't remember mine.

strikelaserclaw4y ago

I mean what else could he have said? People rarely speak the complete truth if their reputation of paycheck is on the line.

kleiba4y ago

I used to work at a university where my professor had been in automatic speech recognition for a long time, but basically gave up on that line of research about 10 years ago because he figured that universities simply cannot compete budget wise with the big industry players.

I suppose the same will soon be true for most ML-related areas of research sooner or later, at least as far as applied ML is concerned.

Already, a substantial amount of research innovation in NLP and CV has been coming from big companies in recent years.

Of course there is a discussion to be had about what that means for society at large. At this point, a lot of said companies to publish their results at conferences etc. But what if at some point they decide to be as "open" as OpenAI (ie., not)?

Mageek4y ago

Well, universities can’t compete in things like car production or rocket manufacturing but find ways to contribute nonetheless. Researchers have and always will struggle to get resources relative to BigCorp - AI is just joining the party. Daimler and Lockheed are no more open than Facebook is, AFAIK. There is still plenty to do and to analyze. Verifiable AI, more efficient models, knowledge transfer, 1000 brains, human interpretability, etc.

rococode4y ago

I think the academic side will start shifting towards research on efficiency and speed while companies will continue to push the cutting edge.

In the NLP space there's been a lot of work recently around reducing model sizes, since they've started to reach the point where model weights sometimes don't fit in the memory of most GPUs.

There's also projects like MarianNMT which completely abandon Python and write heavily optimized models with fast languages that can run quickly and accurately even without GPUs. I think we'll see a lot more of this, though of course there's a pretty big barrier in the sheer rarity of being good at both deep learning research and writing optimized low-level code.

Nasrudith4y ago

It would be a bit ironic for universities to compete on efficency and speed given those are two things companies optimize on. Not impossible of course, theory and encouragement to a bit more abstract could lead to providing that.

As for writing low level code, I thought that was something usually handled by the compiler or where even the advanced high performance for high price mostly tweaked the compiler after analyzing the output. Not my direct space so I speak with no authority.

knodi1234y ago

> I think the academic side will start shifting towards research on efficiency and speed

Constraints are the mother of creativity.

bluenose694y ago

Julia is not hard for a Python programmer to pick up, and it can be very fast.

riazrizvi4y ago

Academia is not necessarily the pinnacle of achievement in a field, it is the pinnacle of published achievement. There is always a dual track of proprietary knowledge, and knowledge available to the commons. Since the latter is most beneficial to society, that’s why we have awards for people when they publish their research, instead of hiding it for maximum profit.

I don’t see something new here, these institutions to encourage people to share are old, so it must be a problem that had been recognized for a while.

paxys4y ago

Even historically, how much cutting-edge research for commercial tech has come from universities? I'd say government-backed labs, military and private corporations have all always had a greater impact.

0x4d464d484y ago

"But what if at some point they decide to be as "open" as OpenAI (ie., not)?"

Aisde from some of the academics and the "gain and share knowledge for knowledge's sake" types they hire why would they care?

For the record, I don't like the idea of scientific research becoming proprietary. At all. But is there anyone credulous enough to think these organizations would willingly risk their bottom line for principles like "openess" and not just play the PR games to make themselves appear open and concerned?

In other words "Don't LOOK evil but do evil when no one's looking".

The Frances Haugen already shows how damaging such openness can be.

stathibus4y ago

I hope a positive outcome of this will be that universities direct more of their research effort toward efficiency of network architectures and/or understandability.

ml_hardware4y ago

Unfortunately it will be hard to investigate properties of large, powerful neural networks without access to their trained weights. And industrial labs that spend millions of dollars training them will not be keen to share.

If academics want to do research on expensive cutting-edge tech, they will have to join industrial labs or pool together resources, similar to particle physics or drug discovery research today.

karmasimida4y ago

> Meta’s AI supercomputer houses 6,080 Nvidia graphics-processing units ..... By mid-summer, when the AI Research SuperCluster is fully built, it will house some 16,000 GPUs

Honestly ... this is lot of GPUs ... but is it the biggest...?

> Model training is done with mixed precision on the NVIDIA DGX SuperPOD-based Selene supercomputer powered by 560 DGX A100 servers networked with HDR InfiniBand in a full fat tree configuration. Each DGX A100 has eight NVIDIA A100 80GB Tensor Core GPUs

So Nvidia used 4480 GPUs to train Megatron-Turing NLG 530B for example.

vl4y ago

Honestly, this single GPU-based install is child's play compared to Google's multiple TPU exoflop supercomputers with hyper-cube optical interconnects. Google's ML setups allow synchronous weight update on thousand+ TPUs...

rawtxapp4y ago

TPUs are amazing, but in my experience, debugging issues with them can be a bit tricky. Since nvidia's gpus are more common place (especially outside gcp), you can find a lot more information when you get stuck, it's also more battle tested, etc.

alex_sf4y ago

Tbh I thought I was being trolled with 'hyper-cube optical interconnects'.

cm20124y ago

For what its worth, for attention based advertising (youtube and display, not search), FB targeting blows Google out of the water. Not sure why but its consistent across brands.

dekhn4y ago

for TPUv3 it's 2D torus, not hyper-cube, right? Not sure if TPUv4 topology is externally published, but IIRC hypercubes are basically never used any more.

sailingparrot4y ago

At 16k it will definitely be the biggest.

As for today, Nvidia has this a very slightly smaller cluster that you outlined at ~5k, Microsoft as a few of them roughly of that size, and Microsoft also built a 10k GPU cluster for OpenAI 2 years ago, but those are V100 GPUs.

So, is 6k A100 "bigger" than 10k V100? Depends exactly how you use them, in a perfect usage scenario yes, slightly. In real life maybe not.

dekhn4y ago

Systems like this are designed to reach nearly peak performance (IE # of flops per processing element * # of processing elements), explicitly by making a network that won't block or increase latency for the common expensive operations (allreduce, allvall) at the expensive of greatly increased cost.

The point of making this machine is to have a lot of A100s going at the same time, and that will unblock some small set of researchers who are working on time-sensitive competitive research projects by giving them a slightly throughput and latency advantage on the largest problems. The vast majority of users would be better served by a small number of cheaper, slower GPUs that they had exclusive access to for the longest time period they could afford to wait.

buildbot4y ago

Probably not, as Azure was at 10K last year: https://blogs.microsoft.com/ai/openai-azure-supercomputer/

bearjaws4y ago

  “The experiences we’re building for the metaverse require   
  enormous compute power…and RSC will enable new AI models 
  that can learn from trillions of examples, understand
  hundreds of languages, and more,” Meta CEO Mark Zuckerberg

I don't really understand how AI processing is going to make the 'experiences' any better? This seems to me like investor fluff, saying they have some insane capability that other 'VR providers' don't have...

rococode4y ago

I think there are plenty of possibilities:

- 3d worlds with style transfer on the textures, like maybe there's a cafe with the visual style of Starry Night or something

- NPCs with conversation models that are finetuned for each NPC's personality and saves some history for each person it talks to for continuity

- Game-playing AI on NPCs that make them go around doing actual things or playing minigames with players

- The usual user tracking models, figuring out what people like to do in the metaverse and giving them more of that

- All the lower-level stuff that AI can do better - user inputs, rendering, etc.

Whether or not they can pull it off is a separate question - I think the tech is close but not quite there yet - but there's no doubt that the metaverse concept of "an expansive virtual world with lots of fun things to do" has many ways to use huge amounts of computation.

Bombthecat4y ago

Loot boxes with special designs just for you, modeled, picked, designed and coloured after your taste.

mark_l_watson4y ago

I am not sure either, but I worked on “game AI” over 20 years ago for Nintendo and Disney, and I am 100% sure that I could have used deep learning to good effect if it had been available. In the past seven years, I have been using mostly LSTM and GAN, and recommender models, BTW.

forgotmyoldacc4y ago

Example 1: GPT-3 is a decent chatbot. Training a similar model so you can have a conversation with AIs in the "metaverse" (god, that word is terrible) could be fun / useful.

Example 2: Using AI upscaling (like Nvidia) to improve visual fidelity in games.

Example 3: Hand/body tracking for avatars.

The more AI compute, the more experimentation researchers can do.

nerdponx4y ago

I guess "AI in the metaverse" sounds a lot better than "machine learning in Meta Inc's new VR platform".

https://developer.nvidia.com/ai-video-compression

d1sxeyes4y ago

One particularly interesting piece of tech I've seen is Nvidia's AI Video 'Compression'.

In summary, rather than actually streaming video to the person you're chatting with, you send a keyframe, and then 'compressed' video is sent over the wire, and 'decompressed' at the receiver end.

I'm putting 'compression' in quotations because to me I'm not sure I'm comfortable calling it compression. Basically, you're remotely controlling an avatar of yourself.

While the obvious usage of this is reducing bandwidth used (in their example, an h264 stream at ~100KB/frame can be compressed to 0.1KB/frame, literally a thousandth of the bandwidth), it opens up some VERY interesting possibilities for a company like Meta (check from about 1:55 onwards in the video below).

You can view someone's face from any angle, not just the angle they're speaking from (as you might in a VR world), or you can even map the key points onto a completely different keyframe, allowing for hyper-realistic avatars or next-level virtual backgrounds (imagine: you send a keyframe of you sitting at your desk and hop on a video conference from the beach, and no-one's any the wiser as long as the sea is quiet enough)

smoldesu4y ago

I'm hardly an advocate for any of Facebook/Meta's actions over the past... however long, but a lot of people forget that "Metaverse" doesn't just mean "virtual reality". VR could be a large component here, but the biggest goal of the Metaverse is really to map physical things into a digital world. That data can be used in any number of ways, not just VR/AR; it could be used to provide 3D models for common shopping goods in the Walmart app, give meteorologists an interactive forecast maps, map GitHub repositories to technology that you use every day... the list goes on. The real goal is ripping digital metadata from real-world objects, which could indeed be inferenced like an AI model for any number of uses.

rytill4y ago

For one, real-time avatars make heavy use of “AI processing”. https://research.facebook.com/videos/audio-and-gaze-driven-f...

jayd164y ago

If computer vision falls under AI then its pretty obvious why it would help with AR and world sensing.

some_furry4y ago

Any Virtual Reality where I don't have the option of being a talking blue anthropomorphic dhole (i.e. my fursona) isn't one that I'll ever choose to adopt. Calling it a "metaverse" doesn't affect my decision here.

cm20124y ago

Zuck has vision, especially for what people will want to use. I am looking forward to what FB comes up with here.

52-6F-624y ago

But his vision, in summary, is to dictate a world view. I am not looking forward to what they come up with.

giantrobot4y ago

Do you get paid in MetaBucks or a real currency? What are the hours and benefits like? Does Zuck wave out a window at you in lieu of cash bonuses?

[1] https://arxiv.org/pdf/2104.05158.pdf

tikimcfee4y ago

It won’t make it better - it makes it more cost efficient to throw random numbers at a random number optimizer to increase the number of times they can report someone clicked or saw an ad. That’s it, end of story.

The value ad is that the engineering community that they employ has a job, the stock stays higher because of their perceived value add to the tech, and the push to control data continues unburdened by something as trivial as a lack of compute power. Hooray. Progress.

munchbunny4y ago

In practice it'll probably look like indirect improvements in the infrastructure that devs get for computer vision/natural language and other miscellaneous model training stuff.

A lot of this stuff is like trickling tech from F1 teams down into consumer cars. Some of the tech will likely end up in commodity datacenter/cloud stuff.

macrolocal4y ago

Hm, maybe they're optimizing DLRM models, since inefficient HW communication sometimes bottlenecks Facebook's data center performance for them [1]. The improvements would be better personalization, ie. monetization.

plafl4y ago

There may be some possibilities. I'm not sure if it counts as AI but nevertheless a nice video:

https://m.youtube.com/watch?v=BTETsm79D3A

There is never enough compute power. Dwarf Fortress on a supercomputer?

c7DJTLrn4y ago

In the past they've used machine learning to do a kind of blurring I can't remember the name of to make scenes in VR look more realistic. They are building a lot of ML models. I think the future of VR is almost intertwined with machine learning.

rNULLED4y ago

Consider reading https://ai.facebook.com/blog/ai-rsc

impulser_4y ago

Creating a 3D world with limited amount of data.

maydup-nem4y ago

> understand hundreds of languages

> understand

I know this is CEO-talk, but I sometimes wonder if these pricks really think they are inventing AI.

zwaps4y ago

ît is not about making experiences better, it's about modeling behavior as to sell stuff

varelse4y ago

Those models are surprisingly tractable. You're nowhere near as interesting and unique as you might think you are at scale.

Evidence: actual work experience at building latent representations to characterize customer behavior at FAANG. It's hard to come up with something that really gets you, but it's not hard to come up with something likely to make you spend more. You're surprisingly predictable on that axis and even if you aren't because you put the hours into being a crazy outlier, almost everyone else is, and you don't matter.

[0]: https://deepmind.com/blog/article/alphafold-a-solution-to-a-...

ctoth4y ago

Anybody else get the sense that we're just totally frickin doomed? Even if Yudkowsky is off about AGI (which is a big maybe!) in what possible world will this technology be used to make our individual lives better (assuming you're not a FAIR researcher?)

hooande4y ago

In theory, this could be used to create AI agents that can process visual/audio information in a way that's more similar to humans. It could lead to household robots or advanced conversational interfaces. or whatever the hell the metaverse is.

It's just bunch of GPUs. It could be used for anything people can imagine, good or bad

lm284694y ago

Go back in time and tell people about the technical power we have in 2022. Then tell them to imagine what we do with it. And finally explain to them that the vast majority of it is used to sell us goods and services we mostly don't need.

knodi1234y ago

> It could lead to household robots or advanced conversational interfaces.

Yes, but anything that could do that, will be used for military robots and context-aware ubiquitous comms surveillance.

> It's just bunch of GPUs. It could be used for anything people can imagine, good or bad

And nuclear power can be used for good or ill, too. But when the ills grow big enough, it's still fair to worry about proliferation and possible end-of-civilization events. It's unhelpful to reassure someone building a bomb bunker "Don't worry, nuclear power is just a tool, it can be used for good OR for bad".

mwcampbell4y ago

> create AI agents that can process visual/audio information in a way that's more similar to humans

Which, of course, is great for accessibility.

aierou4y ago

Have you ever heard of AlphaFold?

schleck84y ago

Or RosettaFold

https://github.com/RosettaCommons/RoseTTAFold

meetups3234y ago

More people trapped inside on the metaverse = smaller crowds at outdoor recreation areas?

emerged4y ago

I think the real world will become populated with android avatars which are controlled by people from their VR headsets.

So you’ll have small human crowds but loads of anonymous avatar androids taking all the good fishing spots, riding the trails backwards, etc.

I’m joking hopefully

Traubenfuchs4y ago

That would be a win-win for everyone.

Atlas6674y ago

Yep, this will be used for precision tracking and context categorization. And of course, to more deeply understand human motivators and make platforms even more addictive to us. This will have minimal benefits to the working class and all the benefit to Meta and their clients.

I fear the ammount of human information this AI is going to be free to analyze from Facebook and what it will deduce about us and then how Meta will use it to generate capital.

UncleOxidant4y ago

Yes, I think we're in trouble. Walked over to friends' house over the weekend. It was a very nice, sunny weekend. Saw them through the window with their goggles on, rang the doorbell. They came to the door and I said, look how nice and sunny it it, we should go for a walk and they were like "oh, we had no idea it was so nice out because we were in Meta with our goggles on".

colinmhayes4y ago

I don't see how this is a bad thing. What's wrong with preferring the metaverse to their sunny neighborhood? Honestly the matrix seems like the end game here and I'm all for it. Then it can be a sunny day in the neighborhood every day. The doom comes from the matrix being built by people who either don't have our best interests in mind, or aren't competent enough to build the system to correctly act in our best interests.

LightG4y ago

Probably. I'm going to set up all my data so that, anytime it's parsed by Facebook, there'll be a middle-finger waiting for their supercomputer with just two words: "Process this".

cblconfederate4y ago

biology is hard, perhaps too complex for human brains to make a decent model of it. AI can solve it and try to explain it to us (if it pleases her)

hourislate4y ago

We're not all doomed, just the NPC's who embrace it.

pohl4y ago

"You're going to be eaten by a bronteroc. We don't know what it means."

mawadev4y ago

If this won't make people watch ads 24/7, then what will?

exdsq4y ago

How far can we actually take current machine learning technologies by scaling the underlying hardware? Are we going to see some AI algorithms that are 20% better or an order of magnitude better? And what will that realistically look like to an end user? This will have cost a lot of money and maybe the news alone will push stock prices and mean its paid for itself but is it actually going to result in a substantially better product?

jazzyjackson4y ago

I was just in a Twitter Spaces room and they have a live transcription feature, so as to be accessible and all, except the transcript was gibberish. If Facebook wants live translation in the Metaverse, they should hope this brings orders of magnitudes improvement to voice recognition, especially in languages other than english (by far the largest training set available)

zydex4y ago

I obviously don't know the parameters of the room you're referencing, but is it possible that the majority of the issue is on the side of poor user audio and a large number of simultaneous speakers? I find YouTube's transcription to be quite impressive with a handful of speakers and moderate audio quality.

ml_hardware4y ago

You may find this blog post useful for thinking about AI scaling: https://www.alignmentforum.org/posts/k2SNji3jXaLGhBeYP/extra...

For general tasks like language modeling, we are still seeing predictable improvements (on the next-token-prediction loss) with increasing compute. We will very likely be able to scale things up by 10,000x or so and continue to see increasing performance.

But what does this mean for end users? We are probably going to see sigmoid-like curves, where qualitative features of these models (like being able to do math, or tell jokes, or tutor you in French, or provide therapy, or mediate international conflicts) will suddenly get a * lot * better at some point in the scaling curve. We saw this for simple arithmetic in the GPT-3 paper, where the small <1B param models were terrible at it, and then with 100B scale suddenly the model could do arithmetic with 80%+ accuracy.

Personally I would not expect diminishing returns with increased scale, instead there will be sudden leaps in ability that will be very economically valuable. And that is why Meta and others are so interested in scaling up these models.

arnaudsm4y ago

It's linear for now (check GPT-2 vs GPT-3), but we're close to the point of diminishing returns.

bcaine4y ago

It's actually not linear, its a power law. That means we need exponentially more compute, data, and model parameters to see linear improvements in performance.

mindcrime4y ago

Part of the problem though, is that we don't know for sure what non-linearities may be lurking out there. Maybe we add 100 more "neurons" to the net and it "goes exponential" so to speak. Or maybe not. There's still a lot we don't know about the emergent properties of these systems as they scale up.

Buttons8404y ago

I think these things scale sub-linearly

[1] https://uchicago.hosted.panopto.com/Panopto/Pages/Embed.aspx...

What is the difference between “a supercomputer” and “a bunch of racks of computers”?

The actual difference between the two is quite diminished compared to years past and seems to reduce more to how a collection of computers is used and not what it is.

tyingq4y ago

The big remaining one appears to be an unusually high speed interconnect. Infiniband, etc.

lmeyerov4y ago

Yep, hetero multigpu fleet mixing high ram GPUs (40-80GB each on each A100) as multigpus w smaller (ex: ~12-16 GB T4s) nodes, w crazy interconnects locally (nvlink) and across nodes. And storage gets fun as well, like parallel SSD arrays for 100GB+/s combined per node. Then whatever legacy+hybrid CPU stuff. Ex: for stuff like PCIe, new generations that ~10x the bandwidth you'd see in a gamer box, and like 1-2 per GPU. Varies a lot for say log mining vs NN training, and even for diff NNs. Ex: Graph NNs end up needing more balanced CPU side.

Saturating a box with 500+ GB GPU RAM is fun. Only our gov users ask us for help on that typically: most of our users are commercial nowadays, but with much smaller/scaled down GPU rigs. I think that'll change as the fintechs keep improving and software gets easier, but they are still not there (outside of niches). Working on it :)

(If you like writing shaders, we are hiring :D )

benstrumental4y ago

> What is the difference between “a supercomputer” and “a bunch of racks of computers”?

In addition to the other responses, I like pointing people to this talk[1] by Jeff Hammond for a comprehensive answer to this question (you can skip to the 11:15 timestamp).

paxys4y ago

That talk is from 2009 though. Nowadays companies regularly run jobs on commercial data centers which can include thousands of GPU cores, Infiniband networking and other specialized equipment. One can make a pretty valid case that we are approaching the ability to make an ad-hoc supercomputer for yourself from the GCP console.

fennecfoxen4y ago

It's all about the distributed filesystems made from big arrays of fast fast disks, and the massive I/O backplane to the storage system and between nodes.

KaiserPro4y ago

This is a shared memory cluster. That is, there is some level of RDMA over a networking fabric.

cjbgkagh4y ago

I’d say mainly networking bandwidth.

pinewurst4y ago

Are we supposed to collectively feel, "Yay, Facebook!"?! It's a bigger tool for surveillance enablement, no different from how the CCP monitors Xinjiang cameras.

Atlas6674y ago

An AI that has free range to more profoundly study all the human data Facebook users generate?

That sounds wicked evil. If ads, marketing and habit inducing platform designs are a problem now, imagine what this will lead to.

To understand what drives your users more than the users understand it themselves and to use that understanding for profit. Intensified.

And not to mention for surveillance, you know DARPA and the NSA want their hands all over this.

rezonant4y ago

Alternative headline: "Facebook patents Skynet"

yosito4y ago

How can they possibly keep the location of something like this a secret? There have to be thousands of people involved in building and maintaining it.

paxys4y ago

Just because they aren't publishing the location in a blog post doesn't mean they are keeping it secret. It's simply not relevant info.

riffic4y ago

I don't see any sort of commitment to secrecy with this:

> The company declined to comment on the location of the facility or the cost

It's generally a common practice not to disclose addresses of your data centers, but they can usually be discerned with a bit of research. Journos aren't going to be that extensive.

jreese4y ago

Meta has a number of publicly-announced datacenter locations that were built and operated specifically by/for Meta. It's probably safe to assume it's located in one or more of those datacenters.

changoplatanero4y ago

won't it just be a few racks of gpus in one one the existing giant data centers?

ssully4y ago

Super computers are much larger than a few racks of GPU's.

bno14y ago

I wonder if things like this are the real reason behind the GPU shortage. How many other AI super computers are being built right now?

terafo4y ago

This is definitely not the case. A100, which is used for most "AI supercomputers" is manufactured on TSMC fabs, while Nvidia's gaming cards are produced on Samsung fabs. AMD produces their gaming GPUs on TSMC, but they are somewhere around 10% of the market since they are unwilling to divert their capacity from CPUs, which are more profitable, and consoles, really not sure why.

exdsq4y ago

I think these sorts of computers use special GPUs that are industrial and used specifically for AI/ML work. I don't believe they've powered the super computer with 3080s and I also don't think they use the same underlying chips either (albeit they are probably built with the same raw material that might be in short supply).

They take up chip fab capacity and that’s the bottleneck. The fact that it would be a custom die doesn’t really make a difference (and high level the features that go on the chip aren’t really all that different either, same stuff with various quantities and features tweaked)

capableweb4y ago

Good luck building special GPUs in just two years, especially with what's happening regarding chip production right now. Not sure how they could have achieved a project of this size/scope unless they use off-the-shelf components, since the backlogs are so long and have been for some time now.

snek_case4y ago

Yes, lots of GPUs have been purchased to be installed in compute clusters since ~2009. The deep learning boom only increased that. At least these are not being used to mine cryptocoins...

redisman4y ago

You can probably look up who is TSMC manufacturing custom chips for since they’re public

KaiserPro4y ago

naa, they have at most ~30k GPUs. most of it is lack of capacity rather that large demand

keithnz4y ago

So a company that specializes in targeted advertising to make money has invested in the most powerful AI supercomputer? Great.

michelb4y ago

All this to better predict behavior and present ads. What a waste.

sxv4y ago

A waste is when you throw something into a landfill. This is weaponization by an enemy of the people.

ricardobeat4y ago

https://archive.is/xdQtE

hetspookjee4y ago

So much wh will one inference cost? I mean computing has come more efficient but I still struggle to find data on how much wh is used for some sentences of GPT3, for example.

lemax4y ago

It feels eerie that these trillions of parameters and exabyte sized training sets will come from harvested user data. 17 years of user activity all culminating into some.. supercomputer? I wonder how comments I wrote when I, and all the other people from my generation, were like 14 and using FB will feed into this and sort of be immortalized in this strange way.

bognition4y ago

https://archive.md/xdQtE

sydthrowaway4y ago

I feel like the guy in A Canticle For Leibowitz who knew the history of the world.

perilousacts4y ago

Literally don't care. Facebook should not be the company with this. :/

adamnemecek4y ago

Wow, I hope that the surveillance state will be at last 30% more efficient.

bee_rider4y ago

The cool thing about improving efficiency is that you can either keep doing what you were doing, but 30% cheaper, or you can just do 30% more of it!

The best thing is, assuming the 'quality' of their product scales with the amount of work put into it, we'll get... 30% more accurate ads? Somehow they'll steal 30% of Google's lunch? Well, I don't know, but it sure looks like an incredible amount of engineering talent has been put toward getting us 30% more nothing.

eezurr4y ago

I think you're not considering the effective efficiency difference. This is what scares me about unreviewed (by society, government) advances in technology.

If we increase the efficiency of something (lets say software) by 100%, all the good things that can be done with software gain a 100% efficiency. However, that does not equate to all the bad things that can be done with software gain a 100% efficiency. Many destructive actions are orders of magnitude more efficient than all things constructive (currently), so the net result is that the world gets more dangerous.

For a more physical example, consider that a truck filled with powerful explosives could knock down a sky scraper. That is, for a handful of manhours, it is possible to undo the work of hundreds of thousands of manhours, plus the hundreds of thousands of manhours society would need to divert to managing the after effects of that disaster, and the emotional cost, etc.

There's an underlying efficiency bonus that destructive actions have that is not being accounted for.

clows4y ago

or at least ads will be 2% less irrelevant.

tikimcfee4y ago

Nope, you’ll just get 10x as many ads with half the duration to minimize the amount of time your brain has to determine if something irrelevant or not. Those 5 second ads don’t cut short because they’re kind - it’s all they need to repeat to have the name, jingle, or sad-face burned into your neural net.

Traubenfuchs4y ago

Anecdote time: For the first time, my new partner spent last week at my home, using my wifi. He is a car nerd. I am now receiving car ads that are absolutely not relevant to me.

Adtech is still a bad joke.

busymom04y ago

Could this have been the reason for chip shortage?

fennecfoxen4y ago

Do you think 6,080 GPUs — admittedly very large ones — are sufficient to explain the chip shortage?

zydex4y ago

Nvidia and AMD shipped 12.7m cards collectively in 2021, I don't know what the breakdown is on consumer vs corporate but I find it hard to believe this had any impact.

Correction: terafo pointed out they shipped 12.7m cards in Q3 2021 alone.

terafo4y ago

It was not 12.7 million in 2021. It was 12.7 million in Q3 2021.

https://news.ycombinator.com/item?id=30064173

terafo4y ago

hmate94y ago

No, this is way too small in scale vs the global demand.

penjelly4y ago

it feels like the future of companies is to increasingly give tasks to AI. So eventually we'll have massive corps that have a couple execs and a super ai making them all obscenely rich?

I want to hate this idea, but it would be the same as hating machines replacing manual labor over the last 100 years.

im not sure what to think, nor how to prepare myself for the next 20 years.

paxys4y ago

20 years is optimistic. This future isn't something we'll have to worry about in our lifetime, if ever. People wildly overestimate the state of AI as it exists today.

edgyquant4y ago

Not likely. Automation only increases productivity and companies are always looking to expand. The only thing automating jobs does it create new ones for people to work.

j / k navigate · click thread line to collapse

192 comments

chundicus4y ago

I've always imagined AGI (perhaps naively) as being achieved by clever usage of ML, plus some utilization of classical/symbolic AI from pre-AI winter days, plus probably some unknown elements.

ddalex4y ago

I feel that for there are three requirements for a NN-based AGI, inspired by biology:

a). an internal feedback loop that evaluates a possible output without actuating it, and self-modifies the parameters if the possible output is not what it's needed

b). the capability (based on a) to model own behaviours without acting on them, and to model other agents behaviours and incorporate that model into the feedback

c). the ability to switch between modelling own behaviour and other agents behaviour intentionally by the model itself - as part of the feedback loop

joakleaf4y ago

Cruise actually consider both social dynamics and uncertainty (i.e. what can hide behind an obstacle, or where are pedestrians/bikes/cars likely to move to).

If you are interested in self-driving cars, I can highly recommend their presentation from November 2021:

https://youtu.be/uJWN0K26NxQ?t=1467

For me it felt more convincing than Tesla's (a few months prior);

https://www.youtube.com/watch?v=j0z4FweCy4M

Jasper_4y ago

That's what gets me about self-driving cars. The road is a very social space, and follows social rules. Pretty much all of the communication and norms happening on the road are social ones.

https://www.creativemachineslab.com/uploads/6/9/3/4/69340277...

rileyphone4y ago

"A robot modeled itself without prior knowledge of physics or its shape and used the self-model to perform tasks and detect self-damage."

arduinomancer4y ago

Well the theory for the end-to-end image based self-driving models is that they are supposed to cover that.

Whether that will actually work is the question though

Nasrudith4y ago

Biology is glacially slow in comparison and one of the advantages from computing is being fast.

mrkramer4y ago

I doubt there will be AGI in our lifetime. Maybe some breakthrough happens but it won't be even close to human intelligence.

mindcrime4y ago

I've always imagined AGI (perhaps naively) as being achieved by clever usage of ML, plus some utilization of classical/symbolic AI from pre-AI winter days, plus probably some unknown elements.

but I've never heard a compelling argument for why pure ML would get us there.

Now, to figure out how to actually do that. :-)

space_fountain4y ago

edgyquant4y ago

Even if they did replicate how the brain works our brains aren’t one of these networks trained for specific things it is millions, maybe billions, of them combined.

mFixman4y ago

AGI seems hard because each year more and more problems that were previously considered close to AGI are solved.

Playing Chess at a grandmaster level was considered something only a human could do until the 1990s, and now no human has beat the best computer in 17 years while AGI seems further away than ever.

Mark my words: we'll create an AI that can pass the Turing test this decade, but we'll still be as far away from the badly defined general problem as we ever were.

lostmsu4y ago

MR4D4y ago

[0] - https://en.wikipedia.org/wiki/On_Intelligence

jcims4y ago

rafaelero4y ago

Guest424y ago

In my experience, the data becomes quite limiting.

dekhn4y ago

stereolambda4y ago

tsimionescu4y ago

> If my pet would communicate with me on the level of GPTx, I would be very impressed.

1. https://youtu.be/SGzMElJ11Cc?t=6597

ravi-delia4y ago

rawtxapp4y ago

spoonjim4y ago

Not all applications require a general intelligence.

defaultprimate4y ago

We will never achieve AGI

mbrodersen4y ago

Wow so much downvoting. But no serious counter arguments.

reggieband4y ago

acchow4y ago

6gvONxR4sf7o4y ago

mrkramer4y ago

There is no single AI company. These are all machine learning techniques. AI is spreadsheets on steroids not real intelligence.

throwaway4233424y ago

As a side note:

I listened to the episode with Yann. Compared to other talks (e.g. the previous one with Brian Keating) it was a bit dull and uninteresting. The answers were not that insightful.

jcims4y ago

>I hate that I always end up referencing the Lex Friedman podcast but it is often relevant to discussion on HN.

I do the same thing and feel the same way, like I'm astroturfing or something. If its any consolation, I don't remember ever seeing your references and I hope you don't remember mine.

strikelaserclaw4y ago

I mean what else could he have said? People rarely speak the complete truth if their reputation of paycheck is on the line.

kleiba4y ago

I suppose the same will soon be true for most ML-related areas of research sooner or later, at least as far as applied ML is concerned.

Already, a substantial amount of research innovation in NLP and CV has been coming from big companies in recent years.

Mageek4y ago

rococode4y ago

I think the academic side will start shifting towards research on efficiency and speed while companies will continue to push the cutting edge.

In the NLP space there's been a lot of work recently around reducing model sizes, since they've started to reach the point where model weights sometimes don't fit in the memory of most GPUs.

Nasrudith4y ago

knodi1234y ago

> I think the academic side will start shifting towards research on efficiency and speed

Constraints are the mother of creativity.

bluenose694y ago

Julia is not hard for a Python programmer to pick up, and it can be very fast.

riazrizvi4y ago

I don’t see something new here, these institutions to encourage people to share are old, so it must be a problem that had been recognized for a while.

paxys4y ago

0x4d464d484y ago

"But what if at some point they decide to be as "open" as OpenAI (ie., not)?"

Aisde from some of the academics and the "gain and share knowledge for knowledge's sake" types they hire why would they care?

In other words "Don't LOOK evil but do evil when no one's looking".

The Frances Haugen already shows how damaging such openness can be.

stathibus4y ago

I hope a positive outcome of this will be that universities direct more of their research effort toward efficiency of network architectures and/or understandability.

ml_hardware4y ago

If academics want to do research on expensive cutting-edge tech, they will have to join industrial labs or pool together resources, similar to particle physics or drug discovery research today.

karmasimida4y ago

> Meta’s AI supercomputer houses 6,080 Nvidia graphics-processing units ..... By mid-summer, when the AI Research SuperCluster is fully built, it will house some 16,000 GPUs

Honestly ... this is lot of GPUs ... but is it the biggest...?

So Nvidia used 4480 GPUs to train Megatron-Turing NLG 530B for example.

vl4y ago

rawtxapp4y ago

alex_sf4y ago

Tbh I thought I was being trolled with 'hyper-cube optical interconnects'.

cm20124y ago

For what its worth, for attention based advertising (youtube and display, not search), FB targeting blows Google out of the water. Not sure why but its consistent across brands.

dekhn4y ago

for TPUv3 it's 2D torus, not hyper-cube, right? Not sure if TPUv4 topology is externally published, but IIRC hypercubes are basically never used any more.

sailingparrot4y ago

At 16k it will definitely be the biggest.

So, is 6k A100 "bigger" than 10k V100? Depends exactly how you use them, in a perfect usage scenario yes, slightly. In real life maybe not.

dekhn4y ago

buildbot4y ago

Probably not, as Azure was at 10K last year: https://blogs.microsoft.com/ai/openai-azure-supercomputer/

bearjaws4y ago

  “The experiences we’re building for the metaverse require   
  enormous compute power…and RSC will enable new AI models 
  that can learn from trillions of examples, understand
  hundreds of languages, and more,” Meta CEO Mark Zuckerberg

rococode4y ago

I think there are plenty of possibilities:

- 3d worlds with style transfer on the textures, like maybe there's a cafe with the visual style of Starry Night or something

- NPCs with conversation models that are finetuned for each NPC's personality and saves some history for each person it talks to for continuity

- Game-playing AI on NPCs that make them go around doing actual things or playing minigames with players

- The usual user tracking models, figuring out what people like to do in the metaverse and giving them more of that

- All the lower-level stuff that AI can do better - user inputs, rendering, etc.

Bombthecat4y ago

Loot boxes with special designs just for you, modeled, picked, designed and coloured after your taste.

mark_l_watson4y ago

forgotmyoldacc4y ago

Example 1: GPT-3 is a decent chatbot. Training a similar model so you can have a conversation with AIs in the "metaverse" (god, that word is terrible) could be fun / useful.

Example 2: Using AI upscaling (like Nvidia) to improve visual fidelity in games.

Example 3: Hand/body tracking for avatars.

The more AI compute, the more experimentation researchers can do.

nerdponx4y ago

I guess "AI in the metaverse" sounds a lot better than "machine learning in Meta Inc's new VR platform".

https://developer.nvidia.com/ai-video-compression

d1sxeyes4y ago

One particularly interesting piece of tech I've seen is Nvidia's AI Video 'Compression'.

In summary, rather than actually streaming video to the person you're chatting with, you send a keyframe, and then 'compressed' video is sent over the wire, and 'decompressed' at the receiver end.

I'm putting 'compression' in quotations because to me I'm not sure I'm comfortable calling it compression. Basically, you're remotely controlling an avatar of yourself.

smoldesu4y ago

rytill4y ago

For one, real-time avatars make heavy use of “AI processing”. https://research.facebook.com/videos/audio-and-gaze-driven-f...

jayd164y ago

If computer vision falls under AI then its pretty obvious why it would help with AR and world sensing.

some_furry4y ago

cm20124y ago

Zuck has vision, especially for what people will want to use. I am looking forward to what FB comes up with here.

52-6F-624y ago

But his vision, in summary, is to dictate a world view. I am not looking forward to what they come up with.

giantrobot4y ago

Do you get paid in MetaBucks or a real currency? What are the hours and benefits like? Does Zuck wave out a window at you in lieu of cash bonuses?

[1] https://arxiv.org/pdf/2104.05158.pdf

tikimcfee4y ago

munchbunny4y ago

In practice it'll probably look like indirect improvements in the infrastructure that devs get for computer vision/natural language and other miscellaneous model training stuff.

A lot of this stuff is like trickling tech from F1 teams down into consumer cars. Some of the tech will likely end up in commodity datacenter/cloud stuff.

macrolocal4y ago

plafl4y ago

There may be some possibilities. I'm not sure if it counts as AI but nevertheless a nice video:

https://m.youtube.com/watch?v=BTETsm79D3A

There is never enough compute power. Dwarf Fortress on a supercomputer?

c7DJTLrn4y ago

rNULLED4y ago

Consider reading https://ai.facebook.com/blog/ai-rsc

impulser_4y ago

Creating a 3D world with limited amount of data.

maydup-nem4y ago

> understand hundreds of languages

> understand

I know this is CEO-talk, but I sometimes wonder if these pricks really think they are inventing AI.

zwaps4y ago

ît is not about making experiences better, it's about modeling behavior as to sell stuff

varelse4y ago

Those models are surprisingly tractable. You're nowhere near as interesting and unique as you might think you are at scale.

[0]: https://deepmind.com/blog/article/alphafold-a-solution-to-a-...

ctoth4y ago

hooande4y ago

It's just bunch of GPUs. It could be used for anything people can imagine, good or bad

lm284694y ago

knodi1234y ago

> It could lead to household robots or advanced conversational interfaces.

Yes, but anything that could do that, will be used for military robots and context-aware ubiquitous comms surveillance.

> It's just bunch of GPUs. It could be used for anything people can imagine, good or bad

mwcampbell4y ago

> create AI agents that can process visual/audio information in a way that's more similar to humans

Which, of course, is great for accessibility.

aierou4y ago

Have you ever heard of AlphaFold?

schleck84y ago

Or RosettaFold

https://github.com/RosettaCommons/RoseTTAFold

meetups3234y ago

More people trapped inside on the metaverse = smaller crowds at outdoor recreation areas?

emerged4y ago

I think the real world will become populated with android avatars which are controlled by people from their VR headsets.

So you’ll have small human crowds but loads of anonymous avatar androids taking all the good fishing spots, riding the trails backwards, etc.

I’m joking hopefully

Traubenfuchs4y ago

That would be a win-win for everyone.

Atlas6674y ago

I fear the ammount of human information this AI is going to be free to analyze from Facebook and what it will deduce about us and then how Meta will use it to generate capital.

UncleOxidant4y ago

colinmhayes4y ago

LightG4y ago

Probably. I'm going to set up all my data so that, anytime it's parsed by Facebook, there'll be a middle-finger waiting for their supercomputer with just two words: "Process this".

cblconfederate4y ago

biology is hard, perhaps too complex for human brains to make a decent model of it. AI can solve it and try to explain it to us (if it pleases her)

hourislate4y ago

We're not all doomed, just the NPC's who embrace it.

pohl4y ago

"You're going to be eaten by a bronteroc. We don't know what it means."

mawadev4y ago

If this won't make people watch ads 24/7, then what will?

exdsq4y ago

jazzyjackson4y ago

zydex4y ago

ml_hardware4y ago

You may find this blog post useful for thinking about AI scaling: https://www.alignmentforum.org/posts/k2SNji3jXaLGhBeYP/extra...

arnaudsm4y ago

It's linear for now (check GPT-2 vs GPT-3), but we're close to the point of diminishing returns.

bcaine4y ago

It's actually not linear, its a power law. That means we need exponentially more compute, data, and model parameters to see linear improvements in performance.

mindcrime4y ago

Buttons8404y ago

I think these things scale sub-linearly

[1] https://uchicago.hosted.panopto.com/Panopto/Pages/Embed.aspx...

What is the difference between “a supercomputer” and “a bunch of racks of computers”?

The actual difference between the two is quite diminished compared to years past and seems to reduce more to how a collection of computers is used and not what it is.

tyingq4y ago

The big remaining one appears to be an unusually high speed interconnect. Infiniband, etc.

lmeyerov4y ago

(If you like writing shaders, we are hiring :D )

benstrumental4y ago

> What is the difference between “a supercomputer” and “a bunch of racks of computers”?

In addition to the other responses, I like pointing people to this talk[1] by Jeff Hammond for a comprehensive answer to this question (you can skip to the 11:15 timestamp).

paxys4y ago

fennecfoxen4y ago

It's all about the distributed filesystems made from big arrays of fast fast disks, and the massive I/O backplane to the storage system and between nodes.

KaiserPro4y ago

This is a shared memory cluster. That is, there is some level of RDMA over a networking fabric.

cjbgkagh4y ago

I’d say mainly networking bandwidth.

pinewurst4y ago

Are we supposed to collectively feel, "Yay, Facebook!"?! It's a bigger tool for surveillance enablement, no different from how the CCP monitors Xinjiang cameras.

Atlas6674y ago

An AI that has free range to more profoundly study all the human data Facebook users generate?

That sounds wicked evil. If ads, marketing and habit inducing platform designs are a problem now, imagine what this will lead to.

To understand what drives your users more than the users understand it themselves and to use that understanding for profit. Intensified.

And not to mention for surveillance, you know DARPA and the NSA want their hands all over this.

rezonant4y ago

Alternative headline: "Facebook patents Skynet"

yosito4y ago

How can they possibly keep the location of something like this a secret? There have to be thousands of people involved in building and maintaining it.

paxys4y ago

Just because they aren't publishing the location in a blog post doesn't mean they are keeping it secret. It's simply not relevant info.

riffic4y ago

I don't see any sort of commitment to secrecy with this:

> The company declined to comment on the location of the facility or the cost

It's generally a common practice not to disclose addresses of your data centers, but they can usually be discerned with a bit of research. Journos aren't going to be that extensive.

jreese4y ago

Meta has a number of publicly-announced datacenter locations that were built and operated specifically by/for Meta. It's probably safe to assume it's located in one or more of those datacenters.

changoplatanero4y ago

won't it just be a few racks of gpus in one one the existing giant data centers?

ssully4y ago

Super computers are much larger than a few racks of GPU's.

bno14y ago

I wonder if things like this are the real reason behind the GPU shortage. How many other AI super computers are being built right now?

terafo4y ago

exdsq4y ago

capableweb4y ago

snek_case4y ago

Yes, lots of GPUs have been purchased to be installed in compute clusters since ~2009. The deep learning boom only increased that. At least these are not being used to mine cryptocoins...

redisman4y ago

You can probably look up who is TSMC manufacturing custom chips for since they’re public

KaiserPro4y ago

naa, they have at most ~30k GPUs. most of it is lack of capacity rather that large demand

keithnz4y ago

So a company that specializes in targeted advertising to make money has invested in the most powerful AI supercomputer? Great.

michelb4y ago

All this to better predict behavior and present ads. What a waste.

sxv4y ago

A waste is when you throw something into a landfill. This is weaponization by an enemy of the people.

ricardobeat4y ago

https://archive.is/xdQtE

hetspookjee4y ago

So much wh will one inference cost? I mean computing has come more efficient but I still struggle to find data on how much wh is used for some sentences of GPT3, for example.

lemax4y ago

bognition4y ago

https://archive.md/xdQtE

sydthrowaway4y ago

I feel like the guy in A Canticle For Leibowitz who knew the history of the world.

perilousacts4y ago

Literally don't care. Facebook should not be the company with this. :/

adamnemecek4y ago

Wow, I hope that the surveillance state will be at last 30% more efficient.

bee_rider4y ago

The cool thing about improving efficiency is that you can either keep doing what you were doing, but 30% cheaper, or you can just do 30% more of it!

eezurr4y ago

I think you're not considering the effective efficiency difference. This is what scares me about unreviewed (by society, government) advances in technology.

There's an underlying efficiency bonus that destructive actions have that is not being accounted for.

clows4y ago

or at least ads will be 2% less irrelevant.

tikimcfee4y ago

Traubenfuchs4y ago

Anecdote time: For the first time, my new partner spent last week at my home, using my wifi. He is a car nerd. I am now receiving car ads that are absolutely not relevant to me.

Adtech is still a bad joke.

busymom04y ago

Could this have been the reason for chip shortage?

fennecfoxen4y ago

Do you think 6,080 GPUs — admittedly very large ones — are sufficient to explain the chip shortage?

zydex4y ago

Nvidia and AMD shipped 12.7m cards collectively in 2021, I don't know what the breakdown is on consumer vs corporate but I find it hard to believe this had any impact.

Correction: terafo pointed out they shipped 12.7m cards in Q3 2021 alone.

terafo4y ago

It was not 12.7 million in 2021. It was 12.7 million in Q3 2021.