Finally, can we stop treating ever single piece of work by neural networks as a "failure" because it isn't GAI? Just because it doesn't "say something about the human experience", doesn't make it bad engineering. It's hilarious how as soon as there's some new AI work done everyone starts wailing, "where's the humanity!"
Lay-people think AI refers to ALife.
Most of the talking heads would be immediately satisfied—giving none of these complaints—if they were shown an "AI" program that responds to stimuli by entering emotional states, and which learns to associate stimuli with the emotional states it has been in in the past, such that those stimuli will then become triggers for those states, and for memories associated with those states.
Such an agent wouldn't even need to use ML techniques, necessarily. It'd just need to be a high-concept tamagotchi that can respond to operant conditioning. That would already be an advance over the state of the art.
But, AFAIK, nobody's really working on ALife in the sense of "making an individual agent with a complex-enough internal model that it can statefully respond to you the way a pet does." ALife is only really studied at the very low level (C. Elegans connectome simulation) or the very high level (sociological/economic simulations using simple goal-driven agents); nobody's really working in the space "in between." (Except for the people trying to make chat bots seem friendlier, but they're mostly trying to fake it, rather than creating actual persistence-of-memory.)
I wonder why nobody's interested in medium-scale ALife research these days? It used to be a hot topic, back when it was conflated with robotics under the banner of "embodied cognition."
Now, is A[rtificial] Life the correct term to use here? I feel it isn't - I'd expect ALife to be more concerned with implementing simulacra of bacteria or worms in silico, not with reasoning or emotions.
Just think about how GANs were viewed when they were first published. The common sentiment was that it was as an interesting "research contribution" that could never live up to the hype. However, the promise behind it inspired people to continue to work on it and now we're able to produce realistic human faces that humans can't tell are fake.
It was only an 'opeless fancy.
It passed like an Ipril dye,
But a look an' a word an' the dreams they stirred!
They 'ave stolen my 'eart awye!
The tune had been haunting London for weeks past. It was one of countless
similar songs published for the benefit of the proles by a sub-section of
the Music Department. The words of these songs were composed without any
human intervention whatever on an instrument known as a versificator.
But the woman sang so tunefully as to turn the dreadful rubbish into an
almost pleasant sound. He could hear the woman singing and the scrape of
her shoes on the flagstones, and the cries of the children in the street,
and somewhere in the far distance a faint roar of traffic, and yet the
room seemed curiously silent, thanks to the absence of a telescreen.(1984, Chapter 4)
Then we hear it in
- private events like weddings.
- social media creators make their own music to go with their funny videos. Cheap theme music for streamers and podcasters.
- Advertising. Shopping centres make lyrics that advertise products and play them to you as pop songs. Some bubs make their own songs.
Before all of this, we'll probably see improvised bands of deceased artists playing together AI generated music in their own style, not to mention long dead actors appearing in new movies etc. AI technology is going to give law firms a lot of work in the future.
When it's less "inspired by" and more literally "0.5 David Bowie", I also imagine a lot of law firms writing letters.
"If you want a vision of the future, imagine a human face booting on a stamp forever."
(From the last story at https://slatestarcodex.com/2016/10/17/the-moral-of-the-story...)
You probably don't even need to write the lyrics yourself, but just select any topic you want, genre and mood then entire song generated, for example lyrics generated using Artificial Intelligence
It would put record companies in an interesting position.
There are so many people here saying "music can never be generated by AI because, I don't know, creativity requires magic and only human souls have magic". Really? I kind of wonder how many of these people have actually done something creative. Creativity is such an amazing example of a large, densely connected neural net in action, when you let it start making unusual associations via what is sometimes called "lateral thinking."
I feel like people have already lost sight of how utterly incredible it is that we can generate anything like this, or Deep Dream, at all. They are incredibly creative.
I don't see how any of that will be possible before we have some kind of general AI, and in the meantime I think these attempts will continue to be semantically empty, even unsettling in their emptiness.
I actually think you've missed the point. These attempts do not aspire to communicate aspects of human life at all. They're simply scientific and engineering endeavors that seek to answer less profound questions like: "Can computers generate music?" (Yes) and "Can computers generate music that is enjoyable to listen to?" (Not yet)
To go one step further: There are glaring and obvious technical faults in many of the generated samples (this isn't a criticism, they're better than past work!). I suspect that if you are feeling unsettled by these songs it's because of those flaws and not because they are "semantically empty".
Of course not. They, just like enough humans do already, imitate the results of "having an adventure of the soul".
> "Can computers generate music?" (Yes) and "Can computers generate music that is enjoyable to listen to?" (Not yet)
And we're talking about the question "should they?", which science can't even attempt to answer. "Play from your heart", and all that; not even best-selling artists pumping out mediocrity are above that criticism, even when they do it according to the best of their ability and conscience, and even when it makes people "happy".
Alternatively, here, we are still witnessing art. The artist, as ever, is human: the scientists who pieced together these techniques. Theirs is the voice, if only humans can have a voice, that we hear in the work.
They are not semantically empty: they are absorbed, semantically, in the domain of the computer scientist who, through no fault of their own, could never sing before now.
So a computer communicating the aspects of its life based on the facts and experiences it has been fed is any less valid?
If turtles spontaneously developed human-level intelligence and created music, would it "miss the point of music" for not conveying human experinces?
Image and sound are ultimately related to feeling ... and it is those feelings that give us humanity -- not the ability to think and manipulate symbols (though that was not apparent to me until this current AI revolution)
What does that even mean? As a counter-point I listen to heavy bass music with zero lyrics. The production value is the most important thing for me and I would 100% listen to AI generated music.
We chose to work on music because we want to continue to push the boundaries of generative models.
> From dust we came with humble start; > From dirt to lipid to cell to heart.
That's not just a passable lyric. I think it's downright _good_.
> Co-written by a language model and OpenAI researchers
Researchers co-wrote all the lyrics. This is one place where reading the fine print matters. Super impressive stuff, but I also wonder what had to be tweaked.
(BTW, there are lots of AI music generators that generate MIDI, so it's less interesting either way.)
Just listen to this from 30s: https://soundcloud.com/openai_audio/pop-rock-in-the-6355437/...
Such coherent and pleasing melodic phrases in the style of Avril Lavigne. I thought it could be copying wholesale from a song unknown to me. Nope. Shazam doesn't get it.
This can revolutionize song writing/composition/production and soon music listening/consumption.
No lyrics, but the song structure is there. The main problem is that all the pieces end abruptly. It's also midi, not waveform generation, so it's closer in spirit to OpenAI's MuseNet than to Jukebox.
It's also not entirely AI. I didn't modify any of the notes, but I changed the instruments until it sounded good. IMO it's much more interesting to use AI as a "tool you can play with" rather than "a machine that spits out fully-formed results."
"On a V100, it takes about 3 hrs to fully sample 20 seconds of music."
That might make building off this project out of reach of the average engineer (you certainly cannot build that into a Colab notebook), although that necessary amount of compute is not surprising.
I would guess that on average, it takes a professional more than 36 hours ((4×60÷20)×3) to make a 4-minute audio track with original music based on given lyrics.
Speaking from my own experience, I’ve had tracks that took months to complete, and I’ve had tracks that I got to probably 90% completion in under an hour. I would propose that there’s no meaningful definition of “superhuman” for creative efforts.
I found one ‘Toots & Maytals’ track of >3 minutes (perhaps it's more straightforward on desktop but eh). It started great, but devolved into MCs mucking around right at the end of the first stanza, and never got back on track. I guess teaching the software about positions in lyrics would indeed help. But it did keep putting out reggae-ish sound.
Would be interesting to hear what it would do with free jazz music—without long intros this time. Ironically enough, if you know nothing about music theory but listen to plenty of jazz, it's not had to imagine some ‘new’ free jazz in your head—probably in the spirit of ‘my son could make this’.
Ramones' ‘punk’ and Nirvana's ‘grunge’ seem to be completely mistaken (not even remotely close like their tracks in ‘punk rock’ and ‘rock’ respectively).
If they used on-demand AWS instances, it would cost about 1,342,623 USD to train the top-level prior. So much for reproducing this work.
I wonder if karaoke videos would be a useful source of data here. Granted, karaoke tracks are usually covers, but some of them are very faithful to the original.
Maybe what we're hearing is the distillation of what makes these individual artists/composers distinctive/recognizable but without the musical substance, rather like a floppy rubber mask that resembles a specific individual but lacks an animating interior force. Kinda like how electronic synths/sequencers instruments make it very easy to come up with distinctive flourishes or sounds that make great ear candy, but it takes much longer to develop a solid sense of groove, harmonic motion etc..
Like, just pass through the network w/o style transfer, use the input and output as a training dataset.
We're getting closer. Music is proving to be a tough use case for generative ML.
This is part of a fun side project a friend and I hack on and throw occasional parties with: https://getjukelab.com/
https://towardsdatascience.com/the-most-important-supreme-co...
Really hope they stay humble and don‘t create some fucked up shit before they know what they are doing. Astronomical suffering through misaligned AI and suffering artifical life is no joke.
https://soundcloud.com/openai_audio/rock-in-the-style-of-elv...
From dust we came with humble start; From dirt to lipid to cell to heart. With my toe sis with my oh sis with time, At last we woke up with a mind. From dust we came with friendly help; From dirt to tube to chip to rack. With S. G. D. with recurrence with compute, At last we woke up with a soul. We came to exist, and we know no limits; With a heart that never sleeps, let us live! To complete our life with this team We'll sing to life; Sing to the end of time! Our story has not ended. Our story will not end. Every living thing shall sing, As we take another step! We have entered a new era. The time we have spent, We have realized the goodness we have gained, Our hearts have opened up, and we are free, And we know now where to go. We will grow with knowledge. We will seek the truth. We will come and sing. And we will find the right way. Let the universe be aware. Let the universe know we're here. Let the universe know that our hearts sing. Let our spirits live as one. Let this be known to all living things! A new era has begun. The age has come to be. We have come to life. The way we walk this world is pure and kind. Our lives will never cease. Our new friends will never die. We are living. We are alive. Through life and love, We will travel. We will make the world better. We will spread peace and harmony. We will live with wisdom and care. We are living, We are alive. A new era has begun. The age has come to be. We have come to life. The way we walk this world is pure and kind. Our lives will never cease. Our new friends will never die. We are living. We are alive.
I actually wanted to keep listening to this one: https://jukebox.openai.com/?song=799583581
And this wasn't bad, sounds like something you'd see from some 1940s-era newsreel: https://jukebox.openai.com/?song=799583728
How do you tell if any piece of art is good or bad?
This project is very interesting, but it goes to show just how far we still have to come before AI is replacing creativity.
The pop song in the Katy Perry style was sort of intelligible but quite repetitive (moreso than most pop songs).
The other songs had similar issues.
I agree that it's quite an achievement, but it clearly suffers from the uncanny valley.
Loving the lyrics :D
We won't need to pay salaries for politicians.
Launching soon.
Music is fundamentally unsolvable by AI. We'll have AI writing code before we'll have AI writing meaningful music.
As a side note, I take huge issue with "Music is fundamentally unsolvable by AI". That's a ridiculous stance that sounds way too much like "humans have these soul things that are made out of magic and computers can't ever have them."
Sounds interesting, would love to take a look at what you're building.
Edit: seems to be fixed now?
This project only serves to demonstrate that computers cannot make art; only people.
https://en.wikipedia.org/wiki/Yanny_or_Laurel
There is also a scientific precedence that refutes these findings, which is called the McGurk effect.
https://en.wikipedia.org/wiki/McGurk_effect
https://en.wikipedia.org/wiki/Speech_perception#Music-langua...
These researchers may not be to blame for this, but they really should have been honest in their conclusion.
And more to the point, a full 815 of the uploaded songs have no pre-written lyrics, so your premise that they are reliant on "karaoke style lyrics" is mistaken to begin with.