Way to bury the lede! What's this magic algorithm being spoken of, how does it work so well?
> A more detailed overview of these changes can be found in our <a>changelog</a>.
The changelog is even more terse, saying only which key combo activates it :(
A way bigger deal to me is mentioned at the bottom of the "other changes" in said changelog: Opus support!
Edit: wait, under "libraries" it says
> lib-time-and-pitch implements a time stretching algorithm originating in Staffpad. It currently is one of the highest-quality time stretching algorithms for music on the market.
Staffpad links to https://staffpad.net and seems to be some music app. The library name links to a Microsoft repository (https://github.com/audacity/audacity/tree/e4bc052201eb0e6e22...) with no readme. The source code is documentation enough :)
It's a granular time-stretch algorithm. By the sound of the artifacts at extreme settings it's akin to the 'Complex Pro' algorithm in Ableton Live (if you have any reference to that) and seems to be equally effective at a wide variety of sources (percussion, vocals, full records).
Is it better than most commercial offerings? Hard to say after a brief test, but it's not bad!
I suspect it's plenty good for the needs of Audacity's audience, who are unlikely to be demanding audio pros. As an audio professional I would never use Audacity, but if you need a quick, free, and (relatively) simple option to time-stretch a file, then it should fit the bill.
Whether you do or not, there are a hell of a lot of hawkers of software and "hardware" alike, who stand ready to sell their overpriced "professional" tools to people who fail to make it in the music industry (which is going to be most of them, regardless of tools or even skills). A high sticker price is a way to prove your commitment - to oneself, at least. The end market is less likely to care.
The basic idea is this. For a time-stretch factor of, say, 2x, the frequency spectrum of the stretched output at 2 sec should be the same as the frequency spectrum of the unstretched input at 1 sec. The naive algorithm therefore takes a short section of signal at 1s, translates it to 2s and adds it to the result. Unfortunately, this method generates all sorts of unwanted artifacts.
Imagine a pure sine wave. Now take 2 short sections of the wave from 2 random times, overlap them, and add them together. What happens? Well, it depends on the phase of each section. If the sections are out of phase, they cancel on the overlap; if in phase, they constructively interfere.
The phase vocoder is all about overlapping and adding sections together so that the phases of all the different sine waves in the sections line up. Thus, in any phase vocoder algorithm, you will see code that searches for peaks in the spectrum (see _time_stretch code). Each peak is an assumed sine wave, and corresponding peaks in adjacent frames should have their phases match.
The two I know of are Paulstretch (which is really only for super-duper-long stretching) and Lent 1989 ("An Efficient Method for Pitch Shifting Digital Sampled Sounds"), which can be thought of as a form of granular synthesis but isn't really what most people think of when they hear that term.
Time stretching 2 of 6 make audio track stretching effective
You link to the official audacity and chose to (incorrectly) describe it as ”a Microsoft repository”
Why being dishonest?
Maybe I should simply have said git, though. Didn't think of that until now at the end of writing this reply. No dishonesty intended, quite the opposite in fact
A bit of an open ended question, but is there anything more I could do to process the audiobook to make it sound even better at 2x?
Sounds like you could download Audacity 3.4 and make a 2x version of your audiobook files using their new time stretching algorithm.
Staffpad is recent acquisition which is why they are now able to share technology like this: https://mu.se/
Depending on your needs, you'd want to favor one over the other. Granular stretches are far less CPU intensive and have significantly lower latency than an FFT Transform. The granular algorithm will likely have better transient fidelity at small time-stretch intervals (between 0.5x - 2x speeds), whereas FFTs tend to smear the transient information.
Where FFT transforms really excel is in maintaining or transforming formants and harmonic information. Especially at extreme pitch or time settings. Granular algorithms can only stretch so far before you hear the artifacts. FFTs are far more graceful for dramatic transforms.
I wonder how it compares to Ableton Live, warping was always a big part of Abelton.
Eventually, I held my nose and ran Audacity in a VM, and not a single crash.
Only binaries from Audacity themselves had the telemetry, which (as usual) is why you should never use upstream binaries.
That’s kinda the blessing and the curse of FOSS. You absolutely can fork the repo, remove the telemetry, and republish it as a new app.
But fragmentation is confusing, requires a lot of maintenance, and really I’m not sure it was worth it. Those who are particularly conscious about the telemetry can block it with a single line in /etc/hosts.
- Overall great.
- The tempo stretching example in the video was too subtle for me. I listened a few times and had trouble telling the difference.
- The documentation at https://manual.audacityteam.org/index.html is still for 3.3 which is a bit frustrating when trying out new features. Also the link labeled Manual that is displayed in the splash screen 404s for me.
- It took a bit too long to scan my computer for plugins and at the end I was told some plugins were deemed incompatible but not why.
Suggestions on next steps:
- I want to download songs and map the tempo to the song. That way I can easily loop over few bars when practicing an instrument.
- Today I use Ableton for this which can automatically detect the tempo of a clip, and align bar and beat markers to the song, without stretching the audio. It also does a decent job of following tempo variations within a clip. This all started working well in version 11.3.2.
I tried to use Audacity for this and these were my impressions:
- Opus support makes it easy to work with material from Youtube.
- Adding clips to tracks obscures beat and bar markings making them difficult to align with transients.
- Having to generate a metronome track is a bit clunky.
- Stem separation would be a nice addition so that I could easily mute the instrument I'm playing.
that's the point of it. Being able to make 124 bpm samples cooperate with 110bpm samples without anyone ever noticing that it had happened.
> - The documentation at https://manual.audacityteam.org/index.html is still for 3.3 which is a bit frustrating when trying out new features. Also the link labeled Manual that is displayed in the splash screen 404s for me.
the manual job always takes forever to complete, but support.audacityteam.org is updated.
> - It took a bit too long to scan my computer for plugins and at the end I was told some plugins were deemed incompatible but not why.
It tries to load the relevant VSTs in a child process and if the VST crashes the child process it gets flagged as incompatible. Audio plugins are awful and nobody ever follows the spec.
All of the other things you mentioned are in various stages of being planned.
That's what's always kept me from using Audacity in the past. I like the interface and operations and everything, but cleaning up audio (removing room tone mostly) has always been the first step in my workflow, and its built-in noise reduction has just been unusably terrible compared to basic commercial tools.
Or is there a common plugin people use with it that I've never known about?
For example, that also makes them vulnerable to "enshittification".
Even to the suitiest of corporate suits it's clear that the enshittification funnel (first it's awesome for users, then for partners like publishers and advertisers at the cost of users, then it's awesome for making money at the cost of everyone else) simply doesn't work with an open source program.
In the 1990s in a long workroom of sun workstations we rigged a rlogin sox script to play succesive parts of some spooky music as a co worker walked past each one late one night.
on the other hand inkscape is made with gtkmm(gtk), which also runs cross platform.
Every now and again there'll be a hard to otherwise source episode of something that turns up two poor vesions, one with good video but damaged | lower quality sound another with good sound | bad video .. and they each have differing frame rates and edit cuts.
To make a better version involves a bit of time stretching on the audio between marks.
I still have an eye out for the best OS tool for merging and aligning video + audio + subtitles tracks .. the smoothly integrated intersection of MKVToolNix + SubtitleEdit + Audacity.
I loved Apple's Music Memos (works still on some old iPhones) as I could strum my guitar and sing my songs in it to automagically with one click add AI drums and bass (tempo could be changed/edited). They discontinued the app a few years ago, unfortunately.
Join:
SelectNone: