The single easiest and most effective thing Apple could do to improve its SWE org is to invest in Radar.
Radar's importance within Apple cannot be overstated. It subsumes what would be multiple tools in other orgs. As an Apple SWE you spend a massive amount of time in it. And yet Apple treats Radar as a cost center, developed by an outsourced offshore team. It's slow to search, supports only plain text, is hard to script, and is missing obvious features, e.g. automatic duplicate finding.
Hire five good SWEs, give them a mandate to serve the needs of the org, and you will massively increase the effectiveness of every other engineer.
(My experience is now rather stale and I hope they've already formed such a team!)
I thought there were some engineers who worked on Radar, but then I heard that they had moved development offshore. Is this a new thing?
> is missing obvious features, e.g. automatic duplicate finding
Last I heard, I think someone was working on this in their free time. I don't think anyone has tried to add this as an official feature, though.
To me, it feels like Apple hasn't resourced core pieces of infrastructure and engineering teams in line with upper management's plans for growth. While many teams are relatively sequestered, once you start talking to folks elsewhere in the company it becomes clear that many teams are struggling to stay above water. More still, everyone shrugs about it because it's not clear exactly what is wrong. The best description I've heard is in many cases engineers are willing to offer hacks as a solution to meet management's demands, and management is either willing to accept those hacks or doesn't know better. That is far from a full picture, and its an example drawn from a small slice of an enormous company. But it seems telling to me. I find it completely reasonable to imagine that most teams in a place to deal with customer facing bugs don't have adequate time to do so. Not to their satisfaction, never mind customers.
At the same time, I think it can be hard to appreciate the ways in which Apple is ahead of the curve as far as the categories of software projects it tackles. So I don't mean to imply that anyone is really to blame per se. And it's also a shame because Radar was the greatest bug tracker I have ever used. It is unclear to me if Apple doesn't prioritize menial things like handling external bug reports, or if problems like that are not visible to / perceived by those in the company who could do something about it.
Positive: Everything is searchable. It is organizational history going back decades. The company is built around this tool. Negative: The importance of such a tool goes unrecognized compared to the criticality of the system to Apple as a whole. So when it's slow or breaks, everyone is having a bad day.
Nail on the head.
We originally designed Radar so that bugs would be verified as closed by the person with most interest in seeing this happen: the tester assigned to that part of that project. Then management swooped in with an edict that bugs must be verified as closed by whomever originally reported them. This is a stupid idea, because it creates the perverse incentive that no one should report a problem if they are outside the team (because then you are committing to verify the fix, which just means more work for you that has nothing to do with any of your main responsibilities).
When I pointed out that the system would now discourage people on different teams from helping each other, the sponsoring director said "that's what pink slips are for." Direct quote. Soon after that I resigned from the design team.
Without reasonably skilled and principled leadership, you just don't get quality software. And "quality is everyone's job" is just an empty and childish slogan. Excellence is not transmitted through slogans and wishful thinking. You have to assign responsibility, provide resources and time (which means lowering velocity of new development), and follow-up.
The fundamental reason why it doesn't happen is the technology market is not efficient. Quality is, in fact, not as important as career testers wish it were. You can get away with doing terrible work and not lose your job. The fact that Apple pays no significant penalties for having buggy products insulates it from our slings and arrows.
Apple software quality is in serious danger precisely because of this type of community and infrastructure rot. They are not encouraging developers to help them, and a not-surprising number of serious issues have shown up in released products in recent years.
I am one of those people who gave up on BugReporter. Lots has been written on how tedious it can be but even its “rewritten” version a few years ago is frustrating (basically you could be reporting a damned spelling error and they would still want a “sysdiagnose”). Bugs stay open for years and still can somehow be Duplicate. Meanwhile, very obvious broken features make it into new versions.
I put a lot of effort into my app. I spend about 50% of my time on triaging / fixing / preventing bugs. We check every crash report that is logged, and we attempt to reproduce every issue reported by customers as quickly as possible.
By now, a significant fraction of bugs are bugs in Apple's frameworks. We try to report them to Apple, but they are ignored, or simply closed because they are related to deprecated APIs.
Of course, customers don't complain that Apple frameworks are buggy -- they complain that our app crashes! So Apple has no incentive to fix it. (If the bug affects Apple apps, it has a higher chance of being fixed. If a framework isn't used a lot by Apple, it will be full of bugs)
So we have dozens of workarounds in our code, and have stopped using some frameworks altogether (if possible).
But I can confirm that Corbin was quick to review bugs -- NSTableView related bugs were the only ones were I got fast feedback.
I received an email yesterday that a bug I filed in 2013 in the Android Google maps SDK has been assigned to someone. Might as well just not fix it now.
Additionally, something will work with the stock SDK, but will break on Samsung devices (e.g. camera SDK).
When I worked at Apple I practiced what I am now preaching. I would screen all my bugs within a day or so. I would verify bugs within a week or so. I would poke around at code and attempt to make theoretical fixes for bugs I couldn’t reproduce. I felt like it was part of my job, and I just did these things as a daily task.
From Corbin's about page https://www.corbinstreehouse.com/blog/about/ :
For 13 years I worked at Apple on Cocoa, mostly doing UI implementation in AppKit, such as NSTableView, NSWindow, NSVisualEffectView. I also did quite a bit of work in UIKit – mainly on the early iPhone releases, but also helping with the first few versions of the public SDK after that. I wrote and worked on the base classes, such as UILabel and UITableView. You may recall seeing me at WWDC in the labs, or up on stage giving a talk. Before Apple, I worked at Borland, primarily on Delphi.
With Free software, I just do the work myself and include a patch. But beyond the tongue-in-cheek possibility of scripting things to submit every bug thousands of times to make it ‘popular’, it sounds like there’s no workaround to make an Apple bug report worth the time taken to submit it.
It was frustratingly easy to reproduce. Clear the process is broken internally (as source article says) as well as externally. I guess when you sit on the highest volume pipe of money coming out of wallets for consumer tech, there is no reason to change?
This reminds me of the concept of "f* you" money that lets employees quit their jobs. It just dawned on me that some companies have that as well, w.r.t. their customers.
Ultimately, we just gave up and didn't release our app on their platform at all.
On a positive note, I have had some great experiences interacting with the maintainers of smaller projects, and even some larger ones like Emacs. But massive companies and projects? Not worth the effort.
(Context is Apple asking bug originators to verify whether or not the bug still occurs in the latest OS.)
This seems contradictory to the entire article. The point of logging a bug at all is to get it fixed. The fix would be coming in a future release. If you don’t want to take the latest updates that’s fine, but how are you going to get bug fixes?
So you say no, it wasn’t (it never is for me) and the bug goes back into limbo.
I've since disabled the input switching key combination because it conflicts with Emacs' set-mark-command.
Yeah, except Bug Reporter nags you multiple times for a sysdiagnose if you don't attach one (they have been slowing adding these, I've noticed), and if you ignore these screening is going to send it back to you a week later, without reading the bug, asking for you add one. Sometimes if you do have the sysdiagnose they'll ask for something equally stupid (system profile, usually).
It may not be needed then, but it still saves time.
Even if it weren’t, at least some amount of thinking should be involved in requesting it, and e.g. don’t do it on feature requests or API defect problems. But they do, oh they do.
It should also be noted that it doesn’t save time: it off-loads some time wastage to an external developer who is filling the bug. And who has to pointlessly collect it (it takes several minutes and knocks your system off), then upload 200-400MB file to a notoriously crappy form that it will time out or error out, or sometimes say the upload was successful only to get a bitchy QA reply a week later to upload sysdiagnose.
A few years back, I had a sysdiagnose so huge that Radar just wouldn’t take it. I spent an hour trying to upload the damn thing and eventually uploaded it elsewhere and posted an URL.
To this day, that sysdiagnose file was not downloaded a single time.
a) Bugs get broken down into manageable tasks of roughly the same size (eg. about 1 day to fix).
b) You measure the average velocity that your team fixes these triaged bugs.
c) When the number of bug reports exceeds the teams velocity, you know that number of bugs will never be fixed.
d) Be brutal, and flag that number of the lowest priroty bugs that will never be fixed as WONTFIX.
Now you have a manageable queue of bugs, say as many as your team can fix in two months. And the submitters know the score and know waiting isn't going to help. Lots of productivity and efficiency bonuses all around.
But nobody does this or any of the similar variations, because people rather leave bugs that everyone knows will never be fixed in the system 'because they might be' or because they don't want to listen to the butt hurt submitters complain because they place a higher priority on fixing the bug than the people they are trying to convince to fix it.
And yes, this approach might seriously piss customers off enough that it is a PR nightmare, so I'd be interested in anecdotes of this sort of approach being applied and how it worked out (even just internally).
He makes a reasonable case that not keeping the bug queue manageable leads to increased cost. I've definitely been in bug triage mtgs where we spend more time discussing a bug than the code change would take.
IMO, this sentence is the key. For Apple to improve its software quality, it needs to stop trying to do many other things. We don't need new versions of Mac OS X and iOS every year anymore, each with incremental feature additions and new bugs.
So taking inspiration from the React scheduling work, would it make sense to give bugs a timeout based on priority to keep them from sitting around forever? Say you have a low priority bug, it sits around forever while more important stuff gets fixed, but if it's been sitting idle for 6 months it gets an automatic priority bump?
How to reproduce the bug on iPhone/iPad:
- start playing some music
- open control center and swipe down on the volume to mute it
- turn up the volume one click on the physical buttons
- swipe down on the volume in control center
- repeat turning up the volume with the buttons and muting in the control center until the control center volume stops affecting the actual volume
I’ve reported that each iOS beta gnaws away system storage since they started the program.
My phone is currently giving up 176GB to system storage and seems to have no end in sight.
So many enthusiast users I know are so frustrated they want to give up.
This sounds like a great way to accrete unnecessary code, some of which will create even more bugs.
I worked for 5 years on a team who had just started using code reviews. What we found was that code reviews didn't often find bugs in the changes themselves, but they were very effective at preventing bad decisions that would cause bugs in the future.
There are lots of ways to make a change that gets the job done, and might even fix a bug, but can mean twice as much work to deal with in the future. These range from intentional but shortsighted hacks to simple mistakes by experienced engineers working in a large project.
You also mention style nits. I admit even I get annoyed by these. But bad style can make it harder for new hires to learn the codebase, which slows the team down. In rare cases, they can even result in bugs of their own like the infamous Apple "goto fail" bug.
Every team is different, but I can't imagine working on one without code reviews now.
Then I worked on other teams that just gave up. Sometimes for good reason, fixing long standing bugs could mean a major re-architecture of existing code that would likely induce more bugs than it would fix.
But I also saw teams give up due to sheer overwork and lack of time.
Management matters.
Making users their testers AND not giving them good tools to file bugs takes a special kind of hubris.