undefined | Better HN

0 pointsduskwuff3y ago0 comments

Making Copilot stop in its tracks when it sees the word "gender" and refuse to continue until the word is removed is still making a statement. Refusing to bat would be treating "gender" as a meaningless token, just as if you'd typed "traqre" instead.

0 comments

mlyle3y ago

No, refusing to generate stuff in an area where the output is likely to be controversial (in either direction) is refusing to bat. It'll wait for a pitch that it thinks it can hit, just like it refuses to play for many other categories-- you'll have hard time to get Copilot to enumerate races, too.

Ignoring the potential offensiveness and YOLOing through it is swinging the bat wildly at every pitch.

duskwuffOP3y ago

I think you might not fully grasp the scope of the issue here. Right now, if a file you're editing contains one of the restricted words, Copilot will refuse to make any suggestions at all in that file while that word is present -- even if the word isn't relevant to the part of the file you're editing. To keep to the baseball metaphor, Copilot is going on strike at the first whiff of controversy.

What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.

(It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)

mlyle3y ago

> Right now, if a file you're editing contains one of the restricted words, Copilot will refuse to make any suggestions at all in that file while that word is present -- even if the word isn't relevant to the part of the file you're editing.

Yah-- it's unfortunate but it's easy. It might be OK to tolerate it if it's clearly outside the range of tokens used in suggestions, but the filtering doesn't use tokenized stuff.

> What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.

The problem is, the trained model is much smarter than the keyword-based filtering. If you just whiteout the watchwords, it still has a pretty good chance of gleaning context and making a commentary on gender that Microsoft would rather not deal with.

> (It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)

Right now the list is quite a large variety of things. Mostly racial slurs and sexual terms. But letting an AI ramble on after "blacks" is kind of dangerous, as are various gender-related terms that do have innocuous interpretations. It's easy to put words in the filter list and much harder to try and use nuance on these topics that even humans struggle with nuance around.

faeriechangling3y ago

Yes but people haven't quite figured out WHY people should be offended at Microsoft for doing this, so it's quite convenient for them before people discover their reasons for being mad.

j / k navigate · click thread line to collapse

0 comments

mlyle3y ago

Ignoring the potential offensiveness and YOLOing through it is swinging the bat wildly at every pitch.

duskwuffOP3y ago

mlyle3y ago

Yah-- it's unfortunate but it's easy. It might be OK to tolerate it if it's clearly outside the range of tokens used in suggestions, but the filtering doesn't use tokenized stuff.

faeriechangling3y ago

Yes but people haven't quite figured out WHY people should be offended at Microsoft for doing this, so it's quite convenient for them before people discover their reasons for being mad.

j / k navigate · click thread line to collapse