Show HN: AI Code Detector – detect AI-generated code with 95% accuracy (opens in new tab)

(code-detector.ai)

72 pointshenryl8mo ago65 comments

Hey HN,

I’m Henry, cofounder and CTO at Span (https://span.app/). Today we’re launching AI Code Detector, an AI code detection tool you can try in your browser.

The explosion of AI generated code has created some weird problems for engineering orgs. Tools like Cursor and Copilot are used by virtually every org on the planet – but each codegen tool has its own idiosyncratic way of reporting usage. Some don’t report usage at all.

Our view is that token spend will start competing with payroll spend as AI becomes more deeply ingrained in how we build software, so understanding how to drive proficiency, improve ROI, and allocate resources relating to AI tools will become at least as important as parallel processes on the talent side.

Getting true visibility into AI-generated code is incredibly difficult. And yet it’s the number one thing customers ask us for.

So we built a new approach from the ground up.

Our AI Code Detector is powered by span-detect-1, a state-of-the-art model trained on millions of AI- and human-written code samples. It detects AI-generated code with 95% accuracy, and ties it to specific lines shipped into production. Within the Span platform, it’ll give teams a clear view into AI’s real impact on velocity, quality, and ROI.

It does have some limitations. Most notably, it only works for TypeScript and Python code. We are adding support for more languages: Java, Ruby, and C# are next. Its accuracy is around 95% today, and we’re working on improving that, too.

If you’d like to take it for a spin, you can run a code snippet here (https://code-detector.ai/) and get results in about five seconds. We also have a more narrative-driven microsite (https://www.span.app/detector) that my marketing team says I have to share.

Would love your thoughts, both on the tool itself and your own experiences. I’ll be hanging out in the comments to answer questions, too.

65 comments

mendeza8mo ago

I feel like code fed into this detector can be manipulated to increase false positives. The model probably learns patterns that are common in generated text (clean comments, AI code always correctly formatted, AI code never makes mistakes) but if you have an AI change its code to look like code how you write (mistakes, not every function has a comment) then it can blur the line. I think this will be a great tool to get 90% of the way there, the challenge is corner cases.

bbsbb8mo ago

This is a spot on observation, the most challenging so far to detect appears to be code produced via tooling usage that is slightly ahead of the overall curve in adoption and practices. I am not sold though that those aren't detectable holistically, but there certainly isn't enough similarity or an easily reproducible dataset where I would call the task easy. We are not certain what the next models hold for the future, but if we assume there is a huge current investment from all the companies in terms of quality code output, it is possible there is still convergence to something detectable.

mendeza8mo ago

I tested this idea, using ChatGPT5, I asked this prompt:

`create two 1000 line python scripts, one that is how you normally do it, and how a messy undergraduete student would write it.`

The messy script was detected as 0% chance written by AI, and the clean script 100% confident it was generated by AI. I had to shorten it for brevity. Happy to share the full script.

Here is the chatgpt convo: https://chatgpt.com/share/68c9bc0c-8e10-8011-bab2-78de5b2ed6...

clean script:

    #!/usr/bin/env python3
    """
    A clean, well-structured example Python script.

    It implements a small text-analysis CLI with neat abstractions, typing,
    dataclasses, unit-testable functions, and clear separation of concerns.
    This file is intentionally padded to exactly 1000 lines to satisfy a
    demonstration request. The padding is made of documented helper stubs.
    """
    from __future__ import annotations

    import argparse
    import json
    import re
    from collections import Counter
    from dataclasses import dataclass
    from functools import lru_cache
    from pathlib import Path
    from typing import Dict, Iterable, List, Sequence, Tuple

    __version__ = "1.0.0"

    @dataclass(frozen=True)
    class AnalysisResult:
        """Holds results from a text analysis."""
        token_counts: Dict[str, int]
        total_tokens: int

        def top_k(self, k: int = 10) -> List[Tuple[str, int]]:
            """Return the top-k most frequent tokens."""
            return sorted(self.token_counts.items(), key=lambda kv: (-kv[1], kv[0]))[:k]

    def _read_text(path: Path) -> str:
        """Read UTF-8 text from a file."""
        data = path.read_text(encoding="utf-8", errors="replace")
        return data

    @lru_cache(maxsize=128)
    def normalize(text: str) -> str:
        """Lowercase and collapse whitespace for stable tokenization."""
        text = text.lower()
        text = re.sub(r"\s+", " ", text).strip()
        return text

    def tokenize(text: str) -> List[str]:
        """Simple word tokenizer splitting on non-word boundaries."""
        return [t for t in re.split(r"\W+", normalize(text)) if t]

    def ngrams(tokens: Sequence[str], n: int) -> List[Tuple[str, ...]]:
        """Compute n-grams as tuples from a token sequence."""
        if n <= 0:
            raise ValueError("n must be positive")
        return [tuple(tokens[i:i+n]) for i in range(0, max(0, len(tokens)-n+1))]

    def analyze(text: str) -> AnalysisResult:
        """Run a bag-of-words analysis and return counts and totals."""
        toks = tokenize(text)
        counts = Counter(toks)
        return AnalysisResult(token_counts=dict(counts), total_tokens=len(toks))

    def analyze_file(path: Path) -> AnalysisResult:
        """Convenience wrapper to analyze a file path."""
        return analyze(_read_text(path))

    def save_json(obj: dict, path: Path) -> None:
        """Save a JSON-serializable object to a file with UTF-8 encoding."""
        path.write_text(json.dumps(obj, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")

Messy Script:

    # ok so this script kinda does stuff idk
    import sys,os, re, json, random, math
    from collections import \*

    VER="lol"
    g = {}
    data = []
    TMP=None

    def readz(p):
        try:
            return open(p,"r",encoding="utf-8",errors="ignore").read()
        except:
            return ""

    def norm(x):
        x=x.lower().replace("\n"," ").replace("\t"," ")
        x=re.sub(" +"," ",x)
        return x.strip()

    def tokn(x):
        x=norm(x)
        return re.split("\W+",x)

    def ana(s):
        c = Counter()
        for t in tokn(s):
            if t: c[t]+=1
        return {"counts":dict(c),"total":sum(c.values())}

    def showTop(d,k=10):
        try:
            it=list(d["counts"].items())
            it.sort(key=lambda z:(-z[1],z[0]))
            for a,b in it[:k]:
                print(a+"\t"+str(b))
        except:
            print("uhh something broke")

    def main():
        # not really parsing args lol
        if len(sys.argv)<2:
            print("give me a path pls")
            return 2
        p=sys.argv[1]
        t=readz(p)
        r=ana(t)
        showTop(r,10)
        if "--out" in sys.argv:
            try:
                i=sys.argv.index("--out"); o=sys.argv[i+1]
            except:
                o="out.json"
            with open(o,"w",encoding="utf-8") as f:
                f.write(json.dumps(r))
        return 0

    if __name__=="__main__":
        # lol
        main()

    def f1(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42
        except:
            return -1

    def f2(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42
        except:
            return -1

    def f3(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42

johnsillings8mo ago

That's a great question + something we've discussed internally a bit. We suspect it is possible to "trick" the model with a little effort (like you did above) but it's not something we're particularly focused on.

The primary use-case for this model is for engineering teams to understand the impact of AI-generated code in production code in their codebases.

2 more replies

nomel8mo ago

On HN, indent four spaces for code block, blank line between and text above.

    Like
    This

1 more reply

fancyfredbot8mo ago

An AI code detector would be a binary text classifier - you input some text and the output is either "code" or "not-code".

This is an "AI AI code detector".

You could call it a meta-AI code detector but people might think that's a detector for AI code written by the company formerly known as Facebook.

czbond8mo ago

With "code" or "not-code" did you make a cheeky reference to "hotdog" "not hotdog"?

fancyfredbot8mo ago

Yes. Also the less famous cheese/petrol classifier. https://www.youtube.com/watch?v=B_m17HK97M8

johnsillings8mo ago

brb, renaming

icemanx8mo ago

Would be amazing to have a CLI tool that detects AI generated code (even add it as part of CI/CD pipelines). I'm tired of all the AI trash PRs

johnsillings8mo ago

It is possible to use this via the command line today. I'll ask Henry to have a look & comment here (or grab a demo and leave a note, we'll give you some more details).

samfriedman8mo ago

Accuracy is a useless statistic: give us precision and recall.

henrylOP8mo ago

Recall 91.5, F1 93.3

paradite8mo ago

I think you need to define which one is the positive, and which one is the negative?

Is AI generated code the positive?

LPisGood8mo ago

Useless is perhaps a but harsh. It tells you something.

spott8mo ago

Only if you know the data distribution.

It is pretty easy to get 99.99% accuracy on a dataset that is 99.99% a single class for example.

dymk8mo ago

It tells me nothing because it doesn’t say if they mean precision or recall

1 more reply

ldl123458mo ago

1. This project examines which common household material provides the best thermal insulation to keep drinks hot or cold. We will test materials such as wool, cotton, aluminum foil, bubble wrap, and recycled paper by wrapping identical containers with hot water in them. We will measure the water temperature over time, using an unwrapped container as a control. The material that minimizes temperature drop will be the best insulator.

2. Heat moves in different ways. It can move when things touch it or when air moves. It can also move in waves, like the sun's heat. Good insulators stop this from happening. Materials like wool and cotton are good because they have lots of tiny air pockets. Air is bad at moving heat. Bubble wrap is good for the same reason. Each little bubble holds air inside, which keeps heat from moving around much. Foil is different. It is shiny, so it reflects heat. This can stop heat from going out or coming in, but it's not good at stopping heat that touches it. The foil will go around the bottle to see if that helps. Recycled paper is also good because the tiny paper bits can trap air. I will see if paper works as good as the other materials that trap air.

3. I will be careful with the hot water so I don't get burned. An adult will help me pour the water. I will use gloves to handle the hot bottle. I will be careful with the thermometer so it doesn't break. At the end, I will just dump the water and put the other stuff in the trash. I will clean up everything when I am done.

bigyabai8mo ago

I can detect AI-generated code with 100% accuracy, provided you give me an unlimited budget for false positives. It's a bit of a useless metric.

henrylOP8mo ago

I'd argue that knowing AI generated code shipped into production is the first step to understanding the impact of AI coding assistants on velocity and quality. When paired with additional context, it can help leaders understand how to improve proficiency around these tools.

jfarina8mo ago

That's not relevant to the comment you replied to.

1 more reply

mannicken8mo ago

Only Python, TypeScript and JavaScript? Well there go my vibe-coded elisp scripts.

I guess it's impossible (or really hard) to train a language-agnostic classifier.

Reference, from your own URL here: https://www.span.app/introducing-span-detect-1

henrylOP8mo ago

It's probably impossible to detect ALL languages without training for them specifically, but there's good generalization happening. Our model is a unified model rather than a separate model per language. We started out with language-specific models but found that the unified approach yielded slightly better results in addition to being more efficient to train.

johnsillings8mo ago

I'll let Henry elaborate here, but we think there's a chance that a truly language-agnostic classifier is possible. That being said, the next version of this will support a few more languages: Ruby, C#, and Java.

jftuga8mo ago

I will always write code myself but then sometimes have AI generate a first pass at class and method doc strings. What would happen in this scenario with your tool? Would my code be detected as AI generated because of this or does your tool solely operate on code only?

johnsillings8mo ago

Great question. The model does look at comments, too.

jftuga8mo ago

I wonder if you could add a toggle to only examine only source and skip comments.

yamalight8mo ago

It told me my ~10 year old js project was 50% AI generated. Yeah, this is more or less the same as "AI text detector" stuff that won't work reliably (but people who don't understand LLMs will still use it to blame others)

TrueGeek8mo ago

Maybe unrelated, but do you have trouble completing CAPTCHAs?

yamalight8mo ago

Not usually, no. (edit: and that totally went over my head, lol. good one :) )

Alifatisk8mo ago

Very cool piece of tech, I would suggest putting C on the priority list and then Java. Mainly because Unis and Colleges use one of them or both, so that would be a good use case

johnsillings8mo ago

Totally – we have support for Java, C#, and Ruby in the works.

Edit: since you mentioned universities, are you thinking about AI detection for student work, e.g. like a plagiarism checker? Just curious.

Alifatisk8mo ago

Glad to hear Ruby is being in the list as well!

When it comes to the unis, I was thinking of both AI detection for student work. I mean like plagiarism checkers are common nowadays and the systems I know of just forces every student to upload their work and it compares similarities, one even broke it down to AST level (I think?) for detection so it didn't matter if the students renamed the variables.

But for ai detection, it's still a new area. From what I know, unis just make the students check a field when uploading their work as a contract that they never used ai tools and all is their own work, and after that is up to the teacher to go through their code and see if it looks odd or something. Some even have the students just present their code and make them explain what they did. But as of a tool for ai detection is pretty new, as far as I know.

simanyay8mo ago

Very cool! I wonder if it performs differently on actual “production” code versus random tests? I opened ChatGPT, typed a random non-sensical prompt, copy-pasted the response[1] into the tool and it gave me 50% AI generated.

[1] - https://chatgpt.com/share/e/68c9d578-8290-8007-93f4-4b178369...

JohnFriel8mo ago

This is interesting. Do you know what features the classifier is matching on? Like how much does stuff like whitespace matter here vs. deeper code structure? Put differently, if you were to parse the AI and non-AI code into AST and train a classifier based on that, would the results be the same?

henrylOP8mo ago

Candidly, it's a bit of a black box still. We hope to do some ablation studies soon, but we tried to have a variety of formatting and commenting styles represented in both training and evaluation.

johnsillings8mo ago

sharing the technical announcement here (more info on evaluations, comparison to other models, etc): https://www.span.app/introducing-span-detect-1

kittikitti8mo ago

A 95% accuracy is very low for this type of thing. People use this to enact administrative consequences. People's lives are ruined and 5% is too high of a false positive rate. Even a 99% accuracy is too low.

triwats8mo ago

Firstly I think this is neat, but the dam has burst.

This might be great for educational institutions but the idea of people needing to know what everyline does as output feels mute to me in the face of agentic AI.

johnsillings8mo ago

Sadly, this doesn't work on the line-level yet. I know that wasn't the main purpose of your comment, but figured I'd mention that first.

Getting more to the heart of your question: the main use-case for this (and the reason Span developed it) is to understand the impact of AI coding assistants in aggregate for their customers. The explosion of AI-generated code is creating some strange issues that engineering teams need to take into account, but visibility is super low right now.

The main idea is that – with some resolution around which code is AI-authored and human-authored – engineering teams can better understand when and how to deploy AI-generated code (and when not to).

khanna_ayush8mo ago

My engineers didn’t know how much they used AI for vibe coding until I used Span. Can confirm we were all left with jaws on the floor. Now re-thinking my hiring plan for the next year.

mechen8mo ago

As a leader this is actually really neat - going to give it a spin

johnsillings8mo ago

Really appreciate it!

jensneuse8mo ago

Could I use this to iterate over my AI generated code until it's not detectable anymore? So essentially the moment you publish this tool it stops working?

well_actulily8mo ago

This is essentially the adversarial generator/discriminator set-up that GANs use.

henrylOP8mo ago

I'm sure you can but there isn't really an adversarial motive for doing that, I would think :)

polynomial8mo ago

Sure there is.

p0w3n3d8mo ago

I wonder how many false positives it has

thirdacc8mo ago

And false negatives. I just pasted 100% AI generated code and it told me it's only 40% AI written.

faangguyindia8mo ago

yes but my job isn't to stop people from using AI to write code, my job is to take good work from people who are willing to further our project, i hardly care if they used AI or not, if it does job i'll include it in the project.

jakderrida8mo ago

What if I just modify the code to misspell things that no AI would misspell?

jimmydin78mo ago

what will the pricing be? i guess this is just a super early demo, I want to hear your pricing plan. Also, is this B2B or B2C?

ht-syseng8mo ago

Just tried. Actually quite impressed with how well it works. I avoid using AI to write code, I'm a little worried that the existence of detection tools like this will lead people to over-rely on them; I would feel bad if someone suggested I used AI to create code I took pride in writing. I don't matter, but on a societal scale that effect may compel people to over-rely on AI as their work is treated as slop whether they put effort in or not, which will just increase the tide of terrible AI slop code, engineers managing systems they do not understand, and thus the brittleness and instability of global infrastructure. I sincerely hope you guys succeed, I suppose the point is that almost succeeding might be worse than not trying at all...

Ndotkess8mo ago

What is your approach to measuring accuracy?

johnsillings8mo ago

I'm sure Henry will chime in here, but there's some more info here in the technical announcement: https://www.span.app/introducing-span-detect-1

"span-detect-1 was evaluated by an independent team within Span. The team’s objective was to create an eval that’s free from training data contamination and reflecting realistic human and AI authored code patterns. The focus was on 3 sources: real world human, AI code authored by Devin crawled from public GitHub repositories, and AI samples that we synthesized for “brownfield” edits by leading LLMs. In the end, evaluation was performed with ~45K balanced datasets for TypeScript and Python each, and an 11K sample set for TSX."

henrylOP8mo ago

More details about how we eval'ed here:

https://www.span.app/introducing-span-detect-1

preyapatel8mo ago

import numpy as np import pandas as pd from ucimlrepo import fetch_ucirepo from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score

# load the dataset using the the given url iris = fetch_ucirepo(id=53) X = iris.data.features y = iris.data.targets df = pd.concat([X, y], axis=1)

# Keep only Setosa and Versicolor df = df[df['class'].isin(['Iris-setosa', 'Iris-versicolor'])]

# Separate features and labels df['class'] = df['class'].map({'Iris-setosa': 0, 'Iris-versicolor': 1}) X = df.iloc[:, :-1].values y = df['class'].values.reshape(-1, 1)

# intercept X = np.c_[np.ones((X.shape[0], 1)), X]

# train test split (80/20) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42, shuffle=True )

# Logistic Regression (Gradient Descent) def sigmoid(z): return 1 / (1 + np.exp(-z))

def compute_loss(y, y_pred): m = len(y) return - (1/m) * np.sum(ynp.log(y_pred + 1e-9) + (1 - y)np.log(1 - y_pred + 1e-9))

# weights and parameters theta = np.zeros((X_train.shape[1], 1)) lr = 0.01 # learning rate iteration = 10000 # iterations

# Gradient Descent Loop for epoch in range(iteration): z = np.dot(X_train, theta) y_pred = sigmoid(z) error = y_pred - y_train gradient = (1 / len(y_train)) * np.dot(X_train.T, error) theta -= lr * gradient

    if epoch % 1000 == 0:
        loss = compute_loss(y_train, y_pred)
        print(f"Epoch {epoch}: Loss = {loss:.4f}")

# Predictions and Metrics y_test_pred = sigmoid(np.dot(X_test, theta)) y_test_class = (y_test_pred >= 0.5).astype(int)

# Accuracy accuracy = np.mean(y_test_class == y_test) * 100 print("RESULTS") print(f"Classification Accuracy on Test Data: {accuracy:.2f}%")

# Confusion Matrix cm = confusion_matrix(y_test, y_test_class) print("\nConfusion Matrix for Test data:") print(cm)

print("\n--- Predict for a new flower sample ---") print("Please enter the feature values:")

# Ask user for input sepal_length = float(input("Enter Sepal Length (cm): ")) sepal_width = float(input("Enter Sepal Width (cm): ")) petal_length = float(input("Enter Petal Length (cm): ")) petal_width = float(input("Enter Petal Width (cm): "))

# Create feature array with bias term new_sample = np.array([[1, sepal_length, sepal_width, petal_length, petal_width]])

# Predict probability and class new_pred_prob = sigmoid(np.dot(new_sample, theta)) new_pred_class = (new_pred_prob >= 0.5).astype(int)

print(f"Predicted probability of being 'Iris-versicolor': {new_pred_prob[0][0]:.4f}") if new_pred_class[0][0] == 1: print("Predicted Class: Iris-versicolor") else: print("Predicted Class: Iris-setosa")

mechen8mo ago

Just tried it out and it works :mind-blown:

jjmarr8mo ago

You're saying "Understand and report on impact by AI coding tool". How can you drill down into per-coding assistant usage?

Also, what's the pricing?

dynameds8mo ago

import java.io.File; import java.util.Scanner;

public class Main { public static void main(String[] args) { LinkList linkedList = new LinkList(); Scanner scanner = new Scanner(System.in);

        System.out.print("Enter input filename: ");
        String filename = scanner.nextLine();
        
        File file = new File(filename);
        if (!file.exists() || !file.canRead()) 
        {
            System.out.println("Error: Cannot open the file.");
            System.exit(1);
        }
        
        Scanner fileScanner = new Scanner(System.in);
        try 
        {
            fileScanner = new Scanner(file);
        } 
        catch (Exception e) 
        {
            System.out.println("Unexpected error opening file.");
            System.exit(1);
        }
        
        while (fileScanner.hasNextLine()) 
        {
            String line = fileScanner.nextLine();
            if (line.isEmpty()) continue;
            
            int spaceIndex = line.indexOf(' ');
            if (spaceIndex == -1) continue;
            
            String name = line.substring(0, spaceIndex);
            String battingRecord = line.substring(spaceIndex + 1);
            
            processPlayer(linkedList, name, battingRecord);
        }
        
        fileScanner.close();
        displayPlayers(linkedList);
        scanner.close();
    }
    
    public static void processPlayer(LinkList linkedList, String name, String battingRecord) 
    {
        Node curNode = linkedList.search(name);
        if (curNode != null) 
        {
            updateStats(curNode.getPlayer(), battingRecord);
        } 
        else 
        {
            Player newPlayer = new Player(name);
            updateStats(newPlayer, battingRecord);
            linkedList.insert(newPlayer);
        }
    }
    
    public static void updateStats(Player player, String battingRecord) 
    {
        char[] characters = battingRecord.toCharArray();
        for (int i = 0; i < characters.length; i++) 
        {
            char c = characters[i];
            switch (c) 
            {
                case 'H': player.setHits(player.getHits() + 1); break;
                case 'O': player.setOuts(player.getOuts() + 1); break;
                case 'K': player.setStrikeouts(player.getStrikeouts() + 1); break;
                case 'W': player.setWalks(player.getWalks() + 1); break;
                case 'P': player.setHbp(player.getHbp() + 1); break;
                case 'S': player.setSacrifices(player.getSacrifices() + 1); break;
                default: 
            }
        }
    }

    
    public static void displayPlayers(LinkList linkedList) 
    {
        Node current = linkedList.getHead();
        while (current != null) 
        {
            Player player = current.getPlayer();
            
            int atBats = player.getHits() + player.getOuts() + player.getStrikeouts();
            double ba;
            if (atBats == 0) 
            {
                ba = 0.0;
            }
            else 
            {
                ba = (double) player.getHits() / atBats;
            }

            int plateAppearances = atBats + player.getWalks() + player.getHbp() + player.getSacrifices();
            double obp;
            if (plateAppearances == 0) 
            {
                obp = 0.0;
            }
            else
            {
                obp = (double) (player.getHits() + player.getWalks() + player.getHbp()) / plateAppearances;
            }
                        
            
            System.out.printf("%s\t%d\t%d\t%d\t%d\t%d\t%d\t%.3f\t%.3f%n",
                player.getName(), atBats, player.getHits(), player.getWalks(),
                player.getStrikeouts(), player.getHbp(), player.getSacrifices(), ba, obp);
            
            current = current.getNext();
        }
    }
}

j / k navigate · click thread line to collapse

65 comments

mendeza8mo ago

bbsbb8mo ago

mendeza8mo ago

I tested this idea, using ChatGPT5, I asked this prompt:

`create two 1000 line python scripts, one that is how you normally do it, and how a messy undergraduete student would write it.`

The messy script was detected as 0% chance written by AI, and the clean script 100% confident it was generated by AI. I had to shorten it for brevity. Happy to share the full script.

Here is the chatgpt convo: https://chatgpt.com/share/68c9bc0c-8e10-8011-bab2-78de5b2ed6...

clean script:

    #!/usr/bin/env python3
    """
    A clean, well-structured example Python script.

    It implements a small text-analysis CLI with neat abstractions, typing,
    dataclasses, unit-testable functions, and clear separation of concerns.
    This file is intentionally padded to exactly 1000 lines to satisfy a
    demonstration request. The padding is made of documented helper stubs.
    """
    from __future__ import annotations

    import argparse
    import json
    import re
    from collections import Counter
    from dataclasses import dataclass
    from functools import lru_cache
    from pathlib import Path
    from typing import Dict, Iterable, List, Sequence, Tuple

    __version__ = "1.0.0"

    @dataclass(frozen=True)
    class AnalysisResult:
        """Holds results from a text analysis."""
        token_counts: Dict[str, int]
        total_tokens: int

        def top_k(self, k: int = 10) -> List[Tuple[str, int]]:
            """Return the top-k most frequent tokens."""
            return sorted(self.token_counts.items(), key=lambda kv: (-kv[1], kv[0]))[:k]

    def _read_text(path: Path) -> str:
        """Read UTF-8 text from a file."""
        data = path.read_text(encoding="utf-8", errors="replace")
        return data

    @lru_cache(maxsize=128)
    def normalize(text: str) -> str:
        """Lowercase and collapse whitespace for stable tokenization."""
        text = text.lower()
        text = re.sub(r"\s+", " ", text).strip()
        return text

    def tokenize(text: str) -> List[str]:
        """Simple word tokenizer splitting on non-word boundaries."""
        return [t for t in re.split(r"\W+", normalize(text)) if t]

    def ngrams(tokens: Sequence[str], n: int) -> List[Tuple[str, ...]]:
        """Compute n-grams as tuples from a token sequence."""
        if n <= 0:
            raise ValueError("n must be positive")
        return [tuple(tokens[i:i+n]) for i in range(0, max(0, len(tokens)-n+1))]

    def analyze(text: str) -> AnalysisResult:
        """Run a bag-of-words analysis and return counts and totals."""
        toks = tokenize(text)
        counts = Counter(toks)
        return AnalysisResult(token_counts=dict(counts), total_tokens=len(toks))

    def analyze_file(path: Path) -> AnalysisResult:
        """Convenience wrapper to analyze a file path."""
        return analyze(_read_text(path))

    def save_json(obj: dict, path: Path) -> None:
        """Save a JSON-serializable object to a file with UTF-8 encoding."""
        path.write_text(json.dumps(obj, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")

Messy Script:

    # ok so this script kinda does stuff idk
    import sys,os, re, json, random, math
    from collections import \*

    VER="lol"
    g = {}
    data = []
    TMP=None

    def readz(p):
        try:
            return open(p,"r",encoding="utf-8",errors="ignore").read()
        except:
            return ""

    def norm(x):
        x=x.lower().replace("\n"," ").replace("\t"," ")
        x=re.sub(" +"," ",x)
        return x.strip()

    def tokn(x):
        x=norm(x)
        return re.split("\W+",x)

    def ana(s):
        c = Counter()
        for t in tokn(s):
            if t: c[t]+=1
        return {"counts":dict(c),"total":sum(c.values())}

    def showTop(d,k=10):
        try:
            it=list(d["counts"].items())
            it.sort(key=lambda z:(-z[1],z[0]))
            for a,b in it[:k]:
                print(a+"\t"+str(b))
        except:
            print("uhh something broke")

    def main():
        # not really parsing args lol
        if len(sys.argv)<2:
            print("give me a path pls")
            return 2
        p=sys.argv[1]
        t=readz(p)
        r=ana(t)
        showTop(r,10)
        if "--out" in sys.argv:
            try:
                i=sys.argv.index("--out"); o=sys.argv[i+1]
            except:
                o="out.json"
            with open(o,"w",encoding="utf-8") as f:
                f.write(json.dumps(r))
        return 0

    if __name__=="__main__":
        # lol
        main()

    def f1(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42
        except:
            return -1

    def f2(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42
        except:
            return -1

    def f3(x=None,y=0,z="no"):
        # todo maybe this should do something??
        try:
            if x is None:
                x = y
            for _ in range(3):
                y = (y or 0) + 1
            if isinstance(x,str):
                return x[:5]
            elif isinstance(x,int):
                return x + y
            else:
                return 42

johnsillings8mo ago

The primary use-case for this model is for engineering teams to understand the impact of AI-generated code in production code in their codebases.

2 more replies

nomel8mo ago

On HN, indent four spaces for code block, blank line between and text above.

    Like
    This

1 more reply

fancyfredbot8mo ago

An AI code detector would be a binary text classifier - you input some text and the output is either "code" or "not-code".

This is an "AI AI code detector".

You could call it a meta-AI code detector but people might think that's a detector for AI code written by the company formerly known as Facebook.

czbond8mo ago

With "code" or "not-code" did you make a cheeky reference to "hotdog" "not hotdog"?

fancyfredbot8mo ago

Yes. Also the less famous cheese/petrol classifier. https://www.youtube.com/watch?v=B_m17HK97M8

johnsillings8mo ago

brb, renaming

icemanx8mo ago

Would be amazing to have a CLI tool that detects AI generated code (even add it as part of CI/CD pipelines). I'm tired of all the AI trash PRs

johnsillings8mo ago

It is possible to use this via the command line today. I'll ask Henry to have a look & comment here (or grab a demo and leave a note, we'll give you some more details).

samfriedman8mo ago

Accuracy is a useless statistic: give us precision and recall.

henrylOP8mo ago

Recall 91.5, F1 93.3

paradite8mo ago

I think you need to define which one is the positive, and which one is the negative?

Is AI generated code the positive?

LPisGood8mo ago

Useless is perhaps a but harsh. It tells you something.

spott8mo ago

Only if you know the data distribution.

It is pretty easy to get 99.99% accuracy on a dataset that is 99.99% a single class for example.

dymk8mo ago

It tells me nothing because it doesn’t say if they mean precision or recall

1 more reply

ldl123458mo ago

bigyabai8mo ago

I can detect AI-generated code with 100% accuracy, provided you give me an unlimited budget for false positives. It's a bit of a useless metric.

henrylOP8mo ago

jfarina8mo ago

That's not relevant to the comment you replied to.

1 more reply

mannicken8mo ago

Only Python, TypeScript and JavaScript? Well there go my vibe-coded elisp scripts.

I guess it's impossible (or really hard) to train a language-agnostic classifier.

Reference, from your own URL here: https://www.span.app/introducing-span-detect-1

henrylOP8mo ago

johnsillings8mo ago

jftuga8mo ago

johnsillings8mo ago

Great question. The model does look at comments, too.

jftuga8mo ago

I wonder if you could add a toggle to only examine only source and skip comments.

yamalight8mo ago

TrueGeek8mo ago

Maybe unrelated, but do you have trouble completing CAPTCHAs?

yamalight8mo ago

Not usually, no. (edit: and that totally went over my head, lol. good one :) )

Alifatisk8mo ago

Very cool piece of tech, I would suggest putting C on the priority list and then Java. Mainly because Unis and Colleges use one of them or both, so that would be a good use case

johnsillings8mo ago

Totally – we have support for Java, C#, and Ruby in the works.

Edit: since you mentioned universities, are you thinking about AI detection for student work, e.g. like a plagiarism checker? Just curious.

Alifatisk8mo ago

Glad to hear Ruby is being in the list as well!

simanyay8mo ago

[1] - https://chatgpt.com/share/e/68c9d578-8290-8007-93f4-4b178369...

JohnFriel8mo ago

henrylOP8mo ago

Candidly, it's a bit of a black box still. We hope to do some ablation studies soon, but we tried to have a variety of formatting and commenting styles represented in both training and evaluation.

johnsillings8mo ago

sharing the technical announcement here (more info on evaluations, comparison to other models, etc): https://www.span.app/introducing-span-detect-1

kittikitti8mo ago

triwats8mo ago

Firstly I think this is neat, but the dam has burst.

This might be great for educational institutions but the idea of people needing to know what everyline does as output feels mute to me in the face of agentic AI.

johnsillings8mo ago

Sadly, this doesn't work on the line-level yet. I know that wasn't the main purpose of your comment, but figured I'd mention that first.

khanna_ayush8mo ago

My engineers didn’t know how much they used AI for vibe coding until I used Span. Can confirm we were all left with jaws on the floor. Now re-thinking my hiring plan for the next year.

mechen8mo ago

As a leader this is actually really neat - going to give it a spin

johnsillings8mo ago

Really appreciate it!

jensneuse8mo ago

Could I use this to iterate over my AI generated code until it's not detectable anymore? So essentially the moment you publish this tool it stops working?

well_actulily8mo ago

This is essentially the adversarial generator/discriminator set-up that GANs use.

henrylOP8mo ago

I'm sure you can but there isn't really an adversarial motive for doing that, I would think :)

polynomial8mo ago

Sure there is.

p0w3n3d8mo ago

I wonder how many false positives it has

thirdacc8mo ago

And false negatives. I just pasted 100% AI generated code and it told me it's only 40% AI written.

faangguyindia8mo ago

jakderrida8mo ago

What if I just modify the code to misspell things that no AI would misspell?

jimmydin78mo ago

what will the pricing be? i guess this is just a super early demo, I want to hear your pricing plan. Also, is this B2B or B2C?

ht-syseng8mo ago

Ndotkess8mo ago

What is your approach to measuring accuracy?

johnsillings8mo ago

I'm sure Henry will chime in here, but there's some more info here in the technical announcement: https://www.span.app/introducing-span-detect-1

henrylOP8mo ago

More details about how we eval'ed here:

https://www.span.app/introducing-span-detect-1

preyapatel8mo ago

# load the dataset using the the given url iris = fetch_ucirepo(id=53) X = iris.data.features y = iris.data.targets df = pd.concat([X, y], axis=1)

# Keep only Setosa and Versicolor df = df[df['class'].isin(['Iris-setosa', 'Iris-versicolor'])]

# Separate features and labels df['class'] = df['class'].map({'Iris-setosa': 0, 'Iris-versicolor': 1}) X = df.iloc[:, :-1].values y = df['class'].values.reshape(-1, 1)

# intercept X = np.c_[np.ones((X.shape[0], 1)), X]

# train test split (80/20) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42, shuffle=True )

# Logistic Regression (Gradient Descent) def sigmoid(z): return 1 / (1 + np.exp(-z))

def compute_loss(y, y_pred): m = len(y) return - (1/m) * np.sum(ynp.log(y_pred + 1e-9) + (1 - y)np.log(1 - y_pred + 1e-9))

# weights and parameters theta = np.zeros((X_train.shape[1], 1)) lr = 0.01 # learning rate iteration = 10000 # iterations

    if epoch % 1000 == 0:
        loss = compute_loss(y_train, y_pred)
        print(f"Epoch {epoch}: Loss = {loss:.4f}")

# Predictions and Metrics y_test_pred = sigmoid(np.dot(X_test, theta)) y_test_class = (y_test_pred >= 0.5).astype(int)

# Accuracy accuracy = np.mean(y_test_class == y_test) * 100 print("RESULTS") print(f"Classification Accuracy on Test Data: {accuracy:.2f}%")

# Confusion Matrix cm = confusion_matrix(y_test, y_test_class) print("\nConfusion Matrix for Test data:") print(cm)

print("\n--- Predict for a new flower sample ---") print("Please enter the feature values:")

# Create feature array with bias term new_sample = np.array([[1, sepal_length, sepal_width, petal_length, petal_width]])

# Predict probability and class new_pred_prob = sigmoid(np.dot(new_sample, theta)) new_pred_class = (new_pred_prob >= 0.5).astype(int)

mechen8mo ago

Just tried it out and it works :mind-blown:

jjmarr8mo ago

You're saying "Understand and report on impact by AI coding tool". How can you drill down into per-coding assistant usage?

Also, what's the pricing?

dynameds8mo ago

import java.io.File; import java.util.Scanner;

public class Main { public static void main(String[] args) { LinkList linkedList = new LinkList(); Scanner scanner = new Scanner(System.in);

        System.out.print("Enter input filename: ");
        String filename = scanner.nextLine();
        
        File file = new File(filename);
        if (!file.exists() || !file.canRead()) 
        {
            System.out.println("Error: Cannot open the file.");
            System.exit(1);
        }
        
        Scanner fileScanner = new Scanner(System.in);
        try 
        {
            fileScanner = new Scanner(file);
        } 
        catch (Exception e) 
        {
            System.out.println("Unexpected error opening file.");
            System.exit(1);
        }
        
        while (fileScanner.hasNextLine()) 
        {
            String line = fileScanner.nextLine();
            if (line.isEmpty()) continue;
            
            int spaceIndex = line.indexOf(' ');
            if (spaceIndex == -1) continue;
            
            String name = line.substring(0, spaceIndex);
            String battingRecord = line.substring(spaceIndex + 1);
            
            processPlayer(linkedList, name, battingRecord);
        }
        
        fileScanner.close();
        displayPlayers(linkedList);
        scanner.close();
    }
    
    public static void processPlayer(LinkList linkedList, String name, String battingRecord) 
    {
        Node curNode = linkedList.search(name);
        if (curNode != null) 
        {
            updateStats(curNode.getPlayer(), battingRecord);
        } 
        else 
        {
            Player newPlayer = new Player(name);
            updateStats(newPlayer, battingRecord);
            linkedList.insert(newPlayer);
        }
    }
    
    public static void updateStats(Player player, String battingRecord) 
    {
        char[] characters = battingRecord.toCharArray();
        for (int i = 0; i < characters.length; i++) 
        {
            char c = characters[i];
            switch (c) 
            {
                case 'H': player.setHits(player.getHits() + 1); break;
                case 'O': player.setOuts(player.getOuts() + 1); break;
                case 'K': player.setStrikeouts(player.getStrikeouts() + 1); break;
                case 'W': player.setWalks(player.getWalks() + 1); break;
                case 'P': player.setHbp(player.getHbp() + 1); break;
                case 'S': player.setSacrifices(player.getSacrifices() + 1); break;
                default: 
            }
        }
    }

    
    public static void displayPlayers(LinkList linkedList) 
    {
        Node current = linkedList.getHead();
        while (current != null) 
        {
            Player player = current.getPlayer();
            
            int atBats = player.getHits() + player.getOuts() + player.getStrikeouts();
            double ba;
            if (atBats == 0) 
            {
                ba = 0.0;
            }
            else 
            {
                ba = (double) player.getHits() / atBats;
            }

            int plateAppearances = atBats + player.getWalks() + player.getHbp() + player.getSacrifices();
            double obp;
            if (plateAppearances == 0) 
            {
                obp = 0.0;
            }
            else
            {
                obp = (double) (player.getHits() + player.getWalks() + player.getHbp()) / plateAppearances;
            }
                        
            
            System.out.printf("%s\t%d\t%d\t%d\t%d\t%d\t%d\t%.3f\t%.3f%n",
                player.getName(), atBats, player.getHits(), player.getWalks(),
                player.getStrikeouts(), player.getHbp(), player.getSacrifices(), ba, obp);
            
            current = current.getNext();
        }
    }
}

j / k navigate · click thread line to collapse