The following quotes from a reddit comment here https://www.reddit.com/r/LocalLLaMA/comments/1dkgjqg/comment...
> under International Data Transfers (in the Privacy Policy): """ The personal information we collect from you may be stored on a server located outside of the country where you live. We store the information we collect in secure servers located in the People's Republic of China . """
> under How We Share Your Information > Our Corporate Group (in the Privacy Policy): """ The Services are supported by certain entities within our corporate group. These entities process Information You Provide, and Automatically Collected Information for us, as necessary to provide certain functions, such as storage, content delivery, security, research and development, analytics, customer and technical support, and content moderation. """
> under How We Use Your Information (in the Privacy Policy): """ Carry out data analysis, research and investigations, and test the Services to ensure its stability and security; """
> under 4.Intellectual Property (in the Terms): """ 4.3 By using our Services, you hereby grant us an unconditional, irrevocable, non-exclusive, royalty-free, sublicensable, transferable, perpetual and worldwide licence, to the extent permitted by local law, to reproduce, use, modify your Inputs and Outputs in connection with the provision of the Services. """
At $0.14M/$0.28M it's a no brainier to use their APIs. I understand some people would have privacy concerns and would want to avoid their APIs, although I personally spend all my time contributing to publicly available OSS code bases so I'm happy for any OSS LLM to use any of our code bases to improve their LLM and hopefully also improving the generated code for anyone using our libraries.
Since many LLM orgs are looking to build proprietary moats around their LLMs to maintain their artificially high prices, I'll personally make an effort to use the best OSS LLMs available first (i.e. from DeepSeek, Meta, Qwen or Mistral AI) since they're bringing down the cost of LLMs and aiming to render the technology a commodity.
[1] https://ollama.com/library/deepseek-coder-v2
[2] https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-In...
Is that even legal in regards to EU users ?
Maybe OK for trying out stuff, a big no no for real work.
43x cheaper is good, but my time is also worth money, and it unfortunately doesn't bode well for me that it's stumped by the first problem I throw at it.
Yi-Coder results, with Sonnet and GPT-3.5 for scale:
77% Sonnet
58% GPT-3.5
54% Yi-Coder-9b-Chat
45% Yi-Coder-9b-Chat-q4_0
Full leaderboard:I wonder what's the reason.
You also need a knowledge of code to instruct an LLM to generate decent code, and even then it's not always perfect.
Meanwhile plenty of people are using free/cheap image generation and going "good enough". Now they don't need to pay a graphic artist or a stock photo licence
Any layperson can describe what they want a picture to look like so the barrier to entry and successful exit is a lot lower for LLM image generation than for LLM code generation.
and getting sandwich photos of ham blending into human fingers:
https://www.reddit.com/r/Wellthatsucks/comments/1f8bvb8/my_l...
This is not the case for art or music generators: they are marketed towards (and created by) laypeople with who want generic content and don't care about human artists. These systems are a significant burden on productivity (and fatal burden on creativity) if you are an honest illustrator or musician.
Another perspective: a lot of the most useful LLM codegen is not asking the LLM to solve a tricky problem, but rather to translate and refine a somewhat loose English-language solution into a more precise JavaScript solution (or whatever), including a large bag of memorized tricks around sorting, regexes, etc. It is more "science than art," and for a sufficiently precise English prompt there is even a plausible set of optimal solutions. The LLM does not have to "understand" the prompt or rely on plagiarism to give a good answer. (Although GPT-3.5 was a horrific F# plagiarist... I don't like LLM codegen but it is far more defensible than music generation)
This is not the case with art or music generators: it makes no sense to describe them as "English to song" translators, and the only "optimal" solutions are the plagiarized / interpolated stuff the human raters most preferred. They clearly don't understand what they are drawing, nor do they understand what melodies are. Their output is either depressing content slop or suspiciously familiar. And their creators have filled the tech community with insultingly stupid propaganda like "they learn art just like human artists do." No wonder artists are mad!
But many people use diffusion models in a much more interactive way, doing much more of the editing by hand. The simplest case is to erase part of a generated image, and prompt to infill. But there are people who spend hours to get a single image where they want it.
Good. The artists I know have zero interest in doing that work. I have sacrificed a small fortune to invest in my wife's development as an artist so she never had to worry about making any money. She uses AI to help with promoting and "marketing" herself.
She and all of her colleagues all despise commissioned work and they get a constant stream of them. I always tell her to refuse them. Some pay very well.
If you are creating generic "art" for corporations I have little more than a shrug for your anxiety over AI.
Artists put a ton of time into education and refining their vision inside the craft. Amateur efforts to produce compelling work always look amateur. With augmentation, suddenly the "real" artists aren't as differentiated.
The whole conversation is obviously extremely skewed toward digital art, and the ones talking about it most visibly are the digital artists. No abstract painter thinks AI is coming for their occupation or cares wether it is easier to create anime dreamscapes this year or the next.
If they can take a Jira ticket, debug the code, create a patch for a large codebase and understand and respect all the workarounds in a legacy codebase, I would have a problem with it.
I've commissioned tens of thousands of dollars in art, and spent many hundreds of hours working with Stable Diffusion, Midjourney, and Flux. What all the generators are missing is intentionality in art.
They can generate something that looks great at surface level, but doesn't make sense when you look at the details. Why is a particular character wearing a certain bracelet? Why do the windows on that cottage look a certain way? What does a certain engraving mean? Which direction is a character looking, and why?
The diffusers do not understand what they are generating, so they just generates what "looks right." Often this results in art that looks pretty but has no deeper logic, world building, meaning, etc.
And of course, image generators cannot handle the client-artist relationship as well (even LLMs cannot), because it requires an understanding of what the customer wants and what emotion they want to convey with the piece they're commissioning.
So - I rely on artists for art I care about (art I will hang on my walls), and image generators for throwaway work (such as weekly D&D campaign images.)
Once you engage agentic behaviour, it can take you way further than just the chats. We're already in the "resolving JIRA tickets" area - it's just hard to setup, not very well known, and may be expensive.
There isn't a good metaphor for the problem with AI art. I would say it is like some kind of chocolate cake that the first few bites seem like the best cake you have ever had and then progressive bites become more and more shit until you stop even considering eating it. Then at some point even the thought of the cake makes you want to puke.
I say this as someone who thought we reached the art singularity in December 2022. I have no philosophical or moral problem with AI art. It just kind of sucks.
Cursor/Sonnet on the other hand just blew my mind earlier today.
And I use Claude 3.5 Sonnet myself.
- There is a big demand for really complex software development, and an LLM can't do that alone. So software devs have to do lots of busywork, and like the opportunity to be augmented by AI
- Conversely, there is a huge demand for not very high level art. - eg, lots of people want a custom logo or a little jingle, but no many people want to hire a concert pianist or comission the next Salvadore Dali.
So most artists spend a lot of time doing a lot of low level work to pay the bills, while software devs spend a lot of time doing low level code monkey work so they can get to the creative part of their job.
I'm not sure I can really point out a big difference here. Maybe the artists are more skewed towards not liking AI since they work with medium that's not digital in the first place, but the range of responses really feels close.
I'm still waiting for a model that's highly specialised for a single language only - and either a lot smaller than these jack of all trades ones or VERY good at that specific language's nuances + libraries.
Of course, this is far from trivial, you don't just add more data and expect it to automatically be better for everything. So is time management for us mere mortals.
Source? Im very curious how learning one language helps model to generate code in language with different paradigms. Java, Markdown, JSON, HTML, Fortran?
You could finetune it on your codebases and specific docs for added perf.
However, I enjoy using various Lisp languages and I was pleased last night when I set up Emacs + ellama + Ollama + Yi-Coder. I experimented with Cursor last weekend, and it was nice for Python, not so great for Common Lisp.
It's a small model trained only by quality sources (ie textbooks).
Then I tried other questions in my past to compare... However, I believe the engineer who did the LLM, just used the questions in benchmarks.
One instance after a hour of use ( I stopped then ) it answered one question with 4 different programming languages, and answers that was no way related to the question.
Unfortunately, this has always been my experience with all open source code models that can be self-hosted.
Also for the cloud models apart from GitHub Copilot, what tools or steps are you all using to get them working on your projects? Any tips or resources would be super helpful!
The setup is pretty simple:
* Install Ollama (instructions for your OS on their website - for macOS, `brew install ollama`)
* Download the model: `ollama pull yi-coder`
* Install and configure Continue on VS Code (https://docs.continue.dev/walkthroughs/llama3.1 <- this is for Llama 3.1 but it should work by replacing the relevant bits)
This is easy to get "working" but difficult to configure for specific tasks due to docs being lacking or contradictory.
Or at least state what you configured toward and how?
It seems to be an emerging trend people should look out for that model release sheets often contain comparisons with out of date models and don't inform so much as just try to make the model look "best."
It's an annoying trend. Untrustworthy metrics betray untrustworthy morals.
I'm not interested so much with the response time (anyone has a couple of spare A100s?), but it would be good to be able to try out different LLMs locally.
You can also use several GPU options, but they are not as easy to get working.
For practical reasons, I often like to know how much GPU RAM is required to run these models locally. The actual number of weights seems to only express some kind of relative power, which I doubt is relevant to most users.
Edit: reformulated to sound like a genuine question instead of a complaint.
I hope that Yi-Coder 9B FP16 and Q8 will be available soon for Ollama, right now i only see the 4bit quantized 9B model.
I'm assuming that these models will be quite a bit better than the 4bit model.
Using SWE-agent + Yi-Coder-9B-Chat.
EDIT: Granted, Yi-Coder 9B is still smaller than any of these.
I can't find a post that I remember Google published just after all the ChatGPT SQL generation hype happened, but it felt like they were trying to counter that hype by explaining that most complex LLM-generated code snippets won't actually run or work, and that they were putting a code-evaluation step after the LLM for Bard.
(A bit like why did they never put an old fashioned rules-based grammar checker check stage in google translate results?)
Fast forward to today and it seems it's a normal step for Gemini etc https://ai.google.dev/gemini-api/docs/code-execution?lang=py...
However, AFAIK it's only ever at inference time, an interpreter isn't included during LLM training? I wonder if it would be possible to fine tune a model for coding with an interpreter. Though if noone has done it yet there is presumably a good reason why not.