Can't we just be honest and say that most of these are applied statistics jobs with a specialty in large volumes of data? Or is "statistics" just not fashionable enough nowadays?
IMO, the "data science" label is too broad to properly differentiate statistical engineers. A fine definition for a data scientist is someone who runs experiments on user/company data and can assess the results. It's important work, but you don't need a PhD in stats or ML to do basic hypothesis testing.
You could simply call them "machine learning" experts, but that could be a bit too academic. People who are focused narrowly on theory or niche areas may be experts in ML, but they may also never do anything outside of running matlab simulations. It's unlikely that those people will make very good statistical engineers since they may never have had to think about the challenges involved in scaling algorithms.
My preferred term is "predictive analytics," which I feel kind of straddles statistics and machine learning, and also serves as a nod to a common difference -- "statistical" methods often yield understanding, while "machine learning" methods are often opaque to human insight but yield predictions.
The most value I got from this article was in the realization that, every few years or so, the academic globes align well enough (some paper de joure becomes well-read I suppose) that .. for a brief instant .. terms are defined well enough, and gain enough agreement, that progress is made .. which progress attracts more eyeballs, who tend to want to break off a chunk for themselves, and the terms begin to differ again and we have a whole new 'sub-sub-sub-' variety of the subject.
So its all about globes aligning, basically. I will now go off and implement an AI technique based entirely on the description of globes, alignment, and little chunks breaking off every now and then .. see you at the top of the AI heap in a year or ten.
I think many would agree that "machine learning" and/or "deep learning" are at least cornerstones for "artificial intelligence". After all, nobody singularly defines intelligence.
Statistics is not (just) opinion polling, there's a lot more to it than estimating observable properties of a population.
If you're trying to make decisions, predictions or estimates which involve any uncertainty at all (and in my experience big data almost always is), then it's definitely within the purview of statistics even if you have data for the whole population.
Sources of uncertainty include trying to say anything at all about the future (do you have data on the future population? no didn't think so...), trying to make predictions which generalise to new data in general, trying to uncover underlying trends or patterns behind the data you see which aren't directly or fully observed.
Often people expect big data to be able to answer big numbers of questions, estimate big numbers of quantities, or fit big, powerful predictive models with lots of parameters. In these cases statistics can be particularly important to avoid reporting false positives and to make sure you can quantify how certain you are about your results and your predictions. (Amongst other reasons).
The new machine learning is about building layers of components on top of each other, very much like circuits seen in EE. The "circuit" components being used are no longer well defined mathematical pieces built from the bottom up using ideal assumptions, but less well understood, somewhat black-box newer components that were built from the top down. Far more like a type of engineering than a type of statistics.
If you haven't been seeing all the latest Arvix papers, you're really missing out. It's evolved to look sharply different than statistics now.
In general, AI borrows many more techniques from mathematics than it does from statistics. However, the field of AI has been quite established since the 1960's, and many techniques have been developed within that field as AI techniques, it's more about being accurate than about being fashionable as AI simply isn't 'just' statistics.
But isn't this the approach Nature herself is taking?
The knowledge engine you carry in your head spends years just "learning" the world - which means, it absorbs huge amounts of input, sorting the good stuff from the bad. It "knows" what works simply because that stuff happens more often; it "knows" what doesn't work because that stuff doesn't happen very often.
And sure there are higher layers of integration there, but the whole process is strongly supported by a statistical approach.
The other driver is business-driven. And this is where management demands 'AI experts', when what they really want is data-miners. And in many cases management prides themselves on 'AI algorithms', but we know that this is a term for anything that gets the results that management wants and may be far from intelligent and in most corporate cases a bunch of SQL scripts.
I mean people go and study in response to demand. They learn data mining and AI at Universities. I think it's often people with backgrounds or aptitude in maths. What will the 22 year old with an aptitude for maths that is learning R, SQL, AI-for-business and such be doing in 10 years?
I don't know if the starting point matters much. "Results Driven," even if its optimising inventory or making ad purchasing decisions or data mining old DBs is not a bad place to "search" for advancements. Not everything needs to be fundamental research.
I wonder if this is a case where managers end up believing their own bullshit. "AI driven" is basically the marketing-speak for "a bunch of SQL scripts".
Most upper management sees IT, only in a business suport role not as a business driver!
It is a very interestimg field, but as a self-taught programmer I'm used to learning by building things, and it's hard for me to come up with some project that would be practically useful and yet doable.
Does anyone have any ideas?
The state of hobbyist robotics is very hardware-centric. I not yet discovered something appropriate for use by AI programmers.
So I decided to step back and work on theory. I did this for several years and recently finished. Hobbyist robots still aren't ready to implement my ideas so now what do I do? I started working on a software architecture based on my high-level research. But how to test it? So now I am working on software to simulate an environment for my AI to interact with because I don't have the time or skill to build a real robot.
I honestly don't think this is a bad thing because I would imagine that when we start to work on "real" AI (not this nonsense that passes for it today) testing behaviors in a simulated environment before deployment to hardware would save a huge amount of time (and physical damage.)
Now I am hoping that once I have taken my virtual habitat as far as it can go, hardware will be available for me to apply what I have been working on. My hopes are low.
- Sports: Predict game outcomes, player performance. Make money playing fantasy sports?
- Real estate: Build a better tool for house assessments that identifies "comps" and predicts what the house should cost.
- Astronomy: Something about exoplanets?
- Amazon prices: Find opportunities to buy/sell.
- Stock prices: Inputs are news/Twitter, outputs are predictions of closing price.
I'd start with your hobbies and go from there!
This might sound a little uninspiring, but it's a solid start, and often you can make improvements over the authors' original stuff -- especially in areas where you may have more experience, e.g. computational efficiency.
If you have (condensed, especially) AI resources that you think would help bridge that gap, please share! Toy-scale project ideas would also be appreciated.
The edX class from Berkeley is pretty fun and hands on. It uses Pacman as a running example and essentially teaches the agents stuff from AIAMA:
https://www.edx.org/course/artificial-intelligence-uc-berkel...
The Stanford class by Thrun and Norvig himself (one of the authors of AIAMA) is also good but I prefer the edX one:
https://www.udacity.com/course/intro-to-artificial-intellige...
Edit: changed to direct links for the courses
(source: have tried to 'bridge the gap' for 2 year, including taking MSc courses, before admitting to myself that it's a lost battle. Am now starting to build a solid math foundation before revisiting ML applications.)
https://github.com/ChristosChristofidis/awesome-deep-learnin...
https://github.com/owainlewis/awesome-artificial-intelligenc...
https://honnibal.wordpress.com/2013/09/11/a-good-part-of-spe...
https://honnibal.wordpress.com/2013/12/18/a-simple-fast-algo...
> Used mainly to strip information value from people without compensation
> Who are dumping money into foundations to prevent the coming of it's more true form
will be screaming 'It's the end of the world'. Until then, enjoy the algorithms. It's the nature of business to over-sell. Don't be too upset by it.
Second was the expert systems boom in the mid-1980s. This was fanned by Stanford professor Fegeinbaum who wrote the infamous book The 5th Generation about expert system computers being the future and Japan was building the best ones. These would either be LISP machines or an interesting French niche language called prologic. Prologic basically traversed a databse "if-then" rules (modus pons). These machines went nowhere and Japan economy tanked in the early 90s. Lot of Silicon Valley VCs lost big on this.
Prof Feigenbaum may still be correct, but 40 years early. However the new A.I. is driven by massive database matching possible in modern peta-level computers and not so much logical computing.
Stating that a bubble cycle has emerged in the means only accentuates the importance of the end, to note that the desire for AI is so strong that futility hasn't kept people from trying.
Virtual reality is another such example.
https://github.com/clojure/core.logic/wiki/A-Core.logic-Prim...
It tends to suffer from management by scalable procedure disease. Its possible to successfully replace a human assembly line worker with a robot arm and a very small shell script, which inevitably leads overactive imaginations to think of replacing engineers or doctors with an immense set of unfortunately undefinable unscalable procedures and rulesets, so it always collapses with complexity at implementation time. Its like moths to a flame, you should be able to replace an engineer with a very long list of if/then statements, but it turns out to be impossible in practice. Meanwhile the more advanced techniques butts up against the rapidly scaling "DBA" "IT" type of traditional solutions or non-traditional big-data techniques.
Its hard to find something to logic program that isn't less verbose in a non-logic language or unwritable in any language including logic programming. Its like the Perl regex thing where you got a problem, so you write a regex, and now you got two problems. Its a very narrow although interesting niche. Finding something that fits would be pretty cool, although probably very difficult to maintain.
(Is anyone building an AI that can come up with its own agenda?)
(1) Take statistics, machine learning, neural nets, artificial intelligence (AI), big data, Python, R, SPSS, SAS, SQL Server, Hadoop, etc., set them aside, and ask the organization looking to hire: "What is the real world problem or collection of problems you want solved or progress on?"
Or, look at the desired ends, not just the means.
(2) Does the hiring organization really know what they want done that is at all doable with current technical tools or only modest extensions of them?
Or, since artificial intelligence is such a broad field, really, so far of mostly unanswered research questions, and the list of topics I mentioned is still more broad, I question if many organizations know in useful terms just what those topics would do for their organization.
So, for anyone with a lot of technical knowledge in, say, the AI, etc., topics, it is important for them to be able to evaluate the career opportunity. I.e., is there a real career opportunity there, say, one good to put on a resume and worth moving across country, buying a house, supporting a family, getting kids through college, meeting unusual expenses, e.g., special schooling for an ADHD child, providing for retirement, making technical and financial progress in the career, etc.?
So, some concerns:
(A) If an organization is to pay the big bucks for very long, e.g., for longer than some fashion fad, then they will likely need some valuable results on their real problems for their real bottom line. So, to evaluate the opportunity, should hear about the real problems and not just a list of technical topics.
(B) For the opportunity for the big bucks to be realistic, really should know where the money is coming from and why. That is, to evaluate the opportunity, need to know more about the money aspects than a $10/hour fast food guy.
(C) As just an employee, can get replaced, laid off, fired, etc. So, to evaluate the opportunity, need to evaluate how stable the job will be, and for that need to know about the real business and not just a list of technical topics.
(D) For success in projects, problem selection and description and tool selection are part of what is crucial. Is the hiring organization really able to do such work for AI, etc. topics?
Or, mostly organizations are still stuck in the model of a factory 100+ years ago where the supervisor knew more and the subordinate was there to add muscle to the work of the supervisor. But in the case of AI, etc., what supervisors really know more or much of anything; what hiring managers know enough to do good problem and tool selection?
Or, if the supervisors don't know much about the technical topics, then usually the subordinate is in a very bad career position. This is an old problem: One of the more effective solutions is some high, well respected professionalism. E.g., generally a working lawyer is supposed to report only to a lawyer, not a generalist manager. Or there might be professional licensing, peer review, legal liability, etc. Or, being just an AI technical expert working for a generalist business manager promises in a year or so to smell like week old dead fish.
(E) If some of the AI, etc., topics do have a lot of business value, then maybe someone with such expertise really should be a founder of a company, harvest most of the value, and not be an employee. So, what are the real problems to be solved. That is, is there a startup opportunity there?
Really, my take is that the OP is, net, talking about a short term fad in some topics long surrounded with a lot of hype. Not good, not a good direction for a career.
AI and hype? Just why might someone see a connection there?
Ask them to explain what a SVM is. Ask them to explain how training a linear perceptron works. This kind of stuff.
One way it happens is that you get a PhD in astrophysics with years of data analysis experience in for a data science job. Have a software engineer interview her and he might find that she doesn't know a number of basic computer sciences concepts [traversing a linked list, tail recursion, implement breadth-first-search]. His knowledge background says these basic ideas are fundamental, there are therefore serious questions about the technical ability of the interviewee.
Or in other words...I'd be skeptical if a candidate hadn't learned programming on their own even if it wasn't required because it's pretty much impossible to get any practical experience otherwise.
People tried to do AI in the 60s to 90s era. It is dubbed symbolic AI. It didn't work out. A good chance that it never will. Today machine learning algos and a bunch of automated statistics is called "Artificial Intelligence". It's not intelligence at all. Intelligence implies something more than I/O computation.
So that fuzzy logic, thats not AI anymore, thats a footnote in the EE control systems theory class, isn't it? And the face recognition is a parallel processing assignment in FPGA class, speech recognition is an advanced section in DSP theory class, etc.