This makes me wonder what kind of trading systems can actually have any kind of edge, since some kind of autoregressive time series forecasting system seems pretty unreliable.
On a more general note, how do you move beyond it being gambling? Just because a system backtests well doesn't mean a phenomenon will continue to happen, especially if your system will significantly impact the market you're in. If you make a trend-following system, every time you trade, you're gambling that the trend is more likely to continue than not. If you're right, you'll come out ahead over many trades. If you don't have enough capital to withstand drawdown the way most beginners don't, you won't be able to last long enough for whatever phenomenon you've found to average out.
It takes a lot of time, effort and risk to do all this, so, this is a long-winded way of saying I don't think it's for me. If you build a SaaS product and it fails, at least you can talk about what you learned from building it and use that in future endeavors. If you lose money trading because your algorithm doesn't work, what do you learn from that besides that your algorithm doesn't work?
Now that I have my alpha factor I backtest it and whatever. Since the mean of a zscore is zero, I know I'm market neutral, so (ignoring some stuff) my factor should have little exposure to the market.
If I think it's good, I add it to my other alpha factors and combine them somehow. Could be as simple as adding them all up, or maybe something like using random forests to figure out the best way to combine them, or whatever. Now that I have a bunch of alpha factors all combined, I can run them through the optimization engine.
The optimization engine will adjust the weights of my "ideal" portfolio in order to reduce exposure to various risk factors (thus lowering volatility). My optimizer will also figure out how often I need to rebalance. There's generally a bunch of terms in there that try to reduce trading costs and zero out exposure while not diluting the "ideal" portfolio too much (or else the alpha could be wiped out).
Now, after all of this, I'm ready to trade.
In short, what we're trying to do is reduce our exposure to as many factors as possible and just get exposure to our alpha factor. We don't want the market, price of oil, sex scandal of a CEO, or anything else affecting our portfolio. We are trying to dig up this latent, unearthed, alpha that exists in the market, but doesn't belong to one company or asset.
It’s accepting that you’ll receive what the market gives you and not a dollar more, and that that’s the best you’re ever going to get.
I depend on an in-depth understanding of human psychology as one of my data sources. You can't turn something like that into data and input to a model. It is something learned through life experience and study.
I am not that familiar with forex market compared to equity market. But I expect forex to be impacted by changes in political and economic situational news of host countries of relevant currency pairs. All these need to be coded into forex trading strategies.
If trading based on just price was so simple , everybody would be doing it successfully.
A few months ago I tried to evaluate autoregressive behavior in stock returns. To my surprise it seemed strong on some periods, but then weak on others [1], and as you said not reliable enough to rely on.
My impression is that a lot more information aggregation and processing is required to obtain a sustainable edge worth tranding on than what a single developer can achieve in his/her spare time.
Top investment shops have dedicated teams of sw engineers just to deal with the infrastructure that support their data pipelines, financial model backtesting and deployment.
[1] https://thomasvilhena.com/2020/01/likelihood-of-autoregressi...
[1] https://en.wikipedia.org/wiki/Momentum_investing
[2] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2961979
(1) monitoring other people’s tomato transactions in as much detail as possible;
(2) spying on people who are about to buy tomatoes and rapidly make changes behind the scenes just before they make a purchase; or
(3) pointing voice analysis at The Food Network looking out for recipes that call for fresh tomatoes, tracking tomato tankers in major sea lanes, monitoring storm tracks in the top tomato growing zones, etc, and adjusting your position appropriately.
It sounds like you are trying (1) when (3) might be better, or even (2) if you are not jail-averse (or your local jurisdiction has institutionalized high speed market fiddling to the point of being legal.)
The secret is simply to have an edge.
If you're trading on behalf of clients. You don't care what happens to the market because you don't depend on the high or low to make money.
If you're buying or selling for yourself, same thing. Guess who's buying coal and oil, power plants and refineries and assimilated. They sell what they have and buy what they need.
If you're making money on arbitrage, making sure the New York and the London stock exchange have the same USD to GBP to EUR price and vice versa. You could make money but you better be faster than other corporations and more careful at the same time because you're not the only one doing that. Anytime you buy one side, the other side might have changed because you can balance out.
There are clear factors that drive many markets. When the weather is cold people consume more energy for heating. When it's hot, they go out to make barbecues and buy more sausages. When there is a drought or crop sickness, wiping agriculture exploitation, prices of food and meat go up. That's some examples that are easy to understand.
The stock market is not about speculation. It's about buying real items in the real world and providing services.
So inflation comes out at X% and then you try to jump ahead of other people reacting to the news.
Speaking very generally, you are looking for data that has information about future returns. So this may include past values of the time series (this is kind of complex though because a stock price does trend, that company is investing capital to earn a return which compounds in the price so stationarity is...complex) but may include other time series/their past values i.e. price of other stocks, economic data, etc.
So this could be responding to changes in liquidity, it could be seeing some repeatable behaviour by investors and jumping ahead of it, etc.
Quant is not about adding to the efficiency of markets though. They aren't using these models to determine the value of something, they are more about looking at the value of other things to determine the value of a given asset. So these strategies end up being correlated to liquidity in a lot of instances (but not all). This is a generalisation but...it is a very odd thing to have occurring in society...would this exist if investors didn't have an irrational demand for microsecond liquidity? Probably not.
Also, determining whether something is a real signal is just part of statistics, isn't it? This has definitely been an area where there has been quite a lot of innovation as increases in computational power has made non-parametric stuff more feasible (I am not an expert on this, it is just my understanding).
Btw, I should add I used to work in finance and I have some experience with this kind of thing as I do quite a bit of "quant investing" but in gambling (it is far easier to just copy what people do in finance and apply it to gambling then come up with it yourself). And just based on my experience, it makes most sense to employ a mixed approach. So learn about the business valuation, and then build a five-factor model...watch what it does, and then filter its picks with your knowledge. A lot of quant strategies are vaguely ludicrous if you have an understanding of the fundamentals of investing, like you are trying to use a computer to replicate a human...and people wonder why it doesn't work? It is an overcomplicated shortcut (to give you a concrete example, the blowup of value and funds like AQR was very obvious...you just had to look at the utter garbage stocks they owned). So I think a combination of human and computer beats either separately (one fund that does is Marshall Wace).
Though it helped that the period was 1967 to 1974. The piranhas were a little slower back then
Example in point: Warren Buffet. All of his process is public knowledge, he constantly writes and talks about it. And yet somehow it didn't make him lose his edge.
Where higher frequency trades do become "self fulfilling" can be in intraday technical analysis. For example, there are many people that follow RSI signals in options. Youll see retail people trade this, and then market makers will step in and bring things back in line - because options dont trade on technicals...
He found success pursuing relative advantages, infrastructure advantages, and building custom tools from scratch.
But absolute vs relative advantages, plumbing together canned solutions vs building your own from scratch, infrastructure-level advantages vs decision making advantages...all of those contrasts exist in other businesses everywhere. None of those are specific to trading.
> “in my experience, nothing beats learning by doing or finding a mentor”
This hits the nail on the head.
The best way to become a profitable trader is with a mentor, but it’s nearly entirely luck. You drive an Uber or tend bar and happen to make friends with someone successful who is willing to guide you. Trying to seek out a mentor online is nearly impossible, as everyone who is findable and willing is almost certainly a better marketer than trader.
The other way to become a profitable trader is to start trading with real money. It’s amazing how quickly one can learn how to mend a boat, when the boat starts sinking.
> This paper examines the elements necessary for a practical and successful computerized horse race handicapping and wagering system. Data requirements, handicapping model development, wagering strategy, and feasibility are addressed. A logit-based technique and a corresponding heuristic measure of improvement are described for combining a fundamental handicapping model with the public's implied probability estimates. The author reports significant positive results in five years of actual implementation of such a system. This result can be interpreted as evidence of inefficiency in pari-mutuel racetrack wagering. This paper aims to emphasize those aspects of computer handicapping which the author has found most important in practical application of such a system
Arguably the paper describes the state of the art from three decades ago, applied to betting on Hong Kong horse races, not market price movements.
Sports betting is essentially the same thing as proprietary trading in financial markets. The paper gives a good summary of a technique that was very successful in its day.
There is very little publicly available material on quantitative techniques that are useful for proprietary trading. Lo and Mackinlay's "non-random walk down wall st" was good, but that's 20 years old.
The mathematical literature on gambling is a lot more accessible. It's also probably easier to consistently make at least small money gambling, because the barriers to entry are lower.
Questions like, how do you choose a stoploss? Well you can pick it statistically based on history or you can use a supervised label. You can even use stock A calculated stoploss to pick the stoploss you use on stock B because you found a condition under which those two stocks became almost identicall correlated. How do you want to pick the supervised label? You can do spectral analysis to pick the stoploss too. You can use sentiment as a stoploss, source from google news or twitter or stocktwits.
It doesn't have to be, 'well I measured the average profitable stoploss to use over the last 10 years across all stocks and that isn't working so I quit'
Things like that, you get to fit the ideas together and then test them in the real world.
There are some things I would like to share.
1. Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.
2. I have measured a correlation between heavier processing(using a higher big O) and better out of sample performance.
The criticisms with the NN approach like non stationary data have obvious solutions that a 'by the book' trading approach + ml approach don't really teach beginners so they dismiss it.
It is my belief right now that there are people who are prepping data from sources like iextrading then using things like sagemaker to develop good enough forecasting and combining it with a statistics+rules based trading strategy to make living wages.
That said, I have 5k account size for my NN obsessions, and my 401k is 'by the book'.
person_of_color is totally right when he says it is a Moby Dick of programming.
Exactly, this is one of the nice things about RL. You don't to do a bunch of handwaving to turn your predictions into a strategy.
Any recommendations or hints on where to get started (assuming I’m decent with python/pandas etc)?
1. I would start in the numerai tournament, I did this for 3 years after the first two years of me by myself on the market. It's useful because they provide ml ready data, and you can iterate very quickly. If you do not have ML experience numerai will teach you about many different types of overfitting and the many correct and incorrect ways to deal with them. An example would be some ML people always apply dropout, but when you have a small signal to begin with, dropout can dropout the signal, and then there is only noise left for the model to fit and of course it will then perform poorly. The other thing it will help with is the hopelessness that you will encounter from hitting a wall(hitting a wall is common in ML, and should be expected), the scoreboard shows individuals who have broken through that wall so you can know it is possible. I stopped participating after stabilizing in the top 20 because they change the format of the tournament every so often and I wanted my Saturdays back. You don't need to reach top 20, I hit a wall around rank 100 back when they used actual bitcoin to pay people. You just need to do well in one of the rounds where everyone else fails so you can go through the process of 'what did I do that I'm not aware of that made me succeed where everyone else failed'
2. Read Advances in Financial Machine Learning by Marcos Lopez de Prado. This goes over the false assumptions that outsiders make, and then outlines rookie mistakes(I made many of the mistakes described in the book, then read this book when it came out). It also will break you out of the thinking that leads to typical approaches and why it is unrealistic to expect them to work.
3. Become familiar with retail trader mistakes like overtrading, improper sizing, and emotions as well as the fact that you cannot rely upon regulating bodies to prevent fraud from occurring, they only act after it has occurred.(This is for scenarios where your models says short this stock, then you see that the stock is fraudulent but it continues to exist.) Learning blackjack probabilities + sizing helps with developing a strategy. Things like, do you want a trading system that has 60% accuracy and 10% profit each time, or one that has 45% accuracy but 200% profit each time. It's interesting because even if you have a 50% accurate 200% profit/50% loss strategy, you still need to calculate the probability of what number of losses you will see in a row that will still bankrupt you if you have the wrong size. In college for me this was covered under the Discrete Math Class.
After steps 1+2+3 I think people who have some level of control over their emotions have the right foundation to code a system. There are people that should not trade because they don't have the right personality profile.
4. Find a way to fit the data you encounter into a DB. Early on I had to pay 100 a month to get daily csvs for stock data. I wrote code that answered questions for me from the csvs. This was wasted time, because you can write SQL to answer so many questions. Keep this DB on a separate computer from the one you do ML dev on. Because the computer that ML dev happens on inevitably gets wiped(it will happen to you).
Then for you its a matter of just leveraging python+pandas etc to code a solution that meets your criteria. There are three categories that you have to operate across, infrastructure+forecasting+trading strategy. When you see one of your models predictions become true it really is a different feeling. But to ease my conscience I should warn you, if you are the curious type and you try this once, you will always be curious about it.
For timeseries data im currently using iextrading even though it has downsides(they only have data for trades that route through their exchange). I used to use kibot, alphavantage,scrape yahoo, download stock data csvs from ebay,and save etrade realtime quotes. For placing trades I am currently using alpaca.(I've used IB,etrade,and robinhoods private api before they blocked it).
What I would require from a trading platform:
1) decentralized and permissionless 2) provably fair trading
With 'provably fair trading' I mean the protocol should be such that I can prove you are not simply held captive by an intermediary, regardless in what shape or form. It should also be fair with respect to latency.
For example consider a trading market where token X can be exchanged for token Y and vice versa. Each holder of X demands her minimum of Y per X, and each holder of Y demands his minimum of X per Y. What if everyone salty hashed their demands, and pays the market contract (proportional to how much they will actually be allowed to trade) to register their salted hash. When the round has closed, people reveal their salt and plaintext, and the incompatible trading offers get their money back (minus a usage fee perhaps). The compatible ones can have their trades go through at the rate of 'total compatible X offered' to 'total compatible Y offered' (or some variation thereof, say rewarding those that helped close the gap). In this way there is no high frequency trading, and you could have a family of such markets operating at different timescales...
I wonder nevertheless if there's a sweet spot where you can build a simple AI trading algorithm and get modest earnings from it.
In terms of strategies based purely on market data, you are definitely correct. Any publicly (freely/cheaply) available market data is low resolution, lacking the full data from any point in time, and generally based on poor approximations of the actual data (elsewhere in this thread someone mentions that IEX's data is based on trades that get routed through the IEX exchange, which obviously misses any data you could get from the markets that make up 99% of the volume, dark pools, etc.).
I think the "sweet spot" is simply coming up with a strategy that nobody else has thought about, or else executing a better-known strategy more effectively than other market participants. Both are hard, but somewhat in the realm of possibility. The problem is that many people think there's free money to be made without either of these.
We would like to find 1 or 2 more people to work on this project, we need people who can tolerate risk and skilled at data engineering: data pipelines, psql, pandas, numpy, data visualisation, setting up servers. Ideally also skilled at machine learning / deep learning and who has tried his hand at trading systems. If interested, my email is in my about info.
I'd love to see the actual data
Another thing to look at is correlation to the market. The less correlated to the market your strategy is, the more valuable it is. This is because investors like uncorrelated strategies. For example, lets say you have n strategies, each with a volatility of sigma and mean return of mu. Allocating all of your money to one strategy or two or all of them won't change your return, it will still be mu. But if the strategies are uncorrelated, and you equal weight each one, your vol will be sigma/sqrt(n) and your return will be mu. This is the essence of Modern Portfolio Theory (MPT): add as many uncorrelated assets that you can.
In no particular order, here's a list of things that matter when evaluating an (equity) strategy: turnover, size of alpha being exploited, Sharpe ratio, correlation to market and sector, correlation to style factors (value, momentum, oil, etc), and net exposure (long or short).
Their returns kind of suck, but it's more to do with their trading frequency and correlations than anything else.
It starts off picking a random market direction (up/down) places bid (sorry I mean makes a trade). Then based on lots of tuning/backtest decided how long to be in position and what is the stoploss.. Think in the end the most "profitable settings" where something like :
$proft_size = 0.38% $stop_loss_size = 0.35%
Win-Continue-Direction = 3 rounds (after winning/losing do we change direction) So it probably in the end was Markov-model with random-start - if we had to label it :)
Oh and for fun it would also "martingale for x rounds" :P
Worked quite well for 3-4 days and was fun implementing it while watching "Billions" on TV in the background :D
Is this post about the one with decreasing profits, or a new one that is profitable?
Not only are that, but there are many different order types besides "buy at market price, sell at market price". Then there's options, short sales, and more.
It goes deep. People devote 30 years of their career to this. Read the authors experience as a kind of warning, if you will.
[1] You can find a list of NYSE broker/dealers here: https://www.nyse.com/publicdocs/nyse/markets/nyse/members/NY... -- any firm where latency matters will need to be on this list to colo on the exchange.