An automated solution to reconcile statements based on LLM matches removes transparency on how your books are prepared and might create a false sense of trust in the preparation of your books. In case of an audit, people will be in great trouble when their answer is ‘yeah, the AI booked everything.’
I think there is an opportunity here, but I don’t see it ending well until we can put the accountability of your accounts in check.
The way you present the case in this comment is vastly better than in the post.
There are ambiguous cases in certain accounting treatments, and hinting a resolution is, well it still makes me nervous but it’s not crazy.
I think this is a niche issue related to accounts receivable potential errors as most companies I work with don't have the problem of constant repeating amount transactions.
We look at the data we have and if it's sufficient we can "automatically" reconcile it - i.e. suggest a match with 100% certainty that the user(s) can then confirm. Otherwise we make an informed suggestions based on all of the likely data from the transaction(s) and sometimes the suggestions are a list of possible matches.
IME the biggest problems in recon are the edge cases around failures in the banking system or the flow that are very difficult to code around and require manual intervention:
* failures of payments X days after they have been reconciled, now you have to pull things apart again * bank reverses transactions but then puts them back and this appears in their intra day statements (MT942 files for example) but doesn't show on their online portals, leading to "duplicates" in one system that aren't really * statement and reference data is incomplete or just wrong (who knew that free text fields can be problematic?) * amounts simply don't match because you invoiced for X and were paid Y - payments are split up to get around constraints, amounts are rounded up, etc.
We deal with these every single day, and we are automating what we can - but you're always going to need a human to confirm the final step in these cases. Perhaps an LLM can improve suggestions, but when the data is just wrong or missing then I'm not so sure.
[^1]: https://us.payprop.com/
Similarly, matching invoices to purchase orders and authorising payments. This catches fraud and avoids paying for goods you didn't receive ... but it's another necessary evil rather than a value-adding differentiator for the business. So companies exist to take your PDF invoices and your ERP's "we recorded that we got x, y, z" and match them up and authorise payment for unexceptional invoices (we wanted 5, you said you sent 5, we said we got 5, let's pay you for 5).
It’s at least arguable that this task is the oldest documented use of writing, and from double-entry accounting to price/time precedence in modern market microstructure, we have algorithms that align very well with human intuition.
I can think of few cases where gratuitous application of even simple statistical methods would cause more harm than this one.
With all respect, the conversation becomes stupider with every post like this one.
A few weeks ago, I decided, what the hell, and I spent two days writing a ChatGPT-powered reconciler assistant.
It’s so damn accurate. By feeding it relevant examples, it suggests the right journal entry for each bank transaction nearly every time, including saying “no matching entry” for when the corresponding journal entry hasn’t been posted yet.
It would have taken me a lot longer to write an if/elseif/else-based reconciler, and it have required a lot of manual attention… and the constant internal debate of whether the rules are code that should go in Git or data that should go in the DB.
I think ML models are a great fit for transaction reconciliation because they give good-enough results really fast at a reasonable price. I’d prefer that over continuing to spend my own time, or having to learn the more advanced algorithms you mentioned.
Something is better than nothing.
Just get yourself a solver.
And if you wanna solve billion item sized problems hire an OR scientist to write a decomposed algo.
Literally after 2 minutes of search: https://arxiv.org/pdf/2002.00352.pdf
I wish I had better skills at searching academic papers for problems I'm trying to solve or that I'm just thinking about. I think just as there are some people that google better than others, I imagine a similar skill applies to academic papers. Anyone encounter this? How do I get better at it?
I thought at first it was an accessibility problem, and perhaps it still is. In that, I didn't have access to a library of academic papers. But, arxiv.org does make available a lot of content for free. The content seems to be growing too.
Another question I'm exploring is how do I decide which journals to subscribe to. I have a limited budget so have to pick wisely. What makes things difficult is that the papers that I have found interesting in the past, seemingly in a related field, are still published to various journals.
One more random comment. I really can't wait until LLMs are applied towards academic papers. Academic papers build on-top of each other and there are concepts that are considered "common knowledge" to experts and may require a long history of papers to consume to build a foundation of concepts and vocabulary. The difficulty is that recursively these papers introduce the same problem. A lot of times the concepts are not that difficult and it would be wonderful if an LLM could be used to fill the gaps as if I were talking to a expert.
I guess there are sort of expository papers that act as a checkpoint for a particular topic. I'm not sure how to find these.
In fact, there are identical problems that are solved by different communities and you would not know because they use completely different lingo. Math optimization/dynamic programming/reinforcement learning is one of these.
Many of the accomplished scientists just read papers from other domains and adapt them to their own domain making huge progress.
So yes I see tremendous value to what you describe. A Google translate for academic work that can translate between domain specific lingos and common language.
do NOT subscribe to any academic journals. not worth it for an individual. find a way to get access for free, such as through a local library or institution, or by other means. also note that often google scholar gives a link to a PDF over on the right, or in alternate versions of the article.
there are services like perplexity.ai that can search arxiv, pull articles, and feed them through an LLM for you -- it's pretty much what you want. some of the LLM chat interfaces let you upload PDFs too. none of this actually works that well yet but sometimes useful.
That said:
> lol at ai for solving deterministic knapsacks. > Just get yourself a solver.
I don't think these are necessarily in conflict. "write me some Z3 code to solve this knapsack problem, then run it and tell me what the output means" seems like it might actually be in the right realm. The LLM isn't doing the solving, which makes sense because I agree there's no mechanism by which an LLM would be better at it than a solver, but as a UX to the solver it seems like it'd do okay. That's genuinely value added, I don't expect most accounts or even programmers to be familiar with Z3.
For example lets say I ask you to sell 10 shares of Google if the price goes over $140 (this is typically called a limit order). Now your bank comes back and says the sold 2 shares at $140.02, 7 shares at $140.03, and 1 share at $139.77. Did they satisfy their obligation?
The answer is yes, but it's difficult to determine that, and you can't use exact math to do it easily. You expected $1400 from that sale, but you got $1400.02. Now do it again, but you have half a dozen orders at different prices. That's where it turns into the knapsack problem.
The problem is severely compounded when you look at why you're reconciling (it's to make sure your assets changed the way you expected, and fix things when it didn't). Often banks will drop a transaction, or add an extra one (these systems are annoyingly manual, and subject to error). How do you find the exact error and track it down? Especially when the trade happened, but you don't have the actual record of it, and your records show that it didn't.
If you buy something for $0.99 today, they won't bill you immediately. If you buy another $0.99 item tomorrow, you'll get a consolidated charge for $1.98. You would need to do something like this to link the $1.98 to your app/song purchases.
The reason for this is that credit card processing often costs a flat service charge + a percentage of the bill: Stripe is 30 cents + 2.9% right now. The flat portion dominates for small charges, so you'd want to combine them if at all possible. (Apple certainly gets a better rate...but also has a scale where small savings add up).
In this example we have Apple’s charge (receipts) and the consumer’s bank withdrawals (statements). This example gives you an idea of a consumer’s purchases, which are simple to reconcile.
The post is about the Business to Business situation, which deals with greater volume and therefore more complex problems. The OP uses a toy example. If you’ve done financial reconciliation, then you will recognize the problem behind the trivial example.
This is something I'm familiar with! I refined the above algorithm to reconcile statements according to the statement end balance. It works well, and yes it's knapsack.
The reconciliation system that I work on tries to match data from different system to make sure they agree. Like match the Visa transaction file against what our system has internally recorded in its ledger. There is an unique key to match both records so no math involved.
Actually no. This post is about doing something akin to first in first out accounting to match the payments to the invoices.
I wonder why they aren't simply using virtual IBANs...
I really want an automatic inventory pickup system based on it
The way it's described, it's not a knapsack problem at all. The knapsack problem is to maximize the total value of the items you fit into the container.
In reconciliation, you presumably want to get the best matching between transactions, which is not defined here, and in any case is a completely different problem.
Ignoring the knapsack comparison, the article doesn't describe why you'd want to check each possible combination. Assuming the individual amounts are correct, you can do each batch separately - no need to check each combination within one batch with each combination of a different batch. (And if you drop that assumption, that still won't be a sensible thing to do).
I can imagine you can have a "scoring" algorithm that gives a confidence score for a match - then if you check every combination, you can pick the combination with the best overall score. But the article doesn't actually describe anything like that.
It also doesn't describe any alternatives to "AI". For example, what about a greedy algorithm? What about alternative methods to do address comparisons? I'm sure there are issues with those, but none of that is described here.
Combinatorics: Check
Algorithms: Check
Real world problem: Check
Crowbarred connection between 1-3 to show how your AI algorithm is better? Check
Disclaimer: OpenEnvoy provides real-time auditing & reconciliation solutions but in front of the ERP so we don't directly compete with MT.