This project comes from personal experience; my wife and myself trying to manage our finances and accounts. The premise is to import your bank statements which the application then organizes and groups through a learning-algorithm into ledgers that you define. It's a direct correlation to how we manage our personal life (ledgers in a cupboard), except it's digital, scroll-able, searchable and graph-able. Other features include automatically extracting bank fees (where it is possible) into a line item, to give you visibility of where all your money goes.
Development has migrated from a WPF application to a web based solution, and now also supports importing PDF bank statements in addition to the old CSV and OFX type files. This means the user no longer has to manually download any content from on-line banking, and all the information they need to drive this application should arrive in their email: their bank statements.
I started looking at some new import options for BadaChing. Until now you could import the CSV format available to download on most internet banking sites, and also the old OFX format popularized by Microsoft Money about 15 years ago and which is still offered by the local banks here. The problem with this is that those statement downloads are only available for the last 3 months, or 150 transactions or so, depending on your bank. So if you miss a month or two, your data is incomplete.
The solution is to read the PDF statements we all get sent on email. I have a label in GMail with every bank statement ever sent to me, so I should be able to import any of these. All I had to do was to read the PDF, which is easier said than done. Depending on the library (free, paid or otherwise) this is unstructured text; it is impossible to know all the different combinations that transactions are printed with, and some people will receive theirs in different languages! But the POC worked well enough. At least for FNB statements. I hope to make this available soon with two additional features: a complete statement overview before committing the import (because its unstructured and not guaranteed), but also to ask about duplicate transactions.
Currently only the OFX format supplies unique transaction identifiers, so if the user imports a CSV or PDF, all the program has to go on is the description, date and amount. I had a situation where the same amount at the same vendor and on the same date was transacted twice. BadaChing failed to correctly import this statement. This check is necessary because the CSV and OFX formats can be downloaded at any time, meaning it could contain the same transactions. The PDF statements are of course based on monthly periods, so using these exclusively means the check is superfluous.
Check back for an update in, oh.. December?
One of the implementation strategies I used in the WPF app was to utilize the entity classes directly in the viewmodels. They come with all the necessary metadata for Entity Framework to know what to do with them, while the viewmodels simply leveraged directly from the entity's properties without exposing the entity directly to the views. The upshot of this is that it's a simple step from updating an object to saving said changes back to the database. There are no mappers required and there are no separate loads required to check for existence in the database. You're always in context and you're always in-memory. Of course, this app came with a local SQL Compact database, as opposed to running against a hosted API somewhere.
This doesn't work all that well on web, however. In fact, it doesn't work at all, since you loose context completely whenever a GET or POST request completes. So I've had to update the code significantly to deal with the difference in platform. I've had to introduce mappers (via Automapper) to deal with moving from entities to viewmodels and back. I've had to introduce a manager layer to better marshal calls into the repository, and of course the repository layer around EF has had to change significantly to first find an instance of the entity in the context before adding it if required during saves. All this now also comes with a custom IoC implementation (that doesn't hinge off a bootstrapper pattern).
Why am I doing this? Well, a WPF app locks me into the Windows front-end. It's possible to build that into an RT app or whatever for a phone or Surface, but most people don't want to install stuff anyway. If it's on the web, it's accessible, it scales and I have access to Google charts (although I was very well advanced in producing my own WPF charting library). If it's on the web, it's also easier to show it to people, so I figured it's better to combine the website and the app into a single entity.
I've now exceeded the feature set of the old WPF app, and I completed a first round of CSS and front-end work just last night, so there's still a long way to go. I'm not sure if I'm even going to try and monetize this, turn it out for public use or just keep it among friends. We'll see how that pans out. First and foremost, this is something I like to work on (which is a primary factor), and I don't want overhead or management that will detract from that joy. Secondly, I'm building this mostly for ourselves to use, instead of manually updating spreadsheets on a continuous basis. This being a web deployment now though, you have to have a certain level of finesse to it before sending the link to someone, because they can send the link to everyone!