So far it’s handled all the weird cases that I see from my own financial history, but I expect there are a few more oddball scenarios (like wire-transfers or refunds) which may require additional tweaking as time goes on.
Last month, I tried to import some bank-account records (QFX/OFX formats) into the “You Need A Budget” accounting software, which involves telling it how to recognize certain transactions “groceries” and “gas” etc. This did not go as smoothly as I expected, even for an accounting chore, because many of the payee-name and memo fields had ridiculous values! Manually fixing a lot of scrambled data every month wasn’t what I had in mind when it came to simplify my budgeting, so I decided to investigate.
<STMTTRN> <TRNTYPE>DEBIT <DTPOSTED>20140120120000[0:GMT] <TRNAMT>-29.41 <FITID>201401020 <NAME>SAFEWAY STORE 1234 MYCITY ST <MEMO>01/20 Purchase $9.41 Cash Back $ </STMTTRN>
I believe whatever steam-powered mainframes JP Morgan Chase uses don’t seem to have caught up with the current century: Payee fields and memo fields are combined, truncated, arbitrarily split, and whitespace-trimmed, all presumably as sacrifice to some dark and ancient internal decision that 32 characters (for name) and 32 characters (for memo) were long enough for anybody. Some folks say it’s the data-format’s problem, but I disagree: Chase’s datafile says it complies with OFX v1.02, but if you crack open the spec (dated 1997) it clearly says that at least the memo-field should support 256 characters, not 32.
Payee and Memo don’t really contain the right thing:
Payee: "Online Payment 1234567890 To Cap" Memo: "ital One Bank"
The rest of the memo would have had my cash-back amount (which might be handy in budgeting software) but is truncated:
Payee: "Grocer & Sons Inc. 12345 Exampl" Memo: "e road 01/18 Purchase $20.11 Cash b"
The split here occurs between two words, but the whitespace was trimmed! There’s no automatic way to know this is “Park lane” vs. “Park lane”:
Payee: "Marios Pizza and Plumbing 5442 Park" Memo: "lane NW"
Right now I have a series of Python classes which:
- Parses the original OFX file(example)
- Translates it into a much-more-convenient XML file with similar structure
- Visits every transaction in the XML file and applies custom logic to fix it up
- Writes the XML file back out as OFX
So far I’m pretty happy with the result: All I have to do is code logic for a few of the common cases, and run the scripts after I download the OFX files from chase. Here’s an example of a super-basic statement visitor that just tries to combine Payee and Memo.
def visitStatement(self, values): name = values.get("NAME", "") memo = values.get("MEMO", "") if len(name) < 32: # When the split occurred, there was whitespace which got trimmed, re-add it combined = name + " " + memo elif len(name) == 32: # The split was forced due to some size limit, and we don't really know if there's # a space between them or not... combined = name + memo else: pass #TODO warning, larger than ever expected values["NAME"] = combined values["MEMO"] = "" # No more memo data, it's all inside Payee
From this humble beginning I can branch out into recognizing common patterns (like transfers) and payees and clean up the data appropriately. After generating a new QFX file, the YNAB software seems to handle the longer payee names just fine.
The biggest problem left is that the data still isn’t clean enough: Anything over 64 characters has been lost, and it’s not always clear if I need to reinsert whitespace between Payee and Memo. Fortunately, Chase does offer a CSV download, which isn’t as useful for importing into accounting applications but does contain the entire original. I just need to find some way to cross-reference between the two, perhaps based on dates, amounts, and some sort of non-whitespace similarity.
Once I have things a little more polished I plan to put them up on Github, but at the moment there are still a lot of hardcoded data-file paths and stuff.
Sorry, dear readers. Or reader, more likely. I’ve been slacking off for the last few months and now my self-guilt compels me to update.
My work on a branch of PackBSP to handle more game-engines didn’t go so well. I painted myself into some of the same “massive rewrite” corners I vowed to avoid, and then spent so long doing other things that it’s hard to pick up again. On the other hand, I think I’ve worked my way through some architectural problems, and the silver-lining of having no other code-contributors is that I don’t need to worry about backwards-compatibility very much.
Currently my interest is on the Netbeans Platform, and how I might be able to use it to streamline PackBSP and break away from the limitations of a wizard-centric interface. Actually, my daydream is to create a bunch of Netbeans plug-ins that turn the Netbeans IDE into a Source-engine-related powerhouse, but with the VIDE making a surprise return from the dead, perhaps that niche is already well on its way towards being filled.
Lastly, as a matter of interest, I temporarily fixed my video card with the “oven trick” (8 minutes at 375 Fahrenheit) but a week or two later it failed again, so I may just put it up on Craigslist as a challenge to anyone who thinks they can make it stick.
Well, it seems my graphics card (which has been limping along for years with intermittent glitches under load) has finally given up the ghost and my computer will no longer boot. Fortunately, this occurred after I finished wringing many hours of enjoyment and completionism from Deus Ex: Human Revolution, or else I would feel incredibly annoyed at the interruption. (Aside: It’s a good game. A worthy successor to Deus Ex.)
Since the card (an Nvidia 8800GT variant) is still decently-powerful and has no obvious damage, I’m going to see what I can do under the limited-lifetime warranty. Compared to its earlier foibles, RMAing now ought to be unambiguous and straightforward, given that the computer now refuses to even POST if the card is present.
I’m having second-thoughts on how to manage the dependency graph(s). I’ve been experimenting with a “directed multigraph”, but I worry that it adds too much complexity when it comes to determining what portions of it are connected when only a certain type of edge is considered, and whether I’m over-complicating things. Actually, I’m pretty sure I am over-complicating things, but freedom to experiment is part of what makes an independent project fun.
PackBSP’s profile-loading (a nested Spring ApplicationContext, really) is well underway, but I’m adding another level of indirection (some data-holding classes) to avoid too tight of an integration with certain HL2Parse innards. My main goal right now is to geta build which reproduces most of the current dependency-crawling features, and then worry later about how components are going to merge/override specific configuration data, like recognized shader parameters.
This will also mean integrating and testing the new dependency-graph classes, which unlike the old version are able to model crawling multiple maps at once, where each map may potentially have its own copy of a named asset in its pakfile. This is done by allowing multiple edges in the graph (directed acyclic multigraph) where each edge corresponds to the context of a particular map. In this way most nodes will be shared across maps (avoiding duplication of effort) but differences can still be modeled by having certain nodes only connected through map-specific edges.
It’s still very much in the prototyping phase, but I think I have a long-term way for PackBSP to hit the right mix of reliability and customization it may need long-term. Key word? “Profiles”.
Well, Source-descended games have me stumped. There’s a bunch of a variation in behavior between them (consider L4D2 campaign creation versus Portal 2′s map transitions) and I’m trying to find the right technique to make PackBSP work differently depending on what the user has selected… Even assuming I can correctly identify the features of what they picked.
Sometimes that means little things like specifically looking for certain entities that aren’t listed in the FGD files. Other times it almost means changing what GUI a user will see. Creating completely different “editions” of PackBSP seems like it would lead to its own difficulties keeping everything updated, so I’m hoping to find some solution with the Spring framework. I considered the Apache Commons-Configuration library for a while, but I really do need the ability to swap out different code and wiring as opposed to key/value settings.
This gets a little weirder when you consider how to handle new mods. Should it limp along, treating “Mod X” as if it were the same as default Half Life? How easy would it be to add support for Mod X? How would the program recognize it if the authors did a revamp and released “Mod X 1.5″?
Perhaps the best solution is to mostly divorce the detection of games from the identification of games. So the user sees that they can pick “Zombie Shooterz” as a game, but it is up to them to tell PackBSP to use the “Half Life 2 Episode 2″ setting, or the “L4D2″ setting, depending on what code-base the game is built from.
It seems that Steam can sometimes have an exclusive lock on the clientregistry.blob file, which keeps anybody else from reading it in just about any way. This is an inconvenience since it means PackBSP cannot read the data it needs about game-assets. I’ve been looking into using JNI with the Microsoft Volume Shadow Service, but not only is it quite a bit out of my comfort zone in C/C++, but there may be licensing issues that would make it very difficult to put the functionality into an open-source project.
It might be easier to lobby Valve for a workaround, like having Steam periodically attempt to dump a copy to some other filename.
In the meantime, simply turn off Steam, start PackBSP, choose the game you want to pack for, and then restart Steam before you reach the actual altering-the-BSP step.
Still alive, working on PackBSP when not creatively-drained from work. I may end up skipping past a few version numbers to represent all the rewriting going on. Among other initiatives:
- Some sort of system to handle packing multiple maps at once, in order to support the VPK/campaign workflow.
- Per-game config files, allowing various features to be toggled on and off.
- Foreign-language support. It’ll be possible for non-programmers to contribute translations.
- Ability to manually create a per-map config file that specifies assets to include/exclude
On a lighter note, I went ahead and pre-ordered Deus Ex: Human Revolution, mainly on the strength of the favorable comments everyone who downloaded the leaked press-beta, but unfortunately it won’t be out until August. (And let’s face it, with a title like “I Wanted Orange” I pretty much had to.)