I Wanted Orange (The machine would not make a mistake.)


Python script: Easily import screenshots to Steam

Recently I wanted to upload some very old TF2 screenshots with Steam's screenshot-tool, only to realize that it isn't as simple as dropping them into a folder. You've got to rename them correctly, generate thumbnails, and (sometimes) restart the Steam-client so it sees them. Rather than manually making dozens of specially-named thumbnails, I decided to automate it. Fortunately I already had the Python Imaging Library installed, which makes the thumbnail-generation easy.

I've uploaded the script to this Github gist.

Assuming you have Python and PIL, just drop the script into the Steam screenshots folder you want to populate, for example C:\Program Files\Steam\userdata\{userid}\780\remote\{gameid}\screenshots\. Then you can just drag-and-drop image files onto it (or pass them as command-line arguments) and it will do the rest. (780 is the app-id for the Steam screenshot tool.)

Steam Screenshot MigratorSteam Screenshot Migrator

Once the new files exist, you may need to restart Steam for it to notice. Unfortunately, all the effort of preserving the original date-information doesn't seem to matter after uploading: Only an "uploaded on" date is visible through Steam's website. Oh well, at least my local records are in-order.

Tagged as: , No Comments

Spaceballs: The Python Script

"Before you die, there is something you should know about us, Lone-Star..."

def unfoldFilter(unfoldFunction, filterFunction, iterable):
    for item in iterable:           
        for result in unfoldFunction(item):            
            if filterFunction is None or apply(filterFunction, [result]):
                yield result

assert darkHelmet in unfoldFilter(lambda p: p.roommates, lambda p: p.isFormer(), 
    unfoldFilter(lambda p: p.children, None, 
        unfoldFilter(lambda p: p.siblings, None, 
            unfoldFilter(lambda p: p.parents, None, 
                unfoldFilter(lambda p: p.children, lambda p: p.isMale(), 
                    unfoldFilter(lambda p: p.siblings, None, 
                        unfoldFilter(lambda p: p.siblings, lambda p: p.isMale(), 
                            unfoldFilter(lambda p: p.parents, lambda p: p.isMale(), list(loneStar))

assert makesThem(darkHelmet, loneStar) is None

For those who don't get the joke, the movie "Spaceballs" self-parodied itself with a series of whimsical merchandising options and contained a gag referencing Star Wars' iconic "I am your father", except... well, a little more complex.

Tagged as: No Comments

Uncertain Base64 – Overwatch Puzzle

As I mentioned in yesterday's post, there's an Overwatch graphic of some base64 text with an unclear font... But what's the point of writing something to solve a puzzle when you aren't sure you have the right question?
Overwatch Base64 Issues
So I made this little function which takes a base64-string and generates visually-similar versions:

import itertools
import binascii

def B64Variants(val, confusions = None):
    if confusions is None:
        confusions = ["Il1","O0"] # Use defaults
    val = val.translate(None, " \n\t") # Remove whitespace
    sections = []        
    for char in val:
        confused = False
        for group in confusions:
            if char in group:
                confused = True                
        if not confused:   
            if (len(sections) > 0) and isinstance(sections[-1],str):
                sections[-1] = sections[-1] + char
    # Tidy up so we have a consistent list-of-lists
    for i,v in enumerate(sections):
        if isinstance(v,str):
            sections[i] = [v]                
    for varBits in apply(itertools.product, sections):
        yield "".join(varBits)

if __name__ == "__main__":
    sample = binascii.b2a_base64("Hello, World!")
    print "given", sample
    for var in B64Variants(sample):
        print "maybe", var    

Example output:

given SGVsbG8sIFdvcmxkIQ==

maybe SGVsbG8sIFdvcmxkIQ==
maybe SGVsbG8sIFdvcmxklQ==
maybe SGVsbG8sIFdvcmxk1Q==
maybe SGVsbG8slFdvcmxkIQ==
maybe SGVsbG8slFdvcmxklQ==
maybe SGVsbG8slFdvcmxk1Q==
maybe SGVsbG8s1FdvcmxkIQ==
maybe SGVsbG8s1FdvcmxklQ==
maybe SGVsbG8s1Fdvcmxk1Q==

I think the next step is to try hooking this up to code which tries to decode the cipher-stream byte by byte, skipping to a new password (or a new Base64 source-string) when the decoded byte goes outside conventional ASCII. (I'm gambling that it'll contain an ASCII message, if it doesn't then it's hard to know if you've successfully cracked it.)

Finally, what happens if we apply this to one of the transcriptions people have made from the Overwatch Summer Games video?


78,732 variations in total. Uh oh. I'm happy with the output, but whatever I use to go through these, it needs to be able to memoize or somehow reuse whatever progress it can from variant-to-variant, rather than starting over with each new string.


A more-realistic number would be 6,561, since we know that the first 1 is good because it's part of the OpenSSL header, and because the letter O is visibly wider than zero.


Experiments in decryption – Overwatch puzzle

As part of it's "Summer Games" update, Overwatch put an easter-egg into their video, embedding some base64-encoded text into a scene. Overwatch Base64 hint


Unfortunately there may be errors since the font doesn't clearly distinguish between certain characters such as 1/l/I, but one interpretation comes out to binary data like:

>>> binascii.b2a_qp(binascii.a2b_base64("U2FsdGVkX1+vupppZksvRf5pq5g5XjFRlipRkwB0K1Y96Qsv2L m+31cmzaAILwytX/z66ZVWEQM/ccf1g+9m5Ubu1+sit+A9cenD xxqklaxbm4cMeh2oKhqlHhdaBKOi6XX2XDWpa6+P5o9MQw=="))

This seems to be from an OpenSSL command-line encryption utility, since it begins with the bytes for "Salted__". I'm really no crypto expert, but this is interesting. Maybe I can at least run a dictionary-attack against it?

The first thing to note about the puzzle is that it has an odd number of bytes, implying that that a "stream" cipher was used rather than a "block" cipher, or at least a block cipher being used in a streaming mode. Our chances of guessing the right one are somewhat slim anyway, so why not start with a simple stream-cipher, RC4? First we need to figure out how it's implemented in the OpenSSL command-line tools. Let's try a simple example using "x" (one byte) as our secret message, and "test" as our key:

$ hexdump temp.txt
0000000 0078

$ openssl RC4 -S FFFFFFFFFFFFFFFF -k "test" -p -in temp.txt -a


By forcing the salt to a known 8-byte pattern (-S) and by telling OpenSSL to show us the key it created (-p) we know that somehow all those FF's and "test" combined to make D7BA581CCB7DBAFD5BD1C7DF8BDDE4E3. This is useful, because with a small amount of trial-and-error I can figure out how it is generated, leading to this Python script which gives the same key-output:

import binascii
from Crypto.Hash import MD5

password = "test"
salt = binascii.a2b_hex("FFFFFFFFFFFFFFFF")
key = MD5.new(password+salt).digest()
print binascii.b2a_hex(key) # Output: d7ba581ccb7dbafd5bd1c7df8bdde4e3

Now we're one step on the way to scripting up a compatible RC4 encoding routine, which ends up looking like:

import binascii
from Crypto.Hash import MD5
from Crypto.Cipher import ARC4

def encryptString(self, in_str, password):
    salt =  binascii.a2b_hex('FF' * 8)
    tempkey = MD5.new(password+salt).digest()
    print binascii.b2a_hex(tempkey)
    cipher = ARC4.new(tempkey)
    enc = cipher.encrypt(in_str)
    return 'Salted__' + salt + enc

Great! Now we can implement decoding, and after a little more tinkering, and we've got a little Python class which can encrypt/decrypt the same as the command-line tool:

import binascii
from Crypto.Hash import MD5
from Crypto.Cipher import ARC4
from Crypto import Random

class SimpleRc4:
    def __init__(self):        
        self.random = Random.new()       
        self.header = "Salted__"
        self.saltLen = 8

    def encryptString(self, in_str, password):        
        salt =  self.random.read(self.saltLen)
        tempkey = MD5.new(password+salt).digest()
        cipher = ARC4.new(tempkey)
        enc = cipher.encrypt(in_str)
        return self.header + salt + enc
    def decryptString(self, in_str, password):        
        salt = in_str[len(self.header) : len(self.header)+self.saltLen]
        body = in_str[len(self.header)+self.saltLen:]
        tempkey = MD5.new(password+salt).digest()        
        cipher = ARC4.new(tempkey)
        dec = cipher.decrypt(body)
        return dec

    def selftest(self):
        password = "selftest"
        a = "Content"                    
        b = self.encryptString(a,password)                                        
        c = self.decryptString(b,password)        
        assert(a == c)

So what's this kind of thing useful for? Quickly trying lots and lots of decodings. Here, let's build up a brute-forcing framework, and prove that we can crack a simple RC4-encoded message... (Github Gist)

import sys
import itertools
import binascii
import StringIO

from Crypto.Hash import SHA, MD5
from Crypto.Cipher import AES, ARC4
from Crypto import Random

class Breaker:
    def __init__(self,e,puzzle):
        self.e = e
        self.puzzle = puzzle
        self.last = None
    def attempt(self, password):
        self.last = password
        result = self.e.decryptString(self.puzzle, password)                
        return result
    def comboAttack(self, sequence, tester):
        for pw in sequence:            
            result = b.attempt(pw)
            if tester(result):
                yield (pw, result)
    def CheckBoringAscii(result):
        for c in result:
            d = ord(c)
            if d > 127:
                return False
            elif d < 32:
                return False
        return True

    def GenPasswordList(passwordFile):        
        with open(passwordFile,'rb') as pwdict:
            for line in pwdict:            
                pw = line.strip()
                yield pw

    def GenBrute(charset, maxlength):        
        for i in range(1, maxlength + 1):            
            for c in itertools.product(charset,repeat=i):
                yield ''.join(c)       

class SimpleRc4:
    def __init__(self):        
        self.random = Random.new()       
        self.header = "Salted__"
        self.saltLen = 8

    def encryptString(self, in_str, password):        
        salt =  self.random.read(self.saltLen)
        tempkey = MD5.new(password+salt).digest()
        cipher = ARC4.new(tempkey)
        enc = cipher.encrypt(in_str)
        return self.header + salt + enc
    def decryptString(self, in_str, password):        
        salt = in_str[len(self.header) : len(self.header)+self.saltLen]
        body = in_str[len(self.header)+self.saltLen:]
        tempkey = MD5.new(password+salt).digest()        
        cipher = ARC4.new(tempkey)
        dec = cipher.decrypt(body)
        return dec

    def selftest(self):
        password = "selftest"
        a = "Content"                    
        b = self.encryptString(a,password)                                        
        c = self.decryptString(b,password)        
        assert(a == c)

if __name__ == "__main__":
    e = SimpleRc4()    
    sample = e.encryptString("brute decode challenge", "test")
    b = Breaker(e, sample)
    source = Breaker.GenBrute('abcdefghijklmnopqrstuvwxyz',4)
    tester = Breaker.CheckBoringAscii
    for pw,result in b.comboAttack(source, tester):
        print pw, result 
    #Output: test brute decode challenge

And... Yes! It manages to crack the sample.

I'll explore applying this framework to the actual Overwatch puzzle in a follow-up post. I'll need to put in some fuzzier logic to handle the fact that the source-data -- the ciphertext -- may be subtly corrupted by errors by humans who have to write it down from inside a video-frame.


CTF 2Fort Revamp B3 uploaded to Steam Workshop

After coming across another stolen/plagiarized copy, I decided to try uploading the map to the new (comparatively) Steam Workshop: CTF 2Fort Revamp

Dealing with the Source SDK again, I imagine this is a bit like Chell felt at the start of Portal 2: You realize a whole bunch of time passed, and all the nice humming machinery you remember is in disrepair while a few new weird mechanics exist.

This also marks an attempt to overcome a sort of writer's block that uses the excuse of "it's not good enough to share yet", which usually ends up with all of my creative-energy going into pun-filled Reddit comments. We'll see.


Chase_fixer project now up on GitHub

Following up on my previous post, I've worked on packaging the scripts up a little more nicely and the result is now available on github.

So far it's handled all the weird cases that I see from my own financial history, but I expect there are a few more oddball scenarios (like wire-transfers or refunds) which may require additional tweaking as time goes on.

Tagged as: , Comments Off

Chase’s malformed transaction records

The problem:

Last month, I tried to import some bank-account records (QFX/OFX formats) into the “You Need A Budget” accounting software, which involves telling it how to recognize certain transactions “groceries” and “gas” etc. This did not go as smoothly as I expected, even for an accounting chore, because many of the payee-name and memo fields had ridiculous values! Manually fixing a lot of scrambled data every month wasn't what I had in mind when it came to simplify my budgeting, so I decided to investigate.

<MEMO>01/20 Purchase $9.41 Cash Back $

I believe whatever steam-powered mainframes JP Morgan Chase uses don't seem to have caught up with the current century: Payee fields and memo fields are combined, truncated, arbitrarily split, and whitespace-trimmed, all presumably as sacrifice to some dark and ancient internal decision that 32 characters (for name) and 32 characters (for memo) were long enough for anybody. Some folks say it's the data-format's problem, but I disagree: Chase's datafile says it complies with OFX v1.02, but if you crack open the spec (dated 1997) it clearly says that at least the memo-field should support 256 characters, not 32.


Payee and Memo don't really contain the right thing:

Payee: "Online Payment 1234567890 To Cap"
 Memo: "ital One Bank"

The rest of the memo would have had my cash-back amount (which might be handy in budgeting software) but is truncated:

Payee: "Grocer &amp; Sons Inc. 12345 Exampl"
 Memo: "e road 01/18 Purchase $20.11 Cash b"

The split here occurs between two words, but the whitespace was trimmed! There's no automatic way to know this is "Park lane" vs. "Park lane":

Payee: "Marios Pizza and Plumbing 5442 Park" 
 Memo: "lane NW"

Current progress

Right now I have a series of Python classes  which:

  • Parses the original OFX file(example)
  • Translates it into a much-more-convenient XML file with similar structure
  • Visits every transaction in the XML file and applies custom logic to fix it up
  • Writes the XML file back out as OFX

So far I'm pretty happy with the result: All I have to do is code logic for a few of the common cases, and run the scripts after I download the OFX files from chase. Here's an example of a super-basic statement visitor that just tries to combine Payee and Memo.

    def visitStatement(self, values):
        name = values.get("NAME", "")
        memo = values.get("MEMO", "")

        if len(name) < 32:
            # When the split occurred, there was whitespace which got trimmed, re-add it
            combined = name + " " + memo
        elif len(name) == 32:
            # The split was forced due to some size limit, and we don't really know if there's
            # a space between them or not...
            combined = name + memo
            pass #TODO warning, larger than ever expected

        values["NAME"] = combined
        values["MEMO"] = "" # No more memo data, it's all inside Payee

From this humble beginning I can branch out into recognizing common patterns (like transfers) and payees and clean up the data appropriately. After generating a new QFX file, the YNAB software seems to handle the longer payee names just fine.

Future work

The biggest problem left is that the data still isn't clean enough: Anything over 64 characters has been lost, and it's not always clear if I need to reinsert whitespace between Payee and Memo. Fortunately, Chase does offer a CSV download, which isn't as useful for importing into accounting applications but does contain the entire original. I just need to find some way to cross-reference between the two, perhaps based on dates, amounts, and some sort of non-whitespace similarity.

Once I have things a little more polished I plan to put them up on Github, but at the moment there are still a lot of hardcoded data-file paths and stuff.

Tagged as: , Comments Off

Still alive in 2012

Sorry, dear readers. Or reader, more likely. I've been slacking off for the last few months and now my self-guilt compels me to update.

My work on a branch of PackBSP to handle more game-engines didn't go so well. I painted myself into some of the same "massive rewrite" corners I vowed to avoid, and then spent so long doing other things that it's hard to pick up again. On the other hand, I think I've worked my way through some architectural problems, and the silver-lining of having no other code-contributors is that I don't need to worry about backwards-compatibility very much.

Currently my interest is on the Netbeans Platform, and how I might be able to use it to streamline PackBSP and break away from the limitations of a wizard-centric interface. Actually, my daydream is to create a bunch of Netbeans plug-ins that turn the Netbeans IDE into a Source-engine-related powerhouse, but with the VIDE making a surprise return from the dead, perhaps that niche is already well on its way towards being filled.

Lastly, as a matter of interest, I temporarily fixed my video card with the "oven trick" (8 minutes at 375 Fahrenheit) but a week or two later it failed again, so I may just put it up on Craigslist as a challenge to anyone who thinks they can make it stick.


Hardware failure: Need to RMA my video card.

Well, it seems my graphics card (which has been limping along for years with intermittent glitches under load) has finally given up the ghost and my computer will no longer boot. Fortunately, this occurred after I finished wringing many hours of enjoyment and completionism from Deus Ex: Human Revolution, or else I would feel incredibly annoyed at the interruption. (Aside: It's a good game. A worthy successor to Deus Ex.)

Since the card (an Nvidia 8800GT variant) is still decently-powerful and has no obvious damage, I'm going to see what I can do under the limited-lifetime warranty. Compared to its earlier foibles, RMAing now ought to be unambiguous and straightforward, given that the computer now refuses to even POST if the card is present.

Tagged as: Comments Off

Graph Design Blues

I'm having second-thoughts on how to manage the dependency graph(s). I've been experimenting with a "directed multigraph", but I worry that it adds too much complexity when it comes to determining what portions of it are connected when only a certain type of edge is considered, and whether I'm over-complicating things. Actually, I'm pretty sure I am over-complicating things, but freedom to experiment is part of what makes an independent project fun.