Using AutoitX as a DLL in Python

We’ve been using AutoIt for a while now at the shop.  We do a LOT of repetitive stuff; for instance every computer that we clean up gets pretty much the exact same process.  Any time you find yourself performing the same steps over and over again on a computer by hand, that’s a sign you should be figuring out a way to let the computer do it for you (that’s what they’re for!) and AutoIt is one of the easiest ways to automate GUI interaction for those pesky Windows programs that don’t allow automation via command line options.

For a while now though I’ve been struggling with the fact that while AutoIt is really good at GUI automation, it isn’t very good at things like manipulating complex configuration data: a job which, unsurprisingly, often goes hand-in-hand with automating a few mouse clicks.  Often you’ll use automated clicks to install a program, but find that directly editing a config file is an easier way to configure it and that’s when AutoIt falls short.  It has file manipulation tools but they are very basic.  After all, it’s a GUI automation kit NOT a full fledged programming language; I don’t blame the AutoIt folks one bit for doing one thing and doing it well.  That’s the mantra us ‘nix folks live by!  So on and off over time I’ve sort of peaked around for a different solution, one that gives me both solid GUI automation and a full-fledged programming language with lots of good modules/libraries for various tasks.

Other GUI automation tools were eliminated pretty quickly.  I didn’t find anything that was a solid or feature complete as AutoIt.  That left me with looking for a way to glue multiple tools together that did NOT result in a house-of-cards setup that would be nearly impossible to replicate in the case of a failure or that would rely on the perfect alignment of planets to run reliably.  It didn’t take me long to realize that AutoItX was my best bet.

AutoItX (available on the main AutoIt download page) is basically a library version of AutoIt that can be used from other programming languages via a DLL or the Windows COM system.  It comes with some documentation for the interfaces, but for me the installer didn’t put it in the start menu; I had to dig in the program files folder to find the .chm file manually.  The trick was figuring out which programming language was best suited to the task of interfacing with the DLL and doing manipulation of config files in formats like INI and JSON.  The setup would have to be totally portable–we run our tools from a file share on our server and we can’t just go installing random runtimes on customer computers.  It also had to work well on Vista, 7, and 8.x, which makes things like PowerShell difficult since the varying versions provide different functions (e.g., PowerShell 1.0 doesn’t have native JSON support).  Recently my language of choice has been Python, and exploring that option is how I found what turned out to be a huge life saver: Portable Python.

Portable Python is exactly that: a portable python environment that can run from a local folder, UMS device like a flash drive, or a network file share.  Additional modules can be installed with relative ease, and it works on all the operating systems we support right now.  Python has lots of great modules for file management, file manipulation, config parsers for INI and JSON, pretty much everything I need.  Nicely enough, one can also easily call functions exported from a DLL file using an interface called ctypes.  They really mean for you to use AutoItX via COM in python, but that requires AutoItX to be installed locally which we aren’t going to do.

So here’s my setup: portable python and my scripts directory are stored in a file share on our server.  It’s easy to build a batch file that executes the portable python interpreter, passing my scripts as command line arguments to get them running without doing messy file-association mods on the customer PC’s.  The AutoItX DLL is also being hosted on a file share and my python script can copy that to a local folder then manually load it using ctypes.  Here’s an example of one of my scripts:

Please note that the big hurdle that I had to cross was related to Unicode strings.  At first I was just passing regular strings to the AutoIt functions (like WinWait) and they never matched any windows no matter what.  After some digging I found that AutoIt is expecting Unicode strings and it is assuming that all strings passed in are already Unicode and interpreting them as such.  Explicitly passing all strings as Unicode fixes that problem.

Introducing: Firefox Extension Killer

At the shop we generally recommend our customers to use any browser except Internet Explorer.  This probably doesn’t come as a surprise to anyone who has spent any amount of time fixing the things people ruin on their computers or anyone who has ever wondered why their markup/css just doesn’t work right in one browser when it works right in every other browser (or why it only loofirefox_logo-only_RGBks right in one browser!).  Because of it’s ability to behave more like the browsers of yesterday when configured as such (which is EXACTLY what our older customers want) we generally prefer Firefox.  As a result, cleaning up malicious addons has become an everyday chore for us when people bring in junked-up computers.

Out of the box, cleaning up extensions in Firefox is kind of a mixed bag.  Sometimes an extension will have a remove button, other times it won’t.  Why is this?  If an addon was installed via the AddOns Manager, it will have a remove button.  If the files were installed manually from outside Firefox, there will not be a remove button.  I wanted to link to the area of the Mozilla knowledge base where they explain this decision (which I read years ago), but I can’t seem to find it anymore.  I think the gist was that since a manually installed addon could have files that aren’t tracked by Firefox, they didn’t want people to think that removing the extension in Firefox’s AddOns manager would remove all files associated with the addon.

Of course what this has led to is that most malware addons will install manually, leaving the typical user with no way of removing them from Firefox.  Mozilla has an article that explains how to remove them manually, but your average Joe is never going to be able to go through this process.  Even if one is technically inclined enough to follow the directions it’s extremely tedious to look in so many places and it’s time consuming to boot, which is why I put together the Firefox Extension Killer.

I have actually tried to write this program several times.  At first I wanted to do it in C++ because of its lack of lots of system dependencies (aside from any libraries that are used, of course).  That was a hard thing to commit to considering my hatred for the Win32 API, but I just felt like it was the best choice at the time.   I’d thought of something like C# which is much more elegant on Windows than C++, but when I first started working on the tool we were still doing a lot of Windows XP machines and they don’t all have the .Net Framework installed; I just couldn’t see having to spend 10 minutes installing .Net 3.5 to save 5 minutes of repair time.  Then it came to the development tools on Windows: either Visual Studio Express which is a huge resource hog, or a cobbled together environment trying to emulate my beloved Linux work-flow.

When that got toangry-codero cumbersome to deal with, I turned to cross-compiling from Linux to Win32.  Years ago successfully cross compiling code was similar to building a running internal combustion engine out of Lincoln Logs, but nowadays there are full tool-chain sets in the Ubuntu/Mint repos that “just work” after installation, even if the MinGW version names are unbearably confusing.  What this work-flow meant though was that testing the registry access parts of the code would be impossible in Linux.  I had the idea of writing some wrapper functions and implementing a virtual registry testing library for C++ (the perfect textbook solution) but very quickly realized that writing a library that could interact with and emulate the Windows Registry with any amount of configurability would take a LOT longer than the whole rest of Firefox Extension Killer.

After going through all this, I became very quickly disenfranchised.  I had at least gotten a CLI tool running that looked in (most of) the addon locations and just removed everything without any options.  This worked OK, since we mostly only wanted Adblock Plus installed and that was an easy reinstall.  Apart from that, I quit working on it for a year or so; it was just too painful.

Recently things have been slow at the shop, and I really started thinking about it again.  I’d piddled with a few designs on and off over time, but nothing really seemed to fit right.  The playing field is different now too; we’re pretty much working on Vista and higher now which has at least  .Net 3.0 built in.  It occurred to me that what was not an option before was the best option now: C# with Windows Forms.

My application design skills are the worst.  I do not exaggerate.  I realized recently that the designs I layout before coding are at least as bad as the stuff I come up with when I just start typing, so I installed SharpDevelop in a Windows VM and just started typing.  Two days later I had a release of Firefox Extension Killer.

So head on over to GitHub and check it out (or clone it out, as it were).  Right now I only provide a SharpDevelop project file as a means of building it, but it looks like SharpDevelop squirted out a Virtual Studio project file too.  I didn’t ask for that, but I left it in the repo anyway in case it works.  YMMV.

Let Git manage your indentation

Indentation in Ruby source code files is two spaces. It’s the rule.

I’m not much for two space indentation; it just feels cramped to me. That might change over time but for now that’s just the way it is. What that means for me is that it’s really hard to contribute to someone else’s Ruby project since my Vim config specifies four space indentation for everything. I could change my Vim config to use two spaces for .rb files but I figured I could have my cake and eat it too, aka let Git worry about it.

Git has the ability to use filters for various operations. Here’s an in depth guide, but I’m just going to cut to the chase and show how I made it work.

In your working tree, edit the file “.git/info/attributes” (create it if it doesn’t exist) to look like this:

These lines define the file extension(s) to which the filter applies, then specifies which filter to use. The filter name ‘two_four’ is just a descriptive filter name I picked.

Now we need to define the actions for our filter which should do nothing more than accept standard input, modify it as needed, and spit it back out on standard output. There are ‘git config’ commands you can use to enter the filter actions into your git configuration but in my case it’s easier to enter them directly into ~/.gitconfig than try to escape everything for the bash shell. Here’s the appropriate section:

That perl code-puke in there just looks for spaces at the beginning of the line and either doubles them or halves them depending on which direction we’re going. The “smudge” action says “Run this filter code just after checking the files out into my working tree”, whereas the “clean” action says “Run this filter code just before pushing the files to another repository”.

If you want to immediately see the difference in your working tree you can force all files to be checked out again by running:

git checkout HEAD -- **

Just make sure you commit all your work first!

**Disclaimer: I’ve only minimally tested this filter code. Use it at your own risk!

Loading Fixtures Inside Functional Tests

Since I last wrote about Winning Side Ministries’ website (the one I was writing a custom CMS for) I’ve changed my mind yet again. Yes, I tend to do that a lot. I’m currently rebuilding the website based on the Symfony2 PHP web framework. It’s based on MVC and presents a nice clean interface to which one can code pretty much anything web-based and has lots of great built in tools for getting the job done.

Annoyed by past experiences with code-first-test-last development flow I’ve decided to try Test Driven Development for this particular project and I’m finding that I really like it. I will admit that there have been times when I’ve had to sit and ponder over how to bring about certain changes by first writing tests, but they’ve all been situations that I’ll run into time and time again so it hasn’t been time lost. One of those places was the first time I needed to begin writing the code to drive a view (aka, the page that will be rendered in the browser). Since this involves combining multiple components, the tests preceding development must by definition be functional tests; of course this in turn requires that we write some data fixtures so we’ll have something to run the tests against.

The Symfony framework by default uses an ORM called Doctrine, and happily enough there is a bundle (Symfony term that more or less equates to a module) called DoctrineFixturesBundle that handles most of the grunt work of maintaining the fixtures for you.  On the documentation page for the bundle it outlines the steps required for writing the fixtures and loading them into the database, though it only shows how to load them manually from the command line.  This puzzled me since one would think that it would be wiser to reload the fixtures each time the functional tests are run so that in the tests we can do all the CRUD on the database we want without worrying about spoiling things for the next run of tests.  That’s when I set off to find out how I could clear the database and load fresh fixtures at the beginning of each test run.

To start off, here is the class I came up with:

Let’s break it down a bit since a few of the constituent pieces of code are significant in their own right.

Base Class

We will be extending the class that has been provided for us in Symfony to facilitate functional tests: WebTestCase. It’s just a subclass of PHPUnit_Framework_TestCase that has a built-in web crawler to allow us to make requests and check the responses. The namespace defined there is where I keep my custom testing classes.

In PHPUnit 3.6.x if we want to call code just before all the tests in a *_TestCase subclass are run, we put that code in a public method called setUpBeforeClass() (as opposed to setUp() which is called before every individual test). Its destructive counterpart is tearDownAfterClass() just as you may expect. These methods will contain our code for loading the fixtures and then wiping them out when we’re done. If we need to use setUpBeforeClass() or tearDownAfterClass() in classes that extend FreshFixtureWebTestCase we have to be sure to call the parent:: version as well or our code here won’t be executed.

Accessing the Entity Manager

To get started, we need to grab a reference to EntityManager. This is pretty common code inside a WebTestCase instance so I’ll not go into the explanation, but here it is. Notice that I’ve assigned our reference to the Container to a static class variable…this is just a convenience for our subclasses and is not required.

Retrieving Entity class meta-data

Now comes the fun stuff. We don’t want to have to worry about the state of the database schema or migrations in the tests, so I like to dump the database and totally reload the schema directly from the Entities before the tests run. To do that we get a reference to the MetaDataFactory from our EntityManager and call its getAllMetaData() method. This returns an object that contains the meta-data for all our Entity classes.

Using SchemaTool to modify the database

The question is what do we with the meta-data once we have it? Symfony comes with a nice helper class called the SchemaTool. There are several different things we can do with this class including drop our current database and create a schema from meta-data.

Creating a CLI application object

The doctrine-fixtures bundle only gives us the ability to load fixtures from the command line. In order to run Symfony CLI commands from PHP code we need to create an Application object. We have to make sure we set autoExit to false or else our entire PHP script will exit as soon as the command has run—this would be rather counterproductive when we’re trying to run tests!

Running the doctrine:fixtures:load command

To actually run our command we create input and output objects and feed them to the Application object. The command to run goes into the input and the result (which would normally be printed to the terminal) will be read from the output. Our command will be in the form of a string so we’ll use StringInput. We want the output of the command in a string as well but unfortunately there is no StringOutput class; there is a however a StreamOutput which we can point to a temporary file. Once the command is done executing we can read the contents of the file back. Note that PHP automatically deletes files created with tmpfile() once the script exits so we don’t have to bother with it.

Checking for errors

If there were any errors the output of the command should contain the word ‘Exception’. I’m just running an assert to make sure we don’t see that in the output, though you may have to use a different method if you have a table or Entity with ‘Exception’ in the name.

Cleaning up

Once all our tests are done we want to clear the database again. This way if we accidentally forget to load something in another test class we’ll know immediately because all queries will fail.

Using FreshFixtureWebTestCase

To use this class we simply extend it the same way we would WebTestCase. All the fixture-loading code will be called automatically without so much as a thought from us, though I do want to reiterate my prior warning:

If we define setUpBeforeClass() or tearDownAfterClass() in a subclass of FreshFixtureWebTestCase, FreshFixtureWebTestCases’s method(s) will be overridden. To keep the fixture functionality we must call parent::[method_name] from within whichever method(s) we override.

PHP Code Formatting with sed

When I first started the Winning Side Ministries I had one code style…now I prefer another. It happens, right? It shouldn’t be that big of a deal; there are plenty of free tools available that will iterate through source code files and alter the formatting for you. What I’ve found in the last few days though is that not many of them really work all that well.

Our website is written in PHP so naturally I began looking for a solution that was tailored specifically to that language. There are only a few of them and I really couldn’t get any of them working in a satisfactory way. At that point I began looking at more generic solutions like GNU indent which is actually targeted at C but works OK with other languages that use C-style syntax. After a bit of toying around I got indent to make the changes I wanted it to make but it was also making other changes that I didn’t want it to make.

That’s when I realized…there were really only 2 formatting styles I wanted to change. Why not just fix it with some sort of regex substitution program? Enter sed (StreamEDitor). It’s really a nifty little program that does one thing really well (per the Unix philosophy) and that is take the text it’s given and perform the requested modifications. Now it’s really meant as a line based editor and some of my changes required multi-line substitution but it wasn’t really that hard to work around after googling for a bit.

Here’s the code style I wanted to convert from:

function myFunc($arg)
{
    print("some stuff");
    if($arg == 2)
    {
        print("Arg is 2!");
    }
}




function myFunc2()
{
    print("I'm func 2!");
}

And here’s the style I wanted to convert to:

function myFunc($arg) {
    print("some stuff");
    if($arg == 2) {
        print("Arg is 2!");
    }
}

function myFunc2() {
    print("I'm func 2!");
}

So we’re going from having 4 blank lines between functions to only 1, and going from having opening braces on a separate line to having them on the same line preceded by a space. Here’s the sed script I came up with:

# this line reads the whole file in so we can do multiline substitution
:a;N;$!ba;

# this gets rid of our 5 line breaks between functions
s/\n\n\n\n\n/\n\n/g

# this puts curly braces on the same line with the control structures/function definitions to which they belong
# be aware that if there is a set of lines like this:
#
# if(myTest) //here's some documentation
# {
#     next line of code
# ...
#
# the curly brace will end up in the end of the comment; watch out for that
s/\n[ ]*{/ {/g

That first line…I copied and pasted that from this answer to this stackoverflow question. I don’t know sed well enough to really explain it, but there’s some explanation in the answer comments there so check it out if you want to know more.  The basic idea is that sed can’t work with multiple newlines at once without a workaround since it expects to act upon every line individually.  The tr solution also proposed on that same question won’t work for me since tr works character by character and doesn’t support full regex’s as a result.

I did run into the problem described in the script’s comment, but only once. It wasn’t worth complicating the script to fix that though it may not be difficult. I don’t know…I didn’t bother.

To run the script you just call sed like so:

sed -i -f format_code.sed *.php

And you’re done! The “-i” makes it edit the files in place so you don’t have to do any kind of weird redirection or write a script around it. It just works.

PHP types, exceptions, and Smarty templates

Continuing on with the implementation of a custom CMS for the ministry website, I’ve been doing a lot more type checking than I used to do with PHP.  That’s actually one of the things that has always bothered me about languages like PHP; as a programmer I make a lot of mistakes and it’s a real boon if my programming language supports double checking me without any extra intervention, but PHP doesn’t offer much in the way of enforcing variable types.  Starting with PHP 5.1 we’ve had “type hinting”, which means I can force a parameter to a function/method to be an object of a certain type or an array.  Well, this doesn’t really help if my function needs a string, an int, or an array containing strictly typed elements.

What I’ve been doing to get around this is using type hinting wherever I can, then doing extra manual checks on the types of other parameters via the is_* family of functions.  If I get an incorrect type, I throw an InvalidArgumentException.  I’ve been using exceptions extensively in this project because of the multi-layered nature of it.  If I relied on return values alone for monitoring errors there would be layer after layer of if-statements and checks passing data back up the call stack, all the while making it very difficult for my functions and methods to return useful data to their callers without the use of output parameters.

One of the most recent things I’d done was to set up the page loading code to generalize error handling.  I am currently using a layout similar to this (not very pretty, I know—I’m working on it):

There’s actually a good bit more to it than that, but I’ve cut things out to make it less distracting.  Basically I get my page data, feed it to the template requested, and grab the parsed template output using $smarty->fetch().  If there is an exception thrown I check to see if the real exception message is safe to show the user or whether I need to use a general one, then I fetch the error page template to show the user instead.  Sounds OK, right?

When I tried testing this code be requesting an invalid photo album on the photo gallery page, I got two instances of the page header then the error message and page footer.  I thought I’d goofed up somewhere and included something wrong.  After more testing, I found that if the exception was thrown before the call to $smarty->fetch() (for instance if contentManger couldn’t find the page I fed it) there was no problem.  That got me to thinking about output buffering.

I knew smarty used output buffering in the implementation of fetch(), but I didn’t initially considering the implications that brought with it.  I haven’t looked at the code so I don’t know exactly how it’s written but the gist is something like this (again, my interpretation and that oversimplified):

See what’s happening? In the parseTheTemplate() section of the code, however smarty actually calls it, my template plugins get run as part of parsing the template. If one of my plugins throws an exception it gets passed back up the call stack until it gets caught by a catch block. What gets skipped? Yep, ob_get_clean(). That means any output that’s already been printed is still in the output buffer.  On the subsequent call to $smarty->fetch() it gets returned in with the rest of the output.  In order to fix it, all I had to do was add a call to ob_clean() like this:

No more problems.  Smarty could have handled this on it’s own by manually clearing the output buffer at the beginning of fetch() before anything was parsed but I’m sure there is a reason why it isn’t.  Perhaps there are people putting things in the output buffer before calling fetch() on purpose.

Roll my own CMS?

I’d noticed recently that the ministry website (http://www.winningsideministries.org) had gotten woefully out of date in regard to it’s text content.  The news column on the main page is over a year old and the “about us” page doesn’t reflect the current state of the ministry at all.  I started to update the content but I got all hung up in figuring out whether the site was currently running the testing or stable version from svn, then making sure to upload the right file version, trying to remember to update the right repo…it was a mess.  I’d been using svn for the ministry website because I’d started working on it before I new git; more than that, I couldn’t store it on my free github account due to the fact that the database passwords would be there for the world to see.  The solution to that bit of the problem was simple.  I just set up my own git repo on the same server that hosts my svn repos and moved it there.  Since I was having so much trouble figuring out what version of what files were where, I just downloaded the current working version of the website and based the website off that.  It’s possible that I’ve lost a little bit of work that way, but as badly dissheveled as the codebase was I wasn’t sure that it mattered a whole lot in the long run.

After that I asked myself, “Why am I sitting here editing source code files to update the content of the website?”  Sure, building it that way got it up quick without me having to learn a lot of new stuff before-hand, but it made the maintenance a nightmare.  At that point I realized that for there to be any sort of efficiency in the task of keeping the content up to date I was going to have to completely (or nearly so) separate the content from the code, aka, use a CMS.

There are lots of good CMS’s out there, but most of them are really heavy from what I’ve seen.  A lot of them (like the WordPress CMS that powers this blog) are indeed very blog oriented, making it a bit more complex to have widely varying page setups.  I realize that if I’d spend the time requried I could very well make any one of the CMS’s available do just as good a job of handling our static content pages as it would our dynamic ones, but I wasn’t sure that’s what I wanted to do.  We have a very simple website with (currently) 9 working pages.  It just felt weird to use a CMS that took five times more code than the whole rest of the website.

I came to the conclusion that for our purposes all I really needed was a simple CMS that would take a database full of “page” rows and automatically put the content into the right spots in the Smarty template.  After a bit of playing around, the idea evoloved of using mod_rewrite and a single page-loader php file…a very elegant solution compared to what we had before.  What I needed then was a way to be able to mark up the content in a way that was non-code-oriented…very legible on it’s own, outside of the page itself.  Markdown was the first thing that came to mind.

I love markdown: it’s simple, elegant, and reads just like a plain-text email.  Everything about how it looked screamed, “You want to use me!  Pick me!  Pick me!” and so I did.  The only problem I ran into was the fact that markdown doesn’t support generating anything besides plain html tags—no class names, ID’s, or anything.  That wasn’t a big deal until I tried to add link anchors to our “about us” page…oops.  Can’t do it.  I found that the “extra” version of the PHP Markdown implementation allows adding custom ID’s to header tags, but I didn’t want to limit myself so badly that early on in the game.

Enter the next contender: textile.  Textile isn’t quite as plain-text looking as markdown, but it supports everything I need.  The issue I had with textile is that the version of the parser they have available for download on their website is ridiculously dated, though the link doesn’t say anything about that.  I was having weird problems like the last element of lists being bumped down into their own list…as in a 5 element ordered list would show as elements 1, 2, 3, 4 and 1.  Doing some research, I found out that this bug had been fixed long ago.  When I read that I immediately looked in the source code for some indication of a date…it had been released in 2006.  That’s a problem.  I did find, however, that the textile based CMS called TextPattern included the most recent version of the classTextile.php file (which is all you need to just use the textile parser) so I grabbed that, and the problem was solved.  You should be able to get the latest version here: TXP/classTextile.php:HEAD

I’ve already gotten the few completely static pages on the website converted so now all I have to do is get the mixed static & dynamic pages working.

It feels unbelievably good to confidently type `git rm` so many times in a console.  😉