squid patch accepted

Well, I’ve submitted my squid patch, and it was accepted. I’m just waiting for it to be merged! I can’t wait!

squid range_offset_limit mod progress…again

Well, here’s another progress update on getting my acl based range_offset_limit mod to work.  For the longest time, I couldn’t get anything to work at all.  Ever since I first compiled the changes in December, it hadn’t worked right.

Well, it turns out it was my testing environment!  I had forgotten to set some critical squid.conf settings, and so nothing I did was working.  Once I figured this out a couple of weeks ago,  I was able to properly test my code.  It’s been really busy in the shop lately, so my squid development time has been sporadic at best.

Once real testing began, I did have to fix one troubling bug where I improperly specified a default value for range_offset_limit in a place that over-rode all other range_offset_limit settings.  Bummer.   That’s all sorted now, though, and everything seems to be working according to plan.  I had hoped to catch the squid 3.2 release, but alas, for I missed it.  It’s ok, though.

Right now I have the squid test-build suite running, and I’ll post the bzr patch to the mailing list when I get a good test run.  I don’t anticipate any errors, so hopefully this will be soon.  Monday looks like it will be really busy, so it will probably happen later on in the week.  I hate that it’s taken me so long, but I have learned a LOT (this being the first time I’ve worked on a collaborative project).

proxy server in general

Well, I got the squid modifications I was working on to compile…after a good bit more cleanup of the other pattern-based code. I haven’t tested it yet, though.

What I really wanted to give an update on was my caching of antivirus definitions. Everything is working fine except Avast updates. It seems like they may dynamically start/stop update servers based on demand. There are a few files passing back and forth that I didn’t notice before, and one seems at first glance to be a list of running servers. Either way, I just notice that every once in a while I’ll have a day where Avast just gets 404’s on every file it requests. I turn off my url_rewrite, and it works. I’ll have to investigate further sometime, but it’s so hard to find the time.

Also, I found that I may have been doing my firewall rules wrong. The tutorials I’d seen suggested setting the rules in /etc/rc.local. This worked for me most of the time, but sometimes after rebooting the proxy server I would have to go in a manually flush iptables and load the rules again. It was like they were getting loaded at the wrong time. A tutorial I found that was specifically written for Ubuntu said to do it in a pre-up command in /etc/network/interfaces. Before I was actually running iptables commands directly in rc.local. It’s easier to see them that way. What that article suggested was just doing iptables-save > /etc/iptables.rules, then adding iptables-restore /etc/iptables.rules in /etc/network/interfaces.

I’ve rebooted the server 3 times, and the connection came back up by itself every time. Hopefully that got it fixed!

Squid range_offset_limit mod progress

I haven’t posted for a while, so I thought I’d write about the progress in getting that range_offset_limit mod working for squid. I’d gotten it working as a pattern, but I was asked to instead implement it using squid’s acl system. This gives much finer grained control than simply a pattern, since it can match against other fields such as source address, destination address, time of day, etc.

It took me a bit to get the acl system, but I think I have it. The changes are compiling right now…aside from some errors about unused functions I had to clean up (squid uses -Werror), it has gone well so far.

So now the format of the option will be:

range_offset_limit (bytes) [[!]aclname]

I’ve devised a pretty cool system for testing different version of squid, too. In my squid init script, I subbed a few things with variables so I can comment out the ubuntu-supplied squid file locations and instead point to /usr/local/squid/. I set up /usr/local/squid as a symbolic link to /usr/local/squid-whatever-version. Each of those uses /var/spool/squid3 as their cache, so the version 3 squids can all pull from the same cache_dir. This makes sure our Windows updates are always cached, even when I’m switching versions of squid.

Giving back to squid: range_offset_limit mod

I realized that if we ever want to be able to cache vista updates, we’re going to need range_offset_limit set to -1.  Why it wants to download so many byteranges, I have no idea.  The problem is that squid just eats our bandwidth with that option turned on, because it’s global.  Recently I’ve contacted the squid developers and offered to (attempt to) add the ability to set range_offset_limit per a pattern.  My idea was to change the current syntax:

range_offset_limit (bytes)

To the following:

range_offset_limit (bytes) [-i] [pattern]

So that “-i” is a flag to make the pattern case insensitive and “pattern” is the pattern to match by.  I’m sort of patterning the code after the refresh_pattern code, as it is very similar in function.  Like the refresh_pattern rules, I’m maintaining the list of range_offset_limit rules in a linked list, in the order in which they appear in the config file.  Each time a check is made to see if a range request exceeds range_offset_limit, it will iterate through the list from top to bottom looking for a match.  As soon as a match is found, it will stop and return the appropriate value.  If none is found, it will return the default limit of 0.  In this way, the range_offset_limit rules will work just like acls or iptables rules.

At least, that’s what’s supposed to happen.  😉  We’ll see.  I’m making the modifications to the 3.x branch, so I’ll have to make sure I’m careful to get that running right on our server to test it.  The Ubuntu repos are still serving up the 2.7 branch, so I’ve got to make sure I get the init scripts right and such.  I figure I can still use the old one, though.  I’ve already modified the Ubuntu-supplied squid init script so I can point it to a copy of squid installed anywhere on the disk.

squid and avast

This is one that I’ve been looking forward to solving.  The other programs I wanted to update always downloaded their files from the same url.  This made it easy to write refresh patterns that would cause them to retreive the cached version.  Avast, on the other hand, seems to download it’s updates from a random mirror each time.  Some of them use urls in the form of http://download%5Bmirror number].avast.com/iavs4x/* while others use an ip address instead of a hostname like http://%5Bip address]/iavs4x/*.

To allow squid to serve up cached versions, we have to redirect all the requests to the same url.  I chose a random mirror, http://download682.avast.com.  For the redirection, I used a program called squirm.  Squirm uses two configuration files: a list of addresses allowed to use redirects, and the redirect pattern file.

The documentation for squirm leaves a lot to be desired, but it wasn’t too hard to figure out.  For instance, in the documentation squirm says squrim.local (the allowable addresses) should contain a list of networks in the form xxx.xxx.xxx that would match a class C address range.  If you run it that way, though, you get an error.  Mine was 192.168.2, but the error said something like “Invalid IP address range 192.168.”  By that I figured out that it wanted trailing periods, but that’s not what the documentation said.  Using the correct syntax, mine is now 192.168.2.

The installation is also pretty clunky, requiring the user to manually run make in a subfolder and move some files, and manually edit the main Makefile. The website says there is a new, undocumented version that fixes the installation issues, but if the current version is “documented” then I hate to see what “undocumented” means.

My biggest initial mistake after taking care of the installation was trying to use perl compatible regex syntax in squirm.patterns (the redirection pattern file).  It is in fact posix (extended?) syntax.  It uses the GNU regex library.

The way I got it to work was to use the following in my squirm.patterns:

regex ^http://.*/iavs4x(.*) http://download682.avast.com/iavs4x1;

Short, sweet, and to the point.  That line is actually the only thing in the file.  This will take any source url that contains the path element “iavs4x” (unique enough not to cause any problems, I think) and point it to the server I picked.

Now as for the squid configuration, that took just a tad of google searching.  The guide I found wanted you to set a directive called “redirector_program”, and another called “httpd_accel_uses_host_header”.  These have changed a bit in the most recent version of squid (and I don’t know how far back).  The “redirector_program” directive has become “url_rewrite_program”, but it works the same way.  You’ll also want to set “url_rewrite_children”, and squirm recommends 10 as a default.  Thats the number of instances of your redirector squid will keep on hand; each one will handle 1 url at a time.  If you have a really busy server, you may have to increase this in order to keep your clients from having to wait.  As for “httpd_accel_uses_host_header”, it is supposed to be handled automatically as part of the http_port line.  The example I saw used “http_port 3128 transparent”, so I don’t know if the redirector would work on a non-transparent squid proxy or not.

The last part is the refresh pattern.  It’s just a variation on the patterns I used for the other anti-malware programs:

refresh_pattern http://download682.avast.com/.* 1440 100% 1441 ignore-reload override-lastmod override-expire

Note that we just matched it against the url AFTER the redirection. Cool, ne?

BTW, everything seems to be going well with my refresh pattern lines.  They seem to be downloading fresh copies once per day.  I’ll still keep an eye out, though.

squid and malwarebytes

One of the tools we use the most around here at the shop is Malwarebytes Antimalware. Today, I wrote a refresh pattern to get squid to cache it’s updates properly (or rather, convince malwarebytes to download them from the cache). I set malwarebytes to download a fresh copy every day, and I went back and set super antispyware to do the same.  Here’s what the malwarebytes rule looks like (should be 1 line):

refresh_pattern http://mbam-cdn.malwarebytes.org/.* 1440 100% 1441 
ignore-reload override-lastmod override-expire

Now the values for superantispyware are also 1440 100% and 1441.

Let’s hope I understand the docs for refresh_pattern!  We’ll find out in a couple of days based on whether or not our definitions actually get updated.  😉