Planet OSL
Class inheritence in C
For a while now I have been slowly developing a touchscreen music player to replace the stereo in my car. Although that concept should be fairly simple the project has turned into a bit of a playground for learning how to implement an X toolkit using nothing but C, XCB, and Cairo. One of the more recent legs this project has grown is an object oriented system for C which supports class inheritance and method overriding for use in my toolkit.
At this point I've got it almost the way I like it. Defining a class in a header looks something like this:
CLASS(mtk_widget, mtk_object)
int x, y, w, h;
struct mtk_window *window;
struct mtk_widget *parent;
cairo_surface_t *surface;
METHODS(mtk_widget, mtk_object, int x, int y, int w, int h)
void (*init)(mtk_widget_t *this, mtk_widget_t* parent);
void (*draw)(mtk_widget_t *this); /* children must implement this */
void (*update)(mtk_widget_t *this);
void (*mouse_press)(mtk_widget_t *this, int x, int y);
void (*mouse_release)(mtk_widget_t *this, int x, int y);
void (*mouse_move)(mtk_widget_t *this, int x, int y);
void (*set_geometry)(mtk_widget_t *this, int x, int y, int w, int h);
void (*set_parent)(mtk_widget_t *this, mtk_widget_t *parent);
END
This defines a new class named mtk_widget based on the base class mtk_object. The extra arguments to the METHODS macro are additional arguments for the object constructor.
To actually implement the class the corresponding C file will have something like this:
/* more methods up here */
static void set_parent(mtk_widget_t *this, mtk_widget_t *parent)
{
this->parent = parent;
}
mtk_widget_t* mtk_widget_new(size_t size, int x, int y, int w, int h)
{
mtk_widget_t* this = mtk_widget(mtk_object_new(size));
SET_CLASS(this, mtk_widget);
this->x = x;
this->y = y;
this->w = w;
this->h = h;
return this;
}
METHOD_TABLE_INIT(mtk_widget, mtk_object)
METHOD(init);
METHOD(draw);
METHOD(set_geometry);
METHOD(set_parent);
METHOD_TABLE_END
Methods can (and probably should) be static functions. The METHOD_TABLE_INIT macro defines the function _mtk_widget_class_init() to fill up a hidden structure named _mtk_widget_class which is the class virtual method table. Each object then has a pointer tucked away inside to its class's method table (set by the SET_CLASS macro). One thing that is a bit clunky about this is that the program must somehow call _mtk_widget_class_init() before the class is ever used. I would like to be able to do away with that by using GCC's constructor function attribute so it magically runs before main(). However the class hierarchy must be initialized in order but a way to order the constructors was not added until GCC 4.3. I'm using 4.1 on my system so that feature will have to wait for now.
Actually using objects looks something like this:
int main (int argc, char *argv[])
{
mtk_window_t *window;
mtk_widget_t *widget;
mtk_init();
window = new(mtk_window,640, 480);
widget = mtk_widget(new(mtk_text, 0, 0, 640, 480, "WHEE"));
call(window, mtk_container, add_widget, widget);
mtk_main();
mtk_cleanup();
return 0;
}
Pretty strait forward, call() is a bit weird though. It takes the arguments (object, class name that defined the method, method name, method arguments). I would like to be able to get rid of the class argument somehow (it is used to cast object to the correct type) but I haven't come up with a solution yet. I have a similar clunkyness with my super() macro for calling a parent class's method when doing method overriding.
That's pretty much it. As with the rest of this project it is a good example of making things more complicated than they need to be but it was an interesting challenge to put together.
The code for the above macros and base class can be found in this header and c file.
Surpassed 100 Moodle sites
Wow. I am amazed at how quickly teachers are latching onto the Oregon Virtual School District stuff.
Since I set up a system to automatically create school Moodle servers on demand, 109 Moodle instances have been spawned on the shared-code Moodle system. That means at least one teacher from 109 different public schools and ESDs in Oregon has been interested enough in the ORVSD offerings to give it a try. Of course, only a small number of them are far enough along to be using it with their students, but there are quite a few teachers getting up to speed with how the tools work. I’m delighted to see the system seems to be scaling up remarkably well. It’s not having any problem at all dealing with so many separate vhosts/databases running on the same Moodle install.
I expect we’re going to see an explosion of use by the end of the summer and beginning of next fall. I guess I better get those disk arrays ordered … we’re probably going to need a lot more disk space soon.
Hi! I'm a Google SoC student for 2008!
You know, there was a man that lived here once that had a prize-fighting kangaroo.
This blog will house all my notes and ruminations along the way. This is also where I'll be posting about my volunteer work on the OLPC project.
cfengine 2.2.6
I've built updated cfengine 2.2.6 packages for el4 and el5. You can find them in the usual spot:
http://sheltren.com/downloads/cfengine/testing/
The same package has been built for Fedora development/rawhide and should be available there soon.
Some bug fixes as well as support for detecting xen hosts/guests is now included. Let me know if you have any problems!
Ubuntu 8.04
Ubuntu 8.04 was just released. We're having fun watching the bandwidth graphs. See the graph below and try to figure out when they made the release announcement...

I should note that is just the traffic coming out of our FTP server here in Corvallis. We also have two remote FTP mirrors in Chicago and Atlanta which are each pushing ~350Mbit/sec.
Update: Here's an image showing each of our three FTP mirrors and how much traffic they are each pushing.

InnoTech Portland
Yesterday I was up in Portland at the InnoTech conference to give a "lightning talk" on the OSU Open Source Lab. It was a lot of fun to meet a lot of people I've been in contact with on the phone or via email and put a face to a name. There's a great group of people in the Portland area involved with Open Source, and I really need to get up there more often to see some of them!
The lightning talk was interesting but pretty fun and I think it really helped keep things from getting boring. I think the basic idea was borrowed from Ignite who puts on events where people can talk, but the slides are pre-timed so you've got to go pretty fast. In our case, we each had five minutes total, this was for 20 slides for each presenter timed for fifteen seconds for each slide -- we had no control over moving the slides forward or backward. I enjoyed it, although I was trying to cram too much into each slide and was talking very fast the whole time!
Here's a list of the other people that were presenting during the same session:
Nate Angell, The rSmart Group
Jennifer Cloer, Page OnePublic Relations
Stuart Cohen, Collaborative Software Initiative
Selena Deckelmann, PostgreSQL
Bjorn Freeman-Benson, Eclipse Foundation
Brian Jamison, OpenSourcery
Steve Morris, OTBC and Managing Director, OregonStartups.com
Scott Kveton, Vidoop
Brandon Philips, Novell
It was very cool that a number of the other presenters also mentioned the OSUOSL in their talks! We're glad to be making such a large impact. If you are interested in checking out my slides, you can download them here as a PDF.
The session was part of the Open Source Summit at InnoTech, a full day of presentations related to Open Source. Raven Zachary from the 451 Group did a great job at organizing the Open Source part of the conference.
Just before the lightning talks was a panel on "Open Source in the Public Sector". This was led by Deb Bryant, and Greg Lund-Chaix was one of the panelists, both of them are also from the OSUOSL. This was an interesting discussion and the panelists all gave some good examples of how and why they are using Open Source. Greg was talking about his involvement in the Oregon Virtual School District, which is a really exciting partnership with the Oregon Department of Education to provide online courses and course materials at no cost to the teachers.
Go Comcast Go
Lets all give it up for Comcast, come on now:
Go Comcast Go!
Ra Ra Ree!
Kick Em In The Knee!
Ra Ra Rass
Kick Em In...The Other Knee!
A couple weeks ago I got several disturbing reports from the midwest, mainly of the "drupal.org has been down for 6 hours" variety. My response to which was frantically going to drupal.org, watching it load perfectly and then enjoying a large steaming cup of wtf.
This lovely period was capped by helping an employee of a Drupal company in Boston try to track down this issue when it happened to them. He was very patient indeed and sent me the numerous tcpdumps, traceroutes, pings requests and netstat printouts that I wanted. It was quite the interesting issue. Any computer in their office could ping the individual drupal.org webnodes, but not the master virtual IP. SYN packets were getting to the load balancer managing the master VIP, but the SYN,ACK was never getting back to them.
The most frustrating part of this was that every once in awhile a connection would go through and get to ESTABLISH..then die and we would go back to connections waiting for an ACK.
We beat at this for awhile and then someone called Comcast and we discovered a lovely feature called "Smart Packet Detection." This "protects" the intertubes from clogging by noting when many packets are going to a single IP address and then blocking that IP for awhile.....except apparently for SYN packets? That part makes no sense.
Anyway, they requested that feature be turned off and connectivity immediately returned.
it’s a geeky meme
lars@bozeman:~$ history|awk ‘{a[$2]++} END{for(i in a){printf “%5d\t%s\n”,a[i],i}}’|sort -rn|head
133 cd
114 ls
44 svn
31 vi
28 python
24 ssh
21 ./ConfigurationManager.py
17 make
13 rsync
It looks to me like I spend too much time moving around the file system. I should try to type more pathnames and stick around in one place…
Minefield?
Minefield, pfft… what kind of name is that?

From a funny love letter Mozilla received. It’s part of a nice picture series wired.com took at the Mozilla headquarters. Check it out.
GSoC application deadline extended
All right you students out there … here’s your chance to get paid to work on open source software projects. Google is offering to pay you $4,500 this summer to work on an open source project, all you have to do is apply! They’ve even extended the application deadline to Monday, April 7th to make sure you have time to get your application finished.
So what are you waiting for? Apply now at: http://code.google.com/soc/2008/
Easter Sunday With Heartbeat And MySQL
I admit that I'm feeling somewhat relieved today. I've had a large migration on my todo list for quite awhile now. Last Sunday I finally sat down for the 16 hours it required and got it done (with the help of several local coffee shops). First, a little back story.
Drupal.org has had a somewhat standard DB setup, a simple master-master active/passive replication system, with DNS ptrs to control the failover. I've been unhappy with this for several reasons, mainly the lack of automated and instantaneous failover. However, I've also had issues with having a single MySQL instance on each server. These are very powerful servers, with multiple cores/cpus, and they could be very highly concurrent. MySQL/InnoDB has some issues with that (has known issues with lock contention at high levels of concurrency) and these issues prevent full utilization.
There is yet another issue though, drupal.org and its subsites are a somewhat odd situation. Not only is this a "production infrastructure", but it is used as a test-case and in many ways a development environment. This means we have many copies of the drupal.org database, which are used for testing. MySQL doesn't know these arn't production and treats their pages the same as any other database in memory. This quickly became a problem and even with a very large InnoDB buffer pool, it fills after about a day, starving the production DB and the write buffer (among other things).
My solution to all of this was to split the various databases across three MySQL instances, one for drupal.org, one for all the other production sites and one for testing sites. I could then tune them individually and I would get less lock contention across the instances as concurrency scaled up.
So, I decided to put this plan into action Sunday and at the same time deploy Heartbeat (Linux-HA) to enable automated failover. This plan required more IP addresses than were available on the current VLAN, so roll switching VLANs into all of this. Looking back, I probably should have split these up into multiple days....but I do like a challenge (sadly).
I spent quite a bit of time planning out all of this before hand. The first problem was Heartbeat. This software package comes in two versions (named version 1 and version 2 amazingly enough). The first being very simple and limited (it can only handle two nodes, can't have its config updated "online"..etc) and the second being very feature rich and very very complicated (and configured through daemon-managed XML files....hoora). So, I read the various papers checked into cvs for heartbeat and experimented with it on scratchvm.drupal.org and my workstation. Like many things, once you understand how the developers are thinking it becomes quite simple.
I sketched out a plan and on Sunday started things going:
Step 1: Fail DB1 to DB2
This was easy enough, I've done it many times.
Step 2: Bring DB1 up on the new VLAN and new IP
This was also simple. The only odd thing was that I changed the hostname to db1-static. Why? Because this "main" IP is one of the few on this box that won't eventually be moved around and managed by heartbeat.
Step 3: Bring up heartbeat
I had already written the cluster configuration for hearbeat, so all I had to do was check all of the configuration into our central management system, create authentication keys for the cluster...check them into our protected repository...etc..etc..etc. Eventually heartbeat came up and brought up six new IP addresses with it: db1-main-vip, db2-main-vip, db1-other-vip, db2-other-vip, db1-test-vip and db2-test-vip. The breakout is fairly obvious here, one IP for each MySQL instance on the master and the slave. Heartbeat manages all of these.
Step 4: Bring up the MySQL instances
Again...fairly simple. I wrote a new config for each instance and had the Gentoo initscript start each one in turn. The only odd part was splitting everything up into distinct subdirs (as the default MySQL configuration definitely assumes it is the only instance running).
Step 5: Load the DBs and start replicating off the slave
This was a bit tricky. Keep in mind that the slave is running a single MySQL instance and the master is now running 3...each of which will be replicating a subset of the databases on the slave. It wasn't that bad, but I was careful to get it working correctly.
Step 6: Setup the backend network and IPTables
I brought up 3 more private IPs on the backend network between the two database boxes. These are to allow replication between distinct IP's on the backend network, each IP mapped to a MySQL instance. This brings up an interesting point. Lets look at the main MySQL instance. This will be listening on db1-main-vip. But wait...what if db2-main-vip gets failed over to the master server? This will bring up a new IP on the master which that instance has to listen on. Also, we now have a backend IP which that instance has to listen on. Simple enough right? Ya..no. MySQL can only listen to one IP address (or all IP addresses, bug #14979). So, I had to put some iptables rules in place to rewrite these other IP addresses to one on which the MySQL instance was listening.
Step 7: Start replication and fail back to the MySQL instances on db1
Not much to say here, but I was very happy when I saw it working.
Step 8: Rinse and repeat for db2
Step 9: Profit?
This was a rather long day, but the entire setup is working quite well now. I still need to get cluster monitoring going and there are quite a few scripts I use that need to be ported to the idea of running multiple instances, but I'm finally mildly proud of our DB setup. (and it looks awesome when you diagram it)
OSUOSL Accepted into Google Summer of Code 2008
I was very excited to see that the OSUOSL was accepted to participate as a mentoring organization in this year's Google Summer of Code (GSoC) program. This gives us a great opportunity to work with students not only at Oregon State University, but at schools worldwide and help them to get involved with some great Open Source communities.
If you are not familiar with GSoC, I would recommend checking out the official google website for the program. The basic idea is that Google sponsors students (to the tune of $4500/each, plus $500/each student goes to the mentoring organization) to work on Open Source projects for the summer. This is great for a number of reasons: first of all, it gets a LOT of code written. Secondly, it attracts a lot of new contributors to many Open Source projects with the hope that they will continue working with the project even after the summer is over. If you are student interested in participating in GSoC, check out the mentoring organizations listed here, and specifically, the OSL would love to have people work on some of the ideas we have posted on our GSoC ideas page. Remember, you are not limited to the ideas that are listed, if you have a cool idea, bring it up with us on our IRC channel (#osuosl on irc.freenode.net).
Last year, I participated as a GSoC mentor for the Fedora Project. It was a great experience and opened my eyes to what a wonderful project GSoC is, both for the mentoring organizations and for the students. I'm very excited to participate again this year, and I look forward to seeing some fun student projects!
cfengine package updates
I've finally gotten around to updating the EL4 and EL5 cfengine RPMs here. These are rebuilds of the cfengine 2.2.3 package from Fedora. You can get them at the new location:
http://sheltren.com/downloads/cfengine/testing/
Let me know what breaks...
HOWTO: Shared-code hosting for Moodle
When I first started working with Moodle servers, one of the things that bugged me was the fact that it required a complete install of the code for every site hosted on the system. While that’s fine for most circumstances, it really did not work well in our environment where we’re looking at potentially hosting hundreds of Moodle instances. So, in the fine open source tradition of scratching an itch by finding something someone else has done, modifying it, and then sharing it with the world … I give you shared-code Moodle, OSL-style.
First, though, credit where credit is due. Martin Langhoff posted almost all of what we needed to do here. All I needed to do is expand upon it to fit our needs.
Second, what the modified code actually does:
1) config.php looks in Moodle dirroot/multisite_config for an ini file matching the server name. I.E.: fqdn.domain.org.ini
2) If found, the ini file is parsed and used to populate the Moodle $CFG
On to the code!
1) Create a directory in your Moodle wwwroot named multisite_config and make sure it’s readable by the web server
2) Create a moodledata_shared directory to hold the various sites’ moodledata directories
3) Modify the wwwroot/config.php file to look like:
<?php /// Moodle Configuration File
unset($CFG);
$CFG->dirroot = '/var/www/moodle_shared';
// Determine hostname
$hostname = $_SERVER['HTTP_HOST'];
if (isset($_ENV['HTTP_HOST'])){ // this is to support cronjobs on a per-host basis
$hostname = $_ENV['HTTP_HOST'];
}
// Load multi-site configs
$multisite_config_filename = "$CFG->dirroot/multisite_config/$hostname.ini";
if (file_exists($multisite_config_filename)) {
$sites_array = parse_ini_file($multisite_config_filename);
} else {
// Whoops! No ini found, fall back to a default Moodle host
$URL="http://default-moodle.domain.org";
header ("Location: $URL");
die("<pre>Unable to open site configuration file for '$hostname'. Has the config file been created?\n</pre>");
}
$CFG->dbtype = 'mysql';
$CFG->dbhost = $sites_array[dbhost];
$CFG->dbname = $sites_array[dbname];
$CFG->dbuser = $sites_array[dbuser];
$CFG->dbpass = $sites_array[dbpass];
$CFG->dbpersist = false;
$CFG->prefix = 'mdl_';
$CFG->wwwroot = $sites_array[wwwroot];
$CFG->dataroot = '/var/www/moodledata_shared/'.$hostname;
$CFG->admin = 'admin';
4) Create an ini file for each Moodle instance you want to host and place it in the Moodle dirroot/multisite_config/ directory. An example ini file for a fictitious template.domain.org Moodle instance:
; multisite_config.ini
[template]
dbhost = localhost
dbname = template_db_name
dbuser = template_user
dbpass = dbpassword
wwwroot = http://template.domain.org/moodle
5) Create a moodledata_shared/hostname directory for the moodledata stuff.
I have a canned mysqldump file I use to create new Moodle sites instead of using the GUI (along with an automated system to create the databases and generate the ini files, but that’s a topic for another post another time). Since I wanted to avoid creating a vhost for every Moodle instance and having to bounce Apache every time I add a new one, I set up a wildcard domain to point at the moodle_shared webroot. But this should work equally well with explicitly-defined Apache vhosts.
Todo:
1) Modify the GUI config to have it write the ini files directly.
2) Look into having site-specific modules. As it stands right now, all sites get the same modules. So far we haven’t run into module conflicts or sites wanting customized versions of modules, but I expect it’s only a matter of time.
3) Performance and scaling testing. This seems to work well enough with the approximately 50 low traffic sites I’m running now, but I’m not sure how much of a penalty the reading and parsing of the ini file imposes. It may not scale well on high traffic sites.
3a) Not sure how well this might be adapted to a multi-server or clustered environment.
Sanity Compromised by Firefox and ssh X Forwarding
Try this in Linux: open Firefox on your local machine. Then open a terminal window and ssh to another machine using the -X option for X forwarding. On the remote machine, start Firefox. The behavior I get is so bizarre that it cannot be a bug — somehow this looks intentional. The Firefox process on the remote machine sits for a few moments and then dies. Then a new local Firefox window opens. WTF?
I thought I was going insane. The people at the OSL that I told about this thought I was insane. The Mozilla developers that I work with and tried to explain this to thought I was insane.
Some research shows this: the remote Firefox actually starts and communicates with the X server running on the local machine. The X server tells the remote Firefox that there is already a process called Firefox running. The remote Firefox then sends a message to the local one to open a new window and then the remote Firefox dies. This protects a user from creating too many instances of a Firefox process on their machine. Clever, huh? But totally WRONG and counterintuitive!
Apparently you can stop this behavior if you start the remote Firefox with the intuitively named “no-remote” switch. That prevents the remote Firefox from “connecting” to the local Firefox.
Sigh, there goes an hour of my life that I’d like to have back…
Python Generators: Searching Java Jar Files
Here is an example of a utility that uses a recursive generator. It is a command line utility that assists Java programmers in finding missing classes. I wrote this script several pears ago when I was dragged kicking and screaming into a Java project. The script recursively searches a directory tree for jar files. When it finds a jar file, it scans the file’s directory for the target Java class.
#!/usr/bin/python
import sys, os, os.path
import fnmatch
def findFileGenerator(rootDirectory, acceptanceFunction):
for aCurrentDirectoryItem in [ os.path.join(rootDirectory, x) for x in os.listdir(rootDirectory) ]:
if acceptanceFunction(aCurrentDirectoryItem):
yield aCurrentDirectoryItem
if os.path.isdir(aCurrentDirectoryItem):
for aSubdirectoryItem in findFileGenerator(aCurrentDirectoryItem, acceptanceFunction):
yield aSubdirectoryItem
if __name__ == "__main__":
rootOfSearch = '.'
if sys.argv[1:]:
rootOfSearch = sys.argv[1]
if sys.argv[2:]:
classnameFragment = sys.argv[2].replace('.', '/')
def anAcceptanceFunction (itemToTest):
return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar') and
classnameFragment in os.popen('jar -tf %s' % itemToTest).read()
else:
def anAcceptanceFunction (itemToTest):
return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar')
try:
for x in findFileGenerator(rootOfSearch, anAcceptanceFunction):
print x
except Exception, anException:
print anException
The focus is on the generator function findFileGenerator. It creates an iterator for the results of a recursive search through a directory tree. It accepts as parameters a path to begin the search and a function to determine if a given file satisfies the search parameters.
Generators can be kind of confusing because even though they look like a function, they do not execute immediately when called. They return a reference to an object that works like an iterator. The code defined in the generator function is executed by that iterator object. The first time that the iterator’s ‘next’ function is called, execution begins at the beginning of the code and goes until it encounters a ‘yield’ statement. The ‘yield’ statement returns the next value of the iterator. The next time the ‘next’ function is called, execution resumes at the next statement after the ‘yield’.
Let’s examine this example closely. Imagine that the first call to the iterator has happened and we’ve got the resultant iterator-like object. The first call on that object to ‘next’ starts execution at this line:
for aCurrentDirectoryItem in [ os.path.join(rootDirectory, x) for x in os.listdir(rootDirectory) ]:
Here we’re getting our first directory listing of all the files in the current directory. Because the call to ‘os.listdir(rootDirectory)’ returns a list of file names with their paths stripped off, we’re going to have to re-attach them. The list comprehension (the code between the [ … ]) welds the current directory path to each of the files in the list and returns a new list. The for loop then sets us up to iterate through that list.
if acceptanceFunction(aCurrentDirectoryItem):
yield aCurrentDirectoryItem
Here’s where we decide if the current entry in this directory is interesting or not. We call the acceptance function on the item. Since the acceptance function is passed in when we originally called this generator, it could be anything the programmer desired. In the case of this particular utility, we’re looking for Java Jar files that meet a certain criteria. But it really could have been anything at all: find all files that have vowels in their name, or all files that have a specific type or content.
If the acceptance function returns ‘True’, then we yield. The current file is returned by the iterator and execution stops until the ‘next’ function is called.
if os.path.isdir(aCurrentDirectoryItem):
If the acceptance function rejected the item, this is immediately the next line to execute. If the acceptance function accepted the item, this line won’t be called until after the next call to ‘next’. In either case, our goal is to find the next item for the iterator to return.
Since we’re iterating through a list of entries in a directory, some of those will be directories themselves. The item that we sent to the acceptance function could have been a subdirectory. Regardless of the outcome of the acceptance function, we need to recurse into subdirectories.
for aSubdirectoryItem in findFileGenerator(aCurrentDirectoryItem, acceptanceFunction):
yield aSubdirectoryItem
Hang onto your hat, here’s where your brain may explode. We’ve got a sub-directory and we need to recurse into it and iterate through its entries. Well, we’ve got this handy generator that does exact that: it returns an iterator that will cycle through the contents of directory. ‘for’ statements in Python have a special relationship with iterators. You can provide one instead of a list and the ‘for’ loop will dutifully iterate through them for you. We recursively call the generator, passing in the subdirectory and the acceptance function. The generator returns an iterator to us and the for statement starts the iteration by silently calling the next function. Remember that the iterator returns only items that have passed the acceptance function, so each item that we get here we’re just going to pass on as the next item in our iterator. Hence, we yield every item that we get in this loop.
The rest of the file is in the problem domain: a command line utility that will find Java Jar files with certain classes in them.
if __name__ == "__main__":
Perhaps someday in the future, we’ll want to use the generator in another application. By putting the code of the command line utility under this ‘if’, we’ll prevent it from executing when we use the ‘import’ statement on this file.
rootOfSearch = '.'
if sys.argv[1:]:
rootOfSearch = sys.argv[1]
The root of the path that we’re to search is option on the command line. If no path is specified, we’ll assume that we’re to start in the current working directory.
if sys.argv[2:]:
classnameFragment = sys.argv[2].replace('.', '/')
def anAcceptanceFunction (itemToTest):
return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar') and
classnameFragment in os.popen('jar -tf %s' % itemToTest).read()
The name of the class that we’re to search for is also optional. If the user does not provide one, then we’ll assume that we’re to just find all jar files regardless of their content.
This code fragment is the other case: a fragment of a class name has been given. It is our task here to create an acceptance function that meets the criterion.
First thing to do is cook the class name a bit. In Java, class names are qualifies with paths. Inside Java code, ‘.’ is used as a separator. However, inside jar files, ‘/’ is the separator. To be friendly, we want Java programmers to be able to use either notation. We make sure the command line argument is converted to the ‘/’ notation and stored in ‘classnameFragment’. Next we define an acceptance function that receives a pathname as a parameter. All we have to do is subject that pathname to some tests and give it either a thumbs up or down. In this case, we test to see if the pathname represents a directory, then test to see if it is a jar file and finally we run the command line function ‘jar \-tf’ to give us a listing of the jar to see if our class name fragment is in there. Since Python can do “short-circuit” expression evaluation, if any of the earlier tests fail in the boolean expression, the other tests do not get executed.
else:
def anAcceptanceFunction (itemToTest):
return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar')
In the case where the user did not provide a class name fragment, we assume that we’re looking for all jar files. The acceptance function here just drops the additional criterion where we looking into the content of the jar file.
try:
for x in findFileGenerator(rootOfSearch, anAcceptanceFunction):
print x
except Exception, anException:
print anException
Finally, we ‘re ready to actually use the tools. We call the generator function with the path from which to start the search and our acceptance function. That returns an iterator that we loop through and print the matching jar files.
DrupalCon: Performance Tuning
Today, I got sent a draft of the slides I and several others will be working off of at DrupalCon in Boston on Tuesday. This session will cover Performance Tuning for Drupal deployments. It will cover pretty much every aspect, from Linux tuning to opcode caches to MySQL tuning. I am particularly looking forward to David Strauss's section on database design and Scott Mattoon's on DTrace.
Speaking of which, one of the interesting parts of this session is that its one of the panel sessions where multiple people are cooperating on bringing together a quality presentation. Seeing the slides and knowing the people involved, I'm quite hopeful. They may even make up for myself being involved. The presenters are:
- Khalid Baheyeldin
- Jeremy Andrews
- David Strauss
- Narayan Newton
- Scott Mattoon
- Robert Douglass
This is quite a lineup (again with one rather disappointing exception). I'm looking forward to listening from the stage.
Open Document Panel Video Released
During several days of last October’s Government Open Source Conference, we captured some of the sessions on video. We can’t cover them all, but I try to pick what we think will be of greatest interest after the conference is wrapped.
My first Flick Pick of the Week is the Executive Panel on Open Document Formats. It may be a bit backwards to start with the closing panel, but this topic will change soon enough so we didn’t want to sit on it too long. In fact since the panel was taped, the OpenDocument Foundation, which made news by taking a position for a different format altogether, has retired as an entity.
Participants included Adobe’s James King; IBM’s Arnaud LeHors; Microsoft’s Jason Matusow; OpenDocument Foundation’s Paul “Buck” Martin; and Sun Microsystems’ Douglas Johnson. Thanks again to the panelists. (I’m sorry Jason has been swamped but the other four panelists were able take time to weigh in on questions that had been collected from the audience but not always fully answered during the limited time at the conference.)
The GOSCON site provides the slide set, video and an open discussion thread (the latter a first for the conference web site - we shall see.) Mo info mo betta; you be the judge.
Saving, Loading, and interacting with Helix Producer Jobs
The encoding job is the object that the producer SDK revolves around. It contains configuration for what, when and how something is encoded. It is defined by IHTXEncodingJob and implemented by CHXTEncodingJob. Most of the producer client logic will deal with creating and manipulating this object and its children
This example shows an input being created and added to the job.
// Create the inputIHXTInput* pInput = NULL;
if (SUCCEEDED(res))
res = m_pFactory->BuildInstance(IID_IHXTInput, pInitParams, (IUnknown**)&pInput);// Set the input on the encoding job
if (SUCCEEDED(res))
res = m_pJob->SetInput(pInput);
Encoding jobs can also be stored or read from xml files. CHXTEncodingJob implements IHXTUserConfigFile which contains the methods WriteToFile(…) and ReadFromFile(…). This feature won’t be needed by all clients but its easy and unobtrusive to add.
// Get the serialization interfaceIHXTUserConfigFile* pSerializer = NULL;
res = m_pJob->QueryInterface(IID_IHXTUserConfigFile, (void**)&pSerializer);// Save the job to a file
if (SUCCEEDED(res))
res = pSerializer->WriteToFile(szInput);
IHXTEncodingJob isn’t the only interface that can be serialized. Audiences can be stored separately so they may be reused in different jobs.
This code snippet taken from the command line producer app shows an audience being read from a file
IHXTAudiencePtr spAudDef;res = m_spFactory->CreateInstance(IID_IHXTAudience, (IUnknown**)spAudDef.Adopt());
IHXTUserConfigFilePtr spSerial;// retrieve serialization interface
if ( SUCCEEDED(res) )
res=spAudDef0->QueryInterface(IID_IHXTUserConfigFile, (void**)spSerial.Adopt());
//deserialize audience
if ( SUCCEEDED(res) )
res = spSerial->ReadFromFile( audpath.c_str(), !m_Params.bDisableUpdateCodecs);









