Getting Started with Amazon EC2 using Python
April 27th, 2010
With the announcement of RHEL’s offering on Amazon Web Services, I wanted to write up some notes from the work I’ve done with EC2 and python. Amazon provides a capable web console, but (not surprisingly) I’d rather do most of my work through a programmable API. The rest of this blog covers the steps necessary to launch an instance, along with some other random notes from my experience.
Boto
The first step is to grab the boto library. Boto is a python interface to Amazon’s web services (not just EC2 but S3 as well). Their site provides downloads, installation instructions, and source, so I won’t go into any more detail besides saying I use it.
Gather Amazon Information
There are a few things needed from your Amazon AWS account in order create and connect to instances, all of which can be retrieved from the AWS web console.
Account Access Keys
The account access keys are effectively your username/password when connecting through boto. From the AWS console, click the Account tab at the top and navigate to Security Credentials. In the middle of the page you’ll find “Access Key ID” and “Secret Access Key”. Make note of these but be sure to keep them safe; these pretty much give full access to your environment.
Key Pairs
The key pairs are used for SSH authentication when connecting to your instances. These are generated through the AWS console itself (the Account tab from the previous section will simply link you back to the AWS console). It’s pretty self-explanatory how to generate a key pair, just be sure to download and then keep the private key safe; there is no way to retrieve a private key from Amazon other than that initial download link.
Image ID
To create an instance, you have to specify which Amazon Machine Image (AMI) to base the instance on. These can be found under the AMIs section of the web console. Determine which image you want based on what it provides and make a note of the ID. It will look something like “ami-12345678″.
Existing Red Hat customers can find more information on Red Hat’s Cloud Access page.
Connect to Amazon
There are two ways of passing your AWS Key and Secret Key to boto, either through environment variables or as arguments to the connect calls. If you choose the environment variable route, they must be named:
AWS_ACCESS_KEY_ID=foo AWS_SECRET_ACCESS_KEY=bar
Once those are set, create a connection to EC2 in python with the following snippet:
import boto ec2conn = boto.connect_ec2()
If you choose to skip the environment variables, the keys can be passed directly to the connect call:
import boto ec2conn = boto.connect_ec2(aws_access_key_id='foo', aws_secret_access_key='bar')
In either case, it is important to realize that these calls default to the US east EC2 region. If you want to make this explicit or, more likely, connect to one of the other two regions, you can pass the optional region argument:
region = # one of 'us-east', 'us-west', 'eu-west' ec2conn = boto.connect_ec2(region=region)
That’s the main connection to EC2 and the one we’ll use for creating instances. There are others with different purposes, such as connecting to S3, the AWS load balancer features, and so on. They are all named “connect_”, so looking through the help for boto will give you a good idea of what’s available.
Create a new Security Group
A security group is basically Amazon’s firewall to your instances. The default security group is pretty restrictive, so we’ll create a new one that allows us access to SSH and HTTP:
name = 'SSH and HTTP Security Group' description = 'Test security group' ec2conn.create_security_group(name, description) group = ec2conn.get_all_security_groups(groupnames=[name])[0] group.authorize(ip_protocol='tcp', from_port='22', to_port='22', cidr_ip='0.0.0.0/0') group.authorize(ip_protocol='tcp', from_port='80', to_port='80', cidr_ip='0.0.0.0/0')
Note: The create_security_group call returns a handle to the group, but I wanted to demonstrate retrieving an existing group as well.
The above should be pretty self-explanatory. The biggest thing to note is the line where the group is retrieved. Since a list is passed to groupnames we get back a list of matching groups. I can’t tell you how many times I attempted to act on the returned result without indexing a specific group inside of it. This is a common pattern all over boto, so you’d think I’d have learned after the first 30 times.
After this is complete, the web console will show a new security group with the firewall holes we created. This will come in handy when we want to SSH into our instance to, ya know, actually do stuff.
Create the Instance
We’re now ready to actually create an instance.
ami_id = 'ami-12345678' ami = ec2conn.get_all_images([ami_id])[0] ssh_key_name = # name of the key pair created above security_groups = # name of the security group created above; must be a list instance_size = # 'm1.large', 'm1.xlarge', etc. see amazon docs for more info reservation = ami.run(key_name=ssh_key_name, security_groups=security_groups, instance_type=instance_size) print('New instance [%s]' % reservation.instances[0].public_dns_name)
The call is pretty simple at this point, we just need to pass in the data we’ve been collecting. Remember the security_groups argument must be a list. Also, keep in mind a reservation is returned from the create call, not the instance itself. The boto documentation can provide more information on the distinction.
SSH into the Instance
The above code should have output the public DNS name of the newly created instance. Once it’s finished starting (you can watch the progress in the web console or there are ways to do it in boto, I just haven’t included them here) you can SSH into it by passing the SSH key created earlier (substitute in the relevant information):
ssh -i $SSH_KEY root@$INSTANCE_DNS
Conclusion
As you’d expect, there is a lot more to boto than just creating instances, such as creating/attaching Elastic Block Storage (EBS) volumes, creating/configuring Elastic Load Balancers (ELB), and adding Autoscaling Groups to load balancers. Many of the APIs look similar to the code used in creating an instance, so it’s just a matter of figuring out what you want to do.
Django “no such column” error
February 13th, 2010
I’ve been getting into Django recently. I’ll go into it more in another entry, but I ran into a small issue where my database seemed to get out of sync with my model. Running syncdb didn’t throw any errors, but when I tried to access the model from the server I’d get an error about “no such column”, even though I could see it created in the generated DDL.
It took me a bit of digging (in other words, it wasn’t in the tutorial), but there’s a manage command to reset the database for a particular app. Running that and re-syncing my database got me moving again.
python manage.py reset [appname] python manage.py syncdb
From Java to Python
January 13th, 2010
I’m not completely sure why, but I’m a bit embarrassed to admit to Planet Fedora how little my Python experience is; the majority of my experience is in Java. I was able to read and bug fix the Python code in Spacewalk, but I hadn’t really dug deep into my own project. Now that I’m not teaching any longer and have some free time (one of my main reasons for quitting), I can finally sit down and dork around with the language. After spending some time working on some basic games and a simple IRC bot, I figured I’d step back and think about what the transition from Java to Python has felt like.
Don’t Fear The Whitespace
I constantly hear people mention the indentation in Python as the first thing when talking about moving to the language. Not only is it not as jarring of an experience as people make it out to be, it’s downright awesome. I’ve always been compulsive about my code format anyway, so the biggest difference is the lack of curly braces.
Collections Are Awesome
It’s much lighter-weight to throw things into a list or map (dictionary in Python) than it is in Java. Get out of the mentality that you have to jump through import hoops and rigid notation to create, access, or return collections. In Python, they even let you do cool things like assign multiple variables as a return from a call:
exceptionType, exceptionValue, exceptionTraceback = sys.exc_info()
Looping Feels Weird At First
I got a little thrown off by this initially. Most loops read really well:
for square in openSquares:
However, when looping through a set of numbers, you need to use the range method:
for i in range(0, 10):
Looking at both of those examples brings me to my next point…
Don’t Forget The Colon
This keeps throwing me off, but after declaring a function*, loop, or if statement, don’t forget to end the line with a colon. I’m happy to be rid of curly braces, but I get over-ambitious and forget the colon too.
Don’t Over-engineer Configuration
Depending on what you’re doing, you can likely just stuff configuration values into a script and import it (not needing to compile really is liberating in this respect). That’ll also give you the use of lists and maps by default. If you’re not reading between the lines I’ll spell it out: no need for XML-based configuration, which is one of the more evil trends in Java.
There are definitely more things I could mention; don’t take this to be the only lessons I’ve learned (any other hints/tips are appreciated). But I do want to avoid a mammoth blog post that causes readers to go into a zombie-like trance, so I’ll stop it here for now. I do want to thank Devan (dgoodwin) and Jesus (zeus) for dealing with the Java-veteran-turned-Python-noob and not finding a way to crash my chat client to avoid more questions.
* I haven’t seen a solid explanation of “Call them ‘functions’ because you’ll sound like a Java guy calling them ‘methods’”, but this feels like something where using the wrong term will make me stand out as a Java developer in a Python world. So I’ve been advised to take a militant approach of “Yes, I’m a Java guy learning Python, deal with the occasional terminology missteps.”
Triple-quoted Strings
April 20th, 2009
In Python, if you use three double quotes (I know, that just looks weird when written) you don’t have to escape newlines. For instance, I’m working with a query (it’s much longer, I cut out the middle):
1 2 3 4 5 6 7 8 9 10 11 12 | _packageStatement_remove = """ select distinct pn.name name, pe.epoch epoch, pe.version version, pe.release release, pa.label arch from rhnActionPackage ap, rhnPackage p, [snip] and ap.package_arch_id = pa.id(+) and p.id = cp.package_id""" |
In Java, that’d be a lot uglier. I have no desire to convert the entire query, but it’d look something like:
1 2 3 4 5 6 7 8 | String query = "select distinct " + " pn.name name, " + " pe.epoch epoch, " + " pe.version version, " + " pe.release release, " + " pa.label arch " + [snip] |
Also keep in mind that in most cases, you have to be careful to add the space after each line within the quotes. Otherwise, when Java munges this all into a single constant, you’ll get two words merged into one:
1 2 | String foo = "golden" + "monkey"; |
The contents of foo is simply "goldenmonkey" without any spaces. Needless to say, that can really screw with your query.
Score one for Python.
Equality – Part 1
September 14th, 2008
What is the proper way to compare two strings to determine if they contain the same text?
One of the hiccups new object-oriented programmers make revolves around comparing two objects for equality. The confusion typically stems from the initial learning of comparing two primitives:
int x = 0; int y = 0; boolean xyEqual = (x == y);
For primitives, the above code works correctly, in this case returning true. However, the == operator has a significantly different meaning when used to compare two objects. Take, for instance, the following small variation on the code:
Person x = new Person("Tyler Durden");
Person y = new Person("Tyler Durden");
boolean xyEqual = (x == y);
For simplicity, assume the Person constructor assigns the parameter to an internal attribute used to track the person’s name.
Based on the primitive example, the typical assumption is that the code would return true. As you might have guessed given the tone of this post, this is incorrect; the above condition will return false.
The reason lies in the polymorphic nature of the == operator. When applied to objects, this operator has a significantly different execution than it does for primitives.
When two objects are passed to the == operator, the result indicates if their object references are equal. In other words, this check will determine if they both point to the exact same object in memory. Since primitives are not objects, it’s understandable that there would be a different meaning when applied to objects.
Keep in mind that when calling new, a new object is created. In this light, it is clear why the above code evaluates to false. Compare this to the following:
Person x = new Person("Marla Singer");
Person y = x;
boolean xyEqual = (x == y);
This changes the condition to evaluate to true. The assignment in the second statement does not result in a new object creation, but rather indicates that y should point to the same object as x. Given the above definition of == when applied to objects, it should be clear why the result is true.
So how are two objects compared to see if they are semantically equal? The answer lies in the equals method defined in the base Object class. The signature of this method is as follows:
public boolean equals(Object other);
Domain objects often need to override this method in their implementations to provide a meaningful equality comparison. One note on the Object implementation, it defaults to == behavior, which is why fleshed out domain objects will override this method. Before we get to a possible implementation of this method for the Person class, let’s look at the properties this method must honor, as defined by the Java specification:
- It is reflexive: for any non-null reference value x,
x.equals(x)should returntrue. - It is symmetric: for any non-null reference values x and y,
x.equals(y)should returntrueif and only ify.equals(x)returnstrue. - It is transitive: for any non-null reference values x, y, and z, if
x.equals(y)returnstrueandy.equals(z)returnstrue, thenx.equals(z)should returntrue. - It is consistent: for any non-null reference values x and y, multiple invocations of
x.equals(y)consistently returntrueor consistently returnfalse, provided no information used in equals comparisons on the objects is modified. - For any non-null reference value x,
x.equals(null)should returnfalse.
With that in mind, let’s assume a person’s name is enough to identify them for equality. In reality, we’d probably use something guaranteed to be unique, such as a social security number or student/employee ID. But to stick with the above code snippet which only takes a name, we’ll use that.
1 2 3 4 | public boolean equals(Object other) { Person otherPerson = (Person)other; return this.getName().equals(otherPerson.getName()); } |
Since this method uses the same signature as defined in Object, this implementation will be invoked in all cases where the object is a Person. Instead of simply checking object references, this code will do an equality comparison on the person’s name, giving us the desired logic.
Note that this further uses the String class implementation of equals, which is the correct mechanism to use when comparing strings.
Keep in mind this is the Java specific mechanism. Other object-oriented languages have a similar construct, however the syntax will vary. For instance, in Python the == operator tests the values of two variables for equality. To do object reference checking in Python, the is operator is used.
There is much more to say on the topic, and will be covered in future blogs. Future installments include:
- This post makes no mention of the
hashCodemethod in theObjectclass. In many cases, both of these methods must be overridden at the same time. - The
Personclass implementation provided above makes a few assumptions. For now, I’ll leave it as an exercise to the reader to determine in what cases the above method will fail.

