EC2 = Easy Cloud 2.0? Getting started with the Amazon Cloud

If you are command line centric guy like me and you are on Ubuntu this post is for you.Getting starting with Amazon was a pain for me although once you understand the basics it is relativly easy. BTW: there are of course also other cloud systems like Rackspace or Azure.

If you want the official Ubuntu LTS Server (currently 10.04) running in the Amazon Cloud you can do:

ec2-run-instances ami-c00e3cb4 --region eu-west-1 --instance-type m1.small --key amazon-key

or go to this page and pick a different AMI. Hmmh, you are already sick of all the wording like AMI, EC2 and instances? Ok,

lets digg into the amazon world.

Let me know if I have something missing or incorrect:

  • AMI: Amazon Machine Image. This is a highly tuned linux distribution in our case and we can choose from a lot of different types – e.g. on this page.
  • EC2: Elastic Compute Cloud – which is a highly scalable hosting solution where you have root access to the server. You can choose the power and RAM of that instance (‘a server’) and start and stop instances as you like. In Germany Amazon is relative expensive compared to existing hosting solutions (not that case in the US). And since those services can also easy scale there is nearly no advantage of using Amazon or Rackspace.
  • EBS: Elastic block storage – This is where we store our data. An EBS can be attached to any instance but in my case I don’t need a separate volume I just can use the default EBS mounted at /mnt with ~150 GB or even the system partition / with ~8 GB. From wikipedia:
    EBS volumes provide persistent storage independent of the lifetime of the EC2 instance, and act much like hard drives on a real server.
    Also if you choose storage of type ‘ebs’ your instance can be stopped. If it is of type instance-store you could only clone the AMI and terminate. If you try to stop it you’ll get “The instance does not have an ‘ebs’ root device type and cannot be stopped.”
  • A running instance is always attached to one key (a named public key). Once started you cannot change it.
  • S3: Simple Storage Service. Can be used for e.g. backup purposes, has an own API (REST or SOAP). Not covered in this mini post.
  • Availability zone: The datacenter location e.g. eu-west-1 is Ireland or us-west-2 is Oregon. The advantage of having different regions/zones is that if one datacenter crashes you have a fall back in a different. But the big disadvantage of different zones is that e.g. transfering your customized AMIs to a different region is a bit complex and you’ll need to import your keys again etc.

But even now after ‘understanding’ of the wording it is not that easy to get started and e.g. the above command will not work out of the box.

To make the above command working you’ll need:

  1. An Amazon Account and a lot of money ;) or use the micro instance which is free for one year and for a fresh account IMO
  2. The ec2 tools installed locally: sudo apt-get install ec2-api-tools
  3. The amazon credentials stored and added to your ssh-agent:
    export EC2_PRIVATE_KEY=/home/user/.ssh/certificate-privatekey.pem
    export EC2_CERT=/home/user/.ssh/certificate.pem
  4. Test the functionality via
    ec2-describe-instances –region eu-west-1
  5. Now you need to create a key pair and import the public one into your account (choose the right availability zone!)
    Aws Console -> Ec2 -> Network & Security -> Key Pairs -> Import Key Pair and choose amazon-key as name
  6. Then feed your local ssh-agent with the private key:
    ssh-add /home/user/.ssh/amazon-key
  7. Now you should be able to run the above command. To view the instance from the web UI you’ll have to refresh the site.
  8. Open port 22 for the default security group:
    Aws Console -> Ec2 -> Network & Security -> Security Groups -> Click on the default one and then on the ‘inbound’ Tab -> type ’22’ in port range -> Add Rule -> delete the other configurations -> Apply Rule Changes
  9. Now try to login
    ssh ubuntu@ec2-your-machine.amazonaws.com
    For the official amazon AMIs you’ll have to use ec2-user as login

That was easy :) No?

Ok, now you’ll have to configure and install software as you like e.g.
sudo apt-get update && sudo apt-get upgrade -y

To proceed further you could

  • Attach a static IP to the instance so that external applications do not need to be changed after you moved the instance – or use that IP for your load balancer – or use the Amazon load balancer etc:
    Aws Console -> Ec2 -> Network & Security -> Elastic IPs -> Allocate New Address
  • Open some more ports like port 80
  • Or you could create an AMI of your already configured system. You can even publish this custom AMI.
  • Run ElasticSearch as search server in the cloud e.g. even via a debian package which makes it very easy.

Now if you have several instance and you want to

update software on all machines.

How would you do that? Here is one possibility

ips=`ec2-describe-instances --region eu-west-1 | grep running | cut -f17 | tr '\n' ' '`

for IP in $ips
do
 echo UPDATING $IP;
 ssh -A ubuntu@$IP "cd /somewhere; bash ./scripts/update.sh";
done

10 thoughts on “EC2 = Easy Cloud 2.0? Getting started with the Amazon Cloud

  1. No pub ami created, sorry. And Jetslide itself is normally hosted. But it is relative easy to setup. E.g. use the latest master and install via debian package.

  2. Thanks for the Jetslide app, I was able to successfully build it – reviewed some of the code today. I was interested in comparing Wicket to other options for interfacing with ES – django or node.js.

    What do you think of Wicket? Looking at your app it seemed Wicket was a complex way to write a webapp when compared to django and node. But I do not have much experience with any of the options.

    It would be nice if ES had authentication and you could just hit it directly from jquery w/o a middle layer. Thanks again for allowing others to see what an ES/Wicket app looks like

  3. > What do you think of Wicket?

    I think it does the UI thing very well (I like the component oriented approach) compared to node.js which has only templating stuff and is not nice IMO, but yes it has some quirks …

    For Java there is also another option: play framework. which might be better suited and also makes db access easier etc but not sure if there is such a component oriented UI layer

    > ES had authentication

    You could configure some proxy or just hack into the code to avoid access (this is not really difficult). But also be sure that you turn on https ;)

  4. >it does the UI thing very well

    I just need to create a set of public and authenticated REST/json endpoints to ES , for example, each user can only edit their own profile but everyone can see parts of it – my ui will be mobile so the Wicket UI objects don’t seem necessary (?), I might even do a native Android app instead of jqmobile+phonegap
    Do you see Wicket as a good solution to just be a REST server?

    >configure some proxy or just hack into the code to avoid access

    I looked at the node.es proxy, some use this for access to bigdesk – but I’d have to add a user layer and something to write the REST endpoints for the use case I described above

    It seems easy to create a REST server using node that proxies data and provided an authentication layer to ES – but I’m still picking through the pieces

    django has most of the pieces too. I ‘ll look at Play network, thanks for the pointer

    I know there’s no absolute right answer,but: what do you recommend?
    Thanks

  5. > Do you see Wicket as a good solution to just be a REST server?

    It is easily possible but there are more lightway/faster solutions IMO (e.g. directly using the http server from elasticsearch or a separate netty – see below)

    > I might even do a native Android app instead of jqmobile+phonegap

    I woudn’t recommend that :) although I have now robolectric (see my latest post). Or just use a WebView in the native app.

    > my ui will be mobile so the Wicket UI objects don’t seem necessary

    They aren’t necessary at all ;). But if you want to have a solution which is easily growable/maintainable then Wicket is not bad at all.

    > I know there’s no absolute right answer,but: what do you recommend?

    I’m still searching :)

    What I did in the past for our company: using a separate netty http server to proxy ES. Completely testable and fast + Java + Nio + Guice. But also probably a very low level solution – not sure. Although I implemented basic authentication within 5 minutes. You could easily put that stuff in a plugin and see you have only one thing to deploy. But probably the separation is not bad at all…

    The following is what I did for jetslide and the yet out of date android app: Wicket as a proxy. This is nice if you need additional UI stuff and don’t want a separate layer or server. BUT: not nio + probably not that fast. And Wicket its separate html+java tree, its statefull model and bad url customization sometimes annoys me :) But I didn’t found a better UI framework yet (to be honest: didn’t tried play framework yet)

    Node.js is great as I can easily test it and move javascript between client + server but it is probably not that fast compared to java. This is the cleanest solution but also probably very low level. if you need additional UI components you should do this on the client side (ext3, …). On the server side you have only ugly templating systems (I don’t like this: partials, hacking, … – this is so much more complicated and error-prone compared to wicket.)

    play framework 1. probably perfect: java + nio + easy testable + not low level + good support etc. BUT: the latest play framework 2 won’t be in java. it is scala …

  6. Sorry to hijack this post with stuff unrelated to aws.

    >it is probably not that fast compared to java

    I have no experience with Play either, but found these posts you might be interested in:

    http://www.subbu.org/blog/2011/03/nodejs-vs-play-for-front-end-apps

    http://www.subbu.org/blog/2011/04/reflecting-on-nodejs-vs-play

    I think he attributed the speed difference to play’s use of Groovy templates. The benchmarks aren’t related to Java performance per se, just play.

    From the same blog I saw he released a new project (sponsored by ebay): ql.io
    It seems like it might provide a nice proxy for ES, but I haven’t tried it yet. Some ES users are looking at it though, I found one post on their mailing list: http://groups.google.com/group/qlio/browse_thread/thread/64a44b2d4d85114c/6b39d3a80a2bfc73

    Since I only want to build a quick prototype and don’t have much experience with JS or Java – I think I may give node.js + ql.io a try. If I needed to build a website I probably wouldn’t choose node (although people are moving it in this direction). But it seems useful when building rest/json endpoints for mobile clients.

    If you’re aware of any Java code (github, etc) that would allow me to easily accomplish the same thing directly on the ES server – authentication, rest endpoints – let me know. I saw this tutorial: http://www.elasticsearch.org/tutorials/2011/09/14/creating-pluggable-rest-endpoints.html
    but it seems to be much more work than ql.io as a proxy

    Thanks for the discussion : )

  7. > Sorry to hijack this post with stuff unrelated to aws.

    no prob

    > I found one post on their mailing list

    nice finding thanks!

    > authentication

    there is a plugin:

    https://github.com/Asquera/elasticsearch-http-basic

    but still you need a proxy for https. nginx?

    https://github.com/karmi/cookbook-elasticsearch/blob/master/templates/default/elasticsearch_proxy_nginx.conf.erb

    > but it seems to be much more work than ql.io as a proxy

    yes, also I think you need to restart ES to develop with this (slower than just restarting the http server). And I like the separation of ‘concerns’ more …

  8. >there is a plugin: https://github.com/Asquera/elasticsearch-http-basic

    I looked at this, but it doesn’t seem this would be appropriate as a way to manage 100’s of users and the slices of data they have edit privellages on.

    Although the user data should be stored in ES – it seems the access is better handled by node, django, play, etc.. Let me know if you think otherwise.

    >but still you need a proxy for https. nginx?

    That’s the plan – it seems to be the most popular option for node. I didn’t notice the conf file in cookbook previously, thinks for the pointer.

    >And I like the separation of ‘concerns’ more …

    So are you now considering trying node too? : )

    ql.io seems overkill for what I’m currently doing, but I could see it being handy for other systems that I might need to grab data from in the future – so I plan to try it.

    To just create a node REST proxy to ES the snippet provided here should be sufficient (using node-elastical):

    http://www.tikalk.com/incubator/fuse-day-911-application-nodejs-elastic-search

Comments are closed.