Eucalyptus@Karmic910
Permalink Comments off
Permalink Comments off
Recently one of my friends I met from LinkedIn sent me several questions, about Cloud Computing, which looks like a Request for Proposal (RFP). I post my very personal response here for a mindshare.
About Cloud, Cluster/ Load Balancing, Data Center and Open Source, I could talk day and night without stop. I’m giving my personal but confident thoughts upon my friend’s email in brief.
- Open Source / Operating System / Framework
I’m Unix/ Linux guy. Other than Unix, Linux is the only operating system in Cloud management and Cloud resource pool, in terms of efficiency, green security, cost, and manageability. No Windows, No Mac… Certainly Windows could be one of computing services in resource pool as a kind of computing resource, that Infrastructure_as_a_Service (IaaS) provides and delivers.
Microsoft .Net framework doesn’t make much sense to Cloud @ IaaS, as Cloud is Open. That’s a defacto rule in Cloud. Windows isn’t “royal free” either, nor brings satisfied ROI in fact.
Most/ major components in Cloud are supposed to be operating system independent, like Java, Xen, Apache, MySQL, Linux/ Unix, JDBC, Mozilla/ Firefox, Eclipse, Tomcat… etc. As long as a software depends on a specific OS as a pre- requisite, it doesn’t fit in Cloud.
- Cloud Architecture
Several open source projects to recommend
http://www.opennebula.org > this is being used @ http://nebula.nasa.gov/
http://open.eucalyptus.com > this is a very AWS EC2- like open source cloud which I personally installed several time for Proof_of_Concept (PoC).
April 23rd, 2009, Ubuntu released 9.04 Server edition, which includes OpenNebula and Eucalyptus open source projects and supports Cloud environment >
http://doc.ubuntu.com/ubuntu/serverguide/C/opennebula.html
http://doc.ubuntu.com/ubuntu/serverguide/C/eucalyptus.html
5 cost- efficient flexible open source resources for Cloud
http://ostatic.com/blog/5-cost-efficient-flexible-open-source-resources-for-cloud…
Cloud hosting & storage toolbox
http://www.webresourcesdepot.com/cloud-hosting-storage-toolbox-options-tools/
- Virtualization and Virtual Machine Manager (VMM)
Virtualization is a key player in Cloud, but not all in Cloud. Virtualization technology is a carrier of computing resource dynamically managed and delivery in Cloud. In my opinion, its importance has been exaggerated in market. The keys of Virtualization are (1) standard (2) simple (3) easy to manage. You can’t deploy and manage multiple virtualization technologies within one Cloud environment.
Hyper-V doesn’t comply with the rule I mentioned earlier – Cloud is open. VMWare solution is expensive, isn’t it? Xen / XenSource is open source / free and recommended. Xen has been widely adopted in Cloud service provider in Internet. Xen is also compatible with monitoring system to be discussed below.
I’ve a blog describing the comparison between XenSource and VMWare > http://tr.im/mFFa
- Monitor
The ability of monitor determines how precisely and automatically Cloud detects from Cloud, and how fine- grained (granular) computing resource could be deployed to customer.
I recommend Ganglia and Nagios that are popular in Cloud, data center, server farm/ cluster, and widely adopted in many enterprises.
developerWorks > http://www.ibm.com/developerworks/opensource/library/l-ganglia-nagios-1/index.html
Nagios homepage > http://www.nagios.org/
- Request Driven Deployment / Provisioning
Deployment driven by request means the end user decides how much computing resource s/he needs from Cloud. Such as you specify number of CPU, # of gigabyte of memory and size of disk (priced @ http://aws.amazon.com/ec2 ) from Amazon Web Services (AWS) Elastic Compute Cloud (EC2). Very simple. We may learn a lot from AWS architecture > http://highscalability.com/amazon-architecture
Provisioning means that a specific software, like JVM or middleware software, or database, to be installed and configured in an deployed / existing virtual system on demand.
The scenario: eg. in AWS, you’re given a Linux operating system with 2 CPUs and 4G memory. But you can do nothing in a blank OS. You might need an Oracle database 11 or a middleware like JBoss or perhaps WebSphere. Then provisioning function will install and configure on the deployed virtual machine unattended.
For now, in that case @ AWS, you have to install by your own or you can select one available image from AWS image pool, with pre-loaded software you need, in which if someone else made one and published previously. Amazon doesn’t have provisioning service so far. That results that AWS has to manage thousand images in its image repository so that implies it is a challenge.
- SLA / Service Policy / Automation
An example of Service Policy: when an instance is running 80% disk utilization, a pre- defined service policy triggers a configured action – to add another 100G disk into such instance.
This is done by Service Policy, not by system operator and administrator. We can see how this differentiate from mainframe’s virtualization since 60′s.
In another word, Service Policy = Automation!
This kind of policy could be very specific in many and many scenarios. It does require customization effort to fit requirement. I do have some experience/ skill of elastic JVM/ JavaEE computing resource as being engaged with a banking customer now, @ Platform as a Service (PaaS) project.
- Load balancing (not F5)
Load balancing is a must in large enterprise and busy traffic internet. F5 is popular. But F5 throws URL (a request from web) as a token to go into web server cluster. In fact, URLs are not equal. F5 doesn’t know how much resource behind one URL might need from JVM in middleware. So F5 can round- robin requests / URLs, but not granularly down to CPU utilization level.
Cloud Computing @ each layers – IaaS, PaaS & SaaS – needs monitoring to know how much resource remains, how much needed, how much to deliver, how much to remove after use.
- Security/ Authentication / LDAP
Security is a big concern in Cloud, as well as alert and audit, in term of authentication, authority, data integrity, encryption/ decryption, etc.
LDAP (OpenLDAP) could be used @ Cloud portal where end user applies computing resource in resource pool, while Cloud operator manages in management portal. As long as a computing resource is deployed, end user is received an authority to access the delivered resource and doesn’t need LDAP authentication.
- Languages
Java is preferred as global languange
C is used in Eucalyptus Cloud Management
Ruby on Rail used to display @ Portal & UI
XML in data transaction, Python, etc.
- Approach
Have to have Proof_of_Concept – to build a small system to prove all works prior to production
Permalink Comments off
Source: http://berkeleyclouds.blogspot.com/…
One of the obstacles we identified in our paper was API-based vendor lock-in: If you’ve written your application to run on a particular cloud, and especially if you have automated management and provisioning tools, moving to a different provider, with different APIs, can be a significant expense. Mark Shuttleworth recently announced on the Ubuntu Developers mailing list that the next release of Ubuntu, Karmic Koala, “aims to keep free software at the forefront of cloud computing by embracing the API’s of Amazon EC2, and making it easy for anybody to set up their own cloud using entirely open tools.” Specifically they will be including support for Eucalyptus, the University of California, Santa Barbara project that allows operators to provide an EC2 compatible API to their own cluster. While there are a number of other efforts to create a standard this is definitely a step towards the adoption of the EC2 API as a de facto standard. To the best of our knowledge, this makes EC2′s API the first one with a second implementation — an open source implementation, to boot.
Permalink Comments off
Source: http://highscalability.com/eucalyptus-ready-be-your-private-cloud
Rich Wolski, professor of Computer Science at the University of California, Santa Barbara, gave a spirited talk on Eucalyptus to a large group of very interested cloudsters at the Eucalyptus Cloud Meetup. If Rich could teach computer science at every school the state of the computer science industry would be stratospheric. Rich is dynamic, smart, passionate, and visionary. It’s that vision that prompted him to create Eucalyptus in the first place. Rich and his group are experts in grid and distributed computing, having a long and glorious history in that space. When he saw cloud computing on the rise he decided the best way to explore it was to implement what everyone accepted as a real cloud, Amazon’s API. In a remarkably short time they implement Eucalyptus and have been improving it and tracking Amazon’s changes ever since.
The question I had going into the meetup was: should Eucalyptus be used to make an organization’s private cloud? The short answer is no.
The project is of high quality, the people are of the highest quality, but in the end Eucalyptus is a research project from a university. As an academic project Eucalyptus is subject to changes in funding and the research interests of the team. When funding sources dry up so does the project. If the team finds another research area more interesting, or if they get tired of chasing a continuous stream of new Amazon features, or no new grad students sign on, which will happen in a few years, then the project goes dark.
Fears over continuity have at least two solutions: community support and commercial support. Eucalyptus could become community supported open source project. This is unlikely to happen though as it conflicts with the research intent of Eucalyptus. The Eucalyptus team plans to control the core for research purposes and encourage external development of add-on service like SQS. Eucalyptus won’t go commercial as University projects must stay clear from commercial pretensions. Amazon is “no comment” on Eucalyptus so it’s not clear what they would think of commercial development should it occur.
Taken together these concerns imply Eucalyptus is not a good base for an enterprise quality private cloud. Which they readily admit. It’s not enterprise ready Rich repeats. It’s not that the quality isn’t there. It is and will be. And some will certainly base their private cloud on Eucalyptus, but when making a decision of this type you have to be sure your cloud infrastructure will be around for the long haul. With Eucalyptus that is not necessarily the case. Eucalyptus is still a good choice for it’s original research purpose, or as cheap staging platform for Amazon, or as base for temporary clouds, but as your rock solid private cloud infrastructure of the future Eucalyptus isn’t the answer.
The long answer is a little more nuanced and interesting.
The primary purpose for Eucalyptus is research. It was never meant to be our little untethered private Amazon cloud. But if it works, why not?
Eucalyptus implements most of EC2 and a little of S3. They hope to get community support for the rest. That of course makes Eucalyptus far less interesting as a development platform. But if your use for Eucalyptus is as an instant provisioning framework you are still in the game. Their emulation of EC2 is so good RightScale was able to operate on top of Eucalyptus. Impressive.
But even in the EC2 arena I have to wonder for how long they’ll track Amazon development. If you are a researcher implementing every new Amazon feature is going to get mighty old after a while. It will be time to move on and if you are dependent on Eucalyptus you are in trouble. Sure, you can move to Amazon but what about that $1 million data center buildout?
Developing software not tied to the Amazon service stack then Eucalyptus would work great.
As an Amazon developer I would want my code to work without too much trouble in both environments. Certainly you can mock the different services for testing or create a service layer to hide different implementations, but that’s not ideal and makes Eucalyptus as an Amazon proxy less attractive.
One of the uses for Eucalyptus is to make Amazon cheaper and easier by testing code locally without out having to deploy into Amazon all the time. Given the size of images the bandwidth and storage costs add up after a while, so this could make Eucalyptus a valuable part of the development process.
No kidding. Amazon has an army of sysadmins, network engineers, and programmers to make their system work at such ginormous scales. Eucalyptus was built on smarts, grit and pizza. It will never scale as well as Amazon, but Eucalyptus is scalable to 256 nodes right now. Which is not bad.
Rich thinks with some work they already know about it could scale to 5000 nodes. Not exactly Amazon scale, but good enough for many data center dreams.
One big limit Eucalyptus has is the self-imposed requirement to work well in any environment. It’s just a tarball you can install on top of any network. They rightly felt this was necessary for adoption. Saying to potential customers that you need to setup a special network before you can test this software tends to slow down adoption. By making Eucalyptus work as an overlay they soothed a lot of early adopter pain.
But by giving up control of the machines, the OS, the disk, and the network they limited how scalable they can be. There’s more to scalability than just software. Amazon has total control and that gives them power. Eucalyptus plans to make more invasive and more scalable options available in the future.
Organizations interested in a private cloud are often interested in:
Endless pixels have been killed defining clouds, grids, and how they are different enough that there’s really a whole new market to sell into. Rich actually makes a convincing argument that grids and clouds are different and do require a completely different infrastructure. The differences:
Permalink Comments off
Building a private open source’d Cloud Computing environment isn’t that difficult as you think. Get the code from http://eucalyptus.cs.ucsb.edu and it took me about 3~ 4 days @ my spare time to complete the installation and configuration. This environment shares code with Amazon Elastic Compute Cloud (EC2), which means your Amazon Machine Image (AMI) can run on top of this.
Look @ some screenshots



Permalink Comments off
Cloud-computing platforms such as Amazon’s Elastic Compute Cloud (EC2), Microsoft’s Azure Services Platform, and Google App Engine have given many businesses flexible access to computing resources, ushering in an era in which, among other things, startups can operate with much lower infrastructure costs. Instead of having to buy or rent hardware, users can pay for only the processing power that they actually use and are free to use more or less as their needs change.
However, relying on cloud computing comes with drawbacks, including privacy, security, and reliability concerns. So there is now growing interest in open-source cloud-computing tools, for which the source code is freely available. These tools could let companies build and customize their own computing clouds to work alongside more powerful commercial solutions.
One open-source software-infrastructure project, called Eucalyptus, imitates the experience of using EC2 but lets users run programs on their own resources and provides a detailed view of what would otherwise be the black box of cloud-computing services.
Another open-source cloud-computing project is the University of Chicago‘s Globus Nimbus, which is widely recognized as having pioneered the field. And a European cloud-computing initiative coordinated by IBM, called RESERVOIR, features several open-source components, including OpenNebula, a tool for managing the virtual machines within a cloud. Even some companies, such as Enomaly and 10gen, are developing open-source cloud-computing tools.
Rich Wolski, a professor in the computer-science department at the University of California, Santa Barbara, who directs the Eucalyptus project, says that his focus is on developing a platform that is easy to use, maintain, and modify. “We actually started from first principles to build something that looks like a cloud,” he says. “As a result, we believe that our thing is more malleable. We can modify it, we can see inside it, we can install it and maintain it in a cloud environment in a more natural way.”
Reuven Cohen, founder and chief technologist of Enomaly, explains that an open-source cloud provides useful flexibility for academics and large companies. For example, he says, a company might want to run most of its computing in a commercial cloud such as that provided by Amazon but use the same software to process sensitive data on its own machines, for added security. Alternatively, a user might want to run software on his or her own resources most of the time, but have the option to expand to a commercial service in times of high demand. In both cases, an open-source cloud-computing interface can offer that flexibility, serving as a complement to the commercial service rather than a replacement.
Indeed, Wolski says that Eucalyptus isn’t meant to be an EC2 killer (for one thing, it’s not designed to scale to the same size). However, he believes that the project can make a productive contribution by offering a simple way to customize programs for use in the cloud. Wolski says that it’s easier to assess a program’s performance when it’s possible to see how it operates both at the interface and from within a cloud.
Wolski says that Eucalyptus will also imitate Amazon’s popular Simple Storage Surface, which allows users to access storage space on demand, as well as its Elastic IP addresses, which keeps the address of Web resources the same, even if the physical location changes.
Ignacio Llorente, a professor in the distributed systems architecture group at the Universidad Complutense de Madrid, in Spain, who works on OpenNebula, says that Eucalyptus’s main advantage is that it uses the popular EC2 interface. However, he adds that “the open-source interface is only one part of the solution. Their back-end [the system's internal management of physical resources and virtual machines] is too basic. A complete cloud solution requires other components.” Llorente says that Eucalyptus is just one example of a growing ecosystem of open-source cloud-computing components.
Wolski expects many of Eucalyptus’s users to be academics interested in studying cloud-computing infrastructure. Although he doubts that such a platform would be used as a distributed system for ordinary computer users, he doesn’t discount the possibility. “You can argue it both ways,” he notes. But Wolski says that he thinks some open-source cloud-computing tool will become important in the future. “If it’s not Eucalyptus, I suspect [it will be] something else,” he says. “There will be an open-source thing that everyone gets excited about and runs in their environment.”
Copyright Technology Review 2008.
Permalink Comments off