Cloud Computing with Linux

Source: http://www.ibm.com/developerworks/linux/./l-cloud-computing

Author: M. Tim Jones, Consultant Engineer, Emulex Corp.

Click the source URL to view the original article. I brief some key points with diagrams

Figure 1. Cloud Computing migrates resources within internet

Figure 2. Virtualization and resource use

Figure 3. The layers of Cloud Computing

Figure 4. Cloud Computing landscape

Linux and open source in the Cloud

Software-as-a-Service

SaaS is the ability to access software over the Internet as a service. An early approach to SaaS was the Application Service Provider (ASP). ASPs provide subscriptions to software that is hosted or delivered over the Internet. The ASP delivers the software and charges fees based on its use. In this way, you don’t purchase the software but simply lease it on an as-needed basis.

Example SaaS
An interesting example of traditional versus SaaS applications is the application life cycle management tool from SoftwarePlanner.com. This company offers their tool using the traditional model, where customers host the application suite within their enterprise, or as SaaS, where customers host the application suite and make it available over the Internet.

Another perspective on SaaS is the use of software over the Internet that executes remotely. This software can be in the form of services used by a local application (defined as Web services) or a remote application observed through a Web browser. One example of a remote application service is Google Apps, which provides several enterprise applications through a standard Web browser. Remotely executing applications commonly rely on an application server to expose needed services. An application server is a software framework that exposes APIs for software services (such as transaction management or database access). Examples include Red Hat JBoss Application Server, Apache Geronimo, and IBM® WebSphere® Application Server. Many other application servers exist, and an extensive list is included in Resources.

Another recent example of SaaS is Google’s Chrome browser. The browser is an ideal environment as a new desktop through which applications can be delivered (either locally or remotely) in addition to the traditional Web browsing experience. (For more information, see Resources.)

Platform-as-a-Service

PaaS can be described as an entire virtualized platform that includes one or more servers (virtualized over the set of physical servers), operating systems, and specific applications (such as Apache and MySQL for Web-based applications). In some cases, these platforms can be predefined and selected; in others, you can provide a VM image that contains all the necessary user-specific applications.

One interesting example of a PaaS is Google App Engine. App Engine is a service that allows you to deploy your Web applications on Google’s very scalable architecture. App Engine provides you with a sandbox for your Python application that can be referenced over the Internet (and additional languages will be supported in the future). App Engine provides Python APIs for persistently storing and managing data (using the Google Query Language, or GQL) in addition to support for authenticating users, manipulating images, and sending e-mail. The sandbox in which the Web application runs restricts access to the underlying operating system. Although App Engine limits the functionality available to your application, it supports the construction of useful Web services. Check out Resources for more information.

Note: Deploying applications in App Engine is free within certain bandwidth and storage constraints. To build production Web sites with App Engine, usage fees are assessed.

Another example of a PaaS is 10gen, which is both a cloud platform and a downloadable open source package for creating your own private cloud. A software stack similar to App Engine, 10gen provides similar functionality to App Engine—with certain differences. With 10gen, you can develop applications in Python as well as the JavaScript and Ruby programming languages. The platform also uses the sandbox concept to isolate applications and provide a reliable environment over a large number of computers (built, of course, on Linux) using their own application server.

Infrastructure-as-a-Service

IaaS is the delivery of computer infrastructure as a service. This layer differs from PaaS in that the virtual hardware is provided without a software stack. Instead, the consumer provides a VM image that is invoked on one or more virtualized servers. IaaS is the rawest form of computing as a service (outside of access to the physical infrastructure). The most well-known commercial IaaS provider is Amazon Elastic Compute Cloud (EC2). In EC2, you can specify a particular VM (operating system and application set), and then deploy your applications on it or provide your own VM image to execute on the servers. You’re then billed simply for compute time, storage, and network bandwidth.

The Eucalyptus project (Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems) is an open source implementation of Amazon EC2 that is interface-compatible with the commercial service. Like EC2, Eucalyptus relies on Linux with Xen for operating system virtualization. Eucalyptus was developed at the University of California, Santa Barbara, for the purpose of cloud computing research. You can download it from the university’s Web site (see Resources), or you can experiment with it via the Eucalyptus Public Cloud with certain restrictions.

Another EC2 style of IaaS is the Enomalism cloud computing platform. Enomalism is an open source project that provides a cloud computing framework with functionality similar to EC2. Enomalism is based on Linux, with support for both Xen and the Kernel Virtual Machine (KVM). But unlike other pure IaaS solutions, Enomalism provides a software stack based on the TurboGears Web application framework and Python.

  • Share/Bookmark

Comments off

Open options for cloud computing

Source: http://www.linux.com/feature/144529

Author: Jack M. Germain

Some cloud computing vendors, such as 3tera and Nirvani, push their own proprietary platforms and tools, which forces adopters to limit their options and work in a restricted or closed architecture. When these established vendors say cloud, they mean their cloud. As a result, Web developers may believe that, in order to use cloud computing, they must accept limitations in the way they write and build their applications. But that view is a misconception; open standards for cloud computing are already in place and are being tweaked.

This does not mean that a single cloud computing platform is universally available. But just as some vendors have developed their own proprietary platforms for working in the clouds, so have various open source companies and communities.

“We’re already there. That is the trend I’m seeing,” says Jim Zemlin, executive director of the Linux Foundation. “Most Web-based startups are not buying hardware or software. They are using open source middleware and programming products like Ruby on Rails and Perl.”

Among the most popular middleware products are JBoss Enterprise Middleware, WSO2, Iona Fuse, and IBM WebSphere Application Server Community Edition.

What’s in a name?

Cloud computing is more of a process than one set technology. The concept behind what is now referred to as cloud computing has been called a variety of things, including cluster computing, utility computing, grid computing, and on-demand computing.

In its current trappings, the cloud computing model involves distributing computing tasks such as data storage and data center contents to a variety of Internet connections, software, and services accessed over a network. This collection of servers enables users to access supercomputing features. The data is not anchored to one physical location.

The push toward open standards for cloud computing has been going on for some time. This trend toward using open source tools for accessing the clouds is continuing to grow, says Zemlin.

Which path?

Perhaps the most challenging factor for potential adopters of cloud computing services to consider is which path best meets their needs. According to Zemlin, many organizations are integrating open source products to offer choices for accessing cloud computing service.

For example, a team of developers in the Computer Science Department at the University of California in Santa Barbara recently released the Eucalyptus Project, an open source infrastructure for cloud computing that mimics Amazon’s Elastic Computing Cloud (EC2), under the FreeBSD license. The name Eucalyptus stands for Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems. This software infrastructure implements cloud computing on clusters. Its design supports multiple client-side interfaces. Eucalyptus uses Linux tools and basic Web service technologies.

Another example is the 10gen platform-as-a-service technology. Recently released in alpha, it is designed to help developers build dynamic, scalable, mission-critical Web sites and applications. According to the 10gen Web site, its software stack is analogous to Google’s App Engine in that it provides a new stack of database, grid management, and application server tools to run in a cloud environment. The application server supports JavaScript as its first development language. Presently it supports Ruby. 10gen developers plan to build in support for other languages.

Not so fast

Some cloud computing players are not disputing the availability of open source products but question how much of a standard exists yet. They are not sure, for instance, how best to apply data management to the cloud.

“We are all for open standards via open source. But there is no clear path yet to what that standard should be,” says Aaron Darcy, director of product line management for the JBoss Division of Red Hat.

Darcy says his company’s top tier enterprise customers are pushing the envelope on cloud computing. They ask for cloudware and want extensions of existing software and standards.

“Our customers want to leverage cloud computing for its economy and quick deployment to market. But they don’t want to reinvent the wheel,” he says.

Red Hat, like other companies, is rushing products to market to fill customer demands in the clouds. For example, Amazon has worked a deal with Red Hat to run some of its open source products, such as JBoss Application Server, in the clouds. That makes sense, Darcy says, because it reflects a natural extension of what his company is doing on the enterprise level.

The goal of cloud computing should be access through open standards, according to Darcy. That is the only way the technology can adjust to new developments without locking in users to an inflexible platform, he believes.

“With cloud computing standards, no one has it right yet, including us. The market is still so young,” he says.

Rules clouded

The Linux Foundation’s Zemlin does not dispute the growing pains cloud computing is facing. Clearly, it is at an early stage of development. That means the industry has not yet put all of the needed tools in place.

The industry needs to solve a few things, however, he believes. For example, cloud operators need better manageability applications to allow cloud users to understand utilization rates. Numerous open source tools are doing this now, Zemlin says.

To that end, Zemlin wants to see a consistent way to meter and charge for cloud use. Cloud computing is a game of scale. Its real benefit comes from leveraging economies of scale, he notes.

Balancing act

At this point in the growth of cloud computing, companies that want to take advantage of cloud services have to consider whether to build their own computing clouds or subscribe to others’ cloud servers. These decisions often involve guesswork about which platform or cloud service offers the most reliabiilty and longevity. This guessing game in part results from the lack of a clearly defined clud standard supported by both proprietary and open source developers.

“Given budget constraints, customers need to evaluate the trade-offs,” says Red Hat’s Darcy. Cost differences are one of these trade-offs. For instance, potential adopters have to weigh the expenses associated with buying into a proprietary cloud platform or a community-sponsored or paid support open source product. Also, moving to the clouds could entail purchassing both hardware and software that could result in additonal upgrades or existing programs or time and money learning to adopt to new programs.

Another trade-off at this stage of the cloud computing game, Darcy says, is concern about security risks and performance hits. Placing a company’s data in somebody else’s cloud configuration, for instance, raises worries about how secure the data is. Even more troubling may be how the additonal middleware layer’s impact on the corporate computers.

“All of the hidden factors are not recognized yet,” Darcy concludes.

A good amount of the growing pains for cloud computing is similar to what the software as a service (SaaS) industry suffered in that technology’s early stages, according to Zemlin. As he sees it, SaaS was a first-generation technology for cloud computing. Soon, cloud computing may have similar benefits for smaller businesses and consumers.

  • Share/Bookmark

Comments off

Stateless computing: the future of the cloud?

This is an interesting post that I’ve read several times. My thought and understanding can be followed by highlight in bold.

=-=-

Source: http://arstechnica.com/…future-of-the-cloud.html
Author: Ryan Paul

At the LinuxWorld and Next Generation Data Center Expo in San Francisco, Merrill Lynch’s chief technology architect, Jeffrey Birnbaum, discussed stateless computing and the evolution of the cloud. He envisions a future in which software processes are abstracted so far from the underlying hardware that companies will discuss processing capacity in terms of raw computational units rather than discrete servers.

The real underlying value of cloud computing, he says, is that it transparently makes software and data available everywhere. He contends that this “stateless” model facilitates much greater scalability than conventional computing and can be used in conjunction with virtualization to achieve maximum data center utilization. “Stateless will emerge as a core, basic tenet of what’s in the cloud,” he remarked.

During his presentation, he described the architecture of a globally distributed stateless software implementation and explained some of the steps that Merrill Lynch has taken to move towards this model for its own internal IT infrastructure.

Birnbaum believes that one of the key foundational elements of a stateless computing environment is a networked storage system that enables ubiquitous availability of software. The file paths of the individual applications should be based on clearly defined nomenclature, much like the domain of a web site. All application dependencies should be accessible through the network filesystem, and version numbers should be expressed with the path nomenclature.

There are numerous advantages to this kind of network storage scheme. In addition to making the information and software universally accessible, it also obviates many of the technical challenges that are typically associated with software deployment. He also dubiously claims that this approach eliminates the need for preintegrated software stacks, a claim that I view with great skepticism (I’d argue that a big part of preintegration is harmonizing disparate components, and that entails a lot more than just deployment).

The obvious challenge posed by rolling out worldwide network storage infrastructure is scalability. If everyone in a global organization is depending on a network storage solution, then it needs to be fast and consistently reliable. The solution that Birnbaum proposes is regional mirroring and caching. The storage system would be universally synchronized between mirrors that have all the data. Caching can also be used at individual facilities to further improve performance. To achieve this kind of global scalability, he says, the best approach is similar to that of Akamai.

The system Birnbaum describes isn’t just theoretical. Merrill Lynch has been deploying its own implementation, which they call the Enterprise File System (EFS). Applications are streamed across the network and operate with no persistent state on local systems. This has worked well with Linux-based applications over NFS, but he says that they have encountered some challenges with Windows, which wasn’t designed to be used in that manner. For Windows applications, he suggests using virtualization—the user employs RDP to access applications running in virtualized environments in the data center.

These concepts don’t cover a whole lot of new ground yet. Much of this was already possible with conventional thin-client systems. The point at which it becomes immensely valuable, according to Birnbaum, is when all of these technologies are used together with virtualization to abstract the processes away from the hardware. Once this is done, individual operations can seamlessly float around data centers and balance out in a manner that offers a more optimal level of resource utilization.

He claims that 61 percent of a company’s enterprise server capacity goes completely unused and proposes an automated load balancing solution—a placement engine—which will manage the resources and the active processes, moving them between servers so that the existing capacity can be used more efficiently. His goal is to reach approximately 80 percent utilization (he cautions against going higher than that because he thinks it’s beneficial to always have spare capacity for unexpected circumstances).

One of the big advantages of this approach is that it makes it possible for companies to use cheaper hardware. Each individual server doesn’t need to be extremely reliable because processing can always be moved elsewhere in the event of hardware failure. The placement engine, however, is still the missing piece of the puzzle. Virtualization management technologies just aren’t smart enough to do that kind of manipulation in a fully automated way yet. He also says that the industry needs “bigger pipes and lower-latency pipes” in order to handle the constant flow of data (we wouldn’t want the tubes to clog).

The ideas he presented reflect several emerging IT trends. Virtualization is becoming increasingly prominent in data centers because of its advantages for utilization and scalability. The underlying message of his presentation is that the central principles of cloud computing can be adapted for use with conventional software and leveraged to increase the overall efficiency of computation. These ideas could help shape the way that next generation data centers are architected.

  • Share/Bookmark

Comments off

What Does Cloud Computing Mean for You?

I found this article published at PCMag and thought I should highlight some in red

Source > http://www.pcmag.com/article2/0,2704,2320619,00.asp
Author > John Brandon

Cloud computing is set to take over the world, or at least possibly replace Microsoft Outlook. The cloud concept is simple: It’s a way to access your data and apps from anywhere, via the Internet (or “the cloud”). Yet everyone from Gartner Group to Google has a slightly different take on cloud computing: It can be anything from storing and sharing documents on Google Docs to running your entire company operations using a remote, third-party data center. Some envision it as a way to compute without operating systems, or pesky local client programs, and with minimal hardware needs (just a basic client machine).

“The most important single characteristic of a cloud is abstraction of the hardware from the service,” says John Willis, a noted cloud-computing expert and blogger, explaining that the location of the servers is not as important as easy access to the data. “However you define it, I think cloud technology will have a footprint in every business that does IT within the next five years.”

The particular type of cloud computing that the business world could take advantage of requires massive server cluster farms and superfast network bandwidth. It also requires that companies be ready to hand over their data to a third party. A few small companies, among them Zoho.com (which offers business apps, such as word processing and task lists) and Box.net (which supplies online file storage) have established themselves as SaaS (software as a service) providers, with varying degrees of success. But SaaS is primarily a race between Google and Microsoft to provide advertiser-supported cloud applications to customers.

Security is one critical issue that both companies must address. Depending on the SaaS provider, data can be encrypted from point to point, and since services are Web-based, they’re very easy to patch. Google, for example, can respond to a new security threat without customers even being aware of the problem—or the fix. But end users essentially would have to entrust their data to an outside entity, which is a big leap of faith. Dave Girouard, a VP and general manager at Google, says that the company is working to allay the fears that make trust difficult to achieve.

“Google is investing enormous amounts of capital and sweat equity to ensure that we can protect your data better than you can do yourself,” he says. “Cloud computing will be additive. Usage patterns will change, and users will look primarily to the cloud for most of the things they turn to their PCs for today.”

Yet others aren’t as optimistic about cloud computing. Forrester Research analyst Frank Gillett cautions that it’s not quite ready for prime time. He says that the framework is in an early phase of development—it’s almost experimental, rather than a reliable and trusted computing paradigm.

Ironically, even though Google is battling to dominate the cloud, some of its apps, such as Google Earth, still cache a tremendous amount of data locally to speed up operations. Add to that the privacy, network bandwidth, and political hurdles yet to address, and it looks as if cloud computing will have to drop down to earth a bit more before it can enjoy widespread adoption by both consumers and businesses.

  • Share/Bookmark

Comments off

NewYorkTimes’ TimesMachine

What happened about Titanic in year 1912? Let’s check it out

This image comes from http://timesmachine.nytimes.com/browser – a Times Machine service from NewYorkTimes. You could possibly browse a photocopy of newspaper of NewYorkTimes of any day during 1850 till 1920.

The storage could be in terabyte. Behind NewYorkTimes, Amazon introduces this technology architecture and makes this happen, by using it’s Simple Storage Services (S3) + Apache Hadoop on top of its own Elastic Compute Cloud (EC2) . It’s known there are more than 405,000 jpeg or tiff files stored in this cloud environment.

It is a real commercial cloud computing service… Check the detail written by Derek Gottfrid

How does TimesMachine look like?

What is Hadoop? What technology is used? MapReduce

How does MapReduce work?

see the briefing information from Google > http://labs.google.com/papers/mapreduce.html

And a comprehensive tutorial > http://code.google.com/edu/parallel/mapreduce-tutorial.html

  • Share/Bookmark

Comments off

Amazon EC2’s architecture & offerings

Amazon EC2’s architecture & offerings

June 17, 2008, Red Hat has announced their second offering on Amazon EC2JBoss Enterprise Application Platform. Now available on EC2’s flexible, pay-as-you-go computing environment, JBoss Enterprise Application Platform provides developers with the most popular clustered Java EE application server and next generation application frameworks to build innovative and scalable Java applications.

Earlier in May, Sun Microsystems and Amazon Web Services are collaborating to offer two open source solutions on Amazon EC2: OpenSolaris and MySQL technical support. With OpenSolaris OS on Amazon EC2, you have access to a robust operating system on a scalable, cost-effective virtual computing environment. And now MySQL Enterprise customers can choose to deploy their database on Amazon EC2 and receive full database software and production support from MySQL. These new offerings extend the breadth and support of EC2’s on-demand, pay-as-you-go computing environment. Learn about OpenSolaris on Amazon EC2 and MySQL on EC2.

See other partners of Amazon in EC2 offering > http://www.amazon.com/b/ref=s…36L942TSJ2AJA

Information Source > http://www.amazon.com/b/ref=…A36L942TSJ2AJA

What should IBM do when we see these logos and services there?

  • Share/Bookmark

Comments off