.Technova

Understanding the Grid Computing

Bhupendra kumar

M.Tech. [wcc] 2003

bhupendra_wc03@iiita.ac.in

1. Introduction

The last decade has seen a substantial increase in commodity computer and network performance, mainly as a result of faster hardware and more sophisticated software. Nevertheless, there are still problems, in the fields of science, engineering, and business, which cannot be effectively dealt with using the current generation of supercomputers. In fact, due to their size and complexity, these problems are often very numerically and data intensive and consequently require a variety of heterogeneous resources that are not available on a single machine or in single organization. These two factors combines and leading to the possibility of using distributed computers as a single, unified computing resource, what is popularly known as Grid computing.

Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources (owned by different organizations) dynamically at runtime depending upon their availability, capability, performance, cost and users quality of service requirement.

A high-level view of activities within the Grid is shown in Figure.

 

The users interact with the Grid resource broker to solve problems, which in turn performs resource discovery, scheduling, and the processing of application jobs on the distributed Grid resources. To build a Grid, the development and deployment of a number of services is required. These include security, information, directory, resource allocation, and payment mechanisms in an open environment and high-level services for application development, execution management, resource aggregation, and scheduling.

2. Virtual Organizations

Software tools and services providing the capabilities of grid to link computing capability and data sources in order to support distributed analysis and collaboration are collectively known as Grid middleware. As Grid computing provide user with a seamless computing environment, the Grid middleware system needs to handle several challenges. Some of them are:

Multiple administrative domains and autonomy : Grid resources are geographically distributed across multiple administrative domains and owned by different organizations. The autonomy of resource owners needs to be honored along with their local resource management and usage policies.

Dynamic Nature : In a Grid, resource failure is the rule rather than the exception. In fact, with so many resources in a Grid, the probability of some resource failing is high. Resource managers or applications must tailor their behavior dynamically and use the available resources and services efficiently and effectively.

Heterogeneity : A Grid involves a multiplicity of resources that are heterogeneous in nature and will encompass a vast range of technologies.

Scalability : A Grid might grow from a few integrated resources to millions. This raises the problem of potential performance degradation as the size of Grids increases.

To tackle these challenges, Grid architecture has been proposed of the creation of Virtual organizations (VO's) by different physical organization coming together to share resource and collaborating in order to achieve a common goal. A VO defines the resources available for participants and the rules for accessing those resources. Within a VO, participants belonging to member organizations are allocated share based on urgency and priority of a request as determined by the objective of VO.

3. Grid Components

In an World-wide Grid environment, capabilities that the infrastructure needs to supports includes:

  • Remote Storage and replication of data sets
  • Publication of data sets using global logical name
  • Security- Access authorization and uniform authentication
  • Uniform access to remote resource (Data and computational resources)
  • Publication of services and access cost
  • Discovery mechanism for suitable Datasets and computational resources
  • Mapping and Scheduling of jobs (Aggregation of distributed resources)
  • Submission and Monitoring of job execution
  • Movement of data between the user machines and distributed resources
  • Enforcement of QOS requirements
  • Metering and Accounting of resource usage

These capabilities in Grid computing environment play a significant role in variety of scientific, engineering and business applications. Various grid components providing these different capabilities are arranged into layers. Each layer builds on the services provided by lower layer in addition to interacting and co-operating with components at the same level. It consist of four layers : fabric, core middleware, user middleware, application and portal layer.

Grid fabric : This consists of all the globally distributed resources that are accessible from

anywhere on the Internet. These resources could be computers (such as PCs or Symmetric Multi-Processors) running a variety of operating systems (such as UNIX or Windows), storage devices, Databases and special scientific instruments such as a radio telescope or particular heat sensor.

A layered Grid architecture and components

Core Grid middleware : This offers core services such as remote process management, co-allocation of resources, storage access, information registration and discovery, security, and aspects of Quality of Service (QoS) such as resource reservation and trading. These services abstract the complexity and heterogeneity of the fabric level by providing consistent method for access distributed resources.

User-level Grid middleware : This includes application development environments, programming tools, and resource brokers for managing resources and scheduling application tasks for execution on global resources. It utilizes the interfaces provided by low level middleware to provide higher level abstractions and services.

Grid applications and portals : Grid applications are typically developed using Grid-enabled languages and utilities. An example application, such as parameter simulation or a grand-challenge problem, would require computational power, access to remote data sets, and may need to interact with scientific instruments. Grid portals offer Web-enabled application services, where users can submit and collect results for their jobs on remote resources through the Web.

4. Operational flow

Enabling the resource constituents of the Grid, they need to be accessible from different management domains. This can be achieved by installing Globus in Unix/Linux environment or Alchemi in Windows environment as a core Grid middleware services. Multi-node resources can be abstracted to single resource to the Grid by using job management system such as Condor, PBS etc. Data grid technologies like SRB, Globus RLS or EU Data grid may be deployed for the environments where databases to be federated for sharing among various parties.

The following steps show the key steps involved in interaction between various grid components from user's perspective.

  1. Users compose their application as distributed application using development tools.
  2. Specifying QOS parameters and submit job to Grid resource broker.
  3. Broker performs resource discovery and their characteristics using Grid information service.
  4. Broker identifies resource prices by querying Grid market directory.
  5. Broker list Datasets or replicas and selecting optimal ones.
  6. It also lists the computation resources that provide required application services.
  7. Broker interacts with user to check out necessary credit and authorization.
  8. The broker scheduler maps and deploys data analysis jobs on resources that meet user QOS requirements.
  9. The broker agent on a resource executes the jobs and returns results.
  10. The grid resource broker collates the results and passes to user.
  11. The metering system charges the user by giving resource usage information to the accounting system.

5. Summary

Grid Computing is becoming the preferred platform for next generation eScience experiments that require management of massive distributed data. Presented article covers the basic overview architecture for understanding of Grid computing. It includes software-layered architecture where different components providing different services configured to different layers depending upon their functions. All these combines to provide a seamless computing environment where user just need to submit jobs to Grid.

  References:

  • Akshay Luther, Rajkumar Buyya, Rajiv Ranjan, and Srikumar Venugopal Peer-to-Peer Grid Computing and a .NET-based Alchemi Framework
  • Grid Computing and Distributed Systems (GRIDS) Laboratory, The University of Melbourne , Australia
  • Foster, Carl Kesselman, and S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations , International Journal of Supercomputer Application s, 15(3), Sage Publications, 2001, USA .
  • Ian Foster, Carl Kesselman, Jeffrey Nick, and Steve Tuecke. The Physiology of the Grid: An Open Grid Services ,Architecture for Distributed Systems Integratio n, January 2002.
  • Rajkumar Buyya and Srikumar Venugopal “ A Gentle Introduction to Grid Computing and Technologies” CSI Communication VOL 9, july 2005
  • Mark Baker, Rajkumar Buyya and Domenico Laforenza, “ Grids and Grid technologies for wide-area distributed computing”, Softw. Pract. Exper. 2002; (in press)