Product Details
Product Details

Key Features and Functions
- Policy Driven Workload Scheduler
- High Resource Availability
- Open Architecture For Application Development And End User Access
- Open Architecture For Choice Of File Systems
- Supporting Multiple MapReduce Applications Running On The Same Cluster
- High Scalability
- Greater monitoring and troubleshooting capabilities
- Support Rolling Upgrade
- Automatic Cleanup
- Faster Performance
- Flexible Resource Sharing
- Platform Symphony MapReduce MultiCore Optimizer
- Platform Symphony MapReduce Data Affinity
Policy Driven Workload Scheduler
The Platform Symphony MapReduce policy driven workload scheduler allows multiple parallel jobs being executed based on a powerful priority scheduling engine. Platform Symphony MapReduce provides 10,000 levels of priority and fair share scheduling of Mapper and Reducer jobs, all done at the job level to provide better granularity and control. In contrast, the open source Hadoop solution does not provide fair share scheduling of its Mapper and Reducer jobs from its default deployment. Instead, it uses a FIFO methodology. Even with its fair share scheduling plug-in options, open source Hadoop is restricted to five levels of priorities.
One of the distinct advantages of the Platform Symphony MapReduce policy driven workload scheduler is to deliver resource priority for preemptive jobs. Once a preemptive job is requested, it gets all the resources needed to complete the job, existing jobs will wait for additional resources until the preemptive job is done. The previously running jobs will again resume and provide fair-share scheduling priorities once the pre-emptive job has completed. Back to top
High Resource Availability
Platform Symphony MapReduce guarantees uptime within the distributed runtime engine—there is no single point of failure. If the server running the Master Job Tracker fails the Platform Symphony MapReduce job(s) will continue by automatically restarting a Master Job tracker on a new server, followed by automatically restarting the currently running MapReduce tasks. Recovery of any in-progress tasks at the time of failure is automatic and completed tasks do not have to be re-run. The open source Hadoop implementation does not provide such capability. In addition, Platform Symphony MapReduce provides automatic recovery of the individual Map task and Reduce task if the task’s compute server running these tasks fails. They are either restarted on the current compute server (if possible) or restarted on an alternative server immediately.
For Hadoop file systems, Platform Symphony MapReduce offers automatic failover of the NameNode within the Hadoop Distributed file system and provides file system recovery and dependent job recovery. Back to top
Open Architecture For Application Development And End User Access
Platform Symphony MapReduce is built on an open architecture to support multiple MapReduce applications including 100% Hadoop application compatibility for Java based MapReduce jobs. Application integration for jobs built with Hadoop MapReduce technology (Java, Pig, Hive) require no changes to the programming logic for execution on Platform Symphony MapReduce. With no changes to the programming logic, customers avoid any requirements to recompile code. Examples of applications supported by the Platform Application Adapter technology include: Pig, Hive, Java, Oozie, Dumbo, natively written Java MapReduce programs, and others. Back to top
Open Architecture For Choice Of File Systems
Platform Symphony MapReduce provides a method for leveraging multiple file system types as well as database architectures. Platform Symphony MapReduce fully supports HDFS, GPFS and other distributed file system types and data types. In addition, for MapReduce processes, the input data source file system type can be different from the output data source file system. This provides support for many uses, including extract, transformation, and load (ETL) workflow logic. For example, the input data source could be HDFS while the output data source could be a database. This eliminates the need to stage data prior to a batch load process. Back to top
Supporting Multiple MapReduce Applications Running On The Same Cluster
Platform Symphony MapReduce will support up to 300 separate MapReduce applications (also known as Job Trackers) running on a same distributed file system cluster. Multiple applications (or Job Trackers) can run simultaneously and dynamically share resources across application boundaries. This eliminates siloed IT operations in which a cluster is dedicated to a single job tracker. Such capability also increases resource utilization dramatically while still maintaining a single management interface. In addition, Platform Symphony MapReduce supports mixed types of workloads (MapReduce as well as other distributed workloads) running on a single cluster. Such capability allows customers to leverage existing resources, drive up utilization of all resources, balance the workload across applications dynamically, and thus maximize their IT infrastructure. Back to top
High Scalability
Platform Symphony MapReduce offers 3 times scalability over its open source alternatives per distributed file system cluster:
- Up to 5,000 nodes and 40,000 cores
- 40,000 concurrent tasks
- 1,000,000 total tasks in a single job.
- 1,000 concurrent jobs with 300 MapReduce (Job Tracker) applications.
Platform Symphony MapReduce is architected so for virtually an unlimited number of priority levels for job scheduling. Note open source Hadoop can only support 5 levels.
Greater monitoring and troubleshooting capabilities
Platform Symphony MapReduce monitors CPU and memory utilization level and allocates resources accordingly. It provides the ability to pull log data from individual servers and manage them from a single interface. Back to top
Support Rolling Upgrade
Platform Symphony MapReduce supports multiple versions of MapReduce applications running on the same clusters and there is no need to take down the entire cluster for software upgrade. Customers have the option of selecting the servers needed for the upgrade. Once upgraded, these servers can co-exist with the previous version of the product on other nodes and thus allow upgrades to be done incrementally over a set of servers without taking down the entire cluster. Back to top
Automatic Cleanup
Platform Symphony MapReduce automatic ally cleans up intermediate and temporary files upon job completion. As part of the MapReduce logic, all temporary and intermediate files are removed from the local servers at the completion of the last reduce task associated with a job. With Platform Symphony MapReduce, cleanup is done automatically at the job level. In addition, there is a dependency for all reduce jobs to complete prior to the clean up task starting. This dependency ensures files are not deleted and can thus be leveraged again if a failure event were to occur in the middle of a job. Back to top
Faster Performance
Platform Symphony MapReduce is the industry’s fastest distributed resource infrastructure solution. It has the ability to harness resources from distributed clusters in remote data centers. Platform Symphony MapReduce helps organizations run complex data simulations with sub-millisecond latency with data throughput over 7,300 tasks per second. Back to top
Flexible Resource Sharing
Platform Symphony MapReduce is able to react quickly and dynamically to changes based on application demand. It brings a flexible amount of computing power to your applications even with data streams to the distributed resources. Based on workload volume, Platform Symphony MapReduce distributed resources can grow or shrink by re-allocating up to 1,000 CPUs per second to adjust to the current workload, in order to reduce cost while maximizing results. Back to top
Platform Symphony MapReduce MultiCore Optimizer
Platform Symphony MapReduce MultiCore Optimizer helps increase application performance and lower infrastructure cost through higher utilization of multi-core servers. This capability makes the most out of multi-core servers for both multi-threaded and single-threaded applications. Application performance and scalability are improved through reduced I/O and memory contention in multi-core systems, increasing utilization for multiple I/O-intensive tasks per core, efficiently matching resources with non-uniform workload. Back to top
Platform Symphony MapReduce Data Affinity
Platform Symphony MapReduce includes powerful data affinity capabilities to significantly improve application performance and resource utilization by taking into account data locality when scheduling MapReduce workloads. Its data affinity solution virtually eliminates the time it takes to access large data volumes required by data intensive MapReduce applications. It increases overall application performance by up to 400% through faster file access. Back to top

