Intel® Cluster Ready FAQ

Author: Intel® Software Network
Published On: Friday, July 20, 2007 | Last Modified On: Friday, May 09, 2008

Contents
1. Intel® Cluster Ready Program

Q: What is the Intel® Cluster Ready program?
A: A program designed to enable a recipe in which Intel®-based clusters are tested and certified with independent hardware and software applications to provide an interoperable solution stack. The result of the program is to reduce the TCO and risk of deploying a cluster in the end users environment.

Q: Does Intel® Cluster Ready restrict vendor specific cluster value additions?
A: No, Intel® Cluster Ready cares about the base specification of the Intel cluster platform architecture, to provide a standardized, consistent, replicable way to run cluster application. Given the base specification is met, Intel® Cluster Ready allows for vendor specific additions, such as e.g. cluster monitoring, management, hardwire configuration, interconnect options, and/or runtime.

Q: How many independent software and hardware vendors are affiliated?
A: With the program just launching we are excited to let our partners announce their support for Intel® Cluster Ready.  You will also find a listing of our partners on our website.  We are also working with numerous partners and expect ongoing announcements over the coming months.

Q: As an OEM, Platform Integrator, or as an ISV, what do I get for joining?
A: You will receive a registration code providing access to:
  • The Intel® Cluster Ready Cluster Architecture and Specification.
  • A list of pre-certified cluster recipes
  • Access to Intel® Cluster Ready cluster test and tools software and documentation
  • Optional training on Intel® Cluster Ready program and its tools
  • Documented procedures for certification, and testing
  • Vendor certificates for all certified recipes
  • Registration of compliant ISV applications
  • Sales collateral templates
  • Opportunity to be listed in Intel® Cluster Ready marketing material

Q: What is a recipe?
A: A list of material with a step-by-step deployment guide for each standard cluster configuration by an OEM or system integrator which describes to duplicate and deploy a hardware/software solution stack. The recipe is written in non-technical language and is thoroughly checked to ensure an accurate and working deployment.

Intel provides a list of pre-certified cluster recipes, as well as tools and processes to successfully create and test vendor recipes.

Q: What recipes are available?
A: View the complete Intel® Cluster Ready recipes in the catalog at http://www.esaa-members.com/index.php.

Q: What is a certificate?
A: An electronically signed document which provides proof of certification for a specific OEM/PI cluster. On the certificate the actual model name and number of the OEM/PI is published as well as reference to the application vendors supporting the configuration. This is an important document and provides OEMs/PIs the ability to attract customers with the standardized Intel® Cluster Ready platform. The certificate states the application vendor support for this OEM/PI and will stand behind the clustered solution.

Q: What is the process for joining?
A: Send an email to cluster@intel.com. Upon review and acceptance into the program sign a Memorandum of Understanding for the Intel® Cluster Ready Program, and receive a registration code and a Service Level Agreement.


2. Intel® Cluster Checker
Q: What has actually been tested, when Intel® Cluster Checker reports ‘Succeeded’ on a test?
A: Starting Intel® Cluster Checker as ‘cluster-check -verbose’ gives further information. The actual commands or tests are listed in detail if cluster is started with the xml config option <debug/> per test. Each test is described in the Intel® Cluster Checker Users Guide and Module Reference documents with details such as synopsis and parameters.

Q: Does Intel® Cluster Checker support multiple head-, login-, or IO-node configurations?
A: Intel® Cluster Checker is started on a designated node and includes all nodes in a single list for all tests and checks. Currently, Intel® Cluster Checker 1.0 only understands the concept of a single head node. To test multiple head-, login, or IO-nodes, list these nodes along with all compute nodes; to test multiple head node functionality, Intel® Cluster Checker must be run on each.

Q: Can Intel® Cluster Checker be run from a host which is not part of the cluster, i.e. neither head node nor compute node?
A: Yes, the only requirement is that this system must be able to ssh to head node as well as to all the compute nodes.

Q: Can new or customized tests be added easily to Intel® Cluster Checker?
A: Yes. Intel® Cluster Checker provides a framework with over 60 pre-defined parallel cluster tests. Further modules can be added in two ways:

The easiest: is described in the manual section on user defined tests, that is to include commands directly as tests, hence predefined test scripts can also be incorporated. The more complex and flexible way is that modules can be added by creating a perl-script interface starting from a copy of the file  ‘modules/template.pm’, which may be included at Intel® Cluster Checker invocation and executed with all other tests.

Q: How can a third party supply test modules to Intel® Cluster Checker for distribution in a future release of Intel® Cluster Checker?
A: Test modules which are subject to a CPL (common public license) or BSD/MIT/CMU style license by its original author can be included in a future release of Intel® Cluster Checker. With regards to cluster checking such modules could be redistributed in source by Intel as part of a future Intel® Cluster Checker package.

Q: Can an OEM or Platform Integrator ship a cluster with Intel® Cluster Checker software on it?
A: The product version of Intel® Cluster Checker will be able to be passed on to the end user: OEMs and Platform Integrators will be free to customize it and provide own modules on top of standard Intel® Cluster Checker. The right for redistribution is part of the Intel® Cluster Ready member agreement.

3. Intel® Cluster Ready ISV Application Registration

Q: What is the configuration of the registration cluster?
The registration cluster consists of a head node and 4 compute nodes.  The nodes contain S5000PAL (http://www.intel.com/design/servers/boards/s5000PAL/index.htm) server boards with 2 Xeon® processors and 4GB of memory per node.

The cluster has both Ethernet and InfiniBand* interconnect fabrics.  InfiniBand* support is provided by OFED 1.1.  The systems have RedHat Enterprise Linux 4 Update 4 installed; this OS was selected because it provides minimal Intel® Cluster Ready compliance.

The cluster has been certified as Intel® Cluster Ready compliant.

Q: Will I be the only user on the cluster?
A: Yes.  To protect your intellectual property, you will be the only user on the cluster during your registration period.  The registration cluster is freshly built for your application registration run and after you have completed your application registration runs, the hard drives will be reformatted.

Per the Intel® Cluster Ready Program Agreement signed between Intel and your company, Intel will retain a backup of your home directory prior to the reformat.  This backup may be used by Intel to test other Intel® Cluster Ready solutions with your application and workload(s).

Q: I try to connect, but the connection is never established.  What's going on?
A: The Intel firewall is configured to allow only specific IPs through to connect to the registration cluster.  You must provide your IP(s) that you will be using to connect to the registration cluster to Intel at least 5 days prior to your registration period.

If you have already provided your IP(s) and the connection is still failing, please contact Intel.

Q: What SSH authentication method should I use?
A: The registration cluster is configured to accept the “public key” and “password” authentication methods.

Q: Can I connect to the registration cluster using something other than SSH?
A: The short answer is no.  Only SSH connections are allowed through the Intel firewall to the registration cluster.

However, insecure TCP connections can tunnel through the secure SSH connection.  For more information on how to use SSH tunneling, which is also known as port forwarding, consult your SSH client documentation.

For more information on SSH tunneling, please see: http://www.ssh.com/support/documentation/online/ssh/winhelp/32/Tunneling_Explained.html

Q: How can I customize my account environment?
A: By default, your account is setup to use the bash shell.  You may change your shell using the command 'chsh'.

Also by default, the environment of the Intel software tools is not setup.  For instance, you may need to add 'source /opt/intel/mpi-rt/3.0/bin64/mpivars.sh' to your .bashrc to use the Intel® MPI Library Runtime Environment.

Q: What are the node names?
A: The names of the cluster nodes are returned by the command 'dbreport nodes'.  The nodes are also listed in the file /etc/hosts.

Q: Why won't my Intel® MPI Library mpds start?
A: Use the '-r ssh' command line option to launch your mpds using SSH, e.g., 'mpdboot... -r ssh ...'

Note that if you start your mpds on the head node, the head node is automatically included even if the head node is not explicitly listed in your mpd.hosts file.  Run the command 'mpdtrace' to examine the list of hosts included in your mpd ring.  You may need to start your mpds on one of the compute nodes to get your intended set of nodes.

Consult the Intel® MPI Library documentation for additional information.

Q: How can I prevent my job from aborting because the connection to the registration cluster times out?
A: Use the screen(1) command.  The following is a short, introductory primer on how to use screen(1).  Consult its man page for additional information.
  1. Start screen with the command 'screen'.  A new shell is started.
  2. Start your job in the new shell, e.g., run 'top'.
  3. Detach from the screen by typing Control-a, then d.   You will be returned to your original shell and the message '[detached]' will be printed to the terminal.  Your job will continue to run in the detached screen even if you logout from the original shell.
  4. Re-attach to the screen with the command 'screen -r'.

You may repeatedly attach and detach from the screen using steps 3 and 4.

Q: My application run is failing because the registration cluster is missing software that we require.  What should I do?
A: Please contact Intel and inform us that you need an additional package installed on the cluster.  If this package is not guaranteed to be present by the Intel® Cluster Ready Architecture and Specification, you will need to document this dependency as part of the registration process; see the application registration procedure for more information.

If you are aware of dependencies (hardware and/or software) that may not be present on the registration cluster, please inform Intel of this prior to your registration period.  Intel will install the necessary software packages and/or reconfigure the hardware to meet your needs.  Please consider that special configurations may not be immediately available.

4. Intel® Cluster Ready Certification of Reference Clusters

Q: What is the process to receive Intel® Cluster Ready certification for a cluster?
A: The vendor is required to physically build a reference cluster, minimum 4 nodes, with a complete software stack installed meeting the Intel® Cluster Ready specification. The procedure then is to complete a certification form and send it together with output files of a successful certification run of the Intel® Cluster Checker test suite software.

Q: If a 5 node reference cluster is certified as Intel® Cluster Ready, how does this scale?
A: Clusters duplicated from the reference cluster and successfully passing the Intel® Cluster Checker are considered certified too, if they are materially identical. The certification is transferable up to a cluster with “(N-1) x 2” nodes, which is 8 nodes in this example.

Q: If a reference cluster was certified with dual-core Xeon processors, does this automatically cover quad-core processors?
A: No. Different processors, and any other substantial change of the cluster system configuration is considered a different target platform, hence needs to be covered by a separate recipe and a separate certification hereof.

Q: How should system vendors manage updates and upgrades? Should the certification procedure be re-run at each patch or update? What happens for customers with an earlier version of the software stack already installed, since the patch may correct critical bugs?
A: The entire hardware and software stack (i.e., the cluster recipe) is certified. If a piece is changed, updated, or replaced in a way that falls outside the exceptions in the certification procedure, the resulting modified stack must also be certified. The second certificate does not invalidate the first, so customers using the first stack still have Intel® Cluster Ready certified systems.


* Other names and brands may be claimed as the property of others

Post a comment If you have any questions, please contact our support team.