r36 - 17 Jun 2009 - 12:56:18 - EricTYou are here: Wiki >  AppLogic2 Web > ReleaseNotes-2-1-0
ALERT! AppLogic 2.1/2.2 Documentation The latest production release is AppLogic 2.9.9

AppLogic Release Notes


Version 2.1.0 - September 7, 2007

These are the release notes for AppLogic 2.1.0. Other versions.

Note: This release is the official AppLogic 2.1 production release; it is suitable for all customer deployments

Overview

AppLogic is the first grid operating system that runs and scales existing web applications. It converts a set of commodity servers into a scalable grid that's easy to manage. With AppLogic, you can:

  • Deploy existing web applications on a grid without changing any code
  • Scale each application from a fraction of a server up to the whole grid
  • Manage whole racks of servers easier than a single server today
  • Handle hardware failures automatically without losing data
  • Add or remove servers and storage without disrupting applications
  • Manage all applications, servers and storage with just a browser

AppLogic does not require a SAN or other expensive hardware, and is open and vendor-neutral. It supports Linux and all popular open-source middleware including Apache, MySQL, JBoss and Ruby on Rails, so there is no learning curve.

What's so special about AppLogic?

AppLogic does for web applications what Apache does for web content. By separating the application from the datacenter infrastructure required to run it, AppLogic makes it possible to:

  • Deploy the same web application on different grids
  • Run multiple different web applications on the same server
  • Scale a web application to multiple servers, up to the whole grid
  • Manage applications and hardware easily

Just like the web servers made it possible to run web sites without owning a datacenter and servers, AppLogic makes it possible to run and scale web applications without the enormous expense of owning and operating scalable IT infrastructure. With AppLogic, you can host any web application on commodity servers rented on a month-to-month basis from your favorite hosting provider.

What is new in AppLogic 2.1 for AppLogic 1.2.x users

The AppLogic 2.1 release includes the following key features that will be new to AppLogic 1.2.14 users:

User Features
  • SMP Support: AppLogic now supports SMP appliances with up to 4 CPUs/appliance
  • Large grid size: Supported grid size is increased to 128 servers or 1024 CPU cores
  • Web Shell: The AppLogic GUI now contains a web shell used for accessing the grid controller's AppLogic shell and for accessing individual appliances (eliminates the need for ssh client and key setup)
    • Grid controller access: the main dashboard page contains a new "Shell login" button used for logging into the grid controller (also accessible through the application editor)
    • Appliance access: right click on an appliance and click on the new "Login" menu item; this will log the user into the running appliance (bash shell)
    • Support for clipboard copy/paste, scrollback and shell window resizing -- all from the web browser
  • Application Monitoring: Provides visibility in the operation and performance of AppLogic applications. Every appliance within an application can be monitored using the new MON appliance. Simply insert the new MON appliance into your application, connect the appliances to MON and start your application. Appliances are monitored by using the new monitoring GUI accessed by clicking on the monitor button in the main GUI. Appliances are monitored by using visual graphs that display the values of various counters within the appliances. The following counters are examples of what can be monitored for each appliance; please see CatMonitoringMon for the full list of supported counters and AdvCustomCounters for how to define your own custom counters:
    • Key statistics: CPU, memory, scheduler; total/used CPU, total/used/free memory, number of running processes
    • Network statistics: Packets sent and recieved, total bytes sent and recieved
    • Virtual volume statistics: total bytes read/written, completed read/write requests
    • Appliance specific statistics:
      • WEB: number of active requests, number of hits per second
      • MYSQL: number of queries, number of connections
  • System catalog update: The system catalog now includes the following new/updated appliances:
    • NEW INSSL: HTTP gateway with SSL support providing a firewalled entry point for network traffic into an AppLogic application
    • NEW WEBx4/WEBx8: Scalable web server appliances for heavy traffic loads
    • NEW MON: Monitor appliance used to collect counters from appliances within an application and displays them using visual graphs integrated into the AppLogic GUI
    • NEW LUX5: Generic Linux appliance based on CentOS 5
    • NEW LINUX5: Generic Linux server based on CentOS 5
    • NEW WEB5: Web server based on CentOS 5
    • UPDATED WEB: Added a new net output terminal used for subnet access, added real IP logging and also WEB can now be configured with a user-defined script that will be automatically executed after WEB has booted. This allows the user to execute their own script at WEB appliance startup.
    • UPDATED NAS: Added a new nfs input terminal used to access storage using the NFS/3.0 protocol
  • New reference applications: AppLogic contains the following new reference applications which makes it easier to port your existing application onto AppLogic:
    • Lamp: Reference of a basic 2-tier non-scalable LAMP web application, includes the following appliances:
      • WEB for serving web content
      • NET for subnet access from within WEB
      • NAS for web content storage
      • MYSQL for database storage
      • INSSL to provide firewalled access to the web application
      • MON for monitoring the application
    • LampX4: A scalable LAMP application, uses 4 WEB servers with a load balancer (HLB) for heavy traffic load. The app can be easily modified to use up to 8 or more WEB servers. See instructions for 5 minute install of Wordpress on scalable LampX4 cluster (with small modifications, works for any php/LAMP app).
    • LampCluster: A scalable LAMP cluster providing easy modification of the web and database server infrastructure, as well as direct ssh support to appliances.
  • Support for default resources: All appliances and applications now have a setting for default resource values (in place of the sandbox resources which have been removed from AppLogic). The default resources are used when an application/appliance is started w/o any specific resource requirements. This allows appliances and applications to be configured easily with adequate resources by default, while maintaining low minimums.
  • Single-hop migration of applications: Application migration is now significantly faster as applications are migrated directly to remote grids (w/o having to use the impex volume for compression and transfer). A no-compress option has been added for even faster migrations over fast network connections.
  • Volume prefill: When a volume is created, a new --prefill option can be specified which causes the allocation of all blocks of the volume (using this new option will significantly increase the volume creation time). Prefilling a volume results in a non-sparse volume where all blocks in the volume are allocated ensuring higher read and write performance from the volume.

Maintainer Features
  • Improved hardware support: the AppLogic kernel now supports all devices supported by CentOS 4.4, providing support for more recent hardware. SMP and large memory support features include:
    • Dual CPUs, dual and quad cores per CPU
    • Large physical memory support (>2GB, 8GB has been tested)
    • SMP Linux Appliances: Appliances can have more than 1 CPU and >2GB memory enabling them to take advantage of higher-end hardware
    • Multiple harddisks per server increase max. size of volumes
  • Grid Branding: The grid maintainer can now customize various aspects of the AppLogic GUI:
    • defining a logo that is displayed on the grid dashboard
    • including a link to hosting provider Terms of Service that are accepted by the user during the license click-through (optional)
    • replacement of various links available from the grid dashboard: terms of service and AppLogic help
    • support page customizations: support icons and links
  • Grid message notifications: The grid maintainer can now designate one or more e-mail addresses to receive alert or summary notifications for important grid events
  • Server re-imaging and partitioning: maintainers no longer need to install special OS distro and configuration on the servers prior to installing an AppLogic grid. The AppLogic installer can now re-image a server with the proper base OS and partition schema as expected by AppLogic.
  • New rollback command: allows the maintainer to rollback an upgrade to the previous AppLogic version with minimal downtime (takes just a few minutes)
  • Improved upgrade to minimize downtime: upgrade is now split into 2 distinct stages. The first stage readies your grid for the upgrade by importing all of the new software onto your grid (during this stage your grid is fully operational). The second stage performs the actual upgrade and is designed to minimize the downtime of your grid (typically just a few minutes).
  • 128 servers per grid: an AppLogic grid can now contain up to 128 servers or 1024 CPU cores. (Certification tests for this release were performed with 62 servers.)
  • Logging of all executed AppLogic commands: All AppLogic commands that are executed through the command shell are now logged on the AppLogic controller. Entries include the name of the user issuing the command, time and the command itself. Only grid maintainers have access to this log.

See the full set of new features since AppLogic version 1.2.14 at ReleaseNotes-2-Features.

What is new in AppLogic 2.1 for AppLogic 2.0 beta users

To all our users and partners who participated in the beta program, thank you for helping us improve AppLogic. Your input was invaluable!

User Features
  • Large grid size: Supported grid size is increased to 128 servers or 1024 CPU cores
  • Support for default resources: All appliances and applications now have a setting for default resource values (in place of the sandbox resources which have been removed from AppLogic). The default resources are used when an application/appliance is started w/o any specific resource requirements. This allows appliances and applications to be configured easily with adequate resources by default, while maintaining low minimums.
  • Single-hop migration of applications: Application migration is now significantly faster as applications are migrated directly to remote grids (w/o having to use the impex volume for compression and transfer). A no-compress option has been added for even faster migrations over fast network connections.
  • 3 new prototype appliances based on CentOS 5: The following 3 new appliances are provided in the /proto catalog:
    • LUX5: Generic Linux appliance based on CentOS 5
    • LINUX5: Generic Linux server based on CentOS 5
    • WEB5: Web server based on CentOS 5
  • WEB enhanced to execute a user-defined script on boot: The WEB catalog appliance can be configured with a user-defined script that will be automatically executed after WEB has booted. This allows the user to execute their own script at WEB appliance startup and also to create simple application servers without branching the web server.
  • Improved web shell: Added scrollback, clipboard copy and paste, as well as improved window resize
  • Volume prefill: When a volume is created, a new --prefill option can be specified which causes the allocation of all blocks of the volume (using this new option will significantly increase the volume creation time). Prefilling a volume results in a non-sparse volume where all blocks in the volume are allocated ensuring higher read and write performance from the volume.
  • Light control for application monitoring: Users can now control the lighting of their graphs, they can choose between a white or a black background. The colors in the graphs are automatically adjusted according to the selected lighting. The dark background can be used in NOC-type setup in order to reduce the intensity of the screens and increase graph's contrast.
  • Global pace control in application monitoring: Users can now change the pace of all graphs from a single setting.

Maintainer Features
  • Improved hardware support: the AppLogic kernel now supports all devices supported by CentOS 4.4, providing support for more recent hardware.
  • Server re-imaging and partitioning: maintainers no longer need to install special OS distro and configuration on the servers prior to installing an AppLogic grid. The AppLogic installer can now re-image a server with the proper base OS and partition schema as expected by AppLogic.
  • New rollback command: allows the maintainer to rollback an upgrade to the previous AppLogic version with minimal downtime (takes just a few minutes)
  • Improved upgrade to minimize downtime: upgrade is now split into 2 distinct stages. The first stage readies your grid for the upgrade by importing all of the new software onto your grid (during this stage your grid is fully operational). The second stage performs the actual upgrade and is designed to minimize the downtime of your grid (typically just a few minutes).
  • 128 servers per grid: an AppLogic grid can now contain up to 128 servers or 1024 CPU cores. (Certification tests for this release were performed with 62 servers.)
  • Logging of all executed AppLogic commands: All AppLogic commands that are executed through the command shell are now logged on the AppLogic controller. Entries include the name of the user issuing the command, time and the command itself. Only grid maintainers have access to this log.

Important Bug Fixes in AppLogic 2.1.0

The following defects reported by 1.2.x and 2.0 beta users have been resolved in 2.1

  • SCR 1631: Product: During heavy I/O on a grid, grid lost connection to a server (usually when repairing large volumes)
  • SCR 1878: Product: Volume I/O can fail if one of the volume's mirrors is on a server that is rebooted
  • SCR 1910: Product: It is possible for a logged in user to elevate his rights to maintainer (security issue)
  • SCR 1912: Product: If app is stopped while app volume is being repaired, user cannot restart the app (volume is left mounted)
  • SCR 1687: Product: Occasionally appliances fail to boot within the boot timeout period

Hotfixes for AppLogic 2.1.0

This section describes all of the available hotfixes for the AppLogic 2.1.0 release. Make sure that your AppLogic 2.1.0 grid is updated with the mandatory hotfixes to ensure a properly working AppLogic grid.

Mandatory Hotfixes

  • e2064 - Install this first! Support for multi-version hotfixes. Required to install any hotfix
  • Hf2192 - Eliminates a vulnerability in AppLogic that allows properly authenticated grid users to obtain grid internal data
  • Hf1960 - Fixes a possible server crash during heavy broadcast traffic
  • Hf2061 - Fixes problems with building and starting large-size applications (16+ appliances)
  • Hf2069 - Fix server overload during appliance startup

Optional Hotfixes

  • e1688 - Enables installing AppLogic on servers with an MPT SAS host bus adapter
  • Hf1936 - Resolves an issue with application migration not working for regular users and maintainers
  • Hf1978 - Enables installing AppLogic 2.1.0 on HP Blade servers
  • Hf2057 - Fixes problems with starting multiple appliances simultaneously on the same server
  • Hf2195 - Resolves issue with IN, OUT, and NET occasionally failing to start due to low memory conditions
  • Hf2200 - Workaround for failure to mount large volumes (750GB+)

ALERT! Hotfix Hf1479 has been recalled; Hf1936 should be used in place of Hf1479 (Hf1936 cannot be applied on a 2.1.0 grid that already contains Hf1479). If you already have Hf1479 installed on your 2.1.0 grid, please contact 3tera's Technical Support for assistance in removing Hf1479 from your grid before applying Hf1936.

ALERT! Hotfix Hf2197 has been recalled. Please do not install hf2197 on your grid, if you have already installed hf2197 on your grid, please contact 3tera support.

Installation/Upgrade/Migration notes and issues for AppLogic 2.1

This section describes various issues/how-tos related to AppLogic 2.1 installation and upgrades. Carefully read and follow the instructions for each issue to ensure a properly working AppLogic 2.1 grid.

  • AppLogic Hotfixes: Be sure to install any hotfixes that come with an AppLogic release. The hotfixes are stored in the AppLogic distro that is downloaded from the 3tera download server. The hotfixes are usually named applogic-x.y.z-hfXXXX.tar.bz2; where x.y.z is the AppLogic version and hfXXXX is the SCR number in 3tera's bug database which corresponds to the hotfix (hfXXXX may also be eXXXX for a product enhancement). Some hotfixes are optional - be sure to read the hotfix documentation for hotfix installation instructions and important notes (see the previous section about hotfixes).

  • Installing a new 2.1 grid: Starting from AppLogic 2.1.0, AppLogic requires all servers to use a special partitioning schema that makes grid upgrades safer and faster. The AppLogic installer (ALD) contains a new ald-reimage script that automatically both installs the OS and creates the necessary partitions on the specified servers. This makes it easier for the maintainers to install new AppLogic grids; and also ensures that all servers contain the same software and configuration. Please be sure to read the latest ALD documentation before installing your new grid.

  • Upgrading from AppLogic 2.0.2/2.0.5 beta: AppLogic 2.1 supports upgrades from either 2.0.2 or 2.0.5 beta grids as long as those grids use the AppLogic partitioning schema. A grid is using the AppLogic partitioning schema if its servers have one or two /mnt/aplbootX partitions (log into one of the servers and execute "df" to view the server's partitions). Before upgrading your grid, be sure to install hotfix 1687 (applogic-2.0.x-hf1687.tar.bz2). This hotfix updates the domU kernels in all appliances to fix a bug that was causing the appliances to intermittently fail to boot. After applying the hotfix, upgrade your grid by simply executing the following command from your AppLogic 2.1 distro: ./aldo upgrade grid=mygrid. If your AppLogic 2.0.2/2.0.5 grid does not use the AppLogic partitioning schema, you must install a new AppLogic 2.1 grid on separate servers and migrate your applications, catalogs and users over to the new 2.1 grid.

  • Upgrading from AppLogic 1.2.x: AppLogic 1.2.x grids cannot be upgraded to AppLogic 2.1. A new AppLogic 2.1 grid must be installed on separate servers and the 1.2.x applications, catalogs and users must be migrated over to the new AppLogic 2.1 grid. See the next note on how to migrate your applications to your new AppLogic 2.1 grid. Also, after uninstalling an AppLogic 1.2.x grid, reboot its servers before using them for an AppLogic 2.1 installation.

  • Migrating applications from an AppLogic 1.2.14/2.0.2/2.0.5 grid to a AppLogic 2.1 grid: In order to migrate applications from a AppLogic 1.2.14/2.0.2/2.0.5 grid to a AppLogic 2.1.0 grid, follow the steps below:
    • First, make sure you create the users on your new 2.1 grid (maintainers and regular users). AppLogic currently does not support the migration of users.
    • Install hotfix e1720 on the 1.2.14/2.0.2/2.0.5 grid (applogic-x.y.z-e1720.tar.bz2). This hotfix enables faster application migration as applications are migrated directly to remote grids (w/o having to use the impex volume for compression and transfer) To install this hotfix, simply run the following command from your AppLogic 1.2.14/2.0.2/2.0.5 distribution server: ./aldo-fix grid=mygrid applogic-x.y.z-e1720.tar.bz2 (the applogic-2.0.2-e1720.tar.bz2 hotfix can be safely applied to both 2.0.2 and 2.0.5 grids). This hotfix does not require a grid reboot. This hotfix can be downloaded from the 3tera download server using the following command based on from which version of AppLogic you are migrating:
      • 1.2.14: rsync -v --progress applogic@download.3tera.net:/home/applogic/1.2.14c/* /root/applogic-1.2.14c/
      • 2.0.2/2.0.5/2.0.5a: rsync -v --progress beta@download.3tera.net:/home/beta/2.0.x/* /root/applogic-2.0.x/
      • (/root/applogic-x.y.z/ in this case contains the AppLogic x.y.z distribution as downloaded from 3tera, be sure to use the correct AppLogic version)
    • Install hotfix Hf1936 on your AppLogic 2.1.0 grid. This hotfix resolves an issue with application migration not working for regular users and maintainers. This hotfix does not require a grid reboot. The download and installation of this hotfix is simular to the hotifx in the previous step.
    • After both hotfixes above are installed, ensure that all custom appliances contained on your 1.2.14/2.0.2/2.0.5 grid are migrated over to your new AppLogic 2.1.0 grid. If you don't have any custom appliances, this step can be skipped. Note that there is no migrate command for appliances. Use the following steps to move these custom appliances over to your new grid:
      • Export all custom appliances using the class export command: class export myclass mydir. Each custom appliance is exported to the system impex volume into the specified folder (/vol/_impex/mydir). Repeat the class export command for each custom appliance on the grid. Note that in case you have custom appliances in your own custom catalog, you can export the entire catalog using a single AppLogic command: class export mycatalog mydir.
      • Copy the custom appliances over to your new 2.1.0 grid using the Linux rsync command: rsync -r -v --progress /vol/_impex/mydir 2.1.0_controller_ip:/vol/_impex/. Repeat the rsync command for each exported custom appliance or copy the entire /vol/_impex folder.
      • Import all custom appliances using the class import command: class import myclass mydir. Note that in case you have custom appliances in your own custom catalog, you can import the entire catalog using a single AppLogic command: class import mycatalog mydir.
    • It is now time to migrate the applications from your 1.2.14/2.0.2/2.0.5 grid to your new 2.1.0 grid. Migrate the applications using the app migrate command executed from the 2.1.0 grid. The general syntax of the command is app migrate source-grid app-to-migrate. See the AppLogic online documentation or web shell help for more information.
    • If migrating from an AppLogic 1.2.x grid (if not, skip this step): AppLogic 2.1.0 uses a newer Linux kernel version 2.6.16.33 (same as AppLogic 2.0.2/2.0.5) - When migrating/importing applications from AppLogic 1.2.x to AppLogic 2.1, all branched appliances must be updated to use this new kernel (with related modules) as well as newer versions of VMA/VME (AppLogic's virtual machine agent and event generator). The AppLogic 2.1 release provides a script that automates the process of updating all branched appliances on a grid with the latest Linux kernel, modules and AppLogic binaries. In addition, the script updates your applications to use the updated WEB and NAS appliances. Please follow the steps below in order to update your grid after migrating/importing all of your applications/catalogs/appliances:
      • From within your AppLogic 2.1 distro, execute the following:
        • ./upgrade_apps.sh controller-IP vmlinuz-2.6.16.33-xenU [--force]
        • Be sure to specify the correct controller IP address
      • By default, the script will prompt you to confirm all changes to the appliances on your grid. To surpress the prompting, use the --force option.
      • After the script has completed, your new grid is ready for use.
      • The upgrade script may take a few minutes to complete.
    • At this point, all appliances and applications should be migrated to the new grid.

What's Included

This release of the AppLogic grid operating system includes the following key components.

Distributed Kernel

The AppLogic distributed kernel provides a set of system services required to support the distributed infrastructure and application model of AppLogic. The four most important system services include:

  • Global volume store: a scalable, distributed volume store using the built-in hard disks of the grid servers. The volume store keeps volumes mirrored across two or more servers, ensuring high availability and improved read performance. The hierarchical volume space is structured along applications and catalogs, so volumes become integral part of those entities.
  • Distributed virtual machine manager: a runtime component that virtualizes the hardware resources used by applications.
  • Logical connection manager: a runtime component that provides the virtual network bindings between components of an application without the need to configure any IP addresses and network settings for distributed applications
  • Application scheduler: a runtime component that selects and assigns hardware resources to applications, based on available grid resources, application constraints and user-provided configuration

Grid Dashboard

The grid dashboard provides:

  • At-a-glance summary of the grid state, including grid name, version, state summary, resource use, messages, settings, etc.
  • List of currently installed applications, with the ability to create new apps, copy existing apps, etc.
  • Support page with important links to user documentation, support forums, Grid University, etc.

Application Configurator

The application configurator is a control panel for configuring application parameters: setting their hardware resources, network resources, tuning and other parameters. It is a single property sheet that includes all configurable parameters.

The application configurator can also be accessed through the command line shell or scripts using the app configure command.

Infrastructure Editor

The infrastructure editor is a visual tool that makes it easy to create, assemble and troubleshoot disposable infrastructure for AppLogic applications.

The user interface of the editor is highly interactive and is modeled after popular drawing programs: you assemble infrastructure by dragging components onto the canvass, wiring them together and configuring each component using a property sheet.

For running applications, the editor can be used to open the monitoring dashboard for the application, as well as to start the grid shell for the application or login to individual appliances.

Command Line Shell

The command line shell gives you control of all aspects of an AppLogic grid. The shell runs on the AppLogic controller and can be accessed either through a browser, using the new web-based shell, or over SSH using any suitable SSH client package.

The shell commands are designed with the following objectives in mind:

  • make the shell easy to use by human users
  • provide simple means for scripting automation

All commands have a "batch" form of their output that makes it easy to parse programmatically, while the command's default output is structured for convenient interactive operation.

Application Infrastructure Build System

The infrastructure build system compiles the application infrastructure, producing a single entity for the application. It verifies resource and configuration constraints for each appliance and for the application as a whole, builds instance images and enforces the integrity of the application infrastructure. The infrastructure linker binds the application instance to the grid hardware resources just in time for application start, producing a ready-to-run application from the portable application format.

The infrastructure build system is automatically invoked when starting applications and is transparent for the grid operator.

Application Monitoring System

The application monitoring system provides a visual interface for monitoring performance and resource usage statistics of running AppLogic applications. The user interface of the Monitor is highly interactive and is accessible with a web browser. See RefMon for details.

System Catalog

The system catalog contains 13 appliance classes, ready to use in applications.

  • WEB: Apache-based web server with plug-in content/scipts volume
  • WEBx4, WEBx8: Scalable web servers
  • MYSQL: MySQL-based database server
  • HLB: Session-aware http load balancer
  • NAS: Network attached storage / file server appliance (http and cifs file access)
  • INSSL, IN, OUT, NET: Firewalled network gateways based on iptables
  • LUX, LINUX: A tiny and a minimal Linux appliances that can be used as basis for new appliances
  • MON: Application Monitor used to monitor running applications (collects and displays counters using visual graphs)

The system catalog is a global catalog, containing appliance classes that can be used by all applications on the grid. You can see the full documentation for each appliance in the catalog reference. The system catalog is read-only for AppLogic users and can be changed only by the grid maintainer.

AppLogic also includes an empty global catalog called the user catalog, for your own production-level appliances.

AppLogic also includes a proto global catalog for prototyping new appliances. Each AppLogic release may provide new appliances in this catalog. See catalog reference for list of included appliances and their data sheets.

The user and proto catalogs are freely modifiable by AppLogic users.

Sample Applications

The AppLogic release also includes the following 6 sample applications:

  • TWiki: web-based collaboration platform
  • SugarCRM: customer relationship management system
  • GSC: grid server
  • NEW Lamp: basic 2-tier non-scalable WEB application
  • NEW LampX4: scalable Lamp
  • NEW LampCluster: scalable Lamp cluster
    • This application is not installed by default on grid install/upgrade. It is included in the release image and can be installed manually.

The applications are ready to run, requiring only network settings to be configured. You can find details on each application in the Sample Applications reference.

ALERT! Note: The cPanel reference application is no longer distributed with AppLogic.

AppLogic Grid Distribution System

The AppLogic Grid Distribution System (aldo) installs and configures the AppLogic grid operating system, the appliance catalogs and sample applications.

Aldo easily installs multiple grids from a single distribution server. It can also add and remove servers from existing grids, change grid configuration and upgrade AppLogic. In addition, aldo can "clean" servers, removing AppLogic and any user data stored on AppLogic volumes.

For more information on aldo, see Aldo Reference and Aldo Tutorial. Aldo and its documentation are available only to grid maintainers.

Installation

The following procedures are available only to grid maintainers. Hosted grid customers receive their grids pre-installed and pre-configured.

Pre-requisites

To install AppLogic, you need a set of servers (1-128) connected with a gigabit Ethernet network and a designated distribution server. See HardwareConfig, AldTutorial and RefAldSetup for more information.

ALERT! Please read at least AldTutorial before choosing and setting up your servers and resources. Not reading or not following this document will likely result in a trial-and-error process which can be long and expensive. We want to make sure your installation is successful from the first time - please contact Technical Support if any of the requirements is unclear.

In addition, you will need an ssh keypair to be used to authenticate the grid maintainer. The public key must also be provided to 3tera, so that you can gain access to the 3tera download server. Please e-mail your public key to 3tera Technical Support

For more information on ssh keys, please see the man page on ssh-keygen or the Appendix in RefAldSetup.

Downloading the release

As user root from your chosen distribution server, run the following command:

rsync -v --progress applogic@download.3tera.net:/home/applogic/2.1.0/* /root/applogic-2.1.0/

IDEA! Make sure you are running ssh agent with the key that you provided to 3tera for downloads. If you don't have a key or would like to use a different key, please contact Technical Support.

Installing a grid

See AldTutorial for a quick step-by-step guide to installing a grid in its default configuration. See RefAldSetup for details on the various options available when setting up a grid (e.g., installing your logo, setting up defaults for installing multiple grids, etc.).

Product Characteristics

Dimensions

Key System Dimensions

  • 128 servers per grid (certified up to 62 servers)
  • 31 grids per back-end LAN
  • 1024 applications per grid, up to 1024 applications running simultaneously (certified only up to 256)
  • If you need different dimensions, give us a call

Other Dimensional Limits

  • Per application
    • 512 network interfaces per application

  • Per appliance
    • 8GB RAM
    • 4 CPU (400%)
    • 2000 Mbps bandwidth
    • 12 volumes
    • 15 network interfaces/terminals (including external and default interfaces)
    • 1 external network interface
    • 1 default network interface

  • Per server
    • 255 virtual volumes (counting each mirror as a separate virtual volume)
    • 255 shares (counting each mirror as a separate share)
    • 128 mounts (counting each mirror mounted as a separate mount; i.e., 64 if mirroring by two)
    • 40 appliances (AppLogic internal limit)

  • Per virtual volume
    • volumes up to 700GB

  • If you need different dimensions, give us a call

Hardware Compatibility

  • Single CPU, Multi-CPU (SMP), single, dual and quad core
    • Certified: Pentium P4, Intel Xeon/Woodcrest/Clovertown, AMD Opteron, AMD Athlon64
    • Supported: Intel Pentium P4 or better; AMD Athlon or better
    • Note: Intel hyperthreading is automatically disabled by AppLogic and is not used

  • Minimum 1GB RAM per server (2 or 4 GB recommended, 8GB tested)
  • Maximum 16GB RAM per server

  • 80 GB IDE/SATA HDD (250 GB or larger SATA drive recommended, multiple drives/server supported)
  • HDD controllers
    • Certified:
      • Intel Corp. 82801EB/ER (ICH5/ICH5R)
      • Silicon Integrated Systems [SiS] 5513
      • Advanced Micro Devices [AMD] AMD-8111 IDE
    • Supported: all IDE, SATA and SCSI devices supported by CentOS 4.4 (excl. Adaptec AHA-15xx). Details.

  • Dual Gigabit Ethernet adapter (Intel or Broadcomm recommended)
    • Certified:
      • Intel Corp. 82541GI/PI Gigabit Ethernet Controller
      • Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet
      • Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (as external 10/100 NICs)
    • Supported: all Gigabit Ethernet network adapters supported by CentOS 4.4. Details.

  • Any single non-blocking gigabit Ethernet layer 2 switch (for private network; all ports must be on the same switch, no cascading)

  • If you have different network or storage devices, give us a call

Software Compatibility

  • Server
    • CentOS 4.4, 32-bit

  • Appliances
    • 32-bit Linux OS
    • Kernel 2.6.16-33 with Xen 3.0.4 support (included in AppLogic installation)
    • Tested: Fedora Core 3, CentOS 4.3, CentOS 5
    • Supported: all Linux distros based on recent 2.6 kernel (see Linux distros verified by AppLogic developers and users)

  • Appliance volumes
    • File systems supported: ext2, ext3, reiserfs, fat32
    • Swap volumes are supported and optional for appliances
    • Integration services for other file systems are available

  • If you are interested in other Linux distros, file systems and software infrastructure, give us a call.

Important Notes

  1. Before accessing the AppLogic GUI on a newly installed or upgraded AppLogic grid, the user should clear out the browser's cache. If the browser's cache is not cleared out, the AppLogic GUI may not behave properly.
  2. The grid shell can be accessed either through a web browser or using an ssh client. For increased security, password-based ssh logins are not supported except during grid installation. It is recommended to use the new web shell provided with the AppLogic GUI.
  3. When accessing the grid over ssh, the login user name is always root, regardless of the AppLogic user name. For the purpose of ssh logins, users and their roles are uniquely identified by their public ssh keys.
  4. Web browser's Javascript and pop-ups must be enabled to use the web-based graphical user interface (dashboard, editor, documentation)
  5. Users are responsible for allocating, assigning and use of externally visible IP addresses for applications; AppLogic takes care of all internal network assignments
  6. While the AppLogic distribution system sets up all grid servers and controllers with carefully pre-configured firewalls and disables unnecessary network services, users and maintainers are encouraged to verify the security settings of their systems.
  7. Network performance between servers on the private network used for volume and inter-appliance communication is measured to approximately 900 Mbps. The TCP network performance measured between appliances residing on different servers is measured as 720-900 Mbps.
  8. Resource limits on appliance hardware resources are enforced differently for different types of resources (CPU, memory, bandwidth). CPU is "no less than" , memory is "exactly that much" (includes VM overhead), bandwidth is enforced only to the degree of not scheduling components requiring more bandwidth than available (at appliance start time). CPU resources may be enforced to "exactly that much", using the new --cap_cpu option when starting the application.
  9. When starting an application with a specified amount of minimum CPU, it is not guaranteed that the application will get exactly the amount of specified CPU. For example, if an application is started with cpu=2, it is possible that the application will receive 1.97 CPU as observed by adding up all of the assigned CPU to all components of the application. This is due to rounding errors that may occur while trying to assign CPU to each individual component.
  10. When application start fails, not all messages related to the failure may be shown in the shell. Inspect the grid log for additional information, using the list log n=20 command.
  11. Grids in which linear scalability of performance is important should be built using servers that are as uniform as possible in CPU type/speed, memory size and disk capacity. AppLogic will work correctly in grids assembled from servers with different amounts of hardware resources; however, on such grids you may experience sub-linear performance.

Known Problems and Limitations

Limitations

1. Grid size is limited to 128 servers per grid
This is a limitation of the current AppLogic release. This release has been certified up to 62 servers; configurations up to 128 servers are supported.
2. AppLogic's web based user interface requires Firefox 2 or Microsoft Internet Explorer 6.x browser or later to operate
Javascript, pop-ups and cookies should be enabled for the grid controller's host for proper operation of the user interface. Please use the latest available bugfix versions of these browsers, as they correct a number of browser defects needed for AJAX applications. Firefox 1.0 and 1.5 browsers are also supported, with some minor known problems (printing and image caching).
3. The private network Ethernet switch is a single point of failure in the grid
If the Ethernet switch dies or loses power, the grid will stop operating and will need to be restarted after the switch is restored to operation or replaced.
4. Protocols are not enforced on appliance terminals, only endpoints are enforced
This means that an appliance can only talk to appliances connected to it (plus its own server and the grid controller). Nevertheless, protocols on new appliances should be properly specified in order to ensure application design integrity and compatibility with future versions of AppLogic.
5. The total available disk space does not take volume mirroring into account
The total available disk space reported by the grid info command is a raw estimate and does not take volume mirroring into account. The true available disk space is the reported available amount divided by the number of mirrors (2 mirrors by default). For example, if there is 1000GB of available disk space and the grid was configured for mirroring of 2, the available disk space is 500GB. Also, in order to successfully mirror volumes, there must be enough disk space on at least X servers where X is the number of mirrors (AppLogic will not fail to create a volume if any one of its mirrors cannot be created, it will display a warning that the volume could not be mirrored).
6. A server failure during application start may cause the application start to fail
If an application is started and one of the grid's servers fails, the application start will fail if one or more of the application's appliances were scheduled to run on the failed server. If this situation occurs, simply restart the application. $ 7. Defect SCR 1921 - The AppLogic command line interface does not support decimals for memory, for example 1.5G must be expressed as 1536M. This will be fixed in a future version of AppLogic (app start, app config, etc).
8. During a grid upgrade, all of the current catalogs on the grid are renamed to aldsave_xxx before the new/updated catalogs are imported
Therefore, after the grid is upgraded, all custom appliances that were present in the proto catalog are now contained in aldsave_proto. These appliances should be moved back to the the current proto catalog (after the grid has been upgraded). If the custom appliance is not used by any other application or appliance assembly, it can be moved by right clicking on the appliance and choosing the move option. Otherwise, the appliance must be dragged into a new application, branched and then moved back into proto. After moving the appliances back into the proto catalog, all applications and appliance assemblies must be updated to use the appliances from proto (currently, your applications reference appliances from aldsave_proto). There is one exception for appliance assemblies. After branching the assembly, make sure all appliances in the assembly reference appliances from proto. An easy way to replace the aldsave_proto appliances with appliances from proto is by using the visual application editor. While holding the shift key, click on the proto appliance and drag it over the aldsave_proto appliance in your application. Drop the appliance and click the OK button and your appliance will be replaced (without having to re-parameterize or reconnect the appliance).

Known Problems and Issues

The following are the key known problems in this release:

1. Defect SCR 1471 - GUI times out and logs out the user while there is load on the grid controller
The GUI no longer automatically logs the user out when there is heavy load on the grid controller. Instead, the user will recieve a message stating that there was a network error. In this case however, the GUI is still fully funcitonal. The network error message will only be received when there is heavy load on the controller, such as starting 4 applications at the same time AND copying a large multi-GB volume. In large grids, try assigning up to a full CPU core and 1GB RAM to the controller.

2. Defect SCR 857 - Grid reboot may degrade one or more system volumes
If a grid is rebooted using the grid reboot command, when the grid comes back up after the reboot, one or more of the system volumes may become degraded. If your grid needs to be rebooted, after the grid has been rebooted and comes back online, log in as a regular grid user. Check and repair the system volumes by executing the following commands below. This will ensure that the system volumes are always in a clean state. This bug will be fixed in a future AppLogic release.
  • check vol # will show if there any volumes that need repair, look specifically for the _SYSTEM volumes
  • repair vol _SYSTEM:boot
  • repair vol _SYSTEM:meta
  • repair vol _SYSTEM:impex

3. Defect SCR 1199 - Unable to migrate a volume whose streams are all on disabled servers
When migrating a volume, make sure that at least one of its streams is on an enabled server or else the migration command will fail. The volume can be completely migrated off of its original set of servers by migrating the volume twice.

4. Defect SCR 1233 - Grid automatic recovery (HA) may fail due to servers taking too long to reboot
Some physical servers may take a long time to reboot - this may cause AppLogic's automated grid recovery to fail. The end result of this is that applications may not be all restarted automatically after the grid recovers from a failure. This is due to the grid controller waiting for a maximum of 10 minutes for all servers to reboot and reconnect to the grid controller (which may not be enough time for all servers to reboot). Workaround is to manually restart applications after all servers have reconnected to the grid controller - execute "list srv" to ensure that all servers are connected to the grid controller - they all should be in the UP state. In AppLogic 2.1, with server boot timeout of 10 minutes, this may occur primarily if a server fails to boot due to hardware or BIOS malfunction.

5. Defect SCR 1234 - Grid flapping file is not always reset when the operator intentionally reboots the grid
When the operator reboots the grid, the grid flapping state is supposed to be reset and a message should be displayed on the dashboard stating that the operator rebooted the grid intentionally ("Grid has been restarted by operator on ..."). Occasionally when rebooting the grid, the grid file is not reset nor is the dashboard message displayed. The only problem that this may cause is upon the next grid failure, the applications may not be automatically restarted (depending on how many times the grid has failed when this bug occurs). To workaround this problem, if after an intentional grid reboot there is no dashboard message displayed, contact technical support to have the grid flapping state.

6. Defect SCR 1219 - 3 to 15 minute system lockout on server failure
In most cases when a server fails, shell commands will hang for 3 minutes but the AppLogic controller will remain operational. After 3 minutes, the grid will return to normal operation. In rare cases where the failed server contains one of the mirrors of the AppLogic controller system volumes (boot, meta or impex) and the server fails to reboot, the user will be locked out of the controller for up to 15 minutes. After 15 minutes, the grid should return to normal operation. This bug will be fixed in a future release.

7. Defect SCR 1360 - Appliance shows slightly less memory and less disk size than allocated
The reason for the slightly reduced resources is related to allocation for service areas. For memory, it is likely due to XEN related to the memory map table for a virtual machine. For disk, it is due to normal file system service areas (this is the same as on regular Linux servers).

8. Defect SCR 1896 - After closing the shell during volume resize, could not resize volume w/o maintainer access (general problem with any command)
If the web shell is closed during the execution of an AppLogic command, this may result in one or more volumes being left mounted on the grid controller. To recover from this situation, make sure all volumes that were involved with the particular operation are unmounted (note that this may include catalog volumes). You can use vol list --all to retrieve a list of all mounted volumes in the system.

9. Defect SCR 1936 - Application migration does not work for regular users and maintainers
Regular users and maintainers cannot migrate applications to an AppLogic 2.1.0 grid. Hotfix applogic-2.1.0-hf1936.tar.bz2 is now available that fixes this bug. After applying this hotfix, both regular users and maintainers can migrate applications between different AppLogic 2.1.0 grids. To install this hotfix, simply run the following command from your AppLogic 2.1.0 distro: ./aldo-fix grid=mygrid applogic-2.1.0-hf1936.tar.bz2 (the hotfix is available on 3tera's download server in the 2.1.0 distro). This hotfix does not require a grid reboot.

10. Defect SCR 1975 - Occasionally AppLogic operations involving the impex volume fail (i.e., class import/export)
Occasionally when an AppLogic grid is rebooted, the impex volume is not setup correctly on the grid controller. This causes failures when using AppLogic operations that involve the impex volume such as mounting volumes, class/catalog/application import/export, etc. In order to work around this issue, execute the following commands as a grid maintainer (if the following workaround does not work on your grid, please contact 3tera support).
  • /etc/init.d/nfs restart
  • /etc/init.d/3trsh-init stop
  • /etc/init.d/3trsh-init start
  • mount /dev/hda3 /vol/_impex
  • chown applogic:applogic /vol/_impex

Unreproducible Problems and Issues

The following list of problems have been observed in AppLogic 2.1 but are extremely difficult to reproducible (if at all) and have only been observed once or twice. If any of these issues appear on your grid, please send a bug report to 3tera describing which problem occurred and which AppLogic commands were executed that led to the failure.

1. Defect SCR 1631 - During volume repair of the large volumes the grid loses connection to a server
This problem has been solved in the 2.1 release and has not been observed (the dom0 memory on each server has been increased to 384MB to resolve this issue). If this problem appears on your 2.1 grid, please send a bug report to 3Tera.

2. Defect SCR 1574 - During a concurrent build, volume instantiation hung, only recovery is to reboot the grid
While starting two applications at the same time, both of which had to create/destroy volcache volumes concurrently, one of the applications managed to start and the other remained blocked in the volume creation phase. After that the grid was unusable; all volume-related operations blocked and did not complete. Rebooting the grid resolved the issue. This failure has been observed only once, on a development build of AppLogic. If you observe this problem, please send a bug report to 3Tera.

3. Defect SCR 1582 - (maintainers only) After resizing and mounting a volume, bash rmdir on the volume hung, grid needed a reboot to recover
The user mounted a boot volume on the controller and copied a large folder to the volume (remotely using rsync) when there was not enough diskspace on the volume to hold the folder. The copy failed - the boot volume was unmounted, resized and mounted again on the controller. The user then tried to delete a folder on the boot volume (created by the previous copy) using the bash "rm" command and the command hung for 20+ minutes. The user had to reboot the grid in order to recover. This failure has been observed only once, on a development build of AppLogic. If you observe this problem, please send a bug report to 3Tera.

Contact Information

For questions about this release and its operation, please contact Technical Support:


Self-help Resources

These links are also accessible through the Support Tab of your grid dashboard.


3Tera Partner Resources

3Tera partners and direct licensees can also obtain contract-based support and additional information resources.

On-line

Live Support

  • e-mai: support@3tera.com
  • phone: (888) 492-4738
  • fax: (949) 305-0160, ATTN: Technical Support

Support hours are Monday through Friday, 9:00am to 6:00pm Pacific Time (GMT-0800). We may be able to respond outside these hours. Please mark urgent messages as such.

IDEA! When calling the emergency phone support, please e-mail to support first -- this will ensure that all support engineers will have access to your information. Keep in mind that the phone support rings several engineers in sequence; don't hang up while it is ringing.

Interactive Sessions

We can set up interactive help sessions using WebEx. To reach our WebEx site, go to http://3tera.webex.com/. You should also receive from us a meeting number to set up a successful session.

We have verified access with the following browsers/OS combinations:

  • Windows XP: MS Internet Explorer, Mozilla Firefox
  • Linux: Mozilla Firefox with Java plug-in

WebEx sessions require Java or ActiveX to work. For more information on system requirements and to test whether your browser can access WebEx, go to http://developers.webex.com/api/jointest/index.php.


-- EricT - 07 Sep 2007
 
Copyright © CA 2005-2010. All Rights Reserved.
%