|
|
|
CUG Workshop, October 6-9, 1997The Cray User Group (CUG) cordially invites you to attend a special CUG Workshop in Bloomington, Minnesota, October 6-9 at the Minneapolis Airport Marriott Hotel. This CUG Workshop will feature the Origin2000. The registration fee of $175.00 will include breaks, continental breakfast, lunch, reception and theme banquet. The Minneapolis Airport Marriott Hotel is located about 15 minutes from the Minneapolis/St. Paul Airport and right next door to the Mall of America. A hotel shuttle bus is available from the airport to the hotel and also shuttle bus service from the hotel to the Mall of America. The dress attire will be business casual. A Cray/SGI program committee is planning an agenda based around the hardware, software and tools available for the Origin2000. Migration to the Origin platform will also be addressed. The final agenda can be accessed at: http://www.cug.org after September 1. To attend the CUG Workshop or to see a preliminary agenda, please contact Linda Yetzer. The cut off date to register is September 21, 1997 and the cut off for hotel reservations is September 15, 1997. Linda Yetzer Sincerely, Rene' G. Copeland Cray Research VP Customer Relations
CUG WorkshopUpdated: September 4, 97 FINAL PROGRAM ============================================================================== 10/6 Evening: Cray Reception, 7:00-9:00 October 7: Day 1 HOST: Rene Copeland ================= AM: 8:00-8:15: Comments from Gary Jensen, CUG Board President 8:15-9:00: Opening Address 9:00-10:00: Comments from NCSA: Larry Smarr 10:00-10:30: BREAK 10:30-11:15: Service model for Cray Origin customers: Tom Boyle, and others. Abstract: This session will provide customers with an update on SGI/Cray's current efforts and plans for service and support of the Cray Origin 2000, and will include an update from the customer perspective on early experience with Cray Origin 2000 support. The service update will cover various aspects of SGI Customer Service, including service offerings, delivery and installation planning and support, call center support, escalation, problem tracking and service information delivery. 11:15-12:00: Irix plans and status: Gabriel Broner. 12:00-1:00: LUNCH PM: 1:00-2:00 DMF, TMF, NQE status and plans: Neil Bannister. 2:00-2:30 Resource Management on Origin systems: Diane Wengelski and Paul Mielke Abstract: This session will briefly describe the direction that we are taking with Resource Management in Irix. Diane will discuss the traditional Cray features that Irix is being extended to support, and Paul will give a brief overview of the Miser scheduling environment available on Irix platforms. * Interested customers are encouraged to attend the Resource Management BoF. Diane and Paul, along with several of their developers, will describe design plans in more detail and welcome customer feedback on the directions that we are taking with Irix Resource Management. 2:30-3:00 Resource Management on the O'2000 Using LSF Thomas Klingner, LANL Abstract: Load Sharing Facility (LSF) is a distributed set of system daemons/utilities for controlling the workload on a cluster of machines. It is a commercial software product of Platform Computing. Although designed primarily for load leveling across clusters of mixed vendor workstations, it has facilities for SMPs and is implemented for SGI/Origin 2000, Cray PVPs, IBM SP2s, HP/Convex, DEC and other large scale platforms. It has been used at Los Alamos National Laboratory on the open ACL/ASCI SGI O2000s for several months, where it has been used to separate interactive work from jobs requiring dedicated resources in a dynamically configurable fashion, schedule and control MPI and PVM jobs spanning machines, and in general to control the workload such that users have access to the resources they need without interfering with each other. An additional benefit is that the system is heterogeneous, and will enable job control in an environment that includes diverse platforms. Personnel at LANL have been working with developers at both Platform Computing and at SGI to increase capability and to address problem areas. This talk will give a candid presentation of the benefits and pitfalls of this approach to resource management. 3:00-3:30 BREAK 3:30-4:00 Parallelizing a local area ocean circulation model. University of Bergen, Parallab: Ragnhild Blikberg Abstract: In this talk we report on the effort and the results of porting, optimizing and parallelizing a local area ocean model due to Berntsen, Skogen and Espelid to the Cray Origin 2000. We give a short description of the model, and then focus on the techniques applied for optimization and parallelization and the speedup obtained. A comparison of the behavior of two versions of the Fortran 90 compiler 6.2 and 7.2 has also been done during this work and will be presented in this talk. 4:00-5:00 Compiler technology and plans, Library status Compiler technology: TBD Library status: Suzanne LaCroix Abstract: The library update will provide information on the current status of various projects, including Fortran support, Flexible File I/O (FFIO), the Message Passing Toolkit (MPT), and Silicon Graphics/Cray Scientific Libraries (SCSL). Current performance will be presented, and future plans will be discussed. 5:00-5:15 Wrapup October 8: DAY 2: HOST: Vito Bongiorno =================== AM: 8:00-8:45 Hardware update: Dan Lenoski Abstract: During this session one of the original architects of the Origin2000 computer systems, Dan Lenoski, will speak. He will discuss the hardware performance features of the current Origin2000 system. A description of the direction of compatible future products will also be presented. 8:45-10:00 Clustering Origins: Dan Ferber, Ajit Dandapani, Andy Poupart Abstract: With IRIX 6.5, Origin systems will run up to 128 processors in a single system image environment. Even so, the need and interest for cluster configurations continues to grow. Some Origin clusters support large MPI sessions across HIPPI networks. Other cluster configurations promote load balancing of jobs and optimize throughput. Systems supporting high availability applications need shared data and unattended failover handling. Parallel database options increase overall performance of database queries. And many sites have accumulated IRIX systems over time and now have an interest in clustering those systems. This session overviews SGI/Cray's current set of enterprise and technical computing cluster products and strategies. After this overview by Dan Ferber, Ajit Dandapani, High Availability and Cluster Infrastructure Engineering Manager, will describe IRIX Cluster Services and the FailSafe high availability product. Then Andy Poupart, Engineering Manager for the IRIX workgroup management product, will talk about SGI/Cray and 3rd party enterprise management products and strategies. 10:00-10:30 BREAK 10:30-12:30 O'2000 optimization: Lawrence Hannon Abstract: This session will cover performance optimization techniques for users of the Origin 2000. The time will be divided so that half is spent on UNI-processor tuning techniques and half on parallel tuning techniques. In each section, I will cover compiler flags, performance oriented libraries, performance pitfalls, and performance tools available on the Origin systems. As time permits, the following tools will be demo'd: Perfex SpeedShop CaseVision - cvperf (Performance Analysis Tool) CaseVision - cvpav (parallel analysis tool) Compiler Reports (software pipelining reports, pfa reports) 12:30-1:30 LUNCH 1:30-2:00 Comments on the Origin Program Carol Woronow, SGI 2:00-2:30 Working through ASCI-BLUE O2000 Teething Problems:
Curtis V. Canada, LANL.Abstract: We are at the end of our first nine months gestation with our new Accelerated Strategic Computing Inititive, ASCI, and Advanced Computing Laboratory, ACL, "Blue Mountain" Origin 2000 systems (currently 512 processors, growing to at least 4096 processors). Though larger than most O2000 installations, the equipment is by no means unique, and most of our experiences with installation, acceptance testing, initial system configuration and tuning, and first production efforts will be of interest to other O2000 sites. As in any new machine breakin period, we expect to deal with significant and perplexing problems every few weeks (some hardware, some software, some from diagnostics being too primitive to fully elucidate problems, some from scalability issues, some from overzealous sales features), as the system environment evolves and we learn how to use the machine. Despite these, we have now completed several full machine production runs in support of our nuclear stockpile stewardship and grand challenge missions, with productive results. Equally important is our experience developing a partnering relationship with SGI/CRI to overcome the difficulties. 2:30-3:00 Supercomputing API: Marj Verstegen Abstract: Version 1.0 of the Supercomputing API was made available to customers at the end of August. This document defines the set of language features and library functions that will be implemented across Cray and Silicon Graphics platforms including Cray T3E, Cray T90, Cray J90, Origin and future S2MP systems. This session will cover implementation plans and identify releases where specific Supercomputing API features and functions will be available. This session will also address any issues or questions that have been raised about the content of the document (if available by CUG). 3:00-3:30 BREAK 3:30-4:30 Caribou: The future in development tools: Peter Rigsbee Abstract: Caribou is the project name for Silicon Graphics' next generation of debugging and analysis technology. Caribou combines the functionality of WorkShop and CrayTools into a single, flexible, integrated user interface. This talk will discuss the key goals and features of the project, with an emphasis on how Caribou will meet the needs of high-end customers. 4:30-5:00 OPEN FOR CUSTOMER TALK OR BOF 5:00-5:15 Wrapup EVENING: Cray Dinner October 9: DAY 3: (1/2 day) Host: Dan Hogberg ============================= 8:00-8:45 Porting MPP applications to Origin: Margaret Cahir Abstract: Porting codes between the T3E and the Origin has been simplified due to efforts in creating a common user environment for these machines. However, due to hardware differences and library implementation issues, there are still some areas of incompatibility that users will want to be aware of. This session (or presentation or talk) will cover the issues that users may encounter in migrating code for the CRAY T3E to Origin and future platforms based on MIPS/IRIX. 8:45-9:15 Benchmark comparisons: Origin 2000, PVP, MPP: Jeff Brooks 9:15-10:00 NASA Ames Research Center Origin 2000 Systems at NAS (NASA Ames)
by Archemedes F. de Guzman
Present the current hardware and software configuration used at NAS.
Major experiences to be covered are source builds, NFS, compilers, MPI,
CPR, DMF, and patches. Overview of how SGI/CRI service level supports a
neighboring customer. Lastly provide some early filesystem performance
numbers.
10:00-10:30 BREAK 10:30-11:30 Programming Models on O'2000: Mike Heroux, Jeff McDonald Abstract: There are a number of parallel programming environments available on, or being developed for, the Origin 2000. We have several varieties of directive-based shared memory environments. We have PVM and MPI. We also have two newer environments OpenMP and F--, as well as the potential to combine more than one type of parallelism in a single application. In this talk we discuss the features of these programming environments, comparing the relative strengths and weaknesses of each. An important conclusion we make shows that the "signature" of each environment, i.e., the list of attributes, is unique for each environment, if we take into account all factors that are important to our users. Thus, each environment addresses the current or future needs of some group of users uniquely. 11:30-12:30 Wrapup Gary Jensen Abstract: Gary will address the results of this conference and try to develop answers for questions like: what did we learn about the Origin2000 capabilities and future plans, is SGI meeting our needs/expectations with this product, and where do we go from here? Gary will take this information and assemble an action item list for SGI/CRAY, that will address our needs. This session will demand participation from the audience and all input will be noted. Questions, thoughts and further needs should be submitted to Gary in writing prior to the session and will be included in the resulting action item list as back up material. Remember this is a CUG meeting and please do not be bashful in your comments. This is the first CUG meeting focused on one single product since the CRAY 1. We will want input about the Origin2000 and the need for meetings like this in the future. ----------- Tutorial: NQE Open scheduling, and overview: Schedule:
The session will start at 1:00 and end by 5:00
The goal of workload management is scheduling. Many customer requests
for new features revolve around scheduling needs. Scheduling needs for
sites differ and conflict, to the extent that NQE cannot provide for all
these needs in a timely fashion. Therefore, the solution is to provide
an architecture that allows for easy customization by the individual
site.
The solution is NQE Open Scheduling. The NQE scheduler is based on a
database and on Tcl (Tool Command Language). The use of a database
provides a standard interface for examining information in the database.
Tcl is a scripting language similar to other UNIX shell languages, such
as the Bourne Shell (sh), the C Shell (csh), the Korn Shell (ksh) and
Perl. In particular, it provides an extension language to configure and
customize applications.
By using Tcl, NQE provides a mechanism for sites to write custom
schedulers. There is a testing mechanism that can be used to test a
scheduler before putting it into production. The model allows the choice
of scheduler to be updated dynamically at any time. This simplifies
changing schedulers for nights, weekends, or holidays.
This model holds job requests in a central location until they are ready
to be initiated. This allows system load to be examined at the time the
job request is ready to execute. The model is layered on top of NQS.
This preserves the current NQS environment and investment. It provides a
migration path that allows sites to continue in the old environment
while experimenting with the new one. Workload can be gradually
migrated to the new environment as needed and as work schedules allow.
Tutorial Overview:
The tutorial is intended to help system administrators understand,
customize and implement site scheduling policies using the NQE open
scheduler. it would be usefule for anyone supporting or administrating
NQE.
Audience:
The target audience would be support staff and system administrators
that would use, implement or customize the NQE open schedule for their
environment.
Tutorial outline:
Introduction to the NQE open scheduling
Scheduling requirements
NQE operational environments
NQE open scheduler architecture
Callbacks, events and components
Scheduler functions
Define an NQE environment
Sample scheduler
requirements
features
configuration
customization
Scheduler customization
Scheduler testing
Scheduler implementation
If you have questions about the tutorial, you are welcome to contact me,
Daryl Coulthart, at dbc@cray.com or 612-683-5587============================================================================= BoFs: The following BOFs have been proposed and will be scheduled during breakfast (7:00-8:00 each morning) as the event approaches. Migration: Application Development and User Environment Led by Peter Rigsbee and Marj Verstegen This BOF will provide an opportunity to discuss issues and concerns about migration from legacy Cray systems to Origin and follow-on S2MP systems, with the focus on issues important to application developers and end users. What kinds of help do you need from SGI/Cray? What documentation, tools, or approaches did you find useful when moving between other systems? -------------------------------------------------------- Origin Migration and Interoperability: Operating System and Administration Environment Led by Laura Mikrut and Kathy Nottingham This BOF will provide an opportunity to discuss OS and administration issues and concerns about migration from Cray MPP or PVP systems to Origin and follow-on S2MP systems. The focus will be on issues important to system administrators and data center managers. Please come prepared to discuss: - Issues and concerns about the IRIX/UNICOS convergence roadmap - Data and peripheral transition issues - How can Cray/SGI better help you to prepare? - What kinds of tools and documentation are needed? - Experiences with adding an Origin within an existing Cray environment - Migration experiences from other platforms (major pitfalls, useful tools) -------------------------------------------------------- Resource Mangement: Led by Diane Wengelski and Paul Mielke Interested customers are encouraged to attend the Resource
Management BoF. Diane and Paul, along with several of their
developers, will describe design plans in more detail and welcome
customer feedback on the directions that we are taking with
Irix Resource Management.
-------------------------------------------------------- Roles of Technical Computing and Strategic Software Organization Led by Laura Mikrut We value your ideas and comments. To reach us send e-mail to
the CUG
Office Page last modified: 16 Jul 03 |
|
|