HW 엔지니어를 위한 Deep Learning: PowerAI 5.2의 README 파일

아래는 PowerAI 5.2를 설치하면 함께 설치되는 README 파일입니다. 공유를 위해 여기 올려둡니다.

Deep Learning Software Packages

Updated 2018-06-15
PowerAI 1.5.2 provides software packages for several Deep Learning frameworks, supporting libraries, and tools:

Component	Version
DDL	1.0.0
TensorFlow	1.8.0
TensorBoard	1.8.0
IBM Caffe	1.0.0
BVLC Caffe	1.0.0
PyTorch	0.4.0
Snap ML	1.0.0
Spectrum MPI	10.2
Bazel	0.10.0
OpenBLAS	0.2.20
HDF5	1.10.1
Protobuf	3.4.0

PowerAI is optimized to leverage the unique capabilities of IBM Power Systems accelerated servers, and is not available on any other platforms. It is supported on:

IBM AC922 POWER9 system with NVIDIA Tesla V100 GPUs
IBM S822LC POWER8 system with NVIDIA Tesla P100 GPUs

PowerAI requires some additional 3rd-party software components (see below for more information):

Component	Required	Recommended
Red Hat RHEL	7.5	7.5
NVIDIA CUDA	9.2	9.2.88
NVIDIA GPU driver	396	396.26
NVIDIA cuDNN	7.1	7.1.4
NVIDIA NCCL	2.2	2.2.12
Anaconda Anaconda	5.1	5.1.0

New Features in 1.5.2

Python 3 support for the framework packages (in addition to the existing Python 2 support; not including Caffe).
A Technology Preview of IBM PowerAI Snap Machine Learning (Snap ML). Snap ML provides classical machine learning functionalities exposed via an sklearn-like interface.
A Technology Preview of PyTorch - a Python library that enables GPU-accelerated tensor computation and provides a rich API for neural network applications.
A Technology Preview of Large Model Support (LMS) is introduced for TensorFlow and enhanced for IBM Caffe. Large Module Support provides an approach to training large models and batch sizes that cannot fit in GPU memory.
Note that a NCCL v1.3.5 package is still included in the PowerAI distribution but is not installed by default. The other PowerAI components are now built against NCCL v2.2.12, which must be downloaded from NVIDIA. The NCCL 1 package is provided for compatibility with existing applications, but may be removed in future releases of PowerAI.

Additional information

For updates to this document please visit https://developer.ibm.com/linuxonpower/deep-learning-powerai/releases/.
More information about PowerAI is available at https://ibm.biz/powerai.
Developer resources can be found at http://ibm.biz/poweraideveloper.
Have questions?
You may find an answer already at the PowerAI space on developerWorks Answers.

System Setup

Operating System

The Deep Learning packages require RHEL 7.5 little endian for IBM POWER8 and IBM POWER9. The RHEL install image and license must be acquired from RedHat.
https://www.redhat.com/en/technologies/linux-platforms/enterprise-linux

Operating System and Repository Setup

Enable 'optional' and 'extra' repo channels

  IBM POWER8:
      $ sudo subscription-manager repos --enable=rhel-7-for-power-le-optional-rpms
      $ sudo subscription-manager repos --enable=rhel-7-for-power-le-extras-rpms

  IBM POWER9:
      $ sudo subscription-manager repos --enable=rhel-7-for-power-9-optional-rpms
      $ sudo subscription-manager repos --enable=rhel-7-for-power-9-extras-rpms

Install packages needed for the installation
```
  $ sudo yum -y install wget nano bzip2
```

Enable EPEL repo

   $ wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
   $ sudo rpm -ihv epel-release-latest-7.noarch.rpm

Load the latest kernel

  $ sudo yum update kernel kernel-tools kernel-tools-libs kernel-bootwrapper
  $ reboot

Or do a full update

  $ sudo yum update
  $ sudo reboot

NVIDIA Components

IBM POWER9 specific udev rules

Before installing the NVIDIA components the udev Memory Auto-Onlining Rule must be disabled for the CUDA driver to function properly. To disable it:

Copy the /lib/udev/rules.d/40-redhat.rules file to the directory for user overridden rules.
```
  $ sudo cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d/
```

Edit the /etc/udev/rules.d/40-redhat.rules file.

  $ sudo nano /etc/udev/rules.d/40-redhat.rules

Comment out the following line and save the change:

  SUBSYSTEM=="memory", ACTION=="add", PROGRAM="/bin/uname -p", RESULT!="s390*", ATTR{state}=="offline", ATTR{state}="online"

Optionally delete the first line of the file, since the file was copied to a directory where it won't be overwritten.
```
  # do not edit this file, it will be overwritten on update
```
Reboot the system for the changes to take effect.
```
  $ sudo reboot
```

CUDA, GPU driver, cuDNN and NCCL

The Deep Learning packages require CUDA, cuDNN, NCCL, and GPU driver packages from NVIDIA (see table above)
These components can be installed by:

Download and install NVIDIA CUDA 9.2 from https://developer.nvidia.com/cuda-downloads
- Select Operating System: Linux
- Select Architecture: ppc64le
- Select Distribution RHEL
- Select Version 7
- Select Installer Type rpm (local)
- Follow the Linux POWER installation instructions in the CUDA Quick Start Guide, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.
Note: The local rpm is preferred over the network rpm as it will ensure the version installed is the version downloaded. With the network rpm, "yum install cuda" will always install the latest version of the CUDA Toolkit.
Download NVIDIA cuDNN v7.1.4 for CUDA 9.2 from https://developer.nvidia.com/cudnn (Registration in NVIDIA's Accelerated Computing Developer Program is required)
- cuDNN v7.1.4 Library for Linux (Power8/Power9)
Download NVIDIA NCCL v2.2.12 for CUDA 9.2 from https://developer.nvidia.com/nccl (Registration in NVIDIA's Accelerated Computing Developer Program is required)
- NCCL 2.2.12 O/S agnostic and CUDA 9.2 and IBM Power

Install the cuDNN v7.1.4 and NCCL v2.2.12 packages. Refresh shared library cache.

   $ sudo tar -C /usr/local --no-same-owner -xzvf cudnn-9.2-linux-ppc64le-v7.1.4.tgz
   $ sudo tar -C /usr/local --no-same-owner -xzvf nccl_2.2.12-1+cuda9.2_ppc64le.tgz
   $ sudo ldconfig

Anaconda

A number of the Deep Learning frameworks require Anaconda. Anaconda is a platform-agnostic data science distribution with a collection of 1,000+ open source packages with free community support.
Anaconda2 with Python 2 should be used to run the Python 2 versions of the Deep Learning frameworks. Anaconda3 with Python 3 is required to run the Python 3 versions of the Deep Learning frameworks.

Anaconda	Version	Download Location	Size	md5sum
Anaconda2	5.1.0	https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-ppc64le.sh	267M	e894dcc547a1c7d67deb04f6bba7223a
Anaconda3	5.1.0	https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-ppc64le.sh	286M	47b5b2b17b7dbac0d4d0f0a4653f5b1c

Download and Install Anaconda. Installation requires input for license agreement, install location (default is $HOME/anaconda2 or $HOME/anaconda3), and permission to modify the PATH environment variable (via .bashrc).
Example download and install setup for Anaconda2:

   $ wget https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-ppc64le.sh
   $ bash Anaconda2-5.1.0-Linux-ppc64le.sh
   $ source ~/.bashrc

If multiple users are using the same system, each user should install Anaconda individually.

Installing the Deep Learning Frameworks

Software Repository Setup

The PowerAI Deep Learning packages are distributed in a tar.gz file containing an rpm and a README file. The tar.gz file must be extracted on the local machine. Installing the rpm creates an installation repository on the local machine.
Install the repository package:

    $ sudo rpm -ihv mldl-repo-*.rpm

Installing all frameworks at once

The Deep Learning frameworks can be installed all at once using the power-mldl meta-package:

    $ sudo yum install power-mldl

NOTE: This does not include the PowerAI Distributed Deep Learning (DDL) packages. See details of how to install DDL below.

Installing the Python 3 versions of the frameworks

The Python 3 versions of the frameworks can be installed at once using the power-mldl-py3 meta-package

    $ sudo yum install power-mldl-py3

Installing frameworks individually

The Deep Learning frameworks can be installed individually if preferred. The framework packages are:

caffe-bvlc - Berkeley Vision and Learning Center (BVLC) upstream Caffe, v1.0.0
caffe-ibm - IBM Optimized version of BVLC Caffe, v1.0.0
pytorch - PyTorch, v0.4.0
tensorflow - TensorFlow, v1.8.0
tensorboard - Web Applications for inspecting TensorFlow runs and graphs, v1.8.0

The Python 3 version of each framework appends '-py3' to the package name

pytorch-py3 - PyTorch, v0.4.0
tensorflow-py3 - TensorFlow, v1.8.0
tensorboard-py3 - Web Applications for inspecting TensorFlow runs and graphs, v1.8.0

Each can be installed with:

    $ sudo yum install <framework>

Install IBM PowerAI Distributed Deep Learning (DDL) packages

We recommend PowerAI Distributed Deep Learning for distributing model training across a cluster of Power machines. DDL includes IBM Spectrum MPI for communication among machines.
Install the PowerAI Distributed Deep Learning packages using:

    $ sudo yum install power-ddl

Note: DDL is an optional component. Other PowerAI components can be installed and used without installing DDL.
To use InfiniBand for DDL communications, install the latest Mellanox OFED driver. See the Download tab at: http://www.mellanox.com/page/products_dyn?product_family=26

Install IBM PowerAI Snap ML packages

Install the PowerAI Snap ML packages using:

    $ sudo yum install power-snapml

Note: Snap ML is an optional component. Other PowerAI components can be installed and used without installing Snap ML.

Accept the PowerAI License Agreement

Read the license agreement and accept the terms and conditions before using any of the frameworks.

    $ sudo /opt/DL/license/bin/accept-powerai-license.sh

After reading the license agreement, future installs may be automated to silently accept the license agreement.

    $ sudo IBM_POWERAI_LICENSE_ACCEPT=yes /opt/DL/license/bin/accept-powerai-license.sh

Upgrading from PowerAI 1.5.1

PowerAI 1.5.1 should be uninstalled prior to installing PowerAI 1.5.2.

Upgrading from PowerAI 1.5.0

PowerAI 1.5.2 requires newer versions of NVIDIA CUDA, NVIDIA cuDNN, the GPU driver, and IBM Spectrum MPI than 1.5.0. To upgrade, the older versions should be uninstalled and the newer versions installed. Likewise, the PowerAI 1.5.0 software packages should be uninstalled and the PowerAI 1.5.2 packages installed.

Upgrading from PowerAI 1.5.0 Caffe

The Caffe packages in PowerAI 1.5.0 used the HDF5 library from Anaconda. That library is now packaged with PowerAI so the Anaconda copy is no longer needed. After upgrading to 1.5.2, it is safe to remove the library symlinks from the cache directory:

$ ls -l ~/.powerai/caffe-bvlc/
$ rm -r ~/.powerai/caffe-bvlc

$ ls -l ~/.powerai/caffe-ibm/
$ rm -r ~/.powerai/caffe-ibm

Tuning Recommendations

Recommended settings for optimal Deep Learning performance on the S822LC and AC922 for High Performance Computing are:

Enable Performance Governor

   $ sudo yum install kernel-tools
   $ sudo cpupower -c all frequency-set -g performance

Enable GPU persistence mode

   $ sudo systemctl enable nvidia-persistenced
   $ sudo systemctl start nvidia-persistenced

Set GPU memory and graphics clocks
- S822LC with NVIDIA Tesla P100, set clocks to maximum
  $ sudo nvidia-smi -ac 715,1480
- AC922 with NVIDIA Tesla V100, set clocks to NVIDIA defaults
  $ sudo nvidia-smi -rac
For TensorFlow, set the SMT mode
- S822LC with NVIDIA Tesla P100, set SMT=2
  $ sudo ppc64_cpu --smt=2
- AC922 with NVIDIA Tesla V100, set SMT based on DDL usage:
  $ sudo ppc64_cpu --smt=4 # for TensorFlow WITHOUT DDL
  $ sudo ppc64_cpu --smt=2 # for TensorFlow WITH DDL

Getting Started with MLDL Frameworks

General Setup

Most of the PowerAI packages install outside the normal system search paths (to /opt/DL/...), so each framework package provides a shell script to simplify environmental setup (e.g. PATH, LD_LIBRARY_PATH, PYTHONPATH).
We recommend users update their shell rc file (e.g. .bashrc) to source the desired setup scripts. For example:

$ source /opt/DL/<framework>/bin/<framework>-activate

Each framework also provides a test script to verify some of its functions. These test scripts include tests and examples sourced from the various communities. Note that some of the included tests rely on datasets (ex. MNIST) that are available in the community and are downloaded at runtime. Access and availability to this data is subject to the community and may change at any time.
To run the test script for a particular framework, run:

$ <framework>-test

Note about dependencies

A number of the PowerAI frameworks (for example, TensorFlow, TensorBoard, and PyTorch) have their dependencies satisfied via Anaconda packages. These dependencies are validated by the <framework>-activate script to ensure they are installed and, if not, the script will fail.
For these frameworks, the /opt/DL/<framework>/bin/install_dependencies script must be run prior to activation to install the required packages.
For example:

$ source /opt/DL/tensorflow/bin/tensorflow-activate
Missing dependencies ['backports.weakref', 'mock', 'protobuf']
Run "/opt/DL/tensorflow/bin/install_dependencies" to resolve this problem.

$ /opt/DL/tensorflow/bin/install_dependencies
Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment /home/rhel/anaconda2:

The following NEW packages will be INSTALLED:

    backports.weakref: 1.0rc1-py27_0
    libprotobuf:       3.4.0-hd26fab5_0
    mock:              2.0.0-py27_0
    pbr:               1.10.0-py27_0
    protobuf:          3.4.0-py27h7448ec6_0

Proceed ([y]/n)? y

libprotobuf-3. 100% |###############################| Time: 0:00:02   2.04 MB/s
backports.weak 100% |###############################| Time: 0:00:00  12.83 MB/s
protobuf-3.4.0 100% |###############################| Time: 0:00:00   2.20 MB/s
pbr-1.10.0-py2 100% |###############################| Time: 0:00:00   3.35 MB/s
mock-2.0.0-py2 100% |###############################| Time: 0:00:00   3.26 MB/s

$ source /opt/DL/tensorflow/bin/tensorflow-activate
$

Note: PyTorch and TensorFlow have conflicting Anaconda package dependencies. Create separate Anaconda environments for those frameworks.

Getting Started with DDL

The Caffe and TensorFlow sections below describe how to use the DDL support for each of those frameworks.
Some configuration steps are common to all use of DDL:

PowerAI frameworks must be installed at the same version on all nodes in the DDL cluster.
The DDL master node must be able to log into all the nodes in the cluster using ssh keys. Keys can be created and added by:
1. Generate ssh private/public key pair on the master node using:
```
  $ ssh-keygen
```
2. Copy the generated public key in ~/.ssh/id_rsa.pub to all the nodes’ ~./ssh/authorized_keys file:
```
  $ ssh-copy-id -i ~/.ssh/id_rsa.pub $USER@$HOST
```
Linux system firewalls may need to be adjusted to pass MPI traffic. This could be done broadly as shown. Note: Opening only required ports would be more secure. Required ports will vary with configuration.
```
  $ sudo iptables -A INPUT -p tcp --dport 1024:65535 -j ACCEPT
```

Getting Started with Caffe

Caffe Alternatives

Packages are provided for upstream BVLC Caffe (/opt/DL/caffe-bvlc) and IBM optimized Caffe (/opt/DL/caffe-ibm). The system default Caffe (/opt/DL/caffe) can be selected using the operating system's alternatives system:

    $ sudo update-alternatives --config caffe
    There are 2 programs which provide 'caffe'.

      Selection    Command
    -----------------------------------------------
       1           /opt/DL/caffe-bvlc
    *+ 2           /opt/DL/caffe-ibm

    Enter to keep the current selection[+], or type selection number:

Users can activate the system default caffe:

    source /opt/DL/caffe/bin/caffe-activate

Or they can activate a specific variant. For example:

    source /opt/DL/caffe-bvlc/bin/caffe-activate

Attempting to activate multiple Caffe packages in a single login session will cause unpredictable behavior.

Caffe Samples and Examples

Each Caffe package includes example scripts and sample models, etc. A script is provided to copy the sample content into a specified directory:

    $ caffe-install-samples <somedir>

More Info

Visit Caffe's website (http://caffe.berkeleyvision.org/) for tutorials and example programs that you can run to get started.
Here are links to a couple of the example programs:

LeNet MNIST Tutorial - Train a neural network to understand handwritten digits
CIFAR-10 tutorial - Train a convolutional neural network to classify small images

Optimizations in IBM Caffe

The IBM Caffe package (caffe-ibm) in PowerAI is based on BVLC Caffe and includes optimizations and enhancements from IBM:

IBM PowerAI Distributed Deep Learning (DDL)
Large Model Support (LMS)
Fast R-CNN support from https://github.com/rbgirshick/caffe-fast-rcnn/tree/faster-rcnn-upstream-33f2445
CPU/GPU layer-wise reduction
Built with IBM's Mathematical Acceleration Subsystem (MASS) libraries
CPU/GPU Affinity

Note: DDL is to be installed separately as mentioned above.

Command Line Options

IBM Caffe supports all of BVLC Caffe's options and adds a few new ones to control the enhancements. IBM Caffe options related to Distributed Deep Learning (options that start with the word "ddl") will work only if you have DDL installed.

-bvlc: Disable CPU/GPU layer-wise reduction
-threshold: Tune CPU/GPU layer-wise reduction. If the number of parameters for one layer is greater than or equal to threshold, their accumulation on CPU will be done in parallel. Otherwise, the accumulation will be done using one thread. It is set to 2,000,000 by default.
-ddl ["-option1 param -option2 param"]: Enable Distributed Deep Learning, with optional space-delimited parameter string. Supported parameters are:
- mode <mode>
- dump_iter <N>
- dev_sync <0, 1, or 2>
- rebind_iter <N>
- dbg_level <0, 1, or 2>
-ddl_update: This option instructs Caffe to use a new custom version of the ApplyUpdate function that is optimized for DDL. It is faster, but does not support gradient clipping so is off by default. It can be used in networks that do not support clipping (common).
-ddl_align: This option ensures that the gradient buffers have a length that is a multiple of 256 bytes and have start addresses that are multiples of 256. This ensures cache line alignment on multiple platforms as well as alignment with NCCL slices. Off by default
-ddl_database_restart: This option ensures every learner always looks at the same data set during an epoch. This allows a system to cache only the pages that are touched by the learners contained within it. It can help size the number of learners needed for a given data set size by establishing a known database footprint per system. This flag should not be used while running caffe on several hosts. Off by default.
-lms: Enable Large Model Support. See below.
-lms_size_threshold <size in KB>: Set LMS size threshold. See below.
-lms_exclude <size in MB>: Tune LMS memory utilization. See below.
-affinity: Enable CPU/GPU affinity (default). Specify -noaffinity to disable.

Use the command line options as follows:

    | Feature                         | -bvlc | -ddl | -lms  | -gpu          | -affinity |
    | ------------------------------- | ----- | ---- | ----- | ------------- | --------- |
    | CPU/GPU layer-wise reduction    |   N   |   X  |   X   | multiple GPUs | X         |
    | Distributed Deep Learning (DDL) |   X   |   Y  |   X   | N             | X         |
    | Large model support             |   X   |   X  |   Y   | X             | X         |
    | CPU/GPU affinity                |   X   |   X  |   X   | X             | Y         |

    Y: do specify
    N: don't specifiy
    X: don't care/matter

LMS gets enabled regardless of other options as long as -lms is specified. For example, you can use DDL and LMS together.
CPU/GPU layer-wise reduction is enabled only if multiple GPUs are specified and layer_wise_reduce: false.
Use of multiple GPUs with DDL is specified via the MPI rank file, so the -gpu flag may not be used to specify multiple GPUs for DDL.
While running caffe on several hosts, the use of shared storage for data can lead caffe to hang.

About CPU/GPU Layer-wise Reduction

This optimization aims to reduce the running time of a multiple-GPU training by utilizing CPUs. In particular, gradient accumulation is offloaded to CPUs and done in parallel with the training. To gain the best performance with IBM Caffe, please close unnecessary applications that consume a high percentage of CPU.
If using a single GPU, IBM Caffe and BVLC Caffe will have similar performance.
The optimizations in IBM Caffe do not change the convergence of a neural network during training. IBM Caffe and BVLC Caffe should produce the same convergence results.
CPU/GPU layer-wise reduction is enabled unless the -bvlc commandline flag is used.

About IBM PowerAI Distributed Deep Learning (DDL)

See /opt/DL/ddl/doc/README.md for more information about using IBM PowerAI Distributed Deep Learning.

About Large Model Support (LMS)

IBM Caffe with Large Model Support loads the neural model and data set in system memory and caches activity to GPU memory only when needed for computation. This allows models and training batch size to scale significantly beyond what was previously possible. You can enable Large Model Support by adding -lms. Large Model Support is available as a technology preview.
The -lms_size_threshold <size in KB> option modifies the minimum memory chunk size considered for the LMS cache (default: 1000). Any chunk smaller than this value will be exempt from LMS reuse and will persist in GPU memory. The value can be used to control the performance trade-off.
The -lms_exclude <size in MB> option defines a soft limit on GPU memory allocated for the LMS cache (where limit = GPU-capacity - value). If zero, favors aggressive GPU memory reuse over allocation (default). If specified (> 0), enables aggressive allocation of GPU memory up to the limit. Minimizing this value -- while still allowing enough memory for non-LMS allocations -- may improve performance by increasing GPU memory utilization and reducing data transfers between system and GPU memory.
For example, the following command line options yield the best training performance for the GoogleNet model with high-resolution image data (crop size 2240x2240, batch size 5) using Tesla P100 GPUs:

    $ caffe train -solver=solver.prototxt -gpu all -lms —lms_size_threshold 1000 -lms_exclude 1400

Note that ideal tunings for any given scenario may differ depending on the model's network architecture, data size, batch size and GPU memory capacity.

Combining LMS and DDL

Large Model Support and Distributed Deep Learning can be combined. For example, to run on two hosts named host1 and host2:

    $ ddlrun -H host1,host2 caffe train -solver solver-resnet-152.prototxt -lms

Getting Started with Tensorflow

The TensorFlow homepage (https://www.tensorflow.org/) has a variety of information, including Tutorials, How Tos, and a Getting Started guide.
Additional tutorials and examples are available from the community, for example:

High-Performance Models

A version of TensorFlow High-Performance Models which includes options to use Distributed Deep Learning is included in the tensorflow-performance-models package. For more information, see:

/opt/DL/tensorflow-performance-models/scripts/tf_cnn_benchmarks/README.md

Large Model Support (LMS)

This release of PowerAI includes a Technology Preview of large model support for TensorFlow. Large Model Support provides an approach to training large models and batch sizes that cannot fit in GPU memory. It does this by use of a graph editing library that takes the user model's computational graph and automatically adds swap-in and swap-out nodes for transferring tensors from GPU memory to system memory and vice versa during training.
For more information about TensorFlow LMS, see:

/opt/DL/tensorflow/doc/README-LMS.md

Distributed Deep Learning (DDL) Custom Operator for TensorFlow

The DDL custom operator uses IBM Spectrum MPI and NCCL to provide high-speed communications for distributed TensorFlow.
The DDL custom operator can be found in the ddl-tensorflow package. For more information about DDL and about the TensorFlow operator, see:

/opt/DL/ddl/doc/README.md
/opt/DL/ddl-tensorflow/doc/README.md
/opt/DL/ddl-tensorflow/doc/README-API.md

Additional TensorFlow Features

The PowerAI TensorFlow packages include TensorBoard. See: https://www.tensorflow.org/get_started/summaries_and_tensorboard
The TensorFlow 1.8.0 package includes support for additional features:

HDFS
NCCL
experimental XLA JIT compilation (see https://www.tensorflow.org/performance/xla/)

TensorBoard Usage Notes

Additional usage notes are available from the community. See notes at: - https://github.com/tensorflow/tensorboard

Getting started with Snap Machine Learning (Snap ML)

This release of PowerAI includes Technology preview of Snap Machine Learning (Snap ML). Snap ML is a library for training generalized linear models. It is being developed at IBM with the vision to remove training time as a bottleneck for machine learning applications. Snap ML supports a large number of classical machine learning models and scales gracefully to data sets with billions of examples and/or features. It offers distributed training, GPU acceleration and supports sparse data structures.
"With Snap ML you can train your machine learning model faster than you can snap your fingers!"
The Snap ML library offers two different packages:

snap-ml-local

snap-ml-local is used for machine learning on a single machine.
For information on snap-ml-local, see /opt/DL/snap-ml-local/doc/README.md

snap-ml-mpi

snap-ml-mpi is used for distributed training of machine learning models across a cluster of machines.
For information on snap-ml-mpi, see /opt/DL/snap-ml-mpi/doc/README.md

Getting started with PyTorch

This release of PowerAI includes a Technology Preview of PyTorch - deep learning framework for fast, flexible experimentation.

PyTorch Examples

The PyTorch package includes a set of examples. A script is provided to copy the sample content into a specified directory:

    $ pytorch-install-samples <somedir>

More Info

The PyTorch homepage (https://pytorch.org) has a variety of information, including Tutorials and a Getting Started guide.
Additional tutorials and examples are available from the community, for example:

Uninstalling MLDL Frameworks

The MLDL framework packages can be uninstalled individually if desired. Or to uninstall all MLDL packages and the repository package at once:

    $ sudo yum remove powerai-license
    $ sudo yum remove mldl-repo-local
    $ sudo yum autoremove

Legal Notices

© Copyright IBM Corporation 2017, 2018
IBM, the IBM logo, ibm.com, POWER, Power, POWER8, POWER9, and Power systems are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
The TensorFlow package includes code from the BoringSSL project. The following notices may apply:

    This product includes software developed by the OpenSSL Project for
    use in the OpenSSL Toolkit. (http://www.openssl.org/)

    This product includes cryptographic software written by Eric Young
    (eay@cryptsoft.com)

This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates.
THE INFORMATION IN THIS DOCUMENT IS PROVIDED "AS IS" WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided.

2018년 8월 9일 목요일

PowerAI 5.2의 README 파일

Deep Learning Software Packages

New Features in 1.5.2

Additional information

System Setup

Operating System

Operating System and Repository Setup

NVIDIA Components

IBM POWER9 specific udev rules

CUDA, GPU driver, cuDNN and NCCL

Anaconda

Installing the Deep Learning Frameworks

Software Repository Setup

Installing all frameworks at once

Installing the Python 3 versions of the frameworks

Installing frameworks individually

Install IBM PowerAI Distributed Deep Learning (DDL) packages

Install IBM PowerAI Snap ML packages

Accept the PowerAI License Agreement

Upgrading from PowerAI 1.5.1

Upgrading from PowerAI 1.5.0

Upgrading from PowerAI 1.5.0 Caffe

Tuning Recommendations

Getting Started with MLDL Frameworks

General Setup

Note about dependencies

Getting Started with DDL

Getting Started with Caffe

Caffe Alternatives

Caffe Samples and Examples

More Info

Optimizations in IBM Caffe

Command Line Options

About CPU/GPU Layer-wise Reduction

About IBM PowerAI Distributed Deep Learning (DDL)

About Large Model Support (LMS)

Combining LMS and DDL

Getting Started with Tensorflow

High-Performance Models

Large Model Support (LMS)

Distributed Deep Learning (DDL) Custom Operator for TensorFlow

Additional TensorFlow Features

TensorBoard Usage Notes

Getting started with Snap Machine Learning (Snap ML)

snap-ml-local

snap-ml-mpi

Getting started with PyTorch

PyTorch Examples

More Info

Uninstalling MLDL Frameworks

Legal Notices

댓글 없음:

댓글 쓰기