레이블이 xgboost인 게시물을 표시합니다. 모든 게시물 표시
레이블이 xgboost인 게시물을 표시합니다. 모든 게시물 표시

2018년 7월 26일 목요일

ppc64le에서 xgboost4j build하기

먼저 http://hwengineer.blogspot.com/2018/07/ppc64le-pyarrow-wheel-build.html 의 안내에 따라 pyarrow를 설치해야 합니다.  그 다음에 xgboost의 source를 받습니다. 

[root@ING data]# git clone -b h2oai  https://github.com/h2oai/xgboost.git
Cloning into 'xgboost'...
remote: Counting objects: 25961, done.
remote: Compressing objects: 100% (86/86), done.
remote: Total 25961 (delta 83), reused 78 (delta 45), pack-reused 25828
Receiving objects: 100% (25961/25961), 9.44 MiB | 4.67 MiB/s, done.
Resolving deltas: 100% (15316/15316), done.

[root@ING data]# cd xgboost

[root@ING xgboost]# git submodule init
Submodule 'cub' (https://github.com/NVlabs/cub) registered for path 'cub'
Submodule 'dmlc-core' (https://github.com/dmlc/dmlc-core) registered for path 'dmlc-core'
Submodule 'rabit' (https://github.com/dmlc/rabit) registered for path 'rabit'

[root@ING xgboost]# git submodule update --recursive
Cloning into 'cub'...
remote: Counting objects: 32642, done.
remote: Total 32642 (delta 0), reused 0 (delta 0), pack-reused 32642
Receiving objects: 100% (32642/32642), 16.46 MiB | 6.82 MiB/s, done.
Resolving deltas: 100% (28620/28620), done.
Submodule path 'cub': checked out 'b20808b1b04ec3d6a625e51fbc1eb76f337754ad'
Cloning into 'dmlc-core'...
remote: Counting objects: 4975, done.
remote: Compressing objects: 100% (21/21), done.
remote: Total 4975 (delta 7), reused 9 (delta 2), pack-reused 4952
Receiving objects: 100% (4975/4975), 1.22 MiB | 1.44 MiB/s, done.
Resolving deltas: 100% (2976/2976), done.
Submodule path 'dmlc-core': checked out '459ab734d15acd68fd437abf845c7c1730b5a38f'
Cloning into 'rabit'...
remote: Counting objects: 3174, done.
remote: Total 3174 (delta 0), reused 0 (delta 0), pack-reused 3173
Receiving objects: 100% (3174/3174), 905.56 KiB | 378.00 KiB/s, done.
Resolving deltas: 100% (2058/2058), done.
Submodule path 'rabit': checked out '87143deb4c0a34302f727ba35497e3380b2cced8'

처음에 해줄 것은 build.sh 수행, 그 다음에 make -f Makefile2 libxgboost 입니다.  여기까지 해주면 xgboost의 wheel 파일이 build 됩니다.  그 다음에 jvm-packages에 들어가서 create_jni.py를 수행하여 libxgboost4j.so를 만들어 줍니다.

[root@ING xgboost]# ./build.sh
...
build/linear/updater_coordinate.o dmlc-core/libdmlc.a rabit/lib/librabit.a  -pthread -lm  -fopenmp -lrt  -lrt
Successfully build multi-thread xgboost


[root@ING xgboost]# make -f Makefile2 libxgboost
...
copying build/lib/xgboost/build-python.sh -> build/bdist.linux-ppc64le/wheel/xgboost
copying build/lib/xgboost/libxgboost.so -> build/bdist.linux-ppc64le/wheel/xgboost
running install_egg_info
Copying xgboost.egg-info to build/bdist.linux-ppc64le/wheel/xgboost-0.72-py3.6.egg-info
running install_scripts
creating build/bdist.linux-ppc64le/wheel/xgboost-0.72.dist-info/WHEEL

[root@ING xgboost]# ls -l ./python-package/dist/xgboost-0.72-py3-none-any.whl
-rw-r--r-- 1 root root 77640590 Jul 23 14:52 ./python-package/dist/xgboost-0.72-py3-none-any.whl

[root@ING xgboost]# pip install ./python-package/dist/xgboost-0.72-py3-none-any.whl

[root@ING xgboost]# cd jvm-packages

[root@ING jvm-packages]# python create_jni.py
...
[100%] Linking CXX shared library ../lib/libxgboost4j.so
[100%] Built target xgboost4j
cd demo/regression
/root/anaconda3/bin/python mapfeat.py
/root/anaconda3/bin/python mknfold.py machine.txt 1
copying native library
mkdir -p xgboost4j/src/main/resources/lib
cp ../lib/libxgboost4j.so xgboost4j/src/main/resources/lib
copying pure-Python tracker
cp ../dmlc-core/tracker/dmlc_tracker/tracker.py xgboost4j/src/main/resources
copying train/test files
mkdir -p xgboost4j-spark/src/test/resources
cd ../demo/regression
/root/anaconda3/bin/python mapfeat.py
/root/anaconda3/bin/python mknfold.py machine.txt 1
cp ../demo/regression/machine.txt.train xgboost4j-spark/src/test/resources
cp ../demo/regression/machine.txt.test xgboost4j-spark/src/test/resources
cp ../demo/data/agaricus.txt.test xgboost4j-spark/src/test/resources
cp ../demo/data/agaricus.txt.train xgboost4j-spark/src/test/resources

[root@ING jvm-packages]# ls -l ./xgboost4j/src/main/resources/lib/libxgboost4j.so
-rwxr-xr-x 1 root root 158562656 Jul 23 15:49 ./xgboost4j/src/main/resources/lib/libxgboost4j.so

[root@ING jvm-packages]# file ./xgboost4j/src/main/resources/lib/libxgboost4j.so
./xgboost4j/src/main/resources/lib/libxgboost4j.so: ELF 64-bit LSB shared object, 64-bit PowerPC or cisco 7500, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=510aede1975d463804199edd1b74457b8e48c4f8, not stripped

위에서 생성된 libxgboost4j.so을 h2o.jar에 lib/linux64/libxgboost4j_gpu.so라는 이름으로 update해주면 됩니다.

$ mkdir -p lib/linux64

$ cp ./xgboost4j/src/main/resources/lib/libxgboost4j.so lib/linux64/libxgboost4j.so

$ jar uf h2o.jar lib/linux64/libxgboost4j.so

------------------------------

아래는 위의 build하는 과정에서 만났던 몇가지 error 및 그 회피 방법입니다.

[root@ING xgboost]# make -f Makefile2 libxgboost
...
Collecting jmespath<1.0.0,>=0.7.1 (from botocore==1.10.62->awscli>=1.11.148->-r requirements_runtime.txt (line 74))
  Using cached https://files.pythonhosted.org/packages/b7/31/05c8d001f7f87f0f07289a5fc0fc3832e9a57f2dbd4d3b0fee70e0d51365/jmespath-0.9.3-py2.py3-none-any.whl
recommonmark 0.4.0 has requirement commonmark<=0.5.4, but you'll have commonmark 0.7.5 which is incompatible.
Installing collected packages: html5lib, bleach, execnet, qtconsole, tabulate, testpath, pyasn1, rsa, jmespath, botocore, s3transfer, awscli, feather-format, graphviz
  Found existing installation: html5lib 0.9999999
Cannot uninstall 'html5lib'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
make: *** [libxgboostp2nccl] Error 1


[root@ING xgboost]# rm -rf /root/anaconda3/lib/python3.6/site-packages/html5lib*

[root@ING xgboost]# make -f Makefile2 libxgboost
...
Installing collected packages: testpath, jmespath, botocore, pyasn1, rsa, s3transfer, awscli, feather-format, graphviz
Successfully installed awscli-1.15.63 botocore-1.10.62 feather-format-0.4.0 graphviz-0.8.4 jmespath-0.9.3 pyasn1-0.4.3 rsa-3.4.2 s3transfer-0.1.13 testpath-0.3.1
pip install -r requirements_build.txt
Could not open requirements file: [Errno 2] No such file or directory: 'requirements_build.txt'
make: *** [libxgboostp2nccl] Error 1

[root@ING xgboost]# find . -name "requirement*.txt"
./doc/requirements.txt
./requirements_buildonly.txt
./requirements_runtime.txt


[root@ING xgboost]# cp ./requirements_buildonly.txt ./requirements_build.txt

[root@ING xgboost]# make -f Makefile2 libxgboost
...
Collecting cmake>=0.8.0 (from -r requirements_build.txt (line 9))
  Cache entry deserialization failed, entry ignored
  Downloading https://files.pythonhosted.org/packages/79/06/e89052a7e65ab765bc5e279542853d043ec857e61f253973c05a80f2490f/cmake-3.11.4.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-cu87aoqb/cmake/setup.py", line 7, in <module>
        from skbuild import setup
    ModuleNotFoundError: No module named 'skbuild'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-cu87aoqb/cmake/
make: *** [libxgboostp2nccl] Error 1


[root@ING xgboost]# vi requirements_build.txt
#cmake>=0.8.0



2017년 6월 14일 수요일

ppc64le 환경에서의 anaconda python2&3 환경 동시 설정 - XGBoost, OpenCV, KoNLPy, MeCab 등

먼저, ppc64le 환경에서의 miniconda 설치에 대해서는 아래 post를 참조하십시요.

http://hwengineer.blogspot.kr/2017/05/minsky-continuum-anaconda.html

여기서는 miniconda3 (python3.6)이 이미 설치된 환경에서 시작하며, ppc64le 환경에서 anaconda를 사용하실 때 흔히 있을 수 있는 질문과 응답 형식으로 정리했습니다.


Q1.  Anaconda 대신 Miniconda3로 가야하는 경우에는 다음 패키지도 conda install로 설치 가능한지 ? 

numpy, pandas, sklearn(scikit-learn), joblib, flask, bs4(beautiful soup), MKL(math kernel library), 

A1.  예, 아래와 같이 다 conda로 install 잘 됩니다.  단 하나, MKL은 ppc64le에는 없습니다만, 그와 수반되어 사용되는 numpy 및 scipy는 아래와 같이 conda로 잘 설치되므로 꼭 MKL이 있어야 할 필요는 없습니다.

u0017496@sys-87576:~$ conda install joblib
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    joblib: 0.11-py36_0

Proceed ([y]/n)? y

joblib-0.11-py 100% |#########################################################| Time: 0:00:00   7.17 MB/s

u0017496@sys-87576:~$ conda install scikit-learn
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    scikit-learn: 0.18.1-np112py36_1

Proceed ([y]/n)? y

scikit-learn-0 100% |#########################################################| Time: 0:00:00  14.48 MB/s

u0017496@sys-87576:~$ conda install pandas
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    pandas:          0.20.1-np112py36_0
    python-dateutil: 2.6.0-py36_0
    pytz:            2017.2-py36_0

Proceed ([y]/n)? y

pytz-2017.2-py 100% |#########################################################| Time: 0:00:00   7.85 MB/s
python-dateuti 100% |#########################################################| Time: 0:00:00   8.43 MB/s
pandas-0.20.1- 100% |#########################################################| Time: 0:00:02  10.59 MB/s

u0017496@sys-87576:~$ conda install flask
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    click:        6.7-py36_0
    flask:        0.12.2-py36_0
    itsdangerous: 0.24-py36_0
    jinja2:       2.9.6-py36_0
    markupsafe:   0.23-py36_2
    werkzeug:     0.12.2-py36_0

Proceed ([y]/n)? y

click-6.7-py36 100% |#########################################################| Time: 0:00:00   7.14 MB/s
itsdangerous-0 100% |#########################################################| Time: 0:00:00  11.12 MB/s
markupsafe-0.2 100% |#########################################################| Time: 0:00:00  15.42 MB/s
werkzeug-0.12. 100% |#########################################################| Time: 0:00:00   5.71 MB/s
jinja2-2.9.6-p 100% |#########################################################| Time: 0:00:00  14.30 MB/s
flask-0.12.2-p 100% |#########################################################| Time: 0:00:00   7.08 MB/s

u0017496@sys-87576:~$ conda install beautifulsoup4
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    beautifulsoup4: 4.6.0-py36_0

Proceed ([y]/n)? y

beautifulsoup4 100% |#########################################################| Time: 0:00:00   7.12 MB/s


u0017496@sys-87576:~$ conda install numpy scipy
Fetching package metadata .........
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /home/u0017496/miniconda3:
#
numpy                     1.12.1                   py36_0
scipy                     0.19.0              np112py36_0


Q2.  Python2 & 3을 위해서 Anaconda2, Anaconda3을 각각 설치하는 대신에 아래 URL에 나오는 것처럼 Anaconda3 하나만 설치하고 그 내부에 python2 가상환경 구축 가능한지?

www.continuum.io/blog/developer-blog/python-3-support-anaconda

A2.  예, 잘 됩니다.  

u0017496@sys-87576:~$ conda create -n py2k python=2 anaconda
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3/envs/py2k:

The following NEW packages will be INSTALLED:

    alabaster:          0.7.10-py27_0
    anaconda:           4.4.0-np112py27_0
    anaconda-client:    1.6.3-py27_0
....
...
jupyter-1.0.0- 100% |#########################################################| Time: 0:00:00  10.50 MB/s
anaconda-4.4.0 100% |#########################################################| Time: 0:00:00   5.53 MB/s
#
# To activate this environment, use:
# > source activate py2k
#
# To deactivate this environment, use:
# > source deactivate py2k
#

py2k, 즉 python2 가상환경에 들어가려면 다음과 같이 하시면 됩니다.

u0017496@sys-87576:~$ source activate py2k

python이 이제 2.7.13 버전이 구동되는 것을 보실 수 있습니다.

(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>>


Q3. XGBoost의 경우, pip install을 통한 설치는 CPU 버전 패키지만 지원합니다.  
GPU 사용가능한 XGBoost는 아래 URL의 Github 소스코드 + CUB 관련 파일 컴파일이 필요한데 ppc64le에서도 가능한지?

github.com/dmlc/xgboost/tree/master/plugin/updater_gpu

A3.  예, 잘 됩니다.  git clone으로 위의 package를 download 받아 그 안의 다음 4개의 Makefile에서 intel-specifc한 option인 -msse2만 제거하면 잘 됩니다.

u0017496@sys-87576:~/$ git clone --recursive https://github.com/dmlc/xgboost.git

u0017496@sys-87576:~/$ cd xgboost

u0017496@sys-87576:~/xgboost$ vi ./rabit/guide/Makefile ./rabit/Makefile ./dmlc-core/Makefile ./Makefile

u0017496@sys-87576:~/xgboost$ ./build.sh
...
g++ -std=c++11 -Wall -Wno-unknown-pragmas -Iinclude   -Idmlc-core/include -Irabit/include -I/include -O3 -funroll-loops -mcpu=power8 -fPIC -fopenmp -o xgboost  build/cli_main.o build/logging.o build/learner.o build/common/common.o build/common/hist_util.o build/metric/metric.o build/metric/rank_metric.o build/metric/elementwise_metric.o build/metric/multiclass_metric.o build/objective/multiclass_obj.o build/objective/objective.o build/objective/rank_obj.o build/objective/regression_obj.o build/data/sparse_page_dmatrix.o build/data/sparse_page_source.o build/data/sparse_page_writer.o build/data/simple_csr_source.o build/data/data.o build/data/sparse_page_raw_format.o build/data/simple_dmatrix.o build/tree/updater_prune.o build/tree/tree_updater.o build/tree/updater_histmaker.o build/tree/updater_refresh.o build/tree/updater_sync.o build/tree/updater_colmaker.o build/tree/updater_skmaker.o build/tree/tree_model.o build/tree/updater_fast_hist.o build/gbm/gbtree.o build/gbm/gblinear.o build/gbm/gbm.o build/c_api/c_api.o build/c_api/c_api_error.o dmlc-core/libdmlc.a rabit/lib/librabit.a  -pthread -lm  -fopenmp -lrt  -lrt
Successfully build multi-thread xgboost


Q4. konlpy는 JDK 및 Jpype1 관련 설정 및 PATH, JAVA_HOME 등등 경로 지정 필요합니다. python2, python3에서 모두 잘 작동하는지 ?

A4.  잘 됩니다.

먼저 기본 환경인 python3에서 다음과 같이 잘 됩니다.

u0017496@sys-87576:~/xgboost$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-ppc64el
u0017496@sys-87576:~/xgboost$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import *
>>> Kkma()
<konlpy.tag._kkma.Kkma instance at 0x3fff9561be60>

이어서 위에서 설치했던 가상환경의 python2에서도 잘 됩니다.  다만 환경 변수 등은 가상환경 속에서 다시 또 해줘야 합니다.

u0017496@sys-87576:~$ source activate py2k
(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Kkma()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Kkma instance has no __call__ method

위에서 error가 난 이유는 JAVA_HOME 설정이 이 가상환경 속에서는 안 되어 있기 때문입니다.  그걸 해주면 error는 없어집니다.

(py2k) u0017496@sys-87576:~$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-ppc64el
(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Kkma()
<konlpy.tag._kkma.Kkma instance at 0x3fff9561be60>


Q5. KoNLPy의 mecab 추가 설치 및 작동이 잘 되는지 ?  가령 아래 code가 python2, python3에서 모두 잘 되는지 ?

from konlpy.tag import *
Mecab()

A5.  다음과 같이 잘 됩니다.  처음에는 error가 나서 당황했는데, 보니 mecab-ko-dic를 따로 설치하면 해결되는 문제입니다.  

먼저 mecab-python3을 pip로 설치하고, 그 뒤에 mecab-ko-dic을 download 받아 설치합니다.

u0017496@sys-87576:~$ pip install mecab-python3

u0017496@sys-87576:~$ wget https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-1.6.1-20140814.tar.gz

u0017496@sys-87576:~$ tar -zxvf mecab-ko-dic-1.6.1-20140814.tar.gz
u0017496@sys-87576:~$ cd mecab-ko-dic-1.6.1-20140814
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ ./autogen.sh
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ ./configure
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ make
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ sudo make install

u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import *
>>> Mecab('/usr/lib/mecab/dic/mecab-ko-dic')
<konlpy.tag._mecab.Mecab object at 0x3fffa01773c8>

python2의 가상환경에서도 물론 동일하게 잘 됩니다.

u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ source activate py2k
(py2k) u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Mecab('/usr/lib/mecab/dic/mecab-ko-dic')
<konlpy.tag._mecab.Mecab instance at 0x3fff754e7cb0>
>>>


Q5.  OpenCV & GraphViz의 경우, conda install 을 사용하지 않고 설치가 가능하다면, python2&3 안에서 호출 가능한 형태로 설치되는 것인지?

import cv2
import graphviz

A6. 예, 둘다 잘 됩니다.   https://repo.continuum.io/pkgs/free/linux-ppc64le에서 opencv-3.1.0-np112py27_2.tar.bz2 등의 opencv package를 받아와서 설치하면 됩니다.

먼저 python3를 위한 opencv package를 설치해서 테스트합니다.

u0017496@sys-87576:~$ wget https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py36_2.tar.bz2
--2017-06-14 03:35:25--  https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py36_2.tar.bz2
Resolving repo.continuum.io (repo.continuum.io)... 104.16.19.10, 104.16.18.10, 2400:cb00:2048:1::6810:120a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13145265 (13M) [application/x-tar]
Saving to: ‘opencv-3.1.0-np112py36_2.tar.bz2’

opencv-3.1.0-np112py36_2.t 100%[=====================================>]  12.54M  9.02MB/s    in 1.4s

2017-06-14 03:35:26 (9.02 MB/s) - ‘opencv-3.1.0-np112py36_2.tar.bz2’ saved [13145265/13145265]


u0017496@sys-87576:~$ mkdir opencv-3.1.0-py36
u0017496@sys-87576:~$ cd opencv-3.1.0-py36
u0017496@sys-87576:~/opencv-3.1.0-py36$ tar -jxvf ../opencv-3.1.0-np112py36_2.tar.bz2

u0017496@sys-87576:~$ export PYTHONPATH=$PYTHONPATH:/home/u0017496/opencv-3.1.0-py36/lib/python3.6/site-packages

u0017496@sys-87576:~$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> import graphviz
>>>

다음으로 python2를 위한 opencv package를 설치해서 테스트합니다.

u0017496@sys-87576:~$ wget https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py27_2.tar.bz2
--2017-06-14 03:35:38--  https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py27_2.tar.bz2
Resolving repo.continuum.io (repo.continuum.io)... 104.16.18.10, 104.16.19.10, 2400:cb00:2048:1::6810:130a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.18.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13147104 (13M) [application/x-tar]
Saving to: ‘opencv-3.1.0-np112py27_2.tar.bz2’

opencv-3.1.0-np112py27_2.t 100%[=====================================>]  12.54M  9.76MB/s    in 1.3s

2017-06-14 03:35:39 (9.76 MB/s) - ‘opencv-3.1.0-np112py27_2.tar.bz2’ saved [13147104/13147104]

u0017496@sys-87576:~$ mkdir opencv-3.1.0-py27
u0017496@sys-87576:~$ cd opencv-3.1.0-py27
u0017496@sys-87576:~/opencv-3.1.0-py27$ tar -jxvf ../opencv-3.1.0-np112py27_2.tar.bz2

u0017496@sys-87576:~$ source activate py2k

(py2k) u0017496@sys-87576:~$ export PYTHONPATH=$PYTHONPATH:/home/u0017496/opencv-3.1.0-py27/lib/python2.7/site-packages

(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import cv2
>>> import graphviz
>>>

2017년 6월 13일 화요일

ppc64le 아키텍처용 anaconda package list에 없는 일부 package의 수동 설치 (xgboost의 사례)

이제 ppc64le 아키텍처에서도 Continuum사의 miniconda가 지원됩니다.  다만 모든 package들이 지금 다 available한 것은 아닙니다.   가령 아래 URL에 가서 확인해보면 몇몇 package들은 ppc64le에서는 지원되지 않는 것을 확인하실 수 있습니다.   대표적인 예가 xgboost 입니다.

https://repo.continuum.io/pkgs/free/linux-ppc64le/

그러나 여기에 포함되어 있지 않다고 해서 ppc64le에서는 정말 사용할 수 없느냐 하면 그건 아닙니다.  매우 간단히 설치가 가능합니다.

먼저, 그냥 그대로 pip로 xgboost를 설치할 때 어떤 error가 벌어지는지 보시지요.  먼저, pip가 anaconda에서 제공하는 pip인지 확인합니다.

u0017496@sys-87576:~$ which pip
/home/u0017496/miniconda3/bin/pip

그 다음에 이 pip를 이용하여 xgboost 설치를 시도해 봅니다.

u0017496@sys-87576:~$ pip install xgboost
...
  Using cached xgboost-0.6a2.tar.gz
    Complete output from command python setup.py egg_info:
    rm -f -rf build build_plugin lib bin *~ */*~ */*/*~ */*/*/*~ */*.o */*/*.o */*/*/*.o xgboost
    g++ -std=c++0x -Wall -O3 -msse2  -Wno-unknown-pragmas -funroll-loops -Iinclude   -Idmlc-core/include -Irabit/include -fPIC -fopenmp -MM -MT build/logging.o src/logging.cc >build/logging.d
    g++ -std=c++0x -Wall -O3 -msse2  -Wno-unknown-pragmas -funroll-loops -Iinclude   -Idmlc-core/include -Irabit/include -fPIC -fopenmp -MM -MT build/learner.o src/learner.cc >build/learner.d
    g++ -std=c++0x -Wall -O3 -msse2  -Wno-unknown-pragmas -funroll-loops -Iinclude   -Idmlc-core/include -Irabit/include -fPIC -fopenmp -MM -MT build/common/common.o src/common/common.cc >build/common/common.d
    g++ -std=c++0x -Wall -O3 -msse2  -Wno-unknown-pragmas -funroll-loops -Iinclude   -Idmlc-core/include -Irabit/include -fPIC -fopenmp -MM -MT build/metric/metric.o src/metric/metric.cc >build/metric/metric.d
    g++: error: unrecognized command line option ‘-msse2’

결국 intel x86 아키텍처에만 있는 SSE2 instruction 관련 option이 문제인 것을 보실 수 있습니다.  이는 source를 download 받은 뒤 직접 python setup.py를 수행함으로써 간단히 해결 가능합니다.

다음과 같이 pip download 명령으로 xgboost의 source를 download 받습니다.

u0017496@sys-87576:~$ pip download -d "./" xgboost

이것의 압축을 풀고, 관련 Makefile들을 수정합니다.  그냥 -msse2 부분만 빼줘도 되는데, 여기서는 하는 김에 CPU 아키텍처가 POWER8이라는 것을 지정하겠습니다.

u0017496@sys-87576:~$ tar -zxvf xgboost-0.6a2.tar.gz

u0017496@sys-87576:~$ cd xgboost-0.6a2

u0017496@sys-87576:~/xgboost-0.6a2$ vi ./xgboost/Makefile  
export CFLAGS=  -std=c++0x -Wall -O3 -mcpu=power8 -Wno-unknown-pragmas -funroll-loops -Iinclude $(ADD_CFLAGS) $(PLUGIN_CFLAGS)
#export CFLAGS=  -std=c++0x -Wall -O3 -msse2 -Wno-unknown-pragmas -funroll-loops -Iinclude $(ADD_CFLAGS) $(PLUGIN_CFLAGS)

u0017496@sys-87576:~/xgboost-0.6a2$ vi ./xgboost/dmlc-core/Makefile
export CFLAGS = -O3 -Wall -mcpu=power8 -Wno-unknown-pragmas -Iinclude  -std=c++0x
#export CFLAGS = -O3 -Wall -msse2  -Wno-unknown-pragmas -Iinclude  -std=c++0x

u0017496@sys-87576:~/xgboost-0.6a2$ vi ./xgboost/rabit/Makefile
export CFLAGS = -O3 -mcpu=power8 $(WARNFLAGS)
#export CFLAGS = -O3 -msse2 $(WARNFLAGS)

그 다음에 'python setup.py install' 명령을 직접 수행합니다.

u0017496@sys-87576:~/xgboost-0.6a2$ python setup.py install
...
Searching for scipy==0.19.0
Best match: scipy 0.19.0
Adding scipy 0.19.0 to easy-install.pth file

Using /home/u0017496/miniconda3/lib/python3.6/site-packages
Searching for numpy==1.13.0
Best match: numpy 1.13.0
Adding numpy 1.13.0 to easy-install.pth file

Using /home/u0017496/miniconda3/lib/python3.6/site-packages
Finished processing dependencies for xgboost==0.6a2

결과적으로 잘 설치되었습니다.  PYTHONPATH로 되어 있는 /home/u0017496/miniconda3/lib/python3.6/site-packages 디렉토리를 확인하면 다음과 같이 해당 directory가 생성된 것을 보실 수 있습니다.

u0017496@sys-87576:~/xgboost-0.6a2$ ls /home/u0017496/miniconda3/lib/python3.6/site-packages | grep xgboost
xgboost-0.6a2-py3.6.egg

또한 conda list 명령으로 보면 pip 명령으로 해당 package가 설치된 것으로 display 되는 것을 확인하실 수 있습니다.

u0017496@sys-87576:~/xgboost-0.6a2$ conda list | grep xgboost
xgboost                   0.6a2                     <pip>

pip 명령으로 삭제도 정상적으로 됩니다.

u0017496@sys-87576:~/xgboost-0.6a2$ pip uninstall xgboost
Uninstalling xgboost-0.6a2:
  /home/u0017496/miniconda3/lib/python3.6/site-packages/xgboost-0.6a2-py3.6.egg
Proceed (y/n)? y
  Successfully uninstalled xgboost-0.6a2