Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,19 @@

This repository contains examples of Docker images that are valid custom images for KernelGateway Apps in SageMaker Studio. These custom images enable you to bring your own packages, files, and kernels for use with notebooks, terminals, and interactive consoles within SageMaker Studio.

You can find more information about using Custom Images in the [SageMaker Developer Guide](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-byoi.html).

### Examples

- [echo-kernel-image](examples/echo-kernel-image) - This example uses the echo_kernel from Jupyter as a "Hello World" introduction into writing custom KernelGateway images.
- [jupyter-docker-stacks-julia-image](examples/jupyter-docker-stacks-julia-image) - This example leverages the Data Science image from Jupyter Docker Stacks to add a Julia kernel.
- [r-image](examples/r-image) - This example contains the `ir` kernel and a selection of R packages, along with the AWS Python SDK (boto3) and the SageMaker Python SDK which can be used from R using `reticulate`
- [rapids-image](examples/rapids-image) - This example uses the offical rapids.ai image from Dockerhub. Use with a GPU instance on Studio
- [scala-image](examples/scala-image) - This example adds a Scala kernel based on [Almond Scala Kernel](https://almond.sh/).
- [tf2.3-image](examples/tf23-image) - This examples uses the official TensorFlow 2.3 image from DockerHub and demonstrates bundling custom files along with the image.
#### One-time setup
- [echo-kernel-image](examples/echo-kernel-image) - Uses the echo_kernel from Jupyter as a "Hello World" introduction into writing custom KernelGateway images.
- [javascript-tf-image](examples/javascript-tf-image) - [tslab](https://www.npmjs.com/package/tslab)-based kernels for JavaScript or TypeScript, including [TensorFlow.js](https://www.tensorflow.org/js) and CUDA GPU libraries.
- [jupyter-docker-stacks-julia-image](examples/jupyter-docker-stacks-julia-image) - Leverages the Data Science image from Jupyter Docker Stacks to add a Julia kernel.
- [r-image](examples/r-image) - Contains the `ir` kernel and a selection of R packages, along with the AWS Python SDK (boto3) and the SageMaker Python SDK which can be used from R using `reticulate`
- [rapids-image](examples/rapids-image) - Uses the offical rapids.ai image from Dockerhub. Use with a GPU instance on Studio
- [scala-image](examples/scala-image) - Adds a Scala kernel based on [Almond Scala Kernel](https://almond.sh/).
- [tf2.3-image](examples/tf23-image) - Uses the official TensorFlow 2.3 image from DockerHub and demonstrates bundling custom files along with the image.

### One-time setup

All examples have a one-time setup to create an ECR repository

Expand All @@ -28,4 +32,4 @@ See [DEVELOPMENT.md](DEVELOPMENT.md)

### License

This sample code is licensed under the MIT-0 License. See the LICENSE file.
This sample code is licensed under the MIT-0 License. See the LICENSE file.
195 changes: 195 additions & 0 deletions examples/javascript-tf-image/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# A CUDA-capable, tslab-based JS/TS kernel container for SageMaker Studio, including TensorFlow.js.
#
# Python & CUDA configuration with inspiration from the AWS TensorFlow Deep Learning containers, e.g:
# https://github.com/aws/deep-learning-containers/blob/master/tensorflow/training/docker/2.4/py3/cu110/Dockerfile.gpu
#
# Use with Jupyter kernel 'jslab' or 'tslab'; user config as per NB_UID/NB_GID; home folder /home/sagemaker-user
FROM nvidia/cuda:11.0-base-ubuntu18.04

ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"

ARG NODEJS_VERSION=14.x
ARG PYTHON_VERSION=3.9.4
ARG PYTHON=python3.9
ARG PYTHON_PIP=python3-pip
ARG PIP=pip3

# Prevent setup prompts hanging on user input in our non-interactive environment:
ENV DEBIAN_FRONTEND=noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN=true

# Python config for logging, IO, etc:
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=UTF-8
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8

# Optimizing TF for Intel/MKL, as per:
# https://software.intel.com/content/www/us/en/develop/articles/maximize-tensorflow-performance-on-cpu-considerations-and-recommendations-for-inference.html
# (May not be relevant for TensorFlow.js?)
ENV KMP_AFFINITY=granularity=fine,compact,1,0
ENV KMP_BLOCKTIME=1
ENV KMP_SETTINGS=0
ENV MANUAL_BUILD=0

USER root
WORKDIR /root

# Setup the NB user with root privileges.
RUN apt-get update && \
apt-get install -y sudo && \
useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
chmod g+w /etc/passwd && \
echo "${NB_USER} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers && \
# Prevent apt-get cache from being persisted to this layer.
rm -rf /var/lib/apt/lists/*

RUN apt-get update \
&& apt-get install -y --no-install-recommends --allow-unauthenticated \
ca-certificates \
cuda-command-line-tools-11-0 \
cuda-cudart-dev-11-0 \
libcufft-dev-11-0 \
libcurand-dev-11-0 \
libcusolver-dev-11-0 \
libcusparse-dev-11-0 \
curl \
libcudnn8=8.0.5.39-1+cuda11.0 \
# TensorFlow doesn't require libnccl anymore but Open MPI still depends on it
libnccl2=2.7.8-1+cuda11.0 \
libgomp1 \
libnccl-dev=2.7.8-1+cuda11.0 \
libfreetype6-dev \
libhdf5-serial-dev \
liblzma-dev \
libtemplate-perl \
libzmq3-dev \
git \
unzip \
wget \
libtool \
libssl1.1 \
openssl \
build-essential \
zlib1g-dev \
&& apt-get update \
&& apt-get install -y --no-install-recommends --allow-unauthenticated \
libcublas-11-0=11.2.0.252-1 \
libcublas-dev-11-0=11.2.0.252-1 \
# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
# nvinfer-runtime-trt-repo doesn't have a 1804-cuda10.1 version yet. see:
# https://developer.download.nvidia.cn/compute/machine-learning/repos/ubuntu1804/x86_64/
&& apt-get update && apt-get install -y --no-install-recommends --allow-unauthenticated \
nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 \
&& apt-get update && apt-get install -y --no-install-recommends --allow-unauthenticated \
libnvinfer7=7.1.3-1+cuda11.0 \
&& rm -rf /var/lib/apt/lists/*

# Set default NCCL parameters
RUN echo NCCL_DEBUG=INFO >> /etc/nccl.conf

# /usr/local/lib/libpython* needs to be accessible for dynamic linking
ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libbz2-dev \
libc6-dev \
libffi-dev \
libgdbm-dev \
libncursesw5-dev \
libreadline-gplv2-dev \
libsqlite3-dev \
libssl-dev \
tk-dev \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean

# Install specific Python version:
RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz \
&& tar -xvf Python-$PYTHON_VERSION.tgz \
&& cd Python-$PYTHON_VERSION \
&& ./configure --enable-shared && make && make install \
&& rm -rf ../Python-$PYTHON_VERSION*

RUN ${PIP} --no-cache-dir install --upgrade \
pip \
setuptools

# Provide a "python" binary for any tools that need it:
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
&& ln -s $(which ${PIP}) /usr/bin/pip

# Install specific NodeJS version:
RUN curl --silent --location https://deb.nodesource.com/setup_$NODEJS_VERSION | bash -
RUN apt-get install --yes nodejs

# Optional Python packages for AWS/SageMaker/DataScience/C++ in case they're helpful:
RUN ${PIP} install --no-cache-dir \
pybind11 \
cmake==3.18.2.post1 \
# (Numpy & Pandas needed for SageMaker SDK)
numpy==1.20.0 \
pandas==1.2.4 \
# python-dateutil==2.8.1 to satisfy botocore associated with latest awscli
python-dateutil==2.8.1 \
# install PyYAML>=5.4 to avoid conflict with latest awscli
"pyYAML>=5.4,<5.5" \
requests==2.25.1 \
"awscli<2" \
"sagemaker>=2,<3" \
sagemaker-experiments==0.* \
smclarify \
smdebug==1.0.8

# More setup of optional Python packages:
ENV CPATH="/usr/local/lib/python3.9/dist-packages/pybind11/include/"
RUN apt-get update && apt-get -y install cmake protobuf-compiler

# Required Python packages:
RUN ${PIP} install --no-cache-dir \
# tslab brings a separate kernel, but uses Python during setup so needs ipykernel to be present:
"ipykernel>=5,<6" \
&& ${PYTHON} -m ipykernel install --sys-prefix

# Install NodeJS libraries:
# `npm install -g` pushes global installs to /usr/lib/node_modules in this case, which tslab doesn't seem
# able to resolve per the issue below - even if we `ENV NODE_PATH=/usr/lib/node_modules`... So instead,
# we'll central/kernel-provided libs non-"globally" at the filesystem root:
WORKDIR /
RUN npm install \
aws-sdk@2 \
# (For performance, use -node-gpu where you can, else -node, else tfjs)
@tensorflow/[email protected] \
@tensorflow/[email protected] \
@tensorflow/[email protected] \
# tslab supports both TypeScript and JavaScript:
[email protected]

# The `tslab` kernel provider should be installed globally, and then hooked in to Jupyter:
RUN npm install -g [email protected] \
&& tslab install

# Now final user setup:
USER $NB_UID

# Set up user env vars:
# (Bash default shell gives a better Jupyter terminal UX than `sh`)
ENV SHELL=/bin/bash \
NB_USER=$NB_USER \
NB_UID=$NB_UID \
NB_GID=$NB_GID \
HOME=/home/$NB_USER

WORKDIR $HOME

# SageMaker will override the entrypoint when running in context - so just set bash for debugging:
CMD ["/bin/bash"]
107 changes: 107 additions & 0 deletions examples/javascript-tf-image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
## JavaScript / TypeScript Image with TensorFlow.js

### Overview

> NOTE: This Dockerfile installs dependencies that may be licensed under copyleft licenses such as GPLv3. You should review the license terms and make sure they are acceptable for your use case before proceeding and downloading this image.

A SageMaker Studio-compatible notebook kernel image for JavaScript or TypeScript, based on [tslab](https://www.npmjs.com/package/tslab).

This example:

- Derives from [nvidia/cuda](https://hub.docker.com/r/nvidia/cuda) images (as some of the [AWS Deep Learning Containers](https://github.com/aws/deep-learning-containers) do), for GPU driver support
- Includes [TensorFlow.js](https://www.tensorflow.org/js) and the [AWS SDK for JavaScript](https://aws.amazon.com/sdk-for-javascript/)
- Packages some additional Python-oriented AWS and SageMaker utilities including the [AWS CLI](https://aws.amazon.com/cli/), and [SageMaker SDK for Python](https://sagemaker.readthedocs.io/en/stable/)


### Building the image

Build the Docker image and push to Amazon ECR.

> ⏰ **Note:** This image can take several minutes to build Python components from source, and typically reaches several GB in size.

```bash
# Modify these as required. The Docker registry endpoint can be tuned based on your current region from https://docs.aws.amazon.com/general/latest/gr/ecr.html#ecr-docker-endpoints
REGION=<aws-region>
ACCOUNT_ID=<account-id>
IMAGE_NAME=custom-jsts

aws --region ${REGION} ecr get-login-password | docker login --username AWS --password-stdin ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom
docker build . -t ${IMAGE_NAME} -t ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}
```

```bash
docker push ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}
```


### Using with SageMaker Studio

Once the image is pushed to Amazon ECR, you can attach it to your SageMaker Studio domain via either the console UI or the CLI. See the [SageMaker Developer Guide](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-byoi-create.html) for more details.

The tslab installation in this image exports **two Jupyter kernels** you can use:

- `tslab`: For working in [TypeScript](https://www.typescriptlang.org/)
- `jslab`: For standard JavaScript

Both kernel options see the same globally-pre-installed npm packages, and use the same SageMaker Studio filesystem configuration (user, group, folder) as shown in [app-image-config-input.json](app-image-config-input.json).

At the time of writing SageMaker Studio does not (yet?) fully support multi-kernel images, so you'll need to create **two** ["SageMaker Images"](https://console.aws.amazon.com/sagemaker/home?#/images) (referencing the same [ECR](https://console.aws.amazon.com/ecr/repositories/private/) URI)) - if you want to expose both JS and TS options to users in your domain.

For example, to create the SageMaker Image via AWS CLI:

```bash
# IAM Role in your account to be used for the SageMaker Image setup process:
ROLE_ARN=<role-arn>

# May want to use an alternative SM_IMAGE_NAME if registering two SageMaker images to same ECR container:
SM_IMAGE_NAME=${IMAGE_NAME}

aws --region ${REGION} sagemaker create-image \
--image-name ${SM_IMAGE_NAME} \
--role-arn ${ROLE_ARN}

aws --region ${REGION} sagemaker create-image-version \
--image-name ${IMAGE_NAME} \
--base-image "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}"

# Verify the image-version is created successfully.
# Do NOT proceed if image-version is in CREATE_FAILED state or in any other state apart from CREATED.
aws --region ${REGION} sagemaker describe-image-version --image-name ${IMAGE_NAME}
```

...and then configure the container runtime settings in SageMaker:

```bash
# TODO: Edit the JSON to point to whichever of 'tslab' or 'jslab' kernel you intend to use
aws --region ${REGION} sagemaker create-app-image-config --cli-input-json file://app-image-config-input.json
```

The final step is to attach your SageMaker Custom Image(s) to your SageMaker Studio domain. You can do this from the [AWS Console for SageMaker Studio](https://console.aws.amazon.com/sagemaker/home?#/studio/d-doedz9htjn38), or using the `aws sagemaker update-domain` [command](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/update-domain.html) in the AWS CLI.


### Further information

#### Magics and package installations

SageMaker Python users may be used to installing packages inline on kernels using `!pip install ...` commands.

[Magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html) (like `%%bash` or `!` for running shell scripts) are generally specific to the IPython (Python) kernel and may not be available in other implementations (like `tslab` used here).

See [this discussion](https://github.com/yunabe/tslab/issues/35) for alternatives and roadmap for running shell commands like `npm install` in tslab.

You could, for example, run a cell like:

```js
const { execSync } = require("child_process");
execSync("npm install lodash", { encoding: "utf-8" });
```

#### A note on GPU-accelerated notebooks

This kernel includes CUDA libraries in case you want to experiment with GPU-accelerated instance types like `ml.g4dn.xlarge` in notebook.

**However**, remember that a general best practice is to **package your high-resource code as SageMaker Jobs** (such as Processing, Training, or Batch Transform jobs) and keep your notebook environment resources modest (such as an `ml.t3.medium`). Working with SageMaker Jobs early in the build process can help:

- Optimize infrastructure costs (since these jobs spin up and release their infrastructure on-demand)
- Improve experiment tracking (since the SageMaker APIs automatically store history of training job inputs, parameters, metrics, and so on)
- Accelerate the path to production (by e.g. attempting to train models in container environments that are already set up for inference deployment)
16 changes: 16 additions & 0 deletions examples/javascript-tf-image/app-image-config-input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"AppImageConfigName": "custom-ts",
"KernelGatewayImageConfig": {
"KernelSpecs": [
{
"Name": "tslab",
"DisplayName": "TensorFlow.js (TypeScript)"
}
],
"FileSystemConfig": {
"MountPath": "/home/sagemaker-user",
"DefaultUid": 1000,
"DefaultGid": 100
}
}
}