Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions llm/android/LlamaDemo/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
*.iml
.gradle
/local.properties
.idea
.DS_Store
/build
/captures
.externalNativeBuild
.cxx
local.properties
*.so
*.aar
170 changes: 170 additions & 0 deletions llm/android/LlamaDemo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# ExecuTorch Llama Android Demo App

This app serves as a valuable resource to inspire your creativity and provide foundational code that you can customize and adapt for your particular use case.

Please dive in and start exploring our demo app today! We look forward to any feedback and are excited to see your innovative ideas.


## Key Concepts
From this demo app, you will learn many key concepts such as:
* How to prepare Llama models, build the ExecuTorch library, and model inferencing across delegates
* Expose the ExecuTorch library via JNI layer
* Familiarity with current ExecuTorch app-facing capabilities

The goal is for you to see the type of support ExecuTorch provides and feel comfortable with leveraging it for your use cases.

## Supporting Models
As a whole, the models that this app supports are (varies by delegate):
* Llama 3.2 Quantized 1B/3B
* Llama 3.2 1B/3B in BF16
* Llama Guard 3 1B
* Llama 3.1 8B
* Llama 3 8B
* Llama 2 7B
* LLaVA-1.5 vision model (only XNNPACK)
* Qwen 3 0.6B, 1.7B, and 4B


## Building the APK
First it’s important to note that currently ExecuTorch provides support across 3 delegates. Once you identify the delegate of your choice, select the README link to get a complete end-to-end instructions for environment set-up to exporting the models to build ExecuTorch libraries and apps to run on device:

| Delegate | Resource |
| ------------- | ------------- |
| XNNPACK (CPU-based library) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) |
| QNN (Qualcomm AI Accelerators) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md) |
| MediaTek (MediaTek AI Accelerators) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/mediatek_README.md) |


## How to Use the App

This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.

For loading the app, development, and running on device we recommend Android Studio:
1. Open Android Studio and select "Open an existing Android Studio project" to open examples/demo-apps/android/LlamaDemo.
2. Run the app (^R). This builds and launches the app on the phone.

### Opening the App

Below are the UI features for the app.

Select the settings widget to get started with picking a model, its parameters and any prompts.
<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/opening_the_app_details.png" style="width:800px">
</p>



### Select Models and Parameters

Once you've selected the model, tokenizer, and model type you are ready to click on "Load Model" to have the app load the model and go back to the main Chat activity.
<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/settings_menu.png" style="width:300px">
</p>



Optional Parameters:
* Temperature: Defaulted to 0, you can adjust the temperature for the model as well. The model will reload upon any adjustments.
* System Prompt: Without any formatting, you can enter in a system prompt. For example, "you are a travel assistant" or "give me a response in a few sentences".
* User Prompt: More for the advanced user, if you would like to manually input a prompt then you can do so by modifying the `{{user prompt}}`. You can also modify the special tokens as well. Once changed then go back to the main Chat activity to send.

#### ExecuTorch App API

```java
// Upon returning to the Main Chat Activity
mModule = new LlmModule(
ModelUtils.getModelCategory(mCurrentSettingsFields.getModelType()),
modelPath,
tokenizerPath,
temperature);
int loadResult = mModule.load();
```

* `modelCategory`: Indicate whether it’s a text-only or vision model
* `modePath`: path to the .pte file
* `tokenizerPath`: path to the tokenizer file
* `temperature`: model parameter to adjust the randomness of the model’s output


### User Prompt
Once model is successfully loaded then enter any prompt and click the send (i.e. generate) button to send it to the model.
<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/load_complete_and_start_prompt.png" style="width:300px">
</p>

You can provide it more follow-up questions as well.
<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/chat.png" style="width:300px">
</p>

#### ExecuTorch App API

```java
mModule.generate(prompt,sequence_length, MainActivity.this);
```
* `prompt`: User formatted prompt
* `sequence_length`: Number of tokens to generate in response to a prompt
* `MainActivity.this`: Indicate that the callback functions (OnResult(), OnStats()) are present in this class.

[*LLaVA-1.5: Only for XNNPACK delegate*]

For LLaVA-1.5 implementation, select the exported LLaVA .pte and tokenizer file in the Settings menu and load the model. After this you can send an image from your gallery or take a live picture along with a text prompt to the model.

<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/llava_example.png" style="width:300px">
</p>


### Output Generated
To show completion of the follow-up question, here is the complete detailed response from the model.
<p align="center">
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/chat_response.png" style="width:300px">
</p>

#### ExecuTorch App API

Ensure you have the following functions in your callback class that you provided in the `mModule.generate()`. For this example, it is `MainActivity.this`.
```java
@Override
public void onResult(String result) {
//...result contains token from response
//.. onResult will continue to be invoked until response is complete
}

@Override
public void onStats(String stats) {
//... will be a json. See extension/llm/stats.h for the field definitions
}

```

## Instrumentation Test
You can run the instrumentation test for sanity check. The test loads a model pte file and tokenizer.bin file
under `/data/local/tmp/llama`.

### Model preparation
Go to ExecuTorch root,
```sh
curl -C - -Ls "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" --output stories110M.pt
curl -C - -Ls "https://raw.githubusercontent.com/karpathy/llama2.c/master/tokenizer.model" --output tokenizer.model
# Create params.json file
touch params.json
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m extension.llm.export.export_llm base.checkpoint=stories110M.pt base.params=params.json model.dtype_override="fp16" export.output_name=stories110m_h.pte model.use_kv_cache=True
python -m pytorch_tokenizers.tools.llama2c.convert -t tokenizer.model -o tokenizer.bin
```
### Push model
```sh
adb mkdir -p /data/local/tmp/llama
adb push stories110m_h.pte /data/local/tmp/llama
adb push tokenizer.bin /data/local/tmp/llama
```

### Run test
Go to `examples/demo-apps/android/LlamaDemo`,
```sh
./gradlew connectedAndroidTest
```

## Reporting Issues
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new), or join our discord [here](https://lnkd.in/gWCM4ViK).
94 changes: 94 additions & 0 deletions llm/android/LlamaDemo/SDK-quick-setup-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Guide to set up Java/SDK/NDK for Android

Follow this doc if you haven't set up Java/SDK/NDK for Android development
already.
This doc provides a CLI tutorial to set them up. Otherwise, you can do the same
thing with Android Studio GUI.

## Set up Java 17
1. Download the archive from Oracle website.
Make sure you have read and agree with the terms and conditions from the website before downloading.
```bash
export DEV_HOME=<path-to-dev>
cd $DEV_HOME
```
Linux:
```bash
curl https://download.oracle.com/java/17/archive/jdk-17.0.10_linux-x64_bin.tar.gz -o jdk-17.0.10.tar.gz
```
macOS:
```bash
curl https://download.oracle.com/java/17/archive/jdk-17.0.10_macos-aarch64_bin.tar.gz -o jdk-17.0.10.tar.gz
```
2. Unzip the archive. The directory named `jdk-17.0.10` is the Java root directory.
```bash
tar xf jdk-17.0.10.tar.gz
```
3. Set `JAVA_HOME` and update `PATH`.

Linux:
```bash
export JAVA_HOME="$DEV_HOME"/jdk-17.0.10
export PATH="$JAVA_HOME/bin:$PATH"
```
macOS:
```bash
export JAVA_HOME="$DEV_HOME"/jdk-17.0.10.jdk/Contents/Home
export PATH="$JAVA_HOME/bin:$PATH"
```

Note: Oracle has tutorials for installing Java on
[Linux](https://docs.oracle.com/en/java/javase/17/install/installation-jdk-linux-platforms.html#GUID-4A6BD592-1840-4BB4-A758-4CD49E9EE88B)
and [macOS](https://docs.oracle.com/en/java/javase/17/install/installation-jdk-macos.html#GUID-E8A251B6-D9A9-4276-ABC8-CC0DAD62EA33).
Some Linux distributions has JDK package in package manager. For example, Debian users can install
openjdk-17-jdk package.

## Set up Android SDK/NDK
Android has a command line tool [sdkmanager](https://developer.android.com/tools/sdkmanager) which
helps users managing SDK and other tools related to Android development.

1. Go to https://developer.android.com/studio and download the archive from "Command line tools
only" section. Make sure you have read and agree with the terms and conditions from the website.

Linux:
```bash
curl https://dl.google.com/android/repository/commandlinetools-linux-11076708_latest.zip -o commandlinetools.zip
```
macOS:
```bash
curl https://dl.google.com/android/repository/commandlinetools-mac-11076708_latest.zip -o commandlinetools.zip
```
2. Unzip.
```bash
unzip commandlinetools.zip
```
3. Specify a root for Android SDK. For example, we can put it under `$DEV_HOME/sdk`.

```
mkdir -p $DEV_HOME/sdk
export ANDROID_HOME="$(realpath $DEV_HOME/sdk)"
# Install SDK 34
./cmdline-tools/bin/sdkmanager --sdk_root="${ANDROID_HOME}" --install "platforms;android-34"
# Install NDK
./cmdline-tools/bin/sdkmanager --sdk_root="${ANDROID_HOME}" --install "ndk;26.3.11579264"
# The NDK root is then under `ndk/<version>`.
export ANDROID_NDK="$ANDROID_HOME/ndk/26.3.11579264"
```

### (Optional) Android Studio Setup
If you want to use Android Studio and never set up Java/SDK/NDK before, or if
you use the newly installed ones, follow these steps to set Android Studio to use
them.

Copy these output paths to be used by Android Studio
```bash
echo $ANDROID_HOME
echo $ANDROID_NDK
echo $JAVA_HOME
```

Open a project in Android Studio. In Project Structure (File -> Project
Structure, or `⌘;`) -> SDK Location,
* Set Android SDK Location to the path of $ANDROID_HOME
* Set Android NDK Location to the path of $ANDROID_NDK
* Set JDK location (Click Gradle Settings link) -> Gradle JDK -> Add JDK... to the path of $JAVA_HOME
1 change: 1 addition & 0 deletions llm/android/LlamaDemo/app/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/build
73 changes: 73 additions & 0 deletions llm/android/LlamaDemo/app/build.gradle.kts
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/

plugins {
id("com.android.application")
id("org.jetbrains.kotlin.android")
}

val qnnVersion: String? = project.findProperty("qnnVersion") as? String

android {
namespace = "com.example.executorchllamademo"
compileSdk = 34

defaultConfig {
applicationId = "com.example.executorchllamademo"
minSdk = 28
targetSdk = 33
versionCode = 1
versionName = "1.0"

testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
vectorDrawables { useSupportLibrary = true }
externalNativeBuild { cmake { cppFlags += "" } }
}

buildTypes {
release {
isMinifyEnabled = false
proguardFiles(getDefaultProguardFile("proguard-android-optimize.txt"), "proguard-rules.pro")
}
}
compileOptions {
sourceCompatibility = JavaVersion.VERSION_1_8
targetCompatibility = JavaVersion.VERSION_1_8
}
kotlinOptions { jvmTarget = "1.8" }
buildFeatures { compose = true }
composeOptions { kotlinCompilerExtensionVersion = "1.4.3" }
packaging { resources { excludes += "/META-INF/{AL2.0,LGPL2.1}" } }
}

dependencies {
implementation("androidx.core:core-ktx:1.9.0")
implementation("androidx.lifecycle:lifecycle-runtime-ktx:2.6.1")
implementation("androidx.activity:activity-compose:1.7.0")
implementation(platform("androidx.compose:compose-bom:2023.03.00"))
implementation("androidx.compose.ui:ui")
implementation("androidx.compose.ui:ui-graphics")
implementation("androidx.compose.ui:ui-tooling-preview")
implementation("androidx.compose.material3:material3")
implementation("androidx.appcompat:appcompat:1.6.1")
implementation("androidx.camera:camera-core:1.3.0-rc02")
implementation("androidx.constraintlayout:constraintlayout:2.2.0-alpha12")
implementation("com.facebook.fbjni:fbjni:0.5.1")
implementation("com.google.code.gson:gson:2.8.6")
implementation("org.pytorch:executorch-android:1.0.0")
implementation("com.google.android.material:material:1.12.0")
implementation("androidx.activity:activity:1.9.0")
implementation("org.json:json:20250107")
testImplementation("junit:junit:4.13.2")
androidTestImplementation("androidx.test.ext:junit:1.1.5")
androidTestImplementation("androidx.test.espresso:espresso-core:3.5.1")
androidTestImplementation(platform("androidx.compose:compose-bom:2023.03.00"))
androidTestImplementation("androidx.compose.ui:ui-test-junit4")
debugImplementation("androidx.compose.ui:ui-tooling")
debugImplementation("androidx.compose.ui:ui-test-manifest")
}
21 changes: 21 additions & 0 deletions llm/android/LlamaDemo/app/proguard-rules.pro
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Add project specific ProGuard rules here.
# You can control the set of applied configuration files using the
# proguardFiles setting in build.gradle.
#
# For more details, see
# http://developer.android.com/guide/developing/tools/proguard.html

# If your project uses WebView with JS, uncomment the following
# and specify the fully qualified class name to the JavaScript interface
# class:
#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
# public *;
#}

# Uncomment this to preserve the line number information for
# debugging stack traces.
#-keepattributes SourceFile,LineNumberTable

# If you keep the line number information, uncomment this to
# hide the original source file name.
#-renamesourcefileattribute SourceFile
Loading