meta-pytorch · kirklandsign · Sep 19, 2025 · Sep 12, 2025 · Sep 17, 2025 · Sep 17, 2025
diff --git a/llm/android/LlamaDemo/.gitignore b/llm/android/LlamaDemo/.gitignore
@@ -0,0 +1,12 @@
+*.iml
+.gradle
+/local.properties
+.idea
+.DS_Store
+/build
+/captures
+.externalNativeBuild
+.cxx
+local.properties
+*.so
+*.aar
diff --git a/llm/android/LlamaDemo/README.md b/llm/android/LlamaDemo/README.md
@@ -0,0 +1,170 @@
+# ExecuTorch Llama Android Demo App
+
+This app serves as a valuable resource to inspire your creativity and provide foundational code that you can customize and adapt for your particular use case.
+
+Please dive in and start exploring our demo app today! We look forward to any feedback and are excited to see your innovative ideas.
+
+
+## Key Concepts
+From this demo app, you will learn many key concepts such as:
+* How to prepare Llama models, build the ExecuTorch library, and model inferencing across delegates
+* Expose the ExecuTorch library via JNI layer
+* Familiarity with current ExecuTorch app-facing capabilities
+
+The goal is for you to see the type of support ExecuTorch provides and feel comfortable with leveraging it for your use cases.
+
+## Supporting Models
+As a whole, the models that this app supports are (varies by delegate):
+* Llama 3.2 Quantized 1B/3B
+* Llama 3.2 1B/3B in BF16
+* Llama Guard 3 1B
+* Llama 3.1 8B
+* Llama 3 8B
+* Llama 2 7B
+* LLaVA-1.5 vision model (only XNNPACK)
+* Qwen 3 0.6B, 1.7B, and 4B
+
+
+## Building the APK
+First it’s important to note that currently ExecuTorch provides support across 3 delegates. Once you identify the delegate of your choice, select the README link to get a complete end-to-end instructions for environment set-up to exporting the models to build ExecuTorch libraries and apps to run on device:
+
+| Delegate      | Resource |
+| ------------- | ------------- |
+| XNNPACK (CPU-based library)  | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) |
+| QNN (Qualcomm AI Accelerators)  | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md) |
+| MediaTek (MediaTek AI Accelerators)  | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/mediatek_README.md)  |
+
+
+## How to Use the App
+
+This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.
+
+For loading the app, development, and running on device we recommend Android Studio:
+1. Open Android Studio and select "Open an existing Android Studio project" to open examples/demo-apps/android/LlamaDemo.
+2. Run the app (^R). This builds and launches the app on the phone.
+
+### Opening the App
+
+Below are the UI features for the app.
+
+Select the settings widget to get started with picking a model, its parameters and any prompts.
+<p align="center">
+<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/opening_the_app_details.png" style="width:800px">
+</p>
+
+
+
+### Select Models and Parameters
+
+Once you've selected the model, tokenizer, and model type you are ready to click on "Load Model" to have the app load the model and go back to the main Chat activity.
+<p align="center">
+      <img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/settings_menu.png" style="width:300px">
+</p>
+
+
+
+Optional Parameters:
+* Temperature: Defaulted to 0, you can adjust the temperature for the model as well. The model will reload upon any adjustments.
+* System Prompt: Without any formatting, you can enter in a system prompt. For example, "you are a travel assistant" or "give me a response in a few sentences".
+* User Prompt: More for the advanced user, if you would like to manually input a prompt then you can do so by modifying the `{{user prompt}}`. You can also modify the special tokens as well. Once changed then go back to the main Chat activity to send.
+
+#### ExecuTorch App API
+
+```java
+// Upon returning to the Main Chat Activity
+mModule = new LlmModule(
+            ModelUtils.getModelCategory(mCurrentSettingsFields.getModelType()),
+            modelPath,
+            tokenizerPath,
+            temperature);
+int loadResult = mModule.load();
+```
+
+* `modelCategory`: Indicate whether it’s a text-only or vision model
+* `modePath`: path to the .pte file
+* `tokenizerPath`: path to the tokenizer file
+* `temperature`: model parameter to adjust the randomness of the model’s output
+
+
+### User Prompt
+Once model is successfully loaded then enter any prompt and click the send (i.e. generate) button to send it to the model.
+<p align="center">
+<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/load_complete_and_start_prompt.png" style="width:300px">
+</p>
+
+You can provide it more follow-up questions as well.
+<p align="center">
+<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/chat.png" style="width:300px">
+</p>
+
+#### ExecuTorch App API
+
+```java
+mModule.generate(prompt,sequence_length, MainActivity.this);
+```
+* `prompt`: User formatted prompt
+* `sequence_length`: Number of tokens to generate in response to a prompt
+* `MainActivity.this`: Indicate that the callback functions (OnResult(), OnStats()) are present in this class.
+
+[*LLaVA-1.5: Only for XNNPACK delegate*]
+
+For LLaVA-1.5 implementation, select the exported LLaVA .pte and tokenizer file in the Settings menu and load the model. After this you can send an image from your gallery or take a live picture along with a text prompt to the model.
+
+<p align="center">
+<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/llava_example.png" style="width:300px">
+</p>
+
+
+### Output Generated
+To show completion of the follow-up question, here is the complete detailed response from the model.
+<p align="center">
+<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/chat_response.png" style="width:300px">
+</p>
+
+#### ExecuTorch App API
+
+Ensure you have the following functions in your callback class that you provided in the `mModule.generate()`. For this example, it is `MainActivity.this`.
+```java
+  @Override
+  public void onResult(String result) {
+    //...result contains token from response
+    //.. onResult will continue to be invoked until response is complete
+  }
+
+  @Override
+  public void onStats(String stats) {
+    //... will be a json. See extension/llm/stats.h for the field definitions
+  }
+
+```
+
+## Instrumentation Test
+You can run the instrumentation test for sanity check. The test loads a model pte file and tokenizer.bin file
+under `/data/local/tmp/llama`.
+
+### Model preparation
+Go to ExecuTorch root,
+```sh
+curl -C - -Ls "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" --output stories110M.pt
+curl -C - -Ls "https://raw.githubusercontent.com/karpathy/llama2.c/master/tokenizer.model" --output tokenizer.model
+# Create params.json file
+touch params.json
+echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
+python -m extension.llm.export.export_llm base.checkpoint=stories110M.pt base.params=params.json model.dtype_override="fp16" export.output_name=stories110m_h.pte model.use_kv_cache=True
+python -m pytorch_tokenizers.tools.llama2c.convert -t tokenizer.model -o tokenizer.bin
+```
+### Push model
+```sh
+adb mkdir -p /data/local/tmp/llama
+adb push stories110m_h.pte /data/local/tmp/llama
+adb push tokenizer.bin /data/local/tmp/llama
+```
+
+### Run test
+Go to `examples/demo-apps/android/LlamaDemo`,
+```sh
+./gradlew connectedAndroidTest
+```
+
+## Reporting Issues
+If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new), or join our discord [here](https://lnkd.in/gWCM4ViK).
diff --git a/llm/android/LlamaDemo/SDK-quick-setup-guide.md b/llm/android/LlamaDemo/SDK-quick-setup-guide.md
@@ -0,0 +1,94 @@
+# Guide to set up Java/SDK/NDK for Android
+
+Follow this doc if you haven't set up Java/SDK/NDK for Android development
+already.
+This doc provides a CLI tutorial to set them up. Otherwise, you can do the same
+thing with Android Studio GUI.
+
+## Set up Java 17
+1. Download the archive from Oracle website.
+Make sure you have read and agree with the terms and conditions from the website before downloading.
+```bash
+export DEV_HOME=<path-to-dev>
+cd $DEV_HOME
+```
+Linux:
+```bash
+curl https://download.oracle.com/java/17/archive/jdk-17.0.10_linux-x64_bin.tar.gz -o jdk-17.0.10.tar.gz
+```
+macOS:
+```bash
+curl https://download.oracle.com/java/17/archive/jdk-17.0.10_macos-aarch64_bin.tar.gz -o jdk-17.0.10.tar.gz
+```
+2. Unzip the archive. The directory named `jdk-17.0.10` is the Java root directory.
+```bash
+tar xf jdk-17.0.10.tar.gz
+```
+3. Set `JAVA_HOME` and update `PATH`.
+
+Linux:
+```bash
+export JAVA_HOME="$DEV_HOME"/jdk-17.0.10
+export PATH="$JAVA_HOME/bin:$PATH"
+```
+macOS:
+```bash
+export JAVA_HOME="$DEV_HOME"/jdk-17.0.10.jdk/Contents/Home
+export PATH="$JAVA_HOME/bin:$PATH"
+```
+
+Note: Oracle has tutorials for installing Java on
+[Linux](https://docs.oracle.com/en/java/javase/17/install/installation-jdk-linux-platforms.html#GUID-4A6BD592-1840-4BB4-A758-4CD49E9EE88B)
+and [macOS](https://docs.oracle.com/en/java/javase/17/install/installation-jdk-macos.html#GUID-E8A251B6-D9A9-4276-ABC8-CC0DAD62EA33).
+Some Linux distributions has JDK package in package manager. For example, Debian users can install
+openjdk-17-jdk package.
+
+## Set up Android SDK/NDK
+Android has a command line tool [sdkmanager](https://developer.android.com/tools/sdkmanager) which
+helps users managing SDK and other tools related to Android development.
+
+1. Go to https://developer.android.com/studio and download the archive from "Command line tools
+only" section. Make sure you have read and agree with the terms and conditions from the website.
+
+Linux:
+```bash
+curl https://dl.google.com/android/repository/commandlinetools-linux-11076708_latest.zip -o commandlinetools.zip
+```
+macOS:
+```bash
+curl https://dl.google.com/android/repository/commandlinetools-mac-11076708_latest.zip -o commandlinetools.zip
+```
+2. Unzip.
+```bash
+unzip commandlinetools.zip
+```
+3. Specify a root for Android SDK. For example, we can put it under `$DEV_HOME/sdk`.
+
+```
+mkdir -p $DEV_HOME/sdk
+export ANDROID_HOME="$(realpath $DEV_HOME/sdk)"
+# Install SDK 34
+./cmdline-tools/bin/sdkmanager --sdk_root="${ANDROID_HOME}" --install "platforms;android-34"
+# Install NDK
+./cmdline-tools/bin/sdkmanager --sdk_root="${ANDROID_HOME}" --install "ndk;26.3.11579264"
+# The NDK root is then under `ndk/<version>`.
+export ANDROID_NDK="$ANDROID_HOME/ndk/26.3.11579264"
+```
+
+### (Optional) Android Studio Setup
+If you want to use Android Studio and never set up Java/SDK/NDK before, or if
+you use the newly installed ones, follow these steps to set Android Studio to use
+them.
+
+Copy these output paths to be used by Android Studio
+```bash
+echo $ANDROID_HOME
+echo $ANDROID_NDK
+echo $JAVA_HOME
+```
+
+Open a project in Android Studio. In Project Structure (File -> Project
+Structure, or `⌘;`) -> SDK Location,
+* Set Android SDK Location to the path of $ANDROID_HOME
+* Set Android NDK Location to the path of $ANDROID_NDK
+* Set JDK location (Click Gradle Settings link) -> Gradle JDK -> Add JDK... to the path of $JAVA_HOME
diff --git a/llm/android/LlamaDemo/app/.gitignore b/llm/android/LlamaDemo/app/.gitignore
@@ -0,0 +1 @@
+/build
diff --git a/llm/android/LlamaDemo/app/build.gradle.kts b/llm/android/LlamaDemo/app/build.gradle.kts
@@ -0,0 +1,73 @@
+/*
+ * Copyright (c) Meta Platforms, Inc. and affiliates.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of this source tree.
+ */
+
+plugins {
+  id("com.android.application")
+  id("org.jetbrains.kotlin.android")
+}
+
+val qnnVersion: String? = project.findProperty("qnnVersion") as? String
+
+android {
+  namespace = "com.example.executorchllamademo"
+  compileSdk = 34
+
+  defaultConfig {
+    applicationId = "com.example.executorchllamademo"
+    minSdk = 28
+    targetSdk = 33
+    versionCode = 1
+    versionName = "1.0"
+
+    testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
+    vectorDrawables { useSupportLibrary = true }
+    externalNativeBuild { cmake { cppFlags += "" } }
+  }
+
+  buildTypes {
+    release {
+      isMinifyEnabled = false
+      proguardFiles(getDefaultProguardFile("proguard-android-optimize.txt"), "proguard-rules.pro")
+    }
+  }
+  compileOptions {
+    sourceCompatibility = JavaVersion.VERSION_1_8
+    targetCompatibility = JavaVersion.VERSION_1_8
+  }
+  kotlinOptions { jvmTarget = "1.8" }
+  buildFeatures { compose = true }
+  composeOptions { kotlinCompilerExtensionVersion = "1.4.3" }
+  packaging { resources { excludes += "/META-INF/{AL2.0,LGPL2.1}" } }
+}
+
+dependencies {
+  implementation("androidx.core:core-ktx:1.9.0")
+  implementation("androidx.lifecycle:lifecycle-runtime-ktx:2.6.1")
+  implementation("androidx.activity:activity-compose:1.7.0")
+  implementation(platform("androidx.compose:compose-bom:2023.03.00"))
+  implementation("androidx.compose.ui:ui")
+  implementation("androidx.compose.ui:ui-graphics")
+  implementation("androidx.compose.ui:ui-tooling-preview")
+  implementation("androidx.compose.material3:material3")
+  implementation("androidx.appcompat:appcompat:1.6.1")
+  implementation("androidx.camera:camera-core:1.3.0-rc02")
+  implementation("androidx.constraintlayout:constraintlayout:2.2.0-alpha12")
+  implementation("com.facebook.fbjni:fbjni:0.5.1")
+  implementation("com.google.code.gson:gson:2.8.6")
+  implementation("org.pytorch:executorch-android:1.0.0")
+  implementation("com.google.android.material:material:1.12.0")
+  implementation("androidx.activity:activity:1.9.0")
+  implementation("org.json:json:20250107")
+  testImplementation("junit:junit:4.13.2")
+  androidTestImplementation("androidx.test.ext:junit:1.1.5")
+  androidTestImplementation("androidx.test.espresso:espresso-core:3.5.1")
+  androidTestImplementation(platform("androidx.compose:compose-bom:2023.03.00"))
+  androidTestImplementation("androidx.compose.ui:ui-test-junit4")
+  debugImplementation("androidx.compose.ui:ui-tooling")
+  debugImplementation("androidx.compose.ui:ui-test-manifest")
+}
diff --git a/llm/android/LlamaDemo/app/proguard-rules.pro b/llm/android/LlamaDemo/app/proguard-rules.pro
@@ -0,0 +1,21 @@
+# Add project specific ProGuard rules here.
+# You can control the set of applied configuration files using the
+# proguardFiles setting in build.gradle.
+#
+# For more details, see
+#   http://developer.android.com/guide/developing/tools/proguard.html
+
+# If your project uses WebView with JS, uncomment the following
+# and specify the fully qualified class name to the JavaScript interface
+# class:
+#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
+#   public *;
+#}
+
+# Uncomment this to preserve the line number information for
+# debugging stack traces.
+#-keepattributes SourceFile,LineNumberTable
+
+# If you keep the line number information, uncomment this to
+# hide the original source file name.
+#-renamesourcefileattribute SourceFile