If you want to run Gemma 4 offline on your phone, this guide walks you through the full process from setup to practical usage.
Step 1: Get the App
Google AI Edge Gallery is currently not available on Google Play, so you need to install it via APK sideloading.
On your Android device, go to:
Settings -> Apps -> Special app access -> Install unknown apps
Then:
- Find your browser (for example, Chrome or Firefox) and enable “Allow from this source.”
- Open the
Google AI Edge GalleryGitHub Releases page in your mobile browser.
- Download the latest
.apkpackage. - After the download completes, open the file from notifications or your file manager and follow the prompts.
With a stable connection, this step usually takes around 2 minutes.
Step 2: Open the App and Grant Permissions
When you first open AI Edge Gallery, it will request storage permission to save model files. It’s best to allow this; otherwise, the app cannot download or load models.
You will typically see these sections on the home screen:
Ask Image: Vision tasks (describe images, answer questions about photos)AI Chat: Standard text chatSummarize: Paste text and generate summariesSmart Reply: Generate reply suggestions
For most users, AI Chat is the primary entry point.
Step 3: Download a Gemma 4 Model
- Enter
AI Chat. - Tap
Get Modelswhen prompted. - Choose a Gemma 4 model from the list (model size is shown).
- Pick based on your device capability; if your phone has
8GB RAM, start withGemma 4 4B. - Tap
Downloadand let it run in the background.
Note: Larger models take longer to download. You can download multiple models and switch between them later. Downloaded models stay on your device, so you do not need to re-download them.
Step 4: Start Chatting
After the model download is finished:
- Tap the model name to load it (the first load usually takes 10 to 30 seconds depending on model size and device performance).
- Enter your prompt in the chat box and send it.
- The model generates responses locally, and your data does not leave the phone.
The first reply is often slower due to model warm-up. Later messages in the same session are usually faster.
Step 5: Try Vision Features (Gemma 4 Multimodal)
If you downloaded a Gemma 4 multimodal variant:
- Go back to the main menu and open
Ask Image. - Select an image or take a photo.
- Ask a question (for example, “What’s in this image?” or “Is there any text I should pay attention to?”).
- Wait for the model to analyze the image locally and return a result.
This feature works offline, and your image is not sent to external servers.