Troubleshooting Guide

Common Issues and Solutions

Arduino / TensorFlow Lite Micro

“Tensor arena too small”

Symptom: Model fails to allocate tensors

Solution:

// Increase arena size
constexpr int kTensorArenaSize = 100 * 1024;  // Try 100KB

// Or find minimum size:
Serial.print("Arena used: ");
Serial.println(interpreter->arena_used_bytes());

“Operator not supported”

Symptom: Model uses unsupported op

Solution: 1. Check which ops your model uses 2. Add ops to AllOpsResolver or use MicroMutableOpResolver 3. Consider model simplification

// Add specific ops
static tflite::MicroMutableOpResolver<10> resolver;
resolver.AddFullyConnected();
resolver.AddSoftmax();
resolver.AddReshape();

Serial Monitor shows garbage

Solution: Check baud rate matches

Serial.begin(115200);  // Must match Serial Monitor setting

ESP32 / WiFi Issues

WiFi won’t connect

Checklist: - [ ] Correct SSID/password - [ ] 2.4GHz network (ESP32 doesn’t support 5GHz) - [ ] Router not blocking new devices

WiFi.begin(ssid, password);

int attempts = 0;
while (WiFi.status() != WL_CONNECTED && attempts < 20) {
    delay(500);
    Serial.print(".");
    attempts++;
}

if (WiFi.status() != WL_CONNECTED) {
    Serial.println("Failed to connect!");
    Serial.println(WiFi.status());  // Print error code
}

MQTT connection fails

Checklist: - [ ] Broker address correct - [ ] Port correct (usually 1883) - [ ] Firewall not blocking

client.setServer(mqtt_server, 1883);

if (!client.connect("ESP32Client")) {
    Serial.print("MQTT failed, rc=");
    Serial.println(client.state());
}

Raspberry Pi

Camera not detected

# Check if camera is enabled
sudo raspi-config  # Interface Options > Camera

# Test camera
libcamera-hello

# Check permissions
ls -l /dev/video*

Out of memory

# Check memory usage
free -h

# Increase swap
sudo dphys-swapfile swapoff
sudo nano /etc/dphys-swapfile  # Set CONF_SWAPSIZE=2048
sudo dphys-swapfile setup
sudo dphys-swapfile swapon

TensorFlow too slow

# Use TFLite instead of full TensorFlow
interpreter = tf.lite.Interpreter(model_path="model.tflite")

# Enable threading
interpreter = tf.lite.Interpreter(
    model_path="model.tflite",
    num_threads=4
)

Power / Energy

Power consumption too high

Symptom: Battery drains faster than expected, device runs hot

Checklist: - [ ] Disable unused peripherals (WiFi, BLE, LEDs) - [ ] Implement duty cycling (sleep between measurements) - [ ] Reduce sampling frequency - [ ] Use lower clock speeds when possible

// ESP32 power saving example
#include <esp_sleep.h>

void enterDeepSleep(int seconds) {
    esp_sleep_enable_timer_wakeup(seconds * 1000000ULL);
    esp_deep_sleep_start();
}

// Disable WiFi when not needed
WiFi.mode(WIFI_OFF);
btStop();  // Disable Bluetooth

Battery life estimation wrong

Solution: Account for all power states

# Power budget calculation
active_current_mA = 80      # During inference
sleep_current_mA = 0.01     # Deep sleep
duty_cycle = 0.01           # 1% active time

average_current = (active_current_mA * duty_cycle +
                   sleep_current_mA * (1 - duty_cycle))
battery_mAh = 3000
life_hours = battery_mAh / average_current
print(f"Expected battery life: {life_hours/24:.1f} days")

INA219 readings incorrect

Checklist: - [ ] Check wiring (VCC → V+, load → V-) - [ ] Verify I2C address (default 0x40) - [ ] Calibrate with known load

from ina219 import INA219
ina = INA219(shunt_ohms=0.1, address=0x40)
ina.configure()
print(f"Bus Voltage: {ina.voltage():.2f}V")
print(f"Current: {ina.current():.2f}mA")
print(f"Power: {ina.power():.2f}mW")

Python / TensorFlow

CUDA out of memory

# Limit GPU memory
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

# Or use CPU only
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

Model won’t convert to TFLite

Common causes: 1. Unsupported operations 2. Dynamic shapes 3. Custom layers

# Check for unsupported ops
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,
    tf.lite.OpsSet.SELECT_TF_OPS  # Enable TF ops fallback
]

Federated Learning (Flower)

Clients won’t connect

Checklist: - [ ] Server IP correct - [ ] Port not blocked by firewall - [ ] All devices on same network

# Server - listen on all interfaces
fl.server.start_server(
    server_address="0.0.0.0:8080",  # Not "localhost"
    ...
)

# Client - use server's actual IP
fl.client.start_numpy_client(
    server_address="192.168.1.100:8080",
    ...
)

Training stalls

Solutions: - Reduce min_fit_clients - Increase client timeout - Check network connectivity

General Tips

Debug Systematically

Isolate the problem: Does it work in simplest form?
Check dependencies: Version mismatches?
Read error messages: Often contain the solution
Search: Someone likely had the same issue

Version Compatibility

Component	Recommended Version
TensorFlow	2.10-2.14
TFLite Micro	Latest
Python	3.9-3.11
Arduino IDE	2.x
ESP-IDF	5.x

Getting Help

When stuck, follow this escalation path:

Check this troubleshooting guide first - Most common issues are documented here
Search the GitHub repository issues - Someone may have encountered the same problem
Review the relevant lab’s “Related Resources” section - Links to hardware guides and additional documentation
Ask on the course forum or discussion board - Instructors and peers can help
Post on Stack Overflow - Include:
- Minimal reproducible example
- Hardware/software versions
- Error messages
- What you’ve already tried

Lab-Specific Troubleshooting

For issues specific to individual labs, refer to:

LAB01-03: Model training and conversion issues
LAB04: Audio processing and speech recognition problems
LAB05: Deployment and TFLite Micro errors
LAB06: Security and adversarial attack challenges
LAB07: CNN and computer vision issues
LAB08-09: Arduino/ESP32 hardware and wireless problems
LAB10: EMG signal processing and biomedical sensors
LAB11: Profiling and performance measurement
LAB12-13: Streaming and database issues
LAB14: Anomaly detection and unsupervised learning
LAB15: Power measurement and energy optimization
LAB16: YOLO and object detection on edge devices
LAB17: Federated learning and Flower framework
LAB18: On-device learning and model adaptation

--- title: "Troubleshooting Guide" --- ## Common Issues and Solutions ### Arduino / TensorFlow Lite Micro #### "Tensor arena too small" **Symptom:** Model fails to allocate tensors **Solution:** ```cpp // Increase arena size constexpr int kTensorArenaSize = 100 * 1024; // Try 100KB // Or find minimum size: Serial.print("Arena used: "); Serial.println(interpreter->arena_used_bytes()); ``` #### "Operator not supported" **Symptom:** Model uses unsupported op **Solution:** 1. Check which ops your model uses 2. Add ops to AllOpsResolver or use MicroMutableOpResolver 3. Consider model simplification ```cpp // Add specific ops static tflite::MicroMutableOpResolver<10> resolver; resolver.AddFullyConnected(); resolver.AddSoftmax(); resolver.AddReshape(); ``` #### Serial Monitor shows garbage **Solution:** Check baud rate matches ```cpp Serial.begin(115200); // Must match Serial Monitor setting ``` --- ### ESP32 / WiFi Issues #### WiFi won't connect **Checklist:** - [ ] Correct SSID/password - [ ] 2.4GHz network (ESP32 doesn't support 5GHz) - [ ] Router not blocking new devices ```cpp WiFi.begin(ssid, password); int attempts = 0; while (WiFi.status() != WL_CONNECTED && attempts < 20) { delay(500); Serial.print("."); attempts++; } if (WiFi.status() != WL_CONNECTED) { Serial.println("Failed to connect!"); Serial.println(WiFi.status()); // Print error code } ``` #### MQTT connection fails **Checklist:** - [ ] Broker address correct - [ ] Port correct (usually 1883) - [ ] Firewall not blocking ```cpp client.setServer(mqtt_server, 1883); if (!client.connect("ESP32Client")) { Serial.print("MQTT failed, rc="); Serial.println(client.state()); } ``` --- ### Raspberry Pi #### Camera not detected ```bash # Check if camera is enabled sudo raspi-config # Interface Options > Camera # Test camera libcamera-hello # Check permissions ls -l /dev/video* ``` #### Out of memory ```bash # Check memory usage free -h # Increase swap sudo dphys-swapfile swapoff sudo nano /etc/dphys-swapfile # Set CONF_SWAPSIZE=2048 sudo dphys-swapfile setup sudo dphys-swapfile swapon ``` #### TensorFlow too slow ```python # Use TFLite instead of full TensorFlow interpreter = tf.lite.Interpreter(model_path="model.tflite") # Enable threading interpreter = tf.lite.Interpreter( model_path="model.tflite", num_threads=4 ) ``` --- ### Power / Energy {#power-consumption-too-high} #### Power consumption too high **Symptom:** Battery drains faster than expected, device runs hot **Checklist:** - [ ] Disable unused peripherals (WiFi, BLE, LEDs) - [ ] Implement duty cycling (sleep between measurements) - [ ] Reduce sampling frequency - [ ] Use lower clock speeds when possible ```cpp // ESP32 power saving example #include <esp_sleep.h> void enterDeepSleep(int seconds) { esp_sleep_enable_timer_wakeup(seconds * 1000000ULL); esp_deep_sleep_start(); } // Disable WiFi when not needed WiFi.mode(WIFI_OFF); btStop(); // Disable Bluetooth ``` #### Battery life estimation wrong **Solution:** Account for all power states ```python # Power budget calculation active_current_mA = 80 # During inference sleep_current_mA = 0.01 # Deep sleep duty_cycle = 0.01 # 1% active time average_current = (active_current_mA * duty_cycle + sleep_current_mA * (1 - duty_cycle)) battery_mAh = 3000 life_hours = battery_mAh / average_current print(f"Expected battery life: {life_hours/24:.1f} days") ``` #### INA219 readings incorrect **Checklist:** - [ ] Check wiring (VCC → V+, load → V-) - [ ] Verify I2C address (default 0x40) - [ ] Calibrate with known load ```python from ina219 import INA219 ina = INA219(shunt_ohms=0.1, address=0x40) ina.configure() print(f"Bus Voltage: {ina.voltage():.2f}V") print(f"Current: {ina.current():.2f}mA") print(f"Power: {ina.power():.2f}mW") ``` --- ### Python / TensorFlow #### CUDA out of memory ```python # Limit GPU memory gpus = tf.config.experimental.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(gpus[0], True) # Or use CPU only import os os.environ['CUDA_VISIBLE_DEVICES'] = '-1' ``` #### Model won't convert to TFLite **Common causes:** 1. Unsupported operations 2. Dynamic shapes 3. Custom layers ```python # Check for unsupported ops converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS # Enable TF ops fallback ] ``` --- ### Federated Learning (Flower) #### Clients won't connect **Checklist:** - [ ] Server IP correct - [ ] Port not blocked by firewall - [ ] All devices on same network ```python # Server - listen on all interfaces fl.server.start_server( server_address="0.0.0.0:8080", # Not "localhost" ... ) # Client - use server's actual IP fl.client.start_numpy_client( server_address="192.168.1.100:8080", ... ) ``` #### Training stalls **Solutions:** - Reduce `min_fit_clients` - Increase client timeout - Check network connectivity --- ### General Tips #### Debug Systematically 1. **Isolate the problem**: Does it work in simplest form? 2. **Check dependencies**: Version mismatches? 3. **Read error messages**: Often contain the solution 4. **Search**: Someone likely had the same issue #### Version Compatibility | Component | Recommended Version | |-----------|---------------------| | TensorFlow | 2.10-2.14 | | TFLite Micro | Latest | | Python | 3.9-3.11 | | Arduino IDE | 2.x | | ESP-IDF | 5.x | #### Getting Help When stuck, follow this escalation path: 1. **Check this troubleshooting guide first** - Most common issues are documented here 2. **Search the [GitHub repository issues](https://github.com/ngcharithperera/edge-analytics-lab-book/issues)** - Someone may have encountered the same problem 3. **Review the relevant lab's "Related Resources" section** - Links to hardware guides and additional documentation 4. **Ask on the course forum or discussion board** - Instructors and peers can help 5. **Post on Stack Overflow** - Include: - Minimal reproducible example - Hardware/software versions - Error messages - What you've already tried #### Lab-Specific Troubleshooting For issues specific to individual labs, refer to: - **LAB01-03**: Model training and conversion issues - **LAB04**: Audio processing and speech recognition problems - **LAB05**: Deployment and TFLite Micro errors - **LAB06**: Security and adversarial attack challenges - **LAB07**: CNN and computer vision issues - **LAB08-09**: Arduino/ESP32 hardware and wireless problems - **LAB10**: EMG signal processing and biomedical sensors - **LAB11**: Profiling and performance measurement - **LAB12-13**: Streaming and database issues - **LAB14**: Anomaly detection and unsupervised learning - **LAB15**: Power measurement and energy optimization - **LAB16**: YOLO and object detection on edge devices - **LAB17**: Federated learning and Flower framework - **LAB18**: On-device learning and model adaptation