I recently built myself a ham radio repeater and used ESP32 for its control unit. I added Telegram control capabilities to get some status info and force some actions remotely. Using FreeRTOS made code more eficient however ReadTelegram() task often gets hanged at some point and it's been driving me mad since the beggining. Main problem was adapting the code when WiFi got disconnected, although that minimized the resets from 5-10 resets a day to just one every 4-5 days it still hanges and restarts everything.
I don't know if getUpdates() might be the problem and has to be processed differently, but this task seems so short it shouldn't hang at any place. Watchdog timeout is set to 14 seconds.
void taskReadTelegram(void *pvParameters)
{
esp_task_wdt_add(NULL); // Registramos esta tarea en el watchdog, tal que la monitorice, si no va reseteando de vez en cuando entonces detecta un bloqueo y salta
for (;;)
{
esp_task_wdt_reset();
if (WiFiStatus==true)
{
int numNewMessages = bot.getUpdates(bot.last_message_received + 1);
vTaskDelay(pdMS_TO_TICKS(500));
if (numNewMessages != 0)
{
Serial.println("got response");
xQueueSend(colaNumMensajesTelegram, &numNewMessages, 0); // Mandamos el número de mensajes a la taskHandle por la cola, para que vaya evaluando
}
}
else
{
Serial.println("ReadTelegram sin conexion");
}
esp_task_wdt_reset();
vTaskDelay(pdMS_TO_TICKS(6000)); // Cada 6 segundos se consulta si hay mensajes
}
}
Console message when WDT got triggered last time:
08:20:05:982 -> E (183031094) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
08:20:05:990 -> E (183031094) task_wdt: - ReadTelegram (CPU 1)
08:20:05:993 -> E (183031094) task_wdt: Tasks currently running:
08:20:05:999 -> E (183031094) task_wdt: CPU 0: Temperaturas
08:20:06:001 -> E (183031094) task_wdt: CPU 1: loopTask
08:20:06:007 -> E (183031094) task_wdt: Aborting.
08:20:06:010 ->
08:20:06:010 -> abort() was called at PC 0x400ed5a9 on core 0
08:20:06:012 ->
08:20:06:012 ->
08:20:06:012 -> Backtrace: 0x40083e99:0x3ffbed1c |<-CORRUPTED
08:20:06:018 ->
08:20:06:018 ->
08:20:06:018 ->
08:20:06:018 ->
08:20:06:018 -> ELF file SHA256: b56722619dc77b64
08:20:06:021 ->
08:20:06:340 -> Rebooting...
08:20:06:340 -> ets Jul 29 2019 12:21:46
08:20:06:343 ->
08:20:06:343 -> rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
08:20:06:348 -> configsip: 0, SPIWP:0xee
08:20:06:348 -> clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
08:20:06:356 -> mode:DIO, clock div:2
08:20:06:356 -> load:0x3fff0030,len:1184
08:20:06:359 -> load:0x40078000,len:13232
08:20:06:362 -> load:0x40080400,len:3028
08:20:06:366 -> entry 0x400805e4
bot.getUpdates(). Nail it down by spreading code likeSerial.println("I am here")around it. You posted0x3ffbed1c |<-CORRUPTED. That smells like buffer overflow, so you have to dig into called functions.