mod_openai Application Flow Analysis 🔗 ↑ TOC

This document provides a comprehensive analysis of the application flow through mod_openai, starting from the openai_json FreeSWITCH application entry point and mapping the complete execution path through the system architecture.

Table of Contents 🔗 ↑ TOC

  1. High-Level Flow Overview
  2. Entry Point: openai_json_app()
  3. Core Execution: openai_execute()
  4. Session Creation: ai_session_create()
  5. Thread Architecture: launch_threads()
  6. Parallel Processing Threads
  7. Core AI Processing: ai_send_text()
  8. Function Call Processing
  9. State Management and Cleanup
  10. Event System Integration
  11. Error Handling and Recovery
  12. Flow Summary

High-Level Flow Overview 🔗 ↑ TOC

FreeSWITCH Call → openai_json_app() → openai_execute() → ai_session_create() → launch_threads() → Main Conversation Loop

1. Entry Point: openai_json_app() (mod_openai.c:3858) 🔗 ↑ TOC

Function: SWITCH_STANDARD_APP(openai_json_app)
Location: mod_openai.c:3858-3896

Key Operations 🔗 ↑ TOC

1. Session Initialization

init_ais_private(session);  // Initialize private AI session stack

2. Channel Management

switch_channel_answer(channel);  // Answer the call if not already answered
switch_ivr_sleep(session, 1000, SWITCH_FALSE, NULL);  // Brief pause

3. Concurrency Control

while (switch_channel_ready(channel) && 
       switch_channel_test_app_flag_key("AI", channel, AI_CF_PROCESSING)) {
    switch_ivr_sleep(session, 500, SWITCH_FALSE, NULL);  // Wait for other AI apps
}

4. Debug Mode Handling

if (debugging) {
    cJSON *tmp = cJSON_Parse(data);
    char *pretty = cJSON_Print(tmp);
    switch_log_printf(..., "AI JSON:\n%s\n", pretty);
}

5. Main Execution

openai_execute(session, data, SWITCH_TRUE);  // Execute with JSON flag

6. Post-Processing

#if USE_REPORTING_HOOK
    openai_on_reporting(session);
#else
    ai_session_t *ais = NULL;
    if (pop_ais_private(session, &ais) == SWITCH_STATUS_SUCCESS) {
        openai_on_reporting_thread(session, ais);
    }
#endif

2. Core Execution: openai_execute() (mod_openai.c:2707) 🔗 ↑ TOC

Function: openai_execute(session, application, is_json)
Location: mod_openai.c:2707-3400+

Phase 1: Session Setup and Validation 🔗 ↑ TOC

// Prevent recursive execution
if (switch_channel_test_app_flag_key(AI_APP_KEY, channel, AI_APP_RUNNING)) {
    switch_log_printf(..., "INCEPTION ERROR\n");
    return;
}
switch_channel_set_app_flag_key(AI_APP_KEY, channel, AI_APP_RUNNING);

// Create AI session
if (ai_session_create(&ais, application, session, is_json) != SWITCH_STATUS_SUCCESS) {
    switch_log_printf(..., "AI SETUP ERROR\n");
    return;
}

Phase 2: Call Context Preparation 🔗 ↑ TOC

1. Caller Information Processing

const char *cid_name = switch_channel_get_variable(ais->channel, "caller_id_name");
const char *cid_num = switch_channel_get_variable(ais->channel, "caller_id_number");

// Handle special cases
if (!strcmp(cid_name, "0000000000")) {
    cid_name = "Unknown Caller";
}

2. Prompt Variables Setup

ais->app->prompt_vars = cJSON_CreateObject();

// Add call context
cJSON_AddItemToObject(ais->app->prompt_vars, "call_direction", 
                     cJSON_CreateString(ais->app->outbound ? "outbound" : "inbound"));
cJSON_AddItemToObject(ais->app->prompt_vars, "caller_id_name", cJSON_CreateString(cid_name));
cJSON_AddItemToObject(ais->app->prompt_vars, "caller_id_number", cJSON_CreateString(cid_num));

3. Time and Date Context

char date[80] = "";
switch_strftime_tz(ais->app->local_tz, "%A, %B %d, %Y", date, sizeof(date), 0);
cJSON_AddItemToObject(ais->app->prompt_vars, "local_date", cJSON_CreateString(date));

switch_strftime_tz(ais->app->local_tz, "%I:%M:%S %p %z", date, sizeof(date), 0);
cJSON_AddItemToObject(ais->app->prompt_vars, "local_time", cJSON_CreateString(date));

Phase 3: System Prompt Construction 🔗 ↑ TOC

1. Template Loading and Processing

template = ai_load_template(ais, ais->app->model->template_file);
if (template) {
    char *expanded = switch_event_expand_headers(ais->app->template_params, template);
    // Variable expansion using prompt_vars
}

2. System Prompt Injection

ai_conversation_add(&ais->aic, "system-pvt", "en", template);

3. Cache Key Generation

if (ais->cache.active) {
    char *key = switch_mprintf("%s%s\n", template, ais->application);
    switch_md5_string(ais->cache.digest, (void *)key, strlen(key));
}

Phase 4: Conversation Initialization 🔗 ↑ TOC

1. Static Greeting Setup

if (ais->app->params) {
    const char *greeting = switch_event_get_header(ais->app->params, "static_greeting");
    if (greeting) {
        ais->app->static_greeting = expand_jsonvars(ais->app->prompt_vars, greeting, ...);
    }
}

2. Conversation Injection

if (ais->app->convo) {
    ai_conversation_inject(&ais->aic, ais->app->convo);
}
ai_conversation_mark(&ais->aic);  // Mark conversation start point

3. Event Firing and Billing

fire_relay_event(ais->session, "calling.ai.start", NULL);
ais->start_date = switch_time_now();

switch_mutex_lock(ais->billing_mutex);
if (ais_private_is_nested(ais->session, ais) != SWITCH_STATUS_SUCCESS) {
    start_wallet_ticks(ais->session, VOICE_AI_TICK, "lookup", 1);
    ais->billing_voice = 1;
}
switch_mutex_unlock(ais->billing_mutex);

Phase 5: Speech Detection and Cache Check 🔗 ↑ TOC

start_speech_detect(ais, SWITCH_FALSE);

if (ais->cache.enabled) {
    if (ais_cache_check(ais) != SWITCH_STATUS_SUCCESS) {
        ais->cache.collecting = SWITCH_TRUE;
    }
}

Phase 6: Main Conversation Loop 🔗 ↑ TOC

while (ai_session_ready(ais) && ais->offhook) {
    if (ais->listen_state != LISTEN_SLEEPING) {
        ai_session_set_listen_state(ais, LISTEN_READY);
    }

    char *response = NULL;
    check_steps(ais);  // Process any step-based logic

    // AI Response Generation
    if (ais->wait_for_user != 1 && ais->listen_state != LISTEN_SLEEPING && !ais->skip_ai_turn) {
        // Handle first interaction (static greeting or cache)
        if (!ais->aic.user_chats) {
            if (ais->app->static_greeting || (ais->cache.enabled && !ais->cache.collecting)) {
                // Use cached/static response
            }
        }

        // Generate AI response
        if (!response) {
            response = ai_send_text(ais, &ais->app->settings, SWITCH_TRUE, 1);
        }

        // Process response
        if (!zstr(response)) {
            ai_conversation_add_with_latency(&ais->aic, "assistant", ...);
            fire_relay_event(ais->session, "calling.ai.completion", &obj);
        }
    }

    // Function Execution
    if (ais->functions) {
        wait_for_speaking(ais, SWITCH_FALSE);
        // Execute all pending functions
        for (hp = ais->functions->headers; hp; hp = hp->next) {
            if (execute_user_function(ais, hp->name, hp->value, &user_function_retval)) {
                // Process function results
            }
        }
    }

    // User Input Processing
    if (wait_for_input(ais, ais->app->attn_timeout, SWITCH_TRUE)) {
        while (switch_queue_trypop(ais->input_queue, &pop) == SWITCH_STATUS_SUCCESS) {
            // Process user input from speech detection
        }
    }
}

3. Session Creation: ai_session_create() (ai_utils.c:4341) 🔗 ↑ TOC

Function: ai_session_create(&ais, application, session, is_json)
Location: ai_utils.c:4341-5121

Key Operations 🔗 ↑ TOC

1. Application Loading

if (is_json) {
    app = create_app_from_json(session, channel, application);
} else {
    app = load_config(session, application);
}

2. Memory Pool and Structure Setup

if (switch_core_new_memory_pool(&pool) != SWITCH_STATUS_SUCCESS) {
    abort();
}
ais = (ai_session_t *)switch_core_alloc(pool, sizeof(*ais));

3. Session Initialization

ais->running = 1;
ais->offhook = 1;
ais->session = session;
ais->channel = channel;
ais->consolidate_limit = 3500;  // Token limit for conversation consolidation

4. Mutex and Queue Setup

switch_mutex_init(&ais->state_mutex, SWITCH_MUTEX_NESTED, ais->pool);
switch_mutex_init(&ais->speech_mutex, SWITCH_MUTEX_NESTED, ais->pool);
switch_mutex_init(&ais->billing_mutex, SWITCH_MUTEX_NESTED, ais->pool);

switch_queue_create(&ais->input_queue, 10000, ais->pool);
switch_queue_create(&ais->output_queue, 10000, ais->pool);
switch_queue_create(&ais->manual_output_queue, 10000, ais->pool);

5. Codec Initialization

switch_core_session_get_read_impl(ais->session, &ais->read_impl);
ais->interval = ais->read_impl.microseconds_per_packet / 1000;
ais->rate = ais->read_impl.actual_samples_per_second;

switch_core_codec_init(&ais->speech_codec, "L16", NULL, NULL,
                      (int)ais->rate, ais->interval, ais->channels, ...);

6. Thread Launch

if (launch_threads(ais) == SWITCH_STATUS_SUCCESS) {
    push_ais_private(session, ais);
    *aisP = ais;
    return SWITCH_STATUS_SUCCESS;
}

4. Thread Architecture: launch_threads() (ai_utils.c:6856) 🔗 ↑ TOC

Function: launch_threads(ais)
Location: ai_utils.c:6856-6925

Input Thread Creation 🔗 ↑ TOC

switch_threadattr_create(&thd_attr_in, ais->pool);
switch_threadattr_name_set_printf(thd_attr_in, "ai/input");
switch_thread_create(&ais->input_thread, thd_attr_in, ai_input_thread, ais, ais->pool);

Output Thread Creation 🔗 ↑ TOC

switch_threadattr_create(&thd_attr_out, ais->pool);
switch_threadattr_name_set_printf(thd_attr_out, "ai/output");
switch_thread_create(&ais->output_thread, thd_attr_out, ai_output_thread, ais, ais->pool);

Video Thread (Optional) 🔗 ↑ TOC

if (ais->has_video && !ais->video_thread) {
    start_video_thread(ais);
}

Thread Synchronization 🔗 ↑ TOC

while (--sanity > 0 && (ais->input_running == 0 || ais->output_running == 0)) {
    switch_cond_next();  // Wait for threads to start
}

5. Parallel Processing Threads 🔗 ↑ TOC

A. Input Thread: ai_input_thread() (ai_utils.c:5859) 🔗 ↑ TOC

Purpose: Processes incoming audio frames for speech detection

Key Operations:

B. Output Thread: ai_output_thread() (ai_utils.c:6316) 🔗 ↑ TOC

Purpose: Handles TTS generation and audio playback

Key Operations:

C. Video Thread (Optional) 🔗 ↑ TOC

Purpose: Processes video frames for vision model analysis

Key Operations:


6. Core AI Processing: ai_send_text() (ai_utils.c:1055) 🔗 ↑ TOC

Function: ai_send_text(ais, settings, stream, recur)
Location: ai_utils.c:1055-1569

Key Operations 🔗 ↑ TOC

1. Conversation Preparation

cJSON *convo = ai_conversation_json(&ais->aic, SWITCH_FALSE);

2. Model Selection and URL Construction

// OpenAI vs Azure endpoint selection
// Model-specific parameter handling

3. HTTP Request to AI Service

// Streaming response handling
// Token counting and billing
// Function call processing

4. Response Processing

// Parse streaming JSON responses
// Extract text, function calls, and metadata
// Handle interruptions and barge-in

7. Function Call Processing 🔗 ↑ TOC

Function Detection and Execution 🔗 ↑ TOC

if (ais->functions) {
    for (hp = ais->functions->headers; hp; hp = hp->next) {
        if (execute_user_function(ais, hp->name, hp->value, &user_function_retval)) {
            // Process function results
            // Add results to conversation
            // Handle SWML execution if returned
        }
    }
}

SWAIG Protocol Integration 🔗 ↑ TOC


8. State Management and Cleanup 🔗 ↑ TOC

Session State Tracking 🔗 ↑ TOC

Resource Management 🔗 ↑ TOC


9. Event System Integration 🔗 ↑ TOC

Event Firing Throughout Flow 🔗 ↑ TOC

fire_relay_event(ais->session, "calling.ai.start", NULL);
fire_relay_event(ais->session, "calling.ai.completion", &obj);
fire_relay_event(ais->session, "calling.ai.function.call", &func_obj);
fire_relay_event(ais->session, "calling.ai.end", &summary_obj);

Debug and Webhook Integration 🔗 ↑ TOC


10. Error Handling and Recovery 🔗 ↑ TOC

Interruption Handling 🔗 ↑ TOC

if (ais->interrupted) {
    set_interrupted(ais, SWITCH_FALSE);
    // Resume conversation flow
}

Timeout Management 🔗 ↑ TOC


Flow Summary 🔗 ↑ TOC

The mod_openai application follows this high-level execution pattern:

Key Execution Phases:

  1. Initialization: Session setup, configuration loading, and resource allocation
  2. Thread Launch: Parallel processing threads for input/output/video
  3. Main Loop: Alternating between AI response generation and user input processing
  4. Function Processing: External function calls and SWML execution
  5. Cleanup: Thread termination and resource deallocation

Design Characteristics:

The system is designed for real-time, low-latency voice interactions with comprehensive error handling, billing integration, and extensive configurability through JSON parameters or XML configuration files.

Architecture Benefits:


This flow analysis provides the foundation for understanding how voice-based AI conversations are orchestrated within the FreeSWITCH telephony framework.