Exploring the Azure GenAI Tech Stack: Azure Open AI and Azure AI Search

Dec 3, 2024 11:00 AM

Azure Open AI

Important

If you’re interested in playing around with the GPT-4o models, you have to choose theswedencentral Azure region (because only this region is “models complete”).

Make sure your data is always compliant to GDPR restrictions – therefore, choose Data Zone Standard (and as a “good advice” also not to be billed hourly 😃: the “Provisioned-managed” option is also GDPR compliant, but way too expensive).

After having deployed the appropriate model, play around with the generated API, e.g.:

https://espc2025-session.openai.azure.com/openai/deployments/gpt4o/chat/completions?api-version=2024-02-15-preview

The system role is crucial for how your bot acts:

{ 
    "messages": [
        { "role": "system",
            "content": "You are a grumpy sarcastic assistant!"
        },
        { "role": "user",
            "content": "Hello world!"
        }
    ]
}

At the end of the conversation with the bot, you will receive usage tokens that for example can be stored in a database (total_tokens) to be billed to the according customer or department:

"usage": {
    "completion_tokens": 9,
    "prompt_tokens": 10,
    "total_tokens": 19
}

Using JSON as structured output "response_format": { "type": "json_object" } is a reliable way to interact with the bot. By using this declaration, we will always receive a valid JSON object as response:

{ 
    "messages": [
        { 
            "role": "system",
            "content": "Give me 5 ideas for christmas presents for my 3 year old son.
                        Provide the answer as a JSON object in the following format: {\"ideas\":[\"idea1\"
                        ,\"idea2\"]}"
        }
    ], 
    "response_format": {
        "type": "json_object"
    }
}

Note

Normally, any result (content) would be formatted using Markdown syntax.

Function calling

Function calling is a very powerful option to extend the chat functionaltity of your bot. First, you have to define a function:

{
    "model": "gpt-40",
    "messages": [
        // intentionally omitted
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current _weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"]
                        }
                    },
                    "required": ["location"]
                },
                // further output intentionally omitted
            }
        }
    ]
}

The model does not call the defined function – you have to do it by yourself after having declared it by passing the function name and the according parameters:

{
    "model": "gpt-40-mini"
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": null,
                "tool_calls": [
                    {
                        "id": "call_abc123",
                        "type": "function",
                        "function": {
                            "name": "get_current_weather",
                            "arguments": "{\n\"location\": \"Boston, MA\"\n}"
                        }
                    }
                ]
            },
        "logprobs": null,
        "finish_reason": "tool_calls"
        }
    ]
}

Azure AI Search

Embeddings represent the meaning of a text through a (mathematical) vector. They measure a “distance” to reference object, comparing it to 1536 dimensions.

By using Azure AI Search and passing the retrieved data as embeddings to the GPT, you easily would be able to implement your own RAG system.