Understanding parameter settings for supported foundational models

In all AI models, parameters are numerical values that can be used to configure settings of foundational Models (LLMs) that influence how a model processes information by data analysis and makes predictions via responses. More parameters allow for more complex settings and result in better prompt-responses.

Fine tuning parameter configurations can help return more accurate responses. Let us explore the different parameters available for these models:
  • Amazon Bedrock
  • Google Vertex AI
  • Azure OpenAI
  • OpenAI
Model Parameters
Amazon Bedrock
  • Document retrieval count
  • Max Tokens
  • Temperature
  • Top P
Remarque : See action AI21 Labs Chat AI and Learn Models for parameter details.
Google Vertex AI
  • Document retrieval count
  • Max Output Tokens
  • Temperature
  • Top K
  • Top P
Remarque : See L'action Vertex AI : Chat AI for parameter details.
Azure OpenAI
  • Frequency Penalty
  • Max Tokens
  • N
  • Presence Penalty
  • Temperature
  • Top P
Remarque : See Azure OpenAI: Chat AI for parameter details.
OpenAI
  • Frequency Penalty
  • Max Tokens
  • N
  • Presence Penalty
  • Temperature
  • Top P
Remarque : See L'action OpenAI : Chat AI for parameter details.

Let us look at the use and functionality of these parameters. Understanding parameter settings helps you assess a model's balance between capability of the model and its performance.

Model parameters explained

Foundational model parameters give you the ability to fine-tune the model's ability to process complex prompt inputs to return more accurate and refined responses. Select a model based on how complex your prompt-input is, so you can configure the parameters accordingly to process and return accurate and well-defined responses.

Document retrieval count
This setting for hyperscaler vendors typically refers to the number of documents or data entries that can be retrieved in a single query or operation. This setting is crucial for managing and optimizing the performance of data retrieval processes, especially in large-scale environments like those managed by hyperscalers vendors.
Frequency Penalty / Presence Penalty
This setting discourages repetition in the generated text by restricting repetitive use of the tokens based on their frequency of use. The more you use a token in the text, the less likely it will be repeated. Choose a value between -2.0 to 2 with decimal value increments.
Max Tokens / Max Tokens To Sample / Max Output Tokens
This setting denotes the maximum number of tokens used in a generated response. You can choose a value between 1 to 2048, with default value set at 2048. Your prompt response will be affected based on the value set here. Allotting more tokes will return a more comprehensive and detailed response.
Presence Penalty
This setting discourages repetition of tokens in the generated text by restricting the use of tokens based on how frequently they appear. The more often a token is used in the text, less likely it will be repeated. Choose a value between -2.0 to 2 with decimal value increments.
Temperature
A higher value returns diverse and less predictable responses. You can choose a value between 0 to 1 with decimal value increments. This means that with higher value, the responses returned are more varied.
Top P / Top K
This setting determines the diversity of the generated response. Higher value returns more diverse responses. We recommend changing either the P/K or Temperature value, not both. Choose a value between 0 to 1 with decimal value increments.
N
This defines the number of responses generated by the model for a specific prompt. Choose between 1 to 9 with default value set at 1. If you assign more tokens and configure a higher temperature value and choose a high N value of 9, you will get 9 varied responses with details that would accommodate 2048 tokens.

Return to Créer Compétences en IA.