Optionalfields: GoogleVertexAIChatInput<GoogleAuthOptions<JSONClient>>Help the model understand what an appropriate response is
Maximum number of tokens to generate in the completion.
Model to use
Sampling temperature to use
Top-k changes how the model selects tokens for output.
A top-k of 1 means the selected token is the most probable among all tokens in the model’s vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
Top-p changes how the model selects tokens for output.
Tokens are selected from most probable to least until the sum of their probabilities equals the top-p value.
For example, if tokens A, B, and C have a probability of .3, .2, and .1 and the top-p value is .5, then the model will select either A or B as the next token (using temperature).
Creates an instance of the Google Vertex AI chat model.
The messages for the model instance.
A new instance of the Google Vertex AI chat model.
StaticconvertConverts a prediction from the Google Vertex AI chat model to a chat generation.
The prediction to convert.
The converted chat generation.
Staticconvert
Enables calls to the Google Cloud's Vertex AI API to access Large Language Models in a chat-like fashion.
To use, you will need to have one of the following authentication methods in place:
GOOGLE_APPLICATION_CREDENTIALSenvironment variable is set to the path of a credentials file for a service account permitted to the Google Cloud project using Vertex AI.Example