I'm using Mistral Large to analyze PDFs. As input, I provide the PDF and a response-format, which is a Pydantic class defining the output structure.
Here is my api call:
chat_response = self.client.chat.parse(
model="mistral-large-latest",
messages=messages,
response_format=Report,
temperature=0,
max_tokens=3000
)
Here are my classes:
class Element(BaseModel):
number: str
page: int
inspector: str
name: str
n_internal: Optional[int] = None
factory: str
building: str
regulation: Literal["ascenseurs", "mont_charge"]
attribut: Union[
Union[ascenseurs_et_monte_charges_ascenseur.ascenseurs_et_monte_charges_ascenseur_0,
ascenseurs_et_monte_charges_monte_charge.ascenseurs_et_monte_charges_monte_charge_0],
]
class Report(BaseModel):
document: Document
intervention_control: InterventionControl
elements: List[Element]
observations: List[Observation]
Now, I want the attribut field to accept a choice between several classes using Union. It works if I include only a few classes, but not all of them.
I want to make this value dynamic. For example:
If régulation == "ascenseur", Then attribut should take the value of the Ascenseur class (so the LLM follows this structure and fills attribut with the fields of the desired class). This could be done with a dictionary like this: [Your dictionary example would be here]
However, I can't get it to work. I've tried using validators, etc. Maybe I'm doing something wrong.
My issue is that the output from the LLM doesn't take the change into account, even though if I print the value inside the class (in a function), I can see the change.
I'm not sure if it's a serialization problem or if the LLM isn't considering the change.
When I make a UNION of all my classes, I have this error : {"object":"error","message":"Got 6 characters in schema, maximum allowed is 15000","type":"invalid_request_error","param":null,"code":null} Error: Report analysis failed: API error occurred: Status 400
Here is one of the classes :
class ascenseurs_et_monte_charges_ascenseur_0(BaseModel):
type_de_control : Optional[Literal['Contrôle technique', 'Vérification périodique']] = Field(description="Le type de contrôle effectué sur l'élément, si présent.")
date_de_mise_en_service : Optional[str] = Field(description="La date de mise en service de l'ascenseur, si présent.")
date_de_fabrication : Optional[str] = Field(description="La date de fabrication (peut être appelé plaque) de l'ascenseur, si présent.")
modification_importante : Optional[bool] = Field(description="La date de la dernière modification importante de l'ascenseur, si présent.")
societe_de_maintenance : Optional[str] = Field(description="La société de maintenance de l'ascenseur, si présent.")
fabricant : Optional[str] = Field(description="La marque du fabricant de l'ascenseur, si présent.")
I'm expecting this result :
{
"number": "3",
"page": 8,
"inspector": "Mr COVAREL STEPHANE",
"name": "Monte charge",
"n_internal": null,
"factory": "ALSACE MANUTENTION",
"building": "Labos - Bâtiment B 215 - HZ581",
"regulation": "mont_charge",
"attribut": {
"type_de_control": "Vérification périodique",
"date_de_mise_en_service": "1983",
"date_de_fabrication": "1983",
"certification_ce": null,
"nombre_d_etages": 3,
"fabricant": "ALSACE MANUTENTION",
"charge_maximale_kg_var": 100,
"a_t_il_subi_une_modification_importante_var": null,
"parachute": null,
"societe_de_maintenance": "OTIS",
"motorisation": "Hydraulique",
"machinerie": "Haute",
"vitesse_nominale_en_m_s_var": null,
"en_location": null,
"status": null,
"type_de_traction_electrique": null,
"type_de_traction_hydraulique": "Entraînement direct",
"type_de_traction": "Hydraulique",
"type_d_ouverture": "Automatique",
"type_de_porte": "Porte Coulissante"
}