0

I am trying to create the following JSON Schema

from pydantic import BaseModel, Field

class VendorInfo(BaseModel):
    vendor_name: str = Field("", description= "Vendor Name")
    vendor_vat_no: str = Field("", description= "Vendor VAT Number")

class InvoiceHeader(BaseModel):
    invoice_number: str = Field("", description= "The unique invoice number")
    invoice_date: str = Field("", description= "The date invoice was created")
    vendor_info: VendorInfo = Field("", description= "Description of the vendor")

print(json.dumps(InvoiceHeader.model_json_schema(), indent=2))

This gives following output with $defs and $ref. How to remove them from schema?

{"$defs": {"VendorInfo": {"properties": {"vendor_name": {"default": "","description": "Vendor Name","title": "Vendor Name","type": "string"},"vendor_vat_no": {"default": "","description": "Vendor VAT Number","title": "Vendor Vat No","type": "string"}},"title": "VendorInfo","type": "object"}},"properties": {"invoice_number": {"default": "","description": "The unique invoice number","title": "Invoice Number","type": "string"},"invoice_date": {"default": "","description": "The date invoice was created","title": "Invoice Date","type": "string"},"vendor_info": {"$ref": "#/$defs/VendorInfo","default": "","description": "Description of the vendor"}},"title": "InvoiceHeader","type": "object"}
3
  • What do you mean with "how to remove"? Simply use del my_schema["$defs"] I'd say. What is the problem with postprocessing the dictionary structure you get? Commented Nov 27, 2024 at 14:21
  • @lord_haffi I came to this question because I want a JSON schema without extensions. I am facing an API error or else I have to create a plain JSON schema struct which is less readable. Hope this helps. Commented Jan 3 at 16:50
  • I still don't understand which structure you want to achieve. Please update your question and include an (well formatted) example of the desired JSON structure. E.g. where are the referenced models defined if you drop the $defs part? Commented Jan 4 at 16:50

1 Answer 1

1

If I understand you correctly -that you want to inline the $refs from defs- you can use this python script:

def remove_defs_and_refs(schema: dict):
    schema = schema.copy()
    defs = schema.pop('$defs', {})

    def resolve(subschema):
        if isinstance(subschema, dict):
            ref = subschema.get('$ref', None)
            if ref:
                _def = ref.split('/')[-1]
                return resolve(defs[_def])
            return {
                _def: resolve(_ref)
                for _def, _ref in subschema.items()
            }
        if isinstance(subschema, list):
            return [resolve(ss) for ss in subschema]
        return subschema
    
    return resolve(schema)

Simply, it replaces any $ref from the definitions $defs (recursively for the inner refrences).

for your use case (I just added VendorAddress to increase the depth to show that the function can deal with N-level of depth):

from pydantic import BaseModel, Field

class VendorAddress(BaseModel):
    country: str = Field("", description="Vendor Original Country")
    state: str = Field("", description="Vendor Original State")
    city: str = Field("", description="Vendor Original City")

class VendorInfo(BaseModel):
    vendor_name: str = Field("", description= "Vendor Name")
    vendor_vat_no: str = Field("", description= "Vendor VAT Number")
    vendor_addr: VendorAddress = Field(None, description="Vendor Detailed Address")

class InvoiceHeader(BaseModel):
    invoice_number: str = Field("", description= "The unique invoice number")
    invoice_date: str = Field("", description= "The date invoice was created")
    vendor_info: VendorInfo = Field("", description= "Description of the vendor")

If you do:

import json

invoice_header = InvoiceHeader()
schema = invoice_header.model_json_schema()
inlined = remove_defs_and_refs(schema) # No `$defs` or `$ref`s
print(json.dumps(inlined, indent=2))

you will get:

{
  "properties": {
    "invoice_number": {
      "default": "",
      "description": "The unique invoice number",
      "title": "Invoice Number",
      "type": "string"
    },
    "invoice_date": {
      "default": "",
      "description": "The date invoice was created",
      "title": "Invoice Date",
      "type": "string"
    },
    "vendor_info": {
      "properties": {
        "vendor_name": {
          "default": "",
          "description": "Vendor Name",
          "title": "Vendor Name",
          "type": "string"
        },
        "vendor_vat_no": {
          "default": "",
          "description": "Vendor VAT Number",
          "title": "Vendor Vat No",
          "type": "string"
        },
        "vendor_addr": {
          "properties": {
            "country": {
              "default": "",
              "description": "Vendor Original Country",
              "title": "Country",
              "type": "string"
            },
            "state": {
              "default": "",
              "description": "Vendor Original State",
              "title": "State",
              "type": "string"
            },
            "city": {
              "default": "",
              "description": "Vendor Original City",
              "title": "City",
              "type": "string"
            }
          },
          "title": "VendorAddress",
          "type": "object"
        }
      },
      "title": "VendorInfo",
      "type": "object"
    }
  },
  "title": "InvoiceHeader",
  "type": "object"
}

instead of:

{
  "$defs": {
    "VendorAddress": {
      "properties": {
        "country": {
          "default": "",
          "description": "Vendor Original Country",
          "title": "Country",
          "type": "string"
        },
        "state": {
          "default": "",
          "description": "Vendor Original State",
          "title": "State",
          "type": "string"
        },
        "city": {
          "default": "",
          "description": "Vendor Original City",
          "title": "City",
          "type": "string"
        }
      },
      "title": "VendorAddress",
      "type": "object"
    },
    "VendorInfo": {
      "properties": {
        "vendor_name": {
          "default": "",
          "description": "Vendor Name",
          "title": "Vendor Name",
          "type": "string"
        },
        "vendor_vat_no": {
          "default": "",
          "description": "Vendor VAT Number",
          "title": "Vendor Vat No",
          "type": "string"
        },
        "vendor_addr": {
          "$ref": "#/$defs/VendorAddress",
          "default": null,
          "description": "Vendor Detailed Address"
        }
      },
      "title": "VendorInfo",
      "type": "object"
    }
  },
  "properties": {
    "invoice_number": {
      "default": "",
      "description": "The unique invoice number",
      "title": "Invoice Number",
      "type": "string"
    },
    "invoice_date": {
      "default": "",
      "description": "The date invoice was created",
      "title": "Invoice Date",
      "type": "string"
    },
    "vendor_info": {
      "$ref": "#/$defs/VendorInfo",
      "default": "",
      "description": "Description of the vendor"
    }
  },
  "title": "InvoiceHeader",
  "type": "object"
}

NOTE: That will increase the tokens number if you use api models (= extra money), so, be careful!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.