Update 10/5/2025 : Final version functions. New Error detection scheme.
Note that in order to make the (*ACCEPT) verb work correctly
for error detection, a central recursion function must be used
(defined) that contains other recursion function defines embedded
within. From this function all (*ACCEPT) calls are made.
Thus the function (?<Er_Obj> contains (?<Er_Ary> .. ) see regex and comments.
Not sure why is has to be this way, or why that calling (*ACCEPT) from stand alone
parallel functions does not guarantee success.
This structure design below works 100% and fully tested.
For additional info on how this regex works see :
https://stackoverflow.com/a/79785886/15577665
These regex functions will validate as well as find errors in JSON strings.
Their granularity is such that any particular item at any particular level
can be found, removed, modified or replaced without any other needed help.
There are 2 main groups of functions: Validation and Error parsing.
The error parsing matches up to and stops at exactly at the place where
the error is.
So if a JSON is not valid, it can be examined using the error functions to identify
what and where it is. This is accomplished with the (*ACCEPT) verb.
These regex functions can be used to write a JSON query app without much effort.
Function category's :
- Common :
(?&Sep_Ary), (?&Sep_Obj), (?&Str), (?&Numb)
- Validation :
(?&V_KeyVal), (?&V_Value), (?&V_Ary), (?&V_Obj)
- Error :
(?&Er_Obj), (?&Er_Ary), (?&Er_Value)
Free form Demo : https://regex101.com/r/wYoW7v/1
(?:(?:(?&V_Obj)|(?&V_Ary))|(?<Invalid>(?&Er_Obj)|(?&Er_Ary)))(?(DEFINE)(?<Sep_Ary>\s*(?:,(?!\s*[}\]])|(?=\])))(?<Sep_Obj>\s*(?:,(?!\s*[}\]])|(?=})))(?<Er_Obj>(?>{(?:\s*(?&Str)(?:\s*:(?:\s*(?:(?&Er_Value)|(?<Er_Ary>\[(?:\s*(?:(?&Er_Value)|(?&Er_Ary)|(?&Er_Obj))(?:(?&Sep_Ary)|(*ACCEPT)))*(?:\s*\]|(*ACCEPT)))|(?&Er_Obj))(?:(?&Sep_Obj)|(*ACCEPT))|(*ACCEPT))|(*ACCEPT)))*(?:\s*}|(*ACCEPT))))(?<Er_Value>(?>(?&Numb)|(?>true|false|null)|(?&Str)))(?<Str>(?>"[^\\"]*(?:\\[\s\S][^\\"]*)*"))(?<Numb>(?>[+-]?(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?|(?:[eE][+-]?\d+)))(?<V_KeyVal>(?>\s*(?&Str)\s*:\s*(?&V_Value)\s*))(?<V_Value>(?>(?&Numb)|(?>true|false|null)|(?&Str)|(?&V_Obj)|(?&V_Ary)))(?<V_Ary>\[(?>\s*(?&V_Value)(?&Sep_Ary))*\s*\])(?<V_Obj>{(?>(?&V_KeyVal)(?&Sep_Obj))*\s*}))
Regex
# ==========================================
# Validation and Error Detection ..
# ----------------------------------
# Find Valid JSON :
# https://regex101.com/r/pxC3Ph/1
# Find Errors of Failed JSON :
# https://regex101.com/r/pE0vPU/1
# (?m)
# ^
(?:
(?: # Valid JSON
(?&V_Obj)
| (?&V_Ary)
)
| # or,
(?<Invalid> # (1), Invalid JSON - Find the error
(?&Er_Obj)
| (?&Er_Ary)
)
)
# ------------------------------------
# JSON -- Function Types / Defines
# ------------------------------------
(?(DEFINE)
# ================
# Separators
# ==============
(?<Sep_Ary> # (2), Separator for Array
\s*
(?:
,
(?! \s* [}\]] )
| (?= \] )
)
)
(?<Sep_Obj> # (3), Separator for Object
\s*
(?:
,
(?! \s* [}\]] )
| (?= } )
)
)
# ========================
# ERROR Detection
# ======================
(?<Er_Obj> # (4), Object Error detection
(?>
{ # Open object brace {
(?:
\s* (?&Str) # Key
(?: # ------------------
\s* : # : Colon separator
(?: # Value
\s*
(?:
(?&Er_Value) # Strings, nums, bool, numbers
| # or,
(?<Er_Ary> # (5), Array Error detection
\[ # Open array bracket [
(?:
\s*
(?:
(?&Er_Value) # Strings, nums, bool, numbers
| (?&Er_Ary) # or, arrays
| (?&Er_Obj) # or, objects
)
(?: # Array separator or (*ACCEPT)
(?&Sep_Ary)
| (*ACCEPT)
)
)*
(?: \s* \] | (*ACCEPT) ) # Close array bracket ] or (*ACCEPT)
)
| # or,
(?&Er_Obj) # Object
) # ------
(?: # Object separator or (*ACCEPT)
(?&Sep_Obj)
| (*ACCEPT)
)
| (*ACCEPT) # Value error, just (*ACCEPT)
)
| (*ACCEPT) # No Colon separator, just (*ACCEPT)
)
)*
(?: \s* } | (*ACCEPT) ) # Close object brace } oe (*ACCEPT)
)
)
(?<Er_Value> # (6), Values Error detection
(?>
(?&Numb) # Numbers
| (?> true | false | null ) # Boolean and null
| (?&Str) # String
)
)
# ========================
# Strings and Numbers
# ======================
(?<Str> # (7), String
(?>
" [^\\"]*
(?: \\ [\s\S] [^\\"]* )*
"
)
)
# if no control codes, use this :
# " [^\x00-\x1f\\"]*
# (?: \\ [^\x00-\x1f] [^\x00-\x1f\\"]* )*
# "
(?<Numb> # (8), Numbers
(?>
[+-]?
(?:
\d+
(?: \. \d* )?
| \. \d+
)
(?: [eE] [+-]? \d+ )?
| (?: [eE] [+-]? \d+ )
)
)
# ==========================
# Validation Detection
# =======================
(?<V_KeyVal> # (9), Validated Key : Value Pair
(?>
\s* (?&Str) \s* : \s* (?&V_Value) \s*
)
)
(?<V_Value> # (10), Validated Value
(?>
(?&Numb) # Numbers
| (?> true | false | null ) # Boolean and null
| (?&Str) # String
| (?&V_Obj) # Object
| (?&V_Ary) # Array
)
)
(?<V_Ary> # (11), Validated Array
\[
(?>
\s* (?&V_Value) (?&Sep_Ary)
)*
\s* \]
)
(?<V_Obj> # (12), Validated Object
{
(?>
(?&V_KeyVal) (?&Sep_Obj)
)*
\s* }
)
)
json_decodeand parse the result array for data you need?42would be valid JSON? As would"Hi There!"? Unless you restrict your json to be an encoded object only, it's pretty much impossible to detect ALL valid json forms.