PER encoding of BIT STRING with named bits and extensible size constraint

Question

Given:

X ::= BIT STRING {a(0), b(1)} (SIZE(2, ...))

what is the correct length for the PER encoding to use for the value '00'B?

I argue that the encoding should use a length of 0, because X.691 16.3 calls for trimming to "the smallest size capable of carrying this value and satisfies the effective size constraint", and the constraint on X, by being extensible, allows for values having length 0.

However, "satisfies the effective size constraint" doesn't seem to be very well defined, and I know of tools that encode using a length of 2.

Nonetheless, even if the above phrase is not well defined, suppose we had:

X-V2 ::= BIT STRING {a(0), b(1)} (SIZE(2, ..., 0))

then I think it becomes hard to say, under any interpretation, that a length of 0 does not satisfy the effective size constraint. That means for X-V2, '00'B would be encoded using a length of 0. Furthermore, I believe in order to enable canonical PER encodings, X and X-V2 must encode '00'B in the same way, which means also using a length of 0 for '00'B for type X.

To my mind, the ambiguity arises from exactly what values are considered as making up, or being "present", in X, because X.691 gives these relevant definitions:

3.7.8 effective size constraint (for a constrained string type): A single finite size constraint that could be applied to a built-in string type and whose effect would be to permit all and only those lengths that can be present in the constrained string type.

10.3.10 The effective size constraint for a constrained type is a single size constraint such that a size is permitted if and only if there is some value of the constrained type that has that (permitted) size.

In a couple of places, X.680 leads us to believe that an extensible type consists of the values in the root plus any values in the extension additions:

Annex I 4.2.3 :

A1 ::= INTEGER (1..32, ... , 33..128)
A1 is extensible, and contains values 1 to 128 with 1 to 32 in the root and 33 to 128 as extension additions.

50.1:

NOTE 6 – The elements that are referenced by "ElementSetSpecs" is the union of the elements referenced by the "RootElementSetSpec" and "AdditionalElementSetSpec" (when present).

But then it also seems to say that an extensible type contains all of the same values that are in the type being constrained:

6:

In formal terms, an abstract syntax defined by the extensible type X contains not only the values of type X, but also the values of all types that are extension-related to X.

and 3.8.38:

extension-related: Two types that have the same extension root, where one was created by adding zero or more extension additions to the other.

I think section 6 rules here and the language in Annex I 4.2.3 and in 50.1 note 6 is being imprecise.

Update: Discussion of X.691 16.3 and 16.6 Alessandro's answer (thanks for your input) applies 16.3 only after 16.6 is applied. Here's my interpretation of the relationship between the two.

First, 16.3 is an unqualified statement. It does not say it applies only if the PER-visible constraint is not extensible, or, in the case where the constraint is extensible, only if the length of the BIT STRING falls inside the extension root of the constraint. It also does not say anything that requires us to look at 16.6 to understand how to apply 16.3. It uses the phrase "effective size constraint", but that is a phrase that is elsewhere defined without depending on 16.6, although the definition is somewhat unclear, as I noted above. Apparently, though, an "effective size constraint" can be extensible because Annex B.3 has a IA5String example with an effective size constraint that is extensible:

A13 ::= IA5String (SIZE(1..10, ...) ^ FROM("A".."D"))
-- A13 has an extensible effective size constraint of SIZE(1..10,...)

Second, when 16.6 says "In the latter case the length and value shall be encoded as if no extension is present in the constraint." it does not say to go back to 16.1 and start over, or even to invoke 16.3 at that time. Rather, I take this phrase to simply mean that we should act as if there were no extension marker as we continue on to the next point, 16.7, which says "If an extension marker is not present in the constraint specification of the bitstring type, then 16.8 to 16.11 apply." Indeed, 16.8 - 16.11 are concerned with how to encode the length, which means that the two branches of 16.6 both end up specifying how the length is encoded. That makes this a natural reading of 16.6.

Third, 16.6 requires us to take one of two branches depending on whether the "length of this encoding" is in the extension root or not. The "length of this encoding" can only mean the "length of the string value" or the "length that is being used in the encoding". It can't mean the "length that is being used in the encoding" unless 16.3 logically applies before 16.6.

Now, suppose we have

Y ::= BIT STRING {a(0), b(1)} (SIZE(0..4, ...))

and suppose that in 16.6 we are to understand "length of this encoding" to mean simply the "length of the string" and that 16.3 only applies when the length of the string is in the extension root. Then, a Y of '11000'B is not trimmed (5 not in the root), but a Y of '1100'B is trimmed (to 2; 4 is in the root) - a result which seems somewhat odd (that one is trimmed but the longer one is not).

Moreover, for a Y value of '1100'B, in applying 16.6, we treated "the length of this encoding" as 4 (the length of the string) but then the length in the encoding is 2. That seems a very strange use of English indeed!

It is 16.3 that creates the possibility that the length of the string and the length used in the encoding might not be equal. When 16.6 then refers to the "length of this encoding", the natural reading is to assume that 16.3, which introduced the idea, logically precedes 16.6 and has therefore already been applied to determine the length that will be used in the encoding, which is what "length of this encoding" refers to. The wording itself - length of this encoding - suggests this reading.

Under my interpretation, there are cases where a value is encoded as an extension value when it would seem more sensible to be encoded as a root value. This happens whenever the root excludes values in the range 0..k because then the trimming can then result in a size shorter than what the root covers. Nonetheless, I think that is what the spec calls for, on the most natural reading. Personally, I think it is clear 16.3 logically precedes 16.6 and the ambiguity arises it what it means for a size to "satisfy" an "effective size constraint" in 16.3. Does 0 "satisfy" an effective size constraint of SIZE(2, ...)? I think we have to say it does; at least the spec is not clear that it doesn't, as far as I can see.

(End of update)

Update 2: Counterarguments concerning 16.3 and 16.6 First, as Alessandro notes, 17.3 (for octet strings) and 20.4 (for sequence of) parallel 16.6 and the one refers to the "length of this encoding" while the other refers to the "number of components in this encoding", both of which really just refer to the length or to the number of components of the value. This gives us reason to think 16.6 also refers to just the length of the value when it says "length of this encoding". If it weren't for 16.3, they would be one in the same, but given 16.3, there is the potential for confusion. Still, this might just be an unintentional and unfortunate choice of words, and it could be they meant to refer to the length of original value.

Second, it is pretty clear that when 16.6 requires encoding "as if no extension is present in the constraint", this applies not just to the following clauses but also to a prior clause, namely 16.4, since, otherwise, extensible types would never use an optimized encoding for the length (ub would be unset in such cases). So, if the "as if" applies to one clause before 16.6, then why not to another clause, namely 16.3? Then for type X, '00'B has length of 2, which is in the root, so according to 16.6 we encode it as if the extension weren't present. That means 16.3 does not trim the string, as the effective size constraint is then SIZE(2), not SIZE(2, ...).

This interpretation raises a new question: what about an X of '1110'B? The length is not in the extension root, so 16.6 does not have us encode it as if the extension were not present and, supposedly, 16.3 does not apply (though it is not absolutely clear that 16.3 only applies to "root" values). Thus, X.691 does not clearly require trimming the trailing zero bit in this case. However, Paul Thorpe argues it should be trimmed nonetheless. After all, X.680 22.7 tells us that '1110'B and '111'B should be treated (by application designers) as having the same semantics, and encoding rules can arbitrarily add or remove trailing bits. Therefore, some encoding rules could encode '1110'B using the exact same bits as for '111'B. However, this doesn't mean that all encoding rules must do so, nor does it dictate that PER, in particular, must do so. Still, it does mean that if a PER implementation trims '1110'B to '111'B when encoding, a user at least can't complain they got back an unepxected value of '111'B after decoding. However, this doesn't answer what the correct encoding is, and there must be a single correct encoding if we're to have canonical encodings.

A similar question as for '1110'B arises for an X of '1100'B. The length is 4, which length is not in the root, but it is semantically equivalent to '11'B, whose length is in the root. It seems not unreasonable to encode '1100'B the same as '11'B, but the specs don't make it absolutely clear that this is what PER requires. Similarly for an X of ''B, with a length of 0, which length is not in the root, but it is also semantically equivalent to '00'B, which, under this alternative interpretation, should be encoded as length 2.

IMO, ITU-T needs to make some clarifications here as I don't think the arguments one way or the other are decisive. I believe the specification is inherently vague.

(End of update 2)

Basically, I'm looking for a contrary argument or for someone to point out what I've missed, if I have missed something.

Do you want to know for curiosity, or do you really want to trim 2 bits from your data? Because it doesn't seem worth worrying about the 2 bits if you have something that works. — John Bayko
– John Bayko, Commented Nov 28, 2023 at 17:51
@JohnBayko I want to know for correctness. It's not supposed to be an encoder's option, whether to encode using a length of 0 or 2. — Kevin
– Kevin, Commented Nov 28, 2023 at 19:06

Alessandro · Accepted Answer · 2023-11-28 23:19:09Z

1

Your interpretation is contradicted by clause 16.6, which deals with types extensible for PER encodings (especially the last phrase, "encoded as if no extension is present"). I believe clauses 16.1 to 16.5 refer to types that are not extensible for PER encodings, or to the cases in which the length and value are to be "encoded as if no extension is present" in accordance with 16.6. This also implies that the phrase "effective size constraint" in 16.3 is to be understood as referring to the root part of it.

edited Nov 28, 2023 at 23:19

answered Nov 28, 2023 at 22:32

Alessandro

1,1756 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

26 Comments

Kevin Over a year ago

Applying 16.6, the type is extensible for PER encodings, so the next question is whether the "length of the encoding" is "within the range of the extension root". Note it is the length of the encoding not the length of the value that is in question. We have to apply 16.3 to determine the length of the encoding. If the encoding length is 0 (not in the extension root), then 16.6 specifies an extension bit of 1; if it is 2, then the extension bit will be 0. So, I don't think 16.6 contradicts my interpretation. Rather, the correct application of 16.6 depends on how 16.3 is understood.

Paul Thorpe Over a year ago

Please note that you are dealing with a BIT STRING with a NamedBitList. The encoder is permitted to add or remove bits to comply with the size constraint. If no bits are set, the length should still be 2 bits to comply with the size constraint for the extension root.

Alessandro Over a year ago

Clause 16.6 says, "In the latter case the length and value shall be encoded as if no extension is present in the constraint", which also applies to the phrase "satisfies the effective size constraint" in 16.3. That is, the reference to the "effective size constraint" in 16.3 should be understood as the root part of the effective size constraint because 16.6 says that the encoding must be as if no extension is present in the constraint. Clause 16.6 cannot depend on 16.3, otherwise it would be self-contradictory.

Alessandro Over a year ago

A further consideration is that if a value can be understood as being a root value (thanks to clause 16.3 allowing the addition or removal of trailing zero bits), then it should be encoded as a root value.

Kevin Over a year ago

@PaulThorpe Maybe you can create an answer that cites the spec as to why the length should be a length that is in the extension root, versus zero, because I wasn't able to do so myself. After all, strings of length zero are allowed, thanks to the extensibility.

|

Paul Thorpe · Accepted Answer · 2023-12-05 17:41:05Z

0

Please notice Rec. ITU-T X.691 | ISO/IEC 8825-2 clause 10.4.3 which is as follows:

10.4.3 When a constraint includes a value as an extension addition that is present in the root, that value is always encoded as a value in the root, not as a value which is an extension addition.

You then should use clause 16.3 which states the following:

16.3 Where there is a PER visible constraint and Rec. ITU-T X.680 | ISO/IEC 8824-1, 22.7, applies (i.e. the bitstring type is defined with a "NamedBitList"), the value shall be encoded with trailing 0 bits added or removed as necessary to ensure that the size of the transmitted value is the smallest size capable of carrying this value and satisfies the effective size constraint.

The encoding should be one with 2 zero bitstring bits to satisfy the size constraint.

Please also notice Rec. ITU-T X.680 | ISO/IEC 8824-1 clause 22.7 which reads as follows:

22.7 When a "NamedBitList" is used in defining a bitstring type ASN.1 encoding rules are free to add (or remove) arbitrarily any trailing 0 bits to (or from) values that are being encoded or decoded. Application designers should therefore ensure that different semantics are not associated with such values which differ only in the number of trailing 0 bits.

edited Dec 5, 2023 at 17:41

answered Dec 1, 2023 at 17:00

Paul Thorpe

2,20019 silver badges26 bronze badges

2 Comments

Kevin Over a year ago

10.4.3 does not apply to this case. It covers the case where an extension addition value is present in the root, and it gives an example of such a case: INTEGER (0..10, ..., 5). But, that is not the situation in question. The size constraint is SIZE(2, ...) and the length of 0 is not in the root.

Paul Thorpe Over a year ago

@Kevin You are dealing with a BIT STRING with a NamedBitList. Please notice Rec. ITU-T X.680 | ISO/IEC 8824-1 clause 22.7 which is as follows: 22.7 When a "NamedBitList" is used in defining a bitstring type ASN.1 encoding rules are free to add (or remove) arbitrarily any trailing 0 bits to (or from) values that are being encoded or decoded. Application designers should therefore ensure that different semantics are not associated with such values which differ only in the number of trailing 0 bits. This means that a bitstring value with no bits set is in the root.

Collectives™ on Stack Overflow

PER encoding of BIT STRING with named bits and extensible size constraint

2 Answers 2

26 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

26 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related