0

I have the following problem i need in array 3 the languages names not the hex code on array 4 i want only the audio codecs not anything other like hex values or something.

I have no solution i have all tested but all is wrong. Can someone help me ?

Here are the regex data:

Stream #0:1[0x1100](ger): Audio: dts (DTS) ([130][0][0][0] / 0x0082), 48000 Hz, 5.1(side), s16, 1536 kb/s
Stream #0:2(eng): Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side), s16, 1536 kb/s
Stream #0:3: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, 192 kb/s
Stream #1:0: Audio: mp2, 41000 Hz, stereo, 48 kb/s

Here is my regex

/Stream #([0-9\.]+)?:([0-9\.]+).([A-Za-z][A-Za-z]*)?.+Audio: ([^,]+?), ([0-9]+) Hz, ?([^\n,]*)/

Here is the output array:

Array
(
[0] => Array
    (
        [0] => Stream #0:1[0x1100](ger): Audio: dts (DTS) ([130][0][0][0] / 0x0082), 48000 Hz, 5.1(side)
        [1] => Stream #0:2(eng): Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side)
        [2] => Stream #0:3: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo
        [3] => Stream #1:0: Audio: mp2, 41000 Hz, stereo
    )

[1] => Array
    (
        [0] => 0
        [1] => 0
        [2] => 0
        [3] => 1
    )

[2] => Array
    (
        [0] => 1
        [1] => 2
        [2] => 3
        [3] => 0
    )

[3] => Array
    (
        [0] => 
        [1] => eng
        [2] => 
        [3] => 
    )

[4] => Array
    (
        [0] => dts (DTS) ([130][0][0][0] / 0x0082)
        [1] => dts (DTS-HD MA) ([134][0][0][0] / 0x0086)
        [2] => mp2 ([3][0][0][0] / 0x0003)
        [3] => mp2
    )

[5] => Array
    (
        [0] => 48000
        [1] => 48000
        [2] => 48000
        [3] => 41000
    )

[6] => Array
    (
        [0] => 5.1(side)
        [1] => 5.1(side)
        [2] => stereo
        [3] => stereo
    )

)

2 Answers 2

1

One tries to get cue's when free-form parsing. Its usually inadequate based only on a small sample text only because you can't see the generating program.

Taking that into account, this might fix up your basic concern. But I would break it up into a few known simple parts, then parse those separately.

Stream[ ]+\#
([0-9.]+)? : ([0-9.]+)         # 1,2  title : chapter
[^:(]* (?:\(([^)]*)\))?        # 3    language
[^:]* :
[ ]* Audio:
[^(\w,]* (\w*)                  # 4   aud codec
[^,]* , 
[ ]*([0-9]*)[ ]* (?i:[mkhz]+)   # 5   aud frequency
[^,]* , 
[ ]* ([^\n,]*)                  # 6   aud chan's
Sign up to request clarification or add additional context in comments.

Comments

1

If you only want to match the immediate codec name after Audio: then remove all the extraneous match groups, and just look for alphanumeric characters:

 /Stream #([0-9\.]+)?:([0-9\.]+).([A-Za-z][A-Za-z]*)?.+Audio: (\w+)/

You could also just have used strtok($value, " ") to split out the first part from the result array entries.

2 Comments

nope i need all regex values but on array index 3 only the language names and on array index 4 only the audio codecs eg [0] => dts [1] => dts and so on
Aha, interesting. And how exactly does this regex not do that?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.