5

I need to split the string bellow into array keys like in this format:

string = "(731) some text here with number 2 (220) some 54 number other text here" convert into:

array( 
  '731' => 'some text here with number 2', 
  '220' => 'some 54 number other text here' 
);

I have tried:

preg_split( '/\([0-9]{3}\)/', $string ); 

and got:

array ( 
  0 => 'some text here', 
  1 => 'some other text here' 
); 
7
  • I think, you should represent the data in question in little better format. It will be easier for community to read it and reply. Commented Jul 29, 2016 at 9:15
  • 1
    The requirements are too vague. Could you at least provide the expected output for the provided input string? Commented Jul 29, 2016 at 9:15
  • And maybe what you tried yourself until now? Commented Jul 29, 2016 at 9:17
  • Did you try something? Commented Jul 29, 2016 at 9:28
  • I'm guessing you want the numbers in the string to be the keys in your array, correct? Commented Jul 29, 2016 at 9:29

6 Answers 6

8

Code

$string = "(731) some text here with number 2 (220) some 54 number other text here";

preg_match_all("/\((\d{3})\) *([^( ]*(?> +[^( ]+)*)/", $string, $matches);
$result = array_combine($matches[1], $matches[2]);

var_dump($result);

Output

array(2) {
  [731]=>
  string(28) "some text here with number 2"
  [220]=>
  string(30) "some 54 number other text here"
}

ideone demo


Description

The regex uses

  • \((\d{3})\) to match 3 digits in parentheses and captures it (group 1)
  • \ * to match the spaces in between keys and values
  • ([^( ]*(?> +[^( ]+)*) to match everything except a ( and captures it (group 2)
    This subpattern matches exactly the same as [^(]*(?<! ) but more efficiently, based on the unrolling-the-loop technique.

    *Notice though that I am interpreting a value field cannot have a ( within. If that is not the case, do tell and I will modify it accordingly.

After that, we have $matches[1] with keys and $matches[2] with values. Using array_combine() we generate the desired array.

Sign up to request clarification or add additional context in comments.

3 Comments

Up-voted only for the use of array_combine(). Two issues with your RegEx: 1- Unnecessary possessive quantifiers. 2- Ignoring parenthesis from being within strings (which may be there).
Why should you think there could be a failure case for \s*+ and [^( ]*+?
Hah! Right! No idea what I was thinking. Thanks for the heads-up
1

Try this:

$string = "(731) some text here with number 2 (220) some 54 number other text here";
$a = preg_split('/\s(?=\()/', $string);//split by spaces preceding the left bracket
$res = array();
foreach($a as $v){
    $r = preg_split('/(?<=\))\s/', $v);//split by spaces following the right bracket
    if(isset($r[0]) && isset($r[1])){
        $res[trim($r[0],'() ')] = trim($r[1]);//trim brackets and spaces
    }
}
print_r($res);

Output:

Array
(
    [731] => some text here with number 2
    [220] => some 54 number other text here
)

DEMO

If you want to limit it only to those numbers in brackets that have 3 digits, just modify the lookarounds:

$a = preg_split('/\s(?=\([0-9]{3}\))/', $string);

Comments

1

you can try this one,

<?php
$str="(731) some text here (220) some other text here";
echo $str .'<br>';
$arr1=explode('(', $str);
$size_arr=count($arr1);
$final_arr=array();
for($i=1;$i<$size_arr; $i++){
    $arr2=explode(')', $arr1[$i]);
    $final_arr[$arr2[0]]=trim($arr2[1]);
}
echo '<pre>';
print_r($final_arr);
?>

Use this link to test the code, Click Here.

I try to use the simple syntax. Hope everybody can understand.

Comments

1

I'm pretty sure that defining the keys is not possible, as the regex will add matches coninuously. I would define 2 regex, one for the keys:

preg_match_all("/(\()([0-9]*)(\))\s/", $input_lines, $output_array);

you will find your keys in $output_array[2]. And one for the texts (that looks quite the same):

preg_split("/(\()([0-9]*)(\))\s/", $input_line);

After that, you can build your custom array iterating over both. Make sure to trim the strings in the second array when inserting.

Comments

1

Using preg_replace_callback() you can quickly achieve what you desire (when only parentheses contain 3 digits):

$string = "(731) some text here with number 2 (220) some 54 number other text here";
$array = array();
preg_replace_callback('~(\((\d{3})\))(.*?)(?=(?1)|\Z)~s', function($match) use (&$array) {
    $array[$match[2]] = trim($match[3]);
}, $string);
var_dump($array);

Output:

array(2) {
  [731]=>
  string(28) "some text here with number 2"
  [220]=>
  string(30) "some 54 number other text here"
}

Comments

1

Maybe you can add PREG_SPLIT_DELIM_CAPTURE flag to preg_split. From preg_split man page (http://php.net/manual/en/function.preg-split.php)

PREG_SPLIT_DELIM_CAPTURE

If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.

So if you change your code to:

$results = preg_split('/\(([0-9]+)\)/s', $data,null,PREG_SPLIT_DELIM_CAPTURE);

You will obtain an array similar to:

Array
(
    [0] => KS/M/ 2013/1238 
    [1] => 220
    [2] =>  23/12/2013 
    [3] => 300
    [4] => 

    [5] => 731
    [6] =>  VALDETE BUZA ADEM JASHARI- PRIZREN, KS 
    [7] => 526
    [8] => 

    [9] => 591
    [10] => 

    [11] => 740
    [12] => 


    [13] => 540
    [14] =>  DEINA 
    [15] => 546
    [16] => 


    [17] => 511
    [18] =>  3 Preparatet për zbardhim dhe substancat tjera për larje rrobash; preparatet për pastrim, shkëlqim, fërkim dhe gërryerje; sapunët; parfumet, vajrat esencialë, preparatet kozmetike, losionet për flokë, pasta për dhembe
14 Metalet e cmueshme dhe aliazhet e tyre; mallrat në metale të cmueshme ose të veshura me to, që nuk janë përfshire në klasat tjera; xhevahirët, gurët e cmueshëm; instrumentet horologjike dhe kronometrike (për matjen dhe regjistrimin e kohës)
25 Rrobat, këpucët, kapelat
35 Reklamim, menaxhim biznesi; administrim biznesi; funksione zyre
)

What you should do is to loop over the array ignoring first element in that case:

$myArray = array();
$myKey = '';
foreach ($results as $k => $v) {
  if ( ($k > 0) && ($myKey == '')) {
    $myKey = $v;
  } else if ($k > 0) {
    $myArray[$myKey] = $v; 
    $myKey = '';
  }
}

EDIT: This answer is for:

$data ='KS/M/ 2013/1238 (220) 23/12/2013 (300)
(731) VALDETE BUZA ADEM JASHARI- PRIZREN, KS (526)
(591)
(740)

(540) DEINA (546)

(511) 3 Preparatet për zbardhim dhe substancat tjera për larje rrobash; preparatet për pastrim, shkëlqim, fërkim dhe gërryerje; sapunët; parfumet, vajrat esencialë, preparatet kozmetike, losionet për flokë, pasta për dhembe
14 Metalet e cmueshme dhe aliazhet e tyre; mallrat në metale të cmueshme ose të veshura me to, që nuk janë përfshire në klasat tjera; xhevahirët, gurët e cmueshëm; instrumentet horologjike dhe kronometrike (për matjen dhe regjistrimin e kohës)
25 Rrobat, këpucët, kapelat
35 Reklamim, menaxhim biznesi; administrim biznesi; funksione zyre';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.