1

I'm parsing a list of URLs from a website and want to build a hierarchical tree of nested arrays.

What I have so far (which works) is below. Because I don't know how deep the levels will go, I'm doing a simple check on the depth and then executing a basic push on to that node.

How can I rewrite this so any number of levels are accommodated?

$tree = array();
$tree[$domain] = array();  // this is the domain root

foreach ( $allMatches as $url ) {

    $foo = parse_url($url );

    // trim "/" from beginning and end
    $bar = trim($foo['path'],'/');
    // for every "/", add a level
    $parts = explode('/', $bar);
    $parts = array_filter($parts, 'strlen');

    // note: there is likely a bug in here.
    // If I process page-1/page-1-1 before page-1,
    // then the leaf or branch containing page-1-1 will be deleted

    if (count($parts) == 1){
        $tree[$domain][$parts[0]] = array();
    }
    if (count($parts) == 2){
        $tree[$domain][$parts[0]][$parts[1]] = array();
    }
    if (count($parts) == 3){
        $tree[$domain][$parts[0]][$parts[1]][$parts[2]] = array();
    }
    if (count($parts) == 4){
        $tree[$domain][$parts[0]][$parts[1]][$parts[2]][$parts[3]] = array();
    }

};

These are the input URLs:

domain.com/page-1
domain.com/page-1/page-1-1
domain.com/page-1/page-1-1/page-1-1-1
domain.com/page-1/page-1-2
domain.com/page-1/page-1-1/page-1-2-1
domain.com/page-2
domain.com/page-2/page-2-1

Note: I do not necessarily need to have domain.com/page-2 in the list in order to generate a leaf for domain.com/page-2/page-2-1

This is the desired resulting structure:

Array
(
    [domain.com] => Array
        (
            [page-1] => Array
                (
                    [page-1-1] => Array
                        (
                            [page-1-1-1] => Array
                                (
                                )
                        )

                    [page-1-2] => Array
                        (
                            [page-1-2-1] => Array
                                (
                                )
                        )
                )

            [page-2] => Array
                (
                    [page-2-1] => Array
                        (
                        )
        )
    )
)
3
  • Can you provide an example of input and desired output? Commented Feb 4, 2017 at 4:05
  • If you don't have page-2 in the list, but you do have page-2/page-2-1 do you want to create the parent node (i.e. page-2), or do you only want page-1-2 as a leaf node? Commented Feb 4, 2017 at 4:13
  • I would like to create the parent node as well as the leaf in this case. Commented Feb 4, 2017 at 4:19

1 Answer 1

1

You can do this by using a recursive function if you pass the array as a reference.

$result = array();

function build_array(&$arr, $parts, $i = 0){
    if($i == sizeof($parts))
        return;
    if(!isset($arr[$parts[$i]]))
        $arr[$parts[$i]] = array();
    build_array($arr[$parts[$i]], $parts, $i+1);
}

# Call it like so:
build_array($result, $parts);

Call this function for each url you have and it should work.

Hint: use array_reduce.

Note: If you're doing this in a web context with user input, I would add a depth limit, as you could easily run out of memory with this implementation given a bad input.

Sign up to request clarification or add additional context in comments.

1 Comment

Excellent. 1 minor modification: $result[$domain] = array(); And it works like a charm.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.