1

I have a list of records that have addresses that are not separated for a mailing list. I want to split the records by the street addresses and city names.

For the first issue, how can I split the records up by the street type e.g. "St", "Drive", "Dr" "Trail" etc.

The String.Split "eats" the "Ct" in this example.

 string source1 = "Cxxx, Kxxx,9999 Valleycrest Ct Allen TX 75002 ,,,,,,,,,";
      // string source2= "Cxxx, Mxxx Exxx,9999 Chesterwood Dr Little Elm, TX 75068 ,,,,,,,,,";

        string[] stringSeparators = new string[] { "Drive", "St", "Dr", "Trail","Ct" };
        string[] result;

        // ...
        result = source1.Split(stringSeparators, StringSplitOptions.None);

        foreach (string s in result)
        {
            Console.Write("'{0}' ", String.IsNullOrEmpty(s) ? "<>" : s);
        }

        //Objective
       // "Cxxx, Kxxx,9999 Valleycrest Ct, Allen, TX, 75002 ,,,,,,,,,"

Here is a sample of the list.

"Pxxx, Sxxx","9999 Southgate Dr McKinney, TX 75070 ",,,,,,,,,
"Hxxxx, Mxxxx","9999 Glendale Ct Allen, TX 75013 ",,,,,,,,,
"Axxxx, Nxxxxx","99999 Balez Drive Frisco, TX 75035 ",,,,,,,,,
"Sxxx, Dxxxx","999 Pine Trail Allen, TX 75002 ",,,,,,,,,
"Vxxx, Sxxxx","9999 Richmond Ave Dallas, TX 75206 ",,,,,,,,,

My list does not include "St Louis" so that will not be an issue.

To simplify my issue.

If I have the following string:

"Cxxx, Kxxx,9999 Valleycrest Ct Allen TX 75002"

and I want to split on the following string "Ct, Dr, Ave"

I want the following result[]

result[0]="Cxxx, Kxxx,9999 Valleycrest Ct" result[1]=" Allen TX 75002"

Because delimiter strings are not included in the elements of the returned array I want them to not be deleted. Is there another option I am missing?

In other words , don't remove the "Ct" "Dr" or whatever separator I find/use.

Thanks

8
  • 6
    That's going to be a difficult problem. What will happen in "ST. Louis", or "New Haven"? Maybe you can use a Google or USPS API to do the hard work of parsing the address. Commented Aug 7, 2018 at 20:16
  • What do you want the output to be? Commented Aug 7, 2018 at 20:16
  • 4
    You're going to need something far more sophisticated to accomplish your task. Accurately parsing addresses is not a trivial task. Commented Aug 7, 2018 at 20:18
  • 2
    In fact, I once worked for tax-reporting company and wrote just this program in vb6. You need to walk from the back of the address. You can easy get to the state. After which you need to look for "ave, avenue, ct, court", and in case of "st" - check twice. As in "Main st st Louis", or "street rd st louis". In other words if you have any combination of 2 address specifiers, your address and state will split somewhere in between, depending if you have apartment, building, etc. It is going to be a program, not just a routine Commented Aug 7, 2018 at 20:24
  • 1
    Falsehoods Programmers Believe About Addresses. Commented Aug 7, 2018 at 20:53

2 Answers 2

2

You can try this RegEx:

@"(?<first>.+(?:Drive|St|Dr|Trail|Ct))(?<second>[^""]*)"

Now you can access the Capture.Groups["first"] and Capture.Groups["second]; However, this Works without the quotes as in your example.

BTW: You can try it out here:RegExBuilder

Edit:

(?<first>

will make a named Group.

.+ will match any char one or more times.

(?:

will create a non-capturing Group which matches any of the Words inside | Means 'or' The 'second' named Group will match any character not being a quote (zero or more characters).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Poul... I wish I could de-translate the Regex to know what the syntax means. But it does the job.
2

As a commenter pointed out, if you have a large database of addresses, it is likely you will come across some that won't parse correctly this way, and you'll have to make tweaks. For this reason I would contain the risk inside a separate class designed specifically for parsing the address. For parsing, you just need to use IndexOf the old fashioned way, in a loop:

public class Address
{
    static private readonly string[] separators = new string[] { "Drive", "St", "Dr", "Trail","Ct" };

    protected readonly string _text;

    public Address(string text)
    {
        _text = text;
        foreach (var s in separators)
        {
            var i = text.IndexOf(s);
            if (i == -1) continue;
            var splitPoint = i + s.Length;
            StreetPart = text.Substring(0,splitPoint);
            CityPart = text.Substring(splitPoint+1);
            return;
        }
        StreetPart = text;
        CityPart = null;
    }

    public string StreetPart { get; private set; }
    public string CityPart { get; private set; }

    public override string ToString()
    {
        return _text;
    }
}

Then you can call it like this:

public class Program
{
    public static string[] tests = new string []
    {
        @"9999 Southgate Dr McKinney,TX 75070",
        @"Glendale Ct Allen, TX 75013",
        @"99999 Balez Drive Frisco, TX 75035",
        @"999 Pine Trail Allen, TX 75002",
        @"999 Richmond Ave Dallas, TX 75206"
    };

    public static void Main()
    {
        foreach (var t in tests)
        {
            var a = new Address(t);
            Console.WriteLine("Address: '{0}'  StreetPart: '{1}' CityPart: '{2}'", a, a.StreetPart, a.CityPart);
        }
    }
}

Output:

Address: '9999 Southgate Dr McKinney,TX 75070'  StreetPart: '9999 Southgate Dr' CityPart: 'McKinney,TX 75070'
Address: 'Glendale Ct Allen, TX 75013'  StreetPart: 'Glendale Ct' CityPart: 'Allen, TX 75013'
Address: '99999 Balez Drive Frisco, TX 75035'  StreetPart: '99999 Balez Drive' CityPart: 'Frisco, TX 75035'
Address: '999 Pine Trail Allen, TX 75002'  StreetPart: '999 Pine Trail' CityPart: 'Allen, TX 75002'
Address: '999 Richmond Ave Dallas, TX 75206'  StreetPart: '999 Richmond Ave Dallas, TX 75206' CityPart: ''

Example on DotNetFiddle

1 Comment

John, Very nice solution. Plus I never had seen DotNetFiddle

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.