0

In php and java there are explode and tokenizer function to convert a string into array without punctuations. Are are functions or some way in delphi to do the work. Suppose there is a large file " This is, a large file with punctuations,, and spaces and numbers 123..." How can we get array "This is a large file with punctuations and spaces and numbers 123"

Thank you very much in advance.

Yes, we want only [0..9],[a..z],[A..Z], like \w in regex. Can we use regex in Tperlregex to extract \w and put them in Tstringlist as if tstringlist is a array, but it may not be so efficient? Thank you.

3
  • 1
    You might want to observe the fact that a string, in essence, is an array of characters. Commented Oct 22, 2010 at 14:42
  • 1
    Based on your example, it looks like you just want to remove commas from a string. Can you please edit your question to be more precise? Commented Oct 22, 2010 at 15:29
  • Thank you Andreas Rejbrand and Eugene Mayevski Commented Oct 22, 2010 at 15:30

3 Answers 3

4

If you need a function that takes a string and returns an array of strings, these strings being the substrings of the original separated by punctuation, as Eugene suggested in my previous answer, then you can do

type
  StringArray = array of string;
  IntegerArray = array of integer;
  TCharSet = set of char;

function split(const str: string; const delims: TCharSet): StringArray;
var
  SepPos: IntegerArray;
  i: Integer;
begin
  SetLength(SepPos, 1);
  SepPos[0] := 0;
  for i := 1 to length(str) do
    if str[i] in delims then
    begin
      SetLength(SepPos, length(SepPos) + 1);
      SepPos[high(SepPos)] := i;
    end;
  SetLength(SepPos, length(SepPos) + 1);
  SepPos[high(SepPos)] := length(str) + 1;
  SetLength(result, high(SepPos));
  for i := 0 to high(SepPos) -  1 do
    result[i] := Trim(Copy(str, SepPos[i] + 1, SepPos[i+1] - SepPos[i] - 1));
end;

Example:

const
  PUNCT = ['.', ',', ':', ';', '-', '!', '?'];

procedure TForm4.FormCreate(Sender: TObject);
var
  str: string;
begin
  for str in split('this, is, a! test!', PUNCT) do
    ListBox1.Items.Add(str)
end;
Sign up to request clarification or add additional context in comments.

Comments

2

This depends on the definition of "alphanumerical character" and "puncutation character".

If we for instance define the set of punctuation characters

const
  PUNCT = ['.', ',', ':', ';', '-', '!', '?'];

and consider all other characters alphanumeric, then you could do

function RemovePunctuation(const Str: string): string;
var
  ActualLength: integer;
  i: Integer;
const
  PUNCT = ['.', ',', ':', ';', '-', '!', '?'];
begin
  SetLength(result, length(Str));
  ActualLength := 0;
  for i := 1 to length(Str) do
    if not (Str[i] in PUNCT) then
    begin
      inc(ActualLength);
      result[ActualLength] := Str[i];
    end;
  SetLength(result, ActualLength);
end;

This function turns a string into a string. If you want to turn a string into an array of characters instead, just do

type
  CharArray = array of char;

function RemovePunctuation(const Str: string): CharArray;
var
  ActualLength: integer;
  i: Integer;
const
  PUNCT = ['.', ',', ':', ';', '-', '!', '?'];
begin
  SetLength(result, length(Str));
  ActualLength := 0;
  for i := 1 to length(Str) do
    if not (Str[i] in PUNCT) then
    begin
      result[ActualLength] := Str[i];
      inc(ActualLength);
    end;
  SetLength(result, ActualLength);
end;

(Yes, in Delphi, strings use 1-based indexing, whereas arrays use 0-based indexing. This is for historical reasons.)

2 Comments

I believe the OP needs a parser function which will take a string and create an array of substrings, extracted by splitting on punctuation marks.
Ah, I see. (But why didn't he/she say so?)
0

There seems to be no built-in functionality like in Java tokenizer. Long time ago we wrote a tokenizer class similar to Java one which became part of ElPack component suite (now LMD ElPack). Here's some implementation of string tokenizer similar to Java one (just found this link in Google, so I can't comment on code quality).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.