[Java]Bencoding: decoding string

Question

I'm implementing a bencoding system for a torrent downloading system I'm making.

Bencoding a string is very easy, you take a string, for instance "hello", and you encode it by writing the string length + a ':' character, followed by the string itself. Bencoded "hello" will be "5:hello"

Currently I'm having this code.

public BencodeString(String string) {
    this.string = string;
}

public static BencodeString parseBencodeString(String string) {
    byte[] bytes = string.getBytes();
    int position = 0;
    int size = 0;
    StringBuilder sb = new StringBuilder();
    while (bytes[position] >= '0' && bytes[position] <= '9') {
        sb.append((char) bytes[position]);
        position++;
    }
    if (bytes[position] != ':')
        return null;

    size = Integer.parseInt(sb.toString());
    System.out.println(size);
    if (size <= 0)
        return null;
    return new BencodeString(string.substring(position + 1, size + position
            + 1));
}

It works, but I have the feeling that it could be done ways better. What is the best way to do this?

Note: the string could be any size (thus more than one digit before the string)

Solved already, thanks to everybody that replied here :)

This sounds like a question that would be better answered on codereview.stackexchange.com — Stephen C
– Stephen C, Commented Sep 15, 2012 at 23:16
You could parse the length manually while you're scanning for the end of the length, that would avoid building a new string that you're only going to parse anyway — user555045
– user555045, Commented Sep 15, 2012 at 23:17

the8472 · Accepted Answer · 2012-11-02 22:44:01Z

2

Java strings represent Unicode character sequences. You shouldn't use them for bencoding/decoding since you'll run into encoding issues that way.

For example the dictionary keys must be sorted by their binary representation, not by string sorting. Bencoded data has no inherent character set as values can contain raw binary (such as hashes) or utf-8 encoded strings (utf places constraints on which byte sequences are valid).

You should work with ByteBuffers or plain byte[] arrays instead.

answered Nov 2, 2012 at 22:44

the8472

43.5k6 gold badges79 silver badges139 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

obataku · Accepted Answer · 2012-09-15 23:18:41Z

0

string.substring(string.indexOf(':'))

That's all you need to do. You already know the size of the data since you have it all in a String.

answered Sep 15, 2012 at 23:18

obataku

29.8k4 gold badges46 silver badges57 bronze badges

2 Comments

Jon Over a year ago

Can't believe I didn't thought of that haha, thanks good sir.

android developer Over a year ago

so he doesn't even need the ":" and the integer before . he simply can put the string itself .

android developer · Accepted Answer · 2012-09-15 23:16:43Z

0

use string.split(":") in order to get the 2 sides of the string ...

then parse both however you wish .

answered Sep 15, 2012 at 23:16

android developer

117k162 gold badges794 silver badges1.3k bronze badges

2 Comments

Jon Over a year ago

Is there any way to split only from the first occurence of ':' with regex? (The string could be an url, which could holds them)

DNA Over a year ago

Yes, use the other form of split() which takes a limit parameter: docs.oracle.com/javase/1.4.2/docs/api/java/lang/…

DNA · Accepted Answer · 2012-09-15 23:27:33Z

0

How about something like:

public static BencodeString parseBencodeString(String string)
{
    int colon = string.indexOf(":");
    if(colon >= 0)
    {
        return new BencodeString(string.substring(colon+1));
    }
    return null;
}

answered Sep 15, 2012 at 23:27

DNA

42.7k12 gold badges114 silver badges153 bronze badges

Collectives™ on Stack Overflow

[Java]Bencoding: decoding string

4 Answers 4

Comments

2 Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related