2

I have a difficulty on reading a file contains stock price which is in binary format. I have been trying to browse for an answer in here and googling for any tutorial of using DataInputStream, yet still no luck. None of them are working.

I have also read about big and small endian conversion in Java but it still gives me the wrong value. Is there anyone has any experience on reading *.mkt file using Java? I got the code which is working fine but it is written in C but the requirement is to rewrite it in Java.

The purpose of the method to get several fields out of each block of binary data as specified by

if (j == 1 || j == 4 || j == 9 || j == 11 || j == 12 || j == 13 || j == 14)

Below is the spec for the binary data and the code i wrote for testing.

HEADER

Transcode -> Short 2 Bytes
Timestamp -> Long 4Bytes
Message -> Short 2 Bytes

DATA

Security Token -> Short 2 Bytes
Last Traded Price -> Long 4 Bytes
Best Buy Quantity -> Long 4 Bytes
Best Buy Price -> Long 4 Bytes
Best Sell Quantity -> Long 4 Bytes
Best Sell Price -> Long 4 Bytes
Total Traded Quantity -> Long 4 Bytes
Average Traded Price -> Long 4 Bytes
Open Price -> Long 4 Bytes
High Price -> Long 4 Bytes
Low Price -> Long 4 Bytes
ClosePrice -> Long 4 Bytes
Filler -> Long 4 Bytes (Blank)

Total 50 Bytes

public static void main(String[] args) throws Exception {
    FileInputStream inputStream = new FileInputStream(new File("<Path to the file>.mkt"));
    List<String> results = readPriceFromStream(inputStream);
    inputStream.close();
    System.out.println(results.get(0));
}

public static List<String> readPriceFromStream(InputStream sourceInputStream) throws Exception {
    List<String> result = new ArrayList<>();

    DataInputStream inputStream = new DataInputStream(sourceInputStream);

    int[] byteSequences = new int[]{2, 4, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4};
    int len = 50;

    for (int i = 1; i <= inputStream.available(); i += len) {
        StringBuilder sb = new StringBuilder();
        int read = 0;

        for (int j = 0; j < byteSequences.length; j++) {
            byte[] buffer = new byte[byteSequences[j]];

            if (j == 1 || j == 4 || j == 9 || j == 11 || j == 12 || j == 13 || j == 14) {
                try {
                    sb.append(Integer.valueOf(inputStream.readLong())).append(",");
                } catch (Exception e) {
                    e.printStackTrace();
                    sb.append("0").append(",");
                }
            } else {
                read = inputStream.read(buffer, 0, byteSequences[j]);
            }
        }

        if (read <= -1) {
            break;
        } else {
            String price = sb.toString();

            if (price.length() > 0) {
                price = price.substring(0, price.lastIndexOf(","));
                result.add(price);
            }
        }
    }

    if (result.size() > 0) {
        result.remove(0);
    }

    inputStream.close();
    return result;
}

** And following is the code snippet written in C **

for(i = 0; i <= fileLen; i=i+58) {
    fread(&TransCode, sizeof(signed short int), 1, input_filename);
    fread(&TimeStamp, sizeof(signed long int), 1, input_filename);

... Truncated for clarity ...

Sample Data Transcode,Timestamp, MessageLength, SecurityToken, LastTradedPrice, BestBuyQuantity, BestBuyPrice, BestSellQuantity, BestSellPrice, TotalTradedQuantity, AverageTradedPrice, OpenPrice, HighPrice, LowPrice, ClosePrice, Blank

5,1435905898,58,7,34600,1,34585,29,34600,47479,34777,34560,35100,34500,34670,0

Result from the main(String[] args)

-2416744146710362880,-615304298158882816,-7614115823107390437,149649579050240,22110525258626,139753974434839,144387995842138645

If this is duplicating another question or has been answered before, please kindly help to point me to that question/answer cause I am desperate now (been spending half day tried to make it work) and i have limited knowledge on this kind of binary thing. Thanks.

3
  • 1
    Java's long is 8 bytes, not 4 bytes. Commented Jul 3, 2015 at 10:44
  • Tried to change to read with readInt and readShort but still got the wrong value Commented Jul 3, 2015 at 10:53
  • Jeez, your code is horrible. Hold on... Commented Jul 3, 2015 at 10:56

3 Answers 3

1

Try

Path path = Paths.get("path/to/file");
byte[] byteArray= Files.readAllBytes(path);
ByteBuffer bbuffer = ByteBuffer.wrap(byteArray);
short numS = bbuffer.getShort();
System.out.println("short: " + numS);

If the Endian is wrong (e.g. 1280 instead of 5) try Short.reverseBytes(numS); for a single value or bbuffer.order(ByteOrder.LITTLE_ENDIAN); for all elements.

java.nio.ByteBuffer also supports reading specific positions, e.g. java.nio.ByteBuffer.getShort(int) and of course different data types. Just read the file line by line (or in 50 byte chunks) with the ByteBuffer.

Sign up to request clarification or add additional context in comments.

6 Comments

I tried your suggestion and got the same result with readShort from DataInputStream. The output of printout above is 1280 whereas it should be 5 as i described in the sample data.
Can you post the hex-value of that file (with a hex-editor like ultra-edit)? In Java, the Short '5' is this in binary: 0000000000000101 or this in hex: 0005, the Short 58 is 0000000000111010 in binary or 003A in hex
It its an endian-Problem try Short.reverseBytes(numS);
Just use bbuffer.order(ByteOrder.LITTLE_ENDIAN);.
@hinneLinks Here are the first few lines hex values from the file (I've tried to sort the byte using LITTLE_ENDIAN and reverseBytes method and is still working). 05 00 6A 2F 96 55 3A 00 07 00 28 87 00 00 01 00 00 00 19 87 00 00 1D 00 00 00 28 87 00 00 77 B9 00 00 D9 87 00 00 00 87 00 00 1C 89 00 00 C4 86 00 00 6E 87 00 00 00 00 00 00 05 00 6A 2F 96 55 3A 00 0A 00 D9 76 00 00 C9 00 00 00 C5 76 00 00 0C 00 00 00 DE 76 00 00 C6 53 05 00 3E 77 00 00 F7 76 00 00 37 78 00 00 66 76 00 00 61 76 00 00 00 00 00 00 05 00 62 2F 96 55 3A 00 0D 00 C4 1B
|
1

Lose all that for (int i = 1; i <= inputStream.available(); i += len) { stuff. You're doing everything wrong.

After DataInputStream inputStream = new DataInputStream(sourceInputStream); create a loop that goes something like this...

try {
    while(true) {  // An EOFException is thrown when there's no more data
       short transcode = inputStream.readShort();
       int timestamp = inputStream.readInt();
       short message = inputStream.readShort();
       // and so on
    }
} catch(EOFException e) {
    // File processed
}

Don't forget the signedness of Java vs. the unsignedness of at least some of the data fields.

Edit: Since your data is actually in Little Endian form, it's better to use a ByteBuffer like hinneLinks advised:

Path path = Paths.get("path/to/file");
byte[] byteArray= Files.readAllBytes(path);
ByteBuffer bbuffer = ByteBuffer.wrap(byteArray);
bbuffer.order(ByteOrder.LITTLE_ENDIAN); // Set the byte order
short numS = bbuffer.getShort();
System.out.println("short: " + numS);

1 Comment

Your data has the "wrong" endianness (little endian 5 is big endian 1280). Guava has a LittleEndianDataInputStream.
0

Looking at the values in byteSequences you're using readLong to read values that are 32 bits numbers (4 bytes), but long in java is actually 64 bit value (8 bytes), so you end up reading two values.

1 Comment

as the specification says e.g. for Timestamp, its value type is Long 4 bytes. So do you have any trick or suggestion how to consume that using one of methods provided by DataInputStream?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.