Skip to main content
Added tags for easier search indexing on the question
Link
Add information from comment section
Source Link
AdminBee
  • 23.6k
  • 25
  • 55
  • 77

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command.

$ dd if=FileCarve.001 bs=512 | split -b512 - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 4.10824 s, 16.3 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
25b37f28204895e5d0b1cb160c5fa599d15188baf7e529ccc92a10fdb3f0515a  -

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine.

$ dd if=FileCarve.001 bs=512 | split -b1k - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 2.06029 s, 32.6 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  -

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

In response to a comment: I did clear out the split/ directory on each run.

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command.

$ dd if=FileCarve.001 bs=512 | split -b512 - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 4.10824 s, 16.3 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
25b37f28204895e5d0b1cb160c5fa599d15188baf7e529ccc92a10fdb3f0515a  -

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine.

$ dd if=FileCarve.001 bs=512 | split -b1k - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 2.06029 s, 32.6 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  -

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command.

$ dd if=FileCarve.001 bs=512 | split -b512 - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 4.10824 s, 16.3 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
25b37f28204895e5d0b1cb160c5fa599d15188baf7e529ccc92a10fdb3f0515a  -

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine.

$ dd if=FileCarve.001 bs=512 | split -b1k - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 2.06029 s, 32.6 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  -

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

In response to a comment: I did clear out the split/ directory on each run.

Became Hot Network Question
added 747 characters in body
Source Link

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command. split in 512b segments

$ dd if=FileCarve.001 bs=512 | split -b512 - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 4.10824 s, 16.3 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
25b37f28204895e5d0b1cb160c5fa599d15188baf7e529ccc92a10fdb3f0515a  -

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine. split in 1k segments

$ dd if=FileCarve.001 bs=512 | split -b1k - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 2.06029 s, 32.6 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  -

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command. split in 512b segments

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine. split in 1k segments

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

So I'm trying to split a 64MB file FileCarve.001 into 512 byte segments (each block is 512 bytes long). I need to make sure the file has the same data when split into smaller files, so I cat all the files to standard out and pipe it into sha256sum (there's a lot of files, so I need to do this with find and xargs).

Splitting the file in 512 byte segments seems to garble the data when the output gets split by the split command.

$ dd if=FileCarve.001 bs=512 | split -b512 - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 4.10824 s, 16.3 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
25b37f28204895e5d0b1cb160c5fa599d15188baf7e529ccc92a10fdb3f0515a  -

But splitting the file in 1 kilobyte segments (1000 bytes) seems to work just fine.

$ dd if=FileCarve.001 bs=512 | split -b1k - splits/img
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 2.06029 s, 32.6 MB/s
$ sha256sum FileCarve.001 
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  FileCarve.001
$ find splits/ -type f -print0 | xargs -0 cat | sha256sum
3e64100044099b10060f5ca3194d4d60414941c7cb26437330aba532852a60cd  -

Why are they different? Is there something I don't understand about the way blocks work on a storage device?

Source Link
Loading