python base64 string - how to decode first 8 bytes

Question

I'm having some problems decoding what should be simple data.

I have a base64 string that represents a np.int64 followed by an array of np.float64. The size of the array is defined by the first np.int64. This pattern is then repeated for multiple arrays. So in order to decode all of the arrays, I need to be able to read the size in bytes to find the starting point of the next pair.

Here is a very simple example showing the first pair. The second pair starts straight after this - after 64 bytes or 88 base64 characters. then rinse and repeat for the remainig arrays.

>>> test_data = 'OAAAAAAAAAAAAAAAAAAAAFVVVVVVVcU/VVVVVVVV1T8AAAAAAADgP1VVVVVVVeU/qqqqqqqq6j8AAAAAAADwPw=='
>>> struct.unpack('Qddddddd', base64.b64decode(test_data)) # 'Q7d' also works
(56,
 0.0,
 0.16666666666666666,
 0.3333333333333333,
 0.5,
 0.6666666666666666,
 0.8333333333333333,
 1.0)

My problem is that I need to extract the Int64 first to know the proper size array to be unpacked and the start of the next array which starts immediately after this.

I thought I could simply cut off the first 8 bytes from the base64 string using the 4/3 size relation and round to the nearest 4 to account for padding like so:

struct.unpack('Q', base64.b64decode(test_data[:12]))

But that always throws an error regardelsss of how big my slice is (I've tried 8 to 16 just to try and figure out what is going on):

struct.error: unpack requires a buffer of 8 bytes

There must be a simple way to extract just that first integer without knowing the length of the array it is describing?

@mkrieger1 well it will just be an integer followed by a certain number of other values (could be int or float) that will be used to form a numpy array — jpmorr
– jpmorr, Commented Mar 16 at 21:03
@jpmorr Use struct.unpack_from: e.g. b = base64.b64decode(test_data); struct.unpack_from('Q', b)[0] --> 56. There's also struct.calcsize if you want to get the offset into the bytes data. So you could also do struct.unpack('Q', b[:struct.calcsize('Q')])[0] (which is probably roughly equivalent to what the previous solution does). — ekhumoro
– ekhumoro, Commented Mar 16 at 21:29
Then you have 9 bytes. So, indeed, either you use unpack_from. Or you could just use almost your own code struct.unpack('Q', base64.b64decode(test_data[:12])[:8]) — chrslg
– chrslg, Commented Mar 17 at 8:30
@ekhumoro Thanks. unpack_from is the magic I was looking for and didn't read carefully enough to see. — jpmorr
– jpmorr, Commented Mar 17 at 9:36
@chrslg That's what I was currently doing: reading extra data and then slicing out the first 8 bytes, but I thought that wasn't the best way to achieve what I needed. — jpmorr
– jpmorr, Commented Mar 17 at 9:38

mkrieger1 · Accepted Answer · 2025-03-16 21:46:14Z

1

You need to first decode the base64 string to retrieve the original binary data. This approach simplifies data manipulation, as each character in a Base64 string represents 6 bits (so it's complicated to select a byte). Once decoded, you can easily unpack the binary data. Here is a solution that does that for multiple arrays.

import base64, struct

test_data = 'OAAAAAAAAAAAAAAAAAAAAFVVVVVVVcU/VVVVVVVV1T8AAAAAAADgP1VVVVVVVeU/qqqqqqqq6j8AAAAAAADwPw=='

decoded_data = base64.b64decode(test_data)

index = 0
while (index < len(decoded_data)):
    array_size = struct.unpack('Q', decoded_data[index : (index + 8)])[0]
    data = struct.unpack('d' * (array_size // 8), decoded_data[(index + 8) : (array_size + 8)])
    index += array_size + 8
    print(f'array size: {array_size // 8}')
    print(f'array data: {data}')

edited Mar 16 at 21:46

mkrieger1

24.2k7 gold badges68 silver badges84 bronze badges

answered Mar 16 at 21:41

fadicoder

213 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

python base64 string - how to decode first 8 bytes

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related