0

I have a Python scripts for printing a decimal string-represented integer as a 4-bytes little-endian representation:

import sys
i4 = int(sys.argv[1])
sys.stdout.buffer.write(i4.to_bytes(4, 'little'))
# import struct; sys.stdout.buffer.write(struct.pack('<I', i4))

Is there a nice way to do it in with pure Linux / shell commands? (e.g. printf / awk / sed)

Thanks!

3
  • Is this what you're looking for? stackoverflow.com/q/77680291/2422776 Commented Aug 30, 2024 at 10:51
  • What's in argv[1] and what's the expected output? Commented Aug 30, 2024 at 15:56
  • shell is not really suited to manipulating binary data. Commented Aug 30, 2024 at 16:33

3 Answers 3

2

You can use printf to convert decimal to fixed-width hexadecimal. The result will be in standard most-significant to least-significant digit order:

printf '%08x' "$1"

If you want little-endian byte order then you need to capture that result and reorder the bytes, or else separate and re-order the bytes first, then convert to hex.

For example, with Bash and the GNU version of cut, you might do this:

bytes=($(printf '%08x' "$1" | cut -b1-2,3-4,5-6,7-8 --output-delimiter ' '))
echo "${bytes[3]}${bytes[2]}${bytes[1]}${bytes[0]}"

That does the hex conversion all in one piece, then splits the hex digit pairs from each other and gathers them into an array, which is finally (manually) output in reverse order.

Alternatively, the split-first approach might look like this:

num=$1
for i in {1..4}; do
  printf '%02x' $((num % 256))
  num=$((num / 256))
done
echo

That picks off the bytes one by one, least-significant first, and separately converts each to hexadecimal.

Addendum - raw binary data

However you produce the hexadecimal, you can convert to raw binary data by passing it through xxd -r -p. For example,

bytes=($(printf '%08x' "$1" | cut -b1-2,3-4,5-6,7-8 --output-delimiter ' '))
xxd -r -p <<<"${bytes[3]}${bytes[2]}${bytes[1]}${bytes[0]}"
Sign up to request clarification or add additional context in comments.

4 Comments

Maybe I'm misunderstanding your answer, but I don't need a hexadecimal/ASCII representation, I need actual raw bytes containing the integer, as it is in two's complement format e.g. in C for int32_t, i.e. have exactly 4 bytes on the output...
Raw bytes represented how, @VadimKantorov? This provides a hexadecimal representation of the raw bytes. You cannot store bytes with value 0 in a shell variable, and it's at best tricky to print them. But I'll see what I can do.
My Python snippets achieve this purpose. I wonder if I can do it with more basic shell tools than Python... So how to output raw bytes (not hex) is precisely the question...
@VadimKantorov, you can achieve this by passing the hexadecimal through xxd -r -p. I have added that to this answer.
0

Something like this (bash only) perhaps may help you

#! /bin/bash

n="$1"
r='(..)(..)(..)(..)'
h=$(printf '%08x\n' "$n")
if [[ "$h" =~ $r ]]; then
    printf '%b%b%b%b' "\\x${BASH_REMATCH[4]}" "\\x${BASH_REMATCH[3]}" "\\x${BASH_REMATCH[2]}" "\\x${BASH_REMATCH[1]}"
fi

Comments

-1

So here are 2 POSIX-compliant and fully portable awk functions, one for hex representation, the other for combined BE32 and LE32 actual binary bytes. Specify either L or LE (in any case) in 2nd function argument to request for LE32, otherwise it defaults to BE32.

It also has auto detection for byte mode vs. UTF-8 mode, and adjusts necessary offsets accordingly. hex32() assumes unsigned input only, but binary32() properly range clamps, including negative inputs.

function hex32(__, _, ___) {

    return (__ = int(__)) < (___ = (_ += _ += _ ^= _<_)^(_ * _)) \
        ? sprintf("0x[%.*X]", _ + _, __) \
        : sprintf("0x[%.*X][%.*X]", _ += _, 
                            (__ - (__ %= ___)) / ___, _, __)
}
function binary32(__, _, _1_, _2_, _3_, _4_) {

             _1_ = int(__)
          __ = _ = (_ ~ /^[Ll][Ee]?$/)
      _ ^=   _4_ =  _ += _ += _ += !_
    _1_ += ((_1_ %= _4_ = _^_4_) < !_) * _4_

    _4_ = ((_3_ = ((_2_ = (_1_ - (_1_ %= _)) / _) \
            - (_2_ %= _)) / _) - (_3_ %= _)) / _

    return sprintf("%.*s%c%c%c%c", (__ && (_3_ += _2_ - (_2_ = _3_)) < \
                                          (_4_ += _1_ - (_1_ = _4_))) < !_,
                                   FLG_AWK_UTF8 ? _ *= _ * _ : _ = !_,
                                   _4_ + _, _3_ + _, _2_ + _, _1_ + _)
}
BEGIN { FLG_AWK_UTF8 = !+sprintf("%c", 5^5) }

        89 0x[00000059]
      4567 0x[000011D7]
     76543 0x[00012AFF]

  23456789 0x[0165EC15]
  61277761 0x[03A70641]

3221225473 0x[C0000001]
3745221223 0x[DF3B8A67]

FLG_AWK_UTF8 = !+sprintf("%c", 5^5)

The detection here works by leveraging 8-bit wraparound behavior in byte mode to yield the ASCII integer "5" for an input value of 5^5, but that same value in Unicode mode prints this 3-byte UTF-8 Telugu character "వ" instead.

3 Comments

Sorry, but this is not maintainable code. We should write software to be readable and maintainable and so that others have an easy time to learn and improve it. The way you wrote it just shows that you are a very capable hacker but it is not making it accessible for others.
Unlike every other solution above, mine is the only one that offers endian flexibility, have no loops, no arrays, and no hard-coded offsets or magic numbers of any sort. And unlike those multi-step messes that require intermediary hex strings for no reason, mine is the only solution that can go directly from decimal integer to BE32 or LE32 in raw binary form. Needing printf THEN cut with a whole pile of hard-coded offsets into a global shell array THEN echo you back the input in flipped order or having to feed that into xxd is what you consider "maintainable code" ??
bash approach has a wasteful div op by last loop cycle. Mine is also the ONLY solution that can even take in negative inputs of any size (precision-allowing), and still return the BE32/LE32 represented by the 32 LSBs once properly re-adjusted to unsigned space by 2s-compliment. In addition, none of those my div ops involve flt. pt. divisions at all even though awk doesn't have a dedicated int-div operator like python's //. But if your definition of "maintainable code" is hard-coding one set of codes for LE32 then re-inventing the wheel for BE32 then yes mine would fail

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.