Crate data_encoding [−] [src]
Efficient and customizable data-encoding functions
This crate provides little-endian ASCII base-conversion encodings for bases of size 2, 4, 8, 16, 32, and 64. It supports:
- padded and unpadded encodings
- canonical encodings (e.g. trailing bits are checked)
- in-place encoding and decoding functions
- partial decoding functions (e.g. for error recovery)
- character translation (e.g. for case-insensitivity)
- most and least significant bit-order
- ignoring characters when decoding (e.g. for skipping newlines)
- wrapping the output when encoding
The performance of the encoding and decoding functions are similar to existing implementations (see how to run the benchmarks on github).
This is the library documentation. If you are looking for the binary, see the installation instructions on github.
Examples
This crate provides predefined encodings as constants. These constants are
of type Encoding
. This type provides encoding and decoding functions
with in-place or allocating variants. Here is an example using the
allocating encoding function of base64:
use data_encoding::BASE64; assert_eq!(BASE64.encode(b"Hello world"), "SGVsbG8gd29ybGQ=");
Here is an example using the in-place decoding function of base32:
use data_encoding::BASE32; let input = b"JBSWY3DPEB3W64TMMQ======"; let mut output = vec![0; BASE32.decode_len(input.len()).unwrap()]; let len = BASE32.decode_mut(input, &mut output).unwrap(); assert_eq!(&output[0 .. len], b"Hello world");
You are not limited to the predefined encodings. You may define your own
encodings (with the same correctness and performance properties as the
predefined ones) using the Specification
type:
use data_encoding::Specification; let hex = { let mut spec = Specification::new(); spec.symbols.push_str("0123456789abcdef"); spec.encoding().unwrap() }; assert_eq!(hex.encode(b"hello"), "68656c6c6f");
If you use the lazy_static
crate, you can define a global encoding:
lazy_static! { static ref HEX: Encoding = { let mut spec = Specification::new(); spec.symbols.push_str("0123456789abcdef"); spec.translate.from.push_str("ABCDEF"); spec.translate.to.push_str("abcdef"); spec.encoding().unwrap() }; }
You may also use the macro library to define a compile-time custom encoding:
const HEX: Encoding = new_encoding!{ symbols: "0123456789abcdef", translate_from: "ABCDEF", translate_to: "abcdef", }; const BASE64: Encoding = new_encoding!{ symbols: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", padding: '=', };
Properties
The base16, base32, base32hex, base64, and base64url predefined encodings are conform to RFC4648.
In general, the encoding and decoding functions satisfy the following properties:
- They are deterministic: their output only depends on their input
- They have no side-effects: they do not modify a hidden mutable state
- They are correct: encoding then decoding gives the initial data
- They are canonical (unless
is_canonical
returns false): decoding then encoding gives the initial data
This last property is usually not satisfied by common base64 implementations
(like the rustc-serialize
crate, the base64
crate, or the base64
GNU
program). This is a matter of choice and this crate has made the choice to
let the user choose. Support for canonical encoding as described by the
RFC is provided. But it is also possible to disable checking
trailing bits, to add characters translation, to decode concatenated padded
inputs, and to ignore some characters.
Since the RFC specifies the encoding function on all inputs and the decoding
function on all possible encoded outputs, the differences between
implementations come from the decoding function which may be more or less
permissive. In this crate, the decoding function of canonical encodings
rejects all inputs that are not a possible output of the encoding function.
Here are some concrete examples of decoding differences between this crate,
the rustc-serialize
crate, the base64
crate, and the base64
GNU
program:
Input | data-encoding | rustc | base64 | GNU base64 |
---|---|---|---|---|
AAB= | Trailing(2) | [0, 0] | [0, 0] | \x00\x00 |
AA\nB= | Length(4) | [0, 0] | Err(2) | \x00\x00 |
AAB | Length(0) | [0, 0] | [0, 0] | Invalid input |
A\rA\nB= | Length(4) | [0, 0] | Err(1) | Invalid input |
-_\r\n | Symbol(0) | [251] | Err(0) | Invalid input |
AA==AA== | [0, 0] | Err | Err(2) | \x00\x00 |
We can summarize these discrepancies as follows:
Discrepancy | data-encoding | rustc | base64 | GNU base64 |
---|---|---|---|---|
Check trailing bits | Yes | No | No | No |
Ignored characters | None | \r and \n | None | \n |
Translated characters | None | -_ mapped to +/ | None | None |
Check padding | Yes | No | No | Yes |
Support concatenated input | Yes | No | No | Yes |
This crate permits to disable checking trailing bits. It permits to ignore some characters. It permits to translate characters. It permits to use unpadded encodings. However, for padded encodings, support for concatenated inputs cannot be disabled. This is simply because it doesn't make sense to use padding if it is not to support concatenated inputs.
Migration
The changelog describes the changes between v1 and v2. Here are the migration steps for common usage:
v1 | v2 |
---|---|
use data_encoding::baseNN | use data_encoding::BASENN |
baseNN::function | BASENN.method |
baseNN::function_nopad | BASENN_NOPAD.method |
Structs
DecodeError |
Decoding error |
DecodePartial |
Decoding error with partial result |
Encoding |
Base-conversion encoding |
Specification |
Base-conversion specification |
SpecificationError |
Specification error |
Translate |
How to translate characters when decoding |
Wrap |
How to wrap the output when encoding |
Enums
BitOrder |
Order in which bits are read from a byte |
DecodeKind |
Decoding error kind |
Constants
BASE32 |
Padded base32 encoding |
BASE64 |
Padded base64 encoding |
BASE32HEX |
Padded base32hex encoding |
BASE32HEX_NOPAD |
Unpadded base32hex encoding |
BASE32_DNSCURVE |
DNSCurve base32 encoding |
BASE32_DNSSEC |
DNSSEC base32 encoding |
BASE32_NOPAD |
Unpadded base32 encoding |
BASE64URL |
Padded base64url encoding |
BASE64URL_NOPAD |
Unpadded base64url encoding |
BASE64_MIME |
MIME base64 encoding |
BASE64_NOPAD |
Unpadded base64 encoding |
HEXLOWER |
Lowercase hexadecimal encoding |
HEXLOWER_PERMISSIVE |
Lowercase hexadecimal encoding with case-insensitive decoding |
HEXUPPER |
Uppercase hexadecimal encoding |
HEXUPPER_PERMISSIVE |
Uppercase hexadecimal encoding with case-insensitive decoding |