ABI
The ABI (Application Binary Interface) is a specification that defines the encoding/decoding of data types and a standard for exposing and invoking methods in a smart contract.
The specification is defined in ARC4.
At a high level, the ABI allows contracts to define an API with rich types and offer an interface description so clients know exactly what the contract is expecting to be passed.
Data Types
Section titled “Data Types”In Algorand ABI (ARC-4), each data type has a precise encoding scheme, ensuring that contracts and client applications can seamlessly exchange information without ambiguity. It’s crutial to understand how these types - such as integers, strings, arrays, addresses, and more — are structured and its respective representation.
Keep in mind that the AVM only reads uint64 and bytes, usually the convertion of data types to these main two is handled under the hood by Algorand Python, Algorand Typescript and Algokit Utils.
This section describes how ABI types can be represented as byte strings.
| Type | Description |
|---|---|
| uintN | An N-bit unsigned integer, where 8 <= N <= 512 and N % 8 = 0 |
| byte | An alias for uint8 |
| bool | A boolean value that is restricted to either 0 or 1. When encoded, up to 8 consecutive bool values will be packed into a single byte |
| ufixedNxM | An N-bit unsigned fixed-point decimal number with precision M, where 8 <= N <= 512, N % 8 = 0, and 0 < M <= 160, which denotes a value v as v / (10^M) |
| type[N] | A fixed-length array of length N, where N >= 0. type can be any other type |
| address | Used to represent a 32-byte Algorand address. This is equivalent to byte[32] |
| type[] | A variable-length array. type can be any other type |
| string | A variable-length byte array (byte[]) assumed to contain UTF-8 encoded content |
| (T1,T2,…,TN) | A tuple of the types T1, T2, …, TN, N >= 0 |
| reference type | account, asset, application only for arguments, in which case they are an alias for uint8. See section “Reference Types” below |
Encoding for the data types is specified here.
Reference Types
Section titled “Reference Types”Reference types may be specified in the method signature referring to some transaction parameters that must be passed. The value encoded is a uint8 reference to the index of element in the relevant array (i.e. for account, the index in the foreign accounts array). These types are:
account- represents an Algorand account, stored in the Accounts arrayasset- represents an Algorand Standard Asset (ASA), stored in the Foreign Assets arrayapplication- represents an Algorand Application, stored in the Foreign Apps array
Usually the construction of these arrays and handling these reference types is also executed by the high-level language tools in Algorand and Algokit.
Methods
Section titled “Methods”Methods may be exposed by the smart contract and called by submitting an ApplicationCall transaction to the existing application id.
A method signature is defined as a name, argument types, and return type. The stringified version is then hashed and the first 4 bytes are taken as a method selector.
For example:
A method signature for an add method that takes 2 uint64s and returns 1 uint128:
Method signature: add(uint64,uint64)uint128The string version of the method signature is hashed and the first 4 bytes are its method selector:
SHA-512/256 hash (in hex): 8aa3b61f0f1965c3a1cbfa91d46b24e54c67270184ff89dc114e877b1753254aMethod selector (in hex): 8aa3b61fOnce the method selector is known, it is used in the smart contract logic to route to the appropriate logic that implements the add method.
The method pseudo-opcode can be used in a contract to do the above work and produce a method selector given the method signature string.
method "add(uint64,uint64)uint128"Implementing a Method
Section titled “Implementing a Method”Implementing a method is done by handling an ApplicationCall transaction where the first element matches its method selector and the subsequent elements are used by the logic in the method body.
The initial handling logic of the contract should route to the correct method given a match against the method selector passed and the known method selector of the application method.
The return value of the method must be logged with the prefix 151f7c75 which is the result of sha256("return")[:4]. Only the last logged element with this prefix is considered the return value of this method call.
Interfaces
Section titled “Interfaces”An Interface is a logically grouped set of methods. An Algorand Application implements an Interface if it supports all of the methods from that Interface.
For example, an Interface Calculator providing addition and subtraction of integer methods and an Interface NumberFormatting providing formatting methods for numbers into strings are likely to be used together. Interface designers should ensure that all the methods in Calculator and NumberFormatting have distinct method selectors.
For example:
{ "name": "Calculator", "desc": "Interface for a basic calculator supporting additions and multiplications", "methods": [ { "name": "add", "desc": "Calculate the sum of two 64-bit integers", "args": [ { "type": "uint64", "name": "a", "desc": "The first term to add" }, { "type": "uint64", "name": "b", "desc": "The second term to add" } ], "returns": { "type": "uint128", "desc": "The sum of a and b" } }, { "name": "multiply", "desc": "Calculate the product of two 64-bit integers", "args": [ { "type": "uint64", "name": "a", "desc": "The first factor to multiply" }, { "type": "uint64", "name": "b", "desc": "The second factor to multiply" } ], "returns": { "type": "uint128", "desc": "The product of a and b" } } ]}Contracts
Section titled “Contracts”A Contract is a declaration of what an Application implements. It includes the complete list of the methods implemented by the related Application. It is similar to an Interface, but it may include further details about the concrete implementation, as well as implementation-specific methods that do not belong to any Interface. In addition to the set of methods from the Contract’s definition, a Contract may allow bare Application calls (zero arg application calls). The primary purpose of bare Application calls is to allow the execution of an OnCompletion actions which requires no inputs and has no return value such as NoOp, OptIn, CloseOut, UpdateApplication and DeleteApplication.
Here’s an example of a contract implementation:
{ "name": "Calculator", "desc": "Contract of a basic calculator supporting additions and multiplications. Implements the Calculator interface.", "networks": { "wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=": { "appID": 1234 }, "SGO1GKSzyE7IEPItTxCByw9x8FmnrCDexi9/cOUJOiI=": { "appID": 5678 } }, "methods": [ { "name": "add", "desc": "Calculate the sum of two 64-bit integers", "args": [ { "type": "uint64", "name": "a", "desc": "The first term to add" }, { "type": "uint64", "name": "b", "desc": "The second term to add" } ], "returns": { "type": "uint128", "desc": "The sum of a and b" } }, { "name": "multiply", "desc": "Calculate the product of two 64-bit integers", "args": [ { "type": "uint64", "name": "a", "desc": "The first factor to multiply" }, { "type": "uint64", "name": "b", "desc": "The second factor to multiply" } ], "returns": { "type": "uint128", "desc": "The product of a and b" } } ]}The API of a smart contract can be published as an interface description object. A user may read this object and instantiate a client that handles the encoding/decoding of the arguments and returns values using one of the SDKs or Algokit Utils.
A full example of a contract json file might look like:
{ "name": "super-awesome-contract", "networks": { "MainNet": { "appID": 123456 } }, "methods": [ { "name": "add", "desc": "Add 2 integers", "args": [{ "type": "uint64" }, { "type": "uint64" }], "returns": { "type": "uint64" } }, { "name": "sub", "desc": "Subtract 2 integers", "args": [{ "type": "uint64" }, { "type": "uint64" }], "returns": { "type": "uint64" } }, { "name": "mul", "desc": "Multiply 2 integers", "args": [{ "type": "uint64" }, { "type": "uint64" }], "returns": { "type": "uint64" } }, { "name": "div", "desc": "Divide 2 integers, throw away the remainder", "args": [{ "type": "uint64" }, { "type": "uint64" }], "returns": { "type": "uint64" } }, { "name": "qrem", "desc": "Divide 2 integers, return both the quotient and remainder", "args": [{ "type": "uint64" }, { "type": "uint64" }], "returns": { "type": "(uint64,uint64)" } }, { "name": "reverse", "desc": "Reverses a string", "args": [{ "type": "string" }], "returns": { "type": "string" } }, { "name": "txntest", "desc": "just check it", "args": [{ "type": "uint64" }, { "type": "pay" }, { "type": "uint64" }], "returns": { "type": "uint64" } }, { "name": "concat_strings", "desc": "concat some strings", "args": [{ "type": "string[]" }], "returns": { "type": "string" } }, { "name": "manyargs", "desc": "Try to send 20 arguments", "args": [ { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" }, { "type": "uint64" } ], "returns": { "type": "uint64" } }, { "name": "min_bal", "desc": "Get the minimum balance for given account", "args": [{ "type": "account" }], "returns": { "type": "uint64" } }, { "name": "tupler", "desc": "", "args": [{ "type": "(string,uint64,string)" }], "returns": { "type": "uint64" } } ]}Validating ABI Values
Section titled “Validating ABI Values”There are three main categories of ABI types:
- Fixed-length types
- Dynamic arrays
- Dynamic tuples
The implications of validation vary between each type which is detailed below. In summary, invalid encodings of ARC4 values can lead to critical security issues.
Fixed-Length Types
Section titled “Fixed-Length Types”If a fixed-length type is longer or shorter than it should be, this can lead to unintended memory access. For example, a StaticBytes<32> should always be 32 bytes, but if it’s longer it can be used to overwrite other values in an array.
For example:
@abimethod(validate_encoding="unsafe_disabled") def static_value(self, static_bytes: arc4.StaticArray[arc4.Byte, Literal[32]]) -> arc4.UInt64: # ⚠️ VULNERABLE: If static_bytes is more than 32 bytes, # it will overflow into SUPER_IMPORTANT_VALUE array = arc4.Tuple((static_bytes.copy(), arc4.UInt64(SUPER_IMPORTANT_VALUE))) return array[1] @abimethod({ validateEncoding: "unsafe-disabled" }) staticValue(staticBytes: StaticBytes<32>): uint64 { const array: [StaticBytes<32>, uint64] = [ // ⚠️ VULNERABLE: If static_bytes is more than 32 bytes, // it will overflow into SUPER_IMPORTANT_VALUE staticBytes, SUPER_IMPORTANT_VALUE, ];
return array[1]; // Returns the last 8 bytes of staticBytes instead of SUPER_IMPORTANT_VALUE }Dynamic Arrays
Section titled “Dynamic Arrays”ABI arrays are always prefixed with their length. For example, the ABI encoding of 0xdeadbeef as byte[] is 0x0004deadbeef because 0xdeadbeef is 4 bytes long. If the ABI length prefix is longer than the actual value this can lead to an AVM panic when trying to access out-of-bounds memory. If the ABI length prefix is shorter than the actual length this can lead to unintended behavior in contract logic.
For example:
even_numbers: GlobalState[DynamicArray[UInt64]]
def __init__(self) -> None: self.even_numbers = GlobalState(DynamicArray[UInt64])
@abimethod(validate_encoding="unsafe_disabled") def store_numbers(self, numbers: DynamicArray[UInt64]) -> None: # If the ABI prefix for numbers is more than the actual amount of numbers, this will panic # If the ABI prefix for numbers is less than the actual amount of numbers, not all numbers will be validated for num in numbers: assert num % 2 == 0, "Only even numbers are allowed" self.even_numbers.value = numbers.copy()
@abimethod() def get_even_number(self, index: UInt64) -> UInt64: # If the index is larger than what was given as the ABI prefix, this may potentially return an odd number that # bypassed the validation in storeNumbers return self.even_numbers.value[index] evenNumbers = GlobalState<uint64[]>();
@abimethod({ validateEncoding: "unsafe-disabled" }) storeNumbers(numbers: uint64[]) { // If the ABI prefix for numbers is more than the actual amount of numbers, this will panic // If the ABI prefix for numbers is less than the actual amount of numbers, not all numbers will be validated for (const num of numbers) { assert(num % 2, "Only even numbers are allowed"); }
this.evenNumbers.value = numbers; }
getEvenNumber(index: uint64): uint64 { // If the index is larger than what was given as the ABI prefix, this may potentially return an odd number that // bypassed the validation in storeNumbers return this.evenNumbers.value[index]; }Dynamic Tuples
Section titled “Dynamic Tuples”Tuples with dynamically sized elements are encoded with two sections of a byte array. The head which contains offsets into the byte array where the values live and the tail which includes the actual value. For example, [0xdead, 0xbeef] encoded as (byte[], byte[]) is 0x000400080002dead0002beef because 0x0002dead starts at byte 4 and 0x0002beef starts at byte 8. If the offsets are larger than the byte length the AVM can panic. Offsets that point to the incorrect byte offset can lead to unintended behavior in contract logic.
Most high-level languages, such as the ones that use the Puya compiler, will always use the head offsets to extract values from the tuple. This means that incorrect head offsets will lead to panics when attempting to use those values thus not allowing unintended memory access. Caution and understanding should still be taken when not validation types because this is not guaranteed to always be the case.