Caching Files in HyperBEAM
A beginner's guide to persistent caching with lazy loading
What You'll Learn
By the end of this tutorial, you'll understand:
- The Cache — Content-addressed storage for messages and results
- Lazy Loading — Loading data on-demand to save memory
- Read & Write — Storing and retrieving cached data
- Cache Control — HTTP-style directives for caching behavior
- How these pieces work together to create efficient storage
No prior HyperBEAM caching knowledge required. Basic Erlang helps, but we'll explain as we go.
The Big Picture
HyperBEAM's cache provides content-addressed storage for messages and computation results. Data is stored in three layers, with automatic deduplication through content hashing.
When you cache data, it's stored once regardless of how many times it's referenced. Links point to cached data, loading it only when needed.
Here's the mental model:
Message → Write → Content Hash → Store
↓ ↓ ↓
Data Link DeduplicationThink of it like a library:
- Cache = The library building
- Content Hash = The book's ISBN (unique identifier)
- Link = A catalog reference card
- Lazy Loading = Only fetching books when someone actually wants to read them
Let's build each piece.
Part 1: Writing to Cache
📖 Reference: hb_cache
The cache stores messages (maps) and binary data. Every piece of content gets a unique ID based on its content hash.
Writing a Message
%% Get a store (we'll use the default)
Store = hb_store:default(),
Opts = #{store => Store},
%% Create a message
Msg = #{<<"key">> => <<"value">>},
%% Write it to cache
{ok, ID} = hb_cache:write(Msg, Opts).
%% ID is the content-addressed identifierThe write/2 function:
- Computes content hash of your data
- Stores data at that hash (deduplication!)
- Returns the ID for future retrieval
Writing Binary Data
For raw binary data, use write_binary/3:
Binary = <<"Hello, HyperBEAM!">>,
Hashpath = <<"my-custom-path">>,
{ok, DataPath} = hb_cache:write_binary(Hashpath, Binary, Opts).
%% Creates a link at Hashpath pointing to the binaryQuick Reference: Write Functions
| Function | What it does |
|---|---|
hb_cache:write(Msg, Opts) | Store a message, return ID |
hb_cache:write_binary(Path, Binary, Opts) | Store binary at path |
hb_cache:write_hashpath(Msgs, Opts) | Write hashpath for message chain |
Part 2: Reading from Cache
📖 Reference: hb_cache
Reading returns the first layer of data. Nested messages remain as links until you explicitly load them.
Basic Read
%% Write first
Msg = #{<<"name">> => <<"Alice">>},
{ok, ID} = hb_cache:write(Msg, Opts),
%% Read it back
{ok, ReadMsg} = hb_cache:read(ID, Opts).
%% ReadMsg contains the message (first layer only)Handling Not Found
case hb_cache:read(SomeID, Opts) of
{ok, Msg} ->
%% Found it
process(Msg);
not_found ->
%% ID doesn't exist in cache
handle_missing()
end.Reading Signed Messages
Signed messages can be read by both their unsigned and signed IDs:
Wallet = ar_wallet:new(),
%% Create and sign a message
Msg = hb_message:commit(
#{<<"data">> => <<"test">>},
Wallet
),
%% Get the signed ID
SignedID = hb_message:id(Msg, signed, Opts),
%% Write it
{ok, _} = hb_cache:write(Msg, Opts),
%% Read by signed ID
{ok, Read} = hb_cache:read(hb_util:human_id(SignedID), Opts).Quick Reference: Read Functions
| Function | What it does |
|---|---|
hb_cache:read(ID, Opts) | Read message by ID |
hb_cache:read_resolved(Msg1, Msg2, Opts) | Read cached computation result |
Part 3: Lazy Loading
📖 Reference: hb_cache
Lazy loading is the cache's superpower. Instead of loading entire nested structures into memory, HyperBEAM uses links—lightweight references that only load when accessed.
The Link Structure
%% A link looks like this:
{link, ID, LinkOpts}
%% LinkOpts contain loading hints
LinkOpts = #{
<<"type">> => <<"link">>,
<<"lazy">> => true,
store => Store
}Loading One Layer: ensure_loaded/2
Load just the first layer, keeping nested items as links:
%% Write nested data
Inner = #{<<"inner">> => <<"data">>},
Outer = #{<<"outer">> => Inner},
{ok, OuterID} = hb_cache:write(Outer, Opts),
%% Read returns first layer with links
{ok, ReadOuter} = hb_cache:read(OuterID, Opts),
%% ensure_loaded resolves the top-level link
Loaded = hb_cache:ensure_loaded(ReadOuter, Opts).
%% Loaded has first layer; nested "outer" is still a linkLoading Everything: ensure_all_loaded/2
When you need the complete data structure:
%% Fully resolve all nested links
FullyLoaded = hb_cache:ensure_all_loaded(ReadOuter, Opts).
%% Now you can navigate with direct map access
OuterVal = maps:get(<<"outer">>, FullyLoaded),
InnerVal = maps:get(<<"inner">>, OuterVal).
%% InnerVal = <<"data">>⚠️ Performance Warning:
ensure_all_loaded/2recursively loads everything. For deeply nested messages, this can be expensive. Only use when you truly need all the data.
When to Use Each
| Function | Use When |
|---|---|
ensure_loaded/2 | You only need top-level fields |
ensure_all_loaded/2 | You need to traverse the entire structure |
Complete Example
-module(lazy_loading_example).
demo() ->
Store = hb_store:default(),
Opts = #{store => Store},
%% Create deeply nested structure
Deep = #{<<"deep">> => <<"treasure">>},
Middle = #{<<"middle">> => Deep},
Top = #{<<"top">> => Middle},
%% Write to cache
{ok, TopID} = hb_cache:write(Top, Opts),
%% Read (returns with links)
{ok, ReadTop} = hb_cache:read(TopID, Opts),
%% Load everything
Loaded = hb_cache:ensure_all_loaded(ReadTop, Opts),
%% Navigate to the treasure
TopVal = maps:get(<<"top">>, Loaded),
MiddleVal = maps:get(<<"middle">>, TopVal),
DeepVal = maps:get(<<"deep">>, MiddleVal),
io:format("Found: ~p~n", [DeepVal]).
%% Prints: Found: <<"treasure">>Quick Reference: Loading Functions
| Function | What it does |
|---|---|
hb_cache:ensure_loaded(Value, Opts) | Load first layer only |
hb_cache:ensure_all_loaded(Value, Opts) | Recursively load everything |
Part 4: Cache Control
📖 Reference: hb_cache_control
Cache control determines when to cache and when to use cached results. HyperBEAM uses HTTP-style directives with clear precedence rules.
Cache Control Directives
| Directive | Effect |
|---|---|
<<"always">> | Always store and lookup |
<<"store">> | Enable storing |
<<"no-store">> | Disable storing |
<<"cache">> | Enable lookup |
<<"no-cache">> | Disable lookup (force recompute) |
<<"only-if-cached">> | Return error if not cached |
Using Cache Control
%% Always cache
Opts1 = #{cache_control => [<<"always">>]},
{ok, Result} = hb_ao:resolve(Msg1, Msg2, Opts1).
%% Result is automatically cached
%% Force fresh computation
Opts2 = #{cache_control => [<<"no-cache">>]},
{ok, Fresh} = hb_ao:resolve(Msg1, Msg2, Opts2).
%% Ignores any cached result
%% Only use cache, fail if missing
Opts3 = #{cache_control => [<<"only-if-cached">>]},
case hb_ao:resolve(Msg1, Msg2, Opts3) of
{ok, Cached} -> Cached;
{error, _} -> not_in_cache
end.Conditional Store
Use maybe_store/4 to conditionally cache results:
Msg1 = #{<<"key">> => <<"value">>},
Msg2 = #{<<"path">> => <<"key">>},
Msg3 = <<"result">>,
Opts = #{store => Store, cache_control => [<<"always">>]},
case hb_cache_control:maybe_store(Msg1, Msg2, Msg3, Opts) of
{ok, Path} ->
io:format("Cached at: ~p~n", [Path]);
not_caching ->
io:format("Caching disabled~n")
end.Conditional Lookup
Use maybe_lookup/3 to check cache before computing:
case hb_cache_control:maybe_lookup(Msg1, Msg2, Opts) of
{ok, Cached} ->
%% Cache hit!
{cached, Cached};
{continue, M1, M2} ->
%% Cache miss, continue to compute
compute(M1, M2);
{error, #{<<"status">> := 504}} ->
%% only-if-cached was set but not found
{error, not_cached}
end.Precedence Rules
Cache control settings come from multiple sources with clear precedence:
- Opts (highest) — Node operator has final say
- Msg3 — Result message from device
- Msg2 (lowest) — User request message
%% Msg2 says "cache", but Opts says "no-cache"
%% Result: no-cache wins (Opts has highest precedence)
Msg2 = #{<<"cache-control">> => [<<"cache">>]},
Opts = #{cache_control => [<<"no-cache">>]}.Async Caching
For performance, cache in the background:
Opts = #{async_cache => true},
hb_cache_control:maybe_store(Msg1, Msg2, Result, Opts).
%% Returns immediately, caches in background workerPart 5: Listing and Matching
📖 Reference: hb_cache
The cache provides utilities for listing contents and finding messages by template.
Listing Keys
%% Write a message with multiple keys
Msg = #{<<"a">> => <<"1">>, <<"b">> => <<"2">>, <<"c">> => <<"3">>},
{ok, ID} = hb_cache:write(Msg, Opts),
%% List the keys
Keys = hb_cache:list(ID, Opts).
%% Keys = [<<"a">>, <<"b">>, <<"c">>]Listing Numbered Keys
For sequential data (like scheduler slots):
Msg = #{
<<"1">> => <<"first">>,
<<"2">> => <<"second">>,
<<"5">> => <<"fifth">>,
<<"10">> => <<"tenth">>
},
{ok, ID} = hb_cache:write(Msg, Opts),
Numbers = hb_cache:list_numbered(ID, Opts).
%% Numbers = [1, 2, 5, 10] (sorted integers)Matching Messages
Find messages matching a template (requires LMDB backend):
%% Write some messages
{ok, ID1} = hb_cache:write(#{<<"type">> => <<"user">>, <<"name">> => <<"Alice">>}, Opts),
{ok, ID2} = hb_cache:write(#{<<"type">> => <<"user">>, <<"name">> => <<"Bob">>}, Opts),
{ok, _} = hb_cache:write(#{<<"type">> => <<"post">>, <<"title">> => <<"Hello">>}, Opts),
%% Find all users
Template = #{<<"type">> => <<"user">>},
{ok, UserIDs} = hb_cache:match(Template, Opts).
%% UserIDs = [ID1, ID2]Quick Reference: Utility Functions
| Function | What it does |
|---|---|
hb_cache:list(Path, Opts) | List keys under a path |
hb_cache:list_numbered(Path, Opts) | List numeric keys as sorted integers |
hb_cache:match(Template, Opts) | Find messages matching template |
Part 6: Test It
Create a test file src/test/test_hb5.erl:
-module(test_hb5).
-include_lib("eunit/include/eunit.hrl").
-include("include/hb.hrl").
%% Run with: rebar3 eunit --module=test_hb5
basic_write_read_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Write a message
Msg = #{<<"greeting">> => <<"Hello, World!">>},
{ok, ID} = hb_cache:write(Msg, Opts),
?debugFmt("Written with ID: ~p", [ID]),
%% Read it back
{ok, Read} = hb_cache:read(ID, Opts),
Loaded = hb_cache:ensure_all_loaded(Read, Opts),
?assertEqual(<<"Hello, World!">>, maps:get(<<"greeting">>, Loaded)),
?debugFmt("Basic write/read: OK", []).
nested_lazy_loading_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Create deeply nested structure
Level3 = #{<<"value">> => <<"treasure">>},
Level2 = #{<<"nested">> => Level3},
Level1 = #{<<"data">> => Level2},
{ok, ID} = hb_cache:write(Level1, Opts),
?debugFmt("Wrote nested structure with ID: ~p", [ID]),
%% Read returns links
{ok, Read} = hb_cache:read(ID, Opts),
%% Full load resolves all links
Loaded = hb_cache:ensure_all_loaded(Read, Opts),
%% Navigate to the treasure
Data = maps:get(<<"data">>, Loaded),
Nested = maps:get(<<"nested">>, Data),
Value = maps:get(<<"value">>, Nested),
?assertEqual(<<"treasure">>, Value),
?debugFmt("Nested lazy loading: OK", []).
deduplication_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Same content should get same ID (content-addressed!)
Msg = #{<<"x">> => <<"same content">>},
{ok, ID1} = hb_cache:write(Msg, Opts),
{ok, ID2} = hb_cache:write(Msg, Opts),
?assertEqual(ID1, ID2),
?debugFmt("Deduplication verified: same ID for same content", []).
not_found_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Try to read non-existent ID
FakeID = hb_util:human_id(<<1:256>>),
Result = hb_cache:read(FakeID, Opts),
?assertEqual(not_found, Result),
?debugFmt("Not found handling: OK", []).
cache_control_always_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Msg1 = #{<<"key">> => <<"cached-value">>},
Msg2 = <<"key">>,
%% First resolve with "always" to cache the result
Opts1 = #{store => Store, cache_control => [<<"always">>]},
{ok, Res1} = hb_ao:resolve(Msg1, Msg2, Opts1),
?assertEqual(<<"cached-value">>, Res1),
?debugFmt("Resolved and cached with 'always'", []),
%% Now use only-if-cached - should hit cache
Opts2 = #{store => Store, cache_control => [<<"only-if-cached">>]},
{ok, Res2} = hb_ao:resolve(Msg1, Msg2, Opts2),
?assertEqual(<<"cached-value">>, Res2),
?debugFmt("Cache hit with 'only-if-cached': OK", []).
cache_control_no_store_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
Msg1 = #{<<"key">> => <<"value">>},
Msg2 = #{<<"cache-control">> => [<<"no-store">>]},
Msg3 = <<"result">>,
%% Should not cache with no-store directive
Result = hb_cache_control:maybe_store(Msg1, Msg2, Msg3, Opts),
?assertEqual(not_caching, Result),
?debugFmt("no-store directive respected: OK", []).
list_keys_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Write message with multiple keys
Msg = #{<<"alpha">> => <<"1">>, <<"beta">> => <<"2">>, <<"gamma">> => <<"3">>},
{ok, ID} = hb_cache:write(Msg, Opts),
%% List returns all keys
Keys = hb_cache:list(ID, Opts),
SortedKeys = lists:sort(Keys),
?assertEqual([<<"alpha">>, <<"beta">>, <<"gamma">>], SortedKeys),
?debugFmt("List keys: OK", []).
list_numbered_test() ->
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% Write message with numbered keys
Msg = #{
<<"1">> => <<"first">>,
<<"2">> => <<"second">>,
<<"5">> => <<"fifth">>,
<<"10">> => <<"tenth">>
},
{ok, ID} = hb_cache:write(Msg, Opts),
%% Returns sorted integers
Numbers = hb_cache:list_numbered(ID, Opts),
?assertEqual([1, 2, 5, 10], lists:sort(Numbers)),
?debugFmt("List numbered: OK", []).
complete_workflow_test() ->
?debugFmt("=== Complete Caching Workflow Test ===", []),
Store = hb_test_utils:test_store(),
hb_store:reset(Store),
Opts = #{store => Store},
%% 1. Create nested data
Inner = #{<<"secret">> => <<"hidden treasure">>},
Outer = #{<<"container">> => Inner, <<"label">> => <<"box">>},
?debugFmt("1. Created nested data structure", []),
%% 2. Write to cache
{ok, ID} = hb_cache:write(Outer, Opts),
?debugFmt("2. Cached with ID: ~p", [ID]),
%% 3. Read back (lazy)
{ok, Read} = hb_cache:read(ID, Opts),
?debugFmt("3. Read from cache (with links)", []),
%% 4. Fully load
Loaded = hb_cache:ensure_all_loaded(Read, Opts),
?debugFmt("4. Fully loaded all nested data", []),
%% 5. Verify structure
Label = maps:get(<<"label">>, Loaded),
?assertEqual(<<"box">>, Label),
Container = maps:get(<<"container">>, Loaded),
Secret = maps:get(<<"secret">>, Container),
?assertEqual(<<"hidden treasure">>, Secret),
?debugFmt("5. Verified nested structure", []),
%% 6. Verify deduplication
{ok, ID2} = hb_cache:write(Outer, Opts),
?assertEqual(ID, ID2),
?debugFmt("6. Verified content-addressed deduplication", []),
?debugFmt("=== All caching tests passed! ===", []).Run the Tests
rebar3 eunit --module=test_hb5Common Patterns
Pattern 1: Write → Read → Load
Msg = #{<<"key">> => <<"value">>},
{ok, ID} = hb_cache:write(Msg, Opts),
{ok, Read} = hb_cache:read(ID, Opts),
Loaded = hb_cache:ensure_all_loaded(Read, Opts).Pattern 2: Cache Computation Results
%% Check cache first
case hb_cache:read_resolved(Msg1, Msg2, Opts) of
{hit, {ok, Cached}} ->
{cached, Cached};
miss ->
%% Compute and cache
Result = compute(Msg1, Msg2),
Hashpath = hb_path:hashpath(Msg1, Msg2, Opts),
hb_cache:write_binary(Hashpath, Result, Opts),
{computed, Result}
end.Pattern 3: Conditional Caching with Control
%% Let cache control decide
case hb_cache_control:maybe_lookup(Msg1, Msg2, Opts) of
{ok, Cached} ->
Cached;
{continue, _, _} ->
Result = compute(),
hb_cache_control:maybe_store(Msg1, Msg2, Result, Opts),
Result
end.Pattern 4: Force Fresh Computation
Opts = #{cache_control => [<<"no-cache">>]},
{ok, Fresh} = hb_ao:resolve(Msg1, Msg2, Opts).Pattern 5: Background Caching
Opts = #{async_cache => true, cache_control => [<<"store">>]},
hb_cache_control:maybe_store(Msg1, Msg2, Result, Opts).
%% Returns immediatelyWhat's Next?
You now understand the caching fundamentals:
| Concept | Module | Key Functions |
|---|---|---|
| Write | hb_cache | write, write_binary, write_hashpath |
| Read | hb_cache | read, read_resolved |
| Lazy Load | hb_cache | ensure_loaded, ensure_all_loaded |
| Control | hb_cache_control | maybe_store, maybe_lookup |
| Utilities | hb_cache | list, list_numbered, match |
Going Further
- Storage Backends — Learn about
hb_store_lmdb,hb_store_fs, andhb_store_rocksdb - Message System — Deep dive into
hb_messagefor signing and verification - The Resolver — See how
hb_ao:resolve/3uses caching internally
Quick Reference Card
📖 Reference: hb_cache | hb_cache_control
%% === SETUP ===
Store = hb_store:default().
Opts = #{store => Store}.
%% === WRITE ===
{ok, ID} = hb_cache:write(Msg, Opts).
{ok, Path} = hb_cache:write_binary(Hashpath, Binary, Opts).
%% === READ ===
{ok, Msg} = hb_cache:read(ID, Opts).
{hit, Result} = hb_cache:read_resolved(Msg1, Msg2, Opts).
not_found = hb_cache:read(BadID, Opts).
%% === LAZY LOADING ===
Loaded = hb_cache:ensure_loaded(Link, Opts).
FullyLoaded = hb_cache:ensure_all_loaded(Msg, Opts).
%% === CACHE CONTROL ===
Opts1 = #{cache_control => [<<"always">>]}.
Opts2 = #{cache_control => [<<"no-cache">>]}.
Opts3 = #{cache_control => [<<"only-if-cached">>]}.
Opts4 = #{async_cache => true}.
{ok, Path} = hb_cache_control:maybe_store(Msg1, Msg2, Result, Opts).
{ok, Cached} = hb_cache_control:maybe_lookup(Msg1, Msg2, Opts).
{continue, M1, M2} = hb_cache_control:maybe_lookup(Msg1, Msg2, Opts).
%% === UTILITIES ===
Keys = hb_cache:list(ID, Opts).
Nums = hb_cache:list_numbered(ID, Opts).
{ok, IDs} = hb_cache:match(Template, Opts).Now go cache something efficiently!
Resources
HyperBEAM Documentation- hb_cache Reference — Cache functions
- hb_cache_control Reference — Cache control logic
- hb_store Reference — Storage abstraction
- Full Reference — All modules
- hb_store — Storage interface
- hb_store_lmdb — LMDB storage backend
- hb_message — Message creation and signing
- hb_link — Link structure and resolution
- HTTP Cache-Control — HTTP caching semantics (for reference)
- Content-Addressable Storage — The deduplication principle