Skip to content

Fetching Data from Remote Sources

A beginner's guide to remote storage in HyperBEAM


What You'll Learn

By the end of this tutorial, you'll understand:

  1. Remote Storage Concept — Fetching data over the network
  2. Gateway Store — Reading from Arweave gateways
  3. Remote Node Store — Reading from other HyperBEAM nodes
  4. Local Caching — Speed up repeated reads
  5. How to combine remote and local stores for resilient data access

No prior HyperBEAM knowledge required. Basic Erlang helps, but we'll explain as we go.


The Big Picture

HyperBEAM stores operate in different scopes:

  • Local — Data lives on your machine (filesystem, LMDB, RocksDB)
  • Remote — Data fetched over the network

Remote stores enable a powerful pattern: your node can transparently access data from anywhere—Arweave gateways, other HyperBEAM nodes, or both. When combined with local caching, you get the best of both worlds: network access with local performance.

Here's the mental model:

Your Node                          Remote Sources
    |                                    |
    +-- Local Cache (fast) --------------+
    |          | miss                    |
    +-- Gateway Store -------------------> Arweave Gateway
    |          or                        |
    +-- Remote Node Store ---------------> Other HyperBEAM Node

Think of it like a library system:

  • Gateway Store = The main archive (Arweave's permanent storage)
  • Remote Node Store = A branch library (another HyperBEAM node)
  • Local Cache = Your personal bookshelf (fast access to frequently used data)

Let's explore each component.


Part 1: Understanding Remote Scope

📖 Reference: hb_store

Every store in HyperBEAM has a scope that indicates where data lives:

%% Local stores return 'local'
hb_store_fs:scope(Opts).     %% => local
hb_store_lmdb:scope(Opts).   %% => local
 
%% Remote stores return 'remote'
hb_store_gateway:scope(Opts).      %% => remote
hb_store_remote_node:scope(Opts).  %% => remote

Why does scope matter?

  • Performance — Remote reads have network latency (around 500ms vs under 1ms)
  • Availability — Remote sources may be temporarily unreachable
  • Caching — Remote data should be cached locally when possible

Part 2: Gateway Store

📖 Reference: hb_store_gateway

The gateway store reads data from Arweave gateways. It's read-only—you can fetch permanent data from Arweave, but writes go through different mechanisms (bundlers).

Basic Configuration

%% Minimal configuration - uses default gateway
Store = #{<<"store-module">> => hb_store_gateway}.

Reading Data

Gateway store only works with valid Arweave transaction IDs (43-character base64url strings):

Store = #{<<"store-module">> => hb_store_gateway},
 
%% Read a message by ID
ID = <<"BOogk_XAI3bvNWnxNxwxmvOfglZt17o4MOVAdPNZ_ew">>,
{ok, Message} = hb_store_gateway:read(Store, ID).
 
%% Message is now a map you can work with
AppName = maps:get(<<"app-name">>, Message).

Subpath Access

You can read nested values directly without fetching the entire message:

%% Message structure:
%% #{
%%   <<"user">> => #{
%%     <<"name">> => <<"Alice">>,
%%     <<"email">> => <<"alice@example.com">>
%%   }
%% }
 
%% Read nested value with path
{ok, <<"Alice">>} = hb_store_gateway:read(Store, [ID, <<"user">>, <<"name">>]).
 
%% Path not found returns not_found
not_found = hb_store_gateway:read(Store, [ID, <<"nonexistent">>]).

Checking Message Structure

%% Check if message has nested data
case hb_store_gateway:type(Store, ID) of
    simple ->
        %% Flat key-value pairs only
        io:format("Simple message~n");
    composite ->
        %% Contains nested structures
        {ok, Keys} = hb_store_gateway:list(Store, ID),
        io:format("Composite with keys: ~p~n", [Keys]);
    not_found ->
        io:format("Message not found~n")
end.

ID Recognition

Gateway store only processes valid Arweave IDs:

%% Valid: 43-character base64url
<<"IYkkrqlZNW_J-4T-5eFApZOMRl5P4VjvrcOXWvIqB1Q">> %% => fetched
 
%% Invalid: wrong format or length
<<"shortkey">> %% => not_found
<<"not-a-valid-transaction-id">> %% => not_found

Quick Reference: Gateway Store Functions

FunctionWhat it does
scope(Opts)Returns remote
read(Opts, Key)Fetch message by ID
type(Opts, Key)Check if simple or composite
list(Opts, Key)List keys in composite message
resolve(Opts, Key)Returns key unchanged (no-op)

Part 3: Remote Node Store

📖 Reference: hb_store_remote_node

The remote node store reads data from other HyperBEAM nodes via HTTP. Unlike gateway store, it can also perform writes (with proper authorization).

Basic Configuration

%% Connect to a remote HyperBEAM node
RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://ao-node.example.com:8421">>
}.

Reading Data

RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://localhost:8421">>
},
 
%% Read from remote node
{ok, Message} = hb_store_remote_node:read(RemoteStore, ID),
 
%% Messages may have unloaded parts - ensure everything is loaded
LoadedMsg = hb_cache:ensure_all_loaded(Message).

Writing Data (Authorized)

Write operations require authentication:

Wallet = hb:wallet(),
RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://ao-node.example.com:8421">>,
    <<"wallet">> => Wallet
},
 
%% Write to remote node (requires server-side authorization)
ok = hb_store_remote_node:write(RemoteStore, Key, Value).

Creating Links

Links create aliases to existing data:

%% Create link: "my-alias" points to SourceID
ok = hb_store_remote_node:make_link(
    RemoteStore,
    SourceID,          % Existing key
    <<"my-alias">>     % New alias
).
 
%% Now readable via alias
{ok, Message} = hb_store_remote_node:read(RemoteStore, <<"my-alias">>).

HTTP Endpoints

Remote node store uses the cache device API:

EndpointMethodPurpose
/~cache@1.0/read?target=KeyGETRead data
/~cache@1.0/writePOSTWrite data
/~cache@1.0/linkPOSTCreate link

Quick Reference: Remote Node Store Functions

FunctionWhat it does
scope(Opts)Returns remote
read(Opts, Key)Fetch from remote node
write(Opts, Key, Value)Write to remote node
make_link(Opts, Src, Dst)Create alias on remote
type(Opts, Key)Check existence (simple only)
resolve(Opts, Key)Returns key unchanged

Part 4: Local Caching

Both remote stores support local caching to improve performance.

Cache Configuration

%% Define local cache store
LocalCache = #{
    <<"store-module">> => hb_store_fs,
    <<"name">> => <<"gateway-cache">>
},
 
%% Gateway store with caching
GatewayStore = #{
    <<"store-module">> => hb_store_gateway,
    <<"local-store">> => [LocalCache]
},
 
%% Remote node store with caching
RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://ao.computer:8421">>,
    <<"local-store">> => LocalCache
}.

Cache Behavior

First Read:
1. Check local cache -- miss
2. Fetch from remote source -- success
3. Write to local cache
4. Return data
 
Subsequent Reads:
1. Check local cache -- hit
2. Return data (fast!)

Manual Caching

For remote node store, you can manually cache with links:

StoreOpts = #{<<"local-store">> => LocalCache},
Data = #{<<"content">> => <<"valuable data">>},
Links = [<<"alias1">>, <<"alias2">>],
 
%% Cache data with multiple access points
ok = hb_store_remote_node:maybe_cache(StoreOpts, Data, Links).

Part 5: Node Types (Gateway Store)

Gateway store supports different backend node types.

Arweave Gateway

#{
    <<"store-module">> => hb_store_gateway,
    <<"node">> => <<"https://arweave.net">>,
    <<"node-type">> => <<"arweave">>
}
 
%% Uses endpoints:
%% - GraphQL: https://arweave.net/graphql
%% - Raw: https://arweave.net/raw/id

AO Node

#{
    <<"store-module">> => hb_store_gateway,
    <<"node">> => <<"https://ao.computer">>,
    <<"node-type">> => <<"ao">>
}
 
%% Uses endpoints:
%% - GraphQL: https://ao.computer/~query@1.0/graphql
%% - Raw: https://ao.computer/id

Part 6: Multi-Tier Storage

Combine local and remote stores for optimal performance and resilience.

Store Chain Pattern

%% Try stores in order: memory -> filesystem -> gateway
Stores = [
    #{<<"store-module">> => hb_store_memory},
    #{<<"store-module">> => hb_store_fs, <<"name">> => <<"cache">>},
    #{<<"store-module">> => hb_store_gateway}
],
 
%% hb_cache tries each store until data is found
{ok, Message} = hb_cache:read(ID, #{store => Stores}).

Fallback Pattern

%% Primary: Remote node, Fallback: Gateway
case hb_store_remote_node:read(RemoteStore, ID) of
    {ok, Msg} -> 
        {ok, Msg};
    not_found -> 
        hb_store_gateway:read(GatewayStore, ID)
end.

Redundant Gateways

%% Try multiple gateways for resilience
Stores = [
    #{
        <<"store-module">> => hb_store_gateway, 
        <<"node">> => <<"https://primary.arweave.net">>
    },
    #{
        <<"store-module">> => hb_store_gateway, 
        <<"node">> => <<"https://backup.arweave.net">>
    }
],
{ok, Message} = hb_store:read(Stores, ID).

Part 7: Complete Test

Create src/test/test_hb7.erl:

-module(test_hb7).
-include_lib("eunit/include/eunit.hrl").
-include("include/hb.hrl").
 
%% Run with: rebar3 eunit --module=test_hb7
 
scope_test() ->
    %% Gateway store is always remote scope
    GatewayStore = #{<<"store-module">> => hb_store_gateway},
    ?assertEqual(remote, hb_store_gateway:scope(GatewayStore)),
    ?debugFmt("Gateway store scope: remote", []),
    
    %% Remote node store is always remote scope
    RemoteStore = #{
        <<"store-module">> => hb_store_remote_node,
        <<"node">> => <<"http://localhost:8421">>
    },
    ?assertEqual(remote, hb_store_remote_node:scope(RemoteStore)),
    ?debugFmt("Remote node store scope: remote", []).
 
gateway_read_test() ->
    Store = #{<<"store-module">> => hb_store_gateway},
    
    %% Known Arweave transaction ID (aos module)
    ID = <<"BOogk_XAI3bvNWnxNxwxmvOfglZt17o4MOVAdPNZ_ew">>,
    
    try hb_store_gateway:read(Store, ID) of
        {ok, Message} ->
            ?assert(is_map(Message)),
            ?debugFmt("Gateway read success: ~p keys", [maps:size(Message)]);
        not_found ->
            ?debugFmt("Gateway read: not_found (ID may not exist)", [])
    catch
        _:_ ->
            ?debugFmt("Gateway read: network unavailable (skipped)", [])
    end.
 
gateway_invalid_id_test() ->
    Store = #{<<"store-module">> => hb_store_gateway},
    
    %% Short key - not a valid ID (should return not_found without network call)
    ?assertEqual(not_found, hb_store_gateway:read(Store, <<"shortkey">>)),
    ?debugFmt("Invalid ID rejected: OK", []).
    
    %% Note: Testing non-existent valid-format IDs requires network access
    %% and is covered by gateway_read_test when network is available.
 
remote_node_basic_test() ->
    %% Setup local store for testing
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-remote-basic">>
    },
    hb_store:reset(LocalStore),
    ?debugFmt("Local store reset: OK", []),
    
    %% Create test message
    TestData = #{<<"test-key">> => <<"test-value-", (integer_to_binary(rand:uniform(10000)))/binary>>},
    ID = hb_message:id(TestData),
    ?debugFmt("Test message ID: ~s", [hb_util:encode(ID)]),
    
    %% Write to local store
    {ok, ID} = hb_cache:write(TestData, #{store => LocalStore}),
    ?debugFmt("Wrote message to local store", []),
    
    %% Start HTTP server
    Node = hb_http_server:start_node(#{store => LocalStore}),
    ?debugFmt("Started HTTP server at: ~s", [Node]),
    
    %% Configure remote store
    RemoteStore = #{
        <<"store-module">> => hb_store_remote_node,
        <<"node">> => Node
    },
    
    %% Read via remote store
    {ok, Retrieved} = hb_store_remote_node:read(RemoteStore, ID),
    Loaded = hb_cache:ensure_all_loaded(Retrieved),
    ?assert(is_map(Loaded)),
    ?debugFmt("Remote read success", []),
    
    %% Verify content matches
    ?assertEqual(maps:get(<<"test-key">>, TestData), maps:get(<<"test-key">>, Loaded)),
    ?debugFmt("Content verified: OK", []).
 
remote_node_not_found_test() ->
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-remote-notfound">>
    },
    hb_store:reset(LocalStore),
    
    Node = hb_http_server:start_node(#{store => LocalStore}),
    RemoteStore = #{
        <<"store-module">> => hb_store_remote_node,
        <<"node">> => Node
    },
    
    %% Non-existent key
    ?assertEqual(not_found, hb_store_remote_node:read(RemoteStore, <<"nonexistent">>)),
    ?debugFmt("Remote not_found: OK", []).
 
local_caching_test() ->
    %% Setup stores
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-caching-remote">>
    },
    CacheStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-caching-local">>
    },
    hb_store:reset(LocalStore),
    hb_store:reset(CacheStore),
    ?debugFmt("Stores reset: OK", []),
    
    %% Create and store test data
    TestData = #{<<"cached-key">> => <<"cached-value">>},
    ID = hb_message:id(TestData),
    {ok, ID} = hb_cache:write(TestData, #{store => LocalStore}),
    ?debugFmt("Test data written", []),
    
    %% Start server and configure remote store with caching
    Node = hb_http_server:start_node(#{store => LocalStore}),
    RemoteStoreWithCache = #{
        <<"store-module">> => hb_store_remote_node,
        <<"node">> => Node,
        <<"local-store">> => CacheStore
    },
    
    %% First read - fetches from remote, caches locally
    {ok, Msg1} = hb_store_remote_node:read(RemoteStoreWithCache, ID),
    ?debugFmt("First read (from remote): OK", []),
    
    %% Verify data was cached locally
    {ok, CachedMsg} = hb_cache:read(ID, #{store => CacheStore}),
    LoadedCached = hb_cache:ensure_all_loaded(CachedMsg),
    ?assertEqual(<<"cached-value">>, maps:get(<<"cached-key">>, LoadedCached)),
    ?debugFmt("Data cached locally: OK", []).
 
maybe_cache_test() ->
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-maybe-cache">>
    },
    hb_store:reset(LocalStore),
    
    StoreOpts = #{<<"local-store">> => LocalStore},
    Data = #{<<"manual-cache-key">> => <<"manual-cache-value">>},
    ID = hb_message:id(Data),
    
    %% Manually cache data
    Result = hb_store_remote_node:maybe_cache(StoreOpts, Data),
    ?assertEqual(ok, Result),
    ?debugFmt("maybe_cache succeeded", []),
    
    %% Verify data was cached
    {ok, Cached} = hb_cache:read(ID, #{store => LocalStore}),
    LoadedCached = hb_cache:ensure_all_loaded(Cached),
    ?assertEqual(<<"manual-cache-value">>, maps:get(<<"manual-cache-key">>, LoadedCached)),
    ?debugFmt("Manual cache verified: OK", []).
 
maybe_cache_with_links_test() ->
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-cache-links">>
    },
    hb_store:reset(LocalStore),
    
    StoreOpts = #{<<"local-store">> => LocalStore},
    Data = #{<<"linked-data">> => <<"linked-value">>},
    ID = hb_message:id(Data),
    Links = [<<"link1">>, <<"link2">>, ID],
    
    %% Cache with links
    Result = hb_store_remote_node:maybe_cache(StoreOpts, Data, Links),
    ?assertEqual(ok, Result),
    ?debugFmt("maybe_cache with links succeeded", []),
    
    %% Verify data is readable by ID
    {ok, Cached} = hb_cache:read(ID, #{store => LocalStore}),
    ?assert(is_map(Cached)),
    ?debugFmt("Cache with links verified: OK", []).
 
multi_tier_storage_test() ->
    ?debugFmt("=== Multi-Tier Storage Test ===", []),
    
    %% Setup three tiers
    MemoryStore = #{<<"store-module">> => hb_store_memory},
    FsStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-multi-tier">>
    },
    hb_store:reset(FsStore),
    
    %% Write to filesystem tier
    TestData = #{<<"tier-key">> => <<"tier-value">>},
    ID = hb_message:id(TestData),
    {ok, ID} = hb_cache:write(TestData, #{store => FsStore}),
    ?debugFmt("1. Wrote to filesystem tier", []),
    
    %% Read through multi-tier
    Stores = [MemoryStore, FsStore],
    {ok, Retrieved} = hb_cache:read(ID, #{store => Stores}),
    Loaded = hb_cache:ensure_all_loaded(Retrieved),
    ?assertEqual(<<"tier-value">>, maps:get(<<"tier-key">>, Loaded)),
    ?debugFmt("2. Multi-tier read: OK", []),
    
    %% Not found case
    NotFoundResult = hb_cache:read(<<"nonexistent-id">>, #{store => Stores}),
    ?assertEqual(not_found, NotFoundResult),
    ?debugFmt("3. Multi-tier not_found: OK", []),
    
    ?debugFmt("=== Multi-Tier Storage Test Complete ===", []).
 
complete_workflow_test() ->
    ?debugFmt("=== Complete Remote Storage Workflow ===", []),
    
    %% 1. Setup infrastructure
    LocalStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-workflow-source">>
    },
    CacheStore = #{
        <<"store-module">> => hb_store_fs,
        <<"name">> => <<"test-workflow-cache">>
    },
    hb_store:reset(LocalStore),
    hb_store:reset(CacheStore),
    ?debugFmt("1. Infrastructure setup complete", []),
    
    %% 2. Create original data
    OriginalData = #{
        <<"type">> => <<"test-message">>,
        <<"content">> => <<"Hello from remote storage!">>,
        <<"timestamp">> => erlang:system_time(millisecond)
    },
    ID = hb_message:id(OriginalData),
    {ok, ID} = hb_cache:write(OriginalData, #{store => LocalStore}),
    ?debugFmt("2. Original data stored, ID: ~s", [hb_util:encode(ID)]),
    
    %% 3. Start remote node
    Node = hb_http_server:start_node(#{store => LocalStore}),
    ?debugFmt("3. Remote node started at: ~s", [Node]),
    
    %% 4. Configure remote store with caching
    RemoteStore = #{
        <<"store-module">> => hb_store_remote_node,
        <<"node">> => Node,
        <<"local-store">> => CacheStore
    },
    ?debugFmt("4. Remote store configured with local cache", []),
    
    %% 5. Read via remote (triggers caching)
    {ok, RemoteMsg} = hb_store_remote_node:read(RemoteStore, ID),
    LoadedRemote = hb_cache:ensure_all_loaded(RemoteMsg),
    ?assertEqual(<<"Hello from remote storage!">>, maps:get(<<"content">>, LoadedRemote)),
    ?debugFmt("5. Remote read successful", []),
    
    %% 6. Verify local cache
    {ok, CachedMsg} = hb_cache:read(ID, #{store => CacheStore}),
    LoadedCached = hb_cache:ensure_all_loaded(CachedMsg),
    ?assertEqual(<<"Hello from remote storage!">>, maps:get(<<"content">>, LoadedCached)),
    ?debugFmt("6. Local cache verified", []),
    
    %% 7. Verify scope
    ?assertEqual(remote, hb_store_remote_node:scope(RemoteStore)),
    ?debugFmt("7. Scope is remote: OK", []),
    
    ?debugFmt("=== Complete Workflow Test Passed! ===", []).

Run the Tests

rebar3 eunit --module=test_hb7

Common Patterns

Pattern 1: Read with Cache-Through

LocalCache = #{
    <<"store-module">> => hb_store_lmdb,
    <<"name">> => <<"cache">>
},
GatewayStore = #{
    <<"store-module">> => hb_store_gateway,
    <<"local-store">> => [LocalCache]
},
 
%% First read: slow (network)
%% Subsequent reads: fast (local)
{ok, Msg} = hb_cache:read(ID, #{store => [GatewayStore]}).

Pattern 2: Check Before Fetch

Store = #{<<"store-module">> => hb_store_gateway},
 
case hb_store_gateway:type(Store, ID) of
    simple -> 
        {ok, Data} = hb_store_gateway:read(Store, ID),
        process_simple(Data);
    composite ->
        {ok, Keys} = hb_store_gateway:list(Store, ID),
        process_composite(Keys);
    not_found ->
        handle_missing()
end.

Pattern 3: Authenticated Remote Writes

Wallet = hb:wallet(),
RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://trusted-node:8421">>,
    <<"wallet">> => Wallet
},
 
%% Write with authentication
ok = hb_store_remote_node:write(RemoteStore, <<"key">>, <<"value">>),
 
%% Create alias
ok = hb_store_remote_node:make_link(RemoteStore, <<"key">>, <<"alias">>).

Pattern 4: Error Handling

case hb_store_remote_node:read(Opts, Key) of
    {ok, Msg} -> 
        process(hb_cache:ensure_all_loaded(Msg));
    not_found -> 
        %% Key doesn't exist or network error
        handle_not_found();
    {error, Reason} ->
        %% HTTP or connection error
        log_error(Reason),
        try_fallback()
end.

What's Next?

You now understand remote storage in HyperBEAM:

ConceptModuleKey Functions
Gateway Storehb_store_gatewayread, type, list
Remote Nodehb_store_remote_noderead, write, make_link
Local Cachinglocal-store optionAutomatic on read
Store Chainshb_storeMulti-tier fallback

Going Further

  1. Local Storage — Learn about hb_store_fs, hb_store_lmdb, hb_store_rocksdb
  2. Cache System — Explore hb_cache for content-addressed storage
  3. HTTP Client — Understand hb_http and hb_gateway_client

Quick Reference Card

📖 Reference: hb_store_gateway | hb_store_remote_node

%% === GATEWAY STORE ===
GatewayStore = #{<<"store-module">> => hb_store_gateway}.
{ok, Msg} = hb_store_gateway:read(GatewayStore, ID).
{ok, Keys} = hb_store_gateway:list(GatewayStore, ID).
Type = hb_store_gateway:type(GatewayStore, ID).
 
%% === REMOTE NODE STORE ===
RemoteStore = #{
    <<"store-module">> => hb_store_remote_node,
    <<"node">> => <<"http://node:8421">>
}.
{ok, Msg} = hb_store_remote_node:read(RemoteStore, ID).
ok = hb_store_remote_node:write(RemoteStore, Key, Val).
ok = hb_store_remote_node:make_link(RemoteStore, Src, Dst).
 
%% === WITH LOCAL CACHING ===
LocalCache = #{
    <<"store-module">> => hb_store_fs,
    <<"name">> => <<"cache">>
}.
CachedGateway = #{
    <<"store-module">> => hb_store_gateway,
    <<"local-store">> => [LocalCache]
}.
 
%% === MULTI-TIER ===
Stores = [
    #{<<"store-module">> => hb_store_memory},
    #{<<"store-module">> => hb_store_lmdb, <<"name">> => <<"cache">>},
    #{<<"store-module">> => hb_store_gateway}
].
{ok, Msg} = hb_cache:read(ID, #{store => Stores}).
 
%% === NODE TYPES ===
%% Arweave gateway
#{<<"node">> => <<"https://arweave.net">>, <<"node-type">> => <<"arweave">>}.
%% AO node
#{<<"node">> => <<"https://ao.computer">>, <<"node-type">> => <<"ao">>}.

Resources

HyperBEAM Documentation

Related Modules