Adds an alternative HTTPDoer backed by valyala/fasthttp for high-throughput
bots. Cuts per-call allocs from 102 to 56 in the cross-library bench
(within 8 of telego, which uses fasthttp by default), and per-call bytes
from 11.1 KiB to 6.6 KiB.
bot := client.New(token,
client.WithHTTPClient(client.NewFastHTTPDoer()),
)
Implementation notes:
- Wraps *fasthttp.Client behind the existing HTTPDoer (Do *http.Request)
interface, so RetryDoer, custom transports, observability middleware,
and the 1428 generated tests all keep working as-is.
- Translates *http.Request -> fasthttp.Request once per call and
returns a *http.Response whose Body releases the pooled fasthttp
response on Close (net/http contract).
- Recognises the bufferReadCloser / readerReadCloser shapes produced
by buildRequest and passes their underlying bytes straight to
SetBodyRaw -- no io.ReadAll, no copy.
- Honours ctx.Deadline via DoDeadline, falls back to WithFastHTTPReadTimeout
when no deadline is set. fasthttp.ErrTimeout maps to
context.DeadlineExceeded for errors.Is compatibility.
Default stays net/http: fasthttp is HTTP/1.1 only, doesn't compose with
the http.RoundTripper middleware ecosystem, and most users don't have
the throughput to notice. Bots making thousands of API calls/sec should
opt in.
Multipart/file-upload path remains on net/http per the agreed scope --
the perf bottleneck was JSON-method round-trip, not file uploads.
Time numbers in the report deferred until a quiet-system bench run;
allocs/bytes numbers (which are deterministic per code path) are
already updated.
Two changes that together cut allocs/call from 15 to 13 (client-internal
bench) and per-call CPU from 600ns to 455ns (-24%) on the no-HTTP path:
1. Codec gets an optional BodyEncoder extension (MarshalTo io.Writer).
When present, encodeJSONBody stream-encodes the request directly into
a pooled *bytes.Buffer instead of allocating a [2-step] Marshal+Reader
pair. DefaultCodec implements it via goccy/go-json.NewEncoder.
2. *Bot caches the parsed base URL on construction. buildRequest skips
net/http.NewRequestWithContext for the common case and constructs
*http.Request manually — clones the URL by value, sets the method
path, and populates ContentLength + GetBody from the body's concrete
type so RetryDoer's body-replay across attempts still works.
Cross-library bench (sendMessage round-trip vs httptest.Server): -2
allocs/call (104 -> 102), bytes -1.2%, time within noise (real HTTP
plumbing dominates). The CPU win is visible on synthetic stub-doer
benches and translates to lower GC pressure on sustained-throughput
workloads.
Slow-path fallback preserved for codecs that don't implement BodyEncoder
and for *Bot instances where url.Parse on the configured base failed —
they take the original NewRequestWithContext path.
Adds test/benchmarks/ as a separate Go module so competitor deps
(go-telegram-bot-api/v5, telebot.v3, go-telegram/bot, telego,
echotron/v3) stay out of the root go.mod.
Hot paths covered:
- Webhook decode (small Update -> typed Update struct)
- Large unmarshal (Update with entities + reply markup + photo array)
- API round-trip (sendMessage against httptest.Server)
- Dispatch route (20 handlers, last-registered matches)
Results on Apple M4 Max / go1.26.2: ours wins 3 of 4 paths and is
2nd of 5 in the round-trip path. Full report at
docs/benchmarks/2026-05-10-comparison.md, raw output committed under
test/benchmarks/results/.
Caveats called out in the report:
- codec asymmetry (we ship goccy/go-json; competitors mostly stdlib)
- echotron call bench skipped — built-in rate limiter not externally
configurable; would measure throttling, not the library
- dispatch bench limited to libs with a public sync entry point
(ours, telebot, gobot); gotba has no dispatcher, telego/echotron
use channel/per-chat paradigms not directly comparable
Also gitignores docs/superpowers/ (local brainstorm/spec scratch)
and regenerates docs/reference/dispatch.md after the new
Router.Process method.