const engineer = {
name: "Danilo Souza",
role: "Senior Software Engineer",
focus: "Backend Architecture, AI Systems & Distributed Computing",
location: "Brazil 🇧🇷 · Remote-first",
what_i_build: [
"Decentralized AI Training Protocols",
"High-Throughput AI Orchestration Platforms",
"High-Performance Job Queue Systems",
"Autonomous AI Agents & SaaS Infrastructure",
],
core_stack: {
languages: ["TypeScript", "Python", "Rust", "Go"],
backend: ["Node.js", "Fastify", "NestJS", "FastAPI"],
ai_ml: ["PyTorch", "LLMs", "XState", "BullMQ"],
databases: ["PostgreSQL", "Redis", "MongoDB"],
infra: ["Docker", "Prometheus", "Grafana", "GitHub Actions"],
},
engineering_philosophy: [
"Understand systems from the ground up, not just the abstractions.",
"Prioritize performance, scalability, and security from day one.",
"Write pragmatic, maintainable code that solves real problems.",
"Constant learning through building complex architectures.",
],
} as const;Focus: AI Orchestration · Distributed Systems · Real-Time Streaming · Production Observability
The Vision: A high-performance AI orchestration platform designed to manage complex multi-agent pipelines with full observability, real-time streaming, and fault recovery — built for production environments.
Key Architecture:
- State Machine Orchestration (XState): Deterministic execution flows with explicit state transitions and advanced error handling, eliminating unpredictable agent behavior.
- Decoupled Architecture: API layer fully separated from execution workers, ensuring system stability under heavy load.
- Real-Time Mission Streaming: Live mission tracking via Server-Sent Events (SSE), delivering immediate
task.*andstep.*event feedback to the UI. - Industrial Observability: End-to-end telemetry pipeline with Prometheus and Grafana — monitoring worker health, queue status, and cost/latency metrics.
- High-Concurrency Processing: Distributed execution powered by BullMQ + Redis for massive scalability, with full state persistence via Prisma + PostgreSQL.
- Smart Context Injection: Automatic skill discovery and context enrichment injected directly into agent prompts.
Stack: TypeScript · Fastify · Node.js · BullMQ · Redis · PostgreSQL · XState · Prometheus · Grafana · React
Focus: Distributed Systems · AI Optimization · P2P Networks · Privacy Engineering
The Vision: Democratizing AI development by enabling ordinary users to collectively fine-tune large language models on consumer GPUs — no expensive centralized cloud required.
Key Architecture:
- SCAO Integration: Powered by a custom 2nd-order optimizer (SCAO) that reduces network synchronization frequency by converging up to 2.4x faster than standard methods.
- P2P Network: High-performance Rust daemon (libp2p) for decentralized peer discovery and GossipSub gradient propagation.
- Privacy & Security: Rényi Differential Privacy (DP-SGD) + Krum aggregation to protect raw data and defend against Byzantine faults.
- Desktop UX: Tauri v2 application for seamless miner onboarding.
Stack: Python · PyTorch · Rust (libp2p) · Tauri · TypeScript
(Private Repository — available for review upon request)
Focus: ML Algorithms · CUDA Engineering · Performance Optimization
The Vision: Solving the memory bottlenecks of 2nd-order optimizers for LLM training on consumer hardware — making advanced optimization accessible without requiring A100-class GPUs.
Key Architecture:
- Adaptive Rank Selection: Compresses curvature matrices retaining ≥95% spectral mass — 16–32x memory reduction vs. full-rank methods like Shampoo.
- Int8 EMA Quantization: Custom technique storing moving averages in int8, saving 4x memory without perplexity degradation.
- Custom CUDA Kernels: Fused CUDA kernels resolving complexity bottlenecks in projection implementations.
- Proven Results: 100% stability on 3B parameter model fine-tuning on a single 16GB GPU, outperforming AdamW baselines.
Stack: Python · PyTorch · CUDA
Focus: Distributed Systems · Database Engineering · Fault Tolerance · Systems Design
The Vision: A high-performance, low-latency job orchestration system using PostgreSQL 16 as the sole broker — eliminating the operational complexity of managing Redis or RabbitMQ as an extra dependency.
Why PostgreSQL? In many production scenarios, introducing Redis just for queues creates an extra point of failure and backup complexity. This project demonstrates how to leverage powerful database primitives to achieve enterprise-grade queuing without the overhead:
- Guaranteed Transactionality: A job is only created if the business data is saved successfully — no dual-write inconsistencies.
- Atomic Concurrency: Multiple workers with zero collisions via
FOR UPDATE SKIP LOCKED. - Near-Zero Latency: Sub-millisecond reactivity via PostgreSQL
LISTEN/NOTIFY.
Key Architecture:
- Lease Expiry System: If a worker crashes mid-processing, the
lease_expires_atfield expires automatically and another worker claims the job on the next fetch cycle — zero manual intervention. - Exponential Backoff with Jitter (±20%): Temporary failures (e.g., network timeouts) don't create retry storms. Randomized delay prevents thousands of failed jobs retrying simultaneously.
- Dead Letter Queue (DLQ): Persistently failing jobs move to
deadstatus, preserving the fullpayloadandlast_errorfor manual audit and reprocessing via the dashboard. - React Dashboard: Modern UI for monitoring queue health, worker status, and DLQ management.
- ADR Documentation: Architectural Decision Records detailing every technical trade-off made during design.
Benchmarks:
| Metric | Result |
|---|---|
| Fetch Latency | < 3ms |
| Throughput | > 1,200 jobs/sec (Single RDS instance) |
| Concurrency | 50+ parallel workers — zero deadlocks |
Stack: Python · FastAPI · PostgreSQL 16 · React · Docker
Focus: AI Orchestration · SaaS Architecture · Full-Stack Development
The Vision: A platform for deploying and managing autonomous AI agents with real business impact — starting with an e-commerce market intelligence agent for the Etsy ecosystem.
Key Architecture:
- Market Intelligence Agent: Automated agent using Mistral Large 2411 to analyze the Etsy USA market, identifying high-demand, low-competition product opportunities.
- Platform Infrastructure: Scalable SaaS architecture supporting multi-tenant agent deployment and lifecycle management.
Stack: TypeScript · Node.js · Mistral LLM
I'm available for full-time roles, contract work, and technical collaborations — especially in AI infrastructure, distributed systems, and backend architecture.