Hello ๐Ÿ‘‹

It's me AJ Dahal

A

๐Ÿ“„ Download Resume

Ajaya Dahal

About Me

Bridging Hardware Logic, Embedded Software & AI Systems

$> whoami
Sr. FPGA & Network System Engineer @ AMD
Specializing in High-Performance Computing (HPC) & Low-Latency Networking

I serve as the technical lead for AMD's Ethernet reference designs, ensuring data integrity across Versal ACAP and Zynq MPSoC platforms. My core expertise lies in bridging the gap between FPGA Logic (Verilog/RTL) and Embedded Software (Linux/Drivers).

Formerly a Graduate Researcher in Software Defined Radio (SDR) and Autonomous Systems. Outside of FPGAs, I design production-grade software systems โ€” including a 24,700-line AI content pipeline orchestrated by Temporal, and a self-hosted AI infrastructure stack running 12 models across 15 Docker services on a single GPU. I enjoy solving problems where nanoseconds matter โ€” and problems where 55 distributed activities need to compose reliably.

โšก Low-Latency Specialization

โœ“ Sub-microsecond packet processing
โœ“ SerDes up to 58 Gbps/lane
โœ“ Zero-copy DMA architectures
โœ“ 400G/800G Ethernet (DCMAC)

๐Ÿ“‹ Quick Info

  • Name: Ajaya Dahal
  • Location: ๐Ÿ“ Austin, TX
  • Email: ajayadahal1000@gmail.com

1 + Projects Completed

๐Ÿ’ผ Open to Opportunities?

๐Ÿ“„ Download My Resume

What People Say About Me

Recommendations from colleagues and mentors

Work Experience

Jan 2023 - Present

Sr. FPGA & Network System Engineer

Advanced Micro Devices, Inc - Remote - Based in Austin, TX

Core Engineering Focus:

โ€ข System Engineering: Technical Lead for 15+ Ethernet reference designs (10G โ€“ 800G), integrating Hard IP (DCMAC/MRMAC) with DMA subsystems.
โ€ข Physical Layer Tuning: Expert in debugging GTY/GTM transceivers, optimizing Eye Diagrams, and resolving signal integrity issues in 400G links.
โ€ข Full Stack Integration: Debugging from the RTL (Verilog) up to the OS (PetaLinux/Yocto) and Kernel Drivers.

Lab & Integration Focus

Directed FPGA and digital design labs that bridged reference design HDL, embedded Linux, and lab-based signal integrity tuning for every Ethernet IP from 10M to 800G.

Versal ACAP Embedded Linux Signal Integrity

Hardware & IP Ecosystem:

  • Versal ACAP: Deep architectural experience with VMK180, VPK120, VCK190. Configuring NoC (Network on Chip) and PL-to-PS AXI interfaces.
  • Zynq UltraScale+ & RFSoC: ZCU102, ZCU106, and ZCU111/ZCU208 RFSoC (handling high-speed ADCs/DACs).
  • Ethernet MAC IP:
    • 800G/400G: DCMAC (Dual Converter MAC) & MRMAC (Multirate MAC).
    • 100G/50G/40G: CMAC (UltraScale+ Hard IP) & Soft 100G cores.
    • 10G/25G: 10G/25G Ethernet Subsystem & XXV Ethernet.
    • Embedded: PS GEM (Gigabit Ethernet MAC) via EMIO/MIO.

Technical Publications & Knowledge Base:

Authored official AMD Answer Records (ARs) and Technical Blogs to resolve global customer blockers:

๐Ÿ“˜ MRMAC Tutorial Series

Complete step-by-step guide for building a Versal MRMAC 4x25G Baremetal Ethernet Design from scratch.

Versal ACAP MRMAC Tutorial
๐Ÿš€ DCMAC Survival Guide

Comprehensive 100G DCMAC reference design for VPK120 with CAUI-4 interface configuration.

100G DCMAC VPK120
โšก GTY/GTYP Transceiver Latency

Versal GTY/GTYP Transceivers: TX and RX Latency Values for sub-microsecond Ethernet systems.

Versal ACAP SerDes Latency HFT Critical Team Contribution
๐Ÿ“š Additional Technical Publications
  • AR 000036641 10G/25G High Speed Ethernet Subsystem - Timing closure strategies
  • AR 000038505 MRMAC Linux Driver - Block Lock Issue resolution
  • AR 000037069 PS-GTR GEM Driver Patch - Auto-negotiation fixes
  • AR 000037257 MCDMA Device Tree - Critical device tree modifications for low-latency Ethernet designs

๐Ÿ”ฌ My Hardware Lab & Active Setups

Versal VCK190 Lab Setup

Click to Enlarge

Versal VCK190

100G Live Traffic Test With MRMAC

XCVC1902 GTM SerDes 100G
Zynq UltraScale+ ZCU111 Lab Setup

Click to Enlarge

Zynq UltraScale+ ZCU111

100G Live Traffic Test with CMAC

XCZU28DR RFSoC 100G CMAC
IBERT Eye Diagram

Click to Enlarge

IBERT Eye Diagram

100G SerDes Tuning Results

Signal Integrity GTY 28 Gbps

Engineering Excellence Awards:

Public Repositories I Maintain:
Aug 2019-Jan 2024

Graduate Research Assistant

Mississippi State University - Starkville, MS

Collaborated with Center for Advanced Vehicular Systems (CAVS) researchers on autonomous vehicle perception systems using LIDAR, radar, and low-cost cameras. Leveraged GPUs and deep learning frameworks (TensorFlow, OpenCV) to detect and track lanes under diverse road conditions. Applied ML techniques like SqueezeSeg for camera-LIDAR sensor fusion. Led multiple SDR-based projects: (1) AI-powered triangulation system to locate contraband cell phones in prisons, (2) spectrum scanning system to detect RF activity using IQ data for passive microwave sensing, and (3) Wi-Fi-based human activity recognition using ML. Selected as 1 of 10 for the MSU/USDA Summer Research Experience. Contributed to 5G research with 5 universities and National Instruments, experimenting with srsLTE/RAN, OAI, and Amarisoft as a Part 107 FAA-certified drone pilot.

SDR & Wireless Sensing Research

Architected real-time RF signal processing systems for contraband detection and human activity recognition, combining software-defined radio with machine learning to push wireless sensing into new applications.

Software Defined Radio Machine Learning Spectrum Scanning
Jan 2020-Jan 2022

PCB Design Engineer Co-op

Hunter Engineering Company - Raymond, MS

Designed system-level functional testers for PCBs used in Hunter Engineering products (e.g., car lifts, wheel balancers, and tire changers). Created end-to-end test systems, including custom PCBs, displays, and interfaces.

Inspection & Process Focus

Standardized visual audits across multiple shifts and partnered with fulfillment teams to document repeat failure modes so corrective actions could be triggered faster.

Visual QA Root Cause Shift Handoffs

to verify board functionality before assembly. Developed intuitive LabVIEW GUIs and C/C++ backend code to ensure ease of use by operators with no technical background, minimizing user error. Integrated camera-based defect detection using OpenCV and built testers compatible with Aegis Factory Logix. Delivered robust, operator-friendly systems that improved quality control and production efficiency.

Engineering Practice

Applied principles from FPGA and digital design courses to architect automated PCB testers, pairing hardware control with embedded software for fast cycle times.

LabVIEW Embedded Control Digital Verification
March 2019-August 2019

Quality Control

Amazonโ€”Woot! Merch by Amazon - Dallas, TX

Ensured quality assurance in a high-volume T-shirt manufacturing facility by inspecting printed garments for accuracy, color consistency, and print defects. Verified customer specifications and maintained production standards. Played a key role in minimizing defective shipments and supporting efficient fulfillment operations.

Inspection & Process Focus

Standardized visual audits across shifts, partnered with fulfillment leads to capture repeat failure modes, and accelerated corrective loops before assembly.

Visual QA Root Cause Shift Handoffs

Education

Academic Journey in Electrical & Computer Engineering

Master's Degree
2021 - 2023

Master's in Electrical and Computer Engineering

Mississippi State University Starkville, MS
๐Ÿ“„ Master's Thesis

Software Defined Radio (SDR) Based Sensing

This thesis explores Software-Defined Radio (SDR) applications including Spectrum Scanning Systems, Contraband Cellphone Detection, and Human Activity Recognition via Wi-Fi signals. SDRs empower spectrum scanning systems to monitor and analyze radio frequencies in real-time, optimizing spectrum allocation for seamless wireless communication. The research demonstrates SDR-based identification of unauthorized signals in restricted areas and leverages Raspberry Pi 3B+ for tracking movement patterns via Wi-Fi signals. Additionally, a comparative analysis of Wi-Fi-based Human Activity Recognition versus Radar systems was conducted for accuracy assessment, showcasing the versatility of SDR platforms in real-time signal processing and wireless sensing applications.

๐Ÿ“– Read Full Thesis โ†’
SDR RF Sensing Machine Learning Signal Processing
Bachelor's Degree
2019 - 2022

Bachelor of Science in Electrical and Computer Engineering

Mississippi State University Starkville, MS
Academic Focus

I was deeply involved with FPGA and digital design coursework alongside embedded systems labs that translated HDL theory into practical, board-level projects.

FPGA Design Digital Systems Embedded Systems
Coursework
Advanced Coursework

Specialized Technical Training

๐Ÿ’ป

FPGA & Digital Design

  • Verilog for Digital System Design
  • Digital System Design
๐Ÿ–ฅ๏ธ

Computer Systems & Architecture

  • Computer Architecture (MIPS)
  • Embedded System Design
๐Ÿค–

Advanced Topics

  • Machine Learning and Artificial Intelligence
  • Sensor Processing for Autonomous Vehicles

Certifications

Professional Credentials & Technical Training

Featured Certifications

The Fundamentals of RDMA Programming New
NVIDIA Deep Learning Institute
September 2025 โ€ข Credential ID: 14143
View Certificate (PDF) โ†’
Jenkins: Beginner To Pro New
AMD Learning (Udemy)
September 2025
View Certificate โ†’ | External Link
Verilog HDL: VLSI Hardware Design Comprehensive Masterclass
AMD Learning (Udemy)
Completed
View Certificate โ†’ | External Link
Linux Device Drivers
LinkedIn Learning
Completed
View Certificate โ†’ | External Link
Embedded Linux using Yocto
AMD Learning (Udemy)
Completed
View Certificate โ†’ | External Link
LPIC-2 Linux Engineer (201-450) Cert Prep
LinkedIn Learning
Completed
View Certificate โ†’ | External Link โ†’

Projects

Open-source FPGA reference designs and hardware projects spanning high-speed networking, embedded systems, and robotics.

โšก FEATURED PROJECT
โšก
DECEMBER 2025 โ€ข R&D INITIATIVE

72ns Gateway: A Versal ACAP HFT Accelerator

Nanosecond-Latency Network Processing on FPGA

Personal R&D exploration into sub-100ns packet processing on Xilinx Versal ACAP. Implements a complete hardware network stack: UDP filtering at wire speed, clock domain crossing with Gray code synchronizers, and protocol parsing for NASDAQ TotalView-ITCH 5.0 market data feeds. Demonstrates mastery of heterogeneous computing, metastability prevention, and low-latency architecture design. Target latency: 72ns (9 clock cycles @ 125 MHz).

72ns
Target Latency
125 MHz
Clock Frequency
1 Gbps
Throughput
42 Bytes
Header Stripped
Versal VCK190 UDP Filter RTL Async FIFO CDC Gray Code Sync NASDAQ ITCH 5.0 MoldUDP64 AXI Stream Vivado 2024.2
๐Ÿค– FEATURED PROJECT
๐Ÿค–
2024โ€“2025 โ€ข PRODUCTION SYSTEM

Social Media Automation Engine

Temporal-Orchestrated AI Content Pipeline โ€” Script to Published Video, Zero Human Typing

A 24,700-line Python system that autonomously discovers trending topics, generates scripts via multi-provider LLM orchestration (GPT-4o, DeepSeek, Claude), renders broadcast-quality vertical videos with FFmpeg compositing (karaoke captions, code blocks, Mermaid diagrams), and publishes to 4 platforms. Orchestrated by Temporal durable workflows with 55 activities, 5 QA gates, cascading fallback chains, and a signal-based human-in-the-loop approval via Telegram. Self-learning intelligence layer feeds performance data back into content generation. Runs on a local GPU server at $0/month.

24,700
Lines of Code
5
Temporal Workflows
55
Activities
$0/mo
Infrastructure Cost
Python 3.11 Temporal.io GPT-4o / DeepSeek FFmpeg Docker Compose FastAPI + HTMX Whisper / XTTS ComfyUI / FLUX
๐Ÿง  FEATURED PROJECT
๐Ÿง 
2024โ€“2025 โ€ข PRIVATE INFRASTRUCTURE

Cognitive Silo: Private AI Infrastructure

15 Docker Services ยท 12 AI Models ยท 48 GB VRAM ยท $0/Month Self-Hosted AI Stack

A single docker compose up deploys LLMs, voice cloning, image generation, video generation, music generation, and persistent AI memory behind a unified OpenAI-compatible gateway. Runs on an AMD Radeon PRO W7900 (48 GB VRAM, ROCm 6.0) with VRAM budget management, init container pattern for idempotent deployments, LiteLLM fallback routing to GPT-4o, and a 544-line interactive setup wizard. The infrastructure layer that powers the SMA pipeline and all local AI workloads.

15
Docker Services
12
AI Models
48 GB
VRAM Managed
$0/mo
Infrastructure Cost
Docker Compose Ollama (ROCm) LiteLLM Gateway AMD W7900 48GB ComfyUI / FLUX Whisper / XTTS Qdrant + Mem0 PostgreSQL + Redis

Life Outside Work

When I'm not building FPGA designs, I'm chasing adventures in the skies and on the road

โœˆ๏ธ

Aviation Passion

FAA Part 107 Certified Drone Pilot | Aviation Enthusiast

Watch on YouTube

๐Ÿ’ Aviation Milestones

Where My Love for Aviation Met My Love Story

โœˆ๏ธ Proposal at 1,500 Feet

She Said Yes... In The Sky!

As a student pilot working on my private pilot license, I proposed to my now-wife during a flight at 1,500 feet in a single-engine Cessna 172. I had "MARRY ME" written on the ground in huge letters. She said yes!

๐Ÿš Wedding Grand Entry

Arrived By Helicopter

Made a grand entrance to our wedding ceremony via helicopter. When you love aviation, why not make it part of your special day?

Helicopter Wedding Entry
๐Ÿ๏ธ

Life on Two Wheels

Weekend Rides | Open Roads | Freedom on Two Wheels

Watch on YouTube
๐Ÿ“ธ

Miscellaneous Gallery

Travel | Photography | Adventures

Watch on YouTube

Get In Touch

Let's discuss FPGA engineering, network systems, or potential opportunities

Location

๐Ÿ“ Austin, TX

Global Visitors

๐ŸŒ Thank you for visiting from around the world!