CV
This is a description of the page. You can modify it in '_pages/cv.md'. You can also change or remove the top pdf download button.
Contact Information
| Name | Songting (Michael) Wang |
| Professional Title | ECE Student & Software / ML Systems Engineer |
| stw183164761@gmail.com | |
| Phone | +1 (412) 916-4409 |
Professional Summary
M.S./B.S. student in Electrical and Computer Engineering at Carnegie Mellon University, focused on ML systems, distributed systems, and infrastructure.
Experience
-
2025 - 2025 Boston, MA
Software Engineer Intern – Core Infra
S&P Global – Kensho
- Implemented structured logging for OpenSearch log-router, enabling field queries and reducing log analytics time by 50%.
- Delivered endpoint-level FastAPI metrics for 7+ product teams using Grafana and Prometheus.
- Built and managed a centralized, automated alert visualization system on Kubernetes for 60+ application teams.
-
2024 - 2024 Shanghai, China
Software Engineer Intern – ML Infra
ZKH Industrial Supply
- Developed an LLM Benchmark System using Flask and Streamlit, reducing training evaluation time by 25%.
-
2023 - 2024 Pittsburgh, PA
Software Engineer – Backend
EnterviewAI
- Led a 10-member team to deliver production-grade backend services (e.g., Recording, Question Bank) using Django and Azure.
- Implemented AI features (e.g., Study Plans, Brainstorming, Mock Interviews, Feedback) with AutoGen and LangChain.
-
2023 - 2024 Shanghai, China
Software Engineer Intern – Backend
United Imaging Intelligence
- Enhanced ShanghaiTech University’s medical questionnaire app, extending question abstractions and backend update semantics.
Education
-
2025 - 2026 Pittsburgh, PA
-
2022 - 2025 Pittsburgh, PA
B.S.
Carnegie Mellon University
Electrical and Computer Engineering, Minor in Computer Science
- Coursework: Distributed Systems, AI/ML Systems, Deep Learning Systems, CPU/GPU Architecture, Algorithms, Functional Programming
Research
-
2026 - Pittsburgh, PA
Graduate Researcher
Google CoreML × CMU Catalyst
- Building a compiler to lower Mirage-generated computation graphs into efficient Google TPU kernels (advised by Prof. Zhihao Jia).
-
2025 - Pittsburgh, PA
ML Systems Research with Prof. Zhihao Jia
CMU Catalyst Group
- Working on Mirage Persistent Kernel, a Compiler and Runtime for Mega-Kernelizing Tensor Programs.
- Implemented FlashInfer’s optimized Gumbel-Max Sampling CUDA kernels in Mirage Persistent Kernel and verified with unit tests.
- Working on Expert-Parallel Mixture-of-Experts CUDA kernels in Mirage Persistent Kernel using all-to-all (A2A) communication.
-
2024 - Pittsburgh, PA
Data Systems Research with Prof. Vyas Sekar
CyLab Security & Privacy Institute, CMU
- Invited speaker at Current 2025, the world’s largest Data Streaming conference attended by 3,500+ industry professionals.
- Open-Sourced FlinkSketch, a high-performance library of sketching algorithms for Apache Flink.
- Building ProjectASAP, low-latency cost-efficient data pipelines to support next-gen agentic workloads.
Teaching
-
2024 - 2025 Pittsburgh, PA
Teaching Assistant
Carnegie Mellon University
- Distributed Systems (15-440/640), Spring 2025: Supported 200 students on distributed protocols; led recitations; designed/graded coursework.
- Computer Systems (18-213/613), Spring 2024 & Fall 2024: Supported 150 students on assembly, memory, I/O, concurrency; led small-groups; graded work.
Projects
-
Needle Deep Learning Framework
Built core deep learning framework components in C++ and Python.
- Built core components of Needle, including autodiff, tensor IR, backend dispatch, common neural network modules.
- Implemented FlashAttention in CUDA kernels, improving memory and throughput efficiency.
-
Distributed File System
Distributed file system with RPC and concurrent LRU caching in C and Java.
- Implemented Remote Procedure Call (RPC) and concurrent LRU caching to support efficient client-server communication.
- Achieved reliability and scalability through two-phase commit, write-ahead logging, and dynamic scaling.
-
RISC-V CPU Microarchitecture
7-stage pipelined RISC-V CPU in C and Verilog.
- Implemented a 7-stage pipelined RISC-V CPU microarchitecture, with LRU Cache and optimizations to improve perf/watt.
Skills
Programming Languages (Expert): Python, C++, Java, C, Jsonnet, Scala, JavaScript, Standard ML, C#, Verilog, Rust
Frameworks (Proficient): CUDA, JAX Pallas, Kubernetes, Flink, PyTorch, Numpy, Django, Flask, FastAPI, React, Node.js, Streamlit
Tools & Platforms (Proficient): Git, Grafana, Prometheus, Terraform, Jenkins, OpenSearch, Azure, AWS, Docker, Postman, Databases
Languages
English : Fluent
Chinese (Mandarin) : Native speaker