CV

This is a description of the page. You can modify it in '_pages/cv.md'. You can also change or remove the top pdf download button.

Contact Information

Name Songting (Michael) Wang
Professional Title ECE Student & Software / ML Systems Engineer
Email stw183164761@gmail.com
Phone +1 (412) 916-4409

Professional Summary

M.S./B.S. student in Electrical and Computer Engineering at Carnegie Mellon University, focused on ML systems, distributed systems, and infrastructure.

Experience

  • 2025 - 2025

    Boston, MA

    Software Engineer Intern – Core Infra
    S&P Global – Kensho
    • Implemented structured logging for OpenSearch log-router, enabling field queries and reducing log analytics time by 50%.
    • Delivered endpoint-level FastAPI metrics for 7+ product teams using Grafana and Prometheus.
    • Built and managed a centralized, automated alert visualization system on Kubernetes for 60+ application teams.
  • 2024 - 2024

    Shanghai, China

    Software Engineer Intern – ML Infra
    ZKH Industrial Supply
    • Developed an LLM Benchmark System using Flask and Streamlit, reducing training evaluation time by 25%.
  • 2023 - 2024

    Pittsburgh, PA

    Software Engineer – Backend
    EnterviewAI
    • Led a 10-member team to deliver production-grade backend services (e.g., Recording, Question Bank) using Django and Azure.
    • Implemented AI features (e.g., Study Plans, Brainstorming, Mock Interviews, Feedback) with AutoGen and LangChain.
  • 2023 - 2024

    Shanghai, China

    Software Engineer Intern – Backend
    United Imaging Intelligence
    • Enhanced ShanghaiTech University’s medical questionnaire app, extending question abstractions and backend update semantics.

Education

  • 2025 - 2026

    Pittsburgh, PA

    M.S.
    Carnegie Mellon University
    Electrical and Computer Engineering
  • 2022 - 2025

    Pittsburgh, PA

    B.S.
    Carnegie Mellon University
    Electrical and Computer Engineering, Minor in Computer Science
    • Coursework: Distributed Systems, AI/ML Systems, Deep Learning Systems, CPU/GPU Architecture, Algorithms, Functional Programming

Research

  • 2026 -

    Pittsburgh, PA

    Graduate Researcher
    Google CoreML × CMU Catalyst
    • Building a compiler to lower Mirage-generated computation graphs into efficient Google TPU kernels (advised by Prof. Zhihao Jia).
  • 2025 -

    Pittsburgh, PA

    ML Systems Research with Prof. Zhihao Jia
    CMU Catalyst Group
    • Working on Mirage Persistent Kernel, a Compiler and Runtime for Mega-Kernelizing Tensor Programs.
    • Implemented FlashInfer’s optimized Gumbel-Max Sampling CUDA kernels in Mirage Persistent Kernel and verified with unit tests.
    • Working on Expert-Parallel Mixture-of-Experts CUDA kernels in Mirage Persistent Kernel using all-to-all (A2A) communication.
  • 2024 -

    Pittsburgh, PA

    Data Systems Research with Prof. Vyas Sekar
    CyLab Security & Privacy Institute, CMU
    • Invited speaker at Current 2025, the world’s largest Data Streaming conference attended by 3,500+ industry professionals.
    • Open-Sourced FlinkSketch, a high-performance library of sketching algorithms for Apache Flink.
    • Building ProjectASAP, low-latency cost-efficient data pipelines to support next-gen agentic workloads.

Teaching

  • 2024 - 2025

    Pittsburgh, PA

    Teaching Assistant
    Carnegie Mellon University
    • Distributed Systems (15-440/640), Spring 2025: Supported 200 students on distributed protocols; led recitations; designed/graded coursework.
    • Computer Systems (18-213/613), Spring 2024 & Fall 2024: Supported 150 students on assembly, memory, I/O, concurrency; led small-groups; graded work.

Projects

  • Needle Deep Learning Framework

    Built core deep learning framework components in C++ and Python.

    • Built core components of Needle, including autodiff, tensor IR, backend dispatch, common neural network modules.
    • Implemented FlashAttention in CUDA kernels, improving memory and throughput efficiency.
  • Distributed File System

    Distributed file system with RPC and concurrent LRU caching in C and Java.

    • Implemented Remote Procedure Call (RPC) and concurrent LRU caching to support efficient client-server communication.
    • Achieved reliability and scalability through two-phase commit, write-ahead logging, and dynamic scaling.
  • RISC-V CPU Microarchitecture

    7-stage pipelined RISC-V CPU in C and Verilog.

    • Implemented a 7-stage pipelined RISC-V CPU microarchitecture, with LRU Cache and optimizations to improve perf/watt.

Skills

Programming Languages (Expert): Python, C++, Java, C, Jsonnet, Scala, JavaScript, Standard ML, C#, Verilog, Rust
Frameworks (Proficient): CUDA, JAX Pallas, Kubernetes, Flink, PyTorch, Numpy, Django, Flask, FastAPI, React, Node.js, Streamlit
Tools & Platforms (Proficient): Git, Grafana, Prometheus, Terraform, Jenkins, OpenSearch, Azure, AWS, Docker, Postman, Databases

Languages

English : Fluent
Chinese (Mandarin) : Native speaker