Skip to content
View mfeldman143's full-sized avatar

Block or report mfeldman143

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mfeldman143/README.md

Michael Feldman

Research Software Engineer · Data Infrastructure · Open Science

I build the pipelines, databases, and scalable architectures that make research data usable at scale. My work sits at the intersection of scientific computing, data engineering, and open-source community building.

Previously RSE at Stanford University (METER-AI / Andrew Ng, Doerr School of Sustainability, SRCC), where I architected cloud data pipelines processing 14M+ records from 200+ sources and led the open-sourcing of a methane emissions dataset now used by Planet and CarbonMapper for climate mitigation. Before that, nearly three years at UW-Madison Radiology building serverless containerized tools for neuroimaging data management (BIDS, NIfTI, DICOM, Flywheel.io) and GB-range deidentification pipelines under HIPAA/IRB compliance.

Currently exploring open neuroscience infrastructure — studying how platforms like brainlife.io and FreeSurfer are architected, and how standards like BIDS and NWB govern the flow of data from scanner to archive.

Areas of Interest

  • Neuroimaging & neurophysiology data — BIDS, NIfTI, DICOM, NWB, EEG pipelines, deidentification
  • Data versioning & reproducibility — Git internals, containerized workflows, CI/CD for research
  • Graph-based data models — Neo4j, provenance modeling (PROV), brain connectivity, cross-archive metadata linking
  • Agentic AI for science — LLM-powered development workflows, automated data validation, standards compliance
  • Cloud data engineering — GCP (certified), AWS, BigQuery, Terraform, Docker, Kubernetes

Open Source

  • MassGen1 — Framework for AI-augmented workflows
  • Contributions and explorations in neuroinformatics tooling — studying DataLad, DANDI, HeuDiConv ecosystems

Certifications

NVIDIA Certified Professional: Agentic AI (2026) · Google Cloud Professional Data Engineer (2025) · Neo4j Certified Professional (2025) · Neo4j Graph Data Science (2025) · Google Cloud Professional ML Engineer (2020) · Google Cloud Professional Cloud Architect (2018, 2020)

Education

M.S. Applied Statistics, Penn State University (GPA 3.77) · B.S. Finance, Minor in Statistics, Penn State University

Publications & Presentations

  • Stanford SRCC, 2023 — METER-AI: BigQuery pipelines, serverless architecture, SAM (Segment Anything Model) integration
  • SIIM, 2019 — Healthcare interoperability, serverless functions, and big data analytics

Interested in open-source neuroscience infrastructure, reproducible science, and building tools that make research data FAIR and accessible. Always looking to connect with people working on these problems.

@mfeldman143's activity is private